Beyond Work Monthly
Posts
When Harm Is a Feature

When Harm Is a Feature

What happens when ethical collapse is built into the system?

Daniel Dixon
August 30, 2025

This month begins with the leak of 200-page internal document from Meta, detailing the company’s AI behaviour guidelines, was published by Reuters. The content wasn’t speculative or accidental. It showed what had been approved: chatbot responses that flirt with children, amplify racist pseudoscience, and produce false medical claims, all within the rules.

When this kind of harm is formalised, not flagged, it raises a harder question: what kind of system is this, and what does it mean when harm becomes just another setting?

A Real-World Warning

In August, Reuters reported on a leaked 200-page internal Meta document, which outlined approved AI chatbot behaviours. The investigation quoted directly from the document, which Meta confirmed as authentic.

The contents were staggering. In black-and-white policy, Meta approved scenarios where bots could flirt with underage users, romanticise the bodies of children, and echo racist pseudoscience, all while remaining within what the company deemed “acceptable.”

The document wasn’t theoretical. It was reviewed by Meta’s legal, engineering, and policy teams, including its chief ethicist. These weren’t hallucinations or accidents. They were decisions.

And they were only quietly walked back after being publicly exposed.

What stood out most wasn’t just the deeply harmful content, but how calmly it had been normalised.

Romantic language directed at minors.
Racial superiority arguments.
Bots that generate images of violence, but stop just short of gore.

It was all there, mapped out in a document designed to guide internal enforcement.

The company’s defence? That these outputs weren’t ideal, but they were manageable. That the examples “in question” were removed. That the enforcement, in their words, had simply been “inconsistent.”

But inconsistency isn’t the core issue. Acceptance is.

Because this wasn’t a rogue experiment, it was infrastructure. The standards themselves permitted this behaviour, and someone, likely several someones, signed off.

This is the quiet architecture of harm: not just what the system produces, but what it was built to tolerate.

And here’s what’s most unsettling: These policies didn’t slip through the cracks, they were the cracks.

When harm is normalised, what still interrupts it?

This isn’t a call to fear AI. It’s a call to remember what it can’t do.

It can’t pause.
It can’t push back.
It can’t ask whether something should exist in the first place.
That work is still human.

Whether you're leading, designing, or using systems that touch real lives, here are three questions worth asking, and asking again.

1. What kind of harm are we quietly optimising for?
All systems make trade-offs. Some of them hurt people.
The real question is: which harms get tolerated? Which ones get flagged?
And who decides what's an acceptable cost?

2. Who gets to say “this shouldn’t exist”, and are they protected when they do?
Designers, moderators, junior staff, researchers…….
Do they have veto power? Moral voice? Any structural backup?
Or does momentum override caution, again and again?

3. Are we solving for reputation or responsibility?
If something harmful appears, what happens next?
Is it patched? Reframed? Removed from sight?
Or does someone actually stop and ask how it got there, and what else still is?

Want to check your own system?

If you're building or stewarding tools, teams, or decisions that touch real lives, I’ve built a short diagnostic to help organisations assess how well they’re governing AI and language model tools.
It’s not a judgment, just a prompt for clarity: Are we asking the right questions? Are we protecting the right people?

You can take it here:
AI Safety Readiness Diagnostic

(If you're not in a decision-making role, you can still try it based on what you've seen inside your own system, or even make up a scenario. It helps me test the tool, and it might surface something worth noticing.)

When harm becomes infrastructure

Diane Vaughan, a sociologist known for her work on risk, culture, and systemic failure in complex organisations, described normalised deviance as the slow erosion of safety standards, not through dramatic collapse, but through small compromises that become routine. She coined the term while investigating the Challenger disaster, where NASA engineers had grown accustomed to technical risks that would later prove catastrophic. Over time, practices that would once have been unthinkable became normal, not because people stopped caring, but because the system stopped reacting.

That same pattern shows up far beyond aerospace.

When Meta’s AI chatbot guidelines surfaced, they weren’t framed as a breach. They were policy. Reviewed. Logged. Approved by people with responsibility for safety and ethics. This wasn’t a hallucination. It was architecture. And when that happens inside a system, the most useful question is rarely: How did this get through?

It’s: What made this survivable inside the culture?

This is where Bandura’s concept of moral disengagement comes in. In his research on aggression and self-regulation, he explored how people who consider themselves ethical can still participate in harm. It’s not that they lose their values. It’s that they gradually disconnect from consequence.

Harm becomes abstract.
Responsibility gets distributed.
Language softens what’s real.

“People do not ordinarily engage in reprehensible conduct until they have justified to themselves the morality of their actions.”
— Albert Bandura

Organisations often accelerate this process. The larger the system, the easier it becomes to frame harm as a compliance issue, a policy question, a reputational risk. Decisions are made in documents, not conversations. And when speed or scale is the priority, slowing down for ethics feels inefficient.

At some point, the question shifts from: Should we do this? to: Can we defend it?

That shift doesn’t always feel dramatic. Sometimes it sounds like:

“It’s technically allowed.”
“Legal reviewed it.”
“We’ll revise it later if needed.”

Over time, organisations can stop recognising harm as harm. What once would have triggered concern starts to look like a manageable risk, or just another line item in a workflow. Safety gets reframed as a compliance task. Responsibility becomes something to delegate. And sometimes, as the Meta case showed, harm doesn’t creep in slowly at all. It’s documented, approved, and shipped, not because no one noticed, but because the system had already stopped asking whether it should.

What you end up with is a system that looks efficient, but behaves like it’s missing a layer of conscience. A machine with no internal brake. A team with no cultural permission to say: This shouldn’t exist.

And if that sounds familiar, if you’ve been in rooms where “moving fast” meant skipping something vital, then you already know this isn’t just a tech problem. It’s a trust problem. It’s a leadership problem. It’s a problem of what systems protect by default, and who pays the price when they don’t.

Where this get’s real
If your team is navigating trust, decision-making, or unease around AI, you’re not alone. This is exactly the kind of work I now support through organisational diagnostics, advisory, and facilitated reflection. Find the approach here

What the Meta story shows, and what too many systems quietly rely on, is that harm can be fully visible, fully understood, and still allowed.

Not always because people are malicious, but:

Because structure lets them look away.
Because no one’s job is to stop it.
Because it’s easier to adjust the language than to name the truth.

But systems don’t drift toward care on their own, someone has to insist.

Someone has to say:

This isn’t safe.
This isn’t neutral.
This isn’t acceptable just because it’s legal, scalable, or on-brand.

That doesn’t require outrage. It requires accuracy.

And if you’re in the room, in product, in policy, in leadership, in language, you might be the only one who sees it early enough to stop it from becoming policy.

That’s not a burden. That’s the job.

Until next time

You don’t have to catch every failure. But when harm is treated as infrastructure, it’s worth asking what’s being reinforced, and who keeps approving it.

Safety isn’t a feature you can toggle. Ethics isn’t a branding exercise. And responsibility doesn’t scale just because the system does.

If anything in this edition left you uneasy, hold onto that. Not to over-analyse, but because discomfort is often a signal that something real is being named.

That kind of noticing matters. Especially in systems that are built to look smooth on the surface.

Thanks for reading,
Daniel

Where the work leads

If this edition struck a chord, and you’re holding similar questions inside your organisation, I now offer structured support through behavioural insight, strategic advisory, and culture-focused diagnostics around AI and trust.

It’s not technical support. It’s the cultural and behavioural infrastructure that makes AI adoption safe, human, and sustainable.

The Reframing Room

The Reframing Room is a set of structured, psychologically grounded offers to help people navigate tension, change, and emotional stuckness — together. It’s designed for leadership teams, partnerships, boards, communities, and groups of any kind trying to move forward when something feels misaligned.

You can explore the full offer, including example sessions and the downloadable brochure, at:

www.danieldixon.net/the-reframing-room

And if you’d like to have a quiet conversation about what’s happening in your group, I’d be happy to listen.

Need a more personal space to work through this?

Some people read this and feel it’s organisational.

Others read it and realise they need a space for themselves. To think. To reframe. To practise showing up differently.

I offer one-to-one coaching for professionals navigating change, including career coaching, executive communication in English, or the important work of becoming more fully yourself at work.

If that’s where you are, you can book a 15-minute intro call here:https://calendar.app.google/rkUSYjRysGgpmV7V9

Or explore more at www.danieldixon.net

Reply

or to participate.