AI-first is a consensus machine. Here is what we do instead.
10 minutes
Two teams. One in San Francisco. One in Beijing. Built competing AI systems from scratch. Independent companies. Independent architectures. Independent training data.
Last year, researchers asked both systems to write a product description for an iPhone case. GPT-4o produced: "Elevate your iPhone with our sleek, without compromising bold, eye-catching design." DeepSeek-V3 produced the same phrase, word for word. The measured similarity between the two outputs: 81%.
The research team called it the Artificial Hivemind. It won Best Paper at NeurIPS 2025, the most prestigious AI research conference in the world.
Different tools. Different companies. Different continents. Same words.
The architecture of sameness
Every agency I know is currently repositioning as AI-first. I understand the logic. AI has fundamentally changed how fast work can move. We use it every day across strategy, content, development, analysis. The productivity shift is real and it is not reversing.
But AI-first as a strategic position misunderstands what AI actually produces.
The research is now clear on this. When researchers at the University of Washington studied over 70 language models using 26,000 real-world queries, the outputs did not spread across a wide conceptual space. They collapsed. Ask any of these models to write a metaphor about time and you get two clusters: time is a river, or time is a weaver. Ask them to write brand copy and they reach for the same constructions across every language, every market, every category.
And the fix most agencies are selling, better prompts, stronger brand guidelines, adjusted parameters, does not work. A PNAS Nexus study published in March 2026 tested every intervention. None of them restored genuine diversity. Raise the temperature setting and you get gibberish. There is no dial that produces original thinking.
The tools are architecturally designed to predict the most probable next word. They are, by construction, consensus machines. AI-first without a human deciding what to say before the model starts writing produces output that sounds like every other company using the same tools.
Which is most of them.
What we do instead
We decided not to call ourselves AI-first. Not because we use AI less, but because we use it differently.
Before any work begins, we ask three questions. Why are we doing this? What is the real goal? What is the best way to do this?
Sometimes the answer to the third question is AI-first. Sometimes it is AI-assisted. Sometimes it is not using AI at all. The question drives the choice, not the other way around.
This is what we mean by AI-Enabled, Human-Led.
The enabled part is genuine. We use AI across everything we do. It accelerates research, structures thinking, drafts at scale, finds patterns in data that would take weeks to surface manually. Speed and volume are AI's strongest capabilities and we use both without hesitation.
The led part is where the value lives.
Where AI runs out
When Verizon published their 2025 customer experience report, one number stood out. Customer satisfaction with AI-driven interactions: 60%. With human-led interactions: 88%. That is not a small gap. That is the distance between acceptable and trusted.
It is a number we recognise. We have always believed our CS team is one of the most important assets in the business. They are the nerve connecting us to our clients, and we have no intention of replacing that nerve with a chat flow. We are doing the opposite. In August, we welcome a new colleague to the team.
When Ford ran into quality problems they could not resolve, they rehired 350 veteran engineers. Not because AI failed to work. Because the judgment those engineers carried, built from years of domain experience, could not be transferred to a model.
When Klaviyo surveyed 8,000 consumers across eight countries in 2026, they found that when people identify AI-generated brand content, they are four times more likely to trust the brand less than more. And half of consumers can now correctly identify it.
These are not arguments against AI. They are arguments for knowing where AI runs out.
Taste
There is a concept we think about a lot at ted&gustaf. We call it taste. It is nothing unique to talk about taste I believe Steve Bartlett first used that a while back, but it really stuck with me.
Taste is not aesthetics. It is not having an opinion about fonts or colour palettes. Taste is the accumulated judgment that comes from years of doing the work and understanding what makes something right rather than merely finished. It is the capacity to look at something and know, without being able to fully explain why, that it could be better.
AI can produce finished. AI cannot produce right.
The distance between the two is where competitive advantage lives. It is the final stretch from competent to exceptional. It requires a human who has seen enough, thought enough, and cared enough to know the difference.
The Artificial Hivemind study ended with an observation that stayed with me. "If you want to sound different," the researchers concluded, "you cannot get there by using AI differently. You need to have something different to say before the AI ever touches it."
The point of view has to exist first.
This is what ted&gustaf has spent the last months thinking about and building toward. An intelligence system that captures how we see the world, what we believe, how we make decisions. Not so AI can replace that thinking. So AI can work from it.
We are AI-Enabled. The models do the work they are built to do.
We are Human-Led. Someone with taste and judgment decides what the work should actually say.
Speed and volume is AI's A-game. Quality, judgment, and getting the right thing done still cost what they always cost.
We think that is worth saying out loud.
Do you agree or do you have another opinion? Please let me know so we can continue the discussion.
References
- Jiang et al., "The Artificial Hivemind" — Best Paper at NeurIPS 2025, University of Washington, Carnegie Mellon University, Allen Institute for AI. arxiv.org/abs/2510.22954
- Wenger & Kenett, "Large language models are homogeneously creative" — PNAS Nexus, March 2026. academic.oup.com/pnasnexus
- The State of Brand, "The great flattening, part 2: the data is worse than the anecdotes" — May 2026. thestateofbrand.com
- Verizon, "2025 CX annual insights report" — Verizon Business, 2025.
- Klaviyo, "2026 AI consumer trends report" — Klaviyo, 2026. klaviyo.com
- The State of Brand, "Ford rehired 350 veteran engineers after AI couldn't fix its quality problems" — June 2026. thestateofbrand.com
Gustaf Lindqvist
Gustaf Lindqvist is the co-founder of ted&gustaf, a digital advisory partner working with ambitious organizations across the Nordics. He spends his time at the intersection of strategy, technology, and brand, helping leaders make better decisions before the work begins.
