📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent Google whitepaper argues that the most critical aspect of AI-driven software development isn’t the AI model itself but the surrounding harness and context engineering. This shift impacts how companies should allocate resources and develop AI systems, emphasizing configuration and verification over model size.

A new Google whitepaper titled The New SDLC With Vibe Coding highlights a counterintuitive but crucial insight: the AI model accounts for only about 10% of the system’s performance, while the harness and context engineering make up the remaining 90%. This challenges common assumptions that focus on upgrading models and suggests a strategic shift in AI development and deployment.

The whitepaper, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, emphasizes that the key to effective AI integration is not solely the choice of models but how they are configured and managed within the system. They argue that most failures or misbehaviors in AI agents are configuration issues—missing tools, vague rules, or noisy context—rather than the models themselves.

Concrete evidence from public benchmarks shows that changing the harness—such as prompts, tools, and middleware—can significantly improve performance, even with the same model. For example, one team moved a coding agent into the top five on a benchmark by only adjusting the harness, not the model. This indicates that the surface area of control lies in configuration and context engineering.

The paper also stresses the importance of context engineering, which involves providing the AI with relevant instructions, knowledge, examples, tools, and guardrails. The authors introduce the concept of Agent Skills, where procedural knowledge is loaded only when needed, reducing token costs and increasing flexibility. This approach shifts focus from clever prompting to strategic context management.

Economically, the paper warns that vibe coding—quick prompts and minimal review—may seem cheap but incurs higher operational costs due to token overuse, maintenance, and security vulnerabilities. Conversely, disciplined, agentic engineering requires upfront investment but offers lower marginal costs over time.

At a glance
reportWhen: published March 2026
The developmentGoogle’s new whitepaper emphasizes that in AI development, the model constitutes only about 10% of the system’s behavior, with the harness and context engineering comprising 90%.
The Model Is Only 10% — The New SDLC With Vibe Coding
AI Dispatch · Field Notes
Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified
Vibe Coding
Casual prompts · “does it seem to work?” · disposable code · high risk
Structured AI-Assisted
Detailed prompts + constraints · manual testing · features in real codebases
Agentic Engineering
Formal specs · automated tests + evals + CI gates · production scale · low risk
Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.
The idea worth building your strategy around
Agent = Model + Harness
~10%
HARNESS — prompts · tools · context · hooks · sandboxes · observability
MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S
Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.
“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.
The economics: it’s a token-cost problem (CapEx vs OpEx)
Vibe Coding
Low CapEx · High OpEx
Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.
Agentic Engineering
High CapEx · Low OpEx
Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.
85%
of devs use AI coding agents (51% daily)
41%
of all new code is AI-generated
~90%
of agent behavior is the harness, not the model
+19%
longer on some tasks (METR) — verification is the cost
The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.
thorstenmeyerai.com

Why Harness and Context Engineering Are Game-Changers

This shift redefines how organizations should approach AI development. Instead of chasing the latest model upgrades, companies need to invest in building robust harnesses and effective context management. This approach offers better control, reduces costs, and enhances system reliability, making AI deployment more sustainable and strategic in the long term.

lweiyupeixx Press Model Separator Press Type Automatic Model Parts Detacher Part Separation Tool Hobby Assembling Model Ergonomic

lweiyupeixx Press Model Separator Press Type Automatic Model Parts Detacher Part Separation Tool Hobby Assembling Model Ergonomic

Effortlessly separate model components with our Press Type Model Separator, enhances efficiency and minimize damage risk.

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

The Evolution of AI Development Practices

Historically, AI development focused heavily on acquiring larger, more powerful models, with performance improvements tied to model size and complexity. However, recent developments, including the Google whitepaper and industry experiments, reveal that the surrounding system—harness, prompts, tools, and context—is the primary driver of effective AI behavior. This represents a paradigm shift from model-centric to system-centric AI engineering, emphasizing configuration, verification, and strategic context management.

“The model is only 10% of what determines behavior; the harness and context are 90%. This changes the game for AI development.”

— Addy Osmani, co-author of the whitepaper

AI Engineering: Building Applications with Foundation Models

AI Engineering: Building Applications with Foundation Models

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Unresolved Questions About Implementation and Impact

While the whitepaper provides compelling evidence and a clear framework, it remains unclear how quickly organizations will adopt this approach at scale. Specific best practices for building effective harnesses and context management systems are still evolving, and industry-wide standards are yet to emerge. Additionally, the long-term impact on AI model development strategies and costs requires further observation.

Amazon

AI middleware and harness tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Next Steps for Organizations and Developers

Organizations should evaluate their current AI workflows, focusing on the configuration, tools, and context management rather than solely model upgrades. Developing best practices for harness and context engineering will be critical, along with investing in training teams to master these skills. Industry groups may soon formalize standards and frameworks to guide this transition, and ongoing research will clarify optimal strategies.

AI Context Engineering: Architecting Intelligence Through Prompt Structures, Tools, and Memory

AI Context Engineering: Architecting Intelligence Through Prompt Structures, Tools, and Memory

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the harness more important than the model?

The harness includes prompts, tools, rules, and observability layers that shape the AI’s behavior. Evidence shows that adjusting these components can significantly improve performance, often more than upgrading the model itself.

What is meant by context engineering?

Context engineering involves providing the AI with relevant instructions, knowledge, examples, tools, and guardrails to guide its behavior effectively, reducing reliance on prompt tricks and improving reliability.

Does this mean model size is irrelevant?

Not entirely. The paper suggests that model size is only about 10% of the system’s behavior; the majority depends on how the model is integrated and managed through configuration and context.

How will this change AI development costs?

Initially, disciplined system design may incur higher upfront costs, but over time, it reduces operational expenses by lowering token usage, maintenance, and security risks.

Is this approach applicable to all AI systems?

While most evidence supports its broad applicability, the transition may vary depending on the use case, organization size, and existing infrastructure. Ongoing research will clarify these boundaries.

Source: ThorstenMeyerAI.com

This content is for general information only and is not financial, tax or legal advice. Consult a qualified professional for decisions about your money.
You May Also Like

Forezai · TradingAgents: A Trading Firm Made of Agents

Forezai introduces TradingAgents, a multi-agent research system mimicking trading desk organization to improve decision-making and reduce overconfidence.

7 Best Internal Solid State Drives for Prime Day Deals in 2026

Discover the best internal SSD deals for Prime Day 2026, featuring top picks like SK Hynix Gold P31 2TB, Corsair MP600 Mini, and more for optimal upgrades.

7 Best PC Motherboards for Prime Day Deals in 2026

Discover the best PC motherboard deals for Prime Day 2026, including options for AM4 and AM5 platforms, with detailed insights on features and value.

The Model Is Only 10%: The Real Lesson of the New SDLC

A new Google whitepaper emphasizes that in AI-driven software development, the model size is only 10% of system behavior; the harness and context engineering matter most.