AI did not introduce the need for context. It exposed how little of it most teams have.
Right now, “context engineering” is having its moment. People talk about RAG, long context windows, tool calling, and standards like the Model Context Protocol. (Model Context Protocol) Those are real advances. But they skip the harder question that determines whether any of it works: what does good context look like in the first place, for humans and for AI?
If you want a clear lens, steal this one: task context versus shared context. Luca Rossi frames it crisply in his Refactoring essay, and it maps cleanly to how high-performing teams have always operated. (Refactoring.fm)
The contractor test
A simple test: imagine the engineer building your next feature is not on your team. They are a contractor you just onboarded, someone smart and capable, but unfamiliar with your norms. That is not a hypothetical anymore. It is increasingly what an AI agent is.
If you hand them a ticket with “Build onboarding flow v2,” plus a Figma link and a couple requirements, they can produce something. But will it be something you can ship?
They will immediately need answers to questions you probably do not write down:
Do you require tests? Which kinds? What is your definition of “done”? How do you instrument features? Do you have naming conventions? Are feature flags mandatory? How do you handle error states? What is acceptable latency? Which security checks are non-negotiable?
This is where the split matters.
Task context is the “what”
Task context is the job-specific payload. The feature spec. The user story. The acceptance criteria. The UI designs. The edge cases. The constraints that are true this time.
Task context changes constantly. It is supposed to. It is the volatile layer. If you are building a pricing experiment this week and an identity integration next week, task context should look completely different.
When people complain that AI “needs more context,” they often mean task context. That is valid, up to a point. But if you keep stuffing more into the prompt, you are treating the symptom.
Shared context is the “how”
Shared context is everything your team should not have to restate.
It is your working agreements and operating standards. It is the invisible scaffolding that lets an engineer succeed without a manager hovering. It is also the difference between “a prototype that passes a demo” and “a change that fits the system.”
In strong orgs, shared context lives in artifacts you can point to: engineering handbooks, golden paths, paved roads, templates, code review standards, test strategy, observability norms, security posture.
Google’s public engineering practices around code review are a good example of shared context expressed as a standard. The point is not perfection. The point is that code health improves over time because reviewers share an explicit bar. (Google GitHub) Microsoft’s Engineering Fundamentals Playbook is another example, a broad, explicit catalog of how teams work across delivery, quality, and collaboration. (Microsoft GitHub)
AI makes the absence of shared context painful, because AI will happily generate plausible output that violates your unwritten rules.
The north star: shrink the “what,” grow the “how”
Here is the counterintuitive strategy that actually scales: continuously shrink task context and grow shared context.(Refactoring.fm)
Shrinking task context does not mean writing vague tickets. It means you stop using tickets to carry institutional memory. You stop writing “make sure you add feature flags” or “remember to add metrics” or “please follow naming conventions” in every single prompt, PR, or story. Those are not task instructions. They are operating norms.
If you have to remind people to “write tests,” you do not have a testing practice. You have a testing suggestion.
And here is the uncomfortable truth: if you have to remind your AI agent to write tests every time, you do not have AI context engineering. You have prompt heroics.
GitHub’s own Copilot guidance implicitly points in this direction. Copilot is useful for generating tests and repetitive code, but the team still has to be in charge of standards and review. (GitHub Docs) Scaling AI safely is less about better clever prompts, and more about making the “how” unmissable.
The new industry signal: shared context is becoming machine-readable
The most important shift of the last year is not that models got bigger. It is that teams are starting to package shared context in a way agents can reliably consume.
AGENTS.md is an emerging open format that is effectively a README for coding agents. It exists for one reason: projects need a predictable place to put “how we work here.” (Agents) OpenAI’s Codex documentation goes further and describes how these instruction files can be layered from global guidance to repo guidance to directory-specific overrides. (OpenAI Developers) This is not a small UX detail. This is the shape of the next generation engineering handbook, one that can be applied automatically at the point of code generation.
On the “tools and data” side, MCP is doing something similar for external context: standardizing how AI apps connect to data sources and tools, so the agent can fetch the right facts instead of hallucinating. (Model Context Protocol)
Put those two together and you get a clean architecture:
Task context answers what you are building right now.
Shared context encodes how you build anything here.
Tool context provides the live facts and actions needed to execute.
Most teams are over-invested in the first and under-invested in the second.
When to use task context, and when it becomes a trap
Use task context aggressively when the work is ambiguous, novel, or decision-heavy. If the work requires tradeoffs, you want the tradeoffs visible. If the work impacts customers, you want explicit success metrics and failure modes. If there are non-obvious constraints, you want them written.
But task context becomes a trap when it is used to compensate for missing standards. If your tickets keep carrying the same reminders, your system is telling you something: the “how” is not shared.
There is also a failure mode that looks like maturity but is not: bloated specs that try to precompute every decision. Humans ignore them. Agents drown in them. A task spec should be a crisp description of intent and constraints, not an attempt to smuggle an entire engineering culture into a single document.
When to use shared context, and why most teams underfund it
Shared context should cover the recurring decisions that cause rework. Every time you see a repeat failure, that is a shared-context gap trying to get your attention.
In practice, shared context should make these things boring:
Testing expectations and commands, including what “good” looks like for unit, integration, and end-to-end coverage.
Observability norms, including logs, metrics, traces, and what gets instrumented by default.
Release discipline, including feature flags, rollout strategy, and how you handle canaries.
Code review standards, including what reviewers must enforce and what they should let slide. (Google GitHub)
Security defaults, including how secrets are handled, how dependencies are introduced, and what checks must pass.
If that sounds like a lot, good. High-performing orgs make this explicit because it reduces cognitive load. It also makes onboarding faster, whether the newcomer is a human engineer or an agent.
How to combine them without creating a mess
The clean combination pattern is simple: task context should point to shared context, not repeat it.
In human terms: “Build the new onboarding flow. Follow our standard release checklist and instrumentation guide.”
In agent terms: your Codex or agent instruction chain carries the shared context, while the prompt carries only what is unique to this task. That is exactly the layering model described for AGENTS.md style guidance. (OpenAI Developers)
This is where teams get leverage. Once shared context is dependable, task prompts can get shorter without quality collapsing. You are not lowering the bar. You are embedding it.
When to avoid both
There are two moments to deliberately avoid context.
The first is exploration. If you are brainstorming product directions, do not force shared delivery standards into the conversation. You want divergent thinking.
The second is when you are trying to diagnose a failure and you do not yet know if it is a task problem or a shared problem. If you immediately patch the output, you will miss the chance to fix the input that caused it.
This is the move most teams fail to make: treat failures as context bugs, not human bugs.
The feedback loop is the real context engine
Rossi’s point that resonates most is that good context engineering is a feedback loop. When something turns out wrong, the question is rarely “why was the output bad?” The real question is “what input allowed this to happen?” (Refactoring.fm)
In mature orgs, that loop looks a lot like incident response, but applied to product delivery:
If a PR shipped without adequate tests, you can fix the PR. But you should also update the shared checklist and the agent guidance, so it does not happen again.
If a feature launched without metrics, you can patch instrumentation. But you should also fix the default template, the definition of done, and the shared playbook.
If AI-generated code repeatedly violates conventions, you can keep correcting it. Or you can encode conventions once, in the place the agent always reads.
That is why formats like AGENTS.md matter. They give you a stable place to put the “how,” then evolve it as you learn. (Agents)
The opinionated takeaway
If you want AI to amplify your team, stop treating context as a prompt problem. Treat it as an operating system problem.
Task context is fuel. Shared context is the engine. Tools and live data are the sensors. If you only pour more fuel into a weak engine, you do not go faster. You just make more smoke.
The teams that win with AI will not be the teams with the most clever prompts. They will be the teams that make “how we build” explicit, enforceable, and continuously improved, until both humans and agents can ship high-quality work with minimal hand-holding.









