Skip to content

The ⊣ Framework: Balancing AI Agent Capabilities and Human Constraints #71

Description

@terrylica

TL;DR

Maximize AI agent capabilities (Exploration, Autonomy, Idiomatic patterns) while minimizing human constraints (Implementation Prescription, Human Intervention, Bespoke patterns).

The Practice: Start loose, audit carefully, constrain precisely.

Maximize (LEFT) Minimize (RIGHT)
Exploration – Let agents discover state-of-the-art solutions Implementation Prescription – Avoid prescriptive implementation details
Autonomy – Reduce human intervention frequency Human Intervention – Intervene only when audit reveals issues
Idiomatic patterns – Embrace community-tested approaches Bespoke constraints – Avoid custom frameworks and house styles

The Core Concept

The Fundamental Tension

In most businesses, margins come from secrets—proprietary knowledge, non-obvious insights, and idiosyncratic processes create advantage. That is not how modern software engineering works.

In software today, the opposite is usually true: the more we conform to shared idioms, the more future-proof we become.

  • Idiomatic UI patterns → easier for users to understand and designers to extend
  • Idiomatic language features → easier for other engineers and tools to reason about
  • Widely adopted libraries used idiomatically → easier to hire for, easier to maintain, easier to replace

So while business advantage is often bespoke, software robustness and maintainability are often idiomatic.

Why This Matters for AI Coding

If we overload an AI agent with bespoke instructions and micro-specifications, we import the "business secret sauce" mindset into a domain where it is often harmful—telling the agent to ignore the ecosystem and instead re-implement our quirks.

As AI coding agents get smarter, the optimal balance tilts further left. We as humans must learn to let go while keeping just the right elements on the right side.

Why Humans Struggle to Maximize Left

Through 1000+ hours of production AI agent usage, a pattern emerged: many constraints we considered 'necessary' were actually encoding personal preferences rather than business requirements. This isn't a failure of individual judgment—it's a manifestation of systematic cognitive biases that affect all software engineers.

The Four Key Biases

Law of the Instrument (Maslow's Hammer)

"If the only tool you have is a hammer, everything looks like a nail." Programmers favor familiar libraries and frameworks regardless of optimal fit for the task at hand. This manifests as: "I'm a React expert, so build it in React"—even when Vue or Svelte might be more idiomatic for the use case. The comfort of familiarity overrides objective evaluation of alternatives.

Not Invented Here (NIH) Syndrome

The tendency to avoid using or buying already existing products, research, standards, or knowledge from external sources. Research shows only 16% of innovation projects remained unaffected by NIH (Hannen et al., 2019), meaning 84% of projects were impacted by this bias. CTOs mandate custom frameworks believing in-house solutions will be "better, quicker, cheaper"—but the NIH effect amounts to a 6.1% decrease in knowledge absorption for every one-point increase in NIH syndrome severity.

IKEA Effect (Ownership Bias)

People place disproportionately high value on products they partially created. In software engineering, this manifests as engineers overvaluing their own code and architecture decisions while resisting alternatives—even when those alternatives are objectively superior. Attachment to suboptimal solutions stifles innovation and hinders project success.

Curse of Knowledge (Expert Bias)

Experts assume others share their specialized knowledge. This bias increases with experience and, critically, does not reduce when people are told about it. Senior CTOs make technology choices assuming teams will "obviously understand" the architecture. In vertical markets, this amplifies dangerously: domain experts conflate business expertise with technical authority—"I know banking deeply, therefore I know how banking software should be architected."

The Quantitative Evidence

The impact of these biases is measurable and costly:

Why Preferences Scale Poorly; Ecosystem Idioms Scale Well

When we ask "How would I implement this?", we import our accumulated preference history—familiar libraries, comfortable patterns, personally-mastered techniques. Each programmer, CTO, and CIO brings a unique set of preferences shaped by their career trajectory. What feels "cleaner" to one engineer becomes a bespoke constraint that fragments the ecosystem.

The core insight: Individual preferences scale linearly with team size, but ecosystem idioms scale exponentially with community size. A bespoke framework known to 10 engineers helps those 10. An idiomatic pattern known to 10,000 engineers benefits from 10,000 contributors, 100,000 Stack Overflow answers, and millions of production deployments.

This is why every programmer having their own style, every CTO and CIO having their preferences, creates a systemic problem. "When you're holding a hammer, everything looks like a nail"—this fallacious thinking doesn't just affect individuals; it pollutes entire engineering organizations, especially in vertical markets where concentrated expertise amplifies the impact of individual biases.

The AI Agent Advantage (with Critical Nuance)

AI agents inherit biases from training data and can exhibit analogous biases if not properly configured. This is a critical limitation to acknowledge. However, AI agents can be designed to prioritize ecosystem conventions over personal preferences in ways humans systematically cannot:

No personal ownership bias: AI agents don't suffer the IKEA effect on their generated code. They can discard and regenerate without sunk cost fallacy. There's no ego attachment to "my architecture" or "my framework choice."

No 'how I've always done it': Lacking preference history, AI agents naturally gravitate toward well-documented, widely-adopted patterns when given business requirements without implementation prescription. They don't carry decades of accumulated tool familiarity that narrows their solution space.

Configurable at scale: Unlike retraining resistant human experts, AI agent behavior can be updated organization-wide through configuration changes. When the ecosystem evolves (new framework becomes standard, better library emerges), updating AI agents is a configuration file change, not a cultural change management initiative.

Explicit caveat: This advantage requires proper configuration and systematic auditing. AI agents don't automatically avoid bias—they require intentional design to follow ecosystem idioms rather than replicating the bespoke patterns they might encounter in training data.

The Practice: Auditing Constraints

The litmus test for any constraint you give an AI agent:

  1. Does this address a model limitation? (Provable through empirical testing: "AI agents currently struggle with X, so I specify Y")
  2. Or does this encode my preference? ("I prefer library X because I've used it successfully before")

The former is essential. The latter is often technical debt in disguise—it fragments your codebase from ecosystem idioms, making it harder to maintain, harder to hire for, harder to replace.

The most maintainable codebases came from auditing which constraints addressed genuine business requirements versus which encoded human preferences. The latter category can often be replaced with: "Choose the most idiomatic, well-maintained solution for [requirement]." Let the agent explore; audit the results.


The Three-Axis Framework

Through 1000+ hours of production AI agent usage, three critical tensions have emerged that are not adequately addressed in current industry discourse.

We write "A (constrained by) B" to mean: "A is bounded by B." The symbol is shorthand for this relationship.

Axis 1: Exploration ⊣ Implementation Prescription

What it means: The agent's ability to explore new methods, libraries, and architectures is constrained by our implementation prescriptions.

Why it matters: If we stuff the prompt with bespoke, highly specific implementation details, we increase Implementation Prescription and shrink Exploration. We prevent the agent from roaming the current ecosystem, discovering state-of-the-art, future-proof idioms.

Why Exploration superiority matters in AI coding:

AI agents can discover not just state-of-the-art solutions, but SOTA + well-maintained + future-proof + idiomatic patterns that humans might overlook.

Example: AI can automatically benchmark and prove that DuckDB outperforms pandas or Polars for specific business use cases—even though pandas and Polars are well-maintained, they may not be the most future-proof or performant choice for your particular workload. Without exploration freedom, you miss these evidence-driven optimizations.

The key advantage: Empirical evidence over assumptions. Let AI explore, benchmark, and prove—don't prescribe solutions upfront.

Common mistakes:

  • Specifying exact library versions or implementation patterns before letting agent research
  • Dictating data structures instead of describing requirements
  • Providing detailed pseudo-code instead of problem description

Optimal balance: State goals and constraints clearly, but let the agent explore implementation approaches.


Axis 2: Autonomy ⊣ Human Intervention

What it means: The agent's autonomy is constrained by the level and frequency of human intervention.

Why it matters: If we intervene at every small decision, require approvals at every step, or constantly redirect it, we reduce Autonomy. That defeats the point of using an autonomous AI to free up human time.

Why Autonomy superiority matters in AI coding:

  1. Reduces cognitive load: Humans can focus on high-level architecture and business logic instead of micro-managing implementation details.

  2. Increases idiomatic adherence: Paradoxically, less human guidance often leads to MORE idiomatic code—because AI agents naturally gravitate toward well-documented, community-tested patterns when not constrained by human quirks.

  3. Enables autonomous oversight: Even quality control should be autonomous. Use adversarial auditor agents to review code, run security scans, check performance—don't manually review every function.

The key advantage: Human cognitive bandwidth is the bottleneck. Automate not just coding but also oversight, reserving human judgment for strategic decisions.

Why this paradox exists: Human guidance often encodes our cognitive biases—the Law of the Instrument (familiar tools), NIH syndrome (custom frameworks), IKEA effect (attachment to our code)—rather than objective requirements. AI agents, when properly configured to follow ecosystem idioms, lack this preference history. That's not a limitation—it's an advantage for long-term maintainability.

Common mistakes:

  • Reviewing and approving every function before the agent can continue
  • Interrupting the agent's work to suggest different approaches mid-task
  • Requiring human decision on minor formatting or naming choices

Optimal balance: Define success criteria upfront, let agent work autonomously, audit systematically afterward.


Axis 3: Idiomatic patterns ⊣ Bespoke constraints

What it means: The agent's ability to choose idiomatic, community-tested patterns is constrained by bespoke constraints we impose.

Why it matters: If we insist on custom frameworks, unusual architectures, and local "house styles" that diverge from the ecosystem, we block it from using the idioms that maximize compatibility, tooling support, and long-term maintainability.

Why Idiomatic patterns superiority matters in AI coding:

Idiomatic patterns are more future-proof and easier for LLM agents to maintain because they are copious in LLM training data. Community standards appear millions of times in public codebases, documentation, and discussions—making them well-understood by AI.

Concrete examples of well-trained idiomatic patterns:

  • Semantic Versioning 2.0.0 – universally documented versioning standard
  • OpenAPI Specification v3.2.0 – extensively covered in API documentation
  • Python Enhancement Proposals (PEPs) – deeply embedded in Python corpus
  • C++23 (ISO/IEC 14882:2024) – standardized language features

Bespoke house styles and custom frameworks, by definition, appear rarely or never in training data. This creates a maintainability asymmetry: idiomatic patterns are "native" to AI agents, while bespoke patterns require extensive context and are fragile to changes.

The key advantage: Idiomatic code is "readable" to AI by default. Bespoke code requires constant human translation and maintenance.

Common mistakes:

  • Requiring custom error handling instead of standard library approaches
  • Enforcing company-specific naming conventions that conflict with language idioms
  • Demanding proprietary abstractions instead of well-established patterns

Optimal balance: Adopt ecosystem standards unless there's a compelling business reason for deviation.


Iterative Boundary Discovery

The framework above suggests maximizing the left side (Exploration, Autonomy, Idiomaticity) while maintaining essential constraints. But how do we know which constraints are essential?

The 5-Step Process

1. Default to maximum left

Begin with minimal implementation prescription, minimal human intervention, minimal bespoke constraints. Let AI agents explore freely within business requirements.

2. Audit outputs systematically

Review generated code, test results, architecture decisions. Don't assume problems—discover them through evidence.

3. Identify boundary conditions

When audit reveals issues (inconsistent patterns, suboptimal choices, misunderstood requirements), these indicate model capability boundaries.

4. Add targeted constraints

Constrain only the specific areas where audit revealed problems. Don't pre-emptively restrict based on assumptions about what might go wrong.

5. Iterate as models improve

As AI models get smarter (new versions, better training), loosen constraints and re-test boundaries. Yesterday's necessary constraint may be tomorrow's unnecessary restriction.

Why This Works

Pre-emptive restriction limits what agents can accomplish. Empirical boundary discovery maximizes agent capabilities while maintaining quality through systematic audit.

Such restriction often stems from cognitive biases—programmer style preferences, executive technology mandates, "hammer sees nail" thinking—rather than empirical evidence of agent limitations. When you feel tempted to pre-emptively restrict, ask: Is this constraint addressing a proven model limitation, or am I encoding my preference history?

You only pay the human intervention cost where it demonstrably adds value.

Key Insight

Each AI model (Anthropic's Claude, OpenAI's GPT, etc.) has unique capability boundaries. You discover these through practice, not prediction.

Start loose, audit carefully, constrain precisely.


Why Vertical Markets Need This Framework More

Vertical market software—built for specific industries like FinTech, HealthTech, LegalTech, Manufacturing—faces a 3-5x amplification of the "preference pollution" problem compared to horizontal platforms. Five compounding factors create this vulnerability:

The Amplification Effect

1. Constrained Talent Pools with Specialized Knowledge

Vertical markets need developers who know both domain knowledge (banking regulations, medical workflows, legal compliance) AND modern technology stacks. When a CTO with domain expertise mandates a bespoke framework, the already-small talent pool shrinks further. Result: Outsized impact of individual preferences, and high "bus factor" when key people leave.

2. Domain Expert Overconfidence

CTOs with deep domain expertise often conflate business knowledge with technical authority. "I know banking deeply, therefore I know how banking software should be architected" leads to the Curse of Knowledge bias: assuming others share their domain understanding and that domain expertise translates to optimal technical decisions.

3. Regulatory/Compliance Constraints Drive Conservative Choices

SOX, HIPAA, PCI-DSS, and other regulatory requirements create the perception that "standard frameworks can't handle our compliance needs." Reality: Modern frameworks (Spring, .NET, Rails) have mature compliance plugins and extensive audit trail capabilities. But conservative bias leads to custom frameworks built "for auditability," which actually increases long-term risk through reduced ecosystem support.

4. Legacy System Lock-In with Massive Migration Costs

Years—sometimes decades—of business logic become embedded in bespoke code. Migration cost ranges from $1M-$10M+ for large systems, with significant risk of breaking critical business functions. Result: "We can't afford to migrate" becomes a self-fulfilling prophecy as technical debt compounds and migration costs increase further.

5. High Bus Factor from Concentrated Knowledge

Vertical market experts must know domain + custom framework—an extremely narrow talent pool. When a senior developer leaves: months to find a replacement (if possible), months to train them on both domain and bespoke technology. Compare to horizontal platforms: lose a React developer, hire from millions globally, ramp in weeks using standard patterns.

The Economic Trap

The numbers are stark:

Cost Impact:

  • 42% of developer time (CISQ 2022) spent addressing technical debt instead of creating new value
  • 80% of IT budget spent on maintenance in legacy-heavy vertical markets
  • Migration cost: $1M-$10M+ (ERP implementation analysis - large enterprises invest over $1M to north of $10M for comprehensive implementations)

Competitive Impact:

Horizontal platform using React: 100% of developer capacity available for features.

Vertical market with custom COBOL banking system: 80% on maintenance → 20% on features.

Result: Horizontal platforms can innovate 5x faster.

The Trap:

High migration cost + High business risk + Limited expertise
→ "We'll maintain the legacy system one more year"
→ Technical debt grows
→ Migration cost increases further
→ Cycle repeats indefinitely

Vertical markets cannot afford the 42% productivity loss from preference-driven technical debt while competing with limited talent pools against horizontal platforms with abundant resources.

Three Case Studies

Case Study 1: Banking/COBOL Legacy

70% of banks globally rely on legacy systems; 43% still use COBOL (language developed in 1959). Original CTO decisions in the 1970s-1990s: "Build core banking in COBOL with our own frameworks for control and auditability."

Cost today: 80% of IT budget spent on "patch and maintain." Hiring crisis: 60% of organizations report finding COBOL developers as their biggest challenge; average COBOL programmer is 55 years old with 10% retiring annually.

Modernization potential: 60-90% reduction in IT operational costs. But migration risk and cost create paralysis.

Case Study 2: Healthcare/HL7-FHIR Adoption Barriers

Healthcare interoperability problem: Different EMR systems can't communicate effectively. Standards exist—HL7 (established) and FHIR (modern, API-based)—but adoption remains low at ~30% of US hospitals.

Why? Hospital CTOs built custom integration frameworks: "Our clinical workflows are unique; generic standards won't work." Result: "Many healthcare organizations still use outdated methods like faxing or manual data entry" despite standard solutions existing.

FHIR migration barrier: "Smaller healthcare organizations struggle with limited resources and expertise to adopt FHIR successfully, as the cost and complexity of transitioning can be significant." The irony: Standard frameworks would REDUCE long-term complexity, but upfront migration cost creates inertia.

Case Study 3: Manufacturing/Custom ERP Implementations

Every manufacturer needs ERP (Enterprise Resource Planning). Choice: Generic ERP (SAP, Oracle) with heavy customization OR industry-specific ERP OR fully bespoke system.

Common decision: Heavily customize generic ERP because "our production processes are unique." Cost: "When custom developments are implemented without a long-term vision, every update requires significant adjustments and fixes, creating unexpected costs and delaying deployment of new features."

Resource drain: "Constant maintenance diverts technical resources from strategic, innovative projects, reducing competitiveness."

Tribal knowledge problem: "In manufacturing, critical knowledge is trapped in people's heads. When they leave, that knowledge leaves too."

AI Agent Value Proposition for Vertical Markets

Why vertical markets should ESPECIALLY care about ecosystem idioms (the LEFT side):

Horizontal platforms can absorb preference pollution because:

  • Large talent pools → easy to hire developers who know "their way"
  • Resources to maintain custom frameworks
  • Community of developers willing to learn company-specific patterns

Vertical markets CANNOT afford this because:

  • Small talent pools → Can't find developers who know domain + custom framework
  • 42% productivity loss → Already disadvantaged, can't compete at 58% capacity
  • 80% budget on maintenance → No resources left for competitive innovation
  • High bus factor → One person leaving creates organizational crisis

Therefore: Vertical markets MUST use ecosystem idioms (LEFT side) because they cannot afford the compounding cost of preference-driven custom frameworks.

Empirically Validated AI Agent ROI:

The specific advantages for vertical markets:

  1. Generate ecosystem-idiomatic code despite limited in-house expertise
  2. Reduce bus factor through standard, well-documented patterns (any new hire can understand)
  3. Accelerate modernization with 88% automation of legacy system upgrades
  4. Lower hiring barriers by using standard frameworks (hire from larger talent pools)
  5. Free developers for domain work instead of maintaining bespoke technical infrastructure

The Bottom Line: For vertical markets, the LEFT side of the ⊣ Framework is not optional—it's economic survival. AI agents configured to follow ecosystem idioms offer the only scalable path to escape the "preference pollution → technical debt → lock-in" cycle without the resource advantages of horizontal platforms.


Summary: The Left-Right Balance

Maximize LEFT (agent capabilities):

  • Exploration – Let agents discover state-of-the-art solutions
  • Autonomy – Reduce human intervention frequency
  • Idiomatic patterns – Embrace community-tested approaches

Minimize RIGHT (human constraints):

  • Implementation Prescription – Avoid prescriptive implementation details
  • Human Intervention – Intervene only when audit reveals issues
  • Bespoke constraints – Avoid custom frameworks and house styles

The Symbol: A ⊣ B means "A is constrained by B"

The Practice: Start maximally left, audit systematically, constrain only when empirical evidence demands it.


Design Rule for Effective AI Agent Usage

  • Keep goals, constraints, and safety requirements clear
  • Keep implementation details non-bespoke and non-prescriptive where possible

The more we let AI agents operate idiomatically within well-defined boundaries, the more maintainable, future-proof, and effective our codebase becomes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ai-codingAI-augmented development methodologyframeworkConceptual frameworks and modelshow-toHow-to guidesreferenceReference documentationtipsTips and best practices

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions