中文 | English

Harness Engineering Study Guide

A deep-dive learning archive on Harness Engineering — from concept to practice

Introduction

This is an evolving learning project. Harness Engineering is an engineering paradigm proposed by OpenAI in February 2026: engineers stop writing code and instead design environments, clarify intent, and build feedback loops so AI agents can work reliably.

Humans steer. Agents execute.

This repository documents the full learning journey — from reading the original article, breaking down concepts, forming independent thoughts, hands-on experiments, to producing shareable work. We hope it helps others exploring AI-native engineering.

Source: OpenAI — Harness Engineering: Harnessing Codex in an Agent-First World

Note: The insights shared here are not universally applicable. Please adapt them to your own context.

⚡ In One Sentence

Traditional:          Humans write code → Machines run code
Harness Engineering:  Humans design constraints → Agents write code → Machines run code

The core shift: an engineer’s output moves from code to constraint systems — AGENTS.md, architecture rules, custom linters, and feedback loops.

🧭 Six Core Concepts

1. Repo as System of Record — If it's not in the repo, it doesn't exist for the agent

Slack threads, Google Docs, knowledge in people’s heads = invisible to the agent. All decisions, specs, and plans must be committed as versioned artifacts.

→ See 01-repo-as-source-of-truth.md

2. Map, Not Manual — AGENTS.md is a table of contents, not an encyclopedia

A ~100-line entry file pointing to deeper docs. Progressive disclosure: the agent starts from a small, stable entry point and is guided where to look next. Three ways a giant instruction file fails: crowds out context, impossible to maintain, can’t be mechanically verified.

→ See 00-overview.md

3. Mechanical Enforcement — Docs rot; lint rules don't

Custom linters + structural tests = invariant guardians. Lint error messages embed fix instructions so agents can self-correct. Enforce boundaries centrally, allow autonomy locally.

→ See 02-mechanical-enforcement.md

4. Agent Readability — Optimize for the agent's ability to reason

Prefer “boring” technologies (stable APIs, well-represented in training data). Sometimes re-implementing a focused subset is cheaper than wrapping opaque upstream behavior. Make the app launchable per git worktree.

→ See 04-agent-readability.md

5. Throughput Changes Merge Philosophy — Correction is cheap; waiting is expensive

Short PR lifecycles. Flaky tests resolved by re-runs rather than blocking indefinitely. In a system where agent throughput far exceeds human attention, this is usually the right call.

→ See 05-throughput-changes-merge.md

6. Entropy Management = Garbage Collection — Tech debt is a high-interest loan

Agents reproduce existing patterns in the repo — including bad ones. Codify “golden rules” into the repo. Run periodic background tasks to scan for drift, update quality scores, and open targeted refactoring PRs.

→ See 03-entropy-and-garbage-collection.md

🔑 Key Data Points

Metric	Data
Team size	3 → 7 engineers
Time span	5 months
Codebase	~1 million lines
PRs merged	~1,500
PRs per engineer per day	3.5 (still growing after scaling)
Single run duration	6+ hours (often during human sleep)
Efficiency estimate	~1/10 of manual coding time

📂 Repository Structure

harness-engineering/
├── README.md              ← Chinese (primary)
├── README.en.md           ← You are here
├── AGENTS.md              ← Repo navigation entry (for agents)
│
├── concepts/              # Phase 1: Concept notes (7 articles)
│   ├── 00-overview.md     #   Overview of all six concepts
│   ├── 01-repo-as-...     #   Repo as source of truth
│   ├── 02-mechanical-...  #   Mechanical enforcement
│   ├── 03-entropy-...     #   Entropy & garbage collection
│   ├── 04-agent-...       #   Agent readability
│   ├── 05-throughput-...  #   Throughput changes merge philosophy
│   └── 06-harness-...     #   Harness definition (Fowler control-theory extension)
│
├── thinking/              # Phase 2: Independent analysis (5 articles)
├── practice/              # Phase 3: Hands-on experiments (1 Ralph Demo)
├── feedback/              # Phase 4: Lessons learned (1 article)
├── works/                 # Phase 5: Shareable outputs (11 translations)
├── prompts/               # Validated prompts collection
└── references/            # External resource index (18 articles with deep summaries)

Each subdirectory has its own AGENTS.md explaining its purpose and conventions — a direct practice of the “progressive disclosure” principle from the original article.

🚀 Learning Path

Phase 1: Understand core concepts — 7 concept notes covering OpenAI’s six concepts + Fowler’s control-theory extension
Phase 2: Form your own opinions — 5 independent analyses (ongoing)
Phase 3: Pick a small project to practice — Ralph Demo completed (321s, $0.31)
Phase 4: Record feedback & iterations — 1 article (ongoing)
Phase 5: Produce shareable work — 11 professional translations

📚 Research Library

15 core articles + 3 extended readings across three knowledge tracks:

Track	Coverage	Perspectives
AI-Era Harness Engineering	15 articles	OpenAI → Fowler → Anthropic → LangChain → Stanford
Cloud-Native Harness.io	3 articles	CI/CD platform architecture (same name, different meaning)
Extended Reading	3 articles	Mitchell Hashimoto, Context Engineering, Human-Agent collaboration

See articles.md — each article includes core thesis, key data, and cross-article connections.

📖 Translations

11 Chinese translations of key articles (click to expand)

Translation	Original Author	Source
⭐ Eight Years of Wanting	Lalit Maganti	Personal blog
Inside the Scaffold	Benjamin Rombaut	Huawei / arXiv
Meta-Harness	Yoonho Lee et al.	Stanford / arXiv
Harness Engineering (full)	Birgitta Böckeler	Martin Fowler
Harness Engineering (memo)	Birgitta Böckeler	Martin Fowler
Encoding Team Standards	Rahul Garg	Martin Fowler
Feedback Flywheel	Rahul Garg	Martin Fowler
Scaling Managed Agents	Lance Martin et al.	Anthropic
Agent Evaluation Checklist	LangChain Team	LangChain
Agent-driven Development	Tyler McGoffin	GitHub
Continual Learning	Harrison Chase	LangChain

Source Material

Resource	Description
OpenAI Original Article	The full Harness Engineering exposition

Ralph Series — Harness Engineering in Practice

The “Ralph Wiggum Loop” is the core implementation pattern of Harness Engineering: agents work autonomously in a loop until the task is complete.

Project	Stars	Description
snarktank/ralph	13.6k	Original Ralph: bash script that repeatedly spawns AI with fresh context until all PRD items pass. 6 core tenets
ralph-orchestrator	2.3k	Rust evolution: Hat-based personas + event-driven coordination + multi-backend (Claude/Kiro/Gemini/Codex) + backpressure gates + persistent memory
bmad-ralph	2	BMAD method + Ralph: parallel Claude Code worktrees + three-layer self-healing (retry → restart → diagnose) + SQLite state machine

Ralph Tenets ↔ Harness Engineering Mapping

Ralph Tenet	Harness Engineering Concept
Fresh Context Is Reliability	Agent Readability — re-read everything each iteration
Backpressure Over Prescription	Mechanical Enforcement — don’t prescribe how; gate bad output
The Plan Is Disposable	Entropy Management — regeneration costs one planning loop
Disk Is State, Git Is Memory	Repo as System of Record — files are the handoff mechanism
Steer With Signals, Not Scripts	Humans Steer — add signs, not scripts
Let Ralph Ralph	Agents Execute — sit on the loop, not in it

Community & Extended

Resource	Description
vibe-coding-cn	Chinese Vibe Coding community guide
Mitchell Hashimoto: Engineer the Harness	Another origin of the “Harness” concept

🤝 Contributing

Contributions via Issues and PRs are welcome:

Add concept notes (concepts/ has gaps to fill)
Share your independent thinking (thinking/)
Contribute practice cases (practice/)
Recommend related resources (references/)

📞 Contact

Channel	Link
GitHub	@deusyu
X (Twitter)	@0xdeusyu
Telegram	@DeusThink
Telegram Group	@talkdeusyu
Telegram Channel	@lovedesuyu
Email	rainman.deus@gmail.com

Star History

If you find this project helpful, please consider giving it a Star ⭐!

📄 License

MIT

Quartz 4

Explorer

README.en

Harness Engineering Study Guide

Introduction

⚡ In One Sentence

🧭 Six Core Concepts

🔑 Key Data Points

📂 Repository Structure

🚀 Learning Path

📚 Research Library

📖 Translations

Source Material

Ralph Series — Harness Engineering in Practice

Ralph Tenets ↔ Harness Engineering Mapping

Community & Extended

🤝 Contributing

📞 Contact

Star History

📄 License

Graph View

Table of Contents

Quartz 4

Explorer

README.en

Harness Engineering Study Guide

Introduction

⚡ In One Sentence

🧭 Six Core Concepts

🔑 Key Data Points

📂 Repository Structure

🚀 Learning Path

📚 Research Library

📖 Translations

🔗 Related Projects & Resources

Source Material

Ralph Series — Harness Engineering in Practice

Ralph Tenets ↔ Harness Engineering Mapping

Community & Extended

🤝 Contributing

📞 Contact

Star History

📄 License

Graph View

Table of Contents