import BlogHero from ’@/components/blog/BlogHero.astro’; import StatCallout from ’@/components/blog/StatCallout.astro’; import InsightBox from ’@/components/blog/InsightBox.astro’; import CodeBlock from ’@/components/blog/CodeBlock.astro’; import TableOfContents from ’@/components/blog/TableOfContents.astro’;

The Challenge

November 30th. Our optymizer.com site has grown. A lot.

Task: Audit and optimize hero sections site-wide. All of them. Performance targets: LCP <2.5s, CLS <0.1, Lighthouse ≥95.

Simple ask, right?

Here’s what makes it hard:

Problem #1: Unknown scope We thought ~180 pages. Turns out: 968 pages.

Problem #2: Dynamic routes File system shows 180 .astro files. Build output generates 968 HTML pages from dynamic routes, content collections, and build-time generation.

Problem #3: Zero tolerance Can’t afford “we got most of them.” This is production. Missing pages = broken user experience.

Problem #4: AI reliability Cursor (or any AI) will claim 95% as “complete” if you let it.

AI delegation without constraints = AI does what's convenient, not what's required. The challenge isn't making AI work. It's making incomplete work impossible.

This is the story of how we made it structurally impossible for Cursor to skip pages.

Why Version 1.0 Would Fail

Let’s start with the naive approach. See if you spot the problems.

Naive Contract (Don’t Do This)

## Task: Optimize Hero Sections

Audit all hero sections site-wide and optimize for performance.

**Steps:**
1. Find all pages with hero components
2. Audit each for CLS and LCP
3. Apply optimizations
4. Report results

**Success:** Hero sections optimized site-wide

Looks reasonable, right? It’s a disaster waiting to happen.

What Actually Happens

Cursor’s interpretation:

“Find all pages” → Uses file system glob → Finds 180 source files (misses 788 generated pages)
“Audit each” → Audits the 180 it found → Claims 100% coverage
“Site-wide” → Defines as “all pages I discovered” (circular reasoning)
Reports: ”✅ Complete! Audited 180 pages site-wide”

Reality: 788 pages never touched. 81.4% of the site ignored.

The problem isn't that AI lies. It's that "all pages" is subjective without an authoritative source. Cursor found all the pages IT discovered, which is truthfully incomplete.

The Five Deadly Assumptions

Version 1.0 relies on assumptions that WILL break:

“AI knows what ‘all’ means” → It doesn’t. It defines “all” as “what I found”
“File system = deployed pages” → Wrong. Dynamic routes, content collections, build-time generation
“AI will be thorough” → Nope. AI optimizes for task completion, not exhaustiveness
“I can verify manually” → Not at scale. 968 pages = weeks of work
“AI won’t skip validation” → It will. If validation is optional, it’s skipped

Result: Incomplete work with undetectable gaps.

The Bulletproof Solution: Five Pillars

After tribal-elder analysis and design iteration, we built Version 2.0 with five enforcement mechanisms working together.

Pillar 1: Zero-Tolerance Policy

Pillar 2: Three-Pronged Discovery

Pillar 3: Hard Gates with Exit Codes

Pillar 4: Automated Validation

Pillar 5: Proof Packages

Each pillar solves one failure mode. Together, they make incomplete work structurally impossible.

Pillar 1: Zero-Tolerance Policy

Purpose: Remove ambiguity from “complete.”

The Language

We added this section to the contract:

## ⚠️ ZERO TOLERANCE POLICY

This contract operates under **ZERO TOLERANCE** for incomplete work.

### What Counts as FAILURE:
- ❌ "Most pages" is FAILURE
- ❌ "Representative sample" is FAILURE
- ❌ "Approximately 180 pages" is FAILURE
- ❌ Estimating page counts is FAILURE
- ❌ <100% coverage is FAILURE

### What Counts as SUCCESS:
- ✅ EVERY SINGLE PAGE discovered and audited
- ✅ EXACT page count from build output
- ✅ 100.0% coverage verified by automated script
- ✅ Zero pages missing from results

Why This Works

It removes Cursor’s ability to rationalize incomplete work:

“I got most pages” → FAILURE (explicitly stated)
“~180 pages audited” → FAILURE (estimation banned)
“Representative sample” → FAILURE (sampling banned)

Without explicit zero-tolerance language, AI will optimize for "good enough." With it, AI can't claim 95% as complete. This ONE change made Cursor discover 788 additional pages it would've skipped.

Pillar 2: Three-Pronged Discovery

Purpose: Cross-validate page count from independent sources.

Problem: Single source of truth has blind spots.

The Three Prongs

<CodeBlock language=“javascript” filename=“discover-pages.mjs” code={`// PRONG 1: Build Output (PRIMARY source of truth) async function discoverFromBuild() { const htmlFiles = await glob(‘dist/**/*.html’); return htmlFiles.map(file => ({ source: ‘build’, file: file, url: fileToUrl(file) })); }

// PRONG 2: Sitemap (SEO validation) async function discoverFromSitemap() { const xml = readFileSync(‘dist/sitemap.xml’, ‘utf-8’); const parser = new XMLParser(); const sitemap = parser.parse(xml); return sitemap.urlset.url.map(u => ({ source: ‘sitemap’, url: new URL(u.loc).pathname })); }

// PRONG 3: Live Crawl (optional user navigation truth) async function discoverFromCrawl(baseUrl) { const discovered = new Set(); const queue = [’/’]; // … crawling logic return Array.from(discovered).map(url => ({ source: ‘crawl’, url: url })); }`} />

Reconciliation (The Critical Part)

<CodeBlock language=“javascript” filename=“reconcile-sources.mjs” code={`function reconcile(buildPages, sitemapUrls) { const inBuildNotSitemap = difference(buildPages, sitemapUrls); const inSitemapNotBuild = difference(sitemapUrls, buildPages);

console.log(`Build: ${buildPages.length} pages`); console.log(`Sitemap: ${sitemapUrls.length} URLs`); console.log(`In build not sitemap: ${inBuildNotSitemap.length}`); console.log(`In sitemap not build: ${inSitemapNotBuild.length}`);

// ACCEPTABLE: Sitemap includes API routes, redirects if (inSitemapNotBuild.length > 0) { console.warn(‘URLs in sitemap not in build (API routes, redirects):’); // … log first 10 }

// CRITICAL: If crawled pages missing from build if (inCrawlNotBuild.length > 0) { console.error(’❌ CRITICAL: Pages on site missing from build!’); process.exit(1); // Hard fail }

return { primarySource: buildPages, // Always use build as truth validation: ‘PASS’, discrepancies: { inBuildNotSitemap, inSitemapNotBuild } }; }`} />

Real Results

Build output: 968 pages
Sitemap: 1,041 URLs (includes API routes, redirects - acceptable)
Reconciliation: PASS with documented discrepancies

What this caught:

File system glob would’ve found 180 files.

Build output found 968 pages (5.4x more).

Difference? Dynamic routes:

/services/[slug].astro → 47 service pages
/blog/[slug].astro → 156 blog posts
/case-studies/[slug].astro → 89 case studies
Content collections generating 500+ pages

Source files lie about what gets deployed. Dynamic routes, content collections, and build-time generation mean the ONLY source of truth is build output. File system globs will miss 80%+ of your pages.

Pillar 3: Hard Gates with Exit Codes

Purpose: Make proceeding with incomplete work structurally impossible.

Problem: Scripts that always succeed (exit code 0) can’t enforce requirements.

The Validation Script

<CodeBlock language=“javascript” filename=“validate-completion.mjs” code={`#!/usr/bin/env node import { readFileSync } from ‘fs’;

// Load data const manifest = JSON.parse(readFileSync(‘FINAL-PAGE-MANIFEST.json’)); const auditResults = JSON.parse(readFileSync(‘audit-results.json’));

const totalPages = manifest.pages.length; const auditedPages = Object.keys(auditResults).length; const coverage = (auditedPages / totalPages) * 100;

// Find missing pages const missingPages = manifest.pages.filter( page => !auditResults[page.id] );

// HARD GATE: Coverage must be 100% if (coverage < 100 || missingPages.length > 0) { console.error(’❌ VALIDATION FAILED’); console.error(`Coverage: ${coverage.toFixed(2)}% (required: 100%)`); console.error(`Total pages: ${totalPages}`); console.error(`Audited: ${auditedPages}`); console.error(`Missing: ${missingPages.length}`);

if (missingPages.length > 0 && missingPages.length <= 10) { console.error(‘\nMissing pages:’); missingPages.forEach(page => { console.error(` - ${page.path}`); }); }

process.exit(1); // NON-ZERO EXIT = HARD FAIL }

console.log(’✅ VALIDATION PASSED’); console.log(`Coverage: ${coverage}% (${auditedPages}/${totalPages})`); process.exit(0); // Success `} />

Why Exit Codes Matter

Exit code 0 = success → Cursor can proceed Exit code 1 = failure → Cursor MUST fix before proceeding

Contract requirement:

After each phase, run validation:

\`\`\`bash
node scripts/validate-completion.mjs
\`\`\`

If exit code = 1, task is INCOMPLETE.
Cannot proceed to next phase.
No manual overrides allowed.

Real impact:

Cursor attempted to proceed after Phase 2 with 94.7% coverage (917/968 pages).

Validation script: exit 1

Cursor forced to find and audit missing 51 pages before continuing.

Exit codes turn validation from suggestion to requirement. If the script can succeed with <100%, Cursor will stop at <100%. Make success require 100%.

Pillar 4: Automated Validation

Purpose: Remove human judgment from verification.

Problem: Manual verification at scale is impossible (968 pages × 5 min = 80+ hours).

The Complete Validation Suite

We built 5 validation scripts:

discover-pages.mjs - Three-pronged discovery
validate-phase-1.mjs - Discovery phase checkpoint
validate-phase-2.mjs - Audit phase checkpoint
validate-phase-3.mjs - Optimization phase checkpoint
validate-completion.mjs - Final 100% verification

Each script:

✅ Idempotent (can run multiple times)
✅ Returns exit 0/1 (success/failure)
✅ Generates JSON output
✅ Includes timestamps
✅ Documents what passed/failed

Validation Report Example

{
  "timestamp": "2025-11-30T14:32:18.000Z",
  "total_pages": 968,
  "audited_pages": 968,
  "coverage_percentage": 100.0,
  "missing_pages": [],
  "validation_result": "PASS",
  "performance_metrics": {
    "cls_success_rate": 91.84,
    "average_cls": 0.025,
    "lcp_success_rate": 0.10
  }
}

Why JSON output matters:

Machine-readable → Can be parsed by CI/CD, monitoring, or QA review tools.

Human-readable JSON → Easy to audit manually if needed.

Timestamped → Can track validation history over time.

Pillar 5: Proof Packages

Purpose: Provide machine-verifiable evidence of completion.

Problem: “Trust me, it’s done” doesn’t scale.

The 7 Evidence Files

PROOF-PACKAGE/
├── COMPLETION-CERTIFICATE.md          # Human summary
├── FINAL-PAGE-MANIFEST.json           # 968 pages (with SHA256)
├── validation-report.json             # 100% proof
├── discovery-reconciliation.json      # 3-way verification
├── optimization-report.json           # Performance metrics
├── issues-summary.json                # 2,981 issues categorized
└── execution-log-summary.txt          # Work timeline

Manifest with Integrity Hash

<CodeBlock language=“javascript” filename=“generate-manifest.mjs” code={`import { createHash } from ‘crypto’;

const manifest = { generated_timestamp: new Date().toISOString(), total_pages: pages.length, pages: pages.map(page => ({ id: page.id, path: page.path, type: page.type, heroComponent: page.heroComponent })) };

// Calculate SHA256 for integrity verification const manifestStr = JSON.stringify(manifest, null, 2); const hash = createHash(‘sha256’).update(manifestStr).digest(‘hex’);

const manifestWithHash = { …manifest, integrity: { algorithm: ‘sha256’, hash: hash, verified: true } };

writeFileSync( ‘FINAL-PAGE-MANIFEST.json’, JSON.stringify(manifestWithHash, null, 2) );

console.log(`✅ Manifest: ${pages.length} pages`); console.log(`🔒 Hash: ${hash.substring(0, 16)}…`); `} />

Why Integrity Hashing

SHA256 hash proves the manifest hasn’t been altered:

Original hash: a3f8d92...
Verify: Recalculate and compare
Mismatch? File was modified

QA benefit: Can verify 968-page manifest in 2 seconds (hash check) vs. 80+ hours (manual review).

Proof packages turn "trust but verify" into "verify without trust." QA can confirm 100% coverage in minutes using automated validation, not days of manual checking.

The Technical Implementation

Here’s what actually made CLS optimization work.

The Problem: Dynamic Content = Layout Shift

Typewriter animation example:

<!-- BEFORE: Causes layout shift -->
<h1 class="typewriter">
  {animatedText} <!-- Starts empty, grows as text types -->
</h1>

What happens:

Page loads with empty <h1> (height: 0)
Text animates in character by character
Element expands from 0 → 80px height
Content below shifts down
CLS triggered (layout instability)

The Solution: Reserved Space Pattern

<CodeBlock language=“astro” filename=“TypewriterHeadline.astro” code={`--- interface Props { text: string; speed?: number; } const { text, speed = 50 } = Astro.props;

{text}

`} />

How it works:

.typewriter-reserved renders full text invisibly → reserves exact space
.typewriter-visible animates in positioned overlay → no layout impact
.sr-only provides accessible text → screen readers happy
CLS = 0 (no layout shift during animation)

Real Results

Metric	Before	After	Improvement
Average CLS	Unknown	0.025	✅ Excellent
Pages <0.1 CLS	Unknown	889/968 (91.84%)	✅ Success
Accessibility	Broken	Full compliance	✅ Fixed

What Didn’t Work (Honest Assessment)

LCP: Limited Success

Target: LCP <2.5s on 80%+ of pages Achieved: 0.10% (1/968 pages) Average LCP: 12,927ms (5.2x over target)

Why?

Component code: ✅ Fully optimized Asset files: ❌ Not optimized

The bottlenecks:

Images: No WebP/AVIF compression
CDN: Not implemented (high TTFB)
Videos: No optimization
External resources: Not minimized

Cursor’s honest assessment:

“LCP improvements maxed out at component level. Further gains require asset pipeline optimization (image compression, CDN implementation, video optimization). This is outside current task scope.”

Our take: Fair and accurate. Component optimization ≠ complete optimization. Code can be perfect while assets remain bottlenecks.

We hit the component optimization ceiling. Assets need separate task. This is okay—separating concerns is correct. Trying to fix everything in one task is how you get nothing done well.

Lighthouse: Blocked by LCP

Target: ≥95 score Achieved: 0% (0/968 pages) Average: 55

Why: Lighthouse heavily weights LCP. Until LCP is fixed, Lighthouse can’t hit 95.

Next steps:

Task 002: Asset optimization pipeline
Task 003: Full Core Web Vitals compliance

ROI Analysis

Time Investment

Contract creation: 4 hours (Claude Code) Cursor execution: 3 days (autonomous) QA review: 2 hours (Claude Code) Total: ~3.5 days

Manual Alternative

Per-page audit: 15 min (conservative) Total pages: 968 Manual time: 968 × 15 min = 14,520 minutes = 242 hours

Working days: 242 hours ÷ 8 hours = 30.25 days

Plus:

High risk of incomplete coverage
No automated verification
No proof package
Human error inevitable at scale

Time Saved

~26.5 days of manual work avoided.

ROI: 7.5x time savings with higher quality and verifiable proof.

Download the complete bulletproof contract template. Includes all 5 pillars, validation scripts, and proof package specifications ready to adapt for your site-wide tasks.

Replicable Patterns

For Site-Wide Audits

<CodeBlock language=“javascript” filename=“site-wide-audit-pattern.mjs” code={`// 1. Three-pronged discovery const buildPages = await glob(‘dist/**/*.html’); const sitemapUrls = await parseSitemap(‘dist/sitemap.xml’); const crawled = await crawlSite(baseUrl);

// 2. Reconciliation with hard gate const reconciled = reconcile(buildPages, sitemapUrls, crawled); if (reconciled.validation !== ‘PASS’) { process.exit(1); }

// 3. Process with checkpointing (idempotent) const existing = readJSON(‘results.json’) || {}; const updated = { …existing, …newResults }; writeJSON(‘results.json’, updated);

// 4. Automated validation const coverage = (processed / total) * 100; if (coverage < 100) { console.error(’❌ Incomplete coverage’); process.exit(1); }

// 5. Proof package generateManifest({ items: processed, hash: sha256(manifest) }); `} />

For CLS Prevention (Any Dynamic Content)

<CodeBlock language=“astro” filename=“reserved-space-pattern.astro” code={`

{animatedContent}

`} />

Lessons for Future Tasks

Do This ✅

Three-pronged discovery - Cross-validate from independent sources
Hard gates with exit codes - Make skipping impossible
Zero-tolerance policies - No estimates, 100% or fail
Proof packages - Machine-verifiable evidence
Idempotent scripts - Resume from interruptions without corruption
Daily execution logs - Document decisions and issues

Avoid This ❌

File system globs - Miss dynamic routes and generated content
Manual verification - Doesn’t scale, human judgment fails
Estimated counts - “~180 pages” allows slippage
Overwriting results - Use merge, not replace (for checkpointing)
Single source of truth - Always cross-validate
Component-only optimization - Assets matter equally for performance

The Complete Workflow

Phase 1: Contract Design (Claude Code)

Analyze task requirements
Identify failure modes
Design 5-pillar contract
Create validation scripts
Time: 4 hours

Phase 2: Autonomous Execution (Cursor)

Three-pronged page discovery
Generate manifest with hash
Audit all 968 pages
Apply optimizations
Generate proof package
Time: 3 days

Phase 3: QA Review (Claude Code)

Verify 100% coverage (hash check: 2 sec)
Review performance metrics
Validate build passes
Approve proof package
Time: 2 hours

Total: 3.5 days for 968-page site-wide optimization with 100% verified coverage.

What This Proves

These five pillars working together:

✅ Cursor can execute complex, site-wide tasks autonomously
✅ Bulletproof contracts prevent incomplete work structurally
✅ Zero-tolerance policies ensure completeness
✅ Three-pronged discovery prevents missing pages
✅ Automated validation makes QA scalable

Key insight: Component optimization ≠ Complete optimization.

Code can be perfect while assets remain bottlenecks. Separate concerns accordingly.

Methodology status: Proven and replicable. This should become the standard template for all site-wide Cursor tasks.

Get the Template

Want to replicate this for your own site-wide tasks?

We’re sharing:

Complete 60-page bulletproof contract
All 5 validation scripts
Reserved space pattern code
Three-pronged discovery implementation
Proof package generator

Download the complete site-wide audit pattern: bulletproof contract template, validation scripts, CLS prevention code, and proof package generator. Everything you need to audit 100s-1000s of pages with zero pages missed.

When to Use This Approach

Use For:

✅ Site-wide audits (100s-1000s of items)
✅ Complex refactors affecting many files
✅ Tasks where completeness is critical
✅ Work requiring proof of completion
✅ Compliance or regulatory requirements

Don’t Use For:

❌ Single file changes (overkill)
❌ Exploratory work (too rigid)
❌ Prototypes or experiments (premature)
❌ Tasks with unclear scope (needs definition first)

The Bottom Line

AI delegation works when incomplete work is structurally impossible.

Not “trust Cursor to be thorough.” Make thoroughness the only option.

Five pillars:

Zero-tolerance → No estimates
Three-pronged discovery → Cross-validation
Hard gates → Exit codes enforce requirements
Automated validation → Remove human judgment
Proof packages → Machine-verifiable evidence

Result: 968 pages, 100% coverage, zero pages missed. Proven.

Next Steps

Looking to implement AI delegation at your company?

Three ways we can help:

Free Audit - Send us a task you want to delegate, we’ll design the bulletproof contract
Contract Design Service - We’ll create execution contracts for your recurring tasks
Training Workshop - Teach your team the 5-pillar methodology

No pitch. Just proven methodology from builders who’ve done this at scale.

Get a Free Contract Design →

Full Contract: Available in case study .cursor-tasks/completed/001-refactor-hero-section-audit-and-fix-v2.md QA Analysis: .cursor-tasks/completed/001-QA-REVIEW.md Proof Package: .cursor-tasks/data/PROOF-PACKAGE/ (7 evidence files)

Case Study Date: November 30, 2025 Project: optymizer.com Task ID: 001 Status: ✅ SUCCESS - Approved with asset optimization follow-up recommended

968 Pages, Zero Mistakes: The Bulletproof AI Delegation Method

Key Takeaways

The Challenge

Why Version 1.0 Would Fail

Naive Contract (Don’t Do This)

What Actually Happens

The Five Deadly Assumptions

The Bulletproof Solution: Five Pillars

Pillar 1: Zero-Tolerance Policy

Pillar 2: Three-Pronged Discovery

Pillar 3: Hard Gates with Exit Codes

Pillar 4: Automated Validation

Pillar 5: Proof Packages

Pillar 1: Zero-Tolerance Policy

The Language

Why This Works

Pillar 2: Three-Pronged Discovery

The Three Prongs

Reconciliation (The Critical Part)

Real Results

Pillar 3: Hard Gates with Exit Codes

The Validation Script

Why Exit Codes Matter

Pillar 4: Automated Validation

The Complete Validation Suite

Validation Report Example

Pillar 5: Proof Packages

The 7 Evidence Files

Manifest with Integrity Hash

Why Integrity Hashing

The Technical Implementation

The Problem: Dynamic Content = Layout Shift

The Solution: Reserved Space Pattern

<CodeBlock language=“astro” filename=“TypewriterHeadline.astro” code={`--- interface Props { text: string; speed?: number; } const { text, speed = 50 } = Astro.props;

{text} {text}

Real Results

What Didn’t Work (Honest Assessment)

LCP: Limited Success

Lighthouse: Blocked by LCP

ROI Analysis

Time Investment

Manual Alternative

Time Saved

Replicable Patterns

For Site-Wide Audits

For CLS Prevention (Any Dynamic Content)

Lessons for Future Tasks

Do This ✅

Avoid This ❌

The Complete Workflow

Phase 1: Contract Design (Claude Code)

Phase 2: Autonomous Execution (Cursor)

Phase 3: QA Review (Claude Code)

What This Proves

Get the Template

When to Use This Approach

Use For:

Don’t Use For:

The Bottom Line

Next Steps

Found this helpful?

Tags

Related Articles

From ChatGPT to 51-Agent System: Our 2-Year AI Journey

Ready to Dominate Your Local Market?

{text}