30-day field trial plan · checked May 19, 2026

What I would prove in the first month on Codex deployment.

This page answers the hiring-manager version of the question: not whether I can talk about AI coding, but what evidence I would produce if given a narrow field lane for 30 days.

AgentProof demo report showing browser evidence from a tested web app

Deliverable

One instrumented deployment loop

By day 30, the team should have one reusable field kit: customer-neutral intake, scope card, two reference demos, 90-minute workshop script, verifier-backed handoff receipt, and a product feedback packet.

The public field-kit templates are here: AI Coding Deployment Field Kit. A filled example is here: AI Coding Deployment Sample Run. Runnable receipt proof is here: AI Coding Workflow Receipt Reference. The acceptance scorecard is here: AI Coding Field Trial Scorecard.

This is deliberately smaller than owning a portfolio. It is the fastest truthful path from high-potential candidate to measurable Codex deployment asset.

Acceptance

Scorecard gate

The plan now has a runnable acceptance scorecard. It validates role-need coverage, verifier evidence, release gates, official source URLs, and the no-outbound boundary before the field trial can be called ready.

python3 tools/deployment_receipt.py validate-scorecard --input examples/field_trial_acceptance_scorecard.json
python3 tools/deployment_receipt.py scorecard --input examples/field_trial_acceptance_scorecard.json

Current result: ready, 4.4/5 average, no blocked gates, no outbound before 2026-05-26. This is still public-safe readiness proof, not real customer rollout evidence.

30 days

The run

Week 1
Field map and intake

Learn the customer motion before building theater.
- task-selection rubric;
- scope card with owner, data boundary, rollback path, acceptance criteria, and verifier;
- reject list for unsafe or unclear demo candidates.
Proof: one reviewer can mark a proposed demo ready, repair, or reject.
Week 2
Reference demos

Produce demos that are falsifiable, not merely impressive.
- repo maintenance demo with changed files, command proof, limitation, and rollback note;
- team adoption demo with business context, user, verifier, failure mode, and receipt.
Proof: two demos can be rerun or critiqued by an engineer without the presenter in the room.
Week 3
Workshop and failure taxonomy

Turn demos into team behavior.
- scope card, context load, live run, failure map, rollout gate, handoff receipt;
- failure buckets for criteria, repo context, environment, verifier, tool mismatch, and review boundary.
Proof: a decision table separates ready, repair, and human-owned task patterns.
Week 4
Product feedback packet

Feed field signal back to product and applied teams cleanly.
- adoption blockers, ready patterns, trust failures, missing affordances, and model observations;
- documentation gaps, proposed guide additions, and one prioritized product proposal with evidence.
Proof: a product partner can tell which issue is model, tool, docs, or customer readiness.

Role map

How it matches the posting

Customers Intake, task selection, and workshop gates turn customer workflow design into a repeatable artifact.
Demos Two reference demos include context, action, verifier, reviewer decision, and rollback note.
Workshops The 90-minute run is built around failure triage and adoption gates, not prompt education.
Content The field kit yields guide additions, customer templates, and Cookbook-shaped examples.
Product The feedback packet separates model behavior, tool harness, docs gaps, and readiness blockers.
Safety Every demo carries a data boundary, disallowed action list, verifier, rollback path, and human review rule.

Proof stack

Inspect next

Proof brief Field kit Sample run Receipt reference Workshop kit Patterns guide Ledger inspector Cleanroom repo

Boundary

Cleanroom line

This page is independent from OpenAI. It uses OpenAI and Codex only to name public role context. It does not use OpenAI logos, product UI, private systems, or affiliation language.

Sources checked: AI Deployment Engineer - Codex and OpenAI interview guide.

What I would prove in the first month on Codex deployment.

One instrumented deployment loop

Scorecard gate

The run

Field map and intake

Reference demos

Workshop and failure taxonomy

Product feedback packet

How it matches the posting

Inspect next

Cleanroom line