What I would prove in the first month on Codex deployment.
This page answers the hiring-manager version of the question: not whether I can talk about AI coding, but what evidence I would produce if given a narrow field lane for 30 days.
One instrumented deployment loop
By day 30, the team should have one reusable field kit: customer-neutral intake, scope card, two reference demos, 90-minute workshop script, verifier-backed handoff receipt, and a product feedback packet.
The public field-kit templates are here: AI Coding Deployment Field Kit. A filled example is here: AI Coding Deployment Sample Run. Runnable receipt proof is here: AI Coding Workflow Receipt Reference. The acceptance scorecard is here: AI Coding Field Trial Scorecard.
This is deliberately smaller than owning a portfolio. It is the fastest truthful path from high-potential candidate to measurable Codex deployment asset.
Scorecard gate
The plan now has a runnable acceptance scorecard. It validates role-need coverage, verifier evidence, release gates, official source URLs, and the no-outbound boundary before the field trial can be called ready.
python3 tools/deployment_receipt.py validate-scorecard --input examples/field_trial_acceptance_scorecard.json
python3 tools/deployment_receipt.py scorecard --input examples/field_trial_acceptance_scorecard.json
Current result: ready, 4.4/5 average, no blocked gates, no outbound before 2026-05-26. This is still public-safe readiness proof, not real customer rollout evidence.
The run
-
Week 1
Field map and intake
Learn the customer motion before building theater.
- task-selection rubric;
- scope card with owner, data boundary, rollback path, acceptance criteria, and verifier;
- reject list for unsafe or unclear demo candidates.
Proof: one reviewer can mark a proposed demo ready, repair, or reject.
-
Week 2
Reference demos
Produce demos that are falsifiable, not merely impressive.
- repo maintenance demo with changed files, command proof, limitation, and rollback note;
- team adoption demo with business context, user, verifier, failure mode, and receipt.
Proof: two demos can be rerun or critiqued by an engineer without the presenter in the room.
-
Week 3
Workshop and failure taxonomy
Turn demos into team behavior.
- scope card, context load, live run, failure map, rollout gate, handoff receipt;
- failure buckets for criteria, repo context, environment, verifier, tool mismatch, and review boundary.
Proof: a decision table separates ready, repair, and human-owned task patterns.
-
Week 4
Product feedback packet
Feed field signal back to product and applied teams cleanly.
- adoption blockers, ready patterns, trust failures, missing affordances, and model observations;
- documentation gaps, proposed guide additions, and one prioritized product proposal with evidence.
Proof: a product partner can tell which issue is model, tool, docs, or customer readiness.
How it matches the posting
- Customers Intake, task selection, and workshop gates turn customer workflow design into a repeatable artifact.
- Demos Two reference demos include context, action, verifier, reviewer decision, and rollback note.
- Workshops The 90-minute run is built around failure triage and adoption gates, not prompt education.
- Content The field kit yields guide additions, customer templates, and Cookbook-shaped examples.
- Product The feedback packet separates model behavior, tool harness, docs gaps, and readiness blockers.
- Safety Every demo carries a data boundary, disallowed action list, verifier, rollback path, and human review rule.
Inspect next
Cleanroom line
This page is independent from OpenAI. It uses OpenAI and Codex only to name public role context. It does not use OpenAI logos, product UI, private systems, or affiliation language.
Sources checked: AI Deployment Engineer - Codex and OpenAI interview guide.