Nicholas Dunzelman

30-day field trial plan ยท checked May 19, 2026

What I would prove in the first month on Codex deployment.

This page answers the hiring-manager version of the question: not whether I can talk about AI coding, but what evidence I would produce if given a narrow field lane for 30 days.

AgentProof demo report showing browser evidence from a tested web app

By day 30, the team should have one reusable field kit: customer-neutral intake, scope card, two reference demos, 90-minute workshop script, verifier-backed handoff receipt, and a product feedback packet.

The public field-kit templates are here: AI Coding Deployment Field Kit. A filled example is here: AI Coding Deployment Sample Run. Runnable receipt proof is here: AI Coding Workflow Receipt Reference. The acceptance scorecard is here: AI Coding Field Trial Scorecard.

This is deliberately smaller than owning a portfolio. It is the fastest truthful path from high-potential candidate to measurable Codex deployment asset.

The plan now has a runnable acceptance scorecard. It validates role-need coverage, verifier evidence, release gates, official source URLs, and the no-outbound boundary before the field trial can be called ready.

python3 tools/deployment_receipt.py validate-scorecard --input examples/field_trial_acceptance_scorecard.json
python3 tools/deployment_receipt.py scorecard --input examples/field_trial_acceptance_scorecard.json

Current result: ready, 4.4/5 average, no blocked gates, no outbound before 2026-05-26. This is still public-safe readiness proof, not real customer rollout evidence.

  1. Week 1

    Field map and intake

    Learn the customer motion before building theater.

    • task-selection rubric;
    • scope card with owner, data boundary, rollback path, acceptance criteria, and verifier;
    • reject list for unsafe or unclear demo candidates.

    Proof: one reviewer can mark a proposed demo ready, repair, or reject.

  2. Week 2

    Reference demos

    Produce demos that are falsifiable, not merely impressive.

    • repo maintenance demo with changed files, command proof, limitation, and rollback note;
    • team adoption demo with business context, user, verifier, failure mode, and receipt.

    Proof: two demos can be rerun or critiqued by an engineer without the presenter in the room.

  3. Week 3

    Workshop and failure taxonomy

    Turn demos into team behavior.

    • scope card, context load, live run, failure map, rollout gate, handoff receipt;
    • failure buckets for criteria, repo context, environment, verifier, tool mismatch, and review boundary.

    Proof: a decision table separates ready, repair, and human-owned task patterns.

  4. Week 4

    Product feedback packet

    Feed field signal back to product and applied teams cleanly.

    • adoption blockers, ready patterns, trust failures, missing affordances, and model observations;
    • documentation gaps, proposed guide additions, and one prioritized product proposal with evidence.

    Proof: a product partner can tell which issue is model, tool, docs, or customer readiness.

  1. Customers Intake, task selection, and workshop gates turn customer workflow design into a repeatable artifact.
  2. Demos Two reference demos include context, action, verifier, reviewer decision, and rollback note.
  3. Workshops The 90-minute run is built around failure triage and adoption gates, not prompt education.
  4. Content The field kit yields guide additions, customer templates, and Cookbook-shaped examples.
  5. Product The feedback packet separates model behavior, tool harness, docs gaps, and readiness blockers.
  6. Safety Every demo carries a data boundary, disallowed action list, verifier, rollback path, and human review rule.

This page is independent from OpenAI. It uses OpenAI and Codex only to name public role context. It does not use OpenAI logos, product UI, private systems, or affiliation language.

Sources checked: AI Deployment Engineer - Codex and OpenAI interview guide.