# Community Leaderboard

Generated: 2026-03-07T16:23:17+00:00

## Headline Official Baselines

Policy: headline official baselines come from `dev`/`dev50`/`test` suites. `dev10`/`quick` official runs are sanity runs for harness health.

| Submission | Pack | Suite | Top Worker | Baseline | Mentored | Lift |
| --- | --- | --- | --- | --- | --- | --- |
| official_dev50_signal_qwen_llama | task_pack_v2 | dev50 | qwen2.5-coder:7b | 0.00% | 0.00% | 0.00% |

## Official Sanity Runs

| Submission | Pack | Suite | Top Worker | Baseline | Mentored | Model Errors | Timeouts |
| --- | --- | --- | --- | --- | --- | --- | --- |
| official_dev10_signal_qwen25coder7b_phi3mini_2026-03-06 | task_pack_v2 | dev10 | qwen2.5-coder:7b | 20.00% | 20.00% | 0 | 0 |
| official_dev10_signal_qwen25coder7b_gemma29b_2026-03-06 | task_pack_v2 | dev10 | qwen2.5-coder:7b | 20.00% | 20.00% | 0 | 0 |
| official_dev10_signal_qwen25coder7b_qwen25coder7b_2026-03-06 | task_pack_v2 | dev10 | qwen2.5-coder:7b | 20.00% | 20.00% | 0 | 0 |

## All Verified Submissions

| Submission | Label | Role | Pack | Suite | Top Worker | Baseline | Mentored | Model Errors | Timeouts | Metrics Source | Commit |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| official_dev10_signal_qwen25coder7b_phi3mini_2026-03-06 | official | sanity | task_pack_v2 | dev10 | qwen2.5-coder:7b | 20.00% | 20.00% | 0 | 0 | summary,summary | 9321a0092e2a5ee575cba2105e57632964e7dd10 |
| official_dev10_signal_qwen25coder7b_gemma29b_2026-03-06 | official | sanity | task_pack_v2 | dev10 | qwen2.5-coder:7b | 20.00% | 20.00% | 0 | 0 | summary,summary | 9321a0092e2a5ee575cba2105e57632964e7dd10 |
| official_dev10_signal_qwen25coder7b_qwen25coder7b_2026-03-06 | official | sanity | task_pack_v2 | dev10 | qwen2.5-coder:7b | 20.00% | 20.00% | 0 | 0 | summary,summary | 9321a0092e2a5ee575cba2105e57632964e7dd10 |
| community_quick_signal_qwen25coder7b_gemma29b_2026-03-04 | community (not official) | community | task_pack_v2 | quick | qwen2.5-coder:7b | 16.67% | 13.33% | 0 | 0 | summary,summary | 992fbce8336349e4268e273208d3d912dfe9fd95 |
| official_dev50_signal_qwen_llama | official | headline | task_pack_v2 | dev50 | qwen2.5-coder:7b | 0.00% | 0.00% | 0 | 0 | summary,summary | 992fbce8336349e4268e273208d3d912dfe9fd95 |
| community_quick_signal_qwen25coder7b_llama318b_2026-03-03 | community (not official) | community | task_pack_v2 | quick | qwen2.5-coder:7b | 16.67% | 20.00% | 0 | 0 | summary,summary | 9c827e2090de930905752fb58be1cc61de6efec6 |
| community_quick_signal_qwen25coder7b_deepseekcoder67b_2026-03-03 | community (not official) | community | task_pack_v2 | quick | qwen2.5-coder:7b | 16.67% | 13.33% | 9 | 9 | summary,summary | 9c827e2090de930905752fb58be1cc61de6efec6 |
| community_quick_signal_qwen25coder7b_qwen25coder7b_2026-03-03 | community (not official) | community | task_pack_v2 | quick | qwen2.5-coder:7b | 16.67% | 16.67% | 0 | 0 | summary,summary | 9c827e2090de930905752fb58be1cc61de6efec6 |
| community_quick_signal_qwen25coder7b_mistral7b_2026-03-03 | community (not official) | community | task_pack_v2 | quick | qwen2.5-coder:7b | 16.67% | 13.33% | 6 | 6 | summary,summary | 9c827e2090de930905752fb58be1cc61de6efec6 |
