Model Usage and Approval Rates in Aum Golly 3
Production Statistics
The following data shows the generation and approval metrics for the creation of Aum Golly 3.
| Metric | Count |
|---|---|
| Generation Jobs | 253 |
| Total Generation Steps | 420 |
| Total Variant Poems Generated | 1,260 |
| Final Approved Poems | 41 |
| Overall Approval Rate | 3.25% |
Note: Each generation step produces 3 variant poems (A, B, C). The critic selects 1 of the 3 for further refinement.
Model Performance
Models ranked by approval rate (minimum 10 variants):
| Model | Variants Generated | Poems Approved | Approval Rate | Usage % |
|---|---|---|---|---|
| Gemini 2.5 Flash | 45 | 4 | 8.9% | 3.7% |
| Claude Sonnet 4.5 | 306 | 19 | 6.2% | 25.4% |
| Gemini 3 Pro | 90 | 4 | 4.4% | 7.5% |
| Claude Opus 4 | 174 | 5 | 2.9% | 14.4% |
| Claude Opus 4.5 | 189 | 4 | 2.1% | 15.7% |
| GPT-5 | 396 | 4 | 1.0% | 32.8% |
| Claude 3.5 Haiku | 6 | 1 | 16.7% | 0.5% |
Insights
- Claude Sonnet 4.5 had the highest absolute number of approved poems (19) and maintained a strong approval rate of 6.2%.
- Gemini 2.5 Flash achieved the highest approval rate among frequently-used models at 8.9%.
- GPT-5 was the most-used model (32.8% of variants) but had the lowest approval rate at 1.0%.
- The multi-model approach ensured diversity in style and voice across the final collection.