DocsBooksAum Golly 3 (2025)Model Usage and Approval Rates

Model Usage and Approval Rates in Aum Golly 3

Production Statistics

The following data shows the generation and approval metrics for the creation of Aum Golly 3.

MetricCount
Generation Jobs253
Total Generation Steps420
Total Variant Poems Generated1,260
Final Approved Poems41
Overall Approval Rate3.25%

Note: Each generation step produces 3 variant poems (A, B, C). The critic selects 1 of the 3 for further refinement.

Model Performance

Models ranked by approval rate (minimum 10 variants):

ModelVariants GeneratedPoems ApprovedApproval RateUsage %
Gemini 2.5 Flash4548.9%3.7%
Claude Sonnet 4.5306196.2%25.4%
Gemini 3 Pro9044.4%7.5%
Claude Opus 417452.9%14.4%
Claude Opus 4.518942.1%15.7%
GPT-539641.0%32.8%
Claude 3.5 Haiku6116.7%0.5%

Insights

  • Claude Sonnet 4.5 had the highest absolute number of approved poems (19) and maintained a strong approval rate of 6.2%.
  • Gemini 2.5 Flash achieved the highest approval rate among frequently-used models at 8.9%.
  • GPT-5 was the most-used model (32.8% of variants) but had the lowest approval rate at 1.0%.
  • The multi-model approach ensured diversity in style and voice across the final collection.