We are seeking $500K - $1M to build k12eval-bench v1: a 5,000-item, 3-rater gold-standard evaluation set for AI grading in K-12, released under open licenses and made permanently public.
The funded artifact remains public infrastructure. Cograder is the steward, not the owner. Every AI grading system in K-12, including our own, will be measured against the same yardstick.
Lead funders are recognized in the dataset citation, the published methodology paper, and the bench's permanent attribution.
AI grading is being deployed in K-12 classrooms faster than the field can study it. Districts are buying tools without a shared way to evaluate them. Vendors self-report. Researchers don't have a current corpus to study. State procurement officers don't have a yardstick.
Foundational research exists (ASAP, PERSUADE, ETS, work from the AES community) but it predates the LLM era, covers narrow tasks, and was built before AI grading became something teachers actually used at scale.
No commercial actor has both the data and the incentive to build the missing infrastructure alone. Vendors won't release a benchmark that scores them objectively. Districts can't fund this from procurement budgets. Foundation model labs evaluate themselves on their own terms.
This is the kind of gap foundations exist to fill: a public-good infrastructure project that has a clear public benefit, a credible steward willing to release the work openly, and a multi-year scope too large for any single actor to build in isolation.
All funders contribute to the same public benefit project. Tier defines attribution and engagement, not ownership.
Five workstreams, sequenced over a multi-year program. Detailed budget, milestone schedule, and reporting cadence are shared with funders during due diligence.
Detailed budget allocation, milestone schedule, and quarterly reporting cadence are provided to funders during due diligence. The program is designed to support a single Lead funder underwriting the full scope, or a coordinated group of Major and Supporting funders covering individual workstreams.
All datasets are released under CC BY 4.0. All code, eval scripts, and the methodology framework are released under MIT. The bench, the leaderboard, and the annual State of AI Grading report are free to use, cite, and build on. Funders are recognized contributors to permanent public infrastructure.
Cograder, the convening organization, commits in writing that no funded artifact will be commercialized, paywalled, or restricted. The roadmap for v1 includes expanding governance to an independent advisory board with seats reserved for academic researchers, district leaders, and lead funders.
Reviewed personally within 5 business days. Expect a response with the full prospectus, methodology spec, line-item budget, and next steps for due diligence.