Open Datasets

Hacking Education: A Contest for Developers and Data Crunchers

(If this is the first you're hearing about a contest, you should read this first. :)


Overview

To access data about live projects on our site, including their front-end permalinks, photos, and other assets, our JSON API is your best bet. 

Its data is very fresh and tightly integrated into our front-end website, eg. you can enable end-users to browse classroom projects and click-through to our production front-end to donate.

To access every classroom project and donation since the org's inception in 2000 and our last 18 months of site searches, you'll want our open data sets below. 

They include lots of data that's not available via the API, such as every project donation and gift card purchase, as well as all the materials/resources requested by each teacher for each project. As this data has been sanitized for the privacy of our end-users, it's a bit harder to integrate with our production front-end than the JSON API.

You're also encouraged to use any additional 3rd party data you can get your hands on! We have assembled an informal list of potentially complementary data sets.

Do join our discussion group for Developers and Data Crunchers, even if you're not yet certain you're going to submit an app or analysis into the contest. We'll be using it to answer any questions, communicate with participants, etc.

If you have any questions, you can review our API FAQ, search or ask the discussion group, or get in touch with us directly. Our preference is to answer questions in the discussion group in order to benefit other folks who might have similar question.

Thanks in advance for participating...our contest judges and our entire org look forward to seeing the apps and analyses you create!

-- The DonorsChoose.org Team

The Data

Projects

All classroom projects that have been posted to the site, including lots of school info such as its NCES ID (government-issued), lat/long, and city/state/zip.
Data file: ~40MB zip, ~135MB CSV, ~300K records

Donations

All donations, including donor city, state, and partial-zip (when available).
Data file: ~85MB zip, ~260MB CSV, ~1.1M records

Gift cards

All website-purchased gift cards, including donor and recipient city, state, and partial-zip (when available).
Data file: ~3MB zip, ~8MB CSV, ~43K records

Project resources

All materials/resources requested for the classroom projects, including vendor name.
Data file: ~95MB zip, ~275MB CSV,  ~1.6M records

Project written requests

Full text of the teacher-written requests accompanying all classroom projects.
Data file: ~200MB zip, ~1GB CSV,  ~300K records

Search logs

Search queries spanning 12 months, including both keyword searches and any search filters applied.
Data files:
Jan - June 2010 ~6MB zip, ~42MB CSV, 531K records, ~2.3M searches
July - Sep 2010 ~5MB zip, ~36MB CSV, 448K records, ~1.7M searches
Oct - Dec 2010 ~8MB zip, ~56MB CSV, 665K records, ~2.8M searches

Schema diagram



We have also published some scripts to help you load the data into your db of choice and into partially normalized tables more suitable for exploration with SQL.

License


If you'd like to use this data for commercial purposes, get in touch with us and tell us a bit about your plans. Our strong preference is to greenlight your commercial application with no licensing fees, and we have never charged for access to our API or data. We just need to make sure that the application won't run contrary to our org's mission, abuse the rich content that our teachers have created, etc.

Dates of significant web system changes

Some of the web system changes we've made over the years changed the site's user experience, the amount of data that was collected, or the rigor with which our data integrity was enforced. We're mentioning some of these milestones so that if you encounter these changes or their impact at a data level, you'll have some context for what happened.
  • Jan 11, 2007: Launched complete system from-scratch rebuild.
  • Aug 8, 2007: Introduced points system to meter the rate of very expensive or very staff-resource-intensive teacher requests.
  • Sep 5, 2007: Opened to teachers in all 50 states (previously only teachers in a handful of cities and states could use).
  • Sep 8, 2008: Introduced check-out carts to enable donors to support more than one project in a single transaction and also include gift card purchases.
  • Feb 18, 2010: Free-form classroom project essays were no longer allowed and replaced by a series of questions.

Complementary datasets and tools


Thank you to our generous funders: The Anthony E. Meyer Family Foundation and Michael C. Lewis. Also thank you to our judges, our sponsors, Clay Johnson, Frankie Cheung, and Ben Millard.