GitHub README Readiness Research
This notebook component supports the Builder Showcase value proposition by scoring a sample of public GitHub READMEs using the same readiness mindset used in the Builder Showcase review pipeline.
The goal is to create a public GitHub baseline for project-readiness analysis.
This helps answer:
- How many sampled public repositories have a README?
- How complete are those READMEs?
- Which sections are commonly missing?
- How many projects appear showcase-ready?
- How does the public GitHub baseline compare to Builder Showcase submissions over time?
Primary output:
data/processed/github_readme_readiness_scores.csv
Recommended notebook:
notebooks/01_github_readme_readiness_scoring.ipynb
Recommended future outputs:
readiness_level_distribution.csv
missing_sections_summary.csv
readme_score_by_language.csv
readme_score_by_activity.csv
This component should remain CSV-first during early experimentation. If the results are useful, the cleaned and versioned dataset can later be imported into Supabase for Builder Showcase business intelligence, product comparison, and long-term readiness tracking.