howtocodeai.com
Tutorials / Intermediate

Build a 200-Page Website From One JSON File (Programmatic SEO With AI)

The pattern behind every big directory and converter site: a JSON data file, a Python build script, and AI writing both. Generate hundreds of static pages in seconds.

DifficultyIntermediate
Time2–4 hours
You'll needClaude or ChatGPT · Python 3 (free) · Any static web host
You'll buildA complete static site — one page per item in your dataset (cities, tools, recipes, conversions, anything) — generated by a build script you can re-run forever.

Unit-converter sites, 'best X in [city]' directories, statistics references — they aren't written page by page. They're generated: a data file holds the facts, a template holds the design, and a script stamps out one HTML page per row. AI now writes all three parts for you. This is the highest-leverage pattern in web publishing, and it's the architecture this very site runs on.

Step 1 — Pick a dataset, not a topic

Programmatic sites live or die on the data. Good datasets are: structured (every item has the same fields), useful at the individual-item level (someone searches for exactly one item), and big enough to matter (50–500 items). Examples: dog breeds, national parks, Excel functions, state tax rules, historical battles. Pick something you can verify.

Step 2 — Have AI design the schema

I'm building a programmatic site about [national parks]. Design a JSON schema for one entry: include a slug for the URL, a name, 6–8 factual fields a visitor would want, and a 'summary' text field. Then generate 5 complete sample entries so I can see real data in the structure.

Review the 5 samples hard. Fixing the schema now costs one prompt; fixing it after you have 200 entries costs an afternoon.

Step 3 — Fill the dataset

Ask the AI for entries in batches of 10–20 and spot-check facts as you go — AI will confidently invent a park's founding year. Paste batches into one file: data.json. For datasets with hard facts, the better workflow is to find an authoritative source (Wikipedia tables, government CSVs) and ask AI to convert it into your schema rather than recall it from memory.

Worth knowingThe data file IS the website. Treat data.json as the single source of truth: never edit generated HTML by hand, because the next build overwrites it. Want a change on every page? Change the template. A change on one page? Change that item's data.

Step 4 — The build script

Write a Python script called build.py using only the standard library. It reads data.json (schema pasted below), and for each entry writes dist/[slug]/index.html from an inline HTML template. Also generate: a homepage listing all entries grouped alphabetically, a sitemap.xml with every URL for the domain [yourdomain.com], and robots.txt. The template should have proper title and meta description tags using each entry's fields, clean modern CSS in a single style.css copied to dist/, and mobile-friendly layout. Here's my schema and two sample entries: [paste]

python3 build.py
# → dist/ now contains your entire site
# 200 entries = 200 pages + homepage + sitemap, built in under a second

Step 5 — Make each page worth ranking

Google's 'thin content' penalty exists for exactly this kind of site. The fix is fields that produce genuinely useful pages: comparisons to siblings ('vs. the average park'), specific numbers, FAQs per item. Ask the AI: 'What 3 additional fields would make each page substantially more useful than the Wikipedia entry?' Then add them to the schema and regenerate.

Step 6 — Deploy and iterate forever

Upload dist/ to your host. From now on, your workflow is: edit data.json → run build.py → upload. Adding entry #201 takes one minute. Redesigning all 200 pages takes one template edit. This loop — JSON source of truth, disposable HTML output — is the same pattern professional content sites run at 10,000-page scale.

Keep going

Need somewhere to put it live? See where to host AI-built sites. Compare tool costs on the pricing tracker (or stick to the free options), then pick your next build.