💡 idea-box

Stop reading "100 startup ideas" listicles. Start with 1,000 pain points that already have paying users and broken tools.

🔎 Browse the live site · 📊 Download the JSON · 🧠 See the schema · 🤝 Add yours

The pitch in one paragraph

Most "startup idea" lists are vibes. Bullet points, no audience, no numbers, no sources. idea-box is the opposite. Every entry answers six questions a serious builder actually asks:

Who hurts? (the specific role — not "small businesses")
How badly? (pain severity, how often it happens, what breaks)
How many of them are there? (firm counts, industry labor spend)
What are they paying today? (the pricing at incumbent SaaS tools)
Why don't today's tools fix it? (the concrete gap)
Why is now a good time to build? (what infra or rule change made it possible)

Top candidates are then run through a deeper investor-grade check — market sizing triangulated three ways, unit economics, an 8-factor score, and a "what would have to be true" risk list.

A real entry, unpacked

PAIN-0001 — Phone/device repair shop: fragmented quote-to-invoice lifecycle

Who hurts · 1–5 technician repair-shop owner

What breaks · Customer texts asking for an iPhone screen price. Owner manually checks stock in a spreadsheet, quotes via SMS, writes the booking in Google Calendar, logs it in the POS. Nothing is connected — parts get oversold, bookings get missed, everything is re-keyed.

How big · ~150,000 shops in North America alone

What they pay now · ~$200/month to RepairShopr, RepairDesk, or MyGadgetRepairs

Why incumbents miss it · Rigid software, no native WhatsApp/SMS agent, no stock-aware intake, weak omnichannel

Why now · WhatsApp Business API is cheap, LLM voice costs crashed, SMBs expect conversational tools

Sources · Rossmann Forum thread on shop software · CellSmart POS review roundup

Signal strength · 9/10 severity · 9/10 tedium · score 92/100

That's what every one of the 1,000 entries gives you. Not a slogan. Not a vibe.

Who this is for

If you are…	You'll use this to…
A founder hunting for an idea	Skip the vibes lists. Go straight to problems with named customers + known price points.
A VC or angel investor	Pre-triage decks. If the problem isn't in here — or it is but the listed incumbents already solved it — that's a signal.
A product leader at a scaling company	Find the 3 adjacent problems your customers have that you could wrap into a new SKU.
A vertical-AI builder	Map labor budgets to agent workflows. Every entry tags a workflow family (7 of them cover most of B2B).
A researcher or PMM	The highest-density dataset anywhere on where unsolved operational pain maps to buyable software.

What's inside

📦 1,000 entries across 24 batches

Grouped by the kind of builder who'd care:

#	Batch	Entries
01	Repair shops, home services	25
02	Marketing, sales, RevOps	25
03	India / Bharat (UPI, Zepto, Porter…)	25
04	Due diligence, compliance, legal	25
05	Healthcare & medical	20
06	Finance, HR, ops, long-tail	30
07	Off-beat niches (funeral, towing, drone…)	25
08	Reconciliation, automation, ServiceTitan	25
09	YC-style agent wedges	50
10	Real Reddit pain threads	50
11	International (EU · APAC · MENA · LATAM)	50
12	Enterprise IT, gaming, climate	50
13	Business-ops deep	50
14	Direct-to-consumer & e-commerce	50
15	Financial services deep	50
16	Supply chain & logistics	50
17	Scientific, pharma, R&D	50
18	Specialty professional services	50
19	Consumer & prosumer	50
20	AI-native emerging categories	50
21	Healthcare adjacent (payer, pharmacy…)	50
22	Public sector / govtech	50
23	Niche B2B long-tail	50
24	Final curated	50

🔬 The top 50 get a deeper investor-grade check

Each of the 50 highest-scoring entries is run through the AI-cofounder 8-factor rubric. You get:

Market sizing, triangulated 3 ways

Bottom-up — firm count × what they'd pay per year
Value-based — hours saved × hourly cost × 15% capture
Top-down — the incumbent category spend
A convergence check that flags the estimate if the three methods disagree by >2×

Unit economics stub in SaaS / marketplace / D2C / fintech shapes

Annual contract value, gross margin, LTV, CAC, LTV:CAC ratio, payback period

8-layer score (1–5 each)

Problem clarity · Market size · Solution differentiation · Unit econ · P&L viability · Timing & moat · Founder fit · Execution risk

Top 3 riskiest assumptions — the "what would have to be true" list

Verdict

Strong signal · Conditional pass · Fragile · Weak

🖥 Browser UI

Zero-build. Single HTML file. Filter by vertical, workflow family, severity, willingness-to-pay direction, and validation verdict. Try it →

⚙ Tooling

scripts/add-pain.py — drop a CSV, get JSON with auto opportunity-scoring
scripts/validate-s-tier.py — re-run the deep-validation rubric anytime

What's in each entry

Every entry in data/pains-*.json has these fields. Plain English only: what it means and why it matters.

Core fields (on every entry)

Field	What it means	Why a builder cares
`id`	Permanent ID like `PAIN-0001`	Stable reference across versions
`title`	One-line pain in human language	The hook
`vertical`	Industry label (for example: "Device Repair", "Veterinary")	Filters out what you don't want to build in
`workflow_family`	One of 7 kinds of operational work	Maps to the kind of agent you would build
`persona`	The specific buyer role, not a category	You have to know who you are calling
`pain_description`	What breaks, in the persona's words	The narrative that sells your pitch
`pain_severity`	1 to 10, how painful	10 means losing money or triggering regulatory risk
`tedium_score`	1 to 10, how repetitive the task is	10 is pure copy-paste data entry, prime AI replacement
`frequency`	How often the pain happens	"Daily per shop" signals volume pricing
`tam_firms`	Estimated count of target companies	Used for bottom-up market sizing
`tam_labor_spend_usd`	Annual labor cost across this market, in US dollars	Frames the pain as a labor-budget opportunity, not an IT-budget one (Menlo Ventures framing)
`tam_direction`	Growing, stable, or shrinking	Tells you whether you are surfing or swimming
`current_wtp_usd_month`	What customers pay today to existing tools, per month in US dollars	Anchors your pricing ceiling
`incumbent_tools`	Tools and SaaS products already serving this pain	Your competition
`incumbent_gap`	Why today's tools fail these customers	Your wedge
`santifer_pattern`	One-line solution sketch	Starting point, not a spec
`sources`	Reddit threads, forums, reviews	Evidence, not opinion
`verbatim_quotes`	Real user complaints pulled from those threads, quoted as written, with author handle and date	Someone's actual words, not your paraphrase
`reference_build`	URL of an existing working example of this solution pattern, if one exists	Proof it can be done
`why_now`	What changed recently that makes this buildable	The catalyst for today
`opportunity_score`	Composite score from 0 to 100	Sort on this, then read the details

Decision-grade fields (on the top 50 entries, rolling out to others)

Extra evidence that lets you decide whether to actually build. Every field is designed to answer one of four jobs: can I validate this, can I catch the trend early, can I avoid wasting 3 months, and can I reach $50,000 in monthly revenue?

Field	What it means
`specific_prospects`	Named people, companies, subreddits, Facebook groups, and associations you could contact today. For each: name, type, URL, how big, and how to reach them.
`velocity_signal`	Is the pain growing, flat, or cooling? Includes concrete evidence (numbers, dates), the trigger (new API, regulation, cost drop), and your estimated head start before mass competition.
`buying_intent_evidence`	Quotes where someone signals they actively want to pay. Typed as: asking for a recommendation, actively comparing tools, said they will pay, switching away from their current tool, signed up for a waitlist, or funding raised in the category.
`kill_risks`	3 to 5 things that could kill this idea, each tagged kill / serious / manageable with current status.
`fermi_math_to_50k`	Arithmetic path to $50,000 in monthly revenue: realistic year-1 reach, signup rate, monthly price, customers needed, months, and verdict (math works / tight but possible / math does not work).

A quick note on abbreviations

Some fields keep shorthand in their JSON names for backward compatibility, but the browser UI always shows plain English:

tam_firms shows as "Target companies"
tam_labor_spend_usd shows as "Annual labor cost in this market"
tam_direction shows as "Market trend"
current_wtp_usd_month shows as "What they pay today"

Quick start

🌐 Browse the data (recommended)

Live: mothivenkatesh.github.io/idea-box/web/

Prefer local:

git clone https://github.com/mothivenkatesh/idea-box.git
cd idea-box
python -m http.server 8000
# open http://localhost:8000/web/

🔍 Query with jq

# Top 10 India-specific opportunities
jq '[.[] | select(.vertical | test("Bharat"))] | sort_by(-.opportunity_score) | .[0:10] | map({id, title, persona})' \
   data/pains-03-bharat-india.json

# All deep-validated "Conditional pass" candidates
jq '.[] | select(.ai_cofounder_validation.verdict == "Conditional pass") | .title' \
   data/validations-s-tier.json

# Count entries by workflow family across the whole corpus
jq -s 'add | group_by(.workflow_family) | map({family: .[0].workflow_family, count: length})' \
   data/pains-*.json

🐍 Query with Python

import json, glob
pains = []
for f in sorted(glob.glob("data/pains-*.json")):
    pains.extend(json.load(open(f, encoding="utf-8")))

# SMB opportunities where target already pays >= $200/month today
opps = sorted(
    [p for p in pains
     if (p.get("current_wtp_usd_month") or 0) >= 200
     and "smb" in p.get("tags", [])],
    key=lambda p: -(p.get("opportunity_score") or 0)
)[:20]

How the data was built

Synthesized from — and cross-checked against — sources that compound credibility:

Santifer's Business OS case study — 16 years of running a real repair shop, 170 hours/month of operator time saved with the pattern. Applied in 200+ entries.
a16z's vertical-AI + vertical-SaaS thesis — the industry-labor-spend framing
Menlo Ventures — Software Finally Gets to Work — $740B healthcare admin vs $63B IT
Lyzr AI — 101 enterprise use cases + Sierra + Nurix AI — what's actually shipping at enterprise today
ideabrowser.com (Greg Isenberg) + painonsocial.com — the "structured idea card" format
BigIdeasDB — 238K real complaints across Reddit, G2, Capterra, Upwork analyzed
Direct Reddit reading — r/smallbusiness, r/Entrepreneur, r/accounting, r/nocode, r/Airtable, r/HVAC, r/realtors, r/medicine, r/sysadmin, r/construction, and ~20 more
G2 and Capterra 2-star review patterns (gaps hide here)
NAICS labor-spend data for market-sizing triangulation

Who this borrows from (and links back to)

Santifer — the repair-shop pattern that proves the "Airtable + automation + AI agent" bundle works for SMBs
Greg Isenberg's ideabrowser — the card-per-idea discipline
painonsocial.com — Reddit-mined pain-point scoring
BigIdeasDB — the 238K-complaint validation approach
Frappe's Espresso design system — styling for the browser UI

If you found this through any of them, go ⭐ them too.

Roadmap

v1.0 — 1,000 entries, top 50 deep-validated, live browser
v1.1 — verbatim_quotes field on the schema, real quotes seeded for the 50 S-tier entries, empty slots on the other 950 for community fill-in
v1.2 — open contributions via PR + issue templates
v1.3 — deep-validation expanded to top 100
v2.0 — automated pipeline: nightly Reddit pull → candidate entries → human review

Contributing

PRs welcome. See CONTRIBUTING.md. The highest-leverage contributions:

Add a source link to an existing entry — a Reddit thread, G2 review, or forum post that validates the pain
Fill in missing firm counts or labor-spend estimates from NAICS or industry reports
Flag duplicates — open an issue if you find two entries describing the same pain
Submit a new entry via the issue template

If idea-box helps you find your next build, a ⭐ means a lot.

⭐ Star the repo · 🐙 Open an issue · 𝕏 Share

_{MIT-licensed. Built by @mothivenkatesh with the help of Claude Code.}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
data		data
scripts		scripts
web		web
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
index.html		index.html
schema.json		schema.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

💡 idea-box

Stop reading "100 startup ideas" listicles. Start with 1,000 pain points that already have paying users and broken tools.

The pitch in one paragraph

A real entry, unpacked

Who this is for

What's inside

📦 1,000 entries across 24 batches

🔬 The top 50 get a deeper investor-grade check

🖥 Browser UI

⚙ Tooling

What's in each entry

Core fields (on every entry)

Decision-grade fields (on the top 50 entries, rolling out to others)

A quick note on abbreviations

Quick start

🌐 Browse the data (recommended)

🔍 Query with jq

🐍 Query with Python

How the data was built

Who this borrows from (and links back to)

Roadmap

Contributing

If idea-box helps you find your next build, a ⭐ means a lot.

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

💡 idea-box

Stop reading "100 startup ideas" listicles. Start with 1,000 pain points that already have paying users and broken tools.

The pitch in one paragraph

A real entry, unpacked

Who this is for

What's inside

📦 1,000 entries across 24 batches

🔬 The top 50 get a deeper investor-grade check

🖥 Browser UI

⚙ Tooling

What's in each entry

Core fields (on every entry)

Decision-grade fields (on the top 50 entries, rolling out to others)

A quick note on abbreviations

Quick start

🌐 Browse the data (recommended)

🔍 Query with jq

🐍 Query with Python

How the data was built

Who this borrows from (and links back to)

Roadmap

Contributing

If idea-box helps you find your next build, a ⭐ means a lot.

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages