Skip to content

mothivenkatesh/idea-box

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ’‘ idea-box

Stop reading "100 startup ideas" listicles. Start with 1,000 pain points that already have paying users and broken tools.


Stars Forks License Pain points Deep-validated


πŸ”Ž Browse the live site Β Β·Β  πŸ“Š Download the JSON Β Β·Β  🧠 See the schema Β Β·Β  🀝 Add yours



The pitch in one paragraph

Most "startup idea" lists are vibes. Bullet points, no audience, no numbers, no sources. idea-box is the opposite. Every entry answers six questions a serious builder actually asks:

  1. Who hurts? (the specific role β€” not "small businesses")
  2. How badly? (pain severity, how often it happens, what breaks)
  3. How many of them are there? (firm counts, industry labor spend)
  4. What are they paying today? (the pricing at incumbent SaaS tools)
  5. Why don't today's tools fix it? (the concrete gap)
  6. Why is now a good time to build? (what infra or rule change made it possible)

Top candidates are then run through a deeper investor-grade check β€” market sizing triangulated three ways, unit economics, an 8-factor score, and a "what would have to be true" risk list.


A real entry, unpacked

PAIN-0001 β€” Phone/device repair shop: fragmented quote-to-invoice lifecycle

  • Who hurts Β· 1–5 technician repair-shop owner
  • What breaks Β· Customer texts asking for an iPhone screen price. Owner manually checks stock in a spreadsheet, quotes via SMS, writes the booking in Google Calendar, logs it in the POS. Nothing is connected β€” parts get oversold, bookings get missed, everything is re-keyed.
  • How big Β· ~150,000 shops in North America alone
  • What they pay now Β· ~$200/month to RepairShopr, RepairDesk, or MyGadgetRepairs
  • Why incumbents miss it Β· Rigid software, no native WhatsApp/SMS agent, no stock-aware intake, weak omnichannel
  • Why now Β· WhatsApp Business API is cheap, LLM voice costs crashed, SMBs expect conversational tools
  • Sources Β· Rossmann Forum thread on shop software Β· CellSmart POS review roundup
  • Signal strength Β· 9/10 severity Β· 9/10 tedium Β· score 92/100

That's what every one of the 1,000 entries gives you. Not a slogan. Not a vibe.


Who this is for

If you are… You'll use this to…
A founder hunting for an idea Skip the vibes lists. Go straight to problems with named customers + known price points.
A VC or angel investor Pre-triage decks. If the problem isn't in here β€” or it is but the listed incumbents already solved it β€” that's a signal.
A product leader at a scaling company Find the 3 adjacent problems your customers have that you could wrap into a new SKU.
A vertical-AI builder Map labor budgets to agent workflows. Every entry tags a workflow family (7 of them cover most of B2B).
A researcher or PMM The highest-density dataset anywhere on where unsolved operational pain maps to buyable software.

What's inside

πŸ“¦ 1,000 entries across 24 batches

Grouped by the kind of builder who'd care:

# Batch Entries
01 Repair shops, home services 25
02 Marketing, sales, RevOps 25
03 India / Bharat (UPI, Zepto, Porter…) 25
04 Due diligence, compliance, legal 25
05 Healthcare & medical 20
06 Finance, HR, ops, long-tail 30
07 Off-beat niches (funeral, towing, drone…) 25
08 Reconciliation, automation, ServiceTitan 25
09 YC-style agent wedges 50
10 Real Reddit pain threads 50
11 International (EU Β· APAC Β· MENA Β· LATAM) 50
12 Enterprise IT, gaming, climate 50
13 Business-ops deep 50
14 Direct-to-consumer & e-commerce 50
15 Financial services deep 50
16 Supply chain & logistics 50
17 Scientific, pharma, R&D 50
18 Specialty professional services 50
19 Consumer & prosumer 50
20 AI-native emerging categories 50
21 Healthcare adjacent (payer, pharmacy…) 50
22 Public sector / govtech 50
23 Niche B2B long-tail 50
24 Final curated 50

πŸ”¬ The top 50 get a deeper investor-grade check

Each of the 50 highest-scoring entries is run through the AI-cofounder 8-factor rubric. You get:

Market sizing, triangulated 3 ways

  • Bottom-up β€” firm count Γ— what they'd pay per year
  • Value-based β€” hours saved Γ— hourly cost Γ— 15% capture
  • Top-down β€” the incumbent category spend
  • A convergence check that flags the estimate if the three methods disagree by >2Γ—

Unit economics stub in SaaS / marketplace / D2C / fintech shapes

  • Annual contract value, gross margin, LTV, CAC, LTV:CAC ratio, payback period

8-layer score (1–5 each)

  • Problem clarity Β· Market size Β· Solution differentiation Β· Unit econ Β· P&L viability Β· Timing & moat Β· Founder fit Β· Execution risk

Top 3 riskiest assumptions β€” the "what would have to be true" list

Verdict

  • Strong signal Β· Conditional pass Β· Fragile Β· Weak

πŸ–₯ Browser UI

Zero-build. Single HTML file. Filter by vertical, workflow family, severity, willingness-to-pay direction, and validation verdict. Try it β†’

βš™ Tooling

  • scripts/add-pain.py β€” drop a CSV, get JSON with auto opportunity-scoring
  • scripts/validate-s-tier.py β€” re-run the deep-validation rubric anytime

What's in each entry

Every entry in data/pains-*.json has these fields. Plain English only: what it means and why it matters.

Core fields (on every entry)

Field What it means Why a builder cares
id Permanent ID like PAIN-0001 Stable reference across versions
title One-line pain in human language The hook
vertical Industry label (for example: "Device Repair", "Veterinary") Filters out what you don't want to build in
workflow_family One of 7 kinds of operational work Maps to the kind of agent you would build
persona The specific buyer role, not a category You have to know who you are calling
pain_description What breaks, in the persona's words The narrative that sells your pitch
pain_severity 1 to 10, how painful 10 means losing money or triggering regulatory risk
tedium_score 1 to 10, how repetitive the task is 10 is pure copy-paste data entry, prime AI replacement
frequency How often the pain happens "Daily per shop" signals volume pricing
tam_firms Estimated count of target companies Used for bottom-up market sizing
tam_labor_spend_usd Annual labor cost across this market, in US dollars Frames the pain as a labor-budget opportunity, not an IT-budget one (Menlo Ventures framing)
tam_direction Growing, stable, or shrinking Tells you whether you are surfing or swimming
current_wtp_usd_month What customers pay today to existing tools, per month in US dollars Anchors your pricing ceiling
incumbent_tools Tools and SaaS products already serving this pain Your competition
incumbent_gap Why today's tools fail these customers Your wedge
santifer_pattern One-line solution sketch Starting point, not a spec
sources Reddit threads, forums, reviews Evidence, not opinion
verbatim_quotes Real user complaints pulled from those threads, quoted as written, with author handle and date Someone's actual words, not your paraphrase
reference_build URL of an existing working example of this solution pattern, if one exists Proof it can be done
why_now What changed recently that makes this buildable The catalyst for today
opportunity_score Composite score from 0 to 100 Sort on this, then read the details

Decision-grade fields (on the top 50 entries, rolling out to others)

Extra evidence that lets you decide whether to actually build. Every field is designed to answer one of four jobs: can I validate this, can I catch the trend early, can I avoid wasting 3 months, and can I reach $50,000 in monthly revenue?

Field What it means
specific_prospects Named people, companies, subreddits, Facebook groups, and associations you could contact today. For each: name, type, URL, how big, and how to reach them.
velocity_signal Is the pain growing, flat, or cooling? Includes concrete evidence (numbers, dates), the trigger (new API, regulation, cost drop), and your estimated head start before mass competition.
buying_intent_evidence Quotes where someone signals they actively want to pay. Typed as: asking for a recommendation, actively comparing tools, said they will pay, switching away from their current tool, signed up for a waitlist, or funding raised in the category.
kill_risks 3 to 5 things that could kill this idea, each tagged kill / serious / manageable with current status.
fermi_math_to_50k Arithmetic path to $50,000 in monthly revenue: realistic year-1 reach, signup rate, monthly price, customers needed, months, and verdict (math works / tight but possible / math does not work).

A quick note on abbreviations

Some fields keep shorthand in their JSON names for backward compatibility, but the browser UI always shows plain English:

  • tam_firms shows as "Target companies"
  • tam_labor_spend_usd shows as "Annual labor cost in this market"
  • tam_direction shows as "Market trend"
  • current_wtp_usd_month shows as "What they pay today"

Quick start

🌐 Browse the data (recommended)

Live: mothivenkatesh.github.io/idea-box/web/

Prefer local:

git clone https://github.com/mothivenkatesh/idea-box.git
cd idea-box
python -m http.server 8000
# open http://localhost:8000/web/

πŸ” Query with jq

# Top 10 India-specific opportunities
jq '[.[] | select(.vertical | test("Bharat"))] | sort_by(-.opportunity_score) | .[0:10] | map({id, title, persona})' \
   data/pains-03-bharat-india.json

# All deep-validated "Conditional pass" candidates
jq '.[] | select(.ai_cofounder_validation.verdict == "Conditional pass") | .title' \
   data/validations-s-tier.json

# Count entries by workflow family across the whole corpus
jq -s 'add | group_by(.workflow_family) | map({family: .[0].workflow_family, count: length})' \
   data/pains-*.json

🐍 Query with Python

import json, glob
pains = []
for f in sorted(glob.glob("data/pains-*.json")):
    pains.extend(json.load(open(f, encoding="utf-8")))

# SMB opportunities where target already pays >= $200/month today
opps = sorted(
    [p for p in pains
     if (p.get("current_wtp_usd_month") or 0) >= 200
     and "smb" in p.get("tags", [])],
    key=lambda p: -(p.get("opportunity_score") or 0)
)[:20]

How the data was built

Synthesized from β€” and cross-checked against β€” sources that compound credibility:


Who this borrows from (and links back to)

If you found this through any of them, go ⭐ them too.


Roadmap

  • v1.0 β€” 1,000 entries, top 50 deep-validated, live browser
  • v1.1 β€” verbatim_quotes field on the schema, real quotes seeded for the 50 S-tier entries, empty slots on the other 950 for community fill-in
  • v1.2 β€” open contributions via PR + issue templates
  • v1.3 β€” deep-validation expanded to top 100
  • v2.0 β€” automated pipeline: nightly Reddit pull β†’ candidate entries β†’ human review

Contributing

PRs welcome. See CONTRIBUTING.md. The highest-leverage contributions:

  1. Add a source link to an existing entry β€” a Reddit thread, G2 review, or forum post that validates the pain
  2. Fill in missing firm counts or labor-spend estimates from NAICS or industry reports
  3. Flag duplicates β€” open an issue if you find two entries describing the same pain
  4. Submit a new entry via the issue template


If idea-box helps you find your next build, a ⭐ means a lot.

⭐ Star the repo Β Β·Β  πŸ™ Open an issue Β Β·Β  𝕏 Share


MIT-licensed. Built by @mothivenkatesh with the help of Claude Code.

About

πŸ’‘ 1,000 structured pain points for vertical AI. Every entry has persona, TAM, WTP, incumbents & the gap β€” plus 50 AI-cofounder validated. Zero-build static browser.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors