A powerful scraper designed to extract detailed company data from Thomasnet.com, giving you structured business information for research, lead generation, and automation workflows. This scraper efficiently collects company profiles, contact details, service categories, and metadata—making industrial data gathering fast, reliable, and scalable.
Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for Thomasnet.com Scraper you've just found your team — Let’s Chat. 👆👆
The Thomasnet.com Scraper collects structured information about manufacturing and industrial companies listed on Thomasnet. It solves the challenge of manually researching suppliers by fully automating discovery and data extraction at scale. This tool is ideal for researchers, lead generators, analysts, procurement teams, and automation engineers.
- Automatically extracts detailed manufacturing company profiles.
- Captures structured and enriched business metadata.
- Supports multiple starting URLs for broad coverage.
- Maintains reliability through concurrency and retry controls.
- Streamlines B2B lead generation and market research operations.
| Feature | Description |
|---|---|
| Multi-URL Scraping | Supports multiple Thomasnet search result URLs to maximize coverage. |
| Detailed Company Extraction | Gathers names, contacts, descriptions, categories, rankings, and personnel. |
| Proxy Support | Ensures stable, anonymous data collection at scale. |
| Configurable Performance | Control concurrency, limits, retries, and scraping depth. |
| Clean Structured Output | Delivers standardized JSON ready for pipelines and databases. |
| Field Name | Field Description |
|---|---|
| name | The company’s official name. |
| website | URL of the company’s website. |
| primaryPhone | Main company phone number. |
| description | Company description, capabilities, and services. |
| families | Array of service/product families the company belongs to. |
| headings | Categories or headings under which the company is listed. |
| personnel | Key company staff members and their roles. |
| annualSales | Reported annual sales volume. |
| numberEmployees | Employee count range. |
| address | Full address object including city, state, ZIP, and coordinates. |
| tgramsId | Unique Thomasnet identifier. |
| isAdvertiser | Whether the company advertises on Thomasnet. |
| social | Social networks or related links. |
| item | Nested summary object with alternate or raw indexed fields. |
[
{
"__typename": "Company",
"tgramsId": "30828201",
"name": "Focus Fab, LLC",
"website": "http://focusfab.com/",
"primaryPhone": "(888) 449-4877",
"description": "Custom CNC machining, fabrication and assembly services...",
"numberEmployees": "10-49",
"families": [
{ "id": "157558", "name": "Assembly Services" }
],
"address": {
"address1": "5540 Parkwood Circle",
"city": "Bessemer",
"state": "AL",
"zip": "35022",
"country": "USA"
},
"personnel": [
{ "name": "David Lomasney", "title": "Owner" }
]
}
]
Thomasnet.com Scraper/
├── src/
│ ├── main.js
│ ├── utils/
│ │ ├── parser.js
│ │ └── request_handler.js
│ ├── extractors/
│ │ └── thomasnet_company_extractor.js
│ └── config/
│ └── settings.example.json
├── data/
│ ├── input.sample.json
│ └── output.sample.json
├── package.json
└── README.md
- Sales teams use it to collect B2B leads, enabling them to target verified industrial suppliers quickly.
- Researchers use it to analyze manufacturing ecosystem data, improving market intelligence accuracy.
- Procurement departments use it to identify new vendors, helping them expand supply chain options.
- Automation engineers integrate it into pipelines to enrich databases, ensuring fresh, standardized industrial data.
- Competitive analysts use it to track service categories and capabilities, generating insights into market positioning.
Q: Can I scrape multiple regions or categories at once? Yes. Add multiple Thomasnet search URLs to the input, and the scraper will process them sequentially or in parallel depending on your concurrency settings.
Q: Does the scraper handle blocked requests or captchas? The scraper uses proxy rotation and retry logic to reduce failures and improve stability across large datasets.
Q: What output formats are supported? Output is generated in clean JSON, which you can convert to CSV or Excel using any standard tooling.
Q: How large of a dataset can I scrape? With proper concurrency and proxy settings, the scraper can handle thousands of companies reliably.
Primary Metric: Processes an average of 120–180 company profiles per minute depending on concurrency and network conditions.
Reliability Metric: Maintains a 98%+ request success rate with proxy support and retry logic enabled.
Efficiency Metric: Optimized request handling provides high throughput with minimal resource overhead.
Quality Metric: Generates near-complete profile coverage, including headings, families, personnel, and address metadata for the majority of listings.
