GitHub - scrapeless-ai/actor-template-ts

Scrape single-page in TypeScript template

A template for scraping data from a single web page in TypeScript (Node.js). The URL of the web page is passed in via input, which is defined by the [input_schema.json].

The scraped data in this template are page headings but you can easily edit the code to scrape whatever you want from the page.

Included features

Scrapeless SDK - toolkit for building Actors
Puppeteer - a Node.js library that controls Chrome or Chromium browsers programmatically

How it works

actor.input() gets the input where the page URL is defined
client.browser.create() get the browser websocket endpoint
page.goto(url) goto the target website
actor.addItems() save the crawled data to dateset

Getting started

fork or clone the repository to your github, link your github repository to Scrapeless Actor. Then:

Build the Actor
Run the Actor

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.actor		.actor
src		src
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Scrape single-page in TypeScript template

Included features

How it works

Getting started

About

Uh oh!

Releases

Packages

Languages

scrapeless-ai/actor-template-ts

Folders and files

Latest commit

History

Repository files navigation

Scrape single-page in TypeScript template

Included features

How it works

Getting started

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages