URL READER

This project helps you to read the content of URLs, and return the title, length, html, text, markdown, excerpt.

"node": ">=20.11.0"

Installation

yarn add url-reader
# or npm install url-reader

Usage

import URLReader from 'url-reader';

const reader = new URLReader();
await reader.init();

const results = await reader.read({
  urls: ['https://www.google.com'],
  timeout: 10000, // ms, default: 60000
  enableMarkdown: false, // default: true
  runScripts: 'dangerously', // run the scripts included in the HTML and fetch remote resources, default is closed.
});

Parsed Result:

interface IReaderResult {
  title: string;
  length: number;
  html: string;
  text: string;
  markdown?: string;
  excerpt: string;
}

Server

start server

git clone https://github.com/yokingma/url-reader.git
cd url-reader

# default listen on port 3030
yarn install & yarn run start

api

GET /reader?url=https://www.google.com

POST /reader
Body:
{
  urls: ['https://www.google.com', 'https://www.bing.com']
}

Docker

docker build -t urlreader . # urlreader is your image's tag name

The service will listen on port 3030.

Tips

puppeteer When you install Puppeteer, it will automatically downloads a recent version of Chrome for Testing (~170MB macOS, ~282MB Linux, ~280MB Windows) and a chrome-headless-shell binary.

Troubleshooting

install error with puppeteer

Error [ERR_TLS_CERT_ALTNAME_INVALID]: Hostname/IP does not match certificate's altnames...

remove .npmrc file and re-install.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
dist		dist
src		src
.DS_Store		.DS_Store
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
eslint.config.js		eslint.config.js
package.json		package.json
tsconfig.build.json		tsconfig.build.json
tsconfig.json		tsconfig.json
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

URL READER

Installation

Usage

Server

Docker

Tips

Troubleshooting

About

Releases

Packages

Contributors 2

Languages

License

yokingma/url-reader

Folders and files

Latest commit

History

Repository files navigation

URL READER

Installation

Usage

Server

Docker

Tips

Troubleshooting

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages