Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Website #2950

Merged
merged 97 commits into from
Feb 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
97 commits
Select commit Hold shift + click to select a range
54bc2be
Initial docusaurus setup
fb55 Aug 31, 2022
9919090
Enable algolia
fb55 Sep 2, 2022
39c4a1c
Add footer link
fb55 Sep 2, 2022
66d6153
Merge branch 'main' into docusaurus
fb55 Dec 11, 2022
3a6d14b
Enable dependabot
fb55 Dec 11, 2022
7a972f8
bump deps
fb55 Dec 11, 2022
b0749a7
Update docusaurus.config.js
fb55 Dec 11, 2022
679a696
Merge branch 'main' into docusaurus
fb55 Dec 23, 2022
d6933eb
Update authors.yml
fb55 Dec 23, 2022
4b48670
`npm audit fix`
fb55 Dec 30, 2022
ee5db9d
Merge branch 'main' into docusaurus
fb55 Dec 30, 2022
0cd0336
Set prettier `"proseWrap": "always"`
fb55 Dec 30, 2022
436ee8c
Add documentation comment to Cheerio class
fb55 Dec 30, 2022
e22a1d5
First part of guides
fb55 Dec 30, 2022
6e1a5e3
Use batteries for API docs, categories for loading fns
fb55 Dec 30, 2022
4f3736f
Improve batteries docs
fb55 Dec 30, 2022
9f17ec2
Loading guide fixes
fb55 Dec 30, 2022
06a2112
Add guide for selecting elements
fb55 Dec 30, 2022
237b332
Improve docusaurus config
fb55 Dec 30, 2022
c4e2b50
Move XML namespace comment to guide
fb55 Dec 30, 2022
6855f14
Add keywords
fb55 Dec 30, 2022
bf3fdd0
Fix usernames
fb55 Dec 30, 2022
976b087
Add `extract` guide
fb55 Dec 30, 2022
f9cce9f
Add metadata to guides
fb55 Dec 30, 2022
7ad3b77
Rename guides
fb55 Dec 30, 2022
00790f1
Add traversing guide
fb55 Dec 30, 2022
f970ba5
Small fixes
fb55 Dec 30, 2022
484d8e0
Update traversing.md
fb55 Dec 31, 2022
6d8e6d6
Add manipulation guide
fb55 Dec 31, 2022
ac9ae78
Add API links to manipulation
fb55 Dec 31, 2022
561e779
Update colors
fb55 Dec 31, 2022
b0892c1
Update tagline
fb55 Dec 31, 2022
8acf865
Move fragment mode tip to loading guide
fb55 Dec 31, 2022
03cc928
Customize website
fb55 Dec 31, 2022
c909e9f
Format .tsx files
fb55 Dec 31, 2022
72b63b3
Move extract tutorial to advanced dir
fb55 Jan 1, 2023
c95712f
Create extending-cheerio.md
fb55 Jan 1, 2023
ffd36b3
Update logo.svg
fb55 Jan 1, 2023
088925b
Remove plugin section from README
fb55 Jan 1, 2023
ccf16be
Disable `no-missing-require` for website
fb55 Jan 1, 2023
f64978f
Update logo.svg
fb55 Jan 1, 2023
0c3cffd
Add `@docusaurus/remark-plugin-npm2yarn`
fb55 Jan 8, 2023
2b7c26a
Remove custom sidebar
fb55 Jan 9, 2023
28a13cb
Make `Parse5Options` an interface to have better docs
fb55 Jan 9, 2023
00aaebb
Add `@docusaurus/theme-live-codeblock`, update traversing
fb55 Jan 9, 2023
2060a4d
Use OpenMoji graphics for landing page
fb55 Jan 10, 2023
18a2bec
Better logo
fb55 Jan 10, 2023
f002525
Make feature SVGs adjust to color scheme
fb55 Jan 11, 2023
2ee86a2
Improve copy
fb55 Jan 12, 2023
8250c41
Add custom fonts
fb55 Jan 12, 2023
6e87085
Update favicon.ico
fb55 Jan 12, 2023
4d573e3
Merge branch 'main' into docusaurus
fb55 Jan 12, 2023
cfd6cb3
Remove blogpost stubs
fb55 Jan 12, 2023
21f0e40
Remove docs stubs
fb55 Jan 19, 2023
c4df963
Remove blog link (for now)
fb55 Jan 20, 2023
7c7c822
Move typedoc.json into docusaurus config
fb55 Jan 20, 2023
5756a92
Add TypeDoc plugin for better categories
fb55 Jan 22, 2023
9f3368c
Fix `is` category
fb55 Jan 22, 2023
b39b665
Merge branch 'main' into docusaurus
fb55 Jan 22, 2023
883910c
Start guide on configuring Cheerio
fb55 Jan 23, 2023
5afb95c
Move odd way of loading HTML to CheerioAPI docs
fb55 Jan 23, 2023
b4deede
Update Readme.md
fb55 Jan 23, 2023
a1ed351
Delete History.md
fb55 Jan 23, 2023
83251d0
Finish configuring guide
fb55 Jan 23, 2023
ff4ee06
Write tag names as `<tagname>`
fb55 Jan 23, 2023
3c61234
Add short descriptions for guides
fb55 Jan 23, 2023
3740a2f
Update load.ts
fb55 Jan 23, 2023
8ddc062
Add first draft of announcement blogpost
fb55 Jan 23, 2023
9dd0593
Merge branch 'main' into docusaurus
fb55 Jan 23, 2023
335e65d
Update GitHub pages workflow to build new website
fb55 Jan 23, 2023
b2c2fc8
Update extract.md
fb55 Jan 24, 2023
129592b
Merge branch 'main' into docusaurus
fb55 Jan 25, 2023
e164746
Bump typedoc
fb55 Jan 26, 2023
4011c5f
Bump docusaurus
fb55 Jan 28, 2023
7205ad2
Update src/cheerio.ts
fb55 Jan 28, 2023
a40518a
Add deprecated category
fb55 Jan 31, 2023
f3662fb
Merge branch 'docusaurus' of https://github.com/cheeriojs/cheerio int…
fb55 Jan 31, 2023
b6f35b2
Bump docusaurus
fb55 Feb 8, 2023
3857668
Update Readme.md
fb55 Feb 11, 2023
2b30bc0
Merge branch 'main' into docusaurus
fb55 Feb 11, 2023
488fe03
Bump typedoc
fb55 Feb 11, 2023
e433c71
Add XML to the tagline
fb55 Feb 11, 2023
3088d9b
Improve intro blog post
fb55 Feb 11, 2023
0fbdf34
Add announcement bar
fb55 Feb 11, 2023
bb6df69
Add @docusaurus/plugin-client-redirects
fb55 Feb 11, 2023
5ea184f
Improve redirects
fb55 Feb 11, 2023
323413a
Fix static method categories
fb55 Feb 11, 2023
0c56dd3
Update orange-c.svg
fb55 Feb 11, 2023
3975bf3
Update favicon.ico
fb55 Feb 11, 2023
30bcd41
Update orange-c.svg
fb55 Feb 12, 2023
236c4cc
Update package-lock.json
fb55 Feb 13, 2023
1989e9f
Update favicon.ico
fb55 Feb 13, 2023
8b92b7a
Rename selectors.md to selecting.md
fb55 Feb 13, 2023
ea4335e
New intro
fb55 Feb 13, 2023
fba1a59
Update loading.md
fb55 Feb 13, 2023
0d2d2a6
Merge branch 'main' into docusaurus
fb55 Feb 13, 2023
a57bee6
Update package-lock.json
fb55 Feb 13, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,16 @@ updates:
interval: daily
open-pull-requests-limit: 10
versioning-strategy: increase
- package-ecosystem: npm
directory: '/website'
schedule:
interval: daily
open-pull-requests-limit: 4
versioning-strategy: increase
# TODO: We cannot update React to v18. See https://github.com/facebook/docusaurus/issues/7264
ignore:
- dependency-name: 'react'
versions: ['18.x']
- package-ecosystem: 'github-actions'
directory: '/'
schedule:
Expand Down
6 changes: 4 additions & 2 deletions .github/workflows/site.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,12 +42,14 @@ jobs:
uses: actions/configure-pages@v3
- name: Install dependencies
run: npm ci
- name: Install website dependencies
run: cd website && npm ci
- name: Build docs
run: npm run build:docs
run: cd website && npm run build
- name: Upload artifact
uses: actions/upload-pages-artifact@v1
with:
path: ./docs
path: ./website/build

# Deployment job
deploy:
Expand Down
5 changes: 3 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
node_modules
npm-debug.log
.DS_Store
/.netlify/
.docusaurus
.cache-loader
/coverage
/docs
/lib
/website/docs/api
656 changes: 0 additions & 656 deletions History.md

This file was deleted.

198 changes: 22 additions & 176 deletions Readme.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<h1 align="center">cheerio</h1>

<h5 align="center">Fast, flexible & lean implementation of core jQuery designed specifically for the server.</h5>
<h5 align="center">The fast, flexible, and elegant library for parsing and manipulating HTML and XML.</h5>

<div align="center">
<a href="https://github.com/cheeriojs/cheerio/actions/workflows/ci.yml">
Expand Down Expand Up @@ -32,66 +32,34 @@ $.html();
//=> <html><head></head><body><h2 class="title welcome">Hello there!</h2></body></html>
```

## Note

We are currently working on the 1.0.0 release of cheerio on the `main` branch.
The source code for the last published version, `0.22.0`, can be found
[here](https://github.com/cheeriojs/cheerio/tree/aa90399c9c02f12432bfff97b8f1c7d8ece7c307).

## Installation

`npm install cheerio`

## Features

**&#10084; Familiar syntax:** Cheerio implements a subset of core jQuery.
Cheerio removes all the DOM inconsistencies and browser cruft from the jQuery
library, revealing its truly gorgeous API.
**&#10084; Proven syntax:** Cheerio implements a subset of core jQuery. Cheerio
removes all the DOM inconsistencies and browser cruft from the jQuery library,
revealing its truly gorgeous API.

**&#991; Blazingly fast:** Cheerio works with a very simple, consistent DOM
model. As a result parsing, manipulating, and rendering are incredibly
efficient.

**&#10049; Incredibly flexible:** Cheerio wraps around
[parse5](https://github.com/inikulin/parse5) parser and can optionally use
@FB55's forgiving [htmlparser2](https://github.com/fb55/htmlparser2/). Cheerio
[parse5](https://github.com/inikulin/parse5) for parsing HTML and can optionally
use the forgiving [htmlparser2](https://github.com/fb55/htmlparser2/). Cheerio
can parse nearly any HTML or XML document. Cheerio works in both browser and
Node environments.

## Cheerio is not a web browser

Cheerio parses markup and provides an API for traversing/manipulating the
resulting data structure. It does not interpret the result as a web browser
does. Specifically, it does _not_ produce a visual rendering, apply CSS, load
external resources, or execute JavaScript which is common for a SPA (single page
application). This makes Cheerio **much, much faster than other solutions**. If
your use case requires any of this functionality, you should consider browser
automation software like [Puppeteer](https://github.com/puppeteer/puppeteer) and
[Playwright](https://github.com/microsoft/playwright) or DOM emulation projects
like [JSDom](https://github.com/jsdom/jsdom).
server environments.

## API

### Markup example we'll be using:

```html
<ul id="fruits">
<li class="apple">Apple</li>
<li class="orange">Orange</li>
<li class="pear">Pear</li>
</ul>
```

This is the HTML markup we will be using in all of the API examples.

### Loading

First you need to load in the HTML. This step in jQuery is implicit, since
jQuery operates on the one, baked-in DOM. With Cheerio, we need to pass in the
HTML document.

This is the _preferred_ method:

```js
// ES6 or TypeScript:
import * as cheerio from 'cheerio';
Expand All @@ -105,107 +73,20 @@ $.html();
//=> <html><head></head><body><ul id="fruits">...</ul></body></html>
```

Similar to web browser contexts, `load` will introduce `<html>`, `<head>`, and
`<body>` elements if they are not already present. You can set `load`'s third
argument to `false` to disable this.

```js
const $ = cheerio.load('<ul id="fruits">...</ul>', null, false);

$.html();
//=> '<ul id="fruits">...</ul>'
```

Optionally, you can also load in the HTML by passing the string as the context:

```js
$('ul', '<ul id="fruits">...</ul>');
```

Or as the root:

```js
$('li', 'ul', '<ul id="fruits">...</ul>');
```

If you need to modify parsing options for XML input, you may pass an extra
object to `.load()`:

```js
const $ = cheerio.load('<ul id="fruits">...</ul>', {
xml: {
xmlMode: true,
withStartIndices: true,
},
});
```

The options in the `xml` object are taken directly from
[htmlparser2](https://github.com/fb55/htmlparser2/wiki/Parser-options),
therefore any options that can be used in `htmlparser2` are valid in cheerio as
well. When `xml` is set, the default options are:

```js
{
xmlMode: true,
decodeEntities: true, // Decode HTML entities.
withStartIndices: false, // Add a `startIndex` property to nodes.
withEndIndices: false, // Add an `endIndex` property to nodes.
}
```

For a full list of options and their effects, see
[domhandler](https://github.com/fb55/DomHandler) and
[htmlparser2's options](https://github.com/fb55/htmlparser2/wiki/Parser-options).

#### Using `htmlparser2`

Cheerio ships with two parsers, `parse5` and `htmlparser2`. The former is the
default for HTML, the latter the default for XML.

Some users may wish to parse markup with the `htmlparser2` library, and
traverse/manipulate the resulting structure with Cheerio. This may be the case
for those upgrading from pre-1.0 releases of Cheerio (which relied on
`htmlparser2`), for those dealing with invalid markup (because `htmlparser2` is
more forgiving), or for those operating in performance-critical situations
(because `htmlparser2` may be faster in some cases). Note that "more forgiving"
means `htmlparser2` has error-correcting mechanisms that aren't always a match
for the standards observed by web browsers. This behavior may be useful when
parsing non-HTML content.

To support these cases, `load` also accepts a `htmlparser2`-compatible data
structure as its first argument. Users may install `htmlparser2`, use it to
parse input, and pass the result to `load`:

```js
// Usage as of htmlparser2 version 6:
const htmlparser2 = require('htmlparser2');
const dom = htmlparser2.parseDocument(document, options);

const $ = cheerio.load(dom);
```

If you want to save some bytes, you can use Cheerio's _slim_ export, which
always uses `htmlparser2`:

```js
const cheerio = require('cheerio/lib/slim');
```

### Selectors

Cheerio's selector implementation is nearly identical to jQuery's, so the API is
very similar.
Once you've loaded the HTML, you can use jQuery-style selectors to find elements
within the document.

#### \$( selector, [context], [root] )

`selector` searches within the `context` scope which searches within the `root`
scope. `selector` and `context` can be a string expression, DOM Element, array
of DOM elements, or cheerio object. `root` is typically the HTML document
string.
of DOM elements, or cheerio object. `root`, if provided, is typically the HTML
document string.

This selector method is the starting point for traversing and manipulating the
document. Like jQuery, it's the primary method for selecting elements in the
document. Like in jQuery, it's the primary method for selecting elements in the
document.

```js
Expand All @@ -219,16 +100,6 @@ $('li[class=orange]').html();
//=> Orange
```

##### XML Namespaces

You can select with XML Namespaces but
[due to the CSS specification](https://www.w3.org/TR/2011/REC-css3-selectors-20110929/#attribute-selectors),
the colon (`:`) needs to be escaped for the selector to be valid.

```js
$('[xml\\:id="main"');
```

### Rendering

When you're ready to render the document, you can call the `html` method on the
Expand All @@ -250,47 +121,22 @@ $.root().html();

If you want to render the
[`outerHTML`](https://developer.mozilla.org/en-US/docs/Web/API/Element/outerHTML)
of a selection, you can use the `html` utility functon:
of a selection, you can use the `outerHTML` prop:

```js
cheerio.html($('.pear'));
$('.pear').prop('outerHTML');
//=> <li class="pear">Pear</li>
```

You may also render the text content of a Cheerio object using the `text` static
You may also render the text content of a Cheerio object using the `text`
method:

```js
const $ = cheerio.load('This is <em>content</em>.');
cheerio.text($('body'));
$('body').text();
//=> This is content.
```

### Plugins

Once you have loaded a document, you may extend the prototype or the equivalent
`fn` property with custom plugin methods:

```js
const $ = cheerio.load('<html><body>Hello, <b>world</b>!</body></html>');
$.prototype.logHtml = function () {
console.log(this.html());
};

$('body').logHtml(); // logs "Hello, <b>world</b>!" to the console
```

If you're using TypeScript, you should add a type definition for your new
method:

```ts
declare module 'cheerio' {
interface Cheerio<T> {
logHtml(this: Cheerio<T>): void;
}
}
```

### The "DOM Node" object

Cheerio collections are made up of objects that bear some resemblance to
Expand Down Expand Up @@ -407,8 +253,8 @@ support for Cheerio and help us maintain and improve this open source project.
This library stands on the shoulders of some incredible developers. A special
thanks to:

**&#8226; @FB55 for node-htmlparser2 & CSSSelect:** Felix has a knack for
writing speedy parsing engines. He completely re-wrote both @tautologistic's
**&#8226; @fb55 for htmlparser2 & css-select:** Felix has a knack for writing
speedy parsing engines. He completely re-wrote both @tautologistic's
`node-htmlparser` and @harry's `node-soupselect` from the ground up, making both
of them much faster and more flexible. Cheerio would not be possible without his
foundational work
Expand All @@ -418,10 +264,10 @@ despite dealing with all the browser inconsistencies the code base is extremely
clean and easy to follow. Much of cheerio's implementation and documentation is
from jQuery. Thanks guys.

**&#8226; @visionmedia:** The style, the structure, the open-source"-ness" of
this library comes from studying TJ's style and using many of his libraries.
This dude consistently pumps out high-quality libraries and has always been more
than willing to help or answer questions. You rock TJ.
**&#8226; @tj:** The style, the structure, the open-source"-ness" of this
library comes from studying TJ's style and using many of his libraries. This
dude consistently pumps out high-quality libraries and has always been more than
willing to help or answer questions. You rock TJ.

## License

Expand Down
Loading