DE EN ES FR ID JA KO PT RU TH VI ZH

Cheerio

Cheerio is a server-side implementation of jQuery designed specifically for Node.js applications. It provides a familiar jQuery-like API for parsing, manipulating, and traversing HTML and XML documents without the overhead of a browser environment. Whether you're building web scrapers, processing HTML templates, or transforming markup programmatically, Cheerio makes working with HTML as intuitive as working with the DOM in the browser.

At its core, Cheerio solves the problem of server-side HTML manipulation. While browsers provide native DOM APIs for interacting with HTML, server-side JavaScript environments lack these capabilities. Cheerio bridges this gap by implementing jQuery's most useful methods in a lightweight, server-optimized package. It's built on top of proven parsing libraries like htmlparser2 and parse5, ensuring fast and accurate HTML processing while maintaining the developer-friendly syntax that millions of developers already know.

What sets Cheerio apart is its focus on performance and simplicity. Unlike headless browser solutions that simulate an entire browser environment, Cheerio operates directly on parsed HTML structures, making it incredibly fast for HTML manipulation tasks. It removes browser-specific complexities and inconsistencies, giving you a clean, consistent API for HTML processing that works reliably across different environments.

Key Features

Quick Start

Get started with Cheerio by installing it via npm:

npm install cheerio

Here's a simple example that demonstrates Cheerio's power for HTML manipulation:

import * as cheerio from 'cheerio';

// Load HTML content
const $ = cheerio.load(`
  <html>
    <head><title>My Page</title></head>
    <body>
      <h1 class="header">Welcome</h1>
      <div class="content">
        <p>Hello <span class="name">World</span>!</p>
        <ul class="list">
          <li>Item 1</li>
          <li>Item 2</li>
        </ul>
      </div>
    </body>
  </html>
`);

// Use jQuery-like selectors to find and modify elements
$('h1').text('Welcome to Cheerio!');
$('.name').text('Everyone');
$('.list').append('<li>Item 3</li>');
$('p').addClass('highlight');

// Extract data from elements
const title = $('title').text();
const items = $('.list li').map((i, el) => $(el).text()).get();

console.log('Page title:', title);
console.log('List items:', items);
console.log('Modified HTML:', $.html());

This example shows how Cheerio makes HTML manipulation intuitive:

  1. Loading HTML — The cheerio.load() function parses your HTML and returns a jQuery-like function ($)
  2. Selecting Elements — Use CSS selectors to target specific elements, just like in jQuery
  3. Modifying Content — Chain methods like .text(), .addClass(), and .append() to modify your HTML
  4. Extracting Data — Use methods like .map() to extract information from multiple elements
  5. Outputting Results — Call .html() to get the final modified HTML string

The familiar jQuery syntax means there's virtually no learning curve if you've used jQuery before, making Cheerio incredibly approachable for developers of all skill levels.

When to Use Cheerio vs Alternatives

Choose Cheerio when:

Consider alternatives when:

Cheerio vs Puppeteer/Playwright: While headless browsers can execute JavaScript and simulate user interactions, they're much heavier and slower. Cheerio is perfect when you only need HTML parsing and manipulation without JavaScript execution.

Cheerio vs Native DOM APIs: Browser environments provide native DOM manipulation, but server-side Node.js doesn't. Cheerio fills this gap with a familiar, jQuery-inspired interface.

Cheerio vs Regular Expressions: While regex can extract data from HTML, it's fragile and error-prone. Cheerio provides robust HTML parsing that handles edge cases and malformed markup gracefully.

Cheerio excels in scenarios where you need fast, reliable HTML processing without the complexity and resource requirements of a full browser environment. Its jQuery-compatible API makes it an excellent choice for developers who want powerful HTML manipulation capabilities with minimal learning overhead.