Cheerio
Cheerio is a server-side implementation of jQuery designed specifically for Node.js applications. It provides a familiar jQuery-like API for parsing, manipulating, and traversing HTML and XML documents without the overhead of a browser environment. Whether you're building web scrapers, processing HTML templates, or transforming markup programmatically, Cheerio makes working with HTML as intuitive as working with the DOM in the browser.
At its core, Cheerio solves the problem of server-side HTML manipulation. While browsers provide native DOM APIs for interacting with HTML, server-side JavaScript environments lack these capabilities. Cheerio bridges this gap by implementing jQuery's most useful methods in a lightweight, server-optimized package. It's built on top of proven parsing libraries like htmlparser2 and parse5, ensuring fast and accurate HTML processing while maintaining the developer-friendly syntax that millions of developers already know.
What sets Cheerio apart is its focus on performance and simplicity. Unlike headless browser solutions that simulate an entire browser environment, Cheerio operates directly on parsed HTML structures, making it incredibly fast for HTML manipulation tasks. It removes browser-specific complexities and inconsistencies, giving you a clean, consistent API for HTML processing that works reliably across different environments.
Key Features
jQuery-Compatible API — Use familiar selectors like
$('.class'),$('#id'), and$('element')with methods you already know including.find(),.text(),.html(), and.attr()Blazing Fast Performance — Built on optimized parsers with no browser overhead, making it ideal for high-throughput applications like web scraping and batch HTML processing
Flexible HTML Parsing — Handles both well-formed and malformed HTML gracefully, with support for XML mode when strict parsing is needed
Server-Side Optimized — Designed specifically for Node.js environments with no browser dependencies or DOM inconsistencies to worry about
Streaming Support — Process HTML from various sources including strings, buffers, URLs, and streams for maximum flexibility
TypeScript Ready — Full TypeScript support with comprehensive type definitions for better development experience and code safety
Lightweight Footprint — Minimal dependencies and small bundle size make it perfect for serverless functions and resource-constrained environments
Memory Efficient — Direct manipulation of parsed HTML structures without the overhead of maintaining a full browser context
Quick Start
Get started with Cheerio by installing it via npm:
npm install cheerio
Here's a simple example that demonstrates Cheerio's power for HTML manipulation:
import * as cheerio from 'cheerio';
// Load HTML content
const $ = cheerio.load(`
<html>
<head><title>My Page</title></head>
<body>
<h1 class="header">Welcome</h1>
<div class="content">
<p>Hello <span class="name">World</span>!</p>
<ul class="list">
<li>Item 1</li>
<li>Item 2</li>
</ul>
</div>
</body>
</html>
`);
// Use jQuery-like selectors to find and modify elements
$('h1').text('Welcome to Cheerio!');
$('.name').text('Everyone');
$('.list').append('<li>Item 3</li>');
$('p').addClass('highlight');
// Extract data from elements
const title = $('title').text();
const items = $('.list li').map((i, el) => $(el).text()).get();
console.log('Page title:', title);
console.log('List items:', items);
console.log('Modified HTML:', $.html());
This example shows how Cheerio makes HTML manipulation intuitive:
- Loading HTML — The
cheerio.load()function parses your HTML and returns a jQuery-like function ($) - Selecting Elements — Use CSS selectors to target specific elements, just like in jQuery
- Modifying Content — Chain methods like
.text(),.addClass(), and.append()to modify your HTML - Extracting Data — Use methods like
.map()to extract information from multiple elements - Outputting Results — Call
.html()to get the final modified HTML string
The familiar jQuery syntax means there's virtually no learning curve if you've used jQuery before, making Cheerio incredibly approachable for developers of all skill levels.
When to Use Cheerio vs Alternatives
Choose Cheerio when:
- You need to parse and manipulate HTML or XML server-side
- Performance is critical and you're processing many documents
- You want a familiar jQuery-like syntax
- You're building web scrapers or data extraction tools
- You need to transform HTML templates or generate static content
- You're working in resource-constrained environments like serverless functions
Consider alternatives when:
- You need to execute JavaScript within pages (use Puppeteer or Playwright)
- You're working with complex SPAs that require JavaScript rendering
- You need to simulate user interactions like clicks and form submissions
- You require full browser APIs like localStorage or fetch
Cheerio vs Puppeteer/Playwright: While headless browsers can execute JavaScript and simulate user interactions, they're much heavier and slower. Cheerio is perfect when you only need HTML parsing and manipulation without JavaScript execution.
Cheerio vs Native DOM APIs: Browser environments provide native DOM manipulation, but server-side Node.js doesn't. Cheerio fills this gap with a familiar, jQuery-inspired interface.
Cheerio vs Regular Expressions: While regex can extract data from HTML, it's fragile and error-prone. Cheerio provides robust HTML parsing that handles edge cases and malformed markup gracefully.
Cheerio excels in scenarios where you need fast, reliable HTML processing without the complexity and resource requirements of a full browser environment. Its jQuery-compatible API makes it an excellent choice for developers who want powerful HTML manipulation capabilities with minimal learning overhead.