# Defuddle
Defuddle is an open-source content extraction library by [[Steph Ango]] (kepano) that strips away clutter from web pages — ads, navigation, sidebars, comments — and returns clean HTML or Markdown with extracted metadata. Originally built for the [[Obsidian Web Clipper]] browser extension, it works in browsers, Node.js, and CLI environments.
## What It Does
- Extracts the primary content of a web page, removing noise (headers, footers, sidebars, comments)
- Outputs clean **HTML or Markdown**
- Extracts metadata: author, publication date, description, schema.org data
- Standardizes HTML elements: headings, code blocks, footnotes, math notation, callouts
- Uses mobile styles to infer which elements are non-essential (more forgiving than alternatives like Readability)
## Usage
Available as:
- **Browser library** — drop-in script
- **Node.js module** — works with JSDOM or linkedom for server-side use
- **CLI** — `defuddle <url>` with flags for Markdown output, JSON metadata, and debug mode
```bash
# Example CLI usage
npx defuddle https://example.com --markdown
```
Install via npm:
```bash
npm install defuddle
```
## Context
Created by Steph Ango as the extraction engine powering [[Obsidian Web Clipper]]. Designed as a more lenient alternative to Mozilla Readability — it errs on the side of keeping more content rather than stripping too aggressively.
## References
- Official website: https://defuddle.md
- Pricing: https://defuddle.md/pricing
- Documentation: https://defuddle.md/docs
- Playground: https://defuddle.md/playground
- NPM package: https://www.npmjs.com/package/defuddle
- Source code: https://github.com/kepano/defuddle
- Terms of service: https://defuddle.md/terms
- Privacy policy: https://defuddle.md/privacy
## Related
- [[Steph Ango]]
- [[summarize (CLI)]]
- [[Obsidian Web Clipper]]