-
headless_chrome
Control Chrome programmatically
-
spider
A web crawler and scraper, building blocks for data curation workloads
-
pandoras_pot
Honeypot designed to send huge amounts of data to rude web scrapers
-
scrape-cli
Command-line HTML extraction tool powered by scrape-rs
-
titans
Blazingly Fast scraper
-
select
extract useful data from HTML documents, suitable for web scraping
-
supermarkdown-cli
CLI for supermarkdown HTML to Markdown conversion
-
crawn
web crawling and scraping
-
fa-to-letterboxd
webscraping tool to convert a FilmAffinity profile to Letterboxd's import format
-
eoka
Stealth browser automation for Rust. Puppeteer/Playwright alternative with anti-bot bypass.
-
spider-cloud-cli
The Spider Cloud CLI for web crawling and scraping
-
spider-pipeline
Pipeline implementations for the spider-lib web scraping framework
-
spider-lib
A Rust-based web scraping framework inspired by Scrapy (Python)
-
magpie-bird
eBird Target Bird Scraper
-
reasonkit-web
High-performance MCP server for browser automation, web capture, and content extraction. Rust-powered CDP client for AI agents.
-
webpuppet
Web browser programmatic automation and control library for research, testing, and workflow automation
-
spider-core
Core functionality for the spider-lib web scraping framework
-
reqwest-scraper
Web scraping integration with reqwest
-
extrablatt_v2
News, articles and text scraper
-
serp-sdk
A comprehensive, production-ready Rust SDK for SerpAPI with async support, type safety, and ergonomic APIs. Developed during the Realtime Search AI Hackathon powered by SerpAPI.
-
docbox-database
Docbox database structures, logic, and migrations
-
spider-middleware
Middleware implementations for the spider-lib web scraping framework
-
docbox-core
Docbox core business logic and functionality
-
docbox-storage
Docbox storage layer abstraction
-
silkworm-rs
Async-first web scraping framework (Rust port)
-
discord_rust_scraper
DiscordRustScraper is a powerful Discord data scraper built in Rust, designed to extract and format channel data for further analysis. It efficiently scrapes message history from specified…
-
docbox-search
Docbox multi-backend search abstraction
-
skyscraper
XPath for HTML web scraping
-
spider_worker
The fastest web crawler as a worker or proxy
-
refyne
Official Rust SDK for the Refyne API - LLM-powered web extraction
-
html_transpose
html table transpose library
-
snagger
Grab full text across ?page=N pagination with page count discovery
-
docbox-secrets
Docbox secret management abstraction
-
stud_ip_scraper
Blazingly fast 🚀 library for interacting with Stud.IP 📚
-
lastfm-edit
programmatic access to Last.fm's scrobble editing functionality via web scraping
-
ugc-scraper
Scraper for ugcleague.com
-
prehrajto-core
Core scraping library for prehraj.to
-
iocaine
The deadliest poison known to AI
-
halldyll-core
Core scraping engine for Halldyll - high-performance async web scraper for AI agents
-
uninews
A universal news scraper for extracting content from various news blogs and news sites
-
spider_cli
The fastest web crawler CLI written in Rust
-
shave
shaving data from websites
-
spider-downloader
Downloader component for the spider-lib web scraping framework
-
protozoa
A scraper for various anime websites
-
web-scrape
aids in scraping data from the web
-
typing_test
Typing speed test in rust
-
herolib-web
Web utilities for herolib - HTML/Markdown processing and web scraping
-
scraper-trail
Scraping framework and tools
-
gogoanime-scraper
A blazing fast anime scraper for GoGoAnime
-
docbox-web-scraper
Web scraping logic for docbox, document parsing, favicon resolution, ogp resolution
-
crawly
A lightweight async Web crawler in Rust, optimized for concurrent scraping while respecting
robots.txtrules -
webshot
A command-line tool for automated website screenshots and web scraping
-
knee_scraper
Recursive scraping & downloading media, optionaly on word/phrase. 'AI CAPTCHA Solving', and Parses js content for keywords.
-
tagparser
A lightweight Rust library for parsing HTML tags with powerful filtering capabilities
-
rusty-scrap
HTML Scrapper
-
progscrape
progscrape.com web application
-
bms_scraper
Package for scraping BMS
-
omnivore-cli
Universal web scraper and code extractor CLI - crawl websites, analyze repositories, build knowledge graphs
-
rustboxd
A Letterboxd web scraper and API client library written in Rust
-
recluse
A web crawler framework for Rust
-
spider_utils
Spider Web Crawler
-
ferrisfetcher
A cutting-edge, high-level web scraping library crafted in Rust
-
scoutlang
A web crawling programming language
-
pxscrape-conglomerate
Conglomerates a bunch of proxies scraped from some VPNs into a common format
-
dyer
designed for reliable, flexible and fast Request-Response based service, including data processing, web-crawling and so on, providing some friendly, flexible, comprehensive features without compromising speed
-
scrapr-core
web scraping library for Python
-
reposcrape
Repository scraper for presenting repositories on a personal website
-
proxy-types
Basic proxy types for my own uses
-
coma
lightweight command-line tool designed for crawling websites
-
pxscrape-urban
UrbanVPN API scraper
-
pxscrape-browsec
Browsec VPN scraper
-
headless_chrome_fork
Control Chrome programatically
-
voyager
Web crawler and scraper
-
accessibility-tree
Accessibility tree binding CSS styles and vectors to elements. Used mainly for accessibility-rs crate.
-
yfp
A Yahoo finance scraper
-
scraprr
web scraping library for Python
-
crabler
Web scraper for Crabs
-
search_for_llms
A search tool suitable for LLMs, with structured and cleaned search results, written in Rust
-
scout-parser
A web crawling programming language
-
headless_chrome_new
Control Chrome programatically
-
easy-scraper
HTML scraping library focused on easy to use
-
recursive_scraper
Constant-frequency recursive CLI web scraper with frequency, filtering, file directory, and many other options for scraping HTML, images and other files
-
progscrape-scrapers
progscrape.com service scrapers
-
turboscraper
A high-performance, concurrent web scraping framework for Rust with built-in support for retries, storage backends, and concurrent request handling
-
story-dl
Story web scraping
-
scout-lexer
A web crawling programming language
-
scrapr-bindings
web scraping library for Python
-
headless_chrome_xiaoai
Control Chrome programatically
-
aniscraper
designed for efficient web scraping and data extraction. It simplifies the process of fetching, parsing, and extracting data from websites.
-
scout-interpreter
A web crawling programming language
-
shavecli
A command line interface for the shave library
-
progscrape-application
progscrape.com application logic
-
nu_plugin_selector
web scraping using css selector
-
swissknife-scraping-sdk
Web scraping SDK - Apify, Browser Use, Firecrawl
-
scrapelect
Interpreter for scrapelect, a CSS-inspired web scraping DSL
-
keyhunter
Check for leaked API keys and secrets on public websites
-
wappu
fast and flexible web scraping library for Rust, designed to efficiently navigate and extract data from websites. Perfect for data mining, content aggregation, and web automation tasks.
-
spider_scraper
A css scraper using html5ever
-
vocalolyrics
Lyrics scraper, primarily for Vocaloid content. By default, atwiki is used as the source. We plan to make other sources selectable, but that is not currently possible
-
frangipani
Scraping framework for rust
-
only_scraper
Only scrape webpages
-
barrido
discover paths in web applications
-
sivasbus
Sivas live public bus data scraper
-
quick_crawler
QuickCrawler is a Rust crate that provides a completely async, declarative web crawler with domain-specific request rate-limiting built-in
-
scihub-scraper-cli
CLI utility to scrap paper informations from sci-hub
-
scout-json
JSON representation of ScoutLang AST
-
get-cookies
Get cookies from a pop-up window
-
html-ast
Construct and generate legal html string
-
omnivore-core
Core crawler and knowledge graph engine for Omnivore - web scraping, AI extraction, browser automation
-
favicon-scraper
A favicon scraper that just works
-
html-extractor
extracting data from HTML
-
ugc-scraper-types
Scraper for ugcleague.com - data types
-
scraper-main
The core framework xpath parsing
-
star-scraper
fetch info about GitHub stargazers
-
ntwrk
TODO
-
rezvrh_scraper
Bakalari scraper
-
jsdom
javascript dom parser for web scraping
-
stream_crawler
scraping web pages and extracting URLs and endpoints
-
nyxd-scraper-shared
Common crate for the sqlite and psql Nyxd blockchain scrapers
-
etwin_scraper_tools
Helper functions for
scraper -
web-scraper
that is used to get html from a website, and scrape the content in it
-
apify-client
Typed wrapper for Apify API
-
bvdl
Scrape product information from Bazaarvoice
-
fsolver
wrapper around the FlareSolverr API
-
olx
extracting product information from OLX (www.olx.bg)
-
scr
Most simple site parser and file loader
-
rand_agents
generating random user agent strings
-
kirjat-rs
prices for finnish textbooks from multiple stores
-
aliexpress-scraper
An aliexpress scraper using requests
-
lookout
👀 Declarative, asynchronous scraper utility
-
magnetfinder
Multi-threaded CLI torrent scraper & aggregator
-
spider-macro
Proc-macros for the spider-lib web scraping framework
-
seedframe_webscraper
Webscraper loader integration crate for SeedFrame
-
rustfm-scraper
Scrapes listening history from Last.fm and stores it in a file
-
diffbot
A client library for the Diffbot API
-
fs_scraper
A scraper for FjalorShqip
-
voight_kampff
A user agent checker
-
sws-tree
Slotmap-backed ID-tree
-
twitter-scraper
Twitter scraper, no login required. FOR EDUCATIONAL PURPOSES ONLY
-
canadian_news_scraper
that provides an api which scrapes 3 Canadian News Sites and returns the data
-
url-scraper
HTML URL scraper
-
web-scraper-flows
Web scraper integration for flows.network
-
scrpr
Basic rust scraper and data selector
-
wco-rs
Play Cartoon & Animes from Wcostream
-
scrapman
A high-level declarative web scraping framework
-
globescraper
Scraper lib for Globe Explorer AI engine
-
flatcrawl-crawler
set of webpage crawlers. New crawlers can be easily configured and the output can be written to an AMQP queue.
Try searching with DuckDuckGo.