#web-scraping

  1. headless_chrome

    Control Chrome programmatically

    v1.0.21 69K #dev-tools #puppeteer #headless-browser #web-testing #web-scraping #fetching
  2. spider

    A web crawler and scraper, building blocks for data curation workloads

    v2.44.40 1.9K #web-scraping #headless-chrome #scraper
  3. pandoras_pot

    Honeypot designed to send huge amounts of data to rude web scrapers

    v0.7.2 1.4K #web-scraping #web-server #honeypot #web
  4. scrape-cli

    Command-line HTML extraction tool powered by scrape-rs

    v0.2.2 #css-selectors #html-parser #simd-accelerated #extract #nodejs #html5 #wasm #command-line-tool #batch-processing #web-scraping
  5. titans

    Blazingly Fast scraper

    v0.4.8 1.8K #youtube-downloader #youtube #web-scraping #scraper
  6. select

    extract useful data from HTML documents, suitable for web scraping

    v0.6.1 33K #web-scraping #html #document #extract #node #descendant
  7. supermarkdown-cli

    CLI for supermarkdown HTML to Markdown conversion

    v0.0.5 #render-markdown #convert-html #html-markdown-converter #nodejs #native #web-scraping
  8. crawn

    web crawling and scraping

    v0.2.0 #web-scraping #web-crawler #async #concurrency #cli-concurrency
  9. fa-to-letterboxd

    webscraping tool to convert a FilmAffinity profile to Letterboxd's import format

    v0.1.1 #web-scraping #letterboxd #filmaffinity #chaser-oxide
  10. eoka

    Stealth browser automation for Rust. Puppeteer/Playwright alternative with anti-bot bypass.

    v0.3.6 #web-scraping #playwright #puppeteer #selenium #api-bindings
  11. spider-cloud-cli

    The Spider Cloud CLI for web crawling and scraping

    v0.1.85 #web-scraping #web-indexer #web-crawler
  12. spider-pipeline

    Pipeline implementations for the spider-lib web scraping framework

    v0.1.7 #pipeline #rust #web-scraping #web-crawler #scraper
  13. spider-lib

    A Rust-based web scraping framework inspired by Scrapy (Python)

    v1.2.1 #web-scraping #async #web-crawler #scraper
  14. magpie-bird

    eBird Target Bird Scraper

    v0.4.2 1.1K #species #hotspot #bird #list #location #time-range #magpie #web-scraping #geographic #date-range
  15. reasonkit-web

    High-performance MCP server for browser automation, web capture, and content extraction. Rust-powered CDP client for AI agents.

    v0.1.7 #web-scraping #browser-automation #cdp #mcp #headless-browser
  16. webpuppet

    Web browser programmatic automation and control library for research, testing, and workflow automation

    v0.1.5-alpha #browser-automation #web-scraping #puppeteer
  17. spider-core

    Core functionality for the spider-lib web scraping framework

    v0.2.1 #web-scraping #async #rust #web-crawler #scraper
  18. reqwest-scraper

    Web scraping integration with reqwest

    v0.7.1 1.8K #web-scraping #css-selectors #xpath #integration #json-path #json-response #selector-xpath #proc-macro
  19. extrablatt_v2

    News, articles and text scraper

    v0.5.0 #web-scraping #news #web
  20. serp-sdk

    A comprehensive, production-ready Rust SDK for SerpAPI with async support, type safety, and ergonomic APIs. Developed during the Realtime Search AI Hackathon powered by SerpAPI.

    v0.2.1 #artificial-intelligence #web-scraping #serpapi #google #web
  21. docbox-database

    Docbox database structures, logic, and migrations

    v0.10.2 #database #docbox #migration #email #multi-tenant #document-processing #file-manager #file-processing #web-scraping #attachment
  22. stud_ip_scraper

    Blazingly fast 🚀 library for interacting with Stud.IP 📚

    v2.0.1 #stud-ip #web-scraping #scraper
  23. spider-middleware

    Middleware implementations for the spider-lib web scraping framework

    v0.1.7 #middleware #web-scraping #rust #web-crawler
  24. docbox-core

    Docbox core business logic and functionality

    v0.11.1 #document-processing #web-scraping #database #file-processing #file-manager #access-control #multi-tenant #pdf #secure-storage #libre-office
  25. docbox-storage

    Docbox storage layer abstraction

    v0.7.0 #storage-layer #docbox #search #file-manager #multi-tenant #file-processing #full-text-search #web-scraping #access-control #pdf
  26. discord_rust_scraper

    DiscordRustScraper is a powerful Discord data scraper built in Rust, designed to extract and format channel data for further analysis. It efficiently scrapes message history from specified…

    v1.0.7 650 #web-scraping #discord-bot #bot #discord #discordscraper #scraper
  27. silkworm-rs

    Async-first web scraping framework (Rust port)

    v0.1.0 #web-scraping #web-framework #async #web-crawler
  28. skyscraper

    XPath for HTML web scraping

    v0.7.0-beta.2 1.0K #html-parser #xpath #web-scraping #html-text #text-document #parse-error
  29. docbox-search

    Docbox multi-backend search abstraction

    v0.10.1 #database #docbox #search #file-manager #multi-tenant #file-processing #typesense #multi-backend #request-path #web-scraping
  30. spider_worker

    The fastest web crawler as a worker or proxy

    v2.44.40 #web-scraping #web-crawler #spider-cli
  31. refyne

    Official Rust SDK for the Refyne API - LLM-powered web extraction

    v0.1.1 #web-scraping #web-api #llm #extract #web-extract
  32. html_transpose

    html table transpose library

    v0.1.1 #html-table #table-cell #transpose #html-escaping #merged #transposing #convert-html #web-scraping #2d-grid #html-parser
  33. snagger

    Grab full text across ?page=N pagination with page count discovery

    v0.1.4 #web-scraping #pagination #async #rust
  34. ugc-scraper

    Scraper for ugcleague.com

    v0.4.4 850 #ugc #web-scraping #ugcleague #player #team #steam-id
  35. docbox-secrets

    Docbox secret management abstraction

    v0.5.0 #docbox #search #file-manager #file-processing #secret #multi-tenant #secret-management #full-text-search #web-scraping #access-control
  36. lastfm-edit

    programmatic access to Last.fm's scrobble editing functionality via web scraping

    v4.1.0 #lastfm #editing #album #web-scraping #user-name #artist #fm #scrobble #authentication #music
  37. prehrajto-core

    Core scraping library for prehraj.to

    v0.2.0 #web-scraping #video #async #scraper
  38. iocaine

    The deadliest poison known to AI

    v3.1.0 #poison #reverse-proxy #artificial-intelligence #web-scraping #generator #web-crawler #scripting-engine #costs
  39. halldyll-core

    Core scraping engine for Halldyll - high-performance async web scraper for AI agents

    v0.1.0 #artificial-intelligence #web-scraping #web-crawler #async
  40. uninews

    A universal news scraper for extracting content from various news blogs and news sites

    v0.3.2 #web-scraping #blog-post #news-article #render-markdown #image-url #openai #content-processing #unwanted
  41. spider_cli

    The fastest web crawler CLI written in Rust

    v2.44.40 #web-crawler #web-scraping #crawler
  42. shave

    shaving data from websites

    v0.2.5 460 #web-scraping #image #async #html #api-bindings
  43. spider-downloader

    Downloader component for the spider-lib web scraping framework

    v0.2.0 #web-framework #web-scraping #downloader #component #spider-lib
  44. protozoa

    A scraper for various anime websites

    v0.1.15 1.3K #web-scraping #anime #scraper-for-anime
  45. web-scrape

    aids in scraping data from the web

    v0.9.0 #web-scraping #aids #web-data
  46. typing_test

    Typing speed test in rust

    v0.3.5 2.1K #speed-test #word #quote #themes #web-scraping #graphics-rendering
  47. herolib-web

    Web utilities for herolib - HTML/Markdown processing and web scraping

    v0.3.13 #web-scraping #web-html #html
  48. scraper-trail

    Scraping framework and tools

    v0.1.0 #web-scraping #store #request-response #archive #apple
  49. gogoanime-scraper

    A blazing fast anime scraper for GoGoAnime

    v1.2.4 1.7K #web-scraping #anime #gogoanime #user
  50. docbox-web-scraper

    Web scraping logic for docbox, document parsing, favicon resolution, ogp resolution

    v0.6.0 #web-scraping #docbox #favicon #cache #document #proxy-server #ogp #web-scraping-logic #document-parser
  51. crawly

    A lightweight async Web crawler in Rust, optimized for concurrent scraping while respecting robots.txt rules

    v0.1.9 500 #web-crawler #robots-txt #web-scraping #rate-limiting #builder-pattern #concurrency #depth-first-search #respecting
  52. webshot

    A command-line tool for automated website screenshots and web scraping

    v0.2.0 #web-scraping #screenshot #browser-automation
  53. knee_scraper

    Recursive scraping & downloading media, optionaly on word/phrase. 'AI CAPTCHA Solving', and Parses js content for keywords.

    v0.1.8 450 #web-scraping #media-download #artificial-intelligence #logging #recursion #captcha #web-page #javascript #robots-txt #cookies
  54. tagparser

    A lightweight Rust library for parsing HTML tags with powerful filtering capabilities

    v0.6.0 390 #html-parser #web-scraping #html #web
  55. rusty-scrap

    HTML Scrapper

    v0.1.11 650 #web-scraping #scrap #html #target #scrapper
  56. progscrape

    progscrape.com web application

    v0.0.2 #web-scraping #stories #keep #web-apps #points #front-page #reddit #ranking #news #hacker
  57. bms_scraper

    Package for scraping BMS

    v0.1.0 #web-scraping #bms #package #localhost #find #database
  58. omnivore-cli

    Universal web scraper and code extractor CLI - crawl websites, analyze repositories, build knowledge graphs

    v0.2.0 #git #web-crawler #code-analysis #web-scraping
  59. rustboxd

    A Letterboxd web scraper and API client library written in Rust

    v0.2.0 #film #letterboxd #api #web-scraping
  60. recluse

    A web crawler framework for Rust

    v0.1.1 #web-scraping #web-framework #scraper
  61. spider_utils

    Spider Web Crawler

    v2.44.39 250 #web-scraping #css-selectors #web-crawler
  62. ferrisfetcher

    A cutting-edge, high-level web scraping library crafted in Rust

    v0.1.0 #web-scraping #web-crawler #scraper
  63. scoutlang

    A web crawling programming language

    v0.7.2 370 #web-crawler #web-scraping #programming-language #web-crawling
  64. pxscrape-conglomerate

    Conglomerates a bunch of proxies scraped from some VPNs into a common format

    v0.2.1 #format #proxies #conglomerate #vpn #scraped #web-scraping #lazy-evaluation #im
  65. dyer

    designed for reliable, flexible and fast Request-Response based service, including data processing, web-crawling and so on, providing some friendly, flexible, comprehensive features without compromising speed

    v3.3.2 53K #web-scraping #data-processing #request-response #web-crawling
  66. scrapr-core

    web scraping library for Python

    v0.1.1 #web-scraping #html-parser #web
  67. reposcrape

    Repository scraper for presenting repositories on a personal website

    v0.1.6 300 #repository #web-scraping #user #presenting #personal
  68. proxy-types

    Basic proxy types for my own uses

    v0.1.0 #proxy #own #web-scraping #lazy-evaluation #im
  69. coma

    lightweight command-line tool designed for crawling websites

    v0.2.3 310 #web-crawler #web-scraping #web-discovery
  70. pxscrape-urban

    UrbanVPN API scraper

    v0.1.1 #web-scraping #urban-vpn #api #package #proxies #lazy-evaluation #im
  71. pxscrape-browsec

    Browsec VPN scraper

    v0.1.0 #web-scraping #browsec #vpn #lazy-evaluation #im
  72. headless_chrome_fork

    Control Chrome programatically

    v1.0.2 #headless-chrome #headless-browser #dev-tools #puppeteer #web-scraping #web-crawler #fetching
  73. voyager

    Web crawler and scraper

    v0.2.1 #web-crawler #web-scraping #state-machine
  74. accessibility-tree

    Accessibility tree binding CSS styles and vectors to elements. Used mainly for accessibility-rs crate.

    v0.0.14 950 #web-scraping #css #accessibility #css-selectors #accessibility-rs #wcag #vector-elements #bounding-box
  75. yfp

    A Yahoo finance scraper

    v0.2.1 #yahoo-finance #web-scraping #csv #ticker #historical #ohlcv #date
  76. scraprr

    web scraping library for Python

    v0.1.3 #web-scraping #html-parser #web
  77. crabler

    Web scraper for Crabs

    v0.1.28 100 #css #scraper #web-scraper
  78. search_for_llms

    A search tool suitable for LLMs, with structured and cleaned search results, written in Rust

    v0.1.0 #web-scraping #llm #google #google-search #search-web
  79. scout-parser

    A web crawling programming language

    v0.7.2 430 #web-scraping #web-crawling #programming-language
  80. headless_chrome_new

    Control Chrome programatically

    v1.0.6 #headless-chrome #dev-tools #puppeteer #web-scraping #headless-browser #web-testing #fetching
  81. easy-scraper

    HTML scraping library focused on easy to use

    v0.2.0 130 #web-scraping #html #scraping
  82. recursive_scraper

    Constant-frequency recursive CLI web scraper with frequency, filtering, file directory, and many other options for scraping HTML, images and other files

    v0.6.2 #web-scraping #recursion #web-crawler #web #scraper
  83. progscrape-scrapers

    progscrape.com service scrapers

    v0.0.2 #web-scraping #progscrape #reddit #hacker-news #source #front-page
  84. turboscraper

    A high-performance, concurrent web scraping framework for Rust with built-in support for retries, storage backends, and concurrent request handling

    v0.1.1 190 #web-scraping #web-crawler #web #async
  85. story-dl

    Story web scraping

    v0.6.0 #story #web-scraping #site #epub #json
  86. scout-lexer

    A web crawling programming language

    v0.7.2 #web-crawler #web-scraping #programming-language #web-crawling
  87. scrapr-bindings

    web scraping library for Python

    v0.1.1 #web-scraping #html-parser #web
  88. headless_chrome_xiaoai

    Control Chrome programatically

    v1.0.16 #headless-chrome #dev-tools #puppeteer #headless-browser #web-scraping #fetching
  89. aniscraper

    designed for efficient web scraping and data extraction. It simplifies the process of fetching, parsing, and extracting data from websites.

    v0.1.2 140 #anime #web-scraping #hianime #zoro #web
  90. scout-interpreter

    A web crawling programming language

    v0.7.2 420 #web-scraping #programming-language #web-crawling
  91. shavecli

    A command line interface for the shave library

    v0.1.1 #web-scraping #html #image #async #web
  92. progscrape-application

    progscrape.com application logic

    v0.0.2 #progscrape #reddit #documentation #stories #source #web-scraping #front-page #hacker #news #ranking
  93. nu_plugin_selector

    web scraping using css selector

    v0.44.0 #web-scraping #css-selectors
  94. swissknife-scraping-sdk

    Web scraping SDK - Apify, Browser Use, Firecrawl

    v0.1.1 #web-scraping #browser #firecrawl #apify #sdk
  95. scrapelect

    Interpreter for scrapelect, a CSS-inspired web scraping DSL

    v0.3.2 160 #web-scraping #dsl #interpreter #json-output #css-selectors #structured-data
  96. keyhunter

    Check for leaked API keys and secrets on public websites

    v0.2.0 250 #api-key #web-scraping #secret #security #web
  97. Try searching with DuckDuckGo.

  98. wappu

    fast and flexible web scraping library for Rust, designed to efficiently navigate and extract data from websites. Perfect for data mining, content aggregation, and web automation tasks.

    v0.3.0 490 #web-scraping #html-parser #web-content #web-crawler #extract #data-mining #web-page #web-data #fetch-and-parse #navigate
  99. spider_scraper

    A css scraper using html5ever

    v0.1.2 1.4K #web-scraping #css-selectors #html-parser #serialization #web-crawler
  100. vocalolyrics

    Lyrics scraper, primarily for Vocaloid content. By default, atwiki is used as the source. We plan to make other sources selectable, but that is not currently possible

    v0.2.4 330 #lyrics #selectable #web-scraping #vocaloid #content
  101. frangipani

    Scraping framework for rust

    v0.3.1 #web-scraping #continuous-crawler #web-crawler #scraper #scraping
  102. only_scraper

    Only scrape webpages

    v0.1.2 #web-page #web-scraping #task #solution #minimalist
  103. barrido

    discover paths in web applications

    v0.3.2 #web-apps #url-path #web-scraping #find #js #word-list
  104. sivasbus

    Sivas live public bus data scraper

    v0.2.1 #web-scraping #bus #public #live #data
  105. quick_crawler

    QuickCrawler is a Rust crate that provides a completely async, declarative web crawler with domain-specific request rate-limiting built-in

    v0.1.2 #web-crawler #rate-limiting #domain-specific #web-scraping #web-page
  106. scihub-scraper-cli

    CLI utility to scrap paper informations from sci-hub

    v0.2.2 #web-scraping #sci-hub #web #scraper
  107. scout-json

    JSON representation of ScoutLang AST

    v0.7.2 110 #web-scraping #programming-language #web-crawling
  108. get-cookies

    Get cookies from a pop-up window

    v0.2.0 500 #cookies #web-scraping #wry #cross-platform-compatibility #user #pop-up #automated-tests
  109. html-ast

    Construct and generate legal html string

    v0.1.0 #html #html-string #web-scraping
  110. omnivore-core

    Core crawler and knowledge graph engine for Omnivore - web scraping, AI extraction, browser automation

    v0.1.1 #web-crawler #knowledge-graph #browser #async #web-scraping
  111. favicon-scraper

    A favicon scraper that just works

    v0.3.1 150 #favicon #scrape #web-scraping #icons #async
  112. html-extractor

    extracting data from HTML

    v1.0.0 100 #web-scraping #web #scraping
  113. ugc-scraper-types

    Scraper for ugcleague.com - data types

    v0.1.2 #ugcleague #ugc-scraper #web-scraping #home #steam-id
  114. scraper-main

    The core framework xpath parsing

    v0.3.1 #xpath #web-scraping
  115. star-scraper

    fetch info about GitHub stargazers

    v0.1.5 #github-repo #web-scraping #info #star #github-token #github-star #statistics #personal-access-token
  116. ntwrk

    TODO

    v0.1.1 #browser-automation #web-crawler #debugging #web-scraping #debugging-tool
  117. rezvrh_scraper

    Bakalari scraper

    v0.1.6 190 #web-scraping #bakalari #authentication
  118. jsdom

    javascript dom parser for web scraping

    v0.0.11-alpha.1 120 #web-scraping #web-crawler
  119. stream_crawler

    scraping web pages and extracting URLs and endpoints

    v0.1.1 #web-crawler #web-scraping #endpoint #web
  120. nyxd-scraper-shared

    Common crate for the sqlite and psql Nyxd blockchain scrapers

    v1.20.4 #mixnet #blockchain #web-scraping #network-level-attackers #nym #psql #requester
  121. etwin_scraper_tools

    Helper functions for scraper

    v0.12.3 #etwin #web-scraping
  122. web-scraper

    that is used to get html from a website, and scrape the content in it

    v0.1.0 #html-content #website #scrape #site #div
  123. apify-client

    Typed wrapper for Apify API

    v0.2.0 #web-scraping #apify #web-automation
  124. bvdl

    Scrape product information from Bazaarvoice

    v0.1.0 #web-scraping #information #scraper
  125. fsolver

    wrapper around the FlareSolverr API

    v1.0.0 #flare-solverr #async #web-scraping #wrapper #web
  126. olx

    extracting product information from OLX (www.olx.bg)

    v0.1.2 #product #search #price #url #extracting #search-query #web-scraping #pagination #command-line-tool
  127. scr

    Most simple site parser and file loader

    v1.0.2 #file-loader #parser #site #web-scraping
  128. rand_agents

    generating random user agent strings

    v1.0.0 #user-agent #random #generator #web-scraping #string #security
  129. kirjat-rs

    prices for finnish textbooks from multiple stores

    v0.7.1 #price #store #textbooks #finnish #api #web-scraping
  130. aliexpress-scraper

    An aliexpress scraper using requests

    v0.1.4 #aliexpress #web-scraping #detail #product
  131. lookout

    👀 Declarative, asynchronous scraper utility

    v0.1.1 #web-scraping #scraper #scraping
  132. magnetfinder

    Multi-threaded CLI torrent scraper & aggregator

    v0.4.0 #web-scraping #bittorrent
  133. spider-macro

    Proc-macros for the spider-lib web scraping framework

    v0.1.5 #proc-macro #web-scraping #rust
  134. seedframe_webscraper

    Webscraper loader integration crate for SeedFrame

    v0.1.1 #web-scraping #seedframe #interval #css-selectors #integration #config-json #seed-frame #proc-macro #artificial-intelligence
  135. rustfm-scraper

    Scrapes listening history from Last.fm and stores it in a file

    v0.2.23 #web-scraping #lastfm #api
  136. diffbot

    A client library for the Diffbot API

    v1.0.0 #web-scraping #api
  137. fs_scraper

    A scraper for FjalorShqip

    v0.1.1 #web-scraping #fjalor-shqip #fjalorshqip #query #json
  138. voight_kampff

    A user agent checker

    v0.1.2 #user-agent #bot #kampff #voight #checker #web-scraping
  139. sws-tree

    Slotmap-backed ID-tree

    v1.0.0 #id-tree #sws #web-scraping #sitemap #reference #slot-map #csv
  140. twitter-scraper

    Twitter scraper, no login required. FOR EDUCATIONAL PURPOSES ONLY

    v0.3.0 #twitter #web-scraping #required #educational #authentication
  141. canadian_news_scraper

    that provides an api which scrapes 3 Canadian News Sites and returns the data

    v0.1.6 #web-scraping #site #news #canadian #api #cbc #ca #ctv
  142. url-scraper

    HTML URL scraper

    v0.1.2 #web-scraping #html #scraper
  143. web-scraper-flows

    Web scraper integration for flows.network

    v0.1.0 #web-scraping #flows #networking #integration #page #query-parameters #serde-json
  144. scrpr

    Basic rust scraper and data selector

    v0.1.0 #web-scraping #selectors #scraper
  145. wco-rs

    Play Cartoon & Animes from Wcostream

    v0.1.2 #anime #wcostream #web-scraping #scraper
  146. scrapman

    A high-level declarative web scraping framework

    v0.1.1 #web-framework #web-scraping #declarative #chrome-driver
  147. globescraper

    Scraper lib for Globe Explorer AI engine

    v0.3.2 220 #engine #query #globe #artificial-intelligence #explorer #web-scraping
  148. flatcrawl-crawler

    set of webpage crawlers. New crawlers can be easily configured and the output can be written to an AMQP queue.

    v1.0.0 #amqp #web-crawler #web-scraping #flatcrawl #flats #web-page