#web-crawler

  1. spider_chromiumoxide_cdp

    Contains all the generated types for chromiumoxide

    v0.7.8 51K #chromiumoxide #cdp #generated #dev-tools #protocols #chrome #web-crawler
  2. git-leave

    Check for unsaved or uncommitted changes on your machine

    v1.6.4 1.0K #web-crawler #uncommitted #cli
  3. firecrawl

    Rust SDK for Firecrawl API

    v1.2.1 360 #web-crawler #artificial-intelligence #sdk #ai-api #format #markdown #llm #web-search #structured-data #scrape
  4. crawn

    web crawling and scraping

    v0.2.0 #web-scraping #web-crawler #async #concurrency #cli-concurrency
  5. spider-client

    Spider Cloud client

    v0.1.85 #artificial-intelligence #web-crawler #web-indexer
  6. kodegen_tools_citescrape

    KODEGEN.ᴀɪ: Memory-efficient, Blazing-Fast, MCP tools for code generation agents

    v0.10.21 #web-crawler #mcp #claude #terminal
  7. wdict

    Create dictionaries by scraping webpages or crawling local files

    v0.1.22 #web-page #dictionary #web-crawler #local #word-list
  8. spider-cloud-cli

    The Spider Cloud CLI for web crawling and scraping

    v0.1.85 #web-scraping #web-indexer #web-crawler
  9. spider-pipeline

    Pipeline implementations for the spider-lib web scraping framework

    v0.1.7 #pipeline #rust #web-scraping #web-crawler #scraper
  10. spider-lib

    A Rust-based web scraping framework inspired by Scrapy (Python)

    v1.2.1 #web-scraping #async #web-crawler #scraper
  11. yps

    Yggdrasil Port Scanner

    v0.1.3 180 #tcp #yggdrasil #udp #web-crawler #search
  12. toai

    path crawler, that copies all SRC files into a singe output to send it to a ai (toai)

    v0.2.0 #artificial-intelligence #path #stdout #src #ignore #web-crawler #relative-path
  13. mq-crawler

    Directory crawler for batch Markdown file processing

    v0.5.13 #web-crawler #query #markdown #jq
  14. fav_core

    Fav's core crate; A collection of traits

    v0.1.8 2.0K #resources #status-flags #web-crawler #fetch #protobuf #cookies #collection-traits #logout #visualize #fetched
  15. spider-core

    Core functionality for the spider-lib web scraping framework

    v0.2.1 #web-scraping #async #rust #web-crawler #scraper
  16. spider_chromiumoxide_types

    Contains the essential types necessary for using chromiumoxide

    v0.7.4 4.7K #chromiumoxide #chromium #dev-tools #protocols #chrome #websocket #async-api #headless-chrome #web-crawler
  17. siteprobe

    CLI tool to fetch URLs from sitemap.xml, check their existence, and generate performance reports

    v1.2.1 #sitemap #web-crawler #performance #http-monitoring #url-checker
  18. spider_transformations

    Transformation utils to use for spider

    v2.37.114 490 #web-crawler #transformation #crawler
  19. product-os-crawler

    Product OS : Crawler is a browser based cralwer that utilises Product OS : Browser to perform advanced url crawling leveraging headless browsing and automation

    v0.0.16 450 #product-os #web-crawler
  20. robotstxt

    A native Rust port of Google's robots.txt parser and matcher C++ library

    v0.3.0 6.9K #web-crawler #parser
  21. spider-middleware

    Middleware implementations for the spider-lib web scraping framework

    v0.1.7 #middleware #web-scraping #rust #web-crawler
  22. silkworm-rs

    Async-first web scraping framework (Rust port)

    v0.1.0 #web-scraping #web-framework #async #web-crawler
  23. spider_worker

    The fastest web crawler as a worker or proxy

    v2.44.40 #web-scraping #web-crawler #spider-cli
  24. spider-util

    Shared utility functions and types for the spider-lib ecosystem

    v0.1.7 #spider-lib #bloom-filter #utilities #request-url #ecosystem #web-crawler #scraped #building-block #probabilistic-data-structure
  25. iocaine

    The deadliest poison known to AI

    v3.1.0 #poison #reverse-proxy #artificial-intelligence #web-scraping #generator #web-crawler #scripting-engine #costs
  26. halldyll-core

    Core scraping engine for Halldyll - high-performance async web scraper for AI agents

    v0.1.0 #artificial-intelligence #web-scraping #web-crawler #async
  27. tarzi

    Rust-native lite search for AI applications

    v0.1.9 #web-crawler #rag #search
  28. pulsarss

    RSS Aggregator for Gemini Protocol

    v0.1.4 180 #gemini-protocol #rss #web-crawler #gemini-gemtext
  29. spider_cli

    The fastest web crawler CLI written in Rust

    v2.44.40 #web-crawler #web-scraping #crawler
  30. firecrawl-sdk

    Rust SDK for Firecrawl API

    v0.3.1 550 #sdk #scrape #web-crawler #firecrawl #api #cargo-run
  31. crawlurls

    A fast async Rust crawler that discovers and filters URLs by pattern without scraping content

    v0.1.1 #web-crawler #url #web #rust
  32. json-crawler

    Wrapper for serde_json that provides nicer errors when crawling through large json files

    v0.1.0 270 #serde-json #web-crawler #json-error #youtube-music #pointers
  33. inverta

    A basic search engine that downloads pages and matches your search query against their contents

    v0.1.0 #search-query #search-engine #web-crawler #web-page #tf-idf #stop-words
  34. robotxt

    Robots.txt (or URL exclusion) protocol with the support of crawl-delay, sitemap and universal match extensions

    v0.6.1 2.7K #web-crawler #web-framework #scraper
  35. website_crawler

    gRPC tokio based web crawler built with spider

    v0.9.9 #web-indexer #web-crawler #site-map-generator #crawler
  36. seaward

    grep-like tool for the web

    v1.1.0 290 #web-crawler #rustcrawler #cli
  37. crawly

    A lightweight async Web crawler in Rust, optimized for concurrent scraping while respecting robots.txt rules

    v0.1.9 500 #web-crawler #robots-txt #web-scraping #rate-limiting #builder-pattern #concurrency #depth-first-search #respecting
  38. wrake

    Collect links from the given URL

    v0.4.2 #web-crawler #web #crawler
  39. actix_block_ai_crawling

    Actix Middleware that blocks Generative AI crawlers

    v0.2.11 #artificial-intelligence #generative-ai #web-crawler #block #actix-middleware #ip-address #user-agent #openai
  40. ungoliant

    The pipeline for the OSCAR corpus

    v2.0.0 #corpus #common-crawl #oscar #pipeline #web-crawler #fasttext #gz #packaging
  41. omnivore-cli

    Universal web scraper and code extractor CLI - crawl websites, analyze repositories, build knowledge graphs

    v0.2.0 #git #web-crawler #code-analysis #web-scraping
  42. minisearchtk

    Small toolkit for crawling and searching web pages

    v0.2.0 #web-crawler #search #web-page #cache #hash #user-agent #nofollow #statistics #respect #snippets
  43. ferrisfetcher

    A cutting-edge, high-level web scraping library crafted in Rust

    v0.1.0 #web-scraping #web-crawler #scraper
  44. shader-prepper

    Shader include parser and crawler

    v0.3.0-pre.3 5.3K #shader-compiler #web-crawler #build-system #virtual-filesystem #provider #graphics
  45. aquatic-crawler

    Crawler tool for the Aquatic BitTorrent tracker API

    v0.1.0 #bittorrent #web-crawler #parser #magnet #aquatic
  46. spider_utils

    Spider Web Crawler

    v2.44.39 250 #web-scraping #css-selectors #web-crawler
  47. scoutlang

    A web crawling programming language

    v0.7.2 370 #web-crawler #web-scraping #programming-language #web-crawling
  48. capp

    Common things i use to build Rust CLI tools for web crawlers

    v0.4.3 650 #web-crawler #async-executor #async
  49. unobtanium-crawler

    The default web-crawler for unobtanium

    v3.0.0 #web-crawler #unobtanium #index #search-engine
  50. coma

    lightweight command-line tool designed for crawling websites

    v0.2.3 310 #web-crawler #web-scraping #web-discovery
  51. unobtanium

    Opinioated Web search engine library with crawler and viewer companion

    v3.0.0 #search-engine #web-crawler #web-search #database #shared-data-structures
  52. headless_chrome_fork

    Control Chrome programatically

    v1.0.2 #headless-chrome #headless-browser #dev-tools #puppeteer #web-scraping #web-crawler #fetching
  53. voyager

    Web crawler and scraper

    v0.2.1 #web-crawler #web-scraping #state-machine
  54. mitsuba

    Lightweight 4chan board archive software (like Foolfuuka), in Rust

    v1.10.0 #downloader #archive #web-crawler #web-archive
  55. recursive_scraper

    Constant-frequency recursive CLI web scraper with frequency, filtering, file directory, and many other options for scraping HTML, images and other files

    v0.6.2 #web-scraping #recursion #web-crawler #web #scraper
  56. turboscraper

    A high-performance, concurrent web scraping framework for Rust with built-in support for retries, storage backends, and concurrent request handling

    v0.1.1 190 #web-scraping #web-crawler #web #async
  57. scout-lexer

    A web crawling programming language

    v0.7.2 #web-crawler #web-scraping #programming-language #web-crawling
  58. s5_importer_http

    HTTP importer for S5

    v1.0.0-beta.1 #importer #s5 #base-url #http #web-crawler #parallel-processing #content-length
  59. surly-spider

    A command line interface for crawling websites

    v1.0.2 #command-line-interface #web-crawler #surly #domain #flags
  60. chan-downloader

    CLI to download all images/webms of a 4chan thread

    v0.3.0 #download #web-crawler #4chan #4plebs #cli
  61. firecrawl_rs

    Rust SDK for Firecrawl API

    v0.1.1 #web-crawler #sdk #structured-data #llm #api-sdk #markdown #web-data
  62. wappu

    fast and flexible web scraping library for Rust, designed to efficiently navigate and extract data from websites. Perfect for data mining, content aggregation, and web automation tasks.

    v0.3.0 490 #web-scraping #html-parser #web-content #web-crawler #extract #data-mining #web-page #web-data #fetch-and-parse #navigate
  63. indexea

    OpenAPI of Indexea

    v1.0.0 #oauth #widgets #payment #apps-api #account-api #search-api #logging #invoice #web-crawler #blocklist
  64. spider_scraper

    A css scraper using html5ever

    v0.1.2 1.4K #web-scraping #css-selectors #html-parser #serialization #web-crawler
  65. async_job

    async cron job crate for Rust

    v0.1.4 1.8K #cron-job #web-crawler #crawler
  66. frangipani

    Scraping framework for rust

    v0.3.1 #web-scraping #continuous-crawler #web-crawler #scraper #scraping
  67. quick_crawler

    QuickCrawler is a Rust crate that provides a completely async, declarative web crawler with domain-specific request rate-limiting built-in

    v0.1.2 #web-crawler #rate-limiting #domain-specific #web-scraping #web-page
  68. brchd

    Data exfiltration toolkit

    v0.1.0 #toolkit #exfiltration #upload #uploader #0-1 #web-crawler
  69. spacebar

    An anti-plagiarism tool based on null width characters

    v0.3.0-rc1 #character #database #clipboard #tool #width #blog #http-errors #web-crawler
  70. omnivore-core

    Core crawler and knowledge graph engine for Omnivore - web scraping, AI extraction, browser automation

    v0.1.1 #web-crawler #knowledge-graph #browser #async #web-scraping
  71. maman

    Rust Web Crawler

    v0.13.1 #http #web-crawler #web #crawler
  72. rsfile

    operate files or web pages easily and quickly

    v0.1.2 #web-crawler #file-utility #web-page #text-file #web-page-helper #text-file-helper #csv-file-helper #crawler
  73. spidery

    Rust SDK for Spidery API

    v1.0.0 #sdk #web-crawler #llm #crawl #format #scrape #data-url
  74. web-crawler

    Finds every page, image, and script on a website (and downloads it)

    v0.1.3 #download #web-page #find #image #script
  75. ptt-crawler

    A crawler for the web version of PTT, the largest online community in Taiwan

    v0.1.0 #web-crawler #ptt #crawler
  76. ntwrk

    TODO

    v0.1.1 #browser-automation #web-crawler #debugging #web-scraping #debugging-tool
  77. jsdom

    javascript dom parser for web scraping

    v0.0.11-alpha.1 120 #web-scraping #web-crawler
  78. finde-rs

    Multi-threaded filesystem crawler

    v0.1.4 #web-crawler #thread-pool #channel #cli
  79. stream_crawler

    scraping web pages and extracting URLs and endpoints

    v0.1.1 #web-crawler #web-scraping #endpoint #web
  80. url-crawl

    URL crawler for HTML code

    v0.2.0 #kvarn #url #web-crawler #crawl #push #web-server
  81. Try searching with DuckDuckGo.

  82. krate

    Get information and metadata for published Rust crates

    v1.0.0 #metadata #io-api #contract #data-model #web-crawler
  83. actix-prerender

    Actix middleware that sends requests to Prerender.io or a custom Prerender service URL

    v0.2.4 #web-crawler #service-url #prerender #send #io #actix-web #user-agent #actix-middleware
  84. od-get

    recursively crawling & downloading data from open directories

    v0.3.1 #download #web-crawler #open-directory #recursion-depth #file-pattern #verbosity #logging
  85. rust-rock-rover

    Concert web crawler in Rust

    v0.1.0 #concert #web #web-crawler #cargo-generate #template #git #ci
  86. ssufid

    SSU Announcement Crawler for Everyone

    v0.1.0 #ssu #web-crawler #announcement
  87. kodict

    Korean Dictionary Implements and Crawler for Rust

    v0.2.1 #dictionary #korean #web-crawler #hangul
  88. doublesite

    Alternative for httrack

    v0.1.0 #content #httrack #loading #website #cli #backup #har #mirroring #web-crawler
  89. wls

    Easily crawl multiple sitemaps and list URLs

    v0.1.0 #sitemap #web-crawler #url
  90. crawl

    Rust crawl

    v0.2.1 #web-crawler #http #spider
  91. lolchive

    local liminal archiver for webpages

    v0.2.0 #web-page #archiver #local #liminal #web-crawler #date
  92. source-demo-tool-crawler

    WIP: a gui tool for opening (editing planned) source engine demo files

    v0.8.2 #demo-file #source-engine #web-crawler #tool #editing #changelog #file-content
  93. spire

    The flexible scraper framework powered by tokio and tower

    v0.1.0 #web-framework #web-crawler #scraper
  94. emails

    A web scraper to extract email addresses from websites

    v1.0.0 #email #web-crawler #web
  95. sws-crawler

    Web crawler with plugable scraping logic

    v0.1.0 #web-crawler #web-scraping-logic #sws #sitemap #seed #plugable #scrap #web-page
  96. flatcrawl-crawler

    set of webpage crawlers. New crawlers can be easily configured and the output can be written to an AMQP queue.

    v1.0.0 #amqp #web-crawler #web-scraping #flatcrawl #flats #web-page
  97. scraper_query

    Ergonomic Query for HTML with Scraper

    v0.4.0 200 #web-scraping #query #html #document #class #web-crawler
  98. task_deport

    Organize simple task queue

    v0.1.0 #task-queue #redis #in-memory-storage #processing #web-crawler #redis-queue #health-check #health-monitoring #concurrency
  99. gar-crawl

    High level HTML crawler with concise builder

    v0.1.16 #web-crawler #high #level #propagator #builder #allow-list
  100. ac_crawler_types

    normalized types for the anti capital public data crawlers

    v0.1.5 #web-crawler #capital #public #normalized #anti
  101. labisu

    implementing algorithms finding large bipartite subgraphs

    v0.1.1 #subgraph #web-crawler #bipartite #algorithm #finding #undirected-graph
  102. dblp_crawler

    DBLP Crawler

    v0.1.2 #web-crawler #dblp #chatgpt #database
  103. spire-macros

    Macros for spire

    v0.1.0 #web-framework #web-crawler #scraper
  104. crawler_data_client

    client for programmatic download of crawler data

    v0.0.9 #web-crawler #download #client #market #programmatic #zstd
  105. pop-os/apt-repo-crawler

    crawling through files in an apt repo

    GitHub 0.1.0 #web-crawler #apt #repo
  106. scrupy

    fast, modern spider framework written in and for Rust. The framework implements the functionalities of Scrapy, but is low-level and typesafe. It exposes an elegant API and uses zero unsafe code.

    v0.1.6 #framework #low-level #type-safe #elegant #downloader #scrapy #web-crawler
  107. karkinos

    Powerful and flexible web scraper with YAML configuration, supporting pagination, data transformations, caching, and multiple output formats

    v0.0.1 #web-scraping #web-crawler #html-parser #scraper