Lib.rs

› Keywords #pdf #docx #parser #markdown #document-parser #text-document #tf-idf #nlp

#text-extraction

unpdf

High-performance PDF content extraction to Markdown, text, and JSON

v0.1.6 #pdf #markdown #text-extraction #document-parser #pdf-parser #text-document
keyword_extraction

Collection of algorithms for keyword extraction from text

v1.5.0 110 #extract #tf-idf #algorithm #text-extraction
pdfvec

High-performance PDF text extraction library for vectorization pipelines

v0.1.1 #pdf #vectorization #nlp #text-extraction
pdf_oxide

The Complete PDF Toolkit: extract, create, and edit PDFs. Rust core with bindings for Python, Node, WASM, Go, and more.

v0.3.2 150 #pdf #pdf-parser #text-extraction
docx-lite

Lightweight, fast DOCX text extraction library with minimal dependencies

v0.2.0 13K #docx #text-extraction #parser #word #office
heavy-pdf-parser

Extract text from PDF files with support for multiple output formats

v0.1.0 #pdf #text-extraction #document-processing #rust
epub-parser

extracting metadata, table of contents, text, cover, and images from EPUB files

v0.3.4 #ebook #epub #text-extraction #metadata #parser
arabic_pdf_to_text

A CLI tool to convert Arabic PDFs to text using Google's Gemini API

v0.1.0 #gemini-api #pdf #arabic #text-extraction
parser-core

extracting text from various file formats including PDF, DOCX, XLSX, PPTX, images via OCR, and more

v0.1.3 120 #docx #text-parser #pdf #ocr #text-extraction
parser-web

Web API for extracting text from various file formats

v0.1.3 #web-api #pdf #text-extraction #parser

Try searching with DuckDuckGo.

the-daily-stallman

Read the news like Stallman would. No JavaScript required.

v0.3.1 #stallman #text-extraction #rms #news