Readme
HeroIndex
A high-performance full-text search server built on Tantivy , exposing an OpenRPC interface over Unix sockets.
Repository: https://forge.ourworld.tf/lhumina_research/hero_index_server
Looking for the client library? See heroindex_client for easy integration into your Rust applications.
Features
Multiple Index Management - Create, delete, and manage multiple search indexes
Dynamic Schemas - Define custom schemas with 10+ field types
Powerful Queries - Full-text, fuzzy, phrase, boolean, range, regex queries
OpenRPC Discovery - Self-documenting API via rpc. discover
Concurrent Connections - Handle multiple clients simultaneously
Fast Fields - Columnar storage for sorting and aggregations
Zero-Copy Search - Efficient memory-mapped index files
Installation
From crates.io
cargo install heroindex
From source
git clone https://forge.ourworld.tf/lhumina_research/hero_index_server.git
cd hero_index_server
cargo build -- release
Quick Start
1. Start the Server
heroindex -- dir /var/lib/heroindex -- socket /tmp/heroindex.sock
2. Connect with the Client Library
Use heroindex_client to connect:
use heroindex_client:: HeroIndexClient;
use serde_json:: json;
# [ tokio ::main ]
async fn main ( ) -> Result < ( ) , heroindex_client:: Error> {
let mut client = HeroIndexClient:: connect( " /tmp/heroindex.sock" ) . await? ;
// Create an index
client. db_create ( " articles" , json! ( {
" fields" : [
{ " name" : " title" , " type" : " text" , " stored" : true , " indexed" : true } ,
{ " name" : " body" , " type" : " text" , " stored" : true , " indexed" : true }
]
} ) ) . await? ;
// Add documents
client. db_select ( " articles" ) . await? ;
client. doc_add ( json! ( { " title" : " Hello" , " body" : " World" } ) ) . await? ;
client. commit ( ) . await? ;
client. reload ( ) . await? ;
// Search
let results = client. search (
json! ( { " type" : " match" , " field" : " body" , " value" : " world" } ) ,
10 , 0
) . await? ;
println! ( " Found {} results" , results. total_hits) ;
Ok ( ( ) )
}
Command Line Options
heroindex [ OPTIONS ]
Options:
- d, - - dir < DIR > Base directory for all indexes
- s, - - socket < SOCKET > Unix socket path for RPC interface
- h, - - help Print help
- V, - - version Print version
Schema Definition
Define your index schema with these field types:
Type
Description
Options
text
Full-text searchable (tokenized)
stored , indexed , fast , tokenizer
str
Exact match string (keyword)
stored , indexed , fast
u64
Unsigned 64-bit integer
stored , indexed , fast
i64
Signed 64-bit integer
stored , indexed , fast
f64
64-bit floating point
stored , indexed , fast
date
DateTime (RFC 3339)
stored , indexed , fast
bool
Boolean
stored , indexed , fast
json
JSON object
stored , indexed
bytes
Binary data
stored , indexed , fast
ip
IP address
stored , indexed , fast
Example Schema
{
" fields" : [
{ " name" : " id" , " type" : " str" , " stored" : true , " indexed" : true } ,
{ " name" : " title" , " type" : " text" , " stored" : true , " indexed" : true , " tokenizer" : " en_stem" } ,
{ " name" : " content" , " type" : " text" , " stored" : true , " indexed" : true } ,
{ " name" : " views" , " type" : " u64" , " stored" : true , " indexed" : true , " fast" : true } ,
{ " name" : " rating" , " type" : " f64" , " stored" : true , " indexed" : true , " fast" : true } ,
{ " name" : " published" , " type" : " date" , " stored" : true , " indexed" : true , " fast" : true } ,
{ " name" : " active" , " type" : " bool" , " stored" : true , " indexed" : true } ,
{ " name" : " metadata" , " type" : " json" , " stored" : true , " indexed" : true }
]
}
Query Types
Match Query (Full-Text)
{ " type" : " match" , " field" : " content" , " value" : " search terms" }
Term Query (Exact)
{ " type" : " term" , " field" : " id" , " value" : " abc123" }
Fuzzy Query (Typo-Tolerant)
{ " type" : " fuzzy" , " field" : " title" , " value" : " serch" , " distance" : 2 }
Phrase Query
{ " type" : " phrase" , " field" : " content" , " value" : " exact phrase match" }
Prefix Query
{ " type" : " prefix" , " field" : " title" , " value" : " hel" }
Range Query
{ " type" : " range" , " field" : " views" , " gte" : 100 , " lt" : 1000 }
Regex Query
{ " type" : " regex" , " field" : " title" , " value" : " test.*" }
Boolean Query
{
" type" : " boolean" ,
" must" : [ { " type" : " match" , " field" : " content" , " value" : " rust" } ] ,
" should" : [ { " type" : " match" , " field" : " title" , " value" : " tutorial" } ] ,
" must_not" : [ { " type" : " term" , " field" : " status" , " value" : " draft" } ]
}
RPC Methods
Method
Description
rpc. discover
Get OpenRPC schema
server. ping
Health check
server. stats
Server statistics
db. list
List all databases
db. create
Create database with schema
db. delete
Delete a database
db. select
Select database for operations
db. info
Get database info
schema. get
Get current schema
doc. add
Add single document
doc. add_batch
Add multiple documents
doc. delete
Delete by term
index. commit
Commit changes
index. reload
Reload to see changes
search. query
Execute search
search. count
Count matches
Use batch inserts - doc. add_batch is much faster than individual adds
Commit periodically - Don't commit after every document
Enable fast fields - For fields used in sorting/filtering
Use appropriate tokenizers - en_stem for English, raw for keywords
License
MIT License - see LICENSE for details.
Credits
Built on the excellent Tantivy search engine library.