8 unstable releases (3 breaking)
| 0.5.3 | Feb 15, 2026 |
|---|---|
| 0.5.2 | Feb 8, 2026 |
| 0.4.0 | Jan 12, 2026 |
| 0.3.1 | Dec 22, 2025 |
| 0.2.0-beta.1 | Dec 12, 2025 |
#726 in Science
286 downloads per month
Used in 2 crates
(via lance-context-core)
680KB
14K
SLoC
Lance Graph Query Engine
A graph query engine for Lance datasets with Cypher syntax support. This crate enables querying Lance's columnar datasets using familiar graph query patterns, interpreting tabular data as property graphs.
Features
- Cypher query parsing and AST construction
- Graph configuration for mapping Lance tables to nodes and relationships
- Semantic validation with typed
GraphErrordiagnostics - Pluggable execution strategies (DataFusion planner by default, simple executor, Lance Native placeholder)
- Async query execution that returns Arrow
RecordBatchresults - JSON-serializable parameter binding for reusable query templates
- Logical plan debugging via
CypherQuery::explain
Quick Start
use std::collections::HashMap;
use std::sync::Arc;
use arrow_array::{ArrayRef, Int32Array, RecordBatch, StringArray};
use arrow_schema::{DataType, Field, Schema};
use lance_graph::{CypherQuery, ExecutionStrategy, GraphConfig};
let config = GraphConfig::builder()
.with_node_label("Person", "person_id")
.with_relationship("KNOWS", "src_person_id", "dst_person_id")
.build()?;
let schema = Arc::new(Schema::new(vec![
Field::new("person_id", DataType::Int32, false),
Field::new("name", DataType::Utf8, false),
Field::new("age", DataType::Int32, false),
]));
let batch = RecordBatch::try_new(
schema,
vec![
Arc::new(Int32Array::from(vec![1, 2])) as ArrayRef,
Arc::new(StringArray::from(vec!["Alice", "Bob"])) as ArrayRef,
Arc::new(Int32Array::from(vec![29, 35])) as ArrayRef,
],
)?;
let mut tables = HashMap::new();
tables.insert("Person".to_string(), batch);
let query = CypherQuery::new("MATCH (p:Person) WHERE p.age > $min RETURN p.name")?
.with_config(config)
.with_parameter("min", 30);
let runtime = tokio::runtime::Runtime::new()?;
// Use default DataFusion-based execution
let result = runtime.block_on(query.execute(tables.clone(), None))?;
// Opt in to the simple executor if you only need projection/filter support.
let simple = runtime.block_on(query.execute(tables, Some(ExecutionStrategy::Simple)))?;
The query expects a HashMap<String, RecordBatch> keyed by the labels and relationship types referenced in the Cypher text. Each record batch should expose the columns configured through GraphConfig (ID fields, property fields, etc.). Relationship mappings also expect a batch keyed by the relationship type (for example KNOWS) that contains the configured source/target ID columns and any optional property columns.
Configuring Graph Mappings
Graph mappings are declared with GraphConfig::builder():
use lance_graph::{GraphConfig, NodeMapping, RelationshipMapping};
let config = GraphConfig::builder()
.with_node_label("Person", "person_id")
.with_relationship("KNOWS", "src_person_id", "dst_person_id")
.build()?;
For finer control, build NodeMapping and RelationshipMapping instances explicitly:
let person = NodeMapping::new("Person", "person_id")
.with_properties(vec!["name".into(), "age".into()])
.with_filter("kind = 'person'");
let knows = RelationshipMapping::new("KNOWS", "src_person_id", "dst_person_id")
.with_properties(vec!["since".into()]);
let config = GraphConfig::builder()
.with_node_mapping(person)
.with_relationship_mapping(knows)
.build()?;
Executing Cypher Queries
CypherQuery::newparses Cypher text into the internal AST.with_configattaches the graph configuration used for validation and execution.with_parameter/with_parametersbind JSON-serializable values that can be referenced as$paramin the Cypher text.executeis asynchronous and returns an ArrowRecordBatch. PassNonefor the default DataFusion planner orSome(ExecutionStrategy::Simple)for the single-table executor.ExecutionStrategy::LanceNativeis reserved for future native execution support and currently errors.explainis asynchronous and returns a formatted string containing the graph logical plan alongside the DataFusion logical and physical plans.
Queries with a single MATCH clause containing a path pattern are planned as joins using the provided mappings. Other queries can opt into the single-table projection/filter pipeline via ExecutionStrategy::Simple when DataFusion's planner is unnecessary.
A builder (CypherQueryBuilder) is also available for constructing queries programmatically without parsing text.
Supported Cypher Surface
- Node patterns
(:Label)with optional variables. - Relationship patterns with fixed direction and type, including multi-hop paths.
- Property comparisons against literal values with
AND/OR/NOT/EXISTS. - RETURN lists of property accesses, optional
DISTINCT,ORDER BY,SKIP(offset), andLIMIT. - Positional and named parameters (e.g.
$min_age).
Basic aggregations like COUNT are supported. Optional matches and subqueries are parsed but not executed yet.
Crate Layout
ast– Cypher AST definitions.parser– Nom-based Cypher parser.semantic– Lightweight semantic checks on the AST.logical_plan– Builders for graph logical plans.datafusion_planner– DataFusion-based execution planning.simple_executor– Simple single-table executor.config– Graph configuration types and builders.query– High levelCypherQueryAPI and runtime.error–GraphErrorand result helpers.namespace– Namespace helpers (re-exported fromlance-graph-catalog).source_catalog– Catalog helpers for looking up table metadata (re-exported fromlance-graph-catalog).
lance-graph re-exports the catalog and namespace types from the lance-graph-catalog crate for
API compatibility. You can depend on lance-graph-catalog directly if you only need catalog or
namespace utilities.
Error Handling
Most APIs return Result<T, GraphError>. Errors include parsing failures, missing mappings, and execution issues surfaced from DataFusion.
Testing
cargo test -p lance-graph
Benchmarks
See the repository root README.md for benchmark setup, run commands, and report locations.
Python Bindings
See the Python package docs for setup and development:
- Python package README:
python/README.md - Runnable examples (from repo root):
examples/README.md
License
Apache-2.0. See the top-level LICENSE file for details.
Dependencies
~166MB
~2.5M SLoC