Introducing Graph Concepts in Java With Eclipse JNoSQL, Part 3: Understanding Janus
Explore JanusGraph with Java using Eclipse JNoSQL 1.1.8. Model entities, traverse with Gremlin, and query with Jakarta Data — scalable graph power with clean Java APIs.
Join the DZone community and get the full member experience.
Join For FreeGraph databases are increasingly popular in modern applications because they can model complex relationships natively. Graphs provide a more natural representation of connected data from recommendation systems to fraud detection. Our previous articles explored graph databases broadly and delved into Neo4j. In this third part, we focus on JanusGraph, a scalable and distributed graph database.
Unlike Neo4j, JanusGraph supports multiple backends and leverages Apache TinkerPop, a graph computing framework that introduces a standard API and query language (Gremlin) for various databases. This abstraction makes JanusGraph a flexible choice for enterprise applications.
Understanding JanusGraph and TinkerPop
JanusGraph is an open-source, distributed graph database that handles huge volumes of transactional and analytical data. It supports different storage and indexing backends, including Cassandra, HBase, BerkeleyDB, and Elasticsearch.
It implements the TinkerPop framework, which provides two main components:
- Gremlin: A graph traversal language (both declarative and imperative).
- TinkerPop API: A set of interfaces for working with graph databases across different engines.
This allows developers to write database-agnostic code on any compliant TinkerPop-compatible engine.
Gremlin is a functional, step-based language for querying graph structures. It focuses on traversals: the act of walking through a graph. Gremlin supports OLTP (real-time) and OLAP (analytics) use cases across more than 30 graph database vendors.
Feature | SQL | Gremlin |
---|---|---|
Entity Retrieval | SELECT * FROM Book | g.V().hasLabel('Book') |
Filtering | WHERE name = 'Java' | has('name','Java') |
Join/Relationship | JOIN Book_Category ON ... | g.V().hasLabel('Book').out('is').hasLabel('Category') |
Grouping & Count | GROUP BY category_id | group().by('category').by(count()) |
Schema Flexibility | Fixed schema | Dynamic properties, schema-optional |
JanusGraph supports both embedded and external configurations. To get started quickly using Cassandra and Elasticsearch, run this docker-compose
file:
version: '3.8'
services:
cassandra:
image: cassandra:3.11
ports:
- "9042:9042"
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.10.2
environment:
- discovery.type=single-node
- xpack.security.enabled=false
ports:
- "9200:9200"
janusgraph:
image: janusgraph/janusgraph:latest
depends_on:
- cassandra
- elasticsearch
ports:
- "8182:8182"
environment:
- gremlin.graph=org.janusgraph.core.ConfiguredGraphFactory
- storage.backend=cql
- storage.hostname=cassandra
- index.search.backend=elasticsearch
- index.search.hostname=elasticsearch
Alternatively, you can avoid external dependencies entirely for local development or embedded environments. JanusGraph supports embedded mode using BerkeleyDB Java Edition (berkeleyje) for local graph storage, and Lucene as the indexing engine. This is especially useful for quick prototyping or running unit tests without setting up infrastructure.
BerkeleyJE is a fast, embeddable key-value store written in Java, which stores your graph data directly on the local filesystem.
Here’s an example configuration for embedded mode:
storage.backend=berkeleyje
storage.directory=../target/jnosql/berkeleyje
index.search.backend=lucene
index.search.directory=../target/jnosql/lucene
In case you want to execute with Casandra, please update the properties in this mode:
jnosql.graph.database=janusgraph
storage.backend=cql
storage.hostname=localhost
index.search.backend=elasticsearch
index.search.hostname=localhost
Before modeling the domain, we need to define the structure of the entities that will represent our graph. In this case, we use two vertex types: Book
and Category
. Each entity is annotated using Jakarta NoSQL annotations and contains a unique ID and a name. These entities will form the foundation of our graph, allowing us to define relationships between books and their associated categories.
@Entity
public class Book {
@Id
private Long id;
@Column
private String name;
}
@Entity
public class Category {
@Id
private Long id;
@Column
private String name;
}
Once the entities are defined, the next step is to persist and retrieve them from the database. For this purpose, Eclipse JNoSQL provides the TinkerpopTemplate
, which is a specialization of the generic Template
interface specifically designed for graph operations using Apache TinkerPop. The service layer encapsulates the logic of querying the database for existing books or categories and inserting new ones if they don't exist. This pattern helps maintain idempotency when saving data, ensuring duplicates aren't created.
@ApplicationScoped
public class BookService {
@Inject
private TinkerpopTemplate template;
public Book save(Book book) {
return template.select(Book.class).where("name").eq(book.getName()).<Book>singleResult()
.orElseGet(() -> template.insert(book));
}
public Category save(Category category) {
return template.select(Category.class).where("name").eq(category.getName()).<Category>singleResult()
.orElseGet(() -> template.insert(category));
}
}
The BookApp
class shows full execution: inserting entities, creating relationships (edges), and executing Gremlin queries:
var architectureBooks = template.gremlin("g.V().hasLabel('Category').has('name','Architecture').in('is')").toList();
var highRelevanceBooks = template.gremlin("g.E().hasLabel('is').has('relevance', gte(9)).outV().hasLabel('Book').dedup()").toList();
You can also chain traversals with .traversalVertex()
For more fluent pipelines:
List<String> softwareBooks = template.traversalVertex().hasLabel("Category")
.has("name", "Software")
.in("is").hasLabel("Book").<Book>result()
.map(Book::getName).toList();
The BookApp
introduces the TinkerpopTemplate capability, where we have the bridge between Java and Janus database:
public final class BookApp {
private BookApp() {
}
public static void main(String[] args) {
try (SeContainer container = SeContainerInitializer.newInstance().initialize()) {
var template = container.select(TinkerpopTemplate.class).get();
var service = container.select(BookService.class).get();
var software = service.save(Category.of("Software"));
var java = service.save(Category.of("Java"));
var architecture = service.save(Category.of("Architecture"));
var performance = service.save(Category.of("Performance"));
var effectiveJava = service.save(Book.of("Effective Java"));
var cleanArchitecture = service.save(Book.of("Clean Architecture"));
var systemDesign = service.save(Book.of("System Design Interview"));
var javaPerformance = service.save(Book.of("Java Performance"));
template.edge(Edge.source(effectiveJava).label("is").target(java).property("relevance", 10).build());
template.edge(Edge.source(effectiveJava).label("is").target(software).property("relevance", 9).build());
template.edge(Edge.source(cleanArchitecture).label("is").target(software).property("relevance", 8).build());
template.edge(Edge.source(cleanArchitecture).label("is").target(architecture).property("relevance", 10).build());
template.edge(Edge.source(systemDesign).label("is").target(architecture).property("relevance", 9).build());
template.edge(Edge.source(systemDesign).label("is").target(software).property("relevance", 7).build());
template.edge(Edge.source(javaPerformance).label("is").target(performance).property("relevance", 8).build());
template.edge(Edge.source(javaPerformance).label("is").target(java).property("relevance", 9).build());
List<String> softwareCategories = template.traversalVertex().hasLabel("Category")
.has("name", "Software")
.in("is").hasLabel("Category").<Category>result()
.map(Category::getName)
.toList();
List<String> softwareBooks = template.traversalVertex().hasLabel("Category")
.has("name", "Software")
.in("is").hasLabel("Book").<Book>result()
.map(Book::getName)
.toList();
List<String> sofwareNoSQLBooks = template.traversalVertex().hasLabel("Category")
.has("name", "Software")
.in("is")
.has("name", "NoSQL")
.in("is").<Book>result()
.map(Book::getName)
.toList();
System.out.println("The software categories: " + softwareCategories);
System.out.println("The software books: " + softwareBooks);
System.out.println("The software and NoSQL books: " + sofwareNoSQLBooks);
System.out.println("\Books in 'Architecture' category:");
var architectureBooks = template.gremlin("g.V().hasLabel('Category').has('name','Architecture').in('is')").toList();
architectureBooks.forEach(doc -> System.out.println(" - " + doc));
System.out.println("Categories with more than one book:");
var commonCategories = template.gremlin("g.V().hasLabel('Category').where(__.in('is').count().is(gt(1)))"
).toList();
commonCategories.forEach(doc -> System.out.println(" - " + doc));
var highRelevanceBooks = template.gremlin( "g.E().hasLabel('is').has('relevance', gte(9))" +
".outV().hasLabel('Book').dedup()").toList();
System.out.println("Books with high relevance:");
highRelevanceBooks.forEach(doc -> System.out.println(" - " + doc));
System.out.println("\Books with name: 'Effective Java':");
var effectiveJavaBooks = template.gremlin("g.V().hasLabel('Book').has('name', @name)", Collections.singletonMap("name", "Effective Java")).toList();
effectiveJavaBooks.forEach(doc -> System.out.println(" - " + doc));
}
}
}
To complement the use of TinkerpopTemplate
, Eclipse JNoSQL supports the Jakarta Data specification by enabling repository-based data access. This approach allows developers to define interfaces, like BookRepository
and CategoryRepository
— that automatically provide CRUD operations and support custom graph traversals through the @Gremlin
annotation.
By combining standard method name queries (e.g., findByName
) with expressive Gremlin scripts, we gain both convenience and fine-grained control over graph traversal logic. These repositories are ideal for clean, testable, and declarative access patterns in graph-based applications.
@Repository
public interface BookRepository extends TinkerPopRepository<Book, Long> {
Optional<Book> findByName(String name);
@Gremlin("g.V().hasLabel('Book').out('is').hasLabel('Category').has('name','Architecture').in('is').dedup()")
List<Book> findArchitectureBooks();
@Gremlin("g.E().hasLabel('is').has('relevance', gte(9)).outV().hasLabel('Book').dedup()")
List<Book> highRelevanceBooks();
}
@Repository
public interface CategoryRepository extends TinkerPopRepository<Category, Long> {
Optional<Category> findByName(String name);
@Gremlin("g.V().hasLabel('Category').where(__.in('is').count().is(gt(1)))")
List<Category> commonCategories();
}
After defining the repositories, we can build a full application that leverages them to manage data and run queries. The BookApp2
class illustrates this repository-driven execution flow. It uses the repositories to create or fetch vertices (Book
and Category
) and falls back to GraphTemplate
only when inserting edges, since Jakarta Data currently supports querying vertices but not edge creation. This hybrid model provides a clean separation of concerns and reduces boilerplate, making it easier to read, test, and maintain.
public final class BookApp2 {
private BookApp2() {
}
public static void main(String[] args) {
try (SeContainer container = SeContainerInitializer.newInstance().initialize()) {
var template = container.select(GraphTemplate.class).get();
var bookRepository = container.select(BookRepository.class).get();
var repository = container.select(CategoryRepository.class).get();
var software = repository.findByName("Software").orElseGet(() -> repository.save(Category.of("Software")));
var java = repository.findByName("Java").orElseGet(() -> repository.save(Category.of("Java")));
var architecture = repository.findByName("Architecture").orElseGet(() -> repository.save(Category.of("Architecture")));
var performance = repository.findByName("Performance").orElseGet(() -> repository.save(Category.of("Performance")));
var effectiveJava = bookRepository.findByName("Effective Java").orElseGet(() -> bookRepository.save(Book.of("Effective Java")));
var cleanArchitecture = bookRepository.findByName("Clean Architecture").orElseGet(() -> bookRepository.save(Book.of("Clean Architecture")));
var systemDesign = bookRepository.findByName("System Design Interview").orElseGet(() -> bookRepository.save(Book.of("System Design Interview")));
var javaPerformance = bookRepository.findByName("Java Performance").orElseGet(() -> bookRepository.save(Book.of("Java Performance")));
template.edge(Edge.source(effectiveJava).label("is").target(java).property("relevance", 10).build());
template.edge(Edge.source(effectiveJava).label("is").target(software).property("relevance", 9).build());
template.edge(Edge.source(cleanArchitecture).label("is").target(software).property("relevance", 8).build());
template.edge(Edge.source(cleanArchitecture).label("is").target(architecture).property("relevance", 10).build());
template.edge(Edge.source(systemDesign).label("is").target(architecture).property("relevance", 9).build());
template.edge(Edge.source(systemDesign).label("is").target(software).property("relevance", 7).build());
template.edge(Edge.source(javaPerformance).label("is").target(performance).property("relevance", 8).build());
template.edge(Edge.source(javaPerformance).label("is").target(java).property("relevance", 9).build());
System.out.println("Books in 'Architecture' category:");
var architectureBooks = bookRepository.findArchitectureBooks();
architectureBooks.forEach(doc -> System.out.println(" - " + doc));
System.out.println("Categories with more than one book:");
var commonCategories = repository.commonCategories();
commonCategories.forEach(doc -> System.out.println(" - " + doc));
var highRelevanceBooks = bookRepository.highRelevanceBooks();
System.out.println("Books with high relevance:");
highRelevanceBooks.forEach(doc -> System.out.println(" - " + doc));
var bookByName = bookRepository.queryByName("Effective Java");
System.out.println("Book by name: " + bookByName);
}
}
}
JanusGraph, backed by Apache TinkerPop and Gremlin, offers a highly scalable and portable way to model and traverse complex graphs. With Eclipse JNoSQL and Jakarta Data, Java developers can harness powerful graph capabilities while enjoying a clean and modular API. Janus adapts to your architecture, whether embedded or distributed, while keeping your queries expressive and concise.
References
Opinions expressed by DZone contributors are their own.
Comments