Introducing Graph Concepts in Java With Eclipse JNoSQL, Part 3: Understanding Janus

Explore JanusGraph with Java using Eclipse JNoSQL 1.1.8. Model entities, traverse with Gremlin, and query with Jakarta Data — scalable graph power with clean Java APIs.

Otavio Santana

CORE ·

May. 29, 25 · Analysis

Likes (0)

Comment

Save

2.1K Views

Graph databases are increasingly popular in modern applications because they can model complex relationships natively. Graphs provide a more natural representation of connected data from recommendation systems to fraud detection. Our previous articles explored graph databases broadly and delved into Neo4j. In this third part, we focus on JanusGraph, a scalable and distributed graph database.

Unlike Neo4j, JanusGraph supports multiple backends and leverages Apache TinkerPop, a graph computing framework that introduces a standard API and query language (Gremlin) for various databases. This abstraction makes JanusGraph a flexible choice for enterprise applications.

Understanding JanusGraph and TinkerPop

JanusGraph is an open-source, distributed graph database that handles huge volumes of transactional and analytical data. It supports different storage and indexing backends, including Cassandra, HBase, BerkeleyDB, and Elasticsearch.

It implements the TinkerPop framework, which provides two main components:

Gremlin: A graph traversal language (both declarative and imperative).
TinkerPop API: A set of interfaces for working with graph databases across different engines.

This allows developers to write database-agnostic code on any compliant TinkerPop-compatible engine.

Gremlin is a functional, step-based language for querying graph structures. It focuses on traversals: the act of walking through a graph. Gremlin supports OLTP (real-time) and OLAP (analytics) use cases across more than 30 graph database vendors.

Feature	SQL	Gremlin
Entity Retrieval	SELECT * FROM Book	g.V().hasLabel('Book')
Filtering	WHERE name = 'Java'	has('name','Java')
Join/Relationship	JOIN Book_Category ON ...	g.V().hasLabel('Book').out('is').hasLabel('Category')
Grouping & Count	GROUP BY category_id	group().by('category').by(count())
Schema Flexibility	Fixed schema	Dynamic properties, schema-optional

JanusGraph supports both embedded and external configurations. To get started quickly using Cassandra and Elasticsearch, run this docker-compose file:

    YAML
   
 

   version: '3.8'
services:
  cassandra:
    image: cassandra:3.11
    ports:
      - "9042:9042"

  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.10.2
    environment:
      - discovery.type=single-node
      - xpack.security.enabled=false
    ports:
      - "9200:9200"

  janusgraph:
    image: janusgraph/janusgraph:latest
    depends_on:
      - cassandra
      - elasticsearch
    ports:
      - "8182:8182"
    environment:
      - gremlin.graph=org.janusgraph.core.ConfiguredGraphFactory
      - storage.backend=cql
      - storage.hostname=cassandra
      - index.search.backend=elasticsearch
      - index.search.hostname=elasticsearch
  

Alternatively, you can avoid external dependencies entirely for local development or embedded environments. JanusGraph supports embedded mode using BerkeleyDB Java Edition (berkeleyje) for local graph storage, and Lucene as the indexing engine. This is especially useful for quick prototyping or running unit tests without setting up infrastructure.

BerkeleyJE is a fast, embeddable key-value store written in Java, which stores your graph data directly on the local filesystem.

Here’s an example configuration for embedded mode:

    Properties files
   
   storage.backend=berkeleyje
storage.directory=../target/jnosql/berkeleyje
index.search.backend=lucene
index.search.directory=../target/jnosql/lucene

In case you want to execute with Casandra, please update the properties in this mode:

    Properties files
   
 

   jnosql.graph.database=janusgraph
storage.backend=cql
storage.hostname=localhost
index.search.backend=elasticsearch
index.search.hostname=localhost
  

Before modeling the domain, we need to define the structure of the entities that will represent our graph. In this case, we use two vertex types: Book and Category. Each entity is annotated using Jakarta NoSQL annotations and contains a unique ID and a name. These entities will form the foundation of our graph, allowing us to define relationships between books and their associated categories.

    Java
   
 

   @Entity
public class Book {
    @Id
    private Long id;
    @Column
    private String name;
}

@Entity
public class Category {
    @Id
    private Long id;
    @Column
    private String name;
}
  

Once the entities are defined, the next step is to persist and retrieve them from the database. For this purpose, Eclipse JNoSQL provides the TinkerpopTemplate, which is a specialization of the generic Template interface specifically designed for graph operations using Apache TinkerPop. The service layer encapsulates the logic of querying the database for existing books or categories and inserting new ones if they don't exist. This pattern helps maintain idempotency when saving data, ensuring duplicates aren't created.

    Java
   
 

   @ApplicationScoped
public class BookService {
    @Inject
    private TinkerpopTemplate template;

    public Book save(Book book) {
        return template.select(Book.class).where("name").eq(book.getName()).<Book>singleResult()
                .orElseGet(() -> template.insert(book));
    }

    public Category save(Category category) {
        return template.select(Category.class).where("name").eq(category.getName()).<Category>singleResult()
                .orElseGet(() -> template.insert(category));
    }
}
  

The BookApp class shows full execution: inserting entities, creating relationships (edges), and executing Gremlin queries:

    Java
   
   var architectureBooks = template.gremlin("g.V().hasLabel('Category').has('name','Architecture').in('is')").toList();
var highRelevanceBooks = template.gremlin("g.E().hasLabel('is').has('relevance', gte(9)).outV().hasLabel('Book').dedup()").toList();

You can also chain traversals with .traversalVertex() For more fluent pipelines:

    Java
   
   List<String> softwareBooks = template.traversalVertex().hasLabel("Category")
    .has("name", "Software")
    .in("is").hasLabel("Book").<Book>result()
    .map(Book::getName).toList();

The BookApp introduces the TinkerpopTemplate capability, where we have the bridge between Java and Janus database:

    Java
   
 

   
public final class BookApp {

    private BookApp() {
    }

    public static void main(String[] args) {

        try (SeContainer container = SeContainerInitializer.newInstance().initialize()) {
            var template = container.select(TinkerpopTemplate.class).get();
            var service = container.select(BookService.class).get();

            var software = service.save(Category.of("Software"));
            var java = service.save(Category.of("Java"));
            var architecture = service.save(Category.of("Architecture"));
            var performance = service.save(Category.of("Performance"));

            var effectiveJava = service.save(Book.of("Effective Java"));
            var cleanArchitecture = service.save(Book.of("Clean Architecture"));
            var systemDesign = service.save(Book.of("System Design Interview"));
            var javaPerformance = service.save(Book.of("Java Performance"));


            template.edge(Edge.source(effectiveJava).label("is").target(java).property("relevance", 10).build());
            template.edge(Edge.source(effectiveJava).label("is").target(software).property("relevance", 9).build());
            template.edge(Edge.source(cleanArchitecture).label("is").target(software).property("relevance", 8).build());
            template.edge(Edge.source(cleanArchitecture).label("is").target(architecture).property("relevance", 10).build());
            template.edge(Edge.source(systemDesign).label("is").target(architecture).property("relevance", 9).build());
            template.edge(Edge.source(systemDesign).label("is").target(software).property("relevance", 7).build());
            template.edge(Edge.source(javaPerformance).label("is").target(performance).property("relevance", 8).build());
            template.edge(Edge.source(javaPerformance).label("is").target(java).property("relevance", 9).build());


            List<String> softwareCategories = template.traversalVertex().hasLabel("Category")
                    .has("name", "Software")
                    .in("is").hasLabel("Category").<Category>result()
                    .map(Category::getName)
                    .toList();

            List<String> softwareBooks = template.traversalVertex().hasLabel("Category")
                    .has("name", "Software")
                    .in("is").hasLabel("Book").<Book>result()
                    .map(Book::getName)
                    .toList();

            List<String> sofwareNoSQLBooks = template.traversalVertex().hasLabel("Category")
                    .has("name", "Software")
                    .in("is")
                    .has("name", "NoSQL")
                    .in("is").<Book>result()
                    .map(Book::getName)
                    .toList();

            System.out.println("The software categories: " + softwareCategories);
            System.out.println("The software books: " + softwareBooks);
            System.out.println("The software and NoSQL books: " + sofwareNoSQLBooks);


            System.out.println("\Books in 'Architecture' category:");
            var architectureBooks = template.gremlin("g.V().hasLabel('Category').has('name','Architecture').in('is')").toList();
            architectureBooks.forEach(doc -> System.out.println(" - " + doc));

            System.out.println("Categories with more than one book:");
            var commonCategories = template.gremlin("g.V().hasLabel('Category').where(__.in('is').count().is(gt(1)))"
            ).toList();
            commonCategories.forEach(doc -> System.out.println(" - " + doc));

            var highRelevanceBooks = template.gremlin( "g.E().hasLabel('is').has('relevance', gte(9))" +
                    ".outV().hasLabel('Book').dedup()").toList();

            System.out.println("Books with high relevance:");
            highRelevanceBooks.forEach(doc -> System.out.println(" - " + doc));

            System.out.println("\Books with name: 'Effective Java':");
            var effectiveJavaBooks = template.gremlin("g.V().hasLabel('Book').has('name', @name)", Collections.singletonMap("name", "Effective Java")).toList();
            effectiveJavaBooks.forEach(doc -> System.out.println(" - " + doc));
        }
    }
}
  

To complement the use of TinkerpopTemplate, Eclipse JNoSQL supports the Jakarta Data specification by enabling repository-based data access. This approach allows developers to define interfaces, like BookRepository and CategoryRepository — that automatically provide CRUD operations and support custom graph traversals through the @Gremlin annotation.

By combining standard method name queries (e.g., findByName) with expressive Gremlin scripts, we gain both convenience and fine-grained control over graph traversal logic. These repositories are ideal for clean, testable, and declarative access patterns in graph-based applications.

    Java
   
 

   @Repository
public interface BookRepository extends TinkerPopRepository<Book, Long> {
    Optional<Book> findByName(String name);

    @Gremlin("g.V().hasLabel('Book').out('is').hasLabel('Category').has('name','Architecture').in('is').dedup()")
    List<Book> findArchitectureBooks();

    @Gremlin("g.E().hasLabel('is').has('relevance', gte(9)).outV().hasLabel('Book').dedup()")
    List<Book> highRelevanceBooks();
}

@Repository
public interface CategoryRepository extends TinkerPopRepository<Category, Long> {
    Optional<Category> findByName(String name);

    @Gremlin("g.V().hasLabel('Category').where(__.in('is').count().is(gt(1)))")
    List<Category> commonCategories();
}
  

After defining the repositories, we can build a full application that leverages them to manage data and run queries. The BookApp2 class illustrates this repository-driven execution flow. It uses the repositories to create or fetch vertices (Book and Category) and falls back to GraphTemplate only when inserting edges, since Jakarta Data currently supports querying vertices but not edge creation. This hybrid model provides a clean separation of concerns and reduces boilerplate, making it easier to read, test, and maintain.

    Java
   
 

   public final class BookApp2 {

    private BookApp2() {
    }

    public static void main(String[] args) {

        try (SeContainer container = SeContainerInitializer.newInstance().initialize()) {
            var template = container.select(GraphTemplate.class).get();
            var bookRepository = container.select(BookRepository.class).get();
            var repository = container.select(CategoryRepository.class).get();

            var software = repository.findByName("Software").orElseGet(() -> repository.save(Category.of("Software")));
            var java = repository.findByName("Java").orElseGet(() -> repository.save(Category.of("Java")));
            var architecture = repository.findByName("Architecture").orElseGet(() -> repository.save(Category.of("Architecture")));
            var performance = repository.findByName("Performance").orElseGet(() -> repository.save(Category.of("Performance")));

            var effectiveJava = bookRepository.findByName("Effective Java").orElseGet(() -> bookRepository.save(Book.of("Effective Java")));
            var cleanArchitecture = bookRepository.findByName("Clean Architecture").orElseGet(() -> bookRepository.save(Book.of("Clean Architecture")));
            var systemDesign = bookRepository.findByName("System Design Interview").orElseGet(() -> bookRepository.save(Book.of("System Design Interview")));
            var javaPerformance = bookRepository.findByName("Java Performance").orElseGet(() -> bookRepository.save(Book.of("Java Performance")));


            template.edge(Edge.source(effectiveJava).label("is").target(java).property("relevance", 10).build());
            template.edge(Edge.source(effectiveJava).label("is").target(software).property("relevance", 9).build());
            template.edge(Edge.source(cleanArchitecture).label("is").target(software).property("relevance", 8).build());
            template.edge(Edge.source(cleanArchitecture).label("is").target(architecture).property("relevance", 10).build());
            template.edge(Edge.source(systemDesign).label("is").target(architecture).property("relevance", 9).build());
            template.edge(Edge.source(systemDesign).label("is").target(software).property("relevance", 7).build());
            template.edge(Edge.source(javaPerformance).label("is").target(performance).property("relevance", 8).build());
            template.edge(Edge.source(javaPerformance).label("is").target(java).property("relevance", 9).build());


            System.out.println("Books in 'Architecture' category:");
            var architectureBooks = bookRepository.findArchitectureBooks();
            architectureBooks.forEach(doc -> System.out.println(" - " + doc));

            System.out.println("Categories with more than one book:");
            var commonCategories = repository.commonCategories();
            commonCategories.forEach(doc -> System.out.println(" - " + doc));

            var highRelevanceBooks = bookRepository.highRelevanceBooks();

            System.out.println("Books with high relevance:");
            highRelevanceBooks.forEach(doc -> System.out.println(" - " + doc));

            var bookByName = bookRepository.queryByName("Effective Java");
            System.out.println("Book by name: " + bookByName);
        }
    }
}
  

JanusGraph, backed by Apache TinkerPop and Gremlin, offers a highly scalable and portable way to model and traverse complex graphs. With Eclipse JNoSQL and Jakarta Data, Java developers can harness powerful graph capabilities while enjoying a clean and modular API. Janus adapts to your architecture, whether embedded or distributed, while keeping your queries expressive and concise.

References

Eclipse Graph (Unix) Janus (concurrent constraint programming language) Java (programming language)

Opinions expressed by DZone contributors are their own.

Related

Trending

Introducing Graph Concepts in Java With Eclipse JNoSQL, Part 3: Understanding Janus

Explore JanusGraph with Java using Eclipse JNoSQL 1.1.8. Model entities, traverse with Gremlin, and query with Jakarta Data — scalable graph power with clean Java APIs.

Understanding JanusGraph and TinkerPop

References

Related

Partner Resources