Skip to content

CategoricalData/HydraPop

Repository files navigation

HydraPop: translingual extensions for Apache TinkerPop

HydraPop connects Apache TinkerPop with the graph programming language Hydra. Hydra is a framework for translingual programming, meaning that Hydra code can be written in multiple languages, and also compiled to multiple languages. Hydra is being explored as a means for providing validation logic and other functionality in a way that is accessible to each Gremlin language variant, and guaranteed to be consistent across all of them.

Java and Python

Most of the logic in this repository is demonstrated in both Java and Python, in parallel. All of the code generated by Hydra has exactly the same behavior in either language, and we connect this code with Apache TinkerPop using thin, language-specific wrappers.

The Java code is the source of truth for example data (graph schemas and graphs). This data is encoded as Hydra Terms, serialized to a language-independent JSON representation, and decoded on the Python side using Hydra's JSON decoders. The validation logic itself is generated by Hydra and is identical in both languages.

Validation

HydraPop can validate a property graph against a GraphSchema using Hydra's built-in support for property graph validation.

Java workflow

  1. Define a schema using the Hydra PG DSL (hydra.pg.dsl)
  2. Load or construct a TinkerPop graph
  3. Convert the TinkerPop graph to a Hydra graph via HydraGremlinBridge.gremlinToHydra
  4. Validate with Validation.validateGraph

Python workflow

  1. Load the schema and graph from JSON (generated from the Java definitions)
  2. Validate with hydra.pg.validation.validate_graph

Test cases

Both the Java and Python test suites exercise the same validation conditions using TinkerPop's built-in Modern graph:

Test case Modification
Valid graph None
Missing required property Remove a vertex's name property
Wrong id type Add a vertex with a string id where int32 is expected
Unknown edge endpoint Edge references a non-existent vertex
Unexpected vertex label Add a vertex with a label not in the schema
Unexpected edge label Add an edge with a label not in the schema
Property value type mismatch Set a string property to an integer
Unexpected property key Add a property not defined in the schema
Wrong in-vertex label Add an edge whose in-vertex has the wrong label
Wrong out-vertex label Add an edge whose out-vertex has the wrong label
Missing required edge property Add an edge without a required property

Build and test

Java

# Build and test
./gradlew build

# Generate JSON example data from Java definitions
./gradlew generateExampleData

# Package JARs for Gremlin Console
./gradlew consoleLibs

Requires: Java 17+, Gradle 8.12.1 (wrapper included) Note: with the upcoming 0.14 of Hydra, only Java 11 will be required.

Python

HydraPop requires a local checkout of the Hydra repository as a sibling directory (i.e., ../hydra/) for the PG model and validation modules, which are not yet available as a standalone package; they will be available by the 0.15.x release at the latest.

# Install pixi (if not already installed)
curl -fsSL https://pixi.sh/install.sh | bash

# Install dependencies (pulls hydra-python from the meso-forge conda channel)
pixi install

# Run tests
pixi run test

Requires: pixi, Python 3.12+, local Hydra checkout at ../hydra/

Gremlin example

You can validate a TinkerPop graph interactively from both Java and Python. Both examples use TinkerPop's built-in Modern graph and demonstrate the same workflow: load the graph, validate it against a graph schema, break it, and see the validation error.

The Java demo runs in the Gremlin Console using an in-process TinkerGraph — no server required. The Python demo uses gremlinpython to connect to a running Gremlin Server, demonstrating validation against a live graph database.

Gremlin Server setup (for both Java and Python)

Download Gremlin Server (version 3.8.0 or later) and start it with the Modern graph configuration:

bin/gremlin-server.sh conf/gremlin-server-modern.yaml

This starts a server on ws://localhost:8182/gremlin with the Modern graph pre-loaded.

Java setup

Build the project and collect the JARs needed for the Gremlin Console:

./gradlew consoleLibs

Copy build/console-libs/*.jar into the Gremlin Console's lib/ directory:

cp build/console-libs/*.jar /path/to/apache-tinkerpop-gremlin-console/lib/

Java session

Start the Gremlin Console:

bin/gremlin.sh

Connect to Gremlin Server. We also define a reset() helper that reloads the Modern graph on the server, so each example below can start from a clean state:

import net.fortytwo.hydra.hydrapop.Validate
import org.apache.tinkerpop.gremlin.driver.Cluster
import org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteConnection
import org.apache.tinkerpop.gremlin.structure.T

cluster = Cluster.open('conf/remote-objects.yaml')
g = traversal().withRemote(DriverRemoteConnection.using(cluster, 'g'))
client = cluster.connect()
reset = { client.submit('graph.traversal().V().drop().iterate(); TinkerFactory.generateModern(graph)').all().get() }

Define the schema for the Modern graph:

import hydra.dsl.*
import hydra.pg.dsl.Graphs
import hydra.pg.model.*

personType = Graphs.vertexType("person", LiteralTypes.int32()).property("name", LiteralTypes.string(), true).property("age", LiteralTypes.int32(), false).build()
softwareType = Graphs.vertexType("software", LiteralTypes.int32()).property("name", LiteralTypes.string(), true).property("lang", LiteralTypes.string(), true).build()
knowsType = Graphs.edgeType("knows", LiteralTypes.int32(), "person", "person").property("weight", LiteralTypes.float64(), true).build()
createdType = Graphs.edgeType("created", LiteralTypes.int32(), "person", "software").property("weight", LiteralTypes.float64(), true).build()
vtypes = [:]; vtypes[personType.label] = personType; vtypes[softwareType.label] = softwareType
etypes = [:]; etypes[knowsType.label] = knowsType; etypes[createdType.label] = createdType
schema = new GraphSchema(vtypes, etypes)

Validate the unmodified Modern graph (should pass):

reset()
Validate.validate(schema, g)

Remove a required property:

reset()
g.V(1).properties('name').drop().iterate()
Validate.validate(schema, g)

Add a vertex with an unknown label:

reset()
g.addV('robot').property(T.id, 99).property('name', 'Bender').next()
Validate.validate(schema, g)

Set a property to the wrong type:

reset()
g.V(1).property('name', 999).iterate()
Validate.validate(schema, g)

Add a "created" edge to a person (should be person → software):

reset()
g.V(4).addE('created').to(__.V(1)).property(T.id, 99).property('weight', 0.5d).next()
Validate.validate(schema, g)

Python setup

Install dependencies (includes gremlinpython and hydra-python):

pixi install

Python session

Start a Python REPL with the project dependencies:

pixi run console

Connect to Gremlin Server. As with the Java session, we define a reset() helper that reloads the Modern graph on the server:

from gremlin_python.process.anonymous_traversal import traversal
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
from gremlin_python.driver.client import Client
from gremlin_python.process.traversal import T
from hydrapop.validate import validate

conn = DriverRemoteConnection('ws://localhost:8182/gremlin', 'g')
g = traversal().with_remote(conn)
client = Client('ws://localhost:8182/gremlin', 'g')

def reset():
    client.submit(
        "graph.traversal().V().drop().iterate();"
        "TinkerFactory.generateModern(graph)").all().result()

Define the schema for the Modern graph:

from hydrapop.dsl.pg import vertex_type, edge_type, graph_schema, int32, string, float64

person_type = vertex_type("person", int32()).property("name", string(), True).property("age", int32(), False).build()
software_type = vertex_type("software", int32()).property("name", string(), True).property("lang", string(), True).build()
knows_type = edge_type("knows", int32(), "person", "person").property("weight", float64(), True).build()
created_type = edge_type("created", int32(), "person", "software").property("weight", float64(), True).build()
schema = graph_schema([person_type, software_type], [knows_type, created_type])

Validate the unmodified Modern graph (should pass):

reset()
validate(schema, g)

Remove a required property:

reset()
g.V(1).properties('name').drop().iterate()
validate(schema, g)

Add a vertex with an unknown label:

reset()
g.addV('robot').property(T.id, 99).property('name', 'Bender').iterate()
validate(schema, g)

Set a property to the wrong type:

reset()
g.V(1).property('name', 999).iterate()
validate(schema, g)

Add a "created" edge to a person (should be person → software):

reset()
josh = g.V(4).next()
marko = g.V(1).next()
g.V(josh).addE('created').to(marko).property(T.id, 99).property('weight', 0.5).iterate()
validate(schema, g)

About

Translingual extensions for Apache TinkerPop

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors