Memgraph adapter¶

MemgraphGraphDB is a GraphDB backend that connects to a Memgraph server over the Bolt protocol. It is a thin Memgraph dialect over the shared Cypher engine, implementing only the Memgraph-specific hooks (Bolt connection via the neo4j driver, plain list embedding literals, CREATE TEXT INDEX / CREATE VECTOR INDEX DDL, SHOW INDEX INFO / SHOW VECTOR INDEX INFO introspection, and the text_search / vector_search query procedures); all generic orchestration lives in GraphDB. Most applications should start with GraphRAG, which uses any adapter through the backend-agnostic database interface.

Entity merging is delegated to Memgraph's native refactor.merge_nodes procedure (provided by the MAGE library), so the server must run a MAGE-enabled image such as memgraph/memgraph-mage.

Installation¶

pip install 'grawiki[memgraph]'

From a repository checkout use uv sync --extra memgraph instead.

Running a local server¶

A docker-compose.yml at the repository root starts a MAGE-enabled Memgraph (and an optional Memgraph Lab UI at http://localhost:3000):

docker compose up -d

This exposes Bolt on localhost:7687.

Minimal example¶

from grawiki.db import MemgraphGraphDB

database = MemgraphGraphDB(
    "my_graph",
    host="localhost",
    port=7687,
)

Or via the backend factory:

from grawiki.db import create_graph_db

database = create_graph_db("memgraph", "my_graph", host="localhost", port=7687)

Operational notes¶

close() should be called explicitly in tests and short-lived scripts to release the Bolt connection pool.
GraphRAG.ingest(...), GraphRAG.ingest_text(...), and GraphRAG.remember(...) usually call setup() for you. Direct adapter usage may require an explicit await database.setup(...) before indexing or search operations.
Entity merging requires the refactor module, available in the memgraph/memgraph-mage image. Native merge semantics differ from the generic implementation: refactor.merge_nodes does not drop self-relationships.
Deletions clear node embeddings before removing nodes. Memgraph does not evict a deleted node's vector-index entry on its own, and a stale entry breaks subsequent vector searches on that label, so the adapter nulls embeddings first.

Advanced adapter helpers¶

query(...) executes write-capable Cypher.
ro_query(...) executes read-only Cypher and is the simplest option for ad hoc inspection.

Vector index creation can be tuned through the constructor arguments vector_similarity_function, vector_index_capacity, and vector_index_resize_coefficient.

grawiki.db.memgraph.MemgraphGraphDB ¶

Bases: GraphDB

Graph adapter for a Memgraph server reached over the Bolt protocol.

Parameters:

Name	Type	Description	Default
`graph_name`	`str`	Logical graph name. Accepted for interface symmetry with :class:`~grawiki.db.falkordb.FalkorGraphDB`; Memgraph has no FalkorDB-style per-graph namespacing within a single database, so this value is stored but not used for routing.	required
`host`	`str`	Hostname of the Memgraph server. Defaults to `"localhost"`.	`'localhost'`
`port`	`int`	Bolt port of the Memgraph server. Defaults to `7687`.	`7687`
`username`	`str`	Bolt auth username. Defaults to `""` (no authentication).	`''`
`password`	`str`	Bolt auth password. Defaults to `""` (no authentication).	`''`
`database`	`str \| None`	Database name passed to each Bolt session. Defaults to `"memgraph"`. Pass `None` to use the driver's default session without selecting a database (useful for servers that do not support multi-tenancy).	`'memgraph'`
`vector_similarity_function`	`Literal['cosine', 'euclidean']`	Similarity function used for vector indexes. Mapped to Memgraph's `"cos"`/`"l2sq"` metrics.	`'cosine'`
`vector_index_capacity`	`int`	Initial capacity hint for vector indexes.	`1000`
`vector_index_resize_coefficient`	`int`	Growth multiplier applied when a vector index exceeds its capacity.	`2`

close ¶

close()

Close the Bolt driver and release its connection pool.

query ¶

query(query, params=None)

Execute a write-capable query.

Parameters:

Name	Type	Description	Default
`query`	`str`	Cypher query to execute.	required
`params`	`dict[str, Any] \| None`	Query parameters.	`None`

Returns:

Type	Description
`Any`	Result exposing a positional `result_set`.

ro_query ¶

ro_query(query, params=None)

Execute a read-only query.

Parameters:

Name	Type	Description	Default
`query`	`str`	Cypher query to execute.	required
`params`	`dict[str, Any] \| None`	Query parameters.	`None`

Returns:

Type	Description
`Any`	Result exposing a positional `result_set`.

delete_memory `async` ¶

delete_memory(memory_id)

Delete one memory and prune directly-mentioned orphan entities.

Mirrors :meth:~grawiki.db.base.GraphDB.delete_memory but clears node embeddings before each deletion so vector indexes stay consistent (see :meth:_clear_embeddings).

Parameters:

Name	Type	Description	Default
`memory_id`	`str`	Identifier of the memory node to remove.	required

merge_entity_nodes `async` ¶

merge_entity_nodes(*, master, duplicate_ids)

Merge duplicate entity nodes into master using refactor.merge_nodes.

Memgraph's MAGE refactor.merge_nodes procedure redirects every relationship from the duplicates onto the master (the first node in the list) and deletes the duplicates. Master's canonical state is rewritten afterwards via :meth:~grawiki.db.base.GraphDB._update_entity_node so the result is independent of the procedure's property-merge strategy.

Parameters:

Name	Type	Description	Default
`master`	`Node`	Final persisted state for the surviving master node.	required
`duplicate_ids`	`Sequence[str]`	Entity identifiers to merge into `master` and then delete.	required

Raises:

Type	Description
`ValueError`	Raised when `duplicate_ids` contains `master.id`.

Notes

Native merge semantics differ from the generic implementation in :meth:~grawiki.db.base.GraphDB.merge_entity_nodes: refactor.merge_nodes does not drop self-relationships, so a duplicate that pointed at the master can yield a self-loop on the merged node.

setup `async` ¶

setup(embedding_dimensions=None)

Prepare backend indexes and other database structures.

Parameters:

Name	Type	Description	Default
`embedding_dimensions`	`dict[str, int] \| None`	Mapping from node label to embedding dimensionality for vector indexes that require the dimension to be known ahead of time.	`None`

ensure_indexes `async` ¶

ensure_indexes(*, labels, vector_dims=None)

Ensure full-text and vector indexes exist for labels.

Parameters:

Name	Type	Description	Default
`labels`	`Iterable[str]`	Node labels whose indexes should be created.	required
`vector_dims`	`Mapping[str, int] \| None`	Per-label embedding dimensionality. Labels omitted from the mapping do not get a vector index.	`None`

upsert_nodes `async` ¶

upsert_nodes(nodes)

Upsert nodes, creating indexes on first use per label.

Parameters:

Name	Type	Description	Default
`nodes`	`Sequence[Node]`	Nodes to create or update. Dispatches on concrete type and label.	required

upsert_relationships `async` ¶

upsert_relationships(rels)

Upsert relationships, dispatching on label for match semantics.

Parameters:

Name	Type	Description	Default
`rels`	`Sequence[Relationship]`	Relationships to create or update.	required

Notes

__has_chunk__ guards both endpoints by label (__document__ -> __chunk__). Other system relationships such as __mentions__ have heterogeneous sources (a __chunk__ or __memory__) and match by id alone. All non-system labels guard both endpoints as __entity__. Every relationship persists the same id, label, and serialized properties fields.

fulltext_search `async` ¶

fulltext_search(*, labels, query_text, limit=10)

Run a full-text search across one or more node labels.

Parameters:

Name	Type	Description	Default
`labels`	`Sequence[str]`	Labels whose full-text indexes should be queried.	required
`query_text`	`str`	Raw full-text query string.	required
`limit`	`int`	Maximum number of hits to return per label.	`10`

Returns:

Type	Description
`list[NodeHit]`	Flat list of hits across the requested labels.

vector_search `async` ¶

vector_search(*, labels, query_embedding, limit=10)

Run a vector similarity search across one or more node labels.

Parameters:

Name	Type	Description	Default
`labels`	`Sequence[str]`	Labels whose vector indexes should be queried.	required
`query_embedding`	`list[float]`	Pre-computed query embedding.	required
`limit`	`int`	Maximum number of hits to return per label.	`10`

Returns:

Type	Description
`list[NodeHit]`	Flat list of hits across the requested labels.

neighbor_relationships `async` ¶

neighbor_relationships(*, node_ids, limit_per_node=5)

Fetch one-hop relationship context for each seed node.

Parameters:

Name	Type	Description	Default
`node_ids`	`Sequence[str]`	Seed node identifiers.	required
`limit_per_node`	`int`	Maximum number of relationship rows returned for each seed.	`5`

Returns:

Type	Description
`dict[str, list[NeighborRelationship]]`	Relationship context keyed by seed id.

recall_subgraph `async` ¶

recall_subgraph(*, memory_ids, hops=1, limit_per_memory=20)

Fetch a flattened k-hop recall subgraph for memory seeds.

Parameters:

Name	Type	Description	Default
`memory_ids`	`Sequence[str]`	Memory node identifiers used as traversal seeds.	required
`hops`	`int`	Maximum traversal depth in hops. Must be at least `1`.	`1`
`limit_per_memory`	`int`	Maximum number of distinct paths expanded per memory seed before flattening them into relationship rows.	`20`

Returns:

Type	Description
`dict[str, list[NeighborRelationship]]`	Flattened relationship rows keyed by memory id.

list_entities `async` ¶

list_entities(*, include_embeddings=False)

Return persisted entity nodes ordered by semantic key then id.

Parameters:

Name	Type	Description	Default
`include_embeddings`	`bool`	Whether to include entity embeddings in the result.	`False`

Returns:

Type	Description
`list[Node]`	Persisted entity nodes.

entity_relationship_counts `async` ¶

entity_relationship_counts(node_ids)

Return total incident relationship counts for entity ids.

Parameters:

Name	Type	Description	Default
`node_ids`	`Sequence[str]`	Entity identifiers whose incident edge counts should be returned.	required

Returns:

Type	Description
`dict[str, int]`	Mapping from entity id to total relationship count. Missing ids still appear with `0`.

save_documents_and_chunks `async` ¶

save_documents_and_chunks(documents, chunks)

Persist source documents and their chunks.

Parameters:

Name	Type	Description	Default
`documents`	`list[Document]`	Source documents to persist.	required
`chunks`	`list[Chunk]`	Source chunks to persist and connect to their parent documents.	required

save_docs_and_chunks_to_db `async` ¶

save_docs_and_chunks_to_db(doc_nodes, chunk_nodes)

Persist prepared document and chunk nodes.

Parameters:

Name	Type	Description	Default
`doc_nodes`	`list[DocumentNode]`	Prepared document nodes ready for persistence.	required
`chunk_nodes`	`list[ChunkNode]`	Prepared chunk nodes ready for persistence.	required

save_entities_and_rels `async` ¶

save_entities_and_rels(owner_ids, owner_graphs)

Persist extracted owner-linked entities and relationships.

Parameters:

Name	Type	Description	Default
`owner_ids`	`Sequence[str]`	Node identifiers that own the extracted graphs, such as chunk or memory ids.	required
`owner_graphs`	`dict[str, KnowledgeGraph]`	Extracted graphs keyed by owner identifier.	required

Raises:

Type	Description
`ValueError`	Raised when a graph references a chunk identifier that is not present in `owner_ids`.

search `async` ¶

search(query, method, *, limit=10, query_embedding=None)

Search documents, chunks, and entities.

Parameters:

Name	Type	Description	Default
`query`	`str`	Raw user query text.	required
`method`	`('fulltext', 'vector')`	Search strategy to execute.	`"fulltext"`
`limit`	`int`	Maximum number of results to return per node family.	`10`
`query_embedding`	`list[float] \| None`	Embedded query vector required for vector search.	`None`

Returns:

Type	Description
`SearchResults`	Search hits grouped by node family.

Memgraph adapter¶

Installation¶

Running a local server¶

Minimal example¶

Operational notes¶

Advanced adapter helpers¶

grawiki.db.memgraph.MemgraphGraphDB ¶

close ¶

query ¶

ro_query ¶

delete_memory async ¶

merge_entity_nodes async ¶

setup async ¶

ensure_indexes async ¶

upsert_nodes async ¶

upsert_relationships async ¶

fulltext_search async ¶

vector_search async ¶

neighbor_relationships async ¶

recall_subgraph async ¶

list_entities async ¶

entity_relationship_counts async ¶

save_documents_and_chunks async ¶

save_docs_and_chunks_to_db async ¶

save_entities_and_rels async ¶

search async ¶

delete_memory `async` ¶

merge_entity_nodes `async` ¶

setup `async` ¶

ensure_indexes `async` ¶

upsert_nodes `async` ¶

upsert_relationships `async` ¶

fulltext_search `async` ¶

vector_search `async` ¶

neighbor_relationships `async` ¶

recall_subgraph `async` ¶

list_entities `async` ¶

entity_relationship_counts `async` ¶

save_documents_and_chunks `async` ¶

save_docs_and_chunks_to_db `async` ¶

save_entities_and_rels `async` ¶

search `async` ¶