Database Abstractions¶

The database layer is centered on GraphDB, the backend-agnostic Cypher engine. It implements the full persistence and retrieval workflow — index management, node and relationship upserts, full-text and vector search, neighbor and recall-subgraph expansion, entity enumeration and merging, and memory deletion — generically, in terms of a small set of backend dialect hooks.

Every backend GraWiki targets speaks Cypher, so a concrete backend implements only the dialect hooks rather than the whole contract:

_execute_write / _execute_read: run write-capable and read-only queries.
_serialize_embedding_literal: render an embedding as an inline Cypher vector literal.
_create_fulltext_index / _create_vector_index: emit the backend's index DDL.
_index_exists: report whether an index already exists.
_run_fulltext_query / _run_vector_query: invoke the backend's search procedures, returning scores normalized so that higher means a closer match.
close: release backend resources (a no-op by default).

The hooks are ordinary methods that raise NotImplementedError rather than abstract methods, so lightweight in-memory test doubles can override the high-level methods directly without implementing every hook.

Shared, backend-free node-row helpers used by the engine live in grawiki.db.node_rows (projection expressions and row-to-Node reconstruction); Cypher string builders live in grawiki.db.cypher.

Use this page when implementing a new backend (provide the dialect hooks) or when clarifying which responsibilities belong to the database layer rather than the retrieval layer. For the current concrete implementation, see FalkorDB adapter.

grawiki.db.base ¶

Backend-agnostic graph database engine.

:class:GraphDB implements the full Cypher-based persistence and retrieval workflow (upserts, traversals, entity merging, memory deletion, and search dispatch) in terms of a small set of backend dialect hooks. Every graph backend GraWiki targets speaks Cypher, so a concrete backend only needs to implement the hooks (raw query execution, embedding-literal serialization, index DDL and introspection, and the full-text / vector search procedures); all of the generic orchestration is shared here.

NodeHit `dataclass` ¶

Search result pairing a node with scoring metadata.

Parameters:

Name	Type	Description	Default
`node`	`Node`	Matched node. May be a concrete subclass such as :class:`~grawiki.graph.models.DocumentNode`, :class:`~grawiki.graph.models.ChunkNode`, or :class:`~grawiki.graph.models.MemoryNode` depending on the node's label.	required
`score`	`float`	Relevance or similarity score. Adapters and higher-level services may normalize backend-specific distance values into higher-is-better scores. Defaults to `0.0` when no score is reported.	`0.0`
`matched_on`	`str`	Short descriptor of how the hit was matched (for example `"fulltext:content"`, `"vector"`, or `"rapidfuzz"`). Empty when not reported.	`''`

NeighborRelationship `dataclass` ¶

One-hop relationship context around a seed node.

Parameters:

Name	Type	Description	Default
`source_id`	`str`	Identifier of the seed node that was expanded.	required
`source_name`	`str`	Human-readable name of the seed node.	required
`relationship_label`	`str`	Label of the relationship connecting the seed to the target.	required
`target`	`Node`	Neighbor node connected to the seed.	required

GraphDB ¶

Bases: ABC

Generic Cypher graph database engine.

Notes

The contract has two layers. The generic orchestration methods (:meth:upsert_nodes, :meth:upsert_relationships, :meth:fulltext_search, :meth:vector_search, :meth:neighbor_relationships, :meth:recall_subgraph, :meth:list_entities, :meth:entity_relationship_counts, :meth:merge_entity_nodes, :meth:delete_memory, :meth:ensure_indexes, :meth:setup) and the higher-level convenience wrappers (:meth:save_documents_and_chunks, :meth:save_docs_and_chunks_to_db, :meth:save_entities_and_rels, :meth:search) are fully implemented here in terms of the dialect hooks below.

Concrete backends implement only the dialect hooks: :meth:_execute_write, :meth:_execute_read, :meth:_serialize_embedding_literal, :meth:_create_fulltext_index, :meth:_create_vector_index, :meth:_index_exists, :meth:_run_fulltext_query, :meth:_run_vector_query, and (optionally) :meth:close. They are defined as ordinary methods that raise :class:NotImplementedError rather than abstract methods so that lightweight in-memory test doubles can override the generic methods directly without implementing every hook.

close ¶

close()

Release backend resources.

Notes

The default implementation is a no-op, which is correct for in-memory backends. Backends that own external processes or sockets override this.

setup `async` ¶

setup(embedding_dimensions=None)

Prepare backend indexes and other database structures.

Parameters:

Name	Type	Description	Default
`embedding_dimensions`	`dict[str, int] \| None`	Mapping from node label to embedding dimensionality for vector indexes that require the dimension to be known ahead of time.	`None`

ensure_indexes `async` ¶

ensure_indexes(*, labels, vector_dims=None)

Ensure full-text and vector indexes exist for labels.

Parameters:

Name	Type	Description	Default
`labels`	`Iterable[str]`	Node labels whose indexes should be created.	required
`vector_dims`	`Mapping[str, int] \| None`	Per-label embedding dimensionality. Labels omitted from the mapping do not get a vector index.	`None`

upsert_nodes `async` ¶

upsert_nodes(nodes)

Upsert nodes, creating indexes on first use per label.

Parameters:

Name	Type	Description	Default
`nodes`	`Sequence[Node]`	Nodes to create or update. Dispatches on concrete type and label.	required

upsert_relationships `async` ¶

upsert_relationships(rels)

Upsert relationships, dispatching on label for match semantics.

Parameters:

Name	Type	Description	Default
`rels`	`Sequence[Relationship]`	Relationships to create or update.	required

Notes

__has_chunk__ guards both endpoints by label (__document__ -> __chunk__). Other system relationships such as __mentions__ have heterogeneous sources (a __chunk__ or __memory__) and match by id alone. All non-system labels guard both endpoints as __entity__. Every relationship persists the same id, label, and serialized properties fields.

fulltext_search `async` ¶

fulltext_search(*, labels, query_text, limit=10)

Run a full-text search across one or more node labels.

Parameters:

Name	Type	Description	Default
`labels`	`Sequence[str]`	Labels whose full-text indexes should be queried.	required
`query_text`	`str`	Raw full-text query string.	required
`limit`	`int`	Maximum number of hits to return per label.	`10`

Returns:

Type	Description
`list[NodeHit]`	Flat list of hits across the requested labels.

vector_search `async` ¶

vector_search(*, labels, query_embedding, limit=10)

Run a vector similarity search across one or more node labels.

Parameters:

Name	Type	Description	Default
`labels`	`Sequence[str]`	Labels whose vector indexes should be queried.	required
`query_embedding`	`list[float]`	Pre-computed query embedding.	required
`limit`	`int`	Maximum number of hits to return per label.	`10`

Returns:

Type	Description
`list[NodeHit]`	Flat list of hits across the requested labels.

neighbor_relationships `async` ¶

neighbor_relationships(*, node_ids, limit_per_node=5)

Fetch one-hop relationship context for each seed node.

Parameters:

Name	Type	Description	Default
`node_ids`	`Sequence[str]`	Seed node identifiers.	required
`limit_per_node`	`int`	Maximum number of relationship rows returned for each seed.	`5`

Returns:

Type	Description
`dict[str, list[NeighborRelationship]]`	Relationship context keyed by seed id.

recall_subgraph `async` ¶

recall_subgraph(*, memory_ids, hops=1, limit_per_memory=20)

Fetch a flattened k-hop recall subgraph for memory seeds.

Parameters:

Name	Type	Description	Default
`memory_ids`	`Sequence[str]`	Memory node identifiers used as traversal seeds.	required
`hops`	`int`	Maximum traversal depth in hops. Must be at least `1`.	`1`
`limit_per_memory`	`int`	Maximum number of distinct paths expanded per memory seed before flattening them into relationship rows.	`20`

Returns:

Type	Description
`dict[str, list[NeighborRelationship]]`	Flattened relationship rows keyed by memory id.

list_entities `async` ¶

list_entities(*, include_embeddings=False)

Return persisted entity nodes ordered by semantic key then id.

Parameters:

Name	Type	Description	Default
`include_embeddings`	`bool`	Whether to include entity embeddings in the result.	`False`

Returns:

Type	Description
`list[Node]`	Persisted entity nodes.

entity_relationship_counts `async` ¶

entity_relationship_counts(node_ids)

Return total incident relationship counts for entity ids.

Parameters:

Name	Type	Description	Default
`node_ids`	`Sequence[str]`	Entity identifiers whose incident edge counts should be returned.	required

Returns:

Type	Description
`dict[str, int]`	Mapping from entity id to total relationship count. Missing ids still appear with `0`.

merge_entity_nodes `async` ¶

merge_entity_nodes(*, master, duplicate_ids)

Merge duplicate entity nodes into master.

Parameters:

Name	Type	Description	Default
`master`	`Node`	Final persisted state for the surviving master node.	required
`duplicate_ids`	`Sequence[str]`	Entity identifiers to merge into `master` and then delete.	required

Raises:

Type	Description
`ValueError`	Raised when `duplicate_ids` contains `master.id`.

delete_memory `async` ¶

delete_memory(memory_id)

Delete one memory and prune directly-mentioned orphan entities.

Parameters:

Name	Type	Description	Default
`memory_id`	`str`	Identifier of the memory node to remove.	required

save_documents_and_chunks `async` ¶

save_documents_and_chunks(documents, chunks)

Persist source documents and their chunks.

Parameters:

Name	Type	Description	Default
`documents`	`list[Document]`	Source documents to persist.	required
`chunks`	`list[Chunk]`	Source chunks to persist and connect to their parent documents.	required

save_docs_and_chunks_to_db `async` ¶

save_docs_and_chunks_to_db(doc_nodes, chunk_nodes)

Persist prepared document and chunk nodes.

Parameters:

Name	Type	Description	Default
`doc_nodes`	`list[DocumentNode]`	Prepared document nodes ready for persistence.	required
`chunk_nodes`	`list[ChunkNode]`	Prepared chunk nodes ready for persistence.	required

save_entities_and_rels `async` ¶

save_entities_and_rels(owner_ids, owner_graphs)

Persist extracted owner-linked entities and relationships.

Parameters:

Name	Type	Description	Default
`owner_ids`	`Sequence[str]`	Node identifiers that own the extracted graphs, such as chunk or memory ids.	required
`owner_graphs`	`dict[str, KnowledgeGraph]`	Extracted graphs keyed by owner identifier.	required

Raises:

Type	Description
`ValueError`	Raised when a graph references a chunk identifier that is not present in `owner_ids`.

search `async` ¶

search(query, method, *, limit=10, query_embedding=None)

Search documents, chunks, and entities.

Parameters:

Name	Type	Description	Default
`query`	`str`	Raw user query text.	required
`method`	`('fulltext', 'vector')`	Search strategy to execute.	`"fulltext"`
`limit`	`int`	Maximum number of results to return per node family.	`10`
`query_embedding`	`list[float] \| None`	Embedded query vector required for vector search.	`None`

Returns:

Type	Description
`SearchResults`	Search hits grouped by node family.

Database Abstractions¶

grawiki.db.base ¶

NodeHit dataclass ¶

NeighborRelationship dataclass ¶

GraphDB ¶

close ¶

setup async ¶

ensure_indexes async ¶

upsert_nodes async ¶

upsert_relationships async ¶

fulltext_search async ¶

vector_search async ¶

neighbor_relationships async ¶

recall_subgraph async ¶

list_entities async ¶

entity_relationship_counts async ¶

merge_entity_nodes async ¶

delete_memory async ¶

save_documents_and_chunks async ¶

save_docs_and_chunks_to_db async ¶

save_entities_and_rels async ¶

search async ¶

NodeHit `dataclass` ¶

NeighborRelationship `dataclass` ¶

setup `async` ¶

ensure_indexes `async` ¶

upsert_nodes `async` ¶

upsert_relationships `async` ¶

fulltext_search `async` ¶

vector_search `async` ¶

neighbor_relationships `async` ¶

recall_subgraph `async` ¶

list_entities `async` ¶

entity_relationship_counts `async` ¶

merge_entity_nodes `async` ¶

delete_memory `async` ¶

save_documents_and_chunks `async` ¶

save_docs_and_chunks_to_db `async` ¶

save_entities_and_rels `async` ¶

search `async` ¶