Database Abstractions¶
The database layer is centered on GraphDB, which defines the backend-agnostic persistence, indexing, search, neighbor-expansion, and merge primitives used elsewhere in the project.
Use this page when implementing a new backend or when clarifying which responsibilities belong to the database adapter rather than the retrieval layer. For the current concrete implementation, see FalkorDB adapter.
grawiki.db.base
¶
Backend-agnostic graph database interfaces.
NodeHit
dataclass
¶
Search result pairing a node with scoring metadata.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
node
|
Node
|
Matched node. May be a concrete subclass such as
:class: |
required |
score
|
float
|
Relevance or similarity score. Adapters and higher-level services may
normalize backend-specific distance values into higher-is-better
scores. Defaults to |
0.0
|
matched_on
|
str
|
Short descriptor of how the hit was matched (for example
|
''
|
NeighborRelationship
dataclass
¶
One-hop relationship context around a seed node.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source_id
|
str
|
Identifier of the seed node that was expanded. |
required |
source_name
|
str
|
Human-readable name of the seed node. |
required |
relationship_label
|
str
|
Label of the relationship connecting the seed to the target. |
required |
target
|
Node
|
Neighbor node connected to the seed. |
required |
GraphDB
¶
Bases: ABC
Abstract interface for graph database adapters.
Notes
The contract has two layers. Storage-engine primitives
(:meth:upsert_nodes, :meth:upsert_relationships,
:meth:fulltext_search, :meth:vector_search,
:meth:neighbor_relationships, :meth:list_entities,
:meth:ensure_indexes) are the
foundational operations every backend
must implement. Higher-level convenience methods
(:meth:save_documents_and_chunks, :meth:save_docs_and_chunks_to_db,
:meth:save_entities_and_rels, :meth:search) are thin wrappers over
the primitives that preserve the legacy API during the migration.
setup
abstractmethod
async
¶
setup(embedding_dimensions=None)
Prepare backend indexes and other database structures.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
embedding_dimensions
|
dict[str, int] | None
|
Mapping from node label to embedding dimensionality for vector indexes that require the dimension to be known ahead of time. |
None
|
ensure_indexes
abstractmethod
async
¶
ensure_indexes(*, labels, vector_dims=None)
Ensure full-text and vector indexes exist for labels.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
labels
|
Iterable[str]
|
Node labels whose indexes should be created. |
required |
vector_dims
|
Mapping[str, int] | None
|
Per-label embedding dimensionality. Labels omitted from the mapping do not get a vector index. |
None
|
fulltext_search
abstractmethod
async
¶
fulltext_search(*, labels, query_text, limit=10)
Run a full-text search across one or more node labels.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
labels
|
Sequence[str]
|
Labels whose full-text indexes should be queried. |
required |
query_text
|
str
|
Raw full-text query string. |
required |
limit
|
int
|
Maximum number of hits to return per label. |
10
|
Returns:
| Type | Description |
|---|---|
list[NodeHit]
|
Flat list of hits across the requested labels. Callers group by the node family / label set when a grouped view is needed. |
vector_search
abstractmethod
async
¶
vector_search(*, labels, query_embedding, limit=10)
Run a vector similarity search across one or more node labels.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
labels
|
Sequence[str]
|
Labels whose vector indexes should be queried. |
required |
query_embedding
|
list[float]
|
Pre-computed query embedding. The DB does not embed queries; that concern lives in the retrieval layer. |
required |
limit
|
int
|
Maximum number of hits to return per label. |
10
|
Returns:
| Type | Description |
|---|---|
list[NodeHit]
|
Flat list of hits across the requested labels. |
neighbor_relationships
abstractmethod
async
¶
neighbor_relationships(*, node_ids, limit_per_node=5)
Fetch one-hop relationship context for the given seed nodes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
node_ids
|
Sequence[str]
|
Seed node identifiers. |
required |
limit_per_node
|
int
|
Maximum number of one-hop relationships returned for each seed. |
5
|
Returns:
| Type | Description |
|---|---|
dict[str, list[NeighborRelationship]]
|
Relationship context keyed by seed node identifier. Seed ids with no matching context should still be present with an empty list. |
recall_subgraph
abstractmethod
async
¶
recall_subgraph(*, memory_ids, hops=1, limit_per_memory=20)
Fetch a flattened k-hop recall subgraph for memory seeds.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
memory_ids
|
Sequence[str]
|
Memory node identifiers used as traversal seeds. |
required |
hops
|
int
|
Maximum traversal depth in hops. Must be at least |
1
|
limit_per_memory
|
int
|
Maximum number of distinct paths expanded per memory seed before flattening them into relationship rows. |
20
|
Returns:
| Type | Description |
|---|---|
dict[str, list[NeighborRelationship]]
|
Flattened relationship rows keyed by memory id. Traversal is undirected for discovery, but each row preserves the stored relationship direction. |
list_entities
abstractmethod
async
¶
list_entities(*, include_embeddings=False)
Return persisted entity nodes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
include_embeddings
|
bool
|
Whether entity embeddings should be loaded when available. Callers that only need identifiers and names should keep this disabled to avoid transferring large vectors unnecessarily. |
False
|
Returns:
| Type | Description |
|---|---|
list[Node]
|
Persisted entity nodes ordered by backend-defined stable ordering. |
entity_relationship_counts
abstractmethod
async
¶
entity_relationship_counts(node_ids)
Return incident relationship counts for entity nodes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
node_ids
|
Sequence[str]
|
Entity identifiers whose incident edge counts should be returned. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, int]
|
Mapping from entity id to total incoming-plus-outgoing relationship
count. Missing ids should still appear with |
upsert_nodes
abstractmethod
async
¶
upsert_nodes(nodes)
Upsert nodes. Dispatches on labels for persistence semantics.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
nodes
|
Sequence[Node]
|
Nodes to create or update. Each node's label set determines which concrete storage path is used. |
required |
upsert_relationships
abstractmethod
async
¶
upsert_relationships(rels)
Upsert relationships between existing nodes (matched by id).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rels
|
Sequence[Relationship]
|
Relationships to create or update. Both endpoints must already
exist in the graph and are matched by their |
required |
merge_entity_nodes
abstractmethod
async
¶
merge_entity_nodes(*, master, duplicate_ids)
Merge duplicate entity nodes into master.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
master
|
Node
|
Final persisted state for the surviving master node. The master is
matched by |
required |
duplicate_ids
|
Sequence[str]
|
Entity identifiers to merge into |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
Raised when |
delete_memory
abstractmethod
async
¶
delete_memory(memory_id)
Delete one memory and prune now-orphaned directly mentioned entities.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
memory_id
|
str
|
Identifier of the memory node to remove. |
required |
save_documents_and_chunks
async
¶
save_documents_and_chunks(documents, chunks)
Persist source documents and their chunks.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
documents
|
list[Document]
|
Source documents to persist. |
required |
chunks
|
list[Chunk]
|
Source chunks to persist and connect to their parent documents. |
required |
save_docs_and_chunks_to_db
async
¶
save_docs_and_chunks_to_db(doc_nodes, chunk_nodes)
Persist prepared document and chunk nodes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
doc_nodes
|
list[DocumentNode]
|
Prepared document nodes ready for persistence. |
required |
chunk_nodes
|
list[ChunkNode]
|
Prepared chunk nodes ready for persistence. |
required |
save_entities_and_rels
async
¶
save_entities_and_rels(owner_ids, owner_graphs)
Persist extracted owner-linked entities and relationships.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
owner_ids
|
Sequence[str]
|
Node identifiers that own the extracted graphs, such as chunk or memory ids. |
required |
owner_graphs
|
dict[str, KnowledgeGraph]
|
Extracted graphs keyed by owner identifier. |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
Raised when a graph references a chunk identifier that is not
present in |
search
async
¶
search(query, method, *, limit=10, query_embedding=None)
Search documents, chunks, and entities.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Raw user query text. |
required |
method
|
('fulltext', 'vector')
|
Search strategy to execute. |
"fulltext"
|
limit
|
int
|
Maximum number of results to return per node family. |
10
|
query_embedding
|
list[float] | None
|
Embedded query vector required for vector search. |
None
|
Returns:
| Type | Description |
|---|---|
SearchResults
|
Search hits grouped by node family. |