Deduplication Helpers¶
These helpers are used by the facade-level deduplication flow to choose a surviving master node, merge labels and properties, and report the result. They are useful when building custom merge workflows on top of the lower-level APIs.
grawiki.similarity.deduplication
¶
Helpers for post-persistence entity deduplication.
MergeReport
dataclass
¶
Summary of one merge decision.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
master_id
|
str
|
Identifier of the surviving entity node. |
required |
duplicate_ids
|
tuple[str, ...]
|
Identifiers of duplicate nodes merged into the master. |
required |
source
|
str
|
Candidate source category, e.g. |
required |
merged_labels
|
tuple[str, ...]
|
Alphabetically sorted label set that will remain on the master. |
required |
property_conflicts
|
tuple[str, ...]
|
Property keys for which multiple duplicate values were observed and the master's value was kept. |
required |
pick_master
¶
pick_master(nodes, relation_counts)
Return the preferred master node from a duplicate group.
merge_node_properties
¶
merge_node_properties(master, duplicates)
Merge node properties while preserving the master's values on conflicts.
build_merged_master
¶
build_merged_master(master, duplicates)
Return the final master node state for a merge group.