Wiki source code of Data Model
Version 5.1 by Robert Schaub on 2025/12/14 22:27
Show last authors
| author | version | line-number | content |
|---|---|---|---|
| 1 | = Data Model = | ||
| 2 | |||
| 3 | This page describes the current data model for FactHarbor. | ||
| 4 | |||
| 5 | == Versioning Strategy == | ||
| 6 | |||
| 7 | Every entity in FactHarbor has a full immutable version history. This ensures: | ||
| 8 | * Complete auditability | ||
| 9 | * Ability to reconstruct historical state | ||
| 10 | * Federation-compatible lineage tracking | ||
| 11 | * Transparent evolution of claims, scenarios, and verdicts | ||
| 12 | |||
| 13 | === Core Versioning Principles === | ||
| 14 | |||
| 15 | **Immutability**: | ||
| 16 | * Each version is stored independently | ||
| 17 | * Versions cannot be deleted, only superseded | ||
| 18 | * Historical versions remain accessible | ||
| 19 | |||
| 20 | **Lineage**: | ||
| 21 | * Each version links to its parent via `ParentVersionID` | ||
| 22 | * Forms directed acyclic graph (DAG) of changes | ||
| 23 | * Supports branching in federated environments | ||
| 24 | |||
| 25 | **Provenance**: | ||
| 26 | * Every version timestamped (`CreatedAt`) | ||
| 27 | * Author type recorded (`AuthorType`: Human, AI, ExternalNode) | ||
| 28 | * Justification captured (`JustificationText`) | ||
| 29 | * Digital signatures for integrity (`SignatureHash` in Release 1.0) | ||
| 30 | |||
| 31 | **Federation Support**: | ||
| 32 | * Versions can originate from remote nodes | ||
| 33 | * Conflict detection via lineage comparison | ||
| 34 | * Parallel version trees for branching scenarios | ||
| 35 | * Cross-node version synchronization | ||
| 36 | |||
| 37 | === Common Version Fields === | ||
| 38 | |||
| 39 | All versioned entities include: | ||
| 40 | |||
| 41 | * **VersionID**: Unique identifier for this specific version | ||
| 42 | * **ParentVersionID**: Link to previous version (null for first version) | ||
| 43 | * **CreatedAt**: Timestamp (ISO 8601, UTC) | ||
| 44 | * **AuthorType**: Human | AI | ExternalNode | ||
| 45 | * **JustificationText**: Brief explanation of changes | ||
| 46 | * **SignatureHash**: Cryptographic signature (Release 1.0) | ||
| 47 | |||
| 48 | ---- | ||
| 49 | |||
| 50 | == Core Data Model Refinements == | ||
| 51 | |||
| 52 | The system relies on the following versioned core entities: | ||
| 53 | |||
| 54 | * **CLAIM_CLUSTER** | ||
| 55 | ** ``ClusterID`` (PK), ``EmbeddingVectorRef``, ``Theme`` | ||
| 56 | ** Groups related claims into topical clusters. | ||
| 57 | ** One Cluster has many Claims. | ||
| 58 | ** A Claim belongs to exactly one primary cluster. | ||
| 59 | |||
| 60 | * **CLAIM / CLAIM_VERSION** | ||
| 61 | ** ``CLAIM`` is the long‑lived anchor for a real‑world claim. | ||
| 62 | ** ``CLAIM_VERSION`` is an immutable snapshot that includes: | ||
| 63 | *** ``ClaimID`` (FK to CLAIM) | ||
| 64 | *** ``VersionID`` (PK) | ||
| 65 | *** ``ParentVersionID`` (FK to prior version, nullable) | ||
| 66 | *** ``Text`` | ||
| 67 | *** ``Domain`` | ||
| 68 | *** ``ClaimType`` (literal, metaphorical, rhetorical, supernatural...) | ||
| 69 | *** ``Evaluability`` (empirical, subjective, non-falsifiable) | ||
| 70 | *** ``SafetyCategory`` (low, medium, high) | ||
| 71 | *** ``CreatedAt``, ``AuthorType``, ``JustificationText`` | ||
| 72 | *** ``Status`` (active, superseded, merged) | ||
| 73 | |||
| 74 | * **SCENARIO / SCENARIO_VERSION** | ||
| 75 | ** ``SCENARIO`` is the anchor for a scenario across time. | ||
| 76 | ** ``SCENARIO_VERSION`` is an immutable snapshot: | ||
| 77 | *** ``ScenarioID`` (FK to SCENARIO) | ||
| 78 | *** ``VersionID`` (PK) | ||
| 79 | *** ``ParentVersionID`` | ||
| 80 | *** ``ClaimID`` (FK to CLAIM) | ||
| 81 | *** ``Definitions`` | ||
| 82 | *** ``Boundaries`` | ||
| 83 | *** ``Assumptions`` | ||
| 84 | *** ``Context`` | ||
| 85 | *** ``EvaluationMethod`` | ||
| 86 | *** ``SafetyClass`` | ||
| 87 | *** ``CreatedAt``, ``AuthorType``, ``JustificationText`` | ||
| 88 | *** ``Status`` (active, superseded, deprecated) | ||
| 89 | |||
| 90 | * **EVIDENCE / EVIDENCE_VERSION** | ||
| 91 | ** ``EVIDENCE`` is the anchor. | ||
| 92 | ** ``EVIDENCE_VERSION`` is the versioned snapshot: | ||
| 93 | *** ``EvidenceID`` (FK to EVIDENCE) | ||
| 94 | *** ``VersionID`` (PK) | ||
| 95 | *** ``ParentVersionID`` | ||
| 96 | *** ``Type`` (paper, dataset, report, transcript, expert...) | ||
| 97 | *** ``Category`` (empirical, historical, rhetorical, dataset, meta-analysis...) | ||
| 98 | *** ``Reliability`` (low/med/high) | ||
| 99 | *** ``Provenance`` (URL, DOI, source metadata) | ||
| 100 | *** ``ExtractionMethod`` (manual, OCR, API, AKEL) | ||
| 101 | *** ``CreatedAt``, ``AuthorType``, ``JustificationText`` | ||
| 102 | *** ``Status`` (verified, updated, disputed, retracted, superseded) | ||
| 103 | |||
| 104 | * **VERDICT / VERDICT_VERSION** | ||
| 105 | ** ``VERDICT`` is the anchor. | ||
| 106 | ** ``VERDICT_VERSION`` is the snapshot: | ||
| 107 | *** ``VerdictID`` (FK to VERDICT) | ||
| 108 | *** ``VersionID`` (PK) | ||
| 109 | *** ``ParentVersionID`` | ||
| 110 | *** ``ClaimID`` (FK to CLAIM) | ||
| 111 | *** ``ScenarioID`` (FK to SCENARIO) | ||
| 112 | *** ``EvidenceVersionSet`` (list of evidence version IDs used) | ||
| 113 | *** ``LikelihoodRange`` (0–1, with uncertainty bounds) | ||
| 114 | *** ``ExplanationChain`` | ||
| 115 | *** ``UncertaintyFactors`` | ||
| 116 | *** ``CreatedAt``, ``AuthorType``, ``JustificationText`` | ||
| 117 | *** ``Status`` (current, outdated, superseded, retracted) | ||
| 118 | |||
| 119 | ---- | ||
| 120 | |||
| 121 | == Many-to-Many Linking Tables == | ||
| 122 | |||
| 123 | === ScenarioEvidenceLink === | ||
| 124 | |||
| 125 | Links scenario versions to evidence versions with relevance scoring. | ||
| 126 | |||
| 127 | **Fields**: | ||
| 128 | * ``ScenarioID`` | ||
| 129 | * ``ScenarioVersionID`` | ||
| 130 | * ``EvidenceID`` | ||
| 131 | * ``EvidenceVersionID`` | ||
| 132 | * ``RelevanceScore`` (0–1) - How relevant this evidence is to this scenario | ||
| 133 | * ``LinkJustification`` - Brief explanation of relevance | ||
| 134 | |||
| 135 | **Purpose**: | ||
| 136 | * Evidence can be used by multiple scenarios | ||
| 137 | * Scenarios can draw from multiple pieces of evidence | ||
| 138 | * Relevance scoring helps prioritize evidence | ||
| 139 | * Version-specific linking preserves historical accuracy | ||
| 140 | |||
| 141 | === ClaimCluster === | ||
| 142 | |||
| 143 | Semantic clustering of similar claims. | ||
| 144 | |||
| 145 | **Fields**: | ||
| 146 | * ``ClusterID`` (PK) | ||
| 147 | * ``EmbeddingVector`` - Vector representation for semantic search | ||
| 148 | * ``MemberList`` - List of ClaimIDs in this cluster | ||
| 149 | * ``Theme`` - Human-readable theme description | ||
| 150 | |||
| 151 | **Purpose**: | ||
| 152 | * Groups semantically similar claims | ||
| 153 | * Enables efficient search and discovery | ||
| 154 | * Supports cross-node claim alignment | ||
| 155 | * Reduces duplication | ||
| 156 | |||
| 157 | ---- | ||
| 158 | |||
| 159 | == Data Model Behavior == | ||
| 160 | |||
| 161 | === Late-Arriving Evidence === | ||
| 162 | |||
| 163 | When new evidence versions appear: | ||
| 164 | |||
| 165 | 1. Existing verdicts marked as **outdated** | ||
| 166 | 2. Scenario relevance must be re-evaluated | ||
| 167 | 3. Re-evaluation engine triggers verdict recomputation | ||
| 168 | 4. New verdict versions created | ||
| 169 | 5. Users notified of updates | ||
| 170 | |||
| 171 | **Process**: | ||
| 172 | * New EvidenceVersion imported | ||
| 173 | * System scans related ScenarioEvidenceLinks | ||
| 174 | * Checks if evidence affects existing verdicts | ||
| 175 | * Queues affected verdicts for re-evaluation | ||
| 176 | * AKEL or reviewer creates new VerdictVersion | ||
| 177 | * Old verdicts remain accessible (historical record) | ||
| 178 | |||
| 179 | === Scenario Evolution === | ||
| 180 | |||
| 181 | When a scenario's assumptions or definitions change: | ||
| 182 | |||
| 183 | **Creates new scenario version** (not in-place update): | ||
| 184 | * New ScenarioVersion with updated fields | ||
| 185 | * ParentVersionID points to previous version | ||
| 186 | * All dependent verdicts must be recalculated | ||
| 187 | * Previous scenario versions remain accessible | ||
| 188 | |||
| 189 | **Triggers**: | ||
| 190 | * Refined definitions | ||
| 191 | * Changed assumptions | ||
| 192 | * Expanded or narrowed boundaries | ||
| 193 | * Updated evaluation methods | ||
| 194 | * Safety classification changes | ||
| 195 | |||
| 196 | **Impact**: | ||
| 197 | * Verdicts based on old scenario version remain valid (historical) | ||
| 198 | * New verdicts required for new scenario version | ||
| 199 | * Users can compare old vs new scenarios | ||
| 200 | * Evidence links may need re-assessment | ||
| 201 | |||
| 202 | === Federated Nodes === | ||
| 203 | |||
| 204 | Each node may share partial data: | ||
| 205 | |||
| 206 | **Claims and scenarios**: Shared if relevant to node's domain | ||
| 207 | |||
| 208 | **Evidence metadata**: Shared, but not always full evidence files | ||
| 209 | |||
| 210 | **Verdict lineage**: Shared only if not locally overridden | ||
| 211 | |||
| 212 | **Version synchronization**: | ||
| 213 | * Remote versions imported with provenance metadata | ||
| 214 | * Conflicts detected via ParentVersionID comparison | ||
| 215 | * Branching allowed for divergent interpretations | ||
| 216 | * Local node retains authority over local versions | ||
| 217 | |||
| 218 | **Trust and acceptance**: | ||
| 219 | * Trusted nodes: auto-import versions | ||
| 220 | * Neutral nodes: import but flag for review | ||
| 221 | * Untrusted nodes: manual import only | ||
| 222 | |||
| 223 | ---- | ||
| 224 | |||
| 225 | == Entity-Relationship Overview == | ||
| 226 | |||
| 227 | **Core relationships**: | ||
| 228 | |||
| 229 | ``` | ||
| 230 | CLAIM_CLUSTER (1) ──< (N) CLAIM | ||
| 231 | CLAIM (1) ──< (N) CLAIM_VERSION | ||
| 232 | CLAIM (1) ──< (N) SCENARIO | ||
| 233 | SCENARIO (1) ──< (N) SCENARIO_VERSION | ||
| 234 | SCENARIO_VERSION (N) ──< (N) EVIDENCE_VERSION [via ScenarioEvidenceLink] | ||
| 235 | SCENARIO_VERSION (1) ──< (N) VERDICT_VERSION | ||
| 236 | VERDICT_VERSION references specific EvidenceVersionSet | ||
| 237 | ``` | ||
| 238 | |||
| 239 | **Version chains**: | ||
| 240 | |||
| 241 | Each entity has a version DAG: | ||
| 242 | ``` | ||
| 243 | Version 1 (ParentVersionID=null) | ||
| 244 | ↓ | ||
| 245 | Version 2 (ParentVersionID=1) | ||
| 246 | ↓ | ||
| 247 | Version 3 (ParentVersionID=2) | ||
| 248 | ``` | ||
| 249 | |||
| 250 | In federated environments, branching may occur: | ||
| 251 | ``` | ||
| 252 | Version 1 | ||
| 253 | ↓ | ||
| 254 | Version 2 | ||
| 255 | / ↓ ↓ | ||
| 256 | V3a V3b (parallel branches from different nodes) | ||
| 257 | ``` | ||
| 258 | |||
| 259 | ---- | ||
| 260 | |||
| 261 | ## Related Pages == | ||
| 262 | |||
| 263 | * [[Federation & Decentralization>>FactHarbor.Specification.Federation & Decentralization.WebHome]] | ||
| 264 | * [[AKEL (AI Knowledge Extraction Layer)>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]] | ||
| 265 | * [[Architecture>>FactHarbor.Specification.Architecture.WebHome]] |