Changes for page Data Model (From Specification Chat)
Last modified by Robert Schaub on 2025/12/24 20:35
From version 5.1
edited by Robert Schaub
on 2025/11/27 12:28
on 2025/11/27 12:28
Change comment:
There is no comment for this version
To version 7.1
edited by Robert Schaub
on 2025/11/27 12:41
on 2025/11/27 12:41
Change comment:
There is no comment for this version
Summary
-
Page properties (1 modified, 0 added, 0 removed)
Details
- Page properties
-
- Content
-
... ... @@ -1,3 +1,316 @@ 1 +((( 2 + 3 +))) 4 + 5 += 5. Data Model = 6 + 7 +The FactHarbor data model centers on four fully versioned, immutable entities: 8 + 9 +* **Claim** 10 +* **Scenario** 11 +* **Evidence** 12 +* **Verdict** 13 + 14 +These entities form the structured **“truth landscape”** for each claim. 15 +The model is explicitly **versioned**, **traceable**, and **federation-ready**. 16 + 17 +To keep the system auditable and explainable, FactHarbor uses a consistent 18 +**identity vs. version** pattern: 19 + 20 +* Identity entities (e.g. {{code}}CLAIM{{/code}}, {{code}}SCENARIO{{/code}}) 21 + define *what* something is in a stable sense. 22 +* Version entities (e.g. {{code}}CLAIM_VERSION{{/code}}, {{code}}SCENARIO_VERSION{{/code}}) 23 + define *how that thing looked at a given point in time*. 24 + 25 +All reasoning (e.g. verdicts, review actions) is attached to **versions**, never to 26 +mutable identities. 27 + 28 +---- 29 + 30 += 5.1 Core entities and versioning pattern = 31 + 32 +(% class="wikitable" %) 33 +| **Logical concept** | **Identity entity** | **Version entity** | **Notes** 34 +| Claim (what people argue about) | {{code}}CLAIM{{/code}} | {{code}}CLAIM_VERSION{{/code}} | Claim text, phrasing, and metadata live in {{code}}CLAIM_VERSION{{/code}}. The identity {{code}}CLAIM{{/code}} stays stable across rephrasings. 35 +| Scenario (interpretive frame) | {{code}}SCENARIO{{/code}} | {{code}}SCENARIO_VERSION{{/code}} | A SCENARIO belongs to a CLAIM. Its versions capture evolving definitions, assumptions, and boundaries. 36 +| Evidence (source / datapoint) | {{code}}EVIDENCE{{/code}} | {{code}}EVIDENCE_VERSION{{/code}} | Identity of a source vs. specific extractions / updates over time. 37 +| Verdict (assessment) | {{code}}VERDICT{{/code}} | {{code}}VERDICT_VERSION{{/code}} | A VERDICT is defined per SCENARIO; VERDICT_VERSION captures the history of assessments. 38 +| Scenario–Evidence link | {{code}}SCENARIO_EVIDENCE_LINK{{/code}} | {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} | Links bind scenario versions to evidence versions with relevance & direction. 39 +| Claim cluster (semantic group) | {{code}}CLAIM_CLUSTER{{/code}} | – | Groups semantically related claims; mainly for discovery and navigation. 40 + 41 +Key design decisions: 42 + 43 +* A {{code}}CLAIM{{/code}} belongs to exactly one {{code}}CLAIM_CLUSTER{{/code}}. 44 +* A {{code}}SCENARIO{{/code}} belongs to exactly one {{code}}CLAIM{{/code}} 45 + (scenarios live at the *claim* level, not per individual phrasing). 46 +* Verdicts and Scenario–Evidence links are always attached to **versions**: 47 +* {{code}}SCENARIO_VERSION{{/code}} + 48 +{{code}}EVIDENCE_VERSION{{/code}} → 49 +{{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} 50 +* {{code}}SCENARIO_VERSION{{/code}} → 51 +{{code}}VERDICT_VERSION{{/code}} 52 + 53 +This ensures that when a Scenario or Evidence changes, old verdicts and links 54 +remain intact as historical records and can be revisited. 55 + 56 +---- 57 + 58 += 5.2 Core Data Model ERD (expanded, versioned) = 59 + 60 +The following Mermaid ER diagram shows the main entities and their relationships. 61 +The convention is that fields ending in {{code}}Id{{/code}} are primary keys, 62 +and fields with {{code}}...IdFk{{/code}} are foreign keys. 63 + 64 +{{comment}} Core Data Model ERD (Mermaid, from /Specification/Diagrams/Data Model) {{/comment}} 65 +{{include document="FactHarbor.Playground.Core Data Model ERD Page (from Specification chat).WebHome" reference="FactHarbor.Playground.data.Core Data Model ERD Page (from Specification chat).WebHome"/}} 66 + 67 +**Important points:** 68 + 69 +* Scenarios and Evidence are **linked via their versions** 70 + ({{code}}SCENARIO_VERSION{{/code}} and {{code}}EVIDENCE_VERSION{{/code}}). 71 +* Verdicts are **per ScenarioVersion** and stored in {{code}}VERDICT_VERSION{{/code}}. 72 +* {{code}}CLAIM_CLUSTER{{/code}} is shared across diagrams; it is shown here and in the Data Use / Review model. 73 + 74 +All version entities are immutable: once created, they are never changed, only 75 +superseded by newer versions. 76 + 77 +---- 78 + 79 += 5.3 Data Use & Review ERD (expanded, versioned) = 80 + 81 +The **Data Use** model captures who does what with which versioned data: 82 + 83 +* Users (including technical users) 84 +* Roles and role assignments 85 +* Review actions on versioned entities 86 + 87 +{{comment}} Data Use ERD (Mermaid, from /Specification/Diagrams/Data Use ERD) {{/comment}} 88 +{{include document="FactHarbor.Playground.Data Use ERD Page (from Specification chat).WebHome" reference="FactHarbor.Playground.data.Data Use ERD Page (from Specification chat).WebHome"/}} 89 + 90 += Data Use ERD (Roles, Review & Versioned Entities) = 91 + 92 +This diagram shows how users, roles, and review actions relate to the 93 +versioned core entities. 94 + 95 +{{mermaid}} 96 +erDiagram 97 + %% Core clusters shown for context 98 + CLAIM_CLUSTER { 99 + string ClusterID PK 100 + string EmbeddingVectorRef 101 + string Theme 102 + } 103 + 104 + CLAIM { 105 + string ClaimID PK 106 + string ClusterID FK 107 + string Status 108 + datetime CreatedAt 109 + } 110 + 111 + CLAIM_VERSION { 112 + string ClaimVersionID PK 113 + string ClaimID FK 114 + string Text 115 + string ClaimType 116 + string Domain 117 + datetime CreatedAt 118 + } 119 + 120 + SCENARIO { 121 + string ScenarioID PK 122 + string ClaimID FK 123 + string Name 124 + datetime CreatedAt 125 + } 126 + 127 + SCENARIO_VERSION { 128 + string ScenarioVersionID PK 129 + string ScenarioID FK 130 + string Definitions 131 + string Assumptions 132 + string Boundaries 133 + datetime CreatedAt 134 + } 135 + 136 + EVIDENCE { 137 + string EvidenceID PK 138 + string SourceType 139 + string URL 140 + float ReliabilityScore 141 + } 142 + 143 + EVIDENCE_VERSION { 144 + string EvidenceVersionID PK 145 + string EvidenceID FK 146 + string Summary 147 + float ReliabilityScore 148 + datetime CreatedAt 149 + } 150 + 151 + VERDICT { 152 + string VerdictID PK 153 + string ScenarioID FK 154 + } 155 + 156 + VERDICT_VERSION { 157 + string VerdictVersionID PK 158 + string VerdictID FK 159 + float Verdict 160 + float Confidence 161 + string Reasoning 162 + datetime CreatedAt 163 + } 164 + 165 + %% Users and roles 166 + USER { 167 + string UserID PK 168 + string Handle 169 + string Email 170 + } 171 + 172 + TECHNICAL_USER { 173 + string UserID PK 174 + string SystemName 175 + } 176 + 177 + CONTRIBUTING_USER { 178 + string UserID PK 179 + string DisplayName 180 + } 181 + 182 + TRUSTED_CONTRIBUTOR { 183 + string UserID PK 184 + string TrustLevel 185 + } 186 + 187 + REVIEWER { 188 + string UserID PK 189 + string Domain 190 + } 191 + 192 + EXPERT { 193 + string UserID PK 194 + string ExpertiseArea 195 + } 196 + 197 + FEDERATION_NODE { 198 + string NodeID PK 199 + string Region 200 + } 201 + 202 + FEDERATION_ADMIN { 203 + string UserID PK 204 + string Permissions 205 + } 206 + 207 + REVIEW_ACTION { 208 + string ReviewActionID PK 209 + string UserID FK 210 + string TargetEntityType 211 + string TargetEntityVersionID 212 + string ActionType 213 + string Comment 214 + datetime Timestamp 215 + } 216 + 217 + %% Inheritance / specialization (modelled as relationships) 218 + USER ||--o{ TECHNICAL_USER : "is a" 219 + USER ||--o{ CONTRIBUTING_USER : "is a" 220 + 221 + CONTRIBUTING_USER ||--o{ TRUSTED_CONTRIBUTOR : "subset" 222 + CONTRIBUTING_USER ||--o{ REVIEWER : "subset" 223 + CONTRIBUTING_USER ||--o{ EXPERT : "subset" 224 + 225 + TECHNICAL_USER ||--o{ FEDERATION_NODE : "operates" 226 + TECHNICAL_USER ||--o{ FEDERATION_ADMIN : "administers" 227 + 228 + %% Review actions on versioned entities 229 + USER ||--o{ REVIEW_ACTION : performs 230 + 231 + REVIEW_ACTION }o--|| CLAIM_VERSION : reviews 232 + REVIEW_ACTION }o--|| SCENARIO_VERSION : reviews 233 + REVIEW_ACTION }o--|| EVIDENCE_VERSION : reviews 234 + REVIEW_ACTION }o--|| VERDICT_VERSION : reviews 235 +{{/mermaid}} 236 + 237 +{{info}} 238 +This diagram focuses on *who* uses and reviews *which* versioned entities. 239 +USER is the base type; TECHNICAL_USER and CONTRIBUTING_USER are specializations. 240 +Other roles (REVIEWER, EXPERT, TRUSTED_CONTRIBUTOR, FEDERATION_ADMIN, FEDERATION_NODE) 241 +are modelled as specializations or technical subtypes. 242 +{{/info}} 243 + 244 + 245 +Notes: 246 + 247 +* Most roles (READER, CONTRIBUTOR, TRUSTED_CONTRIBUTOR, REVIEWER, MODERATOR, 248 + SYSTEM_ADMIN, FEDERATION_OPERATOR, FEDERATION_ADMIN, …) are represented as rows 249 + in {{code}}ROLE{{/code}}. 250 +* {{code}}TECHNICAL_USER{{/code}} captures strictly technical accounts (API keys, 251 + node-to-node federation agents, batch jobs). All other roles can, in principle, 252 + be held by both human and technical users where appropriate. 253 +* A {{code}}READER{{/code}} normally does **not** perform REVIEW_ACTIONs, while 254 + roles like REVIEWER, TRUSTED_CONTRIBUTOR, MODERATOR, and some federation roles 255 + do. 256 + 257 +---- 258 + 259 += 5.4 Versioning and re-evaluation behavior = 260 + 261 +This section ties the data model to the re-evaluation logic 262 +(described in more detail in the Versioning and Automation chapters). 263 + 264 +* When a new {{code}}EVIDENCE_VERSION{{/code}} is created: 265 +* All related {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} entries referencing 266 + that evidence version are candidates for re-assessment. 267 +* Related {{code}}VERDICT_VERSION{{/code}} entries may become **outdated** and 268 + are queued for re-evaluation. 269 + 270 +* When a new {{code}}SCENARIO_VERSION{{/code}} is created: 271 +* It may inherit some links from earlier scenarios, or start empty depending 272 + on the change classification (cosmetic vs. conceptual). 273 +* All verdicts for that scenario are recalculated and stored as new 274 +{{code}}VERDICT_VERSION{{/code}} entries. 275 + 276 +* REVIEW_ACTIONs are always attached to the **exact version** that was seen by 277 + the reviewer. This preserves a faithful audit trail if data later changes. 278 + 279 +* In a federated environment, nodes can choose: 280 +* which identity entities to replicate (CLAIM, SCENARIO, EVIDENCE, VERDICT) 281 +* which versioned entities to replicate (e.g. only accepted VERDICT_VERSIONs, 282 + only EVIDENCE_VERSIONs above a reliability threshold, etc.) 283 + 284 +---- 285 + 286 += 5.5 Behavioral Notes = 287 + 288 +== 5.5.1 Late-Arriving Evidence == 289 + 290 +New evidence versions can make existing verdicts **outdated** and may trigger 291 +re-evaluation cascades. This is handled by the global trigger and automation 292 +architecture (see the Versioning & Automation chapters). 293 + 294 +== 5.5.2 Scenario Evolution == 295 + 296 +Scenario changes create new SCENARIO_VERSIONs; dependent verdicts and 297 +Scenario–Evidence links are re-assessed. Old versions remain available for 298 +historical comparison and reproducibility. 299 + 300 +== 5.5.3 Federation == 301 + 302 +Federated nodes can replicate subsets of the graph, including: 303 + 304 +* Claims and Scenarios of local interest 305 +* Evidence metadata (without full content) 306 +* Verdict lineages used for local decision-making 307 + 308 +Federation-specific entities (such as {{code}}FEDERATION_NODE{{/code}}, 309 +replication logs, and trust rules) are described in the Federation & 310 +Decentralization chapter and build on top of the core data model defined here. 311 + 312 +---- 313 + 1 1 == 1. Overall analysis & review of the data model == 2 2 3 3 === 1.1 Strengths of the current design === ... ... @@ -165,155 +165,3 @@ 165 165 ))) 166 166 * That’s fine for now; I’ll just clarify that those belong to a “Processing / AKEL” submodel, not the core logical data model. 167 167 ))) 168 - 169 -= 5. Data Model = 170 - 171 -The FactHarbor data model centers on four fully versioned, immutable entities: 172 - 173 -* **Claim** 174 -* **Scenario** 175 -* **Evidence** 176 -* **Verdict** 177 - 178 -These entities form the structured **“truth landscape”** for each claim. 179 -The model is explicitly **versioned**, **traceable**, and **federation-ready**. 180 - 181 -To keep the system auditable and explainable, FactHarbor uses a consistent 182 -**identity vs. version** pattern: 183 - 184 -* Identity entities (e.g. {{code}}CLAIM{{/code}}, {{code}}SCENARIO{{/code}}) 185 - define *what* something is in a stable sense. 186 -* Version entities (e.g. {{code}}CLAIM_VERSION{{/code}}, {{code}}SCENARIO_VERSION{{/code}}) 187 - define *how that thing looked at a given point in time*. 188 - 189 -All reasoning (e.g. verdicts, review actions) is attached to **versions**, never to 190 -mutable identities. 191 - 192 ----- 193 - 194 -= 5.1 Core entities and versioning pattern = 195 - 196 -(% class="wikitable" %) 197 -| **Logical concept** | **Identity entity** | **Version entity** | **Notes** 198 -| Claim (what people argue about) | {{code}}CLAIM{{/code}} | {{code}}CLAIM_VERSION{{/code}} | Claim text, phrasing, and metadata live in {{code}}CLAIM_VERSION{{/code}}. The identity {{code}}CLAIM{{/code}} stays stable across rephrasings. 199 -| Scenario (interpretive frame) | {{code}}SCENARIO{{/code}} | {{code}}SCENARIO_VERSION{{/code}} | A SCENARIO belongs to a CLAIM. Its versions capture evolving definitions, assumptions, and boundaries. 200 -| Evidence (source / datapoint) | {{code}}EVIDENCE{{/code}} | {{code}}EVIDENCE_VERSION{{/code}} | Identity of a source vs. specific extractions / updates over time. 201 -| Verdict (assessment) | {{code}}VERDICT{{/code}} | {{code}}VERDICT_VERSION{{/code}} | A VERDICT is defined per SCENARIO; VERDICT_VERSION captures the history of assessments. 202 -| Scenario–Evidence link | {{code}}SCENARIO_EVIDENCE_LINK{{/code}} | {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} | Links bind scenario versions to evidence versions with relevance & direction. 203 -| Claim cluster (semantic group) | {{code}}CLAIM_CLUSTER{{/code}} | – | Groups semantically related claims; mainly for discovery and navigation. 204 - 205 -Key design decisions: 206 - 207 -* A {{code}}CLAIM{{/code}} belongs to exactly one {{code}}CLAIM_CLUSTER{{/code}}. 208 -* A {{code}}SCENARIO{{/code}} belongs to exactly one {{code}}CLAIM{{/code}} 209 - (scenarios live at the *claim* level, not per individual phrasing). 210 -* Verdicts and Scenario–Evidence links are always attached to **versions**: 211 -* {{code}}SCENARIO_VERSION{{/code}} + 212 -{{code}}EVIDENCE_VERSION{{/code}} → 213 -{{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} 214 -* {{code}}SCENARIO_VERSION{{/code}} → 215 -{{code}}VERDICT_VERSION{{/code}} 216 - 217 -This ensures that when a Scenario or Evidence changes, old verdicts and links 218 -remain intact as historical records and can be revisited. 219 - 220 ----- 221 - 222 -= 5.2 Core Data Model ERD (expanded, versioned) = 223 - 224 -The following Mermaid ER diagram shows the main entities and their relationships. 225 -The convention is that fields ending in {{code}}Id{{/code}} are primary keys, 226 -and fields with {{code}}...IdFk{{/code}} are foreign keys. 227 - 228 -{{comment}} Core Data Model ERD (Mermaid, from /Specification/Diagrams/Data Model) {{/comment}} 229 -{{include document="FactHarbor.Playground.Core Data Model ERD Page (from Specification chat).WebHome"/}} 230 - 231 -**Important points:** 232 - 233 -* Scenarios and Evidence are **linked via their versions** 234 - ({{code}}SCENARIO_VERSION{{/code}} and {{code}}EVIDENCE_VERSION{{/code}}). 235 -* Verdicts are **per ScenarioVersion** and stored in {{code}}VERDICT_VERSION{{/code}}. 236 -* {{code}}CLAIM_CLUSTER{{/code}} is shared across diagrams; it is shown here and in the Data Use / Review model. 237 - 238 -All version entities are immutable: once created, they are never changed, only 239 -superseded by newer versions. 240 - 241 ----- 242 - 243 -= 5.3 Data Use & Review ERD (expanded, versioned) = 244 - 245 -The **Data Use** model captures who does what with which versioned data: 246 - 247 -* Users (including technical users) 248 -* Roles and role assignments 249 -* Review actions on versioned entities 250 - 251 -{{comment}} Data Use ERD (Mermaid, from /Specification/Diagrams/Data Use ERD) {{/comment}} 252 -{{include document="FactHarbor.Playground.Data Use ERD Page (from Specification chat).WebHome"/}} 253 - 254 -Notes: 255 - 256 -* Most roles (READER, CONTRIBUTOR, TRUSTED_CONTRIBUTOR, REVIEWER, MODERATOR, 257 - SYSTEM_ADMIN, FEDERATION_OPERATOR, FEDERATION_ADMIN, …) are represented as rows 258 - in {{code}}ROLE{{/code}}. 259 -* {{code}}TECHNICAL_USER{{/code}} captures strictly technical accounts (API keys, 260 - node-to-node federation agents, batch jobs). All other roles can, in principle, 261 - be held by both human and technical users where appropriate. 262 -* A {{code}}READER{{/code}} normally does **not** perform REVIEW_ACTIONs, while 263 - roles like REVIEWER, TRUSTED_CONTRIBUTOR, MODERATOR, and some federation roles 264 - do. 265 - 266 ----- 267 - 268 -= 5.4 Versioning and re-evaluation behavior = 269 - 270 -This section ties the data model to the re-evaluation logic 271 -(described in more detail in the Versioning and Automation chapters). 272 - 273 -* When a new {{code}}EVIDENCE_VERSION{{/code}} is created: 274 -* All related {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} entries referencing 275 - that evidence version are candidates for re-assessment. 276 -* Related {{code}}VERDICT_VERSION{{/code}} entries may become **outdated** and 277 - are queued for re-evaluation. 278 - 279 -* When a new {{code}}SCENARIO_VERSION{{/code}} is created: 280 -* It may inherit some links from earlier scenarios, or start empty depending 281 - on the change classification (cosmetic vs. conceptual). 282 -* All verdicts for that scenario are recalculated and stored as new 283 -{{code}}VERDICT_VERSION{{/code}} entries. 284 - 285 -* REVIEW_ACTIONs are always attached to the **exact version** that was seen by 286 - the reviewer. This preserves a faithful audit trail if data later changes. 287 - 288 -* In a federated environment, nodes can choose: 289 -* which identity entities to replicate (CLAIM, SCENARIO, EVIDENCE, VERDICT) 290 -* which versioned entities to replicate (e.g. only accepted VERDICT_VERSIONs, 291 - only EVIDENCE_VERSIONs above a reliability threshold, etc.) 292 - 293 ----- 294 - 295 -= 5.5 Behavioral Notes = 296 - 297 -== 5.5.1 Late-Arriving Evidence == 298 - 299 -New evidence versions can make existing verdicts **outdated** and may trigger 300 -re-evaluation cascades. This is handled by the global trigger and automation 301 -architecture (see the Versioning & Automation chapters). 302 - 303 -== 5.5.2 Scenario Evolution == 304 - 305 -Scenario changes create new SCENARIO_VERSIONs; dependent verdicts and 306 -Scenario–Evidence links are re-assessed. Old versions remain available for 307 -historical comparison and reproducibility. 308 - 309 -== 5.5.3 Federation == 310 - 311 -Federated nodes can replicate subsets of the graph, including: 312 - 313 -* Claims and Scenarios of local interest 314 -* Evidence metadata (without full content) 315 -* Verdict lineages used for local decision-making 316 - 317 -Federation-specific entities (such as {{code}}FEDERATION_NODE{{/code}}, 318 -replication logs, and trust rules) are described in the Federation & 319 -Decentralization chapter and build on top of the core data model defined here.