Changes for page Data Model (From Specification Chat)
Last modified by Robert Schaub on 2025/12/24 20:35
From version 7.1
edited by Robert Schaub
on 2025/11/27 12:41
on 2025/11/27 12:41
Change comment:
There is no comment for this version
To version 5.1
edited by Robert Schaub
on 2025/11/27 12:28
on 2025/11/27 12:28
Change comment:
There is no comment for this version
Summary
-
Page properties (1 modified, 0 added, 0 removed)
Details
- Page properties
-
- Content
-
... ... @@ -1,316 +314,3 @@ 1 -((( 2 - 3 -))) 4 - 5 -= 5. Data Model = 6 - 7 -The FactHarbor data model centers on four fully versioned, immutable entities: 8 - 9 -* **Claim** 10 -* **Scenario** 11 -* **Evidence** 12 -* **Verdict** 13 - 14 -These entities form the structured **“truth landscape”** for each claim. 15 -The model is explicitly **versioned**, **traceable**, and **federation-ready**. 16 - 17 -To keep the system auditable and explainable, FactHarbor uses a consistent 18 -**identity vs. version** pattern: 19 - 20 -* Identity entities (e.g. {{code}}CLAIM{{/code}}, {{code}}SCENARIO{{/code}}) 21 - define *what* something is in a stable sense. 22 -* Version entities (e.g. {{code}}CLAIM_VERSION{{/code}}, {{code}}SCENARIO_VERSION{{/code}}) 23 - define *how that thing looked at a given point in time*. 24 - 25 -All reasoning (e.g. verdicts, review actions) is attached to **versions**, never to 26 -mutable identities. 27 - 28 ----- 29 - 30 -= 5.1 Core entities and versioning pattern = 31 - 32 -(% class="wikitable" %) 33 -| **Logical concept** | **Identity entity** | **Version entity** | **Notes** 34 -| Claim (what people argue about) | {{code}}CLAIM{{/code}} | {{code}}CLAIM_VERSION{{/code}} | Claim text, phrasing, and metadata live in {{code}}CLAIM_VERSION{{/code}}. The identity {{code}}CLAIM{{/code}} stays stable across rephrasings. 35 -| Scenario (interpretive frame) | {{code}}SCENARIO{{/code}} | {{code}}SCENARIO_VERSION{{/code}} | A SCENARIO belongs to a CLAIM. Its versions capture evolving definitions, assumptions, and boundaries. 36 -| Evidence (source / datapoint) | {{code}}EVIDENCE{{/code}} | {{code}}EVIDENCE_VERSION{{/code}} | Identity of a source vs. specific extractions / updates over time. 37 -| Verdict (assessment) | {{code}}VERDICT{{/code}} | {{code}}VERDICT_VERSION{{/code}} | A VERDICT is defined per SCENARIO; VERDICT_VERSION captures the history of assessments. 38 -| Scenario–Evidence link | {{code}}SCENARIO_EVIDENCE_LINK{{/code}} | {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} | Links bind scenario versions to evidence versions with relevance & direction. 39 -| Claim cluster (semantic group) | {{code}}CLAIM_CLUSTER{{/code}} | – | Groups semantically related claims; mainly for discovery and navigation. 40 - 41 -Key design decisions: 42 - 43 -* A {{code}}CLAIM{{/code}} belongs to exactly one {{code}}CLAIM_CLUSTER{{/code}}. 44 -* A {{code}}SCENARIO{{/code}} belongs to exactly one {{code}}CLAIM{{/code}} 45 - (scenarios live at the *claim* level, not per individual phrasing). 46 -* Verdicts and Scenario–Evidence links are always attached to **versions**: 47 -* {{code}}SCENARIO_VERSION{{/code}} + 48 -{{code}}EVIDENCE_VERSION{{/code}} → 49 -{{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} 50 -* {{code}}SCENARIO_VERSION{{/code}} → 51 -{{code}}VERDICT_VERSION{{/code}} 52 - 53 -This ensures that when a Scenario or Evidence changes, old verdicts and links 54 -remain intact as historical records and can be revisited. 55 - 56 ----- 57 - 58 -= 5.2 Core Data Model ERD (expanded, versioned) = 59 - 60 -The following Mermaid ER diagram shows the main entities and their relationships. 61 -The convention is that fields ending in {{code}}Id{{/code}} are primary keys, 62 -and fields with {{code}}...IdFk{{/code}} are foreign keys. 63 - 64 -{{comment}} Core Data Model ERD (Mermaid, from /Specification/Diagrams/Data Model) {{/comment}} 65 -{{include document="FactHarbor.Playground.Core Data Model ERD Page (from Specification chat).WebHome" reference="FactHarbor.Playground.data.Core Data Model ERD Page (from Specification chat).WebHome"/}} 66 - 67 -**Important points:** 68 - 69 -* Scenarios and Evidence are **linked via their versions** 70 - ({{code}}SCENARIO_VERSION{{/code}} and {{code}}EVIDENCE_VERSION{{/code}}). 71 -* Verdicts are **per ScenarioVersion** and stored in {{code}}VERDICT_VERSION{{/code}}. 72 -* {{code}}CLAIM_CLUSTER{{/code}} is shared across diagrams; it is shown here and in the Data Use / Review model. 73 - 74 -All version entities are immutable: once created, they are never changed, only 75 -superseded by newer versions. 76 - 77 ----- 78 - 79 -= 5.3 Data Use & Review ERD (expanded, versioned) = 80 - 81 -The **Data Use** model captures who does what with which versioned data: 82 - 83 -* Users (including technical users) 84 -* Roles and role assignments 85 -* Review actions on versioned entities 86 - 87 -{{comment}} Data Use ERD (Mermaid, from /Specification/Diagrams/Data Use ERD) {{/comment}} 88 -{{include document="FactHarbor.Playground.Data Use ERD Page (from Specification chat).WebHome" reference="FactHarbor.Playground.data.Data Use ERD Page (from Specification chat).WebHome"/}} 89 - 90 -= Data Use ERD (Roles, Review & Versioned Entities) = 91 - 92 -This diagram shows how users, roles, and review actions relate to the 93 -versioned core entities. 94 - 95 -{{mermaid}} 96 -erDiagram 97 - %% Core clusters shown for context 98 - CLAIM_CLUSTER { 99 - string ClusterID PK 100 - string EmbeddingVectorRef 101 - string Theme 102 - } 103 - 104 - CLAIM { 105 - string ClaimID PK 106 - string ClusterID FK 107 - string Status 108 - datetime CreatedAt 109 - } 110 - 111 - CLAIM_VERSION { 112 - string ClaimVersionID PK 113 - string ClaimID FK 114 - string Text 115 - string ClaimType 116 - string Domain 117 - datetime CreatedAt 118 - } 119 - 120 - SCENARIO { 121 - string ScenarioID PK 122 - string ClaimID FK 123 - string Name 124 - datetime CreatedAt 125 - } 126 - 127 - SCENARIO_VERSION { 128 - string ScenarioVersionID PK 129 - string ScenarioID FK 130 - string Definitions 131 - string Assumptions 132 - string Boundaries 133 - datetime CreatedAt 134 - } 135 - 136 - EVIDENCE { 137 - string EvidenceID PK 138 - string SourceType 139 - string URL 140 - float ReliabilityScore 141 - } 142 - 143 - EVIDENCE_VERSION { 144 - string EvidenceVersionID PK 145 - string EvidenceID FK 146 - string Summary 147 - float ReliabilityScore 148 - datetime CreatedAt 149 - } 150 - 151 - VERDICT { 152 - string VerdictID PK 153 - string ScenarioID FK 154 - } 155 - 156 - VERDICT_VERSION { 157 - string VerdictVersionID PK 158 - string VerdictID FK 159 - float Verdict 160 - float Confidence 161 - string Reasoning 162 - datetime CreatedAt 163 - } 164 - 165 - %% Users and roles 166 - USER { 167 - string UserID PK 168 - string Handle 169 - string Email 170 - } 171 - 172 - TECHNICAL_USER { 173 - string UserID PK 174 - string SystemName 175 - } 176 - 177 - CONTRIBUTING_USER { 178 - string UserID PK 179 - string DisplayName 180 - } 181 - 182 - TRUSTED_CONTRIBUTOR { 183 - string UserID PK 184 - string TrustLevel 185 - } 186 - 187 - REVIEWER { 188 - string UserID PK 189 - string Domain 190 - } 191 - 192 - EXPERT { 193 - string UserID PK 194 - string ExpertiseArea 195 - } 196 - 197 - FEDERATION_NODE { 198 - string NodeID PK 199 - string Region 200 - } 201 - 202 - FEDERATION_ADMIN { 203 - string UserID PK 204 - string Permissions 205 - } 206 - 207 - REVIEW_ACTION { 208 - string ReviewActionID PK 209 - string UserID FK 210 - string TargetEntityType 211 - string TargetEntityVersionID 212 - string ActionType 213 - string Comment 214 - datetime Timestamp 215 - } 216 - 217 - %% Inheritance / specialization (modelled as relationships) 218 - USER ||--o{ TECHNICAL_USER : "is a" 219 - USER ||--o{ CONTRIBUTING_USER : "is a" 220 - 221 - CONTRIBUTING_USER ||--o{ TRUSTED_CONTRIBUTOR : "subset" 222 - CONTRIBUTING_USER ||--o{ REVIEWER : "subset" 223 - CONTRIBUTING_USER ||--o{ EXPERT : "subset" 224 - 225 - TECHNICAL_USER ||--o{ FEDERATION_NODE : "operates" 226 - TECHNICAL_USER ||--o{ FEDERATION_ADMIN : "administers" 227 - 228 - %% Review actions on versioned entities 229 - USER ||--o{ REVIEW_ACTION : performs 230 - 231 - REVIEW_ACTION }o--|| CLAIM_VERSION : reviews 232 - REVIEW_ACTION }o--|| SCENARIO_VERSION : reviews 233 - REVIEW_ACTION }o--|| EVIDENCE_VERSION : reviews 234 - REVIEW_ACTION }o--|| VERDICT_VERSION : reviews 235 -{{/mermaid}} 236 - 237 -{{info}} 238 -This diagram focuses on *who* uses and reviews *which* versioned entities. 239 -USER is the base type; TECHNICAL_USER and CONTRIBUTING_USER are specializations. 240 -Other roles (REVIEWER, EXPERT, TRUSTED_CONTRIBUTOR, FEDERATION_ADMIN, FEDERATION_NODE) 241 -are modelled as specializations or technical subtypes. 242 -{{/info}} 243 - 244 - 245 -Notes: 246 - 247 -* Most roles (READER, CONTRIBUTOR, TRUSTED_CONTRIBUTOR, REVIEWER, MODERATOR, 248 - SYSTEM_ADMIN, FEDERATION_OPERATOR, FEDERATION_ADMIN, …) are represented as rows 249 - in {{code}}ROLE{{/code}}. 250 -* {{code}}TECHNICAL_USER{{/code}} captures strictly technical accounts (API keys, 251 - node-to-node federation agents, batch jobs). All other roles can, in principle, 252 - be held by both human and technical users where appropriate. 253 -* A {{code}}READER{{/code}} normally does **not** perform REVIEW_ACTIONs, while 254 - roles like REVIEWER, TRUSTED_CONTRIBUTOR, MODERATOR, and some federation roles 255 - do. 256 - 257 ----- 258 - 259 -= 5.4 Versioning and re-evaluation behavior = 260 - 261 -This section ties the data model to the re-evaluation logic 262 -(described in more detail in the Versioning and Automation chapters). 263 - 264 -* When a new {{code}}EVIDENCE_VERSION{{/code}} is created: 265 -* All related {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} entries referencing 266 - that evidence version are candidates for re-assessment. 267 -* Related {{code}}VERDICT_VERSION{{/code}} entries may become **outdated** and 268 - are queued for re-evaluation. 269 - 270 -* When a new {{code}}SCENARIO_VERSION{{/code}} is created: 271 -* It may inherit some links from earlier scenarios, or start empty depending 272 - on the change classification (cosmetic vs. conceptual). 273 -* All verdicts for that scenario are recalculated and stored as new 274 -{{code}}VERDICT_VERSION{{/code}} entries. 275 - 276 -* REVIEW_ACTIONs are always attached to the **exact version** that was seen by 277 - the reviewer. This preserves a faithful audit trail if data later changes. 278 - 279 -* In a federated environment, nodes can choose: 280 -* which identity entities to replicate (CLAIM, SCENARIO, EVIDENCE, VERDICT) 281 -* which versioned entities to replicate (e.g. only accepted VERDICT_VERSIONs, 282 - only EVIDENCE_VERSIONs above a reliability threshold, etc.) 283 - 284 ----- 285 - 286 -= 5.5 Behavioral Notes = 287 - 288 -== 5.5.1 Late-Arriving Evidence == 289 - 290 -New evidence versions can make existing verdicts **outdated** and may trigger 291 -re-evaluation cascades. This is handled by the global trigger and automation 292 -architecture (see the Versioning & Automation chapters). 293 - 294 -== 5.5.2 Scenario Evolution == 295 - 296 -Scenario changes create new SCENARIO_VERSIONs; dependent verdicts and 297 -Scenario–Evidence links are re-assessed. Old versions remain available for 298 -historical comparison and reproducibility. 299 - 300 -== 5.5.3 Federation == 301 - 302 -Federated nodes can replicate subsets of the graph, including: 303 - 304 -* Claims and Scenarios of local interest 305 -* Evidence metadata (without full content) 306 -* Verdict lineages used for local decision-making 307 - 308 -Federation-specific entities (such as {{code}}FEDERATION_NODE{{/code}}, 309 -replication logs, and trust rules) are described in the Federation & 310 -Decentralization chapter and build on top of the core data model defined here. 311 - 312 ----- 313 - 314 314 == 1. Overall analysis & review of the data model == 315 315 316 316 === 1.1 Strengths of the current design === ... ... @@ -478,3 +478,155 @@ 478 478 ))) 479 479 * That’s fine for now; I’ll just clarify that those belong to a “Processing / AKEL” submodel, not the core logical data model. 480 480 ))) 168 + 169 += 5. Data Model = 170 + 171 +The FactHarbor data model centers on four fully versioned, immutable entities: 172 + 173 +* **Claim** 174 +* **Scenario** 175 +* **Evidence** 176 +* **Verdict** 177 + 178 +These entities form the structured **“truth landscape”** for each claim. 179 +The model is explicitly **versioned**, **traceable**, and **federation-ready**. 180 + 181 +To keep the system auditable and explainable, FactHarbor uses a consistent 182 +**identity vs. version** pattern: 183 + 184 +* Identity entities (e.g. {{code}}CLAIM{{/code}}, {{code}}SCENARIO{{/code}}) 185 + define *what* something is in a stable sense. 186 +* Version entities (e.g. {{code}}CLAIM_VERSION{{/code}}, {{code}}SCENARIO_VERSION{{/code}}) 187 + define *how that thing looked at a given point in time*. 188 + 189 +All reasoning (e.g. verdicts, review actions) is attached to **versions**, never to 190 +mutable identities. 191 + 192 +---- 193 + 194 += 5.1 Core entities and versioning pattern = 195 + 196 +(% class="wikitable" %) 197 +| **Logical concept** | **Identity entity** | **Version entity** | **Notes** 198 +| Claim (what people argue about) | {{code}}CLAIM{{/code}} | {{code}}CLAIM_VERSION{{/code}} | Claim text, phrasing, and metadata live in {{code}}CLAIM_VERSION{{/code}}. The identity {{code}}CLAIM{{/code}} stays stable across rephrasings. 199 +| Scenario (interpretive frame) | {{code}}SCENARIO{{/code}} | {{code}}SCENARIO_VERSION{{/code}} | A SCENARIO belongs to a CLAIM. Its versions capture evolving definitions, assumptions, and boundaries. 200 +| Evidence (source / datapoint) | {{code}}EVIDENCE{{/code}} | {{code}}EVIDENCE_VERSION{{/code}} | Identity of a source vs. specific extractions / updates over time. 201 +| Verdict (assessment) | {{code}}VERDICT{{/code}} | {{code}}VERDICT_VERSION{{/code}} | A VERDICT is defined per SCENARIO; VERDICT_VERSION captures the history of assessments. 202 +| Scenario–Evidence link | {{code}}SCENARIO_EVIDENCE_LINK{{/code}} | {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} | Links bind scenario versions to evidence versions with relevance & direction. 203 +| Claim cluster (semantic group) | {{code}}CLAIM_CLUSTER{{/code}} | – | Groups semantically related claims; mainly for discovery and navigation. 204 + 205 +Key design decisions: 206 + 207 +* A {{code}}CLAIM{{/code}} belongs to exactly one {{code}}CLAIM_CLUSTER{{/code}}. 208 +* A {{code}}SCENARIO{{/code}} belongs to exactly one {{code}}CLAIM{{/code}} 209 + (scenarios live at the *claim* level, not per individual phrasing). 210 +* Verdicts and Scenario–Evidence links are always attached to **versions**: 211 +* {{code}}SCENARIO_VERSION{{/code}} + 212 +{{code}}EVIDENCE_VERSION{{/code}} → 213 +{{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} 214 +* {{code}}SCENARIO_VERSION{{/code}} → 215 +{{code}}VERDICT_VERSION{{/code}} 216 + 217 +This ensures that when a Scenario or Evidence changes, old verdicts and links 218 +remain intact as historical records and can be revisited. 219 + 220 +---- 221 + 222 += 5.2 Core Data Model ERD (expanded, versioned) = 223 + 224 +The following Mermaid ER diagram shows the main entities and their relationships. 225 +The convention is that fields ending in {{code}}Id{{/code}} are primary keys, 226 +and fields with {{code}}...IdFk{{/code}} are foreign keys. 227 + 228 +{{comment}} Core Data Model ERD (Mermaid, from /Specification/Diagrams/Data Model) {{/comment}} 229 +{{include document="FactHarbor.Playground.Core Data Model ERD Page (from Specification chat).WebHome"/}} 230 + 231 +**Important points:** 232 + 233 +* Scenarios and Evidence are **linked via their versions** 234 + ({{code}}SCENARIO_VERSION{{/code}} and {{code}}EVIDENCE_VERSION{{/code}}). 235 +* Verdicts are **per ScenarioVersion** and stored in {{code}}VERDICT_VERSION{{/code}}. 236 +* {{code}}CLAIM_CLUSTER{{/code}} is shared across diagrams; it is shown here and in the Data Use / Review model. 237 + 238 +All version entities are immutable: once created, they are never changed, only 239 +superseded by newer versions. 240 + 241 +---- 242 + 243 += 5.3 Data Use & Review ERD (expanded, versioned) = 244 + 245 +The **Data Use** model captures who does what with which versioned data: 246 + 247 +* Users (including technical users) 248 +* Roles and role assignments 249 +* Review actions on versioned entities 250 + 251 +{{comment}} Data Use ERD (Mermaid, from /Specification/Diagrams/Data Use ERD) {{/comment}} 252 +{{include document="FactHarbor.Playground.Data Use ERD Page (from Specification chat).WebHome"/}} 253 + 254 +Notes: 255 + 256 +* Most roles (READER, CONTRIBUTOR, TRUSTED_CONTRIBUTOR, REVIEWER, MODERATOR, 257 + SYSTEM_ADMIN, FEDERATION_OPERATOR, FEDERATION_ADMIN, …) are represented as rows 258 + in {{code}}ROLE{{/code}}. 259 +* {{code}}TECHNICAL_USER{{/code}} captures strictly technical accounts (API keys, 260 + node-to-node federation agents, batch jobs). All other roles can, in principle, 261 + be held by both human and technical users where appropriate. 262 +* A {{code}}READER{{/code}} normally does **not** perform REVIEW_ACTIONs, while 263 + roles like REVIEWER, TRUSTED_CONTRIBUTOR, MODERATOR, and some federation roles 264 + do. 265 + 266 +---- 267 + 268 += 5.4 Versioning and re-evaluation behavior = 269 + 270 +This section ties the data model to the re-evaluation logic 271 +(described in more detail in the Versioning and Automation chapters). 272 + 273 +* When a new {{code}}EVIDENCE_VERSION{{/code}} is created: 274 +* All related {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} entries referencing 275 + that evidence version are candidates for re-assessment. 276 +* Related {{code}}VERDICT_VERSION{{/code}} entries may become **outdated** and 277 + are queued for re-evaluation. 278 + 279 +* When a new {{code}}SCENARIO_VERSION{{/code}} is created: 280 +* It may inherit some links from earlier scenarios, or start empty depending 281 + on the change classification (cosmetic vs. conceptual). 282 +* All verdicts for that scenario are recalculated and stored as new 283 +{{code}}VERDICT_VERSION{{/code}} entries. 284 + 285 +* REVIEW_ACTIONs are always attached to the **exact version** that was seen by 286 + the reviewer. This preserves a faithful audit trail if data later changes. 287 + 288 +* In a federated environment, nodes can choose: 289 +* which identity entities to replicate (CLAIM, SCENARIO, EVIDENCE, VERDICT) 290 +* which versioned entities to replicate (e.g. only accepted VERDICT_VERSIONs, 291 + only EVIDENCE_VERSIONs above a reliability threshold, etc.) 292 + 293 +---- 294 + 295 += 5.5 Behavioral Notes = 296 + 297 +== 5.5.1 Late-Arriving Evidence == 298 + 299 +New evidence versions can make existing verdicts **outdated** and may trigger 300 +re-evaluation cascades. This is handled by the global trigger and automation 301 +architecture (see the Versioning & Automation chapters). 302 + 303 +== 5.5.2 Scenario Evolution == 304 + 305 +Scenario changes create new SCENARIO_VERSIONs; dependent verdicts and 306 +Scenario–Evidence links are re-assessed. Old versions remain available for 307 +historical comparison and reproducibility. 308 + 309 +== 5.5.3 Federation == 310 + 311 +Federated nodes can replicate subsets of the graph, including: 312 + 313 +* Claims and Scenarios of local interest 314 +* Evidence metadata (without full content) 315 +* Verdict lineages used for local decision-making 316 + 317 +Federation-specific entities (such as {{code}}FEDERATION_NODE{{/code}}, 318 +replication logs, and trust rules) are described in the Federation & 319 +Decentralization chapter and build on top of the core data model defined here.