Changes for page Data Model (From Specification Chat)
Last modified by Robert Schaub on 2025/12/24 20:35
From version 8.1
edited by Robert Schaub
on 2025/11/27 12:55
on 2025/11/27 12:55
Change comment:
There is no comment for this version
To version 4.1
edited by Robert Schaub
on 2025/11/27 12:11
on 2025/11/27 12:11
Change comment:
There is no comment for this version
Summary
-
Page properties (1 modified, 0 added, 0 removed)
Details
- Page properties
-
- Content
-
... ... @@ -1,162 +160,3 @@ 1 -((( 2 - 3 -))) 4 - 5 -= 5. Data Model = 6 - 7 -The FactHarbor data model centers on four fully versioned, immutable entities: 8 - 9 -* **Claim** 10 -* **Scenario** 11 -* **Evidence** 12 -* **Verdict** 13 - 14 -These entities form the structured **“truth landscape”** for each claim. 15 -The model is explicitly **versioned**, **traceable**, and **federation-ready**. 16 - 17 -To keep the system auditable and explainable, FactHarbor uses a consistent 18 -**identity vs. version** pattern: 19 - 20 -* Identity entities (e.g. {{code}}CLAIM{{/code}}, {{code}}SCENARIO{{/code}}) 21 - define *what* something is in a stable sense. 22 -* Version entities (e.g. {{code}}CLAIM_VERSION{{/code}}, {{code}}SCENARIO_VERSION{{/code}}) 23 - define *how that thing looked at a given point in time*. 24 - 25 -All reasoning (e.g. verdicts, review actions) is attached to **versions**, never to 26 -mutable identities. 27 - 28 ----- 29 - 30 -= 5.1 Core entities and versioning pattern = 31 - 32 -(% class="wikitable" %) 33 -| **Logical concept** | **Identity entity** | **Version entity** | **Notes** 34 -| Claim (what people argue about) | {{code}}CLAIM{{/code}} | {{code}}CLAIM_VERSION{{/code}} | Claim text, phrasing, and metadata live in {{code}}CLAIM_VERSION{{/code}}. The identity {{code}}CLAIM{{/code}} stays stable across rephrasings. 35 -| Scenario (interpretive frame) | {{code}}SCENARIO{{/code}} | {{code}}SCENARIO_VERSION{{/code}} | A SCENARIO belongs to a CLAIM. Its versions capture evolving definitions, assumptions, and boundaries. 36 -| Evidence (source / datapoint) | {{code}}EVIDENCE{{/code}} | {{code}}EVIDENCE_VERSION{{/code}} | Identity of a source vs. specific extractions / updates over time. 37 -| Verdict (assessment) | {{code}}VERDICT{{/code}} | {{code}}VERDICT_VERSION{{/code}} | A VERDICT is defined per SCENARIO; VERDICT_VERSION captures the history of assessments. 38 -| Scenario–Evidence link | {{code}}SCENARIO_EVIDENCE_LINK{{/code}} | {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} | Links bind scenario versions to evidence versions with relevance & direction. 39 -| Claim cluster (semantic group) | {{code}}CLAIM_CLUSTER{{/code}} | – | Groups semantically related claims; mainly for discovery and navigation. 40 - 41 -Key design decisions: 42 - 43 -* A {{code}}CLAIM{{/code}} belongs to exactly one {{code}}CLAIM_CLUSTER{{/code}}. 44 -* A {{code}}SCENARIO{{/code}} belongs to exactly one {{code}}CLAIM{{/code}} 45 - (scenarios live at the *claim* level, not per individual phrasing). 46 -* Verdicts and Scenario–Evidence links are always attached to **versions**: 47 -* {{code}}SCENARIO_VERSION{{/code}} + 48 -{{code}}EVIDENCE_VERSION{{/code}} → 49 -{{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} 50 -* {{code}}SCENARIO_VERSION{{/code}} → 51 -{{code}}VERDICT_VERSION{{/code}} 52 - 53 -This ensures that when a Scenario or Evidence changes, old verdicts and links 54 -remain intact as historical records and can be revisited. 55 - 56 ----- 57 - 58 -= 5.2 Core Data Model ERD (expanded, versioned) = 59 - 60 -The following Mermaid ER diagram shows the main entities and their relationships. 61 -The convention is that fields ending in {{code}}Id{{/code}} are primary keys, 62 -and fields with {{code}}...IdFk{{/code}} are foreign keys. 63 - 64 -{{comment}} Core Data Model ERD (Mermaid, from /Specification/Diagrams/Data Model) {{/comment}} 65 -{{include document="FactHarbor.Playground.Core Data Model ERD Page (from Specification chat).WebHome" reference="FactHarbor.Playground.data.Core Data Model ERD Page (from Specification chat).WebHome"/}} 66 - 67 -**Important points:** 68 - 69 -* Scenarios and Evidence are **linked via their versions** 70 - ({{code}}SCENARIO_VERSION{{/code}} and {{code}}EVIDENCE_VERSION{{/code}}). 71 -* Verdicts are **per ScenarioVersion** and stored in {{code}}VERDICT_VERSION{{/code}}. 72 -* {{code}}CLAIM_CLUSTER{{/code}} is shared across diagrams; it is shown here and in the Data Use / Review model. 73 - 74 -All version entities are immutable: once created, they are never changed, only 75 -superseded by newer versions. 76 - 77 ----- 78 - 79 -= 5.3 Data Use & Review ERD = 80 - 81 -The **Data Use** model captures who does what with which versioned data: 82 - 83 -* Users (including technical users) 84 -* Roles and role assignments 85 -* Review actions on versioned entities 86 - 87 -{{comment}} Data Use ERD (Mermaid, from /Specification/Diagrams/Data Use ERD) {{/comment}} 88 -{{include document="FactHarbor.Playground.Data Use ERD Page (from Specification chat).WebHome" reference="FactHarbor.Playground.data.Data Use ERD Page (from Specification chat).WebHome"/}} 89 - 90 - 91 -Notes: 92 - 93 -* Most roles (READER, CONTRIBUTOR, TRUSTED_CONTRIBUTOR, REVIEWER, MODERATOR, 94 - SYSTEM_ADMIN, FEDERATION_OPERATOR, FEDERATION_ADMIN, …) are represented as rows 95 - in {{code}}ROLE{{/code}}. 96 -* {{code}}TECHNICAL_USER{{/code}} captures strictly technical accounts (API keys, 97 - node-to-node federation agents, batch jobs). All other roles can, in principle, 98 - be held by both human and technical users where appropriate. 99 -* A {{code}}READER{{/code}} normally does **not** perform REVIEW_ACTIONs, while 100 - roles like REVIEWER, TRUSTED_CONTRIBUTOR, MODERATOR, and some federation roles 101 - do. 102 - 103 ----- 104 - 105 -= 5.4 Versioning and re-evaluation behavior = 106 - 107 -This section ties the data model to the re-evaluation logic 108 -(described in more detail in the Versioning and Automation chapters). 109 - 110 -* When a new {{code}}EVIDENCE_VERSION{{/code}} is created: 111 -* All related {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} entries referencing 112 - that evidence version are candidates for re-assessment. 113 -* Related {{code}}VERDICT_VERSION{{/code}} entries may become **outdated** and 114 - are queued for re-evaluation. 115 - 116 -* When a new {{code}}SCENARIO_VERSION{{/code}} is created: 117 -* It may inherit some links from earlier scenarios, or start empty depending 118 - on the change classification (cosmetic vs. conceptual). 119 -* All verdicts for that scenario are recalculated and stored as new 120 -{{code}}VERDICT_VERSION{{/code}} entries. 121 - 122 -* REVIEW_ACTIONs are always attached to the **exact version** that was seen by 123 - the reviewer. This preserves a faithful audit trail if data later changes. 124 - 125 -* In a federated environment, nodes can choose: 126 -* which identity entities to replicate (CLAIM, SCENARIO, EVIDENCE, VERDICT) 127 -* which versioned entities to replicate (e.g. only accepted VERDICT_VERSIONs, 128 - only EVIDENCE_VERSIONs above a reliability threshold, etc.) 129 - 130 ----- 131 - 132 -= 5.5 Behavioral Notes = 133 - 134 -== 5.5.1 Late-Arriving Evidence == 135 - 136 -New evidence versions can make existing verdicts **outdated** and may trigger 137 -re-evaluation cascades. This is handled by the global trigger and automation 138 -architecture (see the Versioning & Automation chapters). 139 - 140 -== 5.5.2 Scenario Evolution == 141 - 142 -Scenario changes create new SCENARIO_VERSIONs; dependent verdicts and 143 -Scenario–Evidence links are re-assessed. Old versions remain available for 144 -historical comparison and reproducibility. 145 - 146 -== 5.5.3 Federation == 147 - 148 -Federated nodes can replicate subsets of the graph, including: 149 - 150 -* Claims and Scenarios of local interest 151 -* Evidence metadata (without full content) 152 -* Verdict lineages used for local decision-making 153 - 154 -Federation-specific entities (such as {{code}}FEDERATION_NODE{{/code}}, 155 -replication logs, and trust rules) are described in the Federation & 156 -Decentralization chapter and build on top of the core data model defined here. 157 - 158 ----- 159 - 160 160 == 1. Overall analysis & review of the data model == 161 161 162 162 === 1.1 Strengths of the current design === ... ... @@ -324,3 +324,385 @@ 324 324 ))) 325 325 * That’s fine for now; I’ll just clarify that those belong to a “Processing / AKEL” submodel, not the core logical data model. 326 326 ))) 168 + 169 += 5. Data Model = 170 + 171 +The FactHarbor data model centers on four fully versioned, immutable entities: 172 + 173 +* **Claim** 174 +* **Scenario** 175 +* **Evidence** 176 +* **Verdict** 177 + 178 +These entities form the structured **“truth landscape”** for each claim. 179 +The model is explicitly **versioned**, **traceable**, and **federation-ready**. 180 + 181 +To keep the system auditable and explainable, FactHarbor uses a consistent 182 +**identity vs. version** pattern: 183 + 184 +* Identity entities (e.g. {{code}}CLAIM{{/code}}, {{code}}SCENARIO{{/code}}) 185 + define *what* something is in a stable sense. 186 +* Version entities (e.g. {{code}}CLAIM_VERSION{{/code}}, {{code}}SCENARIO_VERSION{{/code}}) 187 + define *how that thing looked at a given point in time*. 188 + 189 +All reasoning (e.g. verdicts, review actions) is attached to **versions**, never to 190 +mutable identities. 191 + 192 +---- 193 + 194 += 5.1 Core entities and versioning pattern = 195 + 196 +(% class="wikitable" %) 197 +| **Logical concept** | **Identity entity** | **Version entity** | **Notes** 198 +| Claim (what people argue about) | {{code}}CLAIM{{/code}} | {{code}}CLAIM_VERSION{{/code}} | Claim text, phrasing, and metadata live in {{code}}CLAIM_VERSION{{/code}}. The identity {{code}}CLAIM{{/code}} stays stable across rephrasings. 199 +| Scenario (interpretive frame) | {{code}}SCENARIO{{/code}} | {{code}}SCENARIO_VERSION{{/code}} | A SCENARIO belongs to a CLAIM. Its versions capture evolving definitions, assumptions, and boundaries. 200 +| Evidence (source / datapoint) | {{code}}EVIDENCE{{/code}} | {{code}}EVIDENCE_VERSION{{/code}} | Identity of a source vs. specific extractions / updates over time. 201 +| Verdict (assessment) | {{code}}VERDICT{{/code}} | {{code}}VERDICT_VERSION{{/code}} | A VERDICT is defined per SCENARIO; VERDICT_VERSION captures the history of assessments. 202 +| Scenario–Evidence link | {{code}}SCENARIO_EVIDENCE_LINK{{/code}} | {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} | Links bind scenario versions to evidence versions with relevance & direction. 203 +| Claim cluster (semantic group) | {{code}}CLAIM_CLUSTER{{/code}} | – | Groups semantically related claims; mainly for discovery and navigation. 204 + 205 +Key design decisions: 206 + 207 +* A {{code}}CLAIM{{/code}} belongs to exactly one {{code}}CLAIM_CLUSTER{{/code}}. 208 +* A {{code}}SCENARIO{{/code}} belongs to exactly one {{code}}CLAIM{{/code}} 209 + (scenarios live at the *claim* level, not per individual phrasing). 210 +* Verdicts and Scenario–Evidence links are always attached to **versions**: 211 +* {{code}}SCENARIO_VERSION{{/code}} + 212 +{{code}}EVIDENCE_VERSION{{/code}} → 213 +{{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} 214 +* {{code}}SCENARIO_VERSION{{/code}} → 215 +{{code}}VERDICT_VERSION{{/code}} 216 + 217 +This ensures that when a Scenario or Evidence changes, old verdicts and links 218 +remain intact as historical records and can be revisited. 219 + 220 +---- 221 + 222 += 5.2 Core Data Model ERD (expanded, versioned) = 223 + 224 +The following Mermaid ER diagram shows the main entities and their relationships. 225 +The convention is that fields ending in {{code}}Id{{/code}} are primary keys, 226 +and fields with {{code}}...IdFk{{/code}} are foreign keys. 227 + 228 +{{mermaid}} 229 +erDiagram 230 + CLAIM_CLUSTER { 231 + string ClusterID PK 232 + string EmbeddingVectorRef 233 + string Theme 234 + } 235 + 236 + CLAIM { 237 + string ClaimID PK 238 + string ClusterID FK 239 + string Status 240 + datetime CreatedAt 241 + } 242 + 243 + CLAIM_VERSION { 244 + string ClaimVersionID PK 245 + string ClaimID FK 246 + string Text 247 + string ClaimType 248 + string Domain 249 + datetime CreatedAt 250 + } 251 + 252 + SCENARIO { 253 + string ScenarioID PK 254 + string ClaimID FK 255 + string Name 256 + datetime CreatedAt 257 + } 258 + 259 + SCENARIO_VERSION { 260 + string ScenarioVersionID PK 261 + string ScenarioID FK 262 + string Definitions 263 + string Assumptions 264 + string Boundaries 265 + datetime CreatedAt 266 + } 267 + 268 + EVIDENCE { 269 + string EvidenceID PK 270 + string SourceType 271 + string URL 272 + float ReliabilityScore 273 + } 274 + 275 + EVIDENCE_VERSION { 276 + string EvidenceVersionID PK 277 + string EvidenceID FK 278 + string Summary 279 + float ReliabilityScore 280 + datetime CreatedAt 281 + } 282 + 283 + SCENARIO_EVIDENCE_LINK { 284 + string LinkID PK 285 + string ScenarioVersionID FK 286 + string EvidenceVersionID FK 287 + float Relevance 288 + string Direction 289 + } 290 + 291 + VERDICT { 292 + string VerdictID PK 293 + string ScenarioID FK 294 + } 295 + 296 + VERDICT_VERSION { 297 + string VerdictVersionID PK 298 + string VerdictID FK 299 + float Verdict 300 + float Confidence 301 + string Reasoning 302 + datetime CreatedAt 303 + } 304 + 305 + CLAIM_CLUSTER ||--o{ CLAIM : contains 306 + CLAIM ||--o{ CLAIM_VERSION : versions 307 + 308 + CLAIM ||--o{ SCENARIO : has 309 + SCENARIO ||--o{ SCENARIO_VERSION : versions 310 + 311 + EVIDENCE ||--o{ EVIDENCE_VERSION : versions 312 + 313 + SCENARIO_VERSION ||--o{ SCENARIO_EVIDENCE_LINK : links 314 + EVIDENCE_VERSION ||--o{ SCENARIO_EVIDENCE_LINK : linked 315 + 316 + SCENARIO ||--o{ VERDICT : assessed 317 + VERDICT ||--o{ VERDICT_VERSION : versions 318 + 319 +{{/mermaid}} 320 + 321 +**Important points:** 322 + 323 +* Scenarios and Evidence are **linked via their versions** 324 + ({{code}}SCENARIO_VERSION{{/code}} and {{code}}EVIDENCE_VERSION{{/code}}). 325 +* Verdicts are **per ScenarioVersion** and stored in {{code}}VERDICT_VERSION{{/code}}. 326 +* {{code}}CLAIM_CLUSTER{{/code}} is shared across diagrams; it is shown here and in the Data Use / Review model. 327 + 328 +All version entities are immutable: once created, they are never changed, only 329 +superseded by newer versions. 330 + 331 +---- 332 + 333 += 5.3 Data Use & Review ERD (expanded, versioned) = 334 + 335 +The **Data Use** model captures who does what with which versioned data: 336 + 337 +* Users (including technical users) 338 +* Roles and role assignments 339 +* Review actions on versioned entities 340 + 341 +{{mermaid}} 342 +erDiagram 343 + %% Core clusters shown for context 344 + CLAIM_CLUSTER { 345 + string ClusterID PK 346 + string EmbeddingVectorRef 347 + string Theme 348 + } 349 + 350 + CLAIM { 351 + string ClaimID PK 352 + string ClusterID FK 353 + string Status 354 + datetime CreatedAt 355 + } 356 + 357 + CLAIM_VERSION { 358 + string ClaimVersionID PK 359 + string ClaimID FK 360 + string Text 361 + string ClaimType 362 + string Domain 363 + datetime CreatedAt 364 + } 365 + 366 + SCENARIO { 367 + string ScenarioID PK 368 + string ClaimID FK 369 + string Name 370 + datetime CreatedAt 371 + } 372 + 373 + SCENARIO_VERSION { 374 + string ScenarioVersionID PK 375 + string ScenarioID FK 376 + string Definitions 377 + string Assumptions 378 + string Boundaries 379 + datetime CreatedAt 380 + } 381 + 382 + EVIDENCE { 383 + string EvidenceID PK 384 + string SourceType 385 + string URL 386 + float ReliabilityScore 387 + } 388 + 389 + EVIDENCE_VERSION { 390 + string EvidenceVersionID PK 391 + string EvidenceID FK 392 + string Summary 393 + float ReliabilityScore 394 + datetime CreatedAt 395 + } 396 + 397 + VERDICT { 398 + string VerdictID PK 399 + string ScenarioID FK 400 + } 401 + 402 + VERDICT_VERSION { 403 + string VerdictVersionID PK 404 + string VerdictID FK 405 + float Verdict 406 + float Confidence 407 + string Reasoning 408 + datetime CreatedAt 409 + } 410 + 411 + %% Users and roles 412 + USER { 413 + string UserID PK 414 + string Handle 415 + string Email 416 + } 417 + 418 + TECHNICAL_USER { 419 + string UserID PK 420 + string SystemName 421 + } 422 + 423 + CONTRIBUTING_USER { 424 + string UserID PK 425 + string DisplayName 426 + } 427 + 428 + TRUSTED_CONTRIBUTOR { 429 + string UserID PK 430 + string TrustLevel 431 + } 432 + 433 + REVIEWER { 434 + string UserID PK 435 + string Domain 436 + } 437 + 438 + EXPERT { 439 + string UserID PK 440 + string ExpertiseArea 441 + } 442 + 443 + FEDERATION_NODE { 444 + string NodeID PK 445 + string Region 446 + } 447 + 448 + FEDERATION_ADMIN { 449 + string UserID PK 450 + string Permissions 451 + } 452 + 453 + REVIEW_ACTION { 454 + string ReviewActionID PK 455 + string UserID FK 456 + string TargetEntityType 457 + string TargetEntityVersionID 458 + string ActionType 459 + string Comment 460 + datetime Timestamp 461 + } 462 + 463 + %% Inheritance / specialization (modelled as relationships) 464 + USER ||--o{ TECHNICAL_USER : "is a" 465 + USER ||--o{ CONTRIBUTING_USER : "is a" 466 + 467 + CONTRIBUTING_USER ||--o{ TRUSTED_CONTRIBUTOR : "subset" 468 + CONTRIBUTING_USER ||--o{ REVIEWER : "subset" 469 + CONTRIBUTING_USER ||--o{ EXPERT : "subset" 470 + 471 + TECHNICAL_USER ||--o{ FEDERATION_NODE : "operates" 472 + TECHNICAL_USER ||--o{ FEDERATION_ADMIN : "administers" 473 + 474 + %% Review actions on versioned entities 475 + USER ||--o{ REVIEW_ACTION : performs 476 + 477 + REVIEW_ACTION }o--|| CLAIM_VERSION : reviews 478 + REVIEW_ACTION }o--|| SCENARIO_VERSION : reviews 479 + REVIEW_ACTION }o--|| EVIDENCE_VERSION : reviews 480 + REVIEW_ACTION }o--|| VERDICT_VERSION : reviews 481 + 482 +{{/mermaid}} 483 + 484 +Notes: 485 + 486 +* Most roles (READER, CONTRIBUTOR, TRUSTED_CONTRIBUTOR, REVIEWER, MODERATOR, 487 + SYSTEM_ADMIN, FEDERATION_OPERATOR, FEDERATION_ADMIN, …) are represented as rows 488 + in {{code}}ROLE{{/code}}. 489 +* {{code}}TECHNICAL_USER{{/code}} captures strictly technical accounts (API keys, 490 + node-to-node federation agents, batch jobs). All other roles can, in principle, 491 + be held by both human and technical users where appropriate. 492 +* A {{code}}READER{{/code}} normally does **not** perform REVIEW_ACTIONs, while 493 + roles like REVIEWER, TRUSTED_CONTRIBUTOR, MODERATOR, and some federation roles 494 + do. 495 + 496 +---- 497 + 498 += 5.4 Versioning and re-evaluation behavior = 499 + 500 +This section ties the data model to the re-evaluation logic 501 +(described in more detail in the Versioning and Automation chapters). 502 + 503 +* When a new {{code}}EVIDENCE_VERSION{{/code}} is created: 504 +* All related {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} entries referencing 505 + that evidence version are candidates for re-assessment. 506 +* Related {{code}}VERDICT_VERSION{{/code}} entries may become **outdated** and 507 + are queued for re-evaluation. 508 + 509 +* When a new {{code}}SCENARIO_VERSION{{/code}} is created: 510 +* It may inherit some links from earlier scenarios, or start empty depending 511 + on the change classification (cosmetic vs. conceptual). 512 +* All verdicts for that scenario are recalculated and stored as new 513 +{{code}}VERDICT_VERSION{{/code}} entries. 514 + 515 +* REVIEW_ACTIONs are always attached to the **exact version** that was seen by 516 + the reviewer. This preserves a faithful audit trail if data later changes. 517 + 518 +* In a federated environment, nodes can choose: 519 +* which identity entities to replicate (CLAIM, SCENARIO, EVIDENCE, VERDICT) 520 +* which versioned entities to replicate (e.g. only accepted VERDICT_VERSIONs, 521 + only EVIDENCE_VERSIONs above a reliability threshold, etc.) 522 + 523 +---- 524 + 525 += 5.5 Behavioral Notes = 526 + 527 +== 5.5.1 Late-Arriving Evidence == 528 + 529 +New evidence versions can make existing verdicts **outdated** and may trigger 530 +re-evaluation cascades. This is handled by the global trigger and automation 531 +architecture (see the Versioning & Automation chapters). 532 + 533 +== 5.5.2 Scenario Evolution == 534 + 535 +Scenario changes create new SCENARIO_VERSIONs; dependent verdicts and 536 +Scenario–Evidence links are re-assessed. Old versions remain available for 537 +historical comparison and reproducibility. 538 + 539 +== 5.5.3 Federation == 540 + 541 +Federated nodes can replicate subsets of the graph, including: 542 + 543 +* Claims and Scenarios of local interest 544 +* Evidence metadata (without full content) 545 +* Verdict lineages used for local decision-making 546 + 547 +Federation-specific entities (such as {{code}}FEDERATION_NODE{{/code}}, 548 +replication logs, and trust rules) are described in the Federation & 549 +Decentralization chapter and build on top of the core data model defined here.