Last modified by Robert Schaub on 2025/12/24 20:35

From version 4.1
edited by Robert Schaub
on 2025/11/27 12:11
Change comment: There is no comment for this version
To version 8.1
edited by Robert Schaub
on 2025/11/27 12:55
Change comment: There is no comment for this version

Summary

Details

Page properties
Content
... ... @@ -1,3 +1,162 @@
1 +(((
2 +
3 +)))
4 +
5 += 5. Data Model =
6 +
7 +The FactHarbor data model centers on four fully versioned, immutable entities:
8 +
9 +* **Claim**
10 +* **Scenario**
11 +* **Evidence**
12 +* **Verdict**
13 +
14 +These entities form the structured **“truth landscape”** for each claim.
15 +The model is explicitly **versioned**, **traceable**, and **federation-ready**.
16 +
17 +To keep the system auditable and explainable, FactHarbor uses a consistent
18 +**identity vs. version** pattern:
19 +
20 +* Identity entities (e.g. {{code}}CLAIM{{/code}}, {{code}}SCENARIO{{/code}})
21 + define *what* something is in a stable sense.
22 +* Version entities (e.g. {{code}}CLAIM_VERSION{{/code}}, {{code}}SCENARIO_VERSION{{/code}})
23 + define *how that thing looked at a given point in time*.
24 +
25 +All reasoning (e.g. verdicts, review actions) is attached to **versions**, never to
26 +mutable identities.
27 +
28 +----
29 +
30 += 5.1 Core entities and versioning pattern =
31 +
32 +(% class="wikitable" %)
33 +| **Logical concept** | **Identity entity** | **Version entity** | **Notes**
34 +| Claim (what people argue about) | {{code}}CLAIM{{/code}} | {{code}}CLAIM_VERSION{{/code}} | Claim text, phrasing, and metadata live in {{code}}CLAIM_VERSION{{/code}}. The identity {{code}}CLAIM{{/code}} stays stable across rephrasings.
35 +| Scenario (interpretive frame) | {{code}}SCENARIO{{/code}} | {{code}}SCENARIO_VERSION{{/code}} | A SCENARIO belongs to a CLAIM. Its versions capture evolving definitions, assumptions, and boundaries.
36 +| Evidence (source / datapoint) | {{code}}EVIDENCE{{/code}} | {{code}}EVIDENCE_VERSION{{/code}} | Identity of a source vs. specific extractions / updates over time.
37 +| Verdict (assessment) | {{code}}VERDICT{{/code}} | {{code}}VERDICT_VERSION{{/code}} | A VERDICT is defined per SCENARIO; VERDICT_VERSION captures the history of assessments.
38 +| Scenario–Evidence link | {{code}}SCENARIO_EVIDENCE_LINK{{/code}} | {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} | Links bind scenario versions to evidence versions with relevance & direction.
39 +| Claim cluster (semantic group) | {{code}}CLAIM_CLUSTER{{/code}} | – | Groups semantically related claims; mainly for discovery and navigation.
40 +
41 +Key design decisions:
42 +
43 +* A {{code}}CLAIM{{/code}} belongs to exactly one {{code}}CLAIM_CLUSTER{{/code}}.
44 +* A {{code}}SCENARIO{{/code}} belongs to exactly one {{code}}CLAIM{{/code}}
45 + (scenarios live at the *claim* level, not per individual phrasing).
46 +* Verdicts and Scenario–Evidence links are always attached to **versions**:
47 +* {{code}}SCENARIO_VERSION{{/code}} +
48 +{{code}}EVIDENCE_VERSION{{/code}} →
49 +{{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}}
50 +* {{code}}SCENARIO_VERSION{{/code}} →
51 +{{code}}VERDICT_VERSION{{/code}}
52 +
53 +This ensures that when a Scenario or Evidence changes, old verdicts and links
54 +remain intact as historical records and can be revisited.
55 +
56 +----
57 +
58 += 5.2 Core Data Model ERD (expanded, versioned) =
59 +
60 +The following Mermaid ER diagram shows the main entities and their relationships.
61 +The convention is that fields ending in {{code}}Id{{/code}} are primary keys,
62 +and fields with {{code}}...IdFk{{/code}} are foreign keys.
63 +
64 +{{comment}} Core Data Model ERD (Mermaid, from /Specification/Diagrams/Data Model) {{/comment}}
65 +{{include document="FactHarbor.Playground.Core Data Model ERD Page (from Specification chat).WebHome" reference="FactHarbor.Playground.data.Core Data Model ERD Page (from Specification chat).WebHome"/}}
66 +
67 +**Important points:**
68 +
69 +* Scenarios and Evidence are **linked via their versions**
70 + ({{code}}SCENARIO_VERSION{{/code}} and {{code}}EVIDENCE_VERSION{{/code}}).
71 +* Verdicts are **per ScenarioVersion** and stored in {{code}}VERDICT_VERSION{{/code}}.
72 +* {{code}}CLAIM_CLUSTER{{/code}} is shared across diagrams; it is shown here and in the Data Use / Review model.
73 +
74 +All version entities are immutable: once created, they are never changed, only
75 +superseded by newer versions.
76 +
77 +----
78 +
79 += 5.3 Data Use & Review ERD =
80 +
81 +The **Data Use** model captures who does what with which versioned data:
82 +
83 +* Users (including technical users)
84 +* Roles and role assignments
85 +* Review actions on versioned entities
86 +
87 +{{comment}} Data Use ERD (Mermaid, from /Specification/Diagrams/Data Use ERD) {{/comment}}
88 +{{include document="FactHarbor.Playground.Data Use ERD Page (from Specification chat).WebHome" reference="FactHarbor.Playground.data.Data Use ERD Page (from Specification chat).WebHome"/}}
89 +
90 +
91 +Notes:
92 +
93 +* Most roles (READER, CONTRIBUTOR, TRUSTED_CONTRIBUTOR, REVIEWER, MODERATOR,
94 + SYSTEM_ADMIN, FEDERATION_OPERATOR, FEDERATION_ADMIN, …) are represented as rows
95 + in {{code}}ROLE{{/code}}.
96 +* {{code}}TECHNICAL_USER{{/code}} captures strictly technical accounts (API keys,
97 + node-to-node federation agents, batch jobs). All other roles can, in principle,
98 + be held by both human and technical users where appropriate.
99 +* A {{code}}READER{{/code}} normally does **not** perform REVIEW_ACTIONs, while
100 + roles like REVIEWER, TRUSTED_CONTRIBUTOR, MODERATOR, and some federation roles
101 + do.
102 +
103 +----
104 +
105 += 5.4 Versioning and re-evaluation behavior =
106 +
107 +This section ties the data model to the re-evaluation logic
108 +(described in more detail in the Versioning and Automation chapters).
109 +
110 +* When a new {{code}}EVIDENCE_VERSION{{/code}} is created:
111 +* All related {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} entries referencing
112 + that evidence version are candidates for re-assessment.
113 +* Related {{code}}VERDICT_VERSION{{/code}} entries may become **outdated** and
114 + are queued for re-evaluation.
115 +
116 +* When a new {{code}}SCENARIO_VERSION{{/code}} is created:
117 +* It may inherit some links from earlier scenarios, or start empty depending
118 + on the change classification (cosmetic vs. conceptual).
119 +* All verdicts for that scenario are recalculated and stored as new
120 +{{code}}VERDICT_VERSION{{/code}} entries.
121 +
122 +* REVIEW_ACTIONs are always attached to the **exact version** that was seen by
123 + the reviewer. This preserves a faithful audit trail if data later changes.
124 +
125 +* In a federated environment, nodes can choose:
126 +* which identity entities to replicate (CLAIM, SCENARIO, EVIDENCE, VERDICT)
127 +* which versioned entities to replicate (e.g. only accepted VERDICT_VERSIONs,
128 + only EVIDENCE_VERSIONs above a reliability threshold, etc.)
129 +
130 +----
131 +
132 += 5.5 Behavioral Notes =
133 +
134 +== 5.5.1 Late-Arriving Evidence ==
135 +
136 +New evidence versions can make existing verdicts **outdated** and may trigger
137 +re-evaluation cascades. This is handled by the global trigger and automation
138 +architecture (see the Versioning & Automation chapters).
139 +
140 +== 5.5.2 Scenario Evolution ==
141 +
142 +Scenario changes create new SCENARIO_VERSIONs; dependent verdicts and
143 +Scenario–Evidence links are re-assessed. Old versions remain available for
144 +historical comparison and reproducibility.
145 +
146 +== 5.5.3 Federation ==
147 +
148 +Federated nodes can replicate subsets of the graph, including:
149 +
150 +* Claims and Scenarios of local interest
151 +* Evidence metadata (without full content)
152 +* Verdict lineages used for local decision-making
153 +
154 +Federation-specific entities (such as {{code}}FEDERATION_NODE{{/code}},
155 +replication logs, and trust rules) are described in the Federation &
156 +Decentralization chapter and build on top of the core data model defined here.
157 +
158 +----
159 +
1 1  == 1. Overall analysis & review of the data model ==
2 2  
3 3  === 1.1 Strengths of the current design ===
... ... @@ -165,385 +165,3 @@
165 165  )))
166 166  * That’s fine for now; I’ll just clarify that those belong to a “Processing / AKEL” submodel, not the core logical data model.
167 167  )))
168 -
169 -= 5. Data Model =
170 -
171 -The FactHarbor data model centers on four fully versioned, immutable entities:
172 -
173 -* **Claim**
174 -* **Scenario**
175 -* **Evidence**
176 -* **Verdict**
177 -
178 -These entities form the structured **“truth landscape”** for each claim.
179 -The model is explicitly **versioned**, **traceable**, and **federation-ready**.
180 -
181 -To keep the system auditable and explainable, FactHarbor uses a consistent
182 -**identity vs. version** pattern:
183 -
184 -* Identity entities (e.g. {{code}}CLAIM{{/code}}, {{code}}SCENARIO{{/code}})
185 - define *what* something is in a stable sense.
186 -* Version entities (e.g. {{code}}CLAIM_VERSION{{/code}}, {{code}}SCENARIO_VERSION{{/code}})
187 - define *how that thing looked at a given point in time*.
188 -
189 -All reasoning (e.g. verdicts, review actions) is attached to **versions**, never to
190 -mutable identities.
191 -
192 -----
193 -
194 -= 5.1 Core entities and versioning pattern =
195 -
196 -(% class="wikitable" %)
197 -| **Logical concept** | **Identity entity** | **Version entity** | **Notes**
198 -| Claim (what people argue about) | {{code}}CLAIM{{/code}} | {{code}}CLAIM_VERSION{{/code}} | Claim text, phrasing, and metadata live in {{code}}CLAIM_VERSION{{/code}}. The identity {{code}}CLAIM{{/code}} stays stable across rephrasings.
199 -| Scenario (interpretive frame) | {{code}}SCENARIO{{/code}} | {{code}}SCENARIO_VERSION{{/code}} | A SCENARIO belongs to a CLAIM. Its versions capture evolving definitions, assumptions, and boundaries.
200 -| Evidence (source / datapoint) | {{code}}EVIDENCE{{/code}} | {{code}}EVIDENCE_VERSION{{/code}} | Identity of a source vs. specific extractions / updates over time.
201 -| Verdict (assessment) | {{code}}VERDICT{{/code}} | {{code}}VERDICT_VERSION{{/code}} | A VERDICT is defined per SCENARIO; VERDICT_VERSION captures the history of assessments.
202 -| Scenario–Evidence link | {{code}}SCENARIO_EVIDENCE_LINK{{/code}} | {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} | Links bind scenario versions to evidence versions with relevance & direction.
203 -| Claim cluster (semantic group) | {{code}}CLAIM_CLUSTER{{/code}} | – | Groups semantically related claims; mainly for discovery and navigation.
204 -
205 -Key design decisions:
206 -
207 -* A {{code}}CLAIM{{/code}} belongs to exactly one {{code}}CLAIM_CLUSTER{{/code}}.
208 -* A {{code}}SCENARIO{{/code}} belongs to exactly one {{code}}CLAIM{{/code}}
209 - (scenarios live at the *claim* level, not per individual phrasing).
210 -* Verdicts and Scenario–Evidence links are always attached to **versions**:
211 -* {{code}}SCENARIO_VERSION{{/code}} +
212 -{{code}}EVIDENCE_VERSION{{/code}} →
213 -{{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}}
214 -* {{code}}SCENARIO_VERSION{{/code}} →
215 -{{code}}VERDICT_VERSION{{/code}}
216 -
217 -This ensures that when a Scenario or Evidence changes, old verdicts and links
218 -remain intact as historical records and can be revisited.
219 -
220 -----
221 -
222 -= 5.2 Core Data Model ERD (expanded, versioned) =
223 -
224 -The following Mermaid ER diagram shows the main entities and their relationships.
225 -The convention is that fields ending in {{code}}Id{{/code}} are primary keys,
226 -and fields with {{code}}...IdFk{{/code}} are foreign keys.
227 -
228 -{{mermaid}}
229 -erDiagram
230 - CLAIM_CLUSTER {
231 - string ClusterID PK
232 - string EmbeddingVectorRef
233 - string Theme
234 - }
235 -
236 - CLAIM {
237 - string ClaimID PK
238 - string ClusterID FK
239 - string Status
240 - datetime CreatedAt
241 - }
242 -
243 - CLAIM_VERSION {
244 - string ClaimVersionID PK
245 - string ClaimID FK
246 - string Text
247 - string ClaimType
248 - string Domain
249 - datetime CreatedAt
250 - }
251 -
252 - SCENARIO {
253 - string ScenarioID PK
254 - string ClaimID FK
255 - string Name
256 - datetime CreatedAt
257 - }
258 -
259 - SCENARIO_VERSION {
260 - string ScenarioVersionID PK
261 - string ScenarioID FK
262 - string Definitions
263 - string Assumptions
264 - string Boundaries
265 - datetime CreatedAt
266 - }
267 -
268 - EVIDENCE {
269 - string EvidenceID PK
270 - string SourceType
271 - string URL
272 - float ReliabilityScore
273 - }
274 -
275 - EVIDENCE_VERSION {
276 - string EvidenceVersionID PK
277 - string EvidenceID FK
278 - string Summary
279 - float ReliabilityScore
280 - datetime CreatedAt
281 - }
282 -
283 - SCENARIO_EVIDENCE_LINK {
284 - string LinkID PK
285 - string ScenarioVersionID FK
286 - string EvidenceVersionID FK
287 - float Relevance
288 - string Direction
289 - }
290 -
291 - VERDICT {
292 - string VerdictID PK
293 - string ScenarioID FK
294 - }
295 -
296 - VERDICT_VERSION {
297 - string VerdictVersionID PK
298 - string VerdictID FK
299 - float Verdict
300 - float Confidence
301 - string Reasoning
302 - datetime CreatedAt
303 - }
304 -
305 - CLAIM_CLUSTER ||--o{ CLAIM : contains
306 - CLAIM ||--o{ CLAIM_VERSION : versions
307 -
308 - CLAIM ||--o{ SCENARIO : has
309 - SCENARIO ||--o{ SCENARIO_VERSION : versions
310 -
311 - EVIDENCE ||--o{ EVIDENCE_VERSION : versions
312 -
313 - SCENARIO_VERSION ||--o{ SCENARIO_EVIDENCE_LINK : links
314 - EVIDENCE_VERSION ||--o{ SCENARIO_EVIDENCE_LINK : linked
315 -
316 - SCENARIO ||--o{ VERDICT : assessed
317 - VERDICT ||--o{ VERDICT_VERSION : versions
318 -
319 -{{/mermaid}}
320 -
321 -**Important points:**
322 -
323 -* Scenarios and Evidence are **linked via their versions**
324 - ({{code}}SCENARIO_VERSION{{/code}} and {{code}}EVIDENCE_VERSION{{/code}}).
325 -* Verdicts are **per ScenarioVersion** and stored in {{code}}VERDICT_VERSION{{/code}}.
326 -* {{code}}CLAIM_CLUSTER{{/code}} is shared across diagrams; it is shown here and in the Data Use / Review model.
327 -
328 -All version entities are immutable: once created, they are never changed, only
329 -superseded by newer versions.
330 -
331 -----
332 -
333 -= 5.3 Data Use & Review ERD (expanded, versioned) =
334 -
335 -The **Data Use** model captures who does what with which versioned data:
336 -
337 -* Users (including technical users)
338 -* Roles and role assignments
339 -* Review actions on versioned entities
340 -
341 -{{mermaid}}
342 -erDiagram
343 - %% Core clusters shown for context
344 - CLAIM_CLUSTER {
345 - string ClusterID PK
346 - string EmbeddingVectorRef
347 - string Theme
348 - }
349 -
350 - CLAIM {
351 - string ClaimID PK
352 - string ClusterID FK
353 - string Status
354 - datetime CreatedAt
355 - }
356 -
357 - CLAIM_VERSION {
358 - string ClaimVersionID PK
359 - string ClaimID FK
360 - string Text
361 - string ClaimType
362 - string Domain
363 - datetime CreatedAt
364 - }
365 -
366 - SCENARIO {
367 - string ScenarioID PK
368 - string ClaimID FK
369 - string Name
370 - datetime CreatedAt
371 - }
372 -
373 - SCENARIO_VERSION {
374 - string ScenarioVersionID PK
375 - string ScenarioID FK
376 - string Definitions
377 - string Assumptions
378 - string Boundaries
379 - datetime CreatedAt
380 - }
381 -
382 - EVIDENCE {
383 - string EvidenceID PK
384 - string SourceType
385 - string URL
386 - float ReliabilityScore
387 - }
388 -
389 - EVIDENCE_VERSION {
390 - string EvidenceVersionID PK
391 - string EvidenceID FK
392 - string Summary
393 - float ReliabilityScore
394 - datetime CreatedAt
395 - }
396 -
397 - VERDICT {
398 - string VerdictID PK
399 - string ScenarioID FK
400 - }
401 -
402 - VERDICT_VERSION {
403 - string VerdictVersionID PK
404 - string VerdictID FK
405 - float Verdict
406 - float Confidence
407 - string Reasoning
408 - datetime CreatedAt
409 - }
410 -
411 - %% Users and roles
412 - USER {
413 - string UserID PK
414 - string Handle
415 - string Email
416 - }
417 -
418 - TECHNICAL_USER {
419 - string UserID PK
420 - string SystemName
421 - }
422 -
423 - CONTRIBUTING_USER {
424 - string UserID PK
425 - string DisplayName
426 - }
427 -
428 - TRUSTED_CONTRIBUTOR {
429 - string UserID PK
430 - string TrustLevel
431 - }
432 -
433 - REVIEWER {
434 - string UserID PK
435 - string Domain
436 - }
437 -
438 - EXPERT {
439 - string UserID PK
440 - string ExpertiseArea
441 - }
442 -
443 - FEDERATION_NODE {
444 - string NodeID PK
445 - string Region
446 - }
447 -
448 - FEDERATION_ADMIN {
449 - string UserID PK
450 - string Permissions
451 - }
452 -
453 - REVIEW_ACTION {
454 - string ReviewActionID PK
455 - string UserID FK
456 - string TargetEntityType
457 - string TargetEntityVersionID
458 - string ActionType
459 - string Comment
460 - datetime Timestamp
461 - }
462 -
463 - %% Inheritance / specialization (modelled as relationships)
464 - USER ||--o{ TECHNICAL_USER : "is a"
465 - USER ||--o{ CONTRIBUTING_USER : "is a"
466 -
467 - CONTRIBUTING_USER ||--o{ TRUSTED_CONTRIBUTOR : "subset"
468 - CONTRIBUTING_USER ||--o{ REVIEWER : "subset"
469 - CONTRIBUTING_USER ||--o{ EXPERT : "subset"
470 -
471 - TECHNICAL_USER ||--o{ FEDERATION_NODE : "operates"
472 - TECHNICAL_USER ||--o{ FEDERATION_ADMIN : "administers"
473 -
474 - %% Review actions on versioned entities
475 - USER ||--o{ REVIEW_ACTION : performs
476 -
477 - REVIEW_ACTION }o--|| CLAIM_VERSION : reviews
478 - REVIEW_ACTION }o--|| SCENARIO_VERSION : reviews
479 - REVIEW_ACTION }o--|| EVIDENCE_VERSION : reviews
480 - REVIEW_ACTION }o--|| VERDICT_VERSION : reviews
481 -
482 -{{/mermaid}}
483 -
484 -Notes:
485 -
486 -* Most roles (READER, CONTRIBUTOR, TRUSTED_CONTRIBUTOR, REVIEWER, MODERATOR,
487 - SYSTEM_ADMIN, FEDERATION_OPERATOR, FEDERATION_ADMIN, …) are represented as rows
488 - in {{code}}ROLE{{/code}}.
489 -* {{code}}TECHNICAL_USER{{/code}} captures strictly technical accounts (API keys,
490 - node-to-node federation agents, batch jobs). All other roles can, in principle,
491 - be held by both human and technical users where appropriate.
492 -* A {{code}}READER{{/code}} normally does **not** perform REVIEW_ACTIONs, while
493 - roles like REVIEWER, TRUSTED_CONTRIBUTOR, MODERATOR, and some federation roles
494 - do.
495 -
496 -----
497 -
498 -= 5.4 Versioning and re-evaluation behavior =
499 -
500 -This section ties the data model to the re-evaluation logic
501 -(described in more detail in the Versioning and Automation chapters).
502 -
503 -* When a new {{code}}EVIDENCE_VERSION{{/code}} is created:
504 -* All related {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} entries referencing
505 - that evidence version are candidates for re-assessment.
506 -* Related {{code}}VERDICT_VERSION{{/code}} entries may become **outdated** and
507 - are queued for re-evaluation.
508 -
509 -* When a new {{code}}SCENARIO_VERSION{{/code}} is created:
510 -* It may inherit some links from earlier scenarios, or start empty depending
511 - on the change classification (cosmetic vs. conceptual).
512 -* All verdicts for that scenario are recalculated and stored as new
513 -{{code}}VERDICT_VERSION{{/code}} entries.
514 -
515 -* REVIEW_ACTIONs are always attached to the **exact version** that was seen by
516 - the reviewer. This preserves a faithful audit trail if data later changes.
517 -
518 -* In a federated environment, nodes can choose:
519 -* which identity entities to replicate (CLAIM, SCENARIO, EVIDENCE, VERDICT)
520 -* which versioned entities to replicate (e.g. only accepted VERDICT_VERSIONs,
521 - only EVIDENCE_VERSIONs above a reliability threshold, etc.)
522 -
523 -----
524 -
525 -= 5.5 Behavioral Notes =
526 -
527 -== 5.5.1 Late-Arriving Evidence ==
528 -
529 -New evidence versions can make existing verdicts **outdated** and may trigger
530 -re-evaluation cascades. This is handled by the global trigger and automation
531 -architecture (see the Versioning & Automation chapters).
532 -
533 -== 5.5.2 Scenario Evolution ==
534 -
535 -Scenario changes create new SCENARIO_VERSIONs; dependent verdicts and
536 -Scenario–Evidence links are re-assessed. Old versions remain available for
537 -historical comparison and reproducibility.
538 -
539 -== 5.5.3 Federation ==
540 -
541 -Federated nodes can replicate subsets of the graph, including:
542 -
543 -* Claims and Scenarios of local interest
544 -* Evidence metadata (without full content)
545 -* Verdict lineages used for local decision-making
546 -
547 -Federation-specific entities (such as {{code}}FEDERATION_NODE{{/code}},
548 -replication logs, and trust rules) are described in the Federation &
549 -Decentralization chapter and build on top of the core data model defined here.