Last modified by Robert Schaub on 2025/12/24 20:35

From version 8.1
edited by Robert Schaub
on 2025/11/27 12:55
Change comment: There is no comment for this version
To version 4.1
edited by Robert Schaub
on 2025/11/27 12:11
Change comment: There is no comment for this version

Summary

Details

Page properties
Content
... ... @@ -1,162 +160,3 @@
1 -(((
2 -
3 -)))
4 -
5 -= 5. Data Model =
6 -
7 -The FactHarbor data model centers on four fully versioned, immutable entities:
8 -
9 -* **Claim**
10 -* **Scenario**
11 -* **Evidence**
12 -* **Verdict**
13 -
14 -These entities form the structured **“truth landscape”** for each claim.
15 -The model is explicitly **versioned**, **traceable**, and **federation-ready**.
16 -
17 -To keep the system auditable and explainable, FactHarbor uses a consistent
18 -**identity vs. version** pattern:
19 -
20 -* Identity entities (e.g. {{code}}CLAIM{{/code}}, {{code}}SCENARIO{{/code}})
21 - define *what* something is in a stable sense.
22 -* Version entities (e.g. {{code}}CLAIM_VERSION{{/code}}, {{code}}SCENARIO_VERSION{{/code}})
23 - define *how that thing looked at a given point in time*.
24 -
25 -All reasoning (e.g. verdicts, review actions) is attached to **versions**, never to
26 -mutable identities.
27 -
28 -----
29 -
30 -= 5.1 Core entities and versioning pattern =
31 -
32 -(% class="wikitable" %)
33 -| **Logical concept** | **Identity entity** | **Version entity** | **Notes**
34 -| Claim (what people argue about) | {{code}}CLAIM{{/code}} | {{code}}CLAIM_VERSION{{/code}} | Claim text, phrasing, and metadata live in {{code}}CLAIM_VERSION{{/code}}. The identity {{code}}CLAIM{{/code}} stays stable across rephrasings.
35 -| Scenario (interpretive frame) | {{code}}SCENARIO{{/code}} | {{code}}SCENARIO_VERSION{{/code}} | A SCENARIO belongs to a CLAIM. Its versions capture evolving definitions, assumptions, and boundaries.
36 -| Evidence (source / datapoint) | {{code}}EVIDENCE{{/code}} | {{code}}EVIDENCE_VERSION{{/code}} | Identity of a source vs. specific extractions / updates over time.
37 -| Verdict (assessment) | {{code}}VERDICT{{/code}} | {{code}}VERDICT_VERSION{{/code}} | A VERDICT is defined per SCENARIO; VERDICT_VERSION captures the history of assessments.
38 -| Scenario–Evidence link | {{code}}SCENARIO_EVIDENCE_LINK{{/code}} | {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} | Links bind scenario versions to evidence versions with relevance & direction.
39 -| Claim cluster (semantic group) | {{code}}CLAIM_CLUSTER{{/code}} | – | Groups semantically related claims; mainly for discovery and navigation.
40 -
41 -Key design decisions:
42 -
43 -* A {{code}}CLAIM{{/code}} belongs to exactly one {{code}}CLAIM_CLUSTER{{/code}}.
44 -* A {{code}}SCENARIO{{/code}} belongs to exactly one {{code}}CLAIM{{/code}}
45 - (scenarios live at the *claim* level, not per individual phrasing).
46 -* Verdicts and Scenario–Evidence links are always attached to **versions**:
47 -* {{code}}SCENARIO_VERSION{{/code}} +
48 -{{code}}EVIDENCE_VERSION{{/code}} →
49 -{{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}}
50 -* {{code}}SCENARIO_VERSION{{/code}} →
51 -{{code}}VERDICT_VERSION{{/code}}
52 -
53 -This ensures that when a Scenario or Evidence changes, old verdicts and links
54 -remain intact as historical records and can be revisited.
55 -
56 -----
57 -
58 -= 5.2 Core Data Model ERD (expanded, versioned) =
59 -
60 -The following Mermaid ER diagram shows the main entities and their relationships.
61 -The convention is that fields ending in {{code}}Id{{/code}} are primary keys,
62 -and fields with {{code}}...IdFk{{/code}} are foreign keys.
63 -
64 -{{comment}} Core Data Model ERD (Mermaid, from /Specification/Diagrams/Data Model) {{/comment}}
65 -{{include document="FactHarbor.Playground.Core Data Model ERD Page (from Specification chat).WebHome" reference="FactHarbor.Playground.data.Core Data Model ERD Page (from Specification chat).WebHome"/}}
66 -
67 -**Important points:**
68 -
69 -* Scenarios and Evidence are **linked via their versions**
70 - ({{code}}SCENARIO_VERSION{{/code}} and {{code}}EVIDENCE_VERSION{{/code}}).
71 -* Verdicts are **per ScenarioVersion** and stored in {{code}}VERDICT_VERSION{{/code}}.
72 -* {{code}}CLAIM_CLUSTER{{/code}} is shared across diagrams; it is shown here and in the Data Use / Review model.
73 -
74 -All version entities are immutable: once created, they are never changed, only
75 -superseded by newer versions.
76 -
77 -----
78 -
79 -= 5.3 Data Use & Review ERD =
80 -
81 -The **Data Use** model captures who does what with which versioned data:
82 -
83 -* Users (including technical users)
84 -* Roles and role assignments
85 -* Review actions on versioned entities
86 -
87 -{{comment}} Data Use ERD (Mermaid, from /Specification/Diagrams/Data Use ERD) {{/comment}}
88 -{{include document="FactHarbor.Playground.Data Use ERD Page (from Specification chat).WebHome" reference="FactHarbor.Playground.data.Data Use ERD Page (from Specification chat).WebHome"/}}
89 -
90 -
91 -Notes:
92 -
93 -* Most roles (READER, CONTRIBUTOR, TRUSTED_CONTRIBUTOR, REVIEWER, MODERATOR,
94 - SYSTEM_ADMIN, FEDERATION_OPERATOR, FEDERATION_ADMIN, …) are represented as rows
95 - in {{code}}ROLE{{/code}}.
96 -* {{code}}TECHNICAL_USER{{/code}} captures strictly technical accounts (API keys,
97 - node-to-node federation agents, batch jobs). All other roles can, in principle,
98 - be held by both human and technical users where appropriate.
99 -* A {{code}}READER{{/code}} normally does **not** perform REVIEW_ACTIONs, while
100 - roles like REVIEWER, TRUSTED_CONTRIBUTOR, MODERATOR, and some federation roles
101 - do.
102 -
103 -----
104 -
105 -= 5.4 Versioning and re-evaluation behavior =
106 -
107 -This section ties the data model to the re-evaluation logic
108 -(described in more detail in the Versioning and Automation chapters).
109 -
110 -* When a new {{code}}EVIDENCE_VERSION{{/code}} is created:
111 -* All related {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} entries referencing
112 - that evidence version are candidates for re-assessment.
113 -* Related {{code}}VERDICT_VERSION{{/code}} entries may become **outdated** and
114 - are queued for re-evaluation.
115 -
116 -* When a new {{code}}SCENARIO_VERSION{{/code}} is created:
117 -* It may inherit some links from earlier scenarios, or start empty depending
118 - on the change classification (cosmetic vs. conceptual).
119 -* All verdicts for that scenario are recalculated and stored as new
120 -{{code}}VERDICT_VERSION{{/code}} entries.
121 -
122 -* REVIEW_ACTIONs are always attached to the **exact version** that was seen by
123 - the reviewer. This preserves a faithful audit trail if data later changes.
124 -
125 -* In a federated environment, nodes can choose:
126 -* which identity entities to replicate (CLAIM, SCENARIO, EVIDENCE, VERDICT)
127 -* which versioned entities to replicate (e.g. only accepted VERDICT_VERSIONs,
128 - only EVIDENCE_VERSIONs above a reliability threshold, etc.)
129 -
130 -----
131 -
132 -= 5.5 Behavioral Notes =
133 -
134 -== 5.5.1 Late-Arriving Evidence ==
135 -
136 -New evidence versions can make existing verdicts **outdated** and may trigger
137 -re-evaluation cascades. This is handled by the global trigger and automation
138 -architecture (see the Versioning & Automation chapters).
139 -
140 -== 5.5.2 Scenario Evolution ==
141 -
142 -Scenario changes create new SCENARIO_VERSIONs; dependent verdicts and
143 -Scenario–Evidence links are re-assessed. Old versions remain available for
144 -historical comparison and reproducibility.
145 -
146 -== 5.5.3 Federation ==
147 -
148 -Federated nodes can replicate subsets of the graph, including:
149 -
150 -* Claims and Scenarios of local interest
151 -* Evidence metadata (without full content)
152 -* Verdict lineages used for local decision-making
153 -
154 -Federation-specific entities (such as {{code}}FEDERATION_NODE{{/code}},
155 -replication logs, and trust rules) are described in the Federation &
156 -Decentralization chapter and build on top of the core data model defined here.
157 -
158 -----
159 -
160 160  == 1. Overall analysis & review of the data model ==
161 161  
162 162  === 1.1 Strengths of the current design ===
... ... @@ -324,3 +324,385 @@
324 324  )))
325 325  * That’s fine for now; I’ll just clarify that those belong to a “Processing / AKEL” submodel, not the core logical data model.
326 326  )))
168 +
169 += 5. Data Model =
170 +
171 +The FactHarbor data model centers on four fully versioned, immutable entities:
172 +
173 +* **Claim**
174 +* **Scenario**
175 +* **Evidence**
176 +* **Verdict**
177 +
178 +These entities form the structured **“truth landscape”** for each claim.
179 +The model is explicitly **versioned**, **traceable**, and **federation-ready**.
180 +
181 +To keep the system auditable and explainable, FactHarbor uses a consistent
182 +**identity vs. version** pattern:
183 +
184 +* Identity entities (e.g. {{code}}CLAIM{{/code}}, {{code}}SCENARIO{{/code}})
185 + define *what* something is in a stable sense.
186 +* Version entities (e.g. {{code}}CLAIM_VERSION{{/code}}, {{code}}SCENARIO_VERSION{{/code}})
187 + define *how that thing looked at a given point in time*.
188 +
189 +All reasoning (e.g. verdicts, review actions) is attached to **versions**, never to
190 +mutable identities.
191 +
192 +----
193 +
194 += 5.1 Core entities and versioning pattern =
195 +
196 +(% class="wikitable" %)
197 +| **Logical concept** | **Identity entity** | **Version entity** | **Notes**
198 +| Claim (what people argue about) | {{code}}CLAIM{{/code}} | {{code}}CLAIM_VERSION{{/code}} | Claim text, phrasing, and metadata live in {{code}}CLAIM_VERSION{{/code}}. The identity {{code}}CLAIM{{/code}} stays stable across rephrasings.
199 +| Scenario (interpretive frame) | {{code}}SCENARIO{{/code}} | {{code}}SCENARIO_VERSION{{/code}} | A SCENARIO belongs to a CLAIM. Its versions capture evolving definitions, assumptions, and boundaries.
200 +| Evidence (source / datapoint) | {{code}}EVIDENCE{{/code}} | {{code}}EVIDENCE_VERSION{{/code}} | Identity of a source vs. specific extractions / updates over time.
201 +| Verdict (assessment) | {{code}}VERDICT{{/code}} | {{code}}VERDICT_VERSION{{/code}} | A VERDICT is defined per SCENARIO; VERDICT_VERSION captures the history of assessments.
202 +| Scenario–Evidence link | {{code}}SCENARIO_EVIDENCE_LINK{{/code}} | {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} | Links bind scenario versions to evidence versions with relevance & direction.
203 +| Claim cluster (semantic group) | {{code}}CLAIM_CLUSTER{{/code}} | – | Groups semantically related claims; mainly for discovery and navigation.
204 +
205 +Key design decisions:
206 +
207 +* A {{code}}CLAIM{{/code}} belongs to exactly one {{code}}CLAIM_CLUSTER{{/code}}.
208 +* A {{code}}SCENARIO{{/code}} belongs to exactly one {{code}}CLAIM{{/code}}
209 + (scenarios live at the *claim* level, not per individual phrasing).
210 +* Verdicts and Scenario–Evidence links are always attached to **versions**:
211 +* {{code}}SCENARIO_VERSION{{/code}} +
212 +{{code}}EVIDENCE_VERSION{{/code}} →
213 +{{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}}
214 +* {{code}}SCENARIO_VERSION{{/code}} →
215 +{{code}}VERDICT_VERSION{{/code}}
216 +
217 +This ensures that when a Scenario or Evidence changes, old verdicts and links
218 +remain intact as historical records and can be revisited.
219 +
220 +----
221 +
222 += 5.2 Core Data Model ERD (expanded, versioned) =
223 +
224 +The following Mermaid ER diagram shows the main entities and their relationships.
225 +The convention is that fields ending in {{code}}Id{{/code}} are primary keys,
226 +and fields with {{code}}...IdFk{{/code}} are foreign keys.
227 +
228 +{{mermaid}}
229 +erDiagram
230 + CLAIM_CLUSTER {
231 + string ClusterID PK
232 + string EmbeddingVectorRef
233 + string Theme
234 + }
235 +
236 + CLAIM {
237 + string ClaimID PK
238 + string ClusterID FK
239 + string Status
240 + datetime CreatedAt
241 + }
242 +
243 + CLAIM_VERSION {
244 + string ClaimVersionID PK
245 + string ClaimID FK
246 + string Text
247 + string ClaimType
248 + string Domain
249 + datetime CreatedAt
250 + }
251 +
252 + SCENARIO {
253 + string ScenarioID PK
254 + string ClaimID FK
255 + string Name
256 + datetime CreatedAt
257 + }
258 +
259 + SCENARIO_VERSION {
260 + string ScenarioVersionID PK
261 + string ScenarioID FK
262 + string Definitions
263 + string Assumptions
264 + string Boundaries
265 + datetime CreatedAt
266 + }
267 +
268 + EVIDENCE {
269 + string EvidenceID PK
270 + string SourceType
271 + string URL
272 + float ReliabilityScore
273 + }
274 +
275 + EVIDENCE_VERSION {
276 + string EvidenceVersionID PK
277 + string EvidenceID FK
278 + string Summary
279 + float ReliabilityScore
280 + datetime CreatedAt
281 + }
282 +
283 + SCENARIO_EVIDENCE_LINK {
284 + string LinkID PK
285 + string ScenarioVersionID FK
286 + string EvidenceVersionID FK
287 + float Relevance
288 + string Direction
289 + }
290 +
291 + VERDICT {
292 + string VerdictID PK
293 + string ScenarioID FK
294 + }
295 +
296 + VERDICT_VERSION {
297 + string VerdictVersionID PK
298 + string VerdictID FK
299 + float Verdict
300 + float Confidence
301 + string Reasoning
302 + datetime CreatedAt
303 + }
304 +
305 + CLAIM_CLUSTER ||--o{ CLAIM : contains
306 + CLAIM ||--o{ CLAIM_VERSION : versions
307 +
308 + CLAIM ||--o{ SCENARIO : has
309 + SCENARIO ||--o{ SCENARIO_VERSION : versions
310 +
311 + EVIDENCE ||--o{ EVIDENCE_VERSION : versions
312 +
313 + SCENARIO_VERSION ||--o{ SCENARIO_EVIDENCE_LINK : links
314 + EVIDENCE_VERSION ||--o{ SCENARIO_EVIDENCE_LINK : linked
315 +
316 + SCENARIO ||--o{ VERDICT : assessed
317 + VERDICT ||--o{ VERDICT_VERSION : versions
318 +
319 +{{/mermaid}}
320 +
321 +**Important points:**
322 +
323 +* Scenarios and Evidence are **linked via their versions**
324 + ({{code}}SCENARIO_VERSION{{/code}} and {{code}}EVIDENCE_VERSION{{/code}}).
325 +* Verdicts are **per ScenarioVersion** and stored in {{code}}VERDICT_VERSION{{/code}}.
326 +* {{code}}CLAIM_CLUSTER{{/code}} is shared across diagrams; it is shown here and in the Data Use / Review model.
327 +
328 +All version entities are immutable: once created, they are never changed, only
329 +superseded by newer versions.
330 +
331 +----
332 +
333 += 5.3 Data Use & Review ERD (expanded, versioned) =
334 +
335 +The **Data Use** model captures who does what with which versioned data:
336 +
337 +* Users (including technical users)
338 +* Roles and role assignments
339 +* Review actions on versioned entities
340 +
341 +{{mermaid}}
342 +erDiagram
343 + %% Core clusters shown for context
344 + CLAIM_CLUSTER {
345 + string ClusterID PK
346 + string EmbeddingVectorRef
347 + string Theme
348 + }
349 +
350 + CLAIM {
351 + string ClaimID PK
352 + string ClusterID FK
353 + string Status
354 + datetime CreatedAt
355 + }
356 +
357 + CLAIM_VERSION {
358 + string ClaimVersionID PK
359 + string ClaimID FK
360 + string Text
361 + string ClaimType
362 + string Domain
363 + datetime CreatedAt
364 + }
365 +
366 + SCENARIO {
367 + string ScenarioID PK
368 + string ClaimID FK
369 + string Name
370 + datetime CreatedAt
371 + }
372 +
373 + SCENARIO_VERSION {
374 + string ScenarioVersionID PK
375 + string ScenarioID FK
376 + string Definitions
377 + string Assumptions
378 + string Boundaries
379 + datetime CreatedAt
380 + }
381 +
382 + EVIDENCE {
383 + string EvidenceID PK
384 + string SourceType
385 + string URL
386 + float ReliabilityScore
387 + }
388 +
389 + EVIDENCE_VERSION {
390 + string EvidenceVersionID PK
391 + string EvidenceID FK
392 + string Summary
393 + float ReliabilityScore
394 + datetime CreatedAt
395 + }
396 +
397 + VERDICT {
398 + string VerdictID PK
399 + string ScenarioID FK
400 + }
401 +
402 + VERDICT_VERSION {
403 + string VerdictVersionID PK
404 + string VerdictID FK
405 + float Verdict
406 + float Confidence
407 + string Reasoning
408 + datetime CreatedAt
409 + }
410 +
411 + %% Users and roles
412 + USER {
413 + string UserID PK
414 + string Handle
415 + string Email
416 + }
417 +
418 + TECHNICAL_USER {
419 + string UserID PK
420 + string SystemName
421 + }
422 +
423 + CONTRIBUTING_USER {
424 + string UserID PK
425 + string DisplayName
426 + }
427 +
428 + TRUSTED_CONTRIBUTOR {
429 + string UserID PK
430 + string TrustLevel
431 + }
432 +
433 + REVIEWER {
434 + string UserID PK
435 + string Domain
436 + }
437 +
438 + EXPERT {
439 + string UserID PK
440 + string ExpertiseArea
441 + }
442 +
443 + FEDERATION_NODE {
444 + string NodeID PK
445 + string Region
446 + }
447 +
448 + FEDERATION_ADMIN {
449 + string UserID PK
450 + string Permissions
451 + }
452 +
453 + REVIEW_ACTION {
454 + string ReviewActionID PK
455 + string UserID FK
456 + string TargetEntityType
457 + string TargetEntityVersionID
458 + string ActionType
459 + string Comment
460 + datetime Timestamp
461 + }
462 +
463 + %% Inheritance / specialization (modelled as relationships)
464 + USER ||--o{ TECHNICAL_USER : "is a"
465 + USER ||--o{ CONTRIBUTING_USER : "is a"
466 +
467 + CONTRIBUTING_USER ||--o{ TRUSTED_CONTRIBUTOR : "subset"
468 + CONTRIBUTING_USER ||--o{ REVIEWER : "subset"
469 + CONTRIBUTING_USER ||--o{ EXPERT : "subset"
470 +
471 + TECHNICAL_USER ||--o{ FEDERATION_NODE : "operates"
472 + TECHNICAL_USER ||--o{ FEDERATION_ADMIN : "administers"
473 +
474 + %% Review actions on versioned entities
475 + USER ||--o{ REVIEW_ACTION : performs
476 +
477 + REVIEW_ACTION }o--|| CLAIM_VERSION : reviews
478 + REVIEW_ACTION }o--|| SCENARIO_VERSION : reviews
479 + REVIEW_ACTION }o--|| EVIDENCE_VERSION : reviews
480 + REVIEW_ACTION }o--|| VERDICT_VERSION : reviews
481 +
482 +{{/mermaid}}
483 +
484 +Notes:
485 +
486 +* Most roles (READER, CONTRIBUTOR, TRUSTED_CONTRIBUTOR, REVIEWER, MODERATOR,
487 + SYSTEM_ADMIN, FEDERATION_OPERATOR, FEDERATION_ADMIN, …) are represented as rows
488 + in {{code}}ROLE{{/code}}.
489 +* {{code}}TECHNICAL_USER{{/code}} captures strictly technical accounts (API keys,
490 + node-to-node federation agents, batch jobs). All other roles can, in principle,
491 + be held by both human and technical users where appropriate.
492 +* A {{code}}READER{{/code}} normally does **not** perform REVIEW_ACTIONs, while
493 + roles like REVIEWER, TRUSTED_CONTRIBUTOR, MODERATOR, and some federation roles
494 + do.
495 +
496 +----
497 +
498 += 5.4 Versioning and re-evaluation behavior =
499 +
500 +This section ties the data model to the re-evaluation logic
501 +(described in more detail in the Versioning and Automation chapters).
502 +
503 +* When a new {{code}}EVIDENCE_VERSION{{/code}} is created:
504 +* All related {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} entries referencing
505 + that evidence version are candidates for re-assessment.
506 +* Related {{code}}VERDICT_VERSION{{/code}} entries may become **outdated** and
507 + are queued for re-evaluation.
508 +
509 +* When a new {{code}}SCENARIO_VERSION{{/code}} is created:
510 +* It may inherit some links from earlier scenarios, or start empty depending
511 + on the change classification (cosmetic vs. conceptual).
512 +* All verdicts for that scenario are recalculated and stored as new
513 +{{code}}VERDICT_VERSION{{/code}} entries.
514 +
515 +* REVIEW_ACTIONs are always attached to the **exact version** that was seen by
516 + the reviewer. This preserves a faithful audit trail if data later changes.
517 +
518 +* In a federated environment, nodes can choose:
519 +* which identity entities to replicate (CLAIM, SCENARIO, EVIDENCE, VERDICT)
520 +* which versioned entities to replicate (e.g. only accepted VERDICT_VERSIONs,
521 + only EVIDENCE_VERSIONs above a reliability threshold, etc.)
522 +
523 +----
524 +
525 += 5.5 Behavioral Notes =
526 +
527 +== 5.5.1 Late-Arriving Evidence ==
528 +
529 +New evidence versions can make existing verdicts **outdated** and may trigger
530 +re-evaluation cascades. This is handled by the global trigger and automation
531 +architecture (see the Versioning & Automation chapters).
532 +
533 +== 5.5.2 Scenario Evolution ==
534 +
535 +Scenario changes create new SCENARIO_VERSIONs; dependent verdicts and
536 +Scenario–Evidence links are re-assessed. Old versions remain available for
537 +historical comparison and reproducibility.
538 +
539 +== 5.5.3 Federation ==
540 +
541 +Federated nodes can replicate subsets of the graph, including:
542 +
543 +* Claims and Scenarios of local interest
544 +* Evidence metadata (without full content)
545 +* Verdict lineages used for local decision-making
546 +
547 +Federation-specific entities (such as {{code}}FEDERATION_NODE{{/code}},
548 +replication logs, and trust rules) are described in the Federation &
549 +Decentralization chapter and build on top of the core data model defined here.