Last modified by Robert Schaub on 2025/12/24 20:35

From version 5.1
edited by Robert Schaub
on 2025/11/27 12:28
Change comment: There is no comment for this version
To version 7.1
edited by Robert Schaub
on 2025/11/27 12:41
Change comment: There is no comment for this version

Summary

Details

Page properties
Content
... ... @@ -1,3 +1,316 @@
1 +(((
2 +
3 +)))
4 +
5 += 5. Data Model =
6 +
7 +The FactHarbor data model centers on four fully versioned, immutable entities:
8 +
9 +* **Claim**
10 +* **Scenario**
11 +* **Evidence**
12 +* **Verdict**
13 +
14 +These entities form the structured **“truth landscape”** for each claim.
15 +The model is explicitly **versioned**, **traceable**, and **federation-ready**.
16 +
17 +To keep the system auditable and explainable, FactHarbor uses a consistent
18 +**identity vs. version** pattern:
19 +
20 +* Identity entities (e.g. {{code}}CLAIM{{/code}}, {{code}}SCENARIO{{/code}})
21 + define *what* something is in a stable sense.
22 +* Version entities (e.g. {{code}}CLAIM_VERSION{{/code}}, {{code}}SCENARIO_VERSION{{/code}})
23 + define *how that thing looked at a given point in time*.
24 +
25 +All reasoning (e.g. verdicts, review actions) is attached to **versions**, never to
26 +mutable identities.
27 +
28 +----
29 +
30 += 5.1 Core entities and versioning pattern =
31 +
32 +(% class="wikitable" %)
33 +| **Logical concept** | **Identity entity** | **Version entity** | **Notes**
34 +| Claim (what people argue about) | {{code}}CLAIM{{/code}} | {{code}}CLAIM_VERSION{{/code}} | Claim text, phrasing, and metadata live in {{code}}CLAIM_VERSION{{/code}}. The identity {{code}}CLAIM{{/code}} stays stable across rephrasings.
35 +| Scenario (interpretive frame) | {{code}}SCENARIO{{/code}} | {{code}}SCENARIO_VERSION{{/code}} | A SCENARIO belongs to a CLAIM. Its versions capture evolving definitions, assumptions, and boundaries.
36 +| Evidence (source / datapoint) | {{code}}EVIDENCE{{/code}} | {{code}}EVIDENCE_VERSION{{/code}} | Identity of a source vs. specific extractions / updates over time.
37 +| Verdict (assessment) | {{code}}VERDICT{{/code}} | {{code}}VERDICT_VERSION{{/code}} | A VERDICT is defined per SCENARIO; VERDICT_VERSION captures the history of assessments.
38 +| Scenario–Evidence link | {{code}}SCENARIO_EVIDENCE_LINK{{/code}} | {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} | Links bind scenario versions to evidence versions with relevance & direction.
39 +| Claim cluster (semantic group) | {{code}}CLAIM_CLUSTER{{/code}} | – | Groups semantically related claims; mainly for discovery and navigation.
40 +
41 +Key design decisions:
42 +
43 +* A {{code}}CLAIM{{/code}} belongs to exactly one {{code}}CLAIM_CLUSTER{{/code}}.
44 +* A {{code}}SCENARIO{{/code}} belongs to exactly one {{code}}CLAIM{{/code}}
45 + (scenarios live at the *claim* level, not per individual phrasing).
46 +* Verdicts and Scenario–Evidence links are always attached to **versions**:
47 +* {{code}}SCENARIO_VERSION{{/code}} +
48 +{{code}}EVIDENCE_VERSION{{/code}} →
49 +{{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}}
50 +* {{code}}SCENARIO_VERSION{{/code}} →
51 +{{code}}VERDICT_VERSION{{/code}}
52 +
53 +This ensures that when a Scenario or Evidence changes, old verdicts and links
54 +remain intact as historical records and can be revisited.
55 +
56 +----
57 +
58 += 5.2 Core Data Model ERD (expanded, versioned) =
59 +
60 +The following Mermaid ER diagram shows the main entities and their relationships.
61 +The convention is that fields ending in {{code}}Id{{/code}} are primary keys,
62 +and fields with {{code}}...IdFk{{/code}} are foreign keys.
63 +
64 +{{comment}} Core Data Model ERD (Mermaid, from /Specification/Diagrams/Data Model) {{/comment}}
65 +{{include document="FactHarbor.Playground.Core Data Model ERD Page (from Specification chat).WebHome" reference="FactHarbor.Playground.data.Core Data Model ERD Page (from Specification chat).WebHome"/}}
66 +
67 +**Important points:**
68 +
69 +* Scenarios and Evidence are **linked via their versions**
70 + ({{code}}SCENARIO_VERSION{{/code}} and {{code}}EVIDENCE_VERSION{{/code}}).
71 +* Verdicts are **per ScenarioVersion** and stored in {{code}}VERDICT_VERSION{{/code}}.
72 +* {{code}}CLAIM_CLUSTER{{/code}} is shared across diagrams; it is shown here and in the Data Use / Review model.
73 +
74 +All version entities are immutable: once created, they are never changed, only
75 +superseded by newer versions.
76 +
77 +----
78 +
79 += 5.3 Data Use & Review ERD (expanded, versioned) =
80 +
81 +The **Data Use** model captures who does what with which versioned data:
82 +
83 +* Users (including technical users)
84 +* Roles and role assignments
85 +* Review actions on versioned entities
86 +
87 +{{comment}} Data Use ERD (Mermaid, from /Specification/Diagrams/Data Use ERD) {{/comment}}
88 +{{include document="FactHarbor.Playground.Data Use ERD Page (from Specification chat).WebHome" reference="FactHarbor.Playground.data.Data Use ERD Page (from Specification chat).WebHome"/}}
89 +
90 += Data Use ERD (Roles, Review & Versioned Entities) =
91 +
92 +This diagram shows how users, roles, and review actions relate to the
93 +versioned core entities.
94 +
95 +{{mermaid}}
96 +erDiagram
97 + %% Core clusters shown for context
98 + CLAIM_CLUSTER {
99 + string ClusterID PK
100 + string EmbeddingVectorRef
101 + string Theme
102 + }
103 +
104 + CLAIM {
105 + string ClaimID PK
106 + string ClusterID FK
107 + string Status
108 + datetime CreatedAt
109 + }
110 +
111 + CLAIM_VERSION {
112 + string ClaimVersionID PK
113 + string ClaimID FK
114 + string Text
115 + string ClaimType
116 + string Domain
117 + datetime CreatedAt
118 + }
119 +
120 + SCENARIO {
121 + string ScenarioID PK
122 + string ClaimID FK
123 + string Name
124 + datetime CreatedAt
125 + }
126 +
127 + SCENARIO_VERSION {
128 + string ScenarioVersionID PK
129 + string ScenarioID FK
130 + string Definitions
131 + string Assumptions
132 + string Boundaries
133 + datetime CreatedAt
134 + }
135 +
136 + EVIDENCE {
137 + string EvidenceID PK
138 + string SourceType
139 + string URL
140 + float ReliabilityScore
141 + }
142 +
143 + EVIDENCE_VERSION {
144 + string EvidenceVersionID PK
145 + string EvidenceID FK
146 + string Summary
147 + float ReliabilityScore
148 + datetime CreatedAt
149 + }
150 +
151 + VERDICT {
152 + string VerdictID PK
153 + string ScenarioID FK
154 + }
155 +
156 + VERDICT_VERSION {
157 + string VerdictVersionID PK
158 + string VerdictID FK
159 + float Verdict
160 + float Confidence
161 + string Reasoning
162 + datetime CreatedAt
163 + }
164 +
165 + %% Users and roles
166 + USER {
167 + string UserID PK
168 + string Handle
169 + string Email
170 + }
171 +
172 + TECHNICAL_USER {
173 + string UserID PK
174 + string SystemName
175 + }
176 +
177 + CONTRIBUTING_USER {
178 + string UserID PK
179 + string DisplayName
180 + }
181 +
182 + TRUSTED_CONTRIBUTOR {
183 + string UserID PK
184 + string TrustLevel
185 + }
186 +
187 + REVIEWER {
188 + string UserID PK
189 + string Domain
190 + }
191 +
192 + EXPERT {
193 + string UserID PK
194 + string ExpertiseArea
195 + }
196 +
197 + FEDERATION_NODE {
198 + string NodeID PK
199 + string Region
200 + }
201 +
202 + FEDERATION_ADMIN {
203 + string UserID PK
204 + string Permissions
205 + }
206 +
207 + REVIEW_ACTION {
208 + string ReviewActionID PK
209 + string UserID FK
210 + string TargetEntityType
211 + string TargetEntityVersionID
212 + string ActionType
213 + string Comment
214 + datetime Timestamp
215 + }
216 +
217 + %% Inheritance / specialization (modelled as relationships)
218 + USER ||--o{ TECHNICAL_USER : "is a"
219 + USER ||--o{ CONTRIBUTING_USER : "is a"
220 +
221 + CONTRIBUTING_USER ||--o{ TRUSTED_CONTRIBUTOR : "subset"
222 + CONTRIBUTING_USER ||--o{ REVIEWER : "subset"
223 + CONTRIBUTING_USER ||--o{ EXPERT : "subset"
224 +
225 + TECHNICAL_USER ||--o{ FEDERATION_NODE : "operates"
226 + TECHNICAL_USER ||--o{ FEDERATION_ADMIN : "administers"
227 +
228 + %% Review actions on versioned entities
229 + USER ||--o{ REVIEW_ACTION : performs
230 +
231 + REVIEW_ACTION }o--|| CLAIM_VERSION : reviews
232 + REVIEW_ACTION }o--|| SCENARIO_VERSION : reviews
233 + REVIEW_ACTION }o--|| EVIDENCE_VERSION : reviews
234 + REVIEW_ACTION }o--|| VERDICT_VERSION : reviews
235 +{{/mermaid}}
236 +
237 +{{info}}
238 +This diagram focuses on *who* uses and reviews *which* versioned entities.
239 +USER is the base type; TECHNICAL_USER and CONTRIBUTING_USER are specializations.
240 +Other roles (REVIEWER, EXPERT, TRUSTED_CONTRIBUTOR, FEDERATION_ADMIN, FEDERATION_NODE)
241 +are modelled as specializations or technical subtypes.
242 +{{/info}}
243 +
244 +
245 +Notes:
246 +
247 +* Most roles (READER, CONTRIBUTOR, TRUSTED_CONTRIBUTOR, REVIEWER, MODERATOR,
248 + SYSTEM_ADMIN, FEDERATION_OPERATOR, FEDERATION_ADMIN, …) are represented as rows
249 + in {{code}}ROLE{{/code}}.
250 +* {{code}}TECHNICAL_USER{{/code}} captures strictly technical accounts (API keys,
251 + node-to-node federation agents, batch jobs). All other roles can, in principle,
252 + be held by both human and technical users where appropriate.
253 +* A {{code}}READER{{/code}} normally does **not** perform REVIEW_ACTIONs, while
254 + roles like REVIEWER, TRUSTED_CONTRIBUTOR, MODERATOR, and some federation roles
255 + do.
256 +
257 +----
258 +
259 += 5.4 Versioning and re-evaluation behavior =
260 +
261 +This section ties the data model to the re-evaluation logic
262 +(described in more detail in the Versioning and Automation chapters).
263 +
264 +* When a new {{code}}EVIDENCE_VERSION{{/code}} is created:
265 +* All related {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} entries referencing
266 + that evidence version are candidates for re-assessment.
267 +* Related {{code}}VERDICT_VERSION{{/code}} entries may become **outdated** and
268 + are queued for re-evaluation.
269 +
270 +* When a new {{code}}SCENARIO_VERSION{{/code}} is created:
271 +* It may inherit some links from earlier scenarios, or start empty depending
272 + on the change classification (cosmetic vs. conceptual).
273 +* All verdicts for that scenario are recalculated and stored as new
274 +{{code}}VERDICT_VERSION{{/code}} entries.
275 +
276 +* REVIEW_ACTIONs are always attached to the **exact version** that was seen by
277 + the reviewer. This preserves a faithful audit trail if data later changes.
278 +
279 +* In a federated environment, nodes can choose:
280 +* which identity entities to replicate (CLAIM, SCENARIO, EVIDENCE, VERDICT)
281 +* which versioned entities to replicate (e.g. only accepted VERDICT_VERSIONs,
282 + only EVIDENCE_VERSIONs above a reliability threshold, etc.)
283 +
284 +----
285 +
286 += 5.5 Behavioral Notes =
287 +
288 +== 5.5.1 Late-Arriving Evidence ==
289 +
290 +New evidence versions can make existing verdicts **outdated** and may trigger
291 +re-evaluation cascades. This is handled by the global trigger and automation
292 +architecture (see the Versioning & Automation chapters).
293 +
294 +== 5.5.2 Scenario Evolution ==
295 +
296 +Scenario changes create new SCENARIO_VERSIONs; dependent verdicts and
297 +Scenario–Evidence links are re-assessed. Old versions remain available for
298 +historical comparison and reproducibility.
299 +
300 +== 5.5.3 Federation ==
301 +
302 +Federated nodes can replicate subsets of the graph, including:
303 +
304 +* Claims and Scenarios of local interest
305 +* Evidence metadata (without full content)
306 +* Verdict lineages used for local decision-making
307 +
308 +Federation-specific entities (such as {{code}}FEDERATION_NODE{{/code}},
309 +replication logs, and trust rules) are described in the Federation &
310 +Decentralization chapter and build on top of the core data model defined here.
311 +
312 +----
313 +
1 1  == 1. Overall analysis & review of the data model ==
2 2  
3 3  === 1.1 Strengths of the current design ===
... ... @@ -165,155 +165,3 @@
165 165  )))
166 166  * That’s fine for now; I’ll just clarify that those belong to a “Processing / AKEL” submodel, not the core logical data model.
167 167  )))
168 -
169 -= 5. Data Model =
170 -
171 -The FactHarbor data model centers on four fully versioned, immutable entities:
172 -
173 -* **Claim**
174 -* **Scenario**
175 -* **Evidence**
176 -* **Verdict**
177 -
178 -These entities form the structured **“truth landscape”** for each claim.
179 -The model is explicitly **versioned**, **traceable**, and **federation-ready**.
180 -
181 -To keep the system auditable and explainable, FactHarbor uses a consistent
182 -**identity vs. version** pattern:
183 -
184 -* Identity entities (e.g. {{code}}CLAIM{{/code}}, {{code}}SCENARIO{{/code}})
185 - define *what* something is in a stable sense.
186 -* Version entities (e.g. {{code}}CLAIM_VERSION{{/code}}, {{code}}SCENARIO_VERSION{{/code}})
187 - define *how that thing looked at a given point in time*.
188 -
189 -All reasoning (e.g. verdicts, review actions) is attached to **versions**, never to
190 -mutable identities.
191 -
192 -----
193 -
194 -= 5.1 Core entities and versioning pattern =
195 -
196 -(% class="wikitable" %)
197 -| **Logical concept** | **Identity entity** | **Version entity** | **Notes**
198 -| Claim (what people argue about) | {{code}}CLAIM{{/code}} | {{code}}CLAIM_VERSION{{/code}} | Claim text, phrasing, and metadata live in {{code}}CLAIM_VERSION{{/code}}. The identity {{code}}CLAIM{{/code}} stays stable across rephrasings.
199 -| Scenario (interpretive frame) | {{code}}SCENARIO{{/code}} | {{code}}SCENARIO_VERSION{{/code}} | A SCENARIO belongs to a CLAIM. Its versions capture evolving definitions, assumptions, and boundaries.
200 -| Evidence (source / datapoint) | {{code}}EVIDENCE{{/code}} | {{code}}EVIDENCE_VERSION{{/code}} | Identity of a source vs. specific extractions / updates over time.
201 -| Verdict (assessment) | {{code}}VERDICT{{/code}} | {{code}}VERDICT_VERSION{{/code}} | A VERDICT is defined per SCENARIO; VERDICT_VERSION captures the history of assessments.
202 -| Scenario–Evidence link | {{code}}SCENARIO_EVIDENCE_LINK{{/code}} | {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} | Links bind scenario versions to evidence versions with relevance & direction.
203 -| Claim cluster (semantic group) | {{code}}CLAIM_CLUSTER{{/code}} | – | Groups semantically related claims; mainly for discovery and navigation.
204 -
205 -Key design decisions:
206 -
207 -* A {{code}}CLAIM{{/code}} belongs to exactly one {{code}}CLAIM_CLUSTER{{/code}}.
208 -* A {{code}}SCENARIO{{/code}} belongs to exactly one {{code}}CLAIM{{/code}}
209 - (scenarios live at the *claim* level, not per individual phrasing).
210 -* Verdicts and Scenario–Evidence links are always attached to **versions**:
211 -* {{code}}SCENARIO_VERSION{{/code}} +
212 -{{code}}EVIDENCE_VERSION{{/code}} →
213 -{{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}}
214 -* {{code}}SCENARIO_VERSION{{/code}} →
215 -{{code}}VERDICT_VERSION{{/code}}
216 -
217 -This ensures that when a Scenario or Evidence changes, old verdicts and links
218 -remain intact as historical records and can be revisited.
219 -
220 -----
221 -
222 -= 5.2 Core Data Model ERD (expanded, versioned) =
223 -
224 -The following Mermaid ER diagram shows the main entities and their relationships.
225 -The convention is that fields ending in {{code}}Id{{/code}} are primary keys,
226 -and fields with {{code}}...IdFk{{/code}} are foreign keys.
227 -
228 -{{comment}} Core Data Model ERD (Mermaid, from /Specification/Diagrams/Data Model) {{/comment}}
229 -{{include document="FactHarbor.Playground.Core Data Model ERD Page (from Specification chat).WebHome"/}}
230 -
231 -**Important points:**
232 -
233 -* Scenarios and Evidence are **linked via their versions**
234 - ({{code}}SCENARIO_VERSION{{/code}} and {{code}}EVIDENCE_VERSION{{/code}}).
235 -* Verdicts are **per ScenarioVersion** and stored in {{code}}VERDICT_VERSION{{/code}}.
236 -* {{code}}CLAIM_CLUSTER{{/code}} is shared across diagrams; it is shown here and in the Data Use / Review model.
237 -
238 -All version entities are immutable: once created, they are never changed, only
239 -superseded by newer versions.
240 -
241 -----
242 -
243 -= 5.3 Data Use & Review ERD (expanded, versioned) =
244 -
245 -The **Data Use** model captures who does what with which versioned data:
246 -
247 -* Users (including technical users)
248 -* Roles and role assignments
249 -* Review actions on versioned entities
250 -
251 -{{comment}} Data Use ERD (Mermaid, from /Specification/Diagrams/Data Use ERD) {{/comment}}
252 -{{include document="FactHarbor.Playground.Data Use ERD Page (from Specification chat).WebHome"/}}
253 -
254 -Notes:
255 -
256 -* Most roles (READER, CONTRIBUTOR, TRUSTED_CONTRIBUTOR, REVIEWER, MODERATOR,
257 - SYSTEM_ADMIN, FEDERATION_OPERATOR, FEDERATION_ADMIN, …) are represented as rows
258 - in {{code}}ROLE{{/code}}.
259 -* {{code}}TECHNICAL_USER{{/code}} captures strictly technical accounts (API keys,
260 - node-to-node federation agents, batch jobs). All other roles can, in principle,
261 - be held by both human and technical users where appropriate.
262 -* A {{code}}READER{{/code}} normally does **not** perform REVIEW_ACTIONs, while
263 - roles like REVIEWER, TRUSTED_CONTRIBUTOR, MODERATOR, and some federation roles
264 - do.
265 -
266 -----
267 -
268 -= 5.4 Versioning and re-evaluation behavior =
269 -
270 -This section ties the data model to the re-evaluation logic
271 -(described in more detail in the Versioning and Automation chapters).
272 -
273 -* When a new {{code}}EVIDENCE_VERSION{{/code}} is created:
274 -* All related {{code}}SCENARIO_EVIDENCE_LINK_VERSION{{/code}} entries referencing
275 - that evidence version are candidates for re-assessment.
276 -* Related {{code}}VERDICT_VERSION{{/code}} entries may become **outdated** and
277 - are queued for re-evaluation.
278 -
279 -* When a new {{code}}SCENARIO_VERSION{{/code}} is created:
280 -* It may inherit some links from earlier scenarios, or start empty depending
281 - on the change classification (cosmetic vs. conceptual).
282 -* All verdicts for that scenario are recalculated and stored as new
283 -{{code}}VERDICT_VERSION{{/code}} entries.
284 -
285 -* REVIEW_ACTIONs are always attached to the **exact version** that was seen by
286 - the reviewer. This preserves a faithful audit trail if data later changes.
287 -
288 -* In a federated environment, nodes can choose:
289 -* which identity entities to replicate (CLAIM, SCENARIO, EVIDENCE, VERDICT)
290 -* which versioned entities to replicate (e.g. only accepted VERDICT_VERSIONs,
291 - only EVIDENCE_VERSIONs above a reliability threshold, etc.)
292 -
293 -----
294 -
295 -= 5.5 Behavioral Notes =
296 -
297 -== 5.5.1 Late-Arriving Evidence ==
298 -
299 -New evidence versions can make existing verdicts **outdated** and may trigger
300 -re-evaluation cascades. This is handled by the global trigger and automation
301 -architecture (see the Versioning & Automation chapters).
302 -
303 -== 5.5.2 Scenario Evolution ==
304 -
305 -Scenario changes create new SCENARIO_VERSIONs; dependent verdicts and
306 -Scenario–Evidence links are re-assessed. Old versions remain available for
307 -historical comparison and reproducibility.
308 -
309 -== 5.5.3 Federation ==
310 -
311 -Federated nodes can replicate subsets of the graph, including:
312 -
313 -* Claims and Scenarios of local interest
314 -* Evidence metadata (without full content)
315 -* Verdict lineages used for local decision-making
316 -
317 -Federation-specific entities (such as {{code}}FEDERATION_NODE{{/code}},
318 -replication logs, and trust rules) are described in the Federation &
319 -Decentralization chapter and build on top of the core data model defined here.