Changes for page Data Model

Last modified by Robert Schaub on 2025/12/24 20:34

From version 4.1
edited by Robert Schaub
on 2025/12/12 08:32
Change comment: Imported from XAR
To version 6.4
edited by Robert Schaub
on 2025/12/16 20:28
Change comment: Renamed back-links.

Summary

Details

Page properties
Parent
... ... @@ -1,1 +1,1 @@
1 -FactHarbor.Specification.WebHome
1 +FactHarbor.Archive.FactHarbor V0\.9\.18.Specification.WebHome
Content
... ... @@ -1,41 +1,293 @@
1 1  = Data Model =
2 2  
3 -This page describes the current data model for FactHarbor.
3 +This page describes the current data model for FactHarbor v0.9.1.
4 4  
5 -== Core Data Model Refinements ==
5 +== Versioning Strategy ==
6 6  
7 +Every entity in FactHarbor has a full immutable version history. This ensures:
8 +
9 +* Complete auditability
10 +* Ability to reconstruct historical state
11 +* Federation-compatible lineage tracking
12 +* Transparent evolution of claims, scenarios, and verdicts
13 +
14 +=== Core Versioning Principles ===
15 +
16 +**Immutability**:
17 +
18 +* Each version is stored independently
19 +* Versions cannot be deleted, only superseded
20 +* Historical versions remain accessible
21 +
22 +**Lineage**:
23 +
24 +* Each version links to its parent via `ParentVersionID`
25 +* Forms directed acyclic graph (DAG) of changes
26 +* Supports branching in federated environments
27 +
28 +**Provenance**:
29 +
30 +* Every version timestamped (`CreatedAt`)
31 +* Author type recorded (`AuthorType`: Human, AI, ExternalNode)
32 +* Justification captured (`JustificationText`)
33 +* Digital signatures for integrity (`SignatureHash` in Release 1.0)
34 +
35 +**Federation Support**:
36 +
37 +* Versions can originate from remote nodes
38 +* Conflict detection via lineage comparison
39 +* Parallel version trees for branching scenarios
40 +* Cross-node version synchronization
41 +
42 +=== Common Version Fields ===
43 +
44 +All versioned entities include:
45 +
46 +* **VersionID**: Unique identifier for this specific version
47 +* **ParentVersionID**: Link to previous version (null for first version)
48 +* **CreatedAt**: Timestamp (ISO 8601, UTC)
49 +* **AuthorType**: Human | AI | ExternalNode
50 +* **CreatedBy**: Foreign key to User or TechnicalUser
51 +* **JustificationText**: Brief explanation of changes
52 +* **PublicationMode**: Mode1 (draft) | Mode2 (AI-published) | Mode3 (human-reviewed)
53 +* **ReviewStatus**: Workflow state (draft|in_review|approved|rejected)
54 +* **NodeOrigin**: Node ID where version was created (for federation)
55 +* **SignatureHash**: Cryptographic signature (Release 1.0)
56 +
57 +----
58 +
59 +== Core Entity Definitions ==
60 +
61 +=== User Entities ===
62 +
63 +**USER** (base user table):
64 +
65 +* ``UserID`` (PK)
66 +* ``UserType`` (Reader|Contributor|Reviewer|Auditor|Expert|Moderator|Maintainer)
67 +* ``DisplayName``
68 +* ``Email`` (for Contributors and above)
69 +* ``RegisteredAt``
70 +* ``LastActive``
71 +* ``Status`` (active|suspended|banned)
72 +
73 +**TECHNICAL_USER** (system processes):
74 +
75 +* ``SystemID`` (PK)
76 +* ``SystemName``
77 +* ``Purpose`` (AKEL|FederationSync|BackupService|Monitor|Audit)
78 +* ``CreatedBy`` (FK to Maintainer who created this system user)
79 +* ``CreatedAt``
80 +* ``Status`` (active|paused|deprecated)
81 +* ``ApiKey`` (encrypted)
82 +* ``Permissions`` (JSON - authorized operations)
83 +
84 +**Examples of Technical Users**:
85 +
86 +* AKEL instances (AI processing)
87 +* Federation sync bots
88 +* Scheduled audit tasks
89 +* Backup services
90 +* Monitoring systems
91 +* External API integrations
92 +
93 +----
94 +
95 +=== Content Entities ===
96 +
7 7  The system relies on the following versioned core entities:
8 8  
9 -* **CLAIM_CLUSTER**
10 -** ``ClusterID`` (PK), ``EmbeddingVectorRef``, ``Theme``
11 -** Groups related claims into topical clusters.
12 -** One Cluster has many Claims.
13 -** A Claim belongs to exactly one primary cluster.
99 +**CLAIM_CLUSTER**:
14 14  
15 -* **CLAIM / CLAIM_VERSION**
16 -** ``CLAIM`` is the long‑lived anchor for a real‑world claim.
17 -** ``CLAIM_VERSION`` is an immutable snapshot of wording + basic metadata.
18 -** **Note:** Verdicts are **NEVER** attached directly to a Claim. They are attached to Scenarios.
101 +* ``ClusterID`` (PK)
102 +* ``EmbeddingVectorRef``
103 +* ``Theme``
104 +* Groups related claims into topical clusters
105 +* One Cluster has many Claims
106 +* A Claim belongs to exactly one primary cluster
19 19  
20 -* **SCENARIO / SCENARIO_VERSION**
21 -** ``SCENARIO`` represents a stable interpretive context for a claim.
22 -** ``SCENARIO_VERSION`` is an immutable snapshot of that context (definitions, assumptions, boundaries).
23 -** A single Claim may have multiple Scenarios.
108 +**CLAIM / CLAIM_VERSION**:
24 24  
25 -* **EVIDENCE / EVIDENCE_VERSION**
26 -** ``EVIDENCE`` is the logical source (report, article, dataset…).
27 -** ``EVIDENCE_VERSION`` is the extracted/processed snapshot (summary, reliability score, extraction method).
110 +* ``CLAIM`` is the long-lived anchor for a real-world claim
111 +* ``CLAIM_VERSION`` is an immutable snapshot that includes:
112 +* ``VersionID`` (PK)
113 +* ``ClaimID`` (FK to CLAIM)
114 +* ``ParentVersionID`` (FK to prior version, nullable)
115 +* ``Text``
116 +* ``Domain``
117 +* ``ClaimType`` (literal|metaphorical|rhetorical|supernatural)
118 +* ``Evaluability`` (empirical|subjective|non-falsifiable)
119 +* ``RiskTier`` (A|B|C) - replaced SafetyCategory for consistency
120 +* ``PublicationMode`` (Mode1|Mode2|Mode3)
121 +* ``ReviewStatus`` (draft|in_review|approved|rejected)
122 +* ``CreatedAt``, ``AuthorType``, ``CreatedBy``, ``JustificationText``
123 +* ``NodeOrigin``, ``SignatureHash``
124 +* ``Status`` (active|superseded|merged)
28 28  
29 -* **VERDICT / VERDICT_VERSION**
30 -** ``VERDICT`` represents the assertion "this claim is assessed **under this specific scenario**."
31 -** ``VERDICT_VERSION`` is an immutable snapshot of the evaluation (likelihood, confidence, reasoning, timestamp).
32 -** **Cardinality:** 1 Scenario has 1 active Verdict (but many Verdict versions over time). Therefore, 1 Claim has N Verdicts.
126 +**SCENARIO / SCENARIO_VERSION**:
33 33  
34 -* **SCENARIO_EVIDENCE_VERSION_LINK**
35 -** Connects ``ScenarioVersion`` ↔ ``EvidenceVersion`` (many‑to‑many).
36 -** Fields: ``LinkID``, ``Relevance``, ``Direction`` (SUPPORTS / CONTRADICTS / NEUTRAL / MIXED).
37 -** **Rule:** The link always targets specific **VERSIONS** of entities, never the base tables, to ensure auditability.
128 +* ``SCENARIO`` is the anchor for a scenario across time
129 +* ``SCENARIO_VERSION`` is an immutable snapshot:
130 +* ``VersionID`` (PK)
131 +* ``ScenarioID`` (FK to SCENARIO)
132 +* ``ParentVersionID``
133 +* ``ClaimID`` (FK to CLAIM)
134 +* ``Definitions`` (JSON)
135 +* ``Boundaries`` (JSON)
136 +* ``Assumptions`` (JSON)
137 +* ``Context`` (text)
138 +* ``EvaluationMethod`` (text)
139 +* ``PublicationMode`` (Mode1|Mode2|Mode3)
140 +* ``ReviewStatus`` (draft|in_review|approved|rejected)
141 +* ``CreatedAt``, ``AuthorType``, ``CreatedBy``, ``JustificationText``
142 +* ``NodeOrigin``, ``SignatureHash``
143 +* ``Status`` (active|superseded|deprecated)
38 38  
39 -== Core Data Model ERD ==
145 +**Note**: SafetyClass removed from Scenario - risk tier is at claim level
40 40  
41 -{{include reference="FactHarbor.Specification.Diagrams.Core Data Model ERD.WebHome"/}}
147 +**EVIDENCE / EVIDENCE_VERSION**:
148 +
149 +* ``EVIDENCE`` is the anchor
150 +* ``EVIDENCE_VERSION`` is the versioned snapshot:
151 +* ``VersionID`` (PK)
152 +* ``EvidenceID`` (FK to EVIDENCE)
153 +* ``ParentVersionID``
154 +* ``Type`` (paper|dataset|report|transcript|expert|media)
155 +* ``Category`` (empirical|historical|rhetorical|dataset|meta-analysis)
156 +* ``Reliability`` (low|medium|high)
157 +* ``Provenance`` (URL, DOI, source metadata)
158 +* ``ExtractionMethod`` (manual|OCR|API|AKEL)
159 +* ``ContentHash`` (SHA256 of evidence content)
160 +* ``PublicationMode`` (Mode1|Mode2|Mode3)
161 +* ``ReviewStatus`` (draft|verified|disputed|retracted)
162 +* ``CreatedAt``, ``AuthorType``, ``CreatedBy``, ``JustificationText``
163 +* ``NodeOrigin``, ``SignatureHash``
164 +* ``Status`` (active|superseded)
165 +
166 +**VERDICT / VERDICT_VERSION**:
167 +
168 +* ``VERDICT`` is the anchor
169 +* ``VERDICT_VERSION`` is the snapshot:
170 +* ``VersionID`` (PK)
171 +* ``VerdictID`` (FK to VERDICT)
172 +* ``ParentVersionID``
173 +* ``ClaimID`` (FK to CLAIM)
174 +* ``ScenarioVersionID`` (FK to specific SCENARIO_VERSION)
175 +* ``EvidenceVersionSet`` (JSON array of Evidence VersionIDs used)
176 +* ``LikelihoodRange`` (0–1, with uncertainty bounds)
177 +* ``ExplanationChain`` (JSON)
178 +* ``UncertaintyFactors`` (JSON)
179 +* ``PublicationMode`` (Mode1|Mode2|Mode3)
180 +* ``ReviewStatus`` (draft|in_review|approved|retracted)
181 +* ``CreatedAt``, ``AuthorType``, ``CreatedBy``, ``JustificationText``
182 +* ``NodeOrigin``, ``SignatureHash``
183 +* ``Status`` (current|outdated|superseded|retracted)
184 +
185 +----
186 +
187 +== Many-to-Many Linking Tables ==
188 +
189 +**ScenarioEvidenceLink**:
190 +
191 +* Links scenario versions to evidence versions with relevance scoring
192 +* ``ScenarioID``, ``ScenarioVersionID``
193 +* ``EvidenceID``, ``EvidenceVersionID``
194 +* ``RelevanceScore`` (0–1) - How relevant this evidence is to this scenario
195 +* ``LinkJustification`` - Brief explanation of relevance
196 +
197 +**Purpose**:
198 +
199 +* Evidence can be used by multiple scenarios
200 +* Scenarios can draw from multiple pieces of evidence
201 +* Relevance scoring helps prioritize evidence
202 +* Version-specific linking preserves historical accuracy
203 +
204 +**ClaimCluster**:
205 +
206 +* Semantic clustering of similar claims
207 +* ``ClusterID`` (PK)
208 +* ``EmbeddingVector`` - Vector representation for semantic search
209 +* ``MemberList`` - List of ClaimIDs in this cluster
210 +* ``Theme`` - Human-readable theme description
211 +
212 +----
213 +
214 +== Key Changes in v0.9.1 ==
215 +
216 +**Updated Field Names**:
217 +
218 +* `SafetyCategory` → `RiskTier` (consistency with risk tier system A/B/C)
219 +* `SafetyClass` removed from Scenario (redundant with claim-level RiskTier)
220 +
221 +**Added Fields to All Version Entities**:
222 +
223 +* `PublicationMode` - Track Mode 1/2/3 status
224 +* `ReviewStatus` - Track workflow state
225 +* `NodeOrigin` - Federation provenance
226 +* `CreatedBy` - FK to User/TechnicalUser (clarified)
227 +
228 +**New Entity**:
229 +
230 +* `TECHNICAL_USER` - Separate system processes from human users
231 +
232 +**Clarifications**:
233 +
234 +* `ScenarioVersionID` in Verdict (not just ScenarioID) - links to specific version
235 +* `ContentHash` in Evidence - SHA256 for integrity checking
236 +
237 +----
238 +
239 +== Data Model Behavior ==
240 +
241 +=== Late-Arriving Evidence ===
242 +
243 +When new evidence versions appear:
244 +
245 +1. Existing verdicts marked as **outdated**
246 +2. Scenario relevance must be re-evaluated
247 +3. Re-evaluation engine triggers verdict recomputation
248 +4. New verdict versions created
249 +5. Users notified of updates
250 +
251 +=== Scenario Evolution ===
252 +
253 +When a scenario's assumptions or definitions change:
254 +
255 +* Creates new scenario version (not in-place update)
256 +* All dependent verdicts must be recalculated
257 +* Previous scenario versions remain accessible
258 +* Version lineage preserved
259 +
260 +=== Federated Nodes ===
261 +
262 +Each node may share partial data:
263 +
264 +* Claims and scenarios shared if relevant
265 +* Evidence metadata shared, not always full files
266 +* Version synchronization via NodeOrigin tracking
267 +* Branching allowed for divergent interpretations
268 +
269 +----
270 +
271 +== Visual Diagrams ==
272 +
273 +The following diagrams provide visual representations of the data model structure and relationships.
274 +
275 +=== Core Data Model ERD ===
276 +
277 +{{include reference="FactHarbor.Archive.FactHarbor V0\.9\.23 Lost Data.Specification.Diagrams.Core Data Model ERD.WebHome"/}}
278 +
279 +=== User Roles Structure ===
280 +
281 +{{include reference="Test.FactHarborV09.Specification.Diagrams.User Roles ERD.WebHome"/}}
282 +
283 +=== Content Workflow ===
284 +
285 +{{include reference="Test.FactHarborV09.Specification.Diagrams.Content Workflow ERD.WebHome"/}}
286 +
287 +----
288 +
289 +== Related Pages ==
290 +
291 +* [[Federation & Decentralization>>FactHarbor.Specification.Federation & Decentralization.WebHome]]
292 +* [[AKEL (AI Knowledge Extraction Layer)>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]]
293 +* [[Architecture>>FactHarbor.Specification.Architecture.WebHome]]