Wiki source code of Data Model

Last modified by Robert Schaub on 2025/12/24 20:34

Show last authors
1 = Data Model =
2
3 This page describes the current data model for FactHarbor v0.9.1.
4
5 == 1. Versioning Strategy ==
6
7 Every entity in FactHarbor has a full immutable version history. This ensures:
8
9 * Complete auditability
10 * Ability to reconstruct historical state
11 * Federation-compatible lineage tracking
12 * Transparent evolution of claims, scenarios, and verdicts
13
14 === 1.1 Core Versioning Principles ===
15
16 **Immutability**:
17
18 * Each version is stored independently
19 * Versions cannot be deleted, only superseded
20 * Historical versions remain accessible
21
22 **Lineage**:
23
24 * Each version links to its parent via `ParentVersionID`
25 * Forms directed acyclic graph (DAG) of changes
26 * Supports branching in federated environments
27
28 **Provenance**:
29
30 * Every version timestamped (`CreatedAt`)
31 * Author type recorded (`AuthorType`: Human, AI, ExternalNode)
32 * Justification captured (`JustificationText`)
33 * Digital signatures for integrity (`SignatureHash` in Release 1.0)
34
35 **Federation Support**:
36
37 * Versions can originate from remote nodes
38 * Conflict detection via lineage comparison
39 * Parallel version trees for branching scenarios
40 * Cross-node version synchronization
41
42 === 1.2 Common Version Fields ===
43
44 All versioned entities include:
45
46 * **VersionID**: Unique identifier for this specific version
47 * **ParentVersionID**: Link to previous version (null for first version)
48 * **CreatedAt**: Timestamp (ISO 8601, UTC)
49 * **AuthorType**: Human | AI | ExternalNode
50 * **CreatedBy**: Foreign key to User or TechnicalUser
51 * **JustificationText**: Brief explanation of changes
52 * **PublicationMode**: Mode1 (draft) | Mode2 (AI-published) | Mode3 (human-reviewed)
53 * **ReviewStatus**: Workflow state (draft|in_review|approved|rejected)
54 * **NodeOrigin**: Node ID where version was created (for federation)
55 * **SignatureHash**: Cryptographic signature (Release 1.0)
56
57 == 2. Core Entity Definitions ==
58
59 === 2.1 User Entities ===
60
61 **USER** (base user table):
62
63 * ``UserID`` (PK)
64 * ``UserType`` (Reader|Contributor|Reviewer|Auditor|Expert|Moderator|Maintainer)
65 * ``DisplayName``
66 * ``Email`` (for Contributors and above)
67 * ``RegisteredAt``
68 * ``LastActive``
69 * ``Status`` (active|suspended|banned)
70
71 **TECHNICAL_USER** (system processes):
72
73 * ``SystemID`` (PK)
74 * ``SystemName``
75 * ``Purpose`` (AKEL|FederationSync|BackupService|Monitor|Audit)
76 * ``CreatedBy`` (FK to Maintainer who created this system user)
77 * ``CreatedAt``
78 * ``Status`` (active|paused|deprecated)
79 * ``ApiKey`` (encrypted)
80 * ``Permissions`` (JSON - authorized operations)
81
82 **Examples of Technical Users**:
83
84 * AKEL instances (AI processing)
85 * Federation sync bots
86 * Scheduled audit tasks
87 * Backup services
88 * Monitoring systems
89 * External API integrations
90
91 === 2.2 Content Entities ===
92
93 The system relies on the following versioned core entities:
94
95 **CLAIM_CLUSTER**:
96
97 * ``ClusterID`` (PK)
98 * ``EmbeddingVectorRef``
99 * ``Theme``
100 * Groups related claims into topical clusters
101 * One Cluster has many Claims
102 * A Claim belongs to exactly one primary cluster
103
104 **CLAIM / CLAIM_VERSION**:
105
106 * ``CLAIM`` is the long-lived anchor for a real-world claim
107 * ``CLAIM_VERSION`` is an immutable snapshot that includes:
108 * ``VersionID`` (PK)
109 * ``ClaimID`` (FK to CLAIM)
110 * ``ParentVersionID`` (FK to prior version, nullable)
111 * ``Text``
112 * ``Domain``
113 * ``ClaimType`` (literal|metaphorical|rhetorical|supernatural)
114 * ``Evaluability`` (empirical|subjective|non-falsifiable)
115 * ``RiskTier`` (A|B|C) - replaced SafetyCategory for consistency
116 * ``PublicationMode`` (Mode1|Mode2|Mode3)
117 * ``ReviewStatus`` (draft|in_review|approved|rejected)
118 * ``CreatedAt``, ``AuthorType``, ``CreatedBy``, ``JustificationText``
119 * ``NodeOrigin``, ``SignatureHash``
120 * ``Status`` (active|superseded|merged)
121
122 **SCENARIO / SCENARIO_VERSION**:
123
124 * ``SCENARIO`` is the anchor for a scenario across time
125 * ``SCENARIO_VERSION`` is an immutable snapshot:
126 * ``VersionID`` (PK)
127 * ``ScenarioID`` (FK to SCENARIO)
128 * ``ParentVersionID``
129 * ``ClaimID`` (FK to CLAIM)
130 * ``Definitions`` (JSON)
131 * ``Boundaries`` (JSON)
132 * ``Assumptions`` (JSON)
133 * ``Context`` (text)
134 * ``EvaluationMethod`` (text)
135 * ``PublicationMode`` (Mode1|Mode2|Mode3)
136 * ``ReviewStatus`` (draft|in_review|approved|rejected)
137 * ``CreatedAt``, ``AuthorType``, ``CreatedBy``, ``JustificationText``
138 * ``NodeOrigin``, ``SignatureHash``
139 * ``Status`` (active|superseded|deprecated)
140
141 **Note**: SafetyClass removed from Scenario - risk tier is at claim level
142
143 **EVIDENCE / EVIDENCE_VERSION**:
144
145 * ``EVIDENCE`` is the anchor
146 * ``EVIDENCE_VERSION`` is the versioned snapshot:
147 * ``VersionID`` (PK)
148 * ``EvidenceID`` (FK to EVIDENCE)
149 * ``ParentVersionID``
150 * ``Type`` (paper|dataset|report|transcript|expert|media)
151 * ``Category`` (empirical|historical|rhetorical|dataset|meta-analysis)
152 * ``Reliability`` (low|medium|high)
153 * ``Provenance`` (URL, DOI, source metadata)
154 * ``ExtractionMethod`` (manual|OCR|API|AKEL)
155 * ``ContentHash`` (SHA256 of evidence content)
156 * ``PublicationMode`` (Mode1|Mode2|Mode3)
157 * ``ReviewStatus`` (draft|verified|disputed|retracted)
158 * ``CreatedAt``, ``AuthorType``, ``CreatedBy``, ``JustificationText``
159 * ``NodeOrigin``, ``SignatureHash``
160 * ``Status`` (active|superseded)
161
162 **VERDICT / VERDICT_VERSION**:
163
164 * ``VERDICT`` is the anchor
165 * ``VERDICT_VERSION`` is the snapshot:
166 * ``VersionID`` (PK)
167 * ``VerdictID`` (FK to VERDICT)
168 * ``ParentVersionID``
169 * ``ClaimID`` (FK to CLAIM)
170 * ``ScenarioVersionID`` (FK to specific SCENARIO_VERSION)
171 * ``EvidenceVersionSet`` (JSON array of Evidence VersionIDs used)
172 * ``LikelihoodRange`` (0–1, with uncertainty bounds)
173 * ``ExplanationChain`` (JSON)
174 * ``UncertaintyFactors`` (JSON)
175 * ``PublicationMode`` (Mode1|Mode2|Mode3)
176 * ``ReviewStatus`` (draft|in_review|approved|retracted)
177 * ``CreatedAt``, ``AuthorType``, ``CreatedBy``, ``JustificationText``
178 * ``NodeOrigin``, ``SignatureHash``
179 * ``Status`` (current|outdated|superseded|retracted)
180
181 == 3. Many-to-Many Linking Tables ==
182
183 **ScenarioEvidenceLink**:
184
185 * Links scenario versions to evidence versions with relevance scoring
186 * ``ScenarioID``, ``ScenarioVersionID``
187 * ``EvidenceID``, ``EvidenceVersionID``
188 * ``RelevanceScore`` (0–1) - How relevant this evidence is to this scenario
189 * ``LinkJustification`` - Brief explanation of relevance
190
191 **Purpose**:
192
193 * Evidence can be used by multiple scenarios
194 * Scenarios can draw from multiple pieces of evidence
195 * Relevance scoring helps prioritize evidence
196 * Version-specific linking preserves historical accuracy
197
198 **ClaimCluster**:
199
200 * Semantic clustering of similar claims
201 * ``ClusterID`` (PK)
202 * ``EmbeddingVector`` - Vector representation for semantic search
203 * ``MemberList`` - List of ClaimIDs in this cluster
204 * ``Theme`` - Human-readable theme description
205
206 == 4. Key Changes in v0.9.1 ==
207
208 **Updated Field Names**:
209
210 * `SafetyCategory` → `RiskTier` (consistency with risk tier system A/B/C)
211 * `SafetyClass` removed from Scenario (redundant with claim-level RiskTier)
212
213 **Added Fields to All Version Entities**:
214
215 * `PublicationMode` - Track Mode 1/2/3 status
216 * `ReviewStatus` - Track workflow state
217 * `NodeOrigin` - Federation provenance
218 * `CreatedBy` - FK to User/TechnicalUser (clarified)
219
220 **New Entity**:
221
222 * `TECHNICAL_USER` - Separate system processes from human users
223
224 **Clarifications**:
225
226 * `ScenarioVersionID` in Verdict (not just ScenarioID) - links to specific version
227 * `ContentHash` in Evidence - SHA256 for integrity checking
228
229 == 5. Data Model Behavior ==
230
231 === 5.1 Late-Arriving Evidence ===
232
233 When new evidence versions appear:
234
235 1. Existing verdicts marked as **outdated**
236 2. Scenario relevance must be re-evaluated
237 3. Re-evaluation engine triggers verdict recomputation
238 4. New verdict versions created
239 5. Users notified of updates
240
241 === 5.2 Scenario Evolution ===
242
243 When a scenario's assumptions or definitions change:
244
245 * Creates new scenario version (not in-place update)
246 * All dependent verdicts must be recalculated
247 * Previous scenario versions remain accessible
248 * Version lineage preserved
249
250 === 5.3 Federated Nodes ===
251
252 Each node may share partial data:
253
254 * Claims and scenarios shared if relevant
255 * Evidence metadata shared, not always full files
256 * Version synchronization via NodeOrigin tracking
257 * Branching allowed for divergent interpretations
258
259 == 6. Visual Diagrams ==
260
261 The following diagrams provide visual representations of the data model structure and relationships.
262
263 === 6.1 Core Data Model ERD ===
264
265 {{include reference="Archive.FactHarbor V0\.9\.23 Lost Data.Specification.Diagrams.Core Data Model ERD.WebHome"}}
266
267 === 6.2 User Roles Structure ===
268
269 {{include reference="Test.FactHarborV09.Specification.Diagrams.User Roles ERD.WebHome"}}
270
271 === 6.3 Content Workflow ===
272
273 {{include reference="Test.FactHarborV09.Specification.Diagrams.Content Workflow ERD.WebHome"}}
274
275
276 == 7. Related Pages ==
277
278 * [[Federation & Decentralization>>Test.FactHarborV09.Specification.Federation & Decentralization.WebHome]]
279 * [[AKEL (AI Knowledge Extraction Layer)>>Test.FactHarborV09.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]]
280 * [[Architecture>>Test.FactHarborV09.Specification.Architecture.WebHome]]
281 {{/include}}
282 {{/include}}
283 {{/include}}