Changes for page Data Model

Last modified by Robert Schaub on 2025/12/24 20:34

From version 6.4
edited by Robert Schaub
on 2025/12/16 20:28
Change comment: Renamed back-links.
To version 5.1
edited by Robert Schaub
on 2025/12/14 22:27
Change comment: Imported from XAR

Summary

Details

Page properties
Parent
... ... @@ -1,1 +1,1 @@
1 -FactHarbor.Archive.FactHarbor V0\.9\.18.Specification.WebHome
1 +FactHarbor.Specification.WebHome
Content
... ... @@ -1,11 +1,10 @@
1 1  = Data Model =
2 2  
3 -This page describes the current data model for FactHarbor v0.9.1.
3 +This page describes the current data model for FactHarbor.
4 4  
5 5  == Versioning Strategy ==
6 6  
7 7  Every entity in FactHarbor has a full immutable version history. This ensures:
8 -
9 9  * Complete auditability
10 10  * Ability to reconstruct historical state
11 11  * Federation-compatible lineage tracking
... ... @@ -14,19 +14,16 @@
14 14  === Core Versioning Principles ===
15 15  
16 16  **Immutability**:
17 -
18 18  * Each version is stored independently
19 19  * Versions cannot be deleted, only superseded
20 20  * Historical versions remain accessible
21 21  
22 22  **Lineage**:
23 -
24 24  * Each version links to its parent via `ParentVersionID`
25 25  * Forms directed acyclic graph (DAG) of changes
26 26  * Supports branching in federated environments
27 27  
28 28  **Provenance**:
29 -
30 30  * Every version timestamped (`CreatedAt`)
31 31  * Author type recorded (`AuthorType`: Human, AI, ExternalNode)
32 32  * Justification captured (`JustificationText`)
... ... @@ -33,7 +33,6 @@
33 33  * Digital signatures for integrity (`SignatureHash` in Release 1.0)
34 34  
35 35  **Federation Support**:
36 -
37 37  * Versions can originate from remote nodes
38 38  * Conflict detection via lineage comparison
39 39  * Parallel version trees for branching scenarios
... ... @@ -47,193 +47,118 @@
47 47  * **ParentVersionID**: Link to previous version (null for first version)
48 48  * **CreatedAt**: Timestamp (ISO 8601, UTC)
49 49  * **AuthorType**: Human | AI | ExternalNode
50 -* **CreatedBy**: Foreign key to User or TechnicalUser
51 51  * **JustificationText**: Brief explanation of changes
52 -* **PublicationMode**: Mode1 (draft) | Mode2 (AI-published) | Mode3 (human-reviewed)
53 -* **ReviewStatus**: Workflow state (draft|in_review|approved|rejected)
54 -* **NodeOrigin**: Node ID where version was created (for federation)
55 55  * **SignatureHash**: Cryptographic signature (Release 1.0)
56 56  
57 57  ----
58 58  
59 -== Core Entity Definitions ==
50 +== Core Data Model Refinements ==
60 60  
61 -=== User Entities ===
62 -
63 -**USER** (base user table):
64 -
65 -* ``UserID`` (PK)
66 -* ``UserType`` (Reader|Contributor|Reviewer|Auditor|Expert|Moderator|Maintainer)
67 -* ``DisplayName``
68 -* ``Email`` (for Contributors and above)
69 -* ``RegisteredAt``
70 -* ``LastActive``
71 -* ``Status`` (active|suspended|banned)
72 -
73 -**TECHNICAL_USER** (system processes):
74 -
75 -* ``SystemID`` (PK)
76 -* ``SystemName``
77 -* ``Purpose`` (AKEL|FederationSync|BackupService|Monitor|Audit)
78 -* ``CreatedBy`` (FK to Maintainer who created this system user)
79 -* ``CreatedAt``
80 -* ``Status`` (active|paused|deprecated)
81 -* ``ApiKey`` (encrypted)
82 -* ``Permissions`` (JSON - authorized operations)
83 -
84 -**Examples of Technical Users**:
85 -
86 -* AKEL instances (AI processing)
87 -* Federation sync bots
88 -* Scheduled audit tasks
89 -* Backup services
90 -* Monitoring systems
91 -* External API integrations
92 -
93 -----
94 -
95 -=== Content Entities ===
96 -
97 97  The system relies on the following versioned core entities:
98 98  
99 -**CLAIM_CLUSTER**:
54 +* **CLAIM_CLUSTER**
55 +** ``ClusterID`` (PK), ``EmbeddingVectorRef``, ``Theme``
56 +** Groups related claims into topical clusters.
57 +** One Cluster has many Claims.
58 +** A Claim belongs to exactly one primary cluster.
100 100  
101 -* ``ClusterID`` (PK)
102 -* ``EmbeddingVectorRef``
103 -* ``Theme``
104 -* Groups related claims into topical clusters
105 -* One Cluster has many Claims
106 -* A Claim belongs to exactly one primary cluster
60 +* **CLAIM / CLAIM_VERSION**
61 +** ``CLAIM`` is the long‑lived anchor for a real‑world claim.
62 +** ``CLAIM_VERSION`` is an immutable snapshot that includes:
63 +*** ``ClaimID`` (FK to CLAIM)
64 +*** ``VersionID`` (PK)
65 +*** ``ParentVersionID`` (FK to prior version, nullable)
66 +*** ``Text``
67 +*** ``Domain``
68 +*** ``ClaimType`` (literal, metaphorical, rhetorical, supernatural...)
69 +*** ``Evaluability`` (empirical, subjective, non-falsifiable)
70 +*** ``SafetyCategory`` (low, medium, high)
71 +*** ``CreatedAt``, ``AuthorType``, ``JustificationText``
72 +*** ``Status`` (active, superseded, merged)
107 107  
108 -**CLAIM / CLAIM_VERSION**:
74 +* **SCENARIO / SCENARIO_VERSION**
75 +** ``SCENARIO`` is the anchor for a scenario across time.
76 +** ``SCENARIO_VERSION`` is an immutable snapshot:
77 +*** ``ScenarioID`` (FK to SCENARIO)
78 +*** ``VersionID`` (PK)
79 +*** ``ParentVersionID``
80 +*** ``ClaimID`` (FK to CLAIM)
81 +*** ``Definitions``
82 +*** ``Boundaries``
83 +*** ``Assumptions``
84 +*** ``Context``
85 +*** ``EvaluationMethod``
86 +*** ``SafetyClass``
87 +*** ``CreatedAt``, ``AuthorType``, ``JustificationText``
88 +*** ``Status`` (active, superseded, deprecated)
109 109  
110 -* ``CLAIM`` is the long-lived anchor for a real-world claim
111 -* ``CLAIM_VERSION`` is an immutable snapshot that includes:
112 -* ``VersionID`` (PK)
113 -* ``ClaimID`` (FK to CLAIM)
114 -* ``ParentVersionID`` (FK to prior version, nullable)
115 -* ``Text``
116 -* ``Domain``
117 -* ``ClaimType`` (literal|metaphorical|rhetorical|supernatural)
118 -* ``Evaluability`` (empirical|subjective|non-falsifiable)
119 -* ``RiskTier`` (A|B|C) - replaced SafetyCategory for consistency
120 -* ``PublicationMode`` (Mode1|Mode2|Mode3)
121 -* ``ReviewStatus`` (draft|in_review|approved|rejected)
122 -* ``CreatedAt``, ``AuthorType``, ``CreatedBy``, ``JustificationText``
123 -* ``NodeOrigin``, ``SignatureHash``
124 -* ``Status`` (active|superseded|merged)
90 +* **EVIDENCE / EVIDENCE_VERSION**
91 +** ``EVIDENCE`` is the anchor.
92 +** ``EVIDENCE_VERSION`` is the versioned snapshot:
93 +*** ``EvidenceID`` (FK to EVIDENCE)
94 +*** ``VersionID`` (PK)
95 +*** ``ParentVersionID``
96 +*** ``Type`` (paper, dataset, report, transcript, expert...)
97 +*** ``Category`` (empirical, historical, rhetorical, dataset, meta-analysis...)
98 +*** ``Reliability`` (low/med/high)
99 +*** ``Provenance`` (URL, DOI, source metadata)
100 +*** ``ExtractionMethod`` (manual, OCR, API, AKEL)
101 +*** ``CreatedAt``, ``AuthorType``, ``JustificationText``
102 +*** ``Status`` (verified, updated, disputed, retracted, superseded)
125 125  
126 -**SCENARIO / SCENARIO_VERSION**:
104 +* **VERDICT / VERDICT_VERSION**
105 +** ``VERDICT`` is the anchor.
106 +** ``VERDICT_VERSION`` is the snapshot:
107 +*** ``VerdictID`` (FK to VERDICT)
108 +*** ``VersionID`` (PK)
109 +*** ``ParentVersionID``
110 +*** ``ClaimID`` (FK to CLAIM)
111 +*** ``ScenarioID`` (FK to SCENARIO)
112 +*** ``EvidenceVersionSet`` (list of evidence version IDs used)
113 +*** ``LikelihoodRange`` (0–1, with uncertainty bounds)
114 +*** ``ExplanationChain``
115 +*** ``UncertaintyFactors``
116 +*** ``CreatedAt``, ``AuthorType``, ``JustificationText``
117 +*** ``Status`` (current, outdated, superseded, retracted)
127 127  
128 -* ``SCENARIO`` is the anchor for a scenario across time
129 -* ``SCENARIO_VERSION`` is an immutable snapshot:
130 -* ``VersionID`` (PK)
131 -* ``ScenarioID`` (FK to SCENARIO)
132 -* ``ParentVersionID``
133 -* ``ClaimID`` (FK to CLAIM)
134 -* ``Definitions`` (JSON)
135 -* ``Boundaries`` (JSON)
136 -* ``Assumptions`` (JSON)
137 -* ``Context`` (text)
138 -* ``EvaluationMethod`` (text)
139 -* ``PublicationMode`` (Mode1|Mode2|Mode3)
140 -* ``ReviewStatus`` (draft|in_review|approved|rejected)
141 -* ``CreatedAt``, ``AuthorType``, ``CreatedBy``, ``JustificationText``
142 -* ``NodeOrigin``, ``SignatureHash``
143 -* ``Status`` (active|superseded|deprecated)
144 -
145 -**Note**: SafetyClass removed from Scenario - risk tier is at claim level
146 -
147 -**EVIDENCE / EVIDENCE_VERSION**:
148 -
149 -* ``EVIDENCE`` is the anchor
150 -* ``EVIDENCE_VERSION`` is the versioned snapshot:
151 -* ``VersionID`` (PK)
152 -* ``EvidenceID`` (FK to EVIDENCE)
153 -* ``ParentVersionID``
154 -* ``Type`` (paper|dataset|report|transcript|expert|media)
155 -* ``Category`` (empirical|historical|rhetorical|dataset|meta-analysis)
156 -* ``Reliability`` (low|medium|high)
157 -* ``Provenance`` (URL, DOI, source metadata)
158 -* ``ExtractionMethod`` (manual|OCR|API|AKEL)
159 -* ``ContentHash`` (SHA256 of evidence content)
160 -* ``PublicationMode`` (Mode1|Mode2|Mode3)
161 -* ``ReviewStatus`` (draft|verified|disputed|retracted)
162 -* ``CreatedAt``, ``AuthorType``, ``CreatedBy``, ``JustificationText``
163 -* ``NodeOrigin``, ``SignatureHash``
164 -* ``Status`` (active|superseded)
165 -
166 -**VERDICT / VERDICT_VERSION**:
167 -
168 -* ``VERDICT`` is the anchor
169 -* ``VERDICT_VERSION`` is the snapshot:
170 -* ``VersionID`` (PK)
171 -* ``VerdictID`` (FK to VERDICT)
172 -* ``ParentVersionID``
173 -* ``ClaimID`` (FK to CLAIM)
174 -* ``ScenarioVersionID`` (FK to specific SCENARIO_VERSION)
175 -* ``EvidenceVersionSet`` (JSON array of Evidence VersionIDs used)
176 -* ``LikelihoodRange`` (0–1, with uncertainty bounds)
177 -* ``ExplanationChain`` (JSON)
178 -* ``UncertaintyFactors`` (JSON)
179 -* ``PublicationMode`` (Mode1|Mode2|Mode3)
180 -* ``ReviewStatus`` (draft|in_review|approved|retracted)
181 -* ``CreatedAt``, ``AuthorType``, ``CreatedBy``, ``JustificationText``
182 -* ``NodeOrigin``, ``SignatureHash``
183 -* ``Status`` (current|outdated|superseded|retracted)
184 -
185 185  ----
186 186  
187 187  == Many-to-Many Linking Tables ==
188 188  
189 -**ScenarioEvidenceLink**:
123 +=== ScenarioEvidenceLink ===
190 190  
191 -* Links scenario versions to evidence versions with relevance scoring
192 -* ``ScenarioID``, ``ScenarioVersionID``
193 -* ``EvidenceID``, ``EvidenceVersionID``
125 +Links scenario versions to evidence versions with relevance scoring.
126 +
127 +**Fields**:
128 +* ``ScenarioID``
129 +* ``ScenarioVersionID``
130 +* ``EvidenceID``
131 +* ``EvidenceVersionID``
194 194  * ``RelevanceScore`` (0–1) - How relevant this evidence is to this scenario
195 195  * ``LinkJustification`` - Brief explanation of relevance
196 196  
197 197  **Purpose**:
198 -
199 199  * Evidence can be used by multiple scenarios
200 200  * Scenarios can draw from multiple pieces of evidence
201 201  * Relevance scoring helps prioritize evidence
202 202  * Version-specific linking preserves historical accuracy
203 203  
204 -**ClaimCluster**:
141 +=== ClaimCluster ===
205 205  
206 -* Semantic clustering of similar claims
143 +Semantic clustering of similar claims.
144 +
145 +**Fields**:
207 207  * ``ClusterID`` (PK)
208 208  * ``EmbeddingVector`` - Vector representation for semantic search
209 209  * ``MemberList`` - List of ClaimIDs in this cluster
210 210  * ``Theme`` - Human-readable theme description
211 211  
212 -----
151 +**Purpose**:
152 +* Groups semantically similar claims
153 +* Enables efficient search and discovery
154 +* Supports cross-node claim alignment
155 +* Reduces duplication
213 213  
214 -== Key Changes in v0.9.1 ==
215 -
216 -**Updated Field Names**:
217 -
218 -* `SafetyCategory` → `RiskTier` (consistency with risk tier system A/B/C)
219 -* `SafetyClass` removed from Scenario (redundant with claim-level RiskTier)
220 -
221 -**Added Fields to All Version Entities**:
222 -
223 -* `PublicationMode` - Track Mode 1/2/3 status
224 -* `ReviewStatus` - Track workflow state
225 -* `NodeOrigin` - Federation provenance
226 -* `CreatedBy` - FK to User/TechnicalUser (clarified)
227 -
228 -**New Entity**:
229 -
230 -* `TECHNICAL_USER` - Separate system processes from human users
231 -
232 -**Clarifications**:
233 -
234 -* `ScenarioVersionID` in Verdict (not just ScenarioID) - links to specific version
235 -* `ContentHash` in Evidence - SHA256 for integrity checking
236 -
237 237  ----
238 238  
239 239  == Data Model Behavior ==
... ... @@ -248,46 +248,99 @@
248 248  4. New verdict versions created
249 249  5. Users notified of updates
250 250  
171 +**Process**:
172 +* New EvidenceVersion imported
173 +* System scans related ScenarioEvidenceLinks
174 +* Checks if evidence affects existing verdicts
175 +* Queues affected verdicts for re-evaluation
176 +* AKEL or reviewer creates new VerdictVersion
177 +* Old verdicts remain accessible (historical record)
178 +
251 251  === Scenario Evolution ===
252 252  
253 253  When a scenario's assumptions or definitions change:
254 254  
255 -* Creates new scenario version (not in-place update)
183 +**Creates new scenario version** (not in-place update):
184 +* New ScenarioVersion with updated fields
185 +* ParentVersionID points to previous version
256 256  * All dependent verdicts must be recalculated
257 257  * Previous scenario versions remain accessible
258 -* Version lineage preserved
259 259  
189 +**Triggers**:
190 +* Refined definitions
191 +* Changed assumptions
192 +* Expanded or narrowed boundaries
193 +* Updated evaluation methods
194 +* Safety classification changes
195 +
196 +**Impact**:
197 +* Verdicts based on old scenario version remain valid (historical)
198 +* New verdicts required for new scenario version
199 +* Users can compare old vs new scenarios
200 +* Evidence links may need re-assessment
201 +
260 260  === Federated Nodes ===
261 261  
262 262  Each node may share partial data:
263 263  
264 -* Claims and scenarios shared if relevant
265 -* Evidence metadata shared, not always full files
266 -* Version synchronization via NodeOrigin tracking
267 -* Branching allowed for divergent interpretations
206 +**Claims and scenarios**: Shared if relevant to node's domain
268 268  
269 -----
208 +**Evidence metadata**: Shared, but not always full evidence files
270 270  
271 -== Visual Diagrams ==
210 +**Verdict lineage**: Shared only if not locally overridden
272 272  
273 -The following diagrams provide visual representations of the data model structure and relationships.
212 +**Version synchronization**:
213 +* Remote versions imported with provenance metadata
214 +* Conflicts detected via ParentVersionID comparison
215 +* Branching allowed for divergent interpretations
216 +* Local node retains authority over local versions
274 274  
275 -=== Core Data Model ERD ===
218 +**Trust and acceptance**:
219 +* Trusted nodes: auto-import versions
220 +* Neutral nodes: import but flag for review
221 +* Untrusted nodes: manual import only
276 276  
277 -{{include reference="FactHarbor.Archive.FactHarbor V0\.9\.23 Lost Data.Specification.Diagrams.Core Data Model ERD.WebHome"/}}
223 +----
278 278  
279 -=== User Roles Structure ===
225 +== Entity-Relationship Overview ==
280 280  
281 -{{include reference="Test.FactHarborV09.Specification.Diagrams.User Roles ERD.WebHome"/}}
227 +**Core relationships**:
282 282  
283 -=== Content Workflow ===
229 +```
230 +CLAIM_CLUSTER (1) ──< (N) CLAIM
231 +CLAIM (1) ──< (N) CLAIM_VERSION
232 +CLAIM (1) ──< (N) SCENARIO
233 +SCENARIO (1) ──< (N) SCENARIO_VERSION
234 +SCENARIO_VERSION (N) ──< (N) EVIDENCE_VERSION [via ScenarioEvidenceLink]
235 +SCENARIO_VERSION (1) ──< (N) VERDICT_VERSION
236 +VERDICT_VERSION references specific EvidenceVersionSet
237 +```
284 284  
285 -{{include reference="Test.FactHarborV09.Specification.Diagrams.Content Workflow ERD.WebHome"/}}
239 +**Version chains**:
286 286  
241 +Each entity has a version DAG:
242 +```
243 +Version 1 (ParentVersionID=null)
244 + ↓
245 +Version 2 (ParentVersionID=1)
246 + ↓
247 +Version 3 (ParentVersionID=2)
248 +```
249 +
250 +In federated environments, branching may occur:
251 +```
252 +Version 1
253 + ↓
254 +Version 2
255 + / ↓ ↓
256 +V3a V3b (parallel branches from different nodes)
257 +```
258 +
287 287  ----
288 288  
289 -== Related Pages ==
261 +## Related Pages ==
290 290  
291 291  * [[Federation & Decentralization>>FactHarbor.Specification.Federation & Decentralization.WebHome]]
292 292  * [[AKEL (AI Knowledge Extraction Layer)>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]]
293 293  * [[Architecture>>FactHarbor.Specification.Architecture.WebHome]]
266 +