Changes for page Data Model

Last modified by Robert Schaub on 2025/12/24 20:34

From version 3.1
edited by Robert Schaub
on 2025/12/11 21:35
Change comment: Imported from XAR
To version 6.1
edited by Robert Schaub
on 2025/12/15 16:56
Change comment: Imported from XAR

Summary

Details

Page properties
Content
... ... @@ -1,38 +1,271 @@
1 1  = Data Model =
2 2  
3 -This page describes the current data model for FactHarbor.
3 +This page describes the current data model for FactHarbor v0.9.1.
4 4  
5 -== Core Data Model Refinements ==
5 +== Versioning Strategy ==
6 6  
7 +Every entity in FactHarbor has a full immutable version history. This ensures:
8 +* Complete auditability
9 +* Ability to reconstruct historical state
10 +* Federation-compatible lineage tracking
11 +* Transparent evolution of claims, scenarios, and verdicts
12 +
13 +=== Core Versioning Principles ===
14 +
15 +**Immutability**:
16 +* Each version is stored independently
17 +* Versions cannot be deleted, only superseded
18 +* Historical versions remain accessible
19 +
20 +**Lineage**:
21 +* Each version links to its parent via `ParentVersionID`
22 +* Forms directed acyclic graph (DAG) of changes
23 +* Supports branching in federated environments
24 +
25 +**Provenance**:
26 +* Every version timestamped (`CreatedAt`)
27 +* Author type recorded (`AuthorType`: Human, AI, ExternalNode)
28 +* Justification captured (`JustificationText`)
29 +* Digital signatures for integrity (`SignatureHash` in Release 1.0)
30 +
31 +**Federation Support**:
32 +* Versions can originate from remote nodes
33 +* Conflict detection via lineage comparison
34 +* Parallel version trees for branching scenarios
35 +* Cross-node version synchronization
36 +
37 +=== Common Version Fields ===
38 +
39 +All versioned entities include:
40 +
41 +* **VersionID**: Unique identifier for this specific version
42 +* **ParentVersionID**: Link to previous version (null for first version)
43 +* **CreatedAt**: Timestamp (ISO 8601, UTC)
44 +* **AuthorType**: Human | AI | ExternalNode
45 +* **CreatedBy**: Foreign key to User or TechnicalUser
46 +* **JustificationText**: Brief explanation of changes
47 +* **PublicationMode**: Mode1 (draft) | Mode2 (AI-published) | Mode3 (human-reviewed)
48 +* **ReviewStatus**: Workflow state (draft|in_review|approved|rejected)
49 +* **NodeOrigin**: Node ID where version was created (for federation)
50 +* **SignatureHash**: Cryptographic signature (Release 1.0)
51 +
52 +----
53 +
54 +== Core Entity Definitions ==
55 +
56 +=== User Entities ===
57 +
58 +**USER** (base user table):
59 +* ``UserID`` (PK)
60 +* ``UserType`` (Reader|Contributor|Reviewer|Auditor|Expert|Moderator|Maintainer)
61 +* ``DisplayName``
62 +* ``Email`` (for Contributors and above)
63 +* ``RegisteredAt``
64 +* ``LastActive``
65 +* ``Status`` (active|suspended|banned)
66 +
67 +**TECHNICAL_USER** (system processes):
68 +* ``SystemID`` (PK)
69 +* ``SystemName``
70 +* ``Purpose`` (AKEL|FederationSync|BackupService|Monitor|Audit)
71 +* ``CreatedBy`` (FK to Maintainer who created this system user)
72 +* ``CreatedAt``
73 +* ``Status`` (active|paused|deprecated)
74 +* ``ApiKey`` (encrypted)
75 +* ``Permissions`` (JSON - authorized operations)
76 +
77 +**Examples of Technical Users**:
78 +* AKEL instances (AI processing)
79 +* Federation sync bots
80 +* Scheduled audit tasks
81 +* Backup services
82 +* Monitoring systems
83 +* External API integrations
84 +
85 +----
86 +
87 +=== Content Entities ===
88 +
7 7  The system relies on the following versioned core entities:
8 8  
9 -* **CLAIM_CLUSTER**
10 -** ``ClusterID`` (PK), ``EmbeddingVectorRef``, ``Theme``
11 -** Groups related claims into topical clusters.
91 +**CLAIM_CLUSTER**:
92 +* ``ClusterID`` (PK)
93 +* ``EmbeddingVectorRef``
94 +* ``Theme``
95 +* Groups related claims into topical clusters
96 +* One Cluster has many Claims
97 +* A Claim belongs to exactly one primary cluster
12 12  
13 -* **CLAIM / CLAIM_VERSION**
14 -** ``CLAIM`` is the long‑lived anchor for a real‑world claim.
15 -** ``CLAIM_VERSION`` is an immutable snapshot of wording + basic metadata.
16 -** Verdicts are **NOT** attached to ClaimVersion but to Scenario.
99 +**CLAIM / CLAIM_VERSION**:
100 +* ``CLAIM`` is the long-lived anchor for a real-world claim
101 +* ``CLAIM_VERSION`` is an immutable snapshot that includes:
102 + * ``VersionID`` (PK)
103 + * ``ClaimID`` (FK to CLAIM)
104 + * ``ParentVersionID`` (FK to prior version, nullable)
105 + * ``Text``
106 + * ``Domain``
107 + * ``ClaimType`` (literal|metaphorical|rhetorical|supernatural)
108 + * ``Evaluability`` (empirical|subjective|non-falsifiable)
109 + * ``RiskTier`` (A|B|C) - replaced SafetyCategory for consistency
110 + * ``PublicationMode`` (Mode1|Mode2|Mode3)
111 + * ``ReviewStatus`` (draft|in_review|approved|rejected)
112 + * ``CreatedAt``, ``AuthorType``, ``CreatedBy``, ``JustificationText``
113 + * ``NodeOrigin``, ``SignatureHash``
114 + * ``Status`` (active|superseded|merged)
17 17  
18 -* **SCENARIO / SCENARIO_VERSION**
19 -** ``SCENARIO`` represents a stable interpretive context for a claim.
20 -** ``SCENARIO_VERSION`` is an immutable snapshot of that context (definitions, assumptions, boundaries).
21 -** Verdicts are attached to SCENARIO, with verdict history in VERDICT_VERSION.
116 +**SCENARIO / SCENARIO_VERSION**:
117 +* ``SCENARIO`` is the anchor for a scenario across time
118 +* ``SCENARIO_VERSION`` is an immutable snapshot:
119 + * ``VersionID`` (PK)
120 + * ``ScenarioID`` (FK to SCENARIO)
121 + * ``ParentVersionID``
122 + * ``ClaimID`` (FK to CLAIM)
123 + * ``Definitions`` (JSON)
124 + * ``Boundaries`` (JSON)
125 + * ``Assumptions`` (JSON)
126 + * ``Context`` (text)
127 + * ``EvaluationMethod`` (text)
128 + * ``PublicationMode`` (Mode1|Mode2|Mode3)
129 + * ``ReviewStatus`` (draft|in_review|approved|rejected)
130 + * ``CreatedAt``, ``AuthorType``, ``CreatedBy``, ``JustificationText``
131 + * ``NodeOrigin``, ``SignatureHash``
132 + * ``Status`` (active|superseded|deprecated)
22 22  
23 -* **EVIDENCE / EVIDENCE_VERSION**
24 -** ``EVIDENCE`` is the logical source (report, article, dataset…).
25 -** ``EVIDENCE_VERSION`` is the extracted/processed snapshot (summary, reliability, etc.).
134 +**Note**: SafetyClass removed from Scenario - risk tier is at claim level
26 26  
27 -* **VERDICT / VERDICT_VERSION**
28 -** ``VERDICT`` represents “this scenario is evaluated for this claim.”
29 -** ``VERDICT_VERSION`` is an immutable snapshot of a concrete evaluation (likelihood, confidence, reasoning, timestamp).
136 +**EVIDENCE / EVIDENCE_VERSION**:
137 +* ``EVIDENCE`` is the anchor
138 +* ``EVIDENCE_VERSION`` is the versioned snapshot:
139 + * ``VersionID`` (PK)
140 + * ``EvidenceID`` (FK to EVIDENCE)
141 + * ``ParentVersionID``
142 + * ``Type`` (paper|dataset|report|transcript|expert|media)
143 + * ``Category`` (empirical|historical|rhetorical|dataset|meta-analysis)
144 + * ``Reliability`` (low|medium|high)
145 + * ``Provenance`` (URL, DOI, source metadata)
146 + * ``ExtractionMethod`` (manual|OCR|API|AKEL)
147 + * ``ContentHash`` (SHA256 of evidence content)
148 + * ``PublicationMode`` (Mode1|Mode2|Mode3)
149 + * ``ReviewStatus`` (draft|verified|disputed|retracted)
150 + * ``CreatedAt``, ``AuthorType``, ``CreatedBy``, ``JustificationText``
151 + * ``NodeOrigin``, ``SignatureHash``
152 + * ``Status`` (active|superseded)
30 30  
31 -* **SCENARIO_EVIDENCE_VERSION_LINK**
32 -** Connects ``ScenarioVersion`` ↔ ``EvidenceVersion`` (many‑to‑many).
33 -** Fields: Relevance, Direction (SUPPORTS / CONTRADICTS / NEUTRAL).
34 -** **Rule:** The link always targets VERSIONED entities, never the base tables.
154 +**VERDICT / VERDICT_VERSION**:
155 +* ``VERDICT`` is the anchor
156 +* ``VERDICT_VERSION`` is the snapshot:
157 + * ``VersionID`` (PK)
158 + * ``VerdictID`` (FK to VERDICT)
159 + * ``ParentVersionID``
160 + * ``ClaimID`` (FK to CLAIM)
161 + * ``ScenarioVersionID`` (FK to specific SCENARIO_VERSION)
162 + * ``EvidenceVersionSet`` (JSON array of Evidence VersionIDs used)
163 + * ``LikelihoodRange`` (0–1, with uncertainty bounds)
164 + * ``ExplanationChain`` (JSON)
165 + * ``UncertaintyFactors`` (JSON)
166 + * ``PublicationMode`` (Mode1|Mode2|Mode3)
167 + * ``ReviewStatus`` (draft|in_review|approved|retracted)
168 + * ``CreatedAt``, ``AuthorType``, ``CreatedBy``, ``JustificationText``
169 + * ``NodeOrigin``, ``SignatureHash``
170 + * ``Status`` (current|outdated|superseded|retracted)
35 35  
36 -== Core Data Model ERD ==
172 +----
37 37  
38 -{{include reference="FactHarbor.Specification.Diagrams.Core Data Model ERD.WebHome"/}}
174 +== Many-to-Many Linking Tables ==
175 +
176 +**ScenarioEvidenceLink**:
177 +* Links scenario versions to evidence versions with relevance scoring
178 +* ``ScenarioID``, ``ScenarioVersionID``
179 +* ``EvidenceID``, ``EvidenceVersionID``
180 +* ``RelevanceScore`` (0–1) - How relevant this evidence is to this scenario
181 +* ``LinkJustification`` - Brief explanation of relevance
182 +
183 +**Purpose**:
184 +* Evidence can be used by multiple scenarios
185 +* Scenarios can draw from multiple pieces of evidence
186 +* Relevance scoring helps prioritize evidence
187 +* Version-specific linking preserves historical accuracy
188 +
189 +**ClaimCluster**:
190 +* Semantic clustering of similar claims
191 +* ``ClusterID`` (PK)
192 +* ``EmbeddingVector`` - Vector representation for semantic search
193 +* ``MemberList`` - List of ClaimIDs in this cluster
194 +* ``Theme`` - Human-readable theme description
195 +
196 +----
197 +
198 +== Key Changes in v0.9.1 ==
199 +
200 +**Updated Field Names**:
201 +* `SafetyCategory` → `RiskTier` (consistency with risk tier system A/B/C)
202 +* `SafetyClass` removed from Scenario (redundant with claim-level RiskTier)
203 +
204 +**Added Fields to All Version Entities**:
205 +* `PublicationMode` - Track Mode 1/2/3 status
206 +* `ReviewStatus` - Track workflow state
207 +* `NodeOrigin` - Federation provenance
208 +* `CreatedBy` - FK to User/TechnicalUser (clarified)
209 +
210 +**New Entity**:
211 +* `TECHNICAL_USER` - Separate system processes from human users
212 +
213 +**Clarifications**:
214 +* `ScenarioVersionID` in Verdict (not just ScenarioID) - links to specific version
215 +* `ContentHash` in Evidence - SHA256 for integrity checking
216 +
217 +----
218 +
219 +== Data Model Behavior ==
220 +
221 +=== Late-Arriving Evidence ===
222 +
223 +When new evidence versions appear:
224 +1. Existing verdicts marked as **outdated**
225 +2. Scenario relevance must be re-evaluated
226 +3. Re-evaluation engine triggers verdict recomputation
227 +4. New verdict versions created
228 +5. Users notified of updates
229 +
230 +=== Scenario Evolution ===
231 +
232 +When a scenario's assumptions or definitions change:
233 +* Creates new scenario version (not in-place update)
234 +* All dependent verdicts must be recalculated
235 +* Previous scenario versions remain accessible
236 +* Version lineage preserved
237 +
238 +=== Federated Nodes ===
239 +
240 +Each node may share partial data:
241 +* Claims and scenarios shared if relevant
242 +* Evidence metadata shared, not always full files
243 +* Version synchronization via NodeOrigin tracking
244 +* Branching allowed for divergent interpretations
245 +
246 +----
247 +
248 +== Visual Diagrams ==
249 +
250 +The following diagrams provide visual representations of the data model structure and relationships.
251 +
252 +=== Core Data Model ERD ===
253 +
254 +{{include reference="Test.FactHarborV09.Specification.Diagrams.Core Data Model ERD.WebHome"/}}
255 +
256 +=== User Roles Structure ===
257 +
258 +{{include reference="Test.FactHarborV09.Specification.Diagrams.User Roles ERD.WebHome"/}}
259 +
260 +=== Content Workflow ===
261 +
262 +{{include reference="Test.FactHarborV09.Specification.Diagrams.Content Workflow ERD.WebHome"/}}
263 +
264 +----
265 +
266 +== Related Pages ==
267 +
268 +* [[Federation & Decentralization>>FactHarbor.Specification.Federation & Decentralization.WebHome]]
269 +* [[AKEL (AI Knowledge Extraction Layer)>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]]
270 +* [[Architecture>>FactHarbor.Specification.Architecture.WebHome]]
271 +