Changes for page Data Model

Last modified by Robert Schaub on 2025/12/24 20:34

From version 2.2
edited by Robert Schaub
on 2025/12/11 21:34
Change comment: Renamed back-links.
To version 5.1
edited by Robert Schaub
on 2025/12/14 22:27
Change comment: Imported from XAR

Summary

Details

Page properties
Content
... ... @@ -2,6 +2,51 @@
2 2  
3 3  This page describes the current data model for FactHarbor.
4 4  
5 +== Versioning Strategy ==
6 +
7 +Every entity in FactHarbor has a full immutable version history. This ensures:
8 +* Complete auditability
9 +* Ability to reconstruct historical state
10 +* Federation-compatible lineage tracking
11 +* Transparent evolution of claims, scenarios, and verdicts
12 +
13 +=== Core Versioning Principles ===
14 +
15 +**Immutability**:
16 +* Each version is stored independently
17 +* Versions cannot be deleted, only superseded
18 +* Historical versions remain accessible
19 +
20 +**Lineage**:
21 +* Each version links to its parent via `ParentVersionID`
22 +* Forms directed acyclic graph (DAG) of changes
23 +* Supports branching in federated environments
24 +
25 +**Provenance**:
26 +* Every version timestamped (`CreatedAt`)
27 +* Author type recorded (`AuthorType`: Human, AI, ExternalNode)
28 +* Justification captured (`JustificationText`)
29 +* Digital signatures for integrity (`SignatureHash` in Release 1.0)
30 +
31 +**Federation Support**:
32 +* Versions can originate from remote nodes
33 +* Conflict detection via lineage comparison
34 +* Parallel version trees for branching scenarios
35 +* Cross-node version synchronization
36 +
37 +=== Common Version Fields ===
38 +
39 +All versioned entities include:
40 +
41 +* **VersionID**: Unique identifier for this specific version
42 +* **ParentVersionID**: Link to previous version (null for first version)
43 +* **CreatedAt**: Timestamp (ISO 8601, UTC)
44 +* **AuthorType**: Human | AI | ExternalNode
45 +* **JustificationText**: Brief explanation of changes
46 +* **SignatureHash**: Cryptographic signature (Release 1.0)
47 +
48 +----
49 +
5 5  == Core Data Model Refinements ==
6 6  
7 7  The system relies on the following versioned core entities:
... ... @@ -9,30 +9,213 @@
9 9  * **CLAIM_CLUSTER**
10 10  ** ``ClusterID`` (PK), ``EmbeddingVectorRef``, ``Theme``
11 11  ** Groups related claims into topical clusters.
57 +** One Cluster has many Claims.
58 +** A Claim belongs to exactly one primary cluster.
12 12  
13 13  * **CLAIM / CLAIM_VERSION**
14 14  ** ``CLAIM`` is the long‑lived anchor for a real‑world claim.
15 -** ``CLAIM_VERSION`` is an immutable snapshot of wording + basic metadata.
16 -** Verdicts are **NOT** attached to ClaimVersion but to Scenario.
62 +** ``CLAIM_VERSION`` is an immutable snapshot that includes:
63 +*** ``ClaimID`` (FK to CLAIM)
64 +*** ``VersionID`` (PK)
65 +*** ``ParentVersionID`` (FK to prior version, nullable)
66 +*** ``Text``
67 +*** ``Domain``
68 +*** ``ClaimType`` (literal, metaphorical, rhetorical, supernatural...)
69 +*** ``Evaluability`` (empirical, subjective, non-falsifiable)
70 +*** ``SafetyCategory`` (low, medium, high)
71 +*** ``CreatedAt``, ``AuthorType``, ``JustificationText``
72 +*** ``Status`` (active, superseded, merged)
17 17  
18 18  * **SCENARIO / SCENARIO_VERSION**
19 -** ``SCENARIO`` represents a stable interpretive context for a claim.
20 -** ``SCENARIO_VERSION`` is an immutable snapshot of that context (definitions, assumptions, boundaries).
21 -** Verdicts are attached to SCENARIO, with verdict history in VERDICT_VERSION.
75 +** ``SCENARIO`` is the anchor for a scenario across time.
76 +** ``SCENARIO_VERSION`` is an immutable snapshot:
77 +*** ``ScenarioID`` (FK to SCENARIO)
78 +*** ``VersionID`` (PK)
79 +*** ``ParentVersionID``
80 +*** ``ClaimID`` (FK to CLAIM)
81 +*** ``Definitions``
82 +*** ``Boundaries``
83 +*** ``Assumptions``
84 +*** ``Context``
85 +*** ``EvaluationMethod``
86 +*** ``SafetyClass``
87 +*** ``CreatedAt``, ``AuthorType``, ``JustificationText``
88 +*** ``Status`` (active, superseded, deprecated)
22 22  
23 23  * **EVIDENCE / EVIDENCE_VERSION**
24 -** ``EVIDENCE`` is the logical source (report, article, dataset…).
25 -** ``EVIDENCE_VERSION`` is the extracted/processed snapshot (summary, reliability, etc.).
91 +** ``EVIDENCE`` is the anchor.
92 +** ``EVIDENCE_VERSION`` is the versioned snapshot:
93 +*** ``EvidenceID`` (FK to EVIDENCE)
94 +*** ``VersionID`` (PK)
95 +*** ``ParentVersionID``
96 +*** ``Type`` (paper, dataset, report, transcript, expert...)
97 +*** ``Category`` (empirical, historical, rhetorical, dataset, meta-analysis...)
98 +*** ``Reliability`` (low/med/high)
99 +*** ``Provenance`` (URL, DOI, source metadata)
100 +*** ``ExtractionMethod`` (manual, OCR, API, AKEL)
101 +*** ``CreatedAt``, ``AuthorType``, ``JustificationText``
102 +*** ``Status`` (verified, updated, disputed, retracted, superseded)
26 26  
27 27  * **VERDICT / VERDICT_VERSION**
28 -** ``VERDICT`` represents “this scenario is evaluated for this claim.”
29 -** ``VERDICT_VERSION`` is an immutable snapshot of a concrete evaluation (likelihood, confidence, reasoning, timestamp).
105 +** ``VERDICT`` is the anchor.
106 +** ``VERDICT_VERSION`` is the snapshot:
107 +*** ``VerdictID`` (FK to VERDICT)
108 +*** ``VersionID`` (PK)
109 +*** ``ParentVersionID``
110 +*** ``ClaimID`` (FK to CLAIM)
111 +*** ``ScenarioID`` (FK to SCENARIO)
112 +*** ``EvidenceVersionSet`` (list of evidence version IDs used)
113 +*** ``LikelihoodRange`` (0–1, with uncertainty bounds)
114 +*** ``ExplanationChain``
115 +*** ``UncertaintyFactors``
116 +*** ``CreatedAt``, ``AuthorType``, ``JustificationText``
117 +*** ``Status`` (current, outdated, superseded, retracted)
30 30  
31 -* **SCENARIO_EVIDENCE_VERSION_LINK**
32 -** Connects ``ScenarioVersion`` ↔ ``EvidenceVersion`` (many‑to‑many).
33 -** Fields: Relevance, Direction (SUPPORTS / CONTRADICTS / NEUTRAL).
34 -** **Rule:** The link always targets VERSIONED entities, never the base tables.
119 +----
35 35  
36 -== Core Data Model ERD ==
121 +== Many-to-Many Linking Tables ==
37 37  
38 -{{include reference="FactHarbor.Archive.Diagrams v0\.8q.Core Data Model ERD.WebHome"/}}
123 +=== ScenarioEvidenceLink ===
124 +
125 +Links scenario versions to evidence versions with relevance scoring.
126 +
127 +**Fields**:
128 +* ``ScenarioID``
129 +* ``ScenarioVersionID``
130 +* ``EvidenceID``
131 +* ``EvidenceVersionID``
132 +* ``RelevanceScore`` (0–1) - How relevant this evidence is to this scenario
133 +* ``LinkJustification`` - Brief explanation of relevance
134 +
135 +**Purpose**:
136 +* Evidence can be used by multiple scenarios
137 +* Scenarios can draw from multiple pieces of evidence
138 +* Relevance scoring helps prioritize evidence
139 +* Version-specific linking preserves historical accuracy
140 +
141 +=== ClaimCluster ===
142 +
143 +Semantic clustering of similar claims.
144 +
145 +**Fields**:
146 +* ``ClusterID`` (PK)
147 +* ``EmbeddingVector`` - Vector representation for semantic search
148 +* ``MemberList`` - List of ClaimIDs in this cluster
149 +* ``Theme`` - Human-readable theme description
150 +
151 +**Purpose**:
152 +* Groups semantically similar claims
153 +* Enables efficient search and discovery
154 +* Supports cross-node claim alignment
155 +* Reduces duplication
156 +
157 +----
158 +
159 +== Data Model Behavior ==
160 +
161 +=== Late-Arriving Evidence ===
162 +
163 +When new evidence versions appear:
164 +
165 +1. Existing verdicts marked as **outdated**
166 +2. Scenario relevance must be re-evaluated
167 +3. Re-evaluation engine triggers verdict recomputation
168 +4. New verdict versions created
169 +5. Users notified of updates
170 +
171 +**Process**:
172 +* New EvidenceVersion imported
173 +* System scans related ScenarioEvidenceLinks
174 +* Checks if evidence affects existing verdicts
175 +* Queues affected verdicts for re-evaluation
176 +* AKEL or reviewer creates new VerdictVersion
177 +* Old verdicts remain accessible (historical record)
178 +
179 +=== Scenario Evolution ===
180 +
181 +When a scenario's assumptions or definitions change:
182 +
183 +**Creates new scenario version** (not in-place update):
184 +* New ScenarioVersion with updated fields
185 +* ParentVersionID points to previous version
186 +* All dependent verdicts must be recalculated
187 +* Previous scenario versions remain accessible
188 +
189 +**Triggers**:
190 +* Refined definitions
191 +* Changed assumptions
192 +* Expanded or narrowed boundaries
193 +* Updated evaluation methods
194 +* Safety classification changes
195 +
196 +**Impact**:
197 +* Verdicts based on old scenario version remain valid (historical)
198 +* New verdicts required for new scenario version
199 +* Users can compare old vs new scenarios
200 +* Evidence links may need re-assessment
201 +
202 +=== Federated Nodes ===
203 +
204 +Each node may share partial data:
205 +
206 +**Claims and scenarios**: Shared if relevant to node's domain
207 +
208 +**Evidence metadata**: Shared, but not always full evidence files
209 +
210 +**Verdict lineage**: Shared only if not locally overridden
211 +
212 +**Version synchronization**:
213 +* Remote versions imported with provenance metadata
214 +* Conflicts detected via ParentVersionID comparison
215 +* Branching allowed for divergent interpretations
216 +* Local node retains authority over local versions
217 +
218 +**Trust and acceptance**:
219 +* Trusted nodes: auto-import versions
220 +* Neutral nodes: import but flag for review
221 +* Untrusted nodes: manual import only
222 +
223 +----
224 +
225 +== Entity-Relationship Overview ==
226 +
227 +**Core relationships**:
228 +
229 +```
230 +CLAIM_CLUSTER (1) ──< (N) CLAIM
231 +CLAIM (1) ──< (N) CLAIM_VERSION
232 +CLAIM (1) ──< (N) SCENARIO
233 +SCENARIO (1) ──< (N) SCENARIO_VERSION
234 +SCENARIO_VERSION (N) ──< (N) EVIDENCE_VERSION [via ScenarioEvidenceLink]
235 +SCENARIO_VERSION (1) ──< (N) VERDICT_VERSION
236 +VERDICT_VERSION references specific EvidenceVersionSet
237 +```
238 +
239 +**Version chains**:
240 +
241 +Each entity has a version DAG:
242 +```
243 +Version 1 (ParentVersionID=null)
244 + ↓
245 +Version 2 (ParentVersionID=1)
246 + ↓
247 +Version 3 (ParentVersionID=2)
248 +```
249 +
250 +In federated environments, branching may occur:
251 +```
252 +Version 1
253 + ↓
254 +Version 2
255 + / ↓ ↓
256 +V3a V3b (parallel branches from different nodes)
257 +```
258 +
259 +----
260 +
261 +## Related Pages ==
262 +
263 +* [[Federation & Decentralization>>FactHarbor.Specification.Federation & Decentralization.WebHome]]
264 +* [[AKEL (AI Knowledge Extraction Layer)>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]]
265 +* [[Architecture>>FactHarbor.Specification.Architecture.WebHome]]
266 +