Wiki source code of Federation & Decentralization

Last modified by Robert Schaub on 2025/12/24 20:30

Show last authors
1 = Federation & Decentralization =
2
3 FactHarbor is designed to operate as a **federated network of nodes** rather than a single central server.
4
5 Decentralization provides:
6 * **Resilience** against censorship or political pressure
7 * **Autonomy** for local governance and moderation
8 * **Scalability** across many independent communities
9 * **Trust** without centralized control
10 * **Domain specialization** (health-focused nodes, energy-focused nodes, etc.)
11
12 FactHarbor draws inspiration from the Fediverse but uses stronger structure, versioning, and integrity guarantees.
13
14
15 == 1. Federation Architecture Diagram ==
16
17 The following diagram shows the complete federated architecture with node components and communication layers.
18
19 {{include reference="FactHarbor.Specification.Diagrams.Federation Architecture.WebHome"/}}
20
21
22 == 2. Federated FactHarbor Nodes ==
23
24 Each FactHarbor instance ("node") maintains:
25 * Its own database
26 * Its own AKEL instance
27 * Its own reviewers, experts, and contributors
28 * Its own governance rules
29
30 Nodes exchange structured information:
31 * Claims
32 * Scenarios
33 * Evidence metadata (not necessarily full files)
34 * Verdicts (optional)
35 * Hashes and signatures for integrity
36
37 Nodes choose which external nodes they trust.
38
39
40 == 3. Global Identifiers ==
41
42 Every entity receives a globally unique, linkable identifier.
43
44 **Format**:
45 `factharbor://node_url/type/local_id`
46
47 **Example**:
48 `factharbor://factharbor.energy/claim/CLM-55812`
49
50 **Supported types**:
51 * `claim`
52 * `scenario`
53 * `evidence`
54 * `verdict`
55 * `user` (optional)
56 * `cluster`
57
58 **Properties**:
59 * Globally consistent
60 * Human-readable
61 * Hash-derived
62 * Independent of database internals
63 * URL-resolvable (future enhancement)
64
65 This allows cross-node references and prevents identifier collisions in federated environments.
66
67
68 == 4. Trust Model ==
69
70 Each node maintains a **trust table** defining relationships with other nodes:
71
72 === 4.1 Trust Levels ===
73
74 **Trusted Nodes**:
75 * Claims auto-imported
76 * Scenarios accepted without re-review
77 * Evidence considered valid
78 * Verdicts displayed to users
79 * High synchronization priority
80
81 **Neutral Nodes**:
82 * Claims imported but flagged for review
83 * Scenarios require local validation
84 * Evidence requires re-assessment
85 * Verdicts shown with "external node" disclaimer
86 * Normal synchronization priority
87
88 **Untrusted Nodes**:
89 * Claims quarantined, manual import only
90 * Scenarios rejected by default
91 * Evidence not accepted
92 * Verdicts not displayed
93 * No automatic synchronization
94
95 === 4.2 Trust Affects ===
96
97 * **Auto-import**: Whether claims/scenarios are automatically added
98 * **Re-review requirements**: Whether local reviewers must validate
99 * **Verdict display**: Whether external verdicts are shown to users
100 * **Synchronization frequency**: How often data is exchanged
101 * **Reputation signals**: How external reputation is interpreted
102
103 === 4.3 Local Trust Authority ===
104
105 Each node's governance team decides:
106 * Which nodes to trust
107 * Trust level criteria
108 * Trust escalation/degradation rules
109 * Dispute resolution with partner nodes
110
111 Trust is **local and autonomous** - no global trust registry exists.
112
113
114 == 5. Data Sharing Model ==
115
116 === 5.1 What Nodes Share ===
117
118 **Always shared** (if federation enabled):
119 * Claims and claim clusters
120 * Scenario structures
121 * Evidence metadata and content hashes
122 * Integrity signatures
123
124 **Optionally shared**:
125 * Full evidence files (large documents)
126 * Verdicts (nodes may choose to keep verdicts local)
127 * Vector embeddings
128 * Scenario templates
129 * AKEL distilled knowledge
130
131 **Never shared**:
132 * Internal user lists
133 * Reviewer comments and internal discussions
134 * Governance decisions and meeting notes
135 * Access control data
136 * Private or sensitive content marked as local-only
137
138 === 5.2 Large Evidence Files ===
139
140 Evidence files are:
141 * Stored locally by default
142 * Referenced via global content hash
143 * Optionally served through IPFS
144 * Accessible via direct peer-to-peer transfer
145 * Can be stored in S3-compatible object storage
146
147
148 == 6. Synchronization Protocol ==
149
150 Nodes exchange data using multiple synchronization methods:
151
152 === 6.1 Push-Based Synchronization ===
153
154 **Mechanism**: Webhooks
155
156 When local content changes:
157 1. Node builds signed bundle
158 2. Sends webhook notification to subscribed nodes
159 3. Remote nodes fetch bundle
160 4. Remote nodes validate and import
161
162 **Use case**: Real-time updates for trusted partners
163
164 === 6.2 Pull-Based Synchronization ===
165
166 **Mechanism**: Scheduled polling
167
168 Nodes periodically:
169 1. Query partner nodes for updates
170 2. Fetch changed entities since last sync
171 3. Validate and import
172 4. Store sync checkpoint
173
174 **Use case**: Regular batch updates, lower trust nodes
175
176 === 6.3 Subscription-Based Synchronization ===
177
178 **Mechanism**: WebSub-like protocol
179
180 Nodes subscribe to:
181 * Specific claim clusters
182 * Specific domains (medical, energy, etc.)
183 * Specific scenario types
184 * Verdict updates
185
186 Publisher pushes updates only to subscribers.
187
188 **Use case**: Selective federation, domain specialization
189
190 === 6.4 Large Asset Transfer ===
191
192 For files >10MB:
193 * S3-compatible object storage
194 * IPFS (content-addressed)
195 * Direct peer-to-peer transfer
196 * Chunked HTTP transfer with resume support
197
198
199 == 7. Federation Sync Workflow ==
200
201 Complete synchronization sequence for creating and sharing new content:
202
203 === 7.1 Step 1: Local Node Creates New Versions ===
204
205 User or AKEL creates:
206 * New claim version
207 * New scenario version
208 * New evidence version
209 * New verdict version
210
211 All changes tracked with:
212 * VersionID
213 * ParentVersionID
214 * AuthorType
215 * Timestamp
216 * JustificationText
217
218 === 7.2 Step 2: Federation Layer Builds Signed Bundle ===
219
220 Federation layer packages:
221 * Entity data (claim, scenario, evidence metadata, verdict)
222 * Version lineage (ParentVersionID chain)
223 * Cryptographic signatures
224 * Node provenance information
225 * Trust metadata
226
227 Bundle format:
228 * JSON-LD for structured data
229 * Content-addressed hashes
230 * Digital signatures for integrity
231
232 === 7.3 Step 3: Bundle Includes Required Data ===
233
234 Each bundle contains:
235 * **Claims**: Full claim text, classification, domain
236 * **Scenarios**: Definitions, assumptions, boundaries
237 * **Evidence metadata**: Source URLs, hashes, reliability scores (not always full files)
238 * **Verdicts**: Likelihood ranges, uncertainty, reasoning chains
239 * **Lineage**: Version history, parent relationships
240 * **Signatures**: Cryptographic proof of origin
241
242 === 7.4 Step 4: Bundle Pushed to Trusted Neighbor Nodes ===
243
244 Based on trust table:
245 * Push to **trusted nodes** immediately
246 * Queue for **neutral nodes** (batched)
247 * Skip **untrusted nodes**
248
249 Push methods:
250 * Webhook notification
251 * Direct API call
252 * Pub/Sub message queue
253
254 === 7.5 Step 5: Remote Nodes Validate Lineage and Signatures ===
255
256 Receiving node:
257 1. Verifies cryptographic signatures
258 2. Validates version lineage (ParentVersionID chain)
259 3. Checks for conflicts with local data
260 4. Validates data structure and required fields
261 5. Applies local trust policies
262
263 Validation failures → reject or quarantine bundle
264
265 === 7.6 Step 6: Accept or Branch Versions ===
266
267 **Accept** (if validation passes):
268 * Import new versions
269 * Maintain provenance metadata
270 * Link to local related entities
271 * Update local indices
272
273 **Branch** (if conflict detected):
274 * Create parallel version tree
275 * Mark as "external branch"
276 * Allow local reviewers to merge or reject
277 * Preserve both version histories
278
279 **Reject** (if validation fails):
280 * Log rejection reason
281 * Notify source node (optional)
282 * Quarantine for manual review (optional)
283
284 === 7.7 Step 7: Local Re-evaluation Runs if Required ===
285
286 After import, local node checks:
287 * Does new evidence affect existing verdicts?
288 * Do new scenarios require re-assessment?
289 * Are there contradictions with local content?
290
291 If yes:
292 * Trigger AKEL re-evaluation
293 * Queue for reviewer attention
294 * Update affected verdicts
295 * Notify users following related content
296
297
298 == 8. Cross-Node AI Knowledge Exchange ==
299
300 Each node runs its own AKEL instance and may exchange AI-derived knowledge:
301
302 === 8.1 What Can Be Shared ===
303
304 **Vector embeddings**:
305 * For cross-node claim clustering
306 * For semantic search alignment
307 * Never includes training data
308
309 **Canonical claim forms**:
310 * Normalized claim text
311 * Standard phrasing templates
312 * Domain-specific formulations
313
314 **Scenario templates**:
315 * Reusable scenario structures
316 * Common assumption patterns
317 * Evaluation method templates
318
319 **Contradiction alerts**:
320 * Detected conflicts between claims
321 * Evidence conflicts across nodes
322 * Scenario incompatibilities
323
324 **Metadata and insights**:
325 * Aggregate quality metrics
326 * Reliability signal extraction
327 * Bubble detection patterns
328
329 === 8.2 What Can NEVER Be Shared ===
330
331 **Model weights**: No sharing of trained model parameters
332
333 **Training data**: No sharing of full training datasets
334
335 **Local governance overrides**: AKEL suggestions can be overridden locally
336
337 **User behavior data**: No cross-node tracking
338
339 **Internal review discussions**: Private content stays private
340
341 === 8.3 Benefits of AI Knowledge Exchange ===
342
343 * Reduced duplication across nodes
344 * Improved claim clustering accuracy
345 * Faster contradiction detection
346 * Shared scenario libraries
347 * Cross-node quality improvements
348
349 === 8.4 Local Control Maintained ===
350
351 * Nodes accept or reject shared AI knowledge
352 * Human reviewers can override any AKEL suggestion
353 * Local governance always has final authority
354 * No external AI control over local content
355 * Privacy-preserving knowledge exchange
356
357
358 == 9. Decentralized Processing ==
359
360 Each node independently performs:
361 * AKEL processing
362 * Scenario drafting and validation
363 * Evidence review
364 * Verdict calculation
365 * Truth landscape summarization
366
367 Nodes can specialize:
368 * Health-focused node with medical experts
369 * Energy-focused node with domain knowledge
370 * Small node delegating scenario libraries to partners
371 * Regional node with language/culture specialization
372
373 Optional data sharing includes:
374 * Embeddings for clustering
375 * Claim clusters for alignment
376 * Scenario templates for efficiency
377 * Verdict comparison metadata
378
379
380 == 10. Scaling to Thousands of Users ==
381
382 Nodes scale independently through:
383 * Horizontally scalable API servers
384 * Worker pools for AKEL tasks
385 * Hybrid storage (local + S3/IPFS)
386 * Redis caching for performance
387 * Sharded or partitioned databases
388
389 Federation allows effectively unlimited horizontal scaling by adding new nodes.
390
391 Communities may form:
392 * Domain-specific nodes (epidemiology, energy, climate)
393 * Language or region-based nodes
394 * NGO or institutional nodes
395 * Private organizational nodes
396 * Academic research nodes
397
398 Nodes cooperate through:
399 * Scenario library sharing
400 * Shared or overlapping claim clusters
401 * Expert delegation between nodes
402 * Distributed AKEL task support
403 * Cross-node quality audits
404
405
406 == 11. Federation and Release 1.0 ==
407
408 **POC**: Single node, optional federation experiments
409
410 **Beta 0**: 2-3 nodes, basic federation protocol
411
412 **Release 1.0**: Full federation support with:
413 * Robust synchronization protocol
414 * Trust model implementation
415 * Cross-node AI knowledge exchange
416 * Federated search and discovery
417 * Distributed audit collaboration
418 * Inter-node expert consultation
419
420
421 == 12. Related Pages ==
422
423 * [[AKEL (AI Knowledge Extraction Layer)>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]]
424 * [[Data Model>>FactHarbor.Specification.Data Model.WebHome]]
425 * [[Architecture>>FactHarbor.Specification.Architecture.WebHome]]
426 * [[Workflows>>FactHarbor.Specification.Workflows.WebHome]]