Wiki source code of Federation & Decentralization

Last modified by Robert Schaub on 2025/12/24 20:35

Show last authors
1 = Federation & Decentralization =
2
3 FactHarbor is designed to operate as a **federated network of nodes** rather than a single central server.
4
5 Decentralization provides:
6
7 * **Resilience** against censorship or political pressure
8 * **Autonomy** for local governance and moderation
9 * **Scalability** across many independent communities
10 * **Trust** without centralized control
11 * **Domain specialization** (health-focused nodes, energy-focused nodes, etc.)
12
13 FactHarbor draws inspiration from the Fediverse but uses stronger structure, versioning, and integrity guarantees.
14
15
16 == 1. Federated FactHarbor Nodes ==
17
18 Each FactHarbor instance ("node") maintains:
19
20 * Its own database
21 * Its own AKEL instance
22 * Its own reviewers, experts, and contributors
23 * Its own governance rules
24
25 Nodes exchange structured information:
26
27 * Claims
28 * Scenarios
29 * Evidence metadata (not necessarily full files)
30 * Verdicts (optional)
31 * Hashes and signatures for integrity
32
33 Nodes choose which external nodes they trust.
34
35
36 == 2. Global Identifiers ==
37
38 Every entity receives a globally unique, linkable identifier.
39
40 **Format**:
41 `factharbor://node_url/type/local_id`
42
43 **Example**:
44 `factharbor://factharbor.energy/claim/CLM-55812`
45
46 **Supported types**:
47
48 * `claim`
49 * `scenario`
50 * `evidence`
51 * `verdict`
52 * `user` (optional)
53 * `cluster`
54
55 **Properties**:
56
57 * Globally consistent
58 * Human-readable
59 * Hash-derived
60 * Independent of database internals
61 * URL-resolvable (future enhancement)
62
63 This allows cross-node references and prevents identifier collisions in federated environments.
64
65
66 == 3. Trust Model ==
67
68 Each node maintains a **trust table** defining relationships with other nodes:
69
70 === 3.1 Trust Levels ===
71
72 **Trusted Nodes**:
73
74 * Claims auto-imported
75 * Scenarios accepted without re-review
76 * Evidence considered valid
77 * Verdicts displayed to users
78 * High synchronization priority
79
80 **Neutral Nodes**:
81
82 * Claims imported but flagged for review
83 * Scenarios require local validation
84 * Evidence requires re-assessment
85 * Verdicts shown with "external node" disclaimer
86 * Normal synchronization priority
87
88 **Untrusted Nodes**:
89
90 * Claims quarantined, manual import only
91 * Scenarios rejected by default
92 * Evidence not accepted
93 * Verdicts not displayed
94 * No automatic synchronization
95
96 === 3.2 Trust Affects ===
97
98 * **Auto-import**: Whether claims/scenarios are automatically added
99 * **Re-review requirements**: Whether local reviewers must validate
100 * **Verdict display**: Whether external verdicts are shown to users
101 * **Synchronization frequency**: How often data is exchanged
102 * **Reputation signals**: How external reputation is interpreted
103
104 === 3.3 Local Trust Authority ===
105
106 Each node's governance team decides:
107
108 * Which nodes to trust
109 * Trust level criteria
110 * Trust escalation/degradation rules
111 * Dispute resolution with partner nodes
112
113 Trust is **local and autonomous** - no global trust registry exists.
114
115
116 == 4. Data Sharing Model ==
117
118 === 4.1 What Nodes Share ===
119
120 **Always shared** (if federation enabled):
121
122 * Claims and claim clusters
123 * Scenario structures
124 * Evidence metadata and content hashes
125 * Integrity signatures
126
127 **Optionally shared**:
128
129 * Full evidence files (large documents)
130 * Verdicts (nodes may choose to keep verdicts local)
131 * Vector embeddings
132 * Scenario templates
133 * AKEL distilled knowledge
134
135 **Never shared**:
136
137 * Internal user lists
138 * Reviewer comments and internal discussions
139 * Governance decisions and meeting notes
140 * Access control data
141 * Private or sensitive content marked as local-only
142
143 === 4.2 Large Evidence Files ===
144
145 Evidence files are:
146
147 * Stored locally by default
148 * Referenced via global content hash
149 * Optionally served through IPFS
150 * Accessible via direct peer-to-peer transfer
151 * Can be stored in S3-compatible object storage
152
153 == 5. Synchronization Protocol ==
154
155 Nodes exchange data using multiple synchronization methods:
156
157 === 5.1 Push-Based Synchronization ===
158
159 **Mechanism**: Webhooks
160
161 When local content changes:
162
163 1. Node builds signed bundle
164 2. Sends webhook notification to subscribed nodes
165 3. Remote nodes fetch bundle
166 4. Remote nodes validate and import
167
168 **Use case**: Real-time updates for trusted partners
169
170 === 5.2 Pull-Based Synchronization ===
171
172 **Mechanism**: Scheduled polling
173
174 Nodes periodically:
175
176 1. Query partner nodes for updates
177 2. Fetch changed entities since last sync
178 3. Validate and import
179 4. Store sync checkpoint
180
181 **Use case**: Regular batch updates, lower trust nodes
182
183 === 5.3 Subscription-Based Synchronization ===
184
185 **Mechanism**: WebSub-like protocol
186
187 Nodes subscribe to:
188
189 * Specific claim clusters
190 * Specific domains (medical, energy, etc.)
191 * Specific scenario types
192 * Verdict updates
193
194 Publisher pushes updates only to subscribers.
195
196 **Use case**: Selective federation, domain specialization
197
198 === 5.4 Large Asset Transfer ===
199
200 For files >10MB:
201
202 * S3-compatible object storage
203 * IPFS (content-addressed)
204 * Direct peer-to-peer transfer
205 * Chunked HTTP transfer with resume support
206
207 == 6. Federation Sync Workflow ==
208
209 Complete synchronization sequence for creating and sharing new content:
210
211 === 6.1 Step 1: Local Node Creates New Versions ===
212
213 User or AKEL creates:
214
215 * New claim version
216 * New scenario version
217 * New evidence version
218 * New verdict version
219
220 All changes tracked with:
221
222 * VersionID
223 * ParentVersionID
224 * AuthorType
225 * Timestamp
226 * JustificationText
227
228 === 6.2 Step 2: Federation Layer Builds Signed Bundle ===
229
230 Federation layer packages:
231
232 * Entity data (claim, scenario, evidence metadata, verdict)
233 * Version lineage (ParentVersionID chain)
234 * Cryptographic signatures
235 * Node provenance information
236 * Trust metadata
237
238 Bundle format:
239
240 * JSON-LD for structured data
241 * Content-addressed hashes
242 * Digital signatures for integrity
243
244 === 6.3 Step 3: Bundle Includes Required Data ===
245
246 Each bundle contains:
247
248 * **Claims**: Full claim text, classification, domain
249 * **Scenarios**: Definitions, assumptions, boundaries
250 * **Evidence metadata**: Source URLs, hashes, reliability scores (not always full files)
251 * **Verdicts**: Likelihood ranges, uncertainty, reasoning chains
252 * **Lineage**: Version history, parent relationships
253 * **Signatures**: Cryptographic proof of origin
254
255 === 6.4 Step 4: Bundle Pushed to Trusted Neighbor Nodes ===
256
257 Based on trust table:
258
259 * Push to **trusted nodes** immediately
260 * Queue for **neutral nodes** (batched)
261 * Skip **untrusted nodes**
262
263 Push methods:
264
265 * Webhook notification
266 * Direct API call
267 * Pub/Sub message queue
268
269 === 6.5 Step 5: Remote Nodes Validate Lineage and Signatures ===
270
271 Receiving node:
272
273 1. Verifies cryptographic signatures
274 2. Validates version lineage (ParentVersionID chain)
275 3. Checks for conflicts with local data
276 4. Validates data structure and required fields
277 5. Applies local trust policies
278
279 Validation failures → reject or quarantine bundle
280
281 === 6.6 Step 6: Accept or Branch Versions ===
282
283 **Accept** (if validation passes):
284
285 * Import new versions
286 * Maintain provenance metadata
287 * Link to local related entities
288 * Update local indices
289
290 **Branch** (if conflict detected):
291
292 * Create parallel version tree
293 * Mark as "external branch"
294 * Allow local reviewers to merge or reject
295 * Preserve both version histories
296
297 **Reject** (if validation fails):
298
299 * Log rejection reason
300 * Notify source node (optional)
301 * Quarantine for manual review (optional)
302
303 === 6.7 Step 7: Local Re-evaluation Runs if Required ===
304
305 After import, local node checks:
306
307 * Does new evidence affect existing verdicts?
308 * Do new scenarios require re-assessment?
309 * Are there contradictions with local content?
310
311 If yes:
312
313 * Trigger AKEL re-evaluation
314 * Queue for reviewer attention
315 * Update affected verdicts
316 * Notify users following related content
317
318 == 7. Cross-Node AI Knowledge Exchange ==
319
320 Each node runs its own AKEL instance and may exchange AI-derived knowledge:
321
322 === 7.1 What Can Be Shared ===
323
324 **Vector embeddings**:
325
326 * For cross-node claim clustering
327 * For semantic search alignment
328 * Never includes training data
329
330 **Canonical claim forms**:
331
332 * Normalized claim text
333 * Standard phrasing templates
334 * Domain-specific formulations
335
336 **Scenario templates**:
337
338 * Reusable scenario structures
339 * Common assumption patterns
340 * Evaluation method templates
341
342 **Contradiction alerts**:
343
344 * Detected conflicts between claims
345 * Evidence conflicts across nodes
346 * Scenario incompatibilities
347
348 **Metadata and insights**:
349
350 * Aggregate quality metrics
351 * Reliability signal extraction
352 * Bubble detection patterns
353
354 === 7.2 What Can NEVER Be Shared ===
355
356 **Model weights**: No sharing of trained model parameters
357
358 **Training data**: No sharing of full training datasets
359
360 **Local governance overrides**: AKEL suggestions can be overridden locally
361
362 **User behavior data**: No cross-node tracking
363
364 **Internal review discussions**: Private content stays private
365
366 === 7.3 Benefits of AI Knowledge Exchange ===
367
368 * Reduced duplication across nodes
369 * Improved claim clustering accuracy
370 * Faster contradiction detection
371 * Shared scenario libraries
372 * Cross-node quality improvements
373
374 === 7.4 Local Control Maintained ===
375
376 * Nodes accept or reject shared AI knowledge
377 * Human reviewers can override any AKEL suggestion
378 * Local governance always has final authority
379 * No external AI control over local content
380 * Privacy-preserving knowledge exchange
381
382 == 8. Decentralized Processing ==
383
384 Each node independently performs:
385
386 * AKEL processing
387 * Scenario drafting and validation
388 * Evidence review
389 * Verdict calculation
390 * Truth landscape summarization
391
392 Nodes can specialize:
393
394 * Health-focused node with medical experts
395 * Energy-focused node with domain knowledge
396 * Small node delegating scenario libraries to partners
397 * Regional node with language/culture specialization
398
399 Optional data sharing includes:
400
401 * Embeddings for clustering
402 * Claim clusters for alignment
403 * Scenario templates for efficiency
404 * Verdict comparison metadata
405
406 == 9. Scaling to Thousands of Users ==
407
408 Nodes scale independently through:
409
410 * Horizontally scalable API servers
411 * Worker pools for AKEL tasks
412 * Hybrid storage (local + S3/IPFS)
413 * Redis caching for performance
414 * Sharded or partitioned databases
415
416 Federation allows effectively unlimited horizontal scaling by adding new nodes.
417
418 Communities may form:
419
420 * Domain-specific nodes (epidemiology, energy, climate)
421 * Language or region-based nodes
422 * NGO or institutional nodes
423 * Private organizational nodes
424 * Academic research nodes
425
426 Nodes cooperate through:
427
428 * Scenario library sharing
429 * Shared or overlapping claim clusters
430 * Expert delegation between nodes
431 * Distributed AKEL task support
432 * Cross-node quality audits
433
434 == 10. Federation and Release 1.0 ==
435
436 **POC**: Single node, optional federation experiments
437
438 **Beta 0**: 2-3 nodes, basic federation protocol
439
440 **Release 1.0**: Full federation support with:
441
442 * Robust synchronization protocol
443 * Trust model implementation
444 * Cross-node AI knowledge exchange
445 * Federated search and discovery
446 * Distributed audit collaboration
447 * Inter-node expert consultation
448
449 == 11. Related Pages ==
450
451 * [[AKEL (AI Knowledge Extraction Layer)>>Archive.FactHarbor V0\.9\.23 Lost Data.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]]
452 * [[Data Model>>Archive.FactHarbor V0\.9\.23 Lost Data.Specification.Data Model.WebHome]]
453 * [[Architecture>>Archive.FactHarbor V0\.9\.23 Lost Data.Specification.Architecture.WebHome]]
454 * [[Workflows>>Archive.FactHarbor V0\.9\.23 Lost Data.Specification.Workflows.WebHome]]