Last modified by Robert Schaub on 2025/12/24 20:33

From version 6.1
edited by Robert Schaub
on 2025/12/12 21:27
Change comment: Rollback to version 4.1
To version 7.1
edited by Robert Schaub
on 2025/12/14 22:27
Change comment: Imported from XAR

Summary

Details

Page properties
Content
... ... @@ -1,221 +1,431 @@
1 1  = Federation & Decentralization =
2 2  
3 -FactHarbor is designed as a network of independent nodes rather than a single centralized service.
4 -This provides resilience, local autonomy, and virtually unlimited scalability.
3 +FactHarbor is designed to operate as a **federated network of nodes** rather than a single central server.
5 5  
6 -Each node remains fully functional on its own while participating in a shared global knowledge graph.
5 +Decentralization provides:
6 +* **Resilience** against censorship or political pressure
7 +* **Autonomy** for local governance and moderation
8 +* **Scalability** across many independent communities
9 +* **Trust** without centralized control
10 +* **Domain specialization** (health-focused nodes, energy-focused nodes, etc.)
7 7  
8 -----
12 +FactHarbor draws inspiration from the Fediverse but uses stronger structure, versioning, and integrity guarantees.
9 9  
10 -= Purpose of Federation =
11 -
12 -A federated architecture enables:
13 -
14 -* Resilience against censorship or political pressure
15 -* Local governance and moderation autonomy
16 -* Scalability by adding more nodes, not bigger servers
17 -* Shared knowledge structures without central authority
18 -* Domain specialization (e.g., health-focused node, energy-focused node)
19 -* Cross-node collaboration while preserving independence
20 -
21 -FactHarbor takes inspiration from the Fediverse but uses stronger structure, integrity guarantees, and version lineage.
22 -
23 23  ----
24 24  
25 -= Federated Node Model =
16 +== Federated FactHarbor Nodes ==
26 26  
27 -Each FactHarbor node maintains:
28 -
29 -* Its own PostgreSQL database
30 -* Its own vector database
31 -* Its own object storage
18 +Each FactHarbor instance ("node") maintains:
19 +* Its own database
32 32  * Its own AKEL instance
33 -* Its own reviewers, experts, and moderators
34 -* Its own trust and governance policies
21 +* Its own reviewers, experts, and contributors
22 +* Its own governance rules
35 35  
36 -Nodes exchange structured objects:
37 -
24 +Nodes exchange structured information:
38 38  * Claims
39 39  * Scenarios
40 -* Evidence metadata (not raw files unless elected)
27 +* Evidence metadata (not necessarily full files)
41 41  * Verdicts (optional)
42 -* Integrity hashes and signatures
29 +* Hashes and signatures for integrity
43 43  
44 -Nodes decide individually which external nodes to trust.
31 +Nodes choose which external nodes they trust.
45 45  
46 46  ----
47 47  
48 -= Global Identifiers =
35 +== Global Identifiers ==
49 49  
50 -Each entity receives a globally unique ID.
37 +Every entity receives a globally unique, linkable identifier.
51 51  
52 -Format:
53 -``factharbor://<node>/<type>/<local_id>``
39 +**Format**:
40 +`factharbor://node_url/type/local_id`
54 54  
55 -Example:
56 -``factharbor://factharbor.energy/claim/CLM-55812``
42 +**Example**:
43 +`factharbor://factharbor.energy/claim/CLM-55812`
57 57  
58 -Types include:
59 -* claim
60 -* scenario
61 -* evidence
62 -* verdict
63 -* user (optional)
64 -* cluster
45 +**Supported types**:
46 +* `claim`
47 +* `scenario`
48 +* `evidence`
49 +* `verdict`
50 +* `user` (optional)
51 +* `cluster`
65 65  
66 -Identifiers are:
53 +**Properties**:
67 67  * Globally consistent
68 68  * Human-readable
69 69  * Hash-derived
70 -* Independent from internal database IDs
57 +* Independent of database internals
58 +* URL-resolvable (future enhancement)
71 71  
60 +This allows cross-node references and prevents identifier collisions in federated environments.
61 +
72 72  ----
73 73  
74 -= Data Sharing Model =
64 +== Trust Model ==
75 75  
76 -Nodes share:
66 +Each node maintains a **trust table** defining relationships with other nodes:
77 77  
78 -* Claims
79 -* Scenario structures
80 -* Evidence metadata + content hashes
81 -* Optional verdicts
82 -* Integrity metadata
68 +=== Trust Levels ===
83 83  
84 -Nodes **do not** share:
70 +**Trusted Nodes**:
71 +* Claims auto-imported
72 +* Scenarios accepted without re-review
73 +* Evidence considered valid
74 +* Verdicts displayed to users
75 +* High synchronization priority
85 85  
86 -* Local users
87 -* Review discussions
88 -* Internal moderation notes
89 -* Private evidence
77 +**Neutral Nodes**:
78 +* Claims imported but flagged for review
79 +* Scenarios require local validation
80 +* Evidence requires re-assessment
81 +* Verdicts shown with "external node" disclaimer
82 +* Normal synchronization priority
90 90  
91 -Large assets may use:
84 +**Untrusted Nodes**:
85 +* Claims quarantined, manual import only
86 +* Scenarios rejected by default
87 +* Evidence not accepted
88 +* Verdicts not displayed
89 +* No automatic synchronization
92 92  
93 -* Local object storage
94 -* S3-compatible buckets
95 -* IPFS for cross-node replication (optional)
91 +=== Trust Affects ===
96 96  
93 +* **Auto-import**: Whether claims/scenarios are automatically added
94 +* **Re-review requirements**: Whether local reviewers must validate
95 +* **Verdict display**: Whether external verdicts are shown to users
96 +* **Synchronization frequency**: How often data is exchanged
97 +* **Reputation signals**: How external reputation is interpreted
98 +
99 +=== Local Trust Authority ===
100 +
101 +Each node's governance team decides:
102 +* Which nodes to trust
103 +* Trust level criteria
104 +* Trust escalation/degradation rules
105 +* Dispute resolution with partner nodes
106 +
107 +Trust is **local and autonomous** - no global trust registry exists.
108 +
97 97  ----
98 98  
99 -= Trust Model =
111 +== Data Sharing Model ==
100 100  
101 -Each node maintains a **trust table**:
113 +=== What Nodes Share ===
102 102  
103 -* Trusted nodes
104 -* Neutral nodes
105 -* Untrusted nodes
115 +**Always shared** (if federation enabled):
116 +* Claims and claim clusters
117 +* Scenario structures
118 +* Evidence metadata and content hashes
119 +* Integrity signatures
106 106  
107 -Trust influences:
121 +**Optionally shared**:
122 +* Full evidence files (large documents)
123 +* Verdicts (nodes may choose to keep verdicts local)
124 +* Vector embeddings
125 +* Scenario templates
126 +* AKEL distilled knowledge
108 108  
109 -* Whether claims are auto-imported
110 -* Whether scenarios are accepted without re-review
111 -* Whether evidence requires new validation
112 -* Whether external verdicts are visible to users
128 +**Never shared**:
129 +* Internal user lists
130 +* Reviewer comments and internal discussions
131 +* Governance decisions and meeting notes
132 +* Access control data
133 +* Private or sensitive content marked as local-only
113 113  
114 -Reputation is local but can be mapped with trust-weighting.
135 +=== Large Evidence Files ===
115 115  
137 +Evidence files are:
138 +* Stored locally by default
139 +* Referenced via global content hash
140 +* Optionally served through IPFS
141 +* Accessible via direct peer-to-peer transfer
142 +* Can be stored in S3-compatible object storage
143 +
116 116  ----
117 117  
118 -= Decentralized Processing =
146 +== Synchronization Protocol ==
119 119  
120 -Each node independently performs:
148 +Nodes exchange data using multiple synchronization methods:
121 121  
122 -* AKEL processing
123 -* Scenario drafting
124 -* Evidence review
125 -* Verdict calculation
126 -* Summary generation
127 -* Re-evaluation
150 +=== Push-Based Synchronization ===
128 128  
129 -Nodes may specialize:
152 +**Mechanism**: Webhooks
130 130  
131 -* medical node
132 -* psychology node
133 -* climate node
134 -* small node delegating expert review to trusted partners
154 +When local content changes:
155 +1. Node builds signed bundle
156 +2. Sends webhook notification to subscribed nodes
157 +3. Remote nodes fetch bundle
158 +4. Remote nodes validate and import
135 135  
136 -Optional cross-node data sharing includes:
160 +**Use case**: Real-time updates for trusted partners
137 137  
138 -* Embeddings
139 -* Claim clusters
140 -* Scenario templates
141 -* Verdict comparison metadata
142 -* Contradiction alerts
162 +=== Pull-Based Synchronization ===
143 143  
164 +**Mechanism**: Scheduled polling
165 +
166 +Nodes periodically:
167 +1. Query partner nodes for updates
168 +2. Fetch changed entities since last sync
169 +3. Validate and import
170 +4. Store sync checkpoint
171 +
172 +**Use case**: Regular batch updates, lower trust nodes
173 +
174 +=== Subscription-Based Synchronization ===
175 +
176 +**Mechanism**: WebSub-like protocol
177 +
178 +Nodes subscribe to:
179 +* Specific claim clusters
180 +* Specific domains (medical, energy, etc.)
181 +* Specific scenario types
182 +* Verdict updates
183 +
184 +Publisher pushes updates only to subscribers.
185 +
186 +**Use case**: Selective federation, domain specialization
187 +
188 +=== Large Asset Transfer ===
189 +
190 +For files >10MB:
191 +* S3-compatible object storage
192 +* IPFS (content-addressed)
193 +* Direct peer-to-peer transfer
194 +* Chunked HTTP transfer with resume support
195 +
144 144  ----
145 145  
146 -= Cross-Node AI Knowledge Exchange =
198 +== Federation Sync Workflow ==
147 147  
148 -Nodes may exchange:
200 +Complete synchronization sequence for creating and sharing new content:
149 149  
150 -* Embeddings for clustering
151 -* Canonical claim forms
152 -* Scenario templates
153 -* Reliability hints
154 -* Contradiction alerts
155 -* Lightweight model insights (NOT weights)
202 +=== Step 1: Local Node Creates New Versions ===
156 156  
157 -AKEL **never**:
204 +User or AKEL creates:
205 +* New claim version
206 +* New scenario version
207 +* New evidence version
208 +* New verdict version
158 158  
159 -* Shares model weights
160 -* Automatically replaces local reviewer decisions
161 -* Accepts untrusted automated content
210 +All changes tracked with:
211 +* VersionID
212 +* ParentVersionID
213 +* AuthorType
214 +* Timestamp
215 +* JustificationText
162 162  
217 +=== Step 2: Federation Layer Builds Signed Bundle ===
218 +
219 +Federation layer packages:
220 +* Entity data (claim, scenario, evidence metadata, verdict)
221 +* Version lineage (ParentVersionID chain)
222 +* Cryptographic signatures
223 +* Node provenance information
224 +* Trust metadata
225 +
226 +Bundle format:
227 +* JSON-LD for structured data
228 +* Content-addressed hashes
229 +* Digital signatures for integrity
230 +
231 +=== Step 3: Bundle Includes Required Data ===
232 +
233 +Each bundle contains:
234 +* **Claims**: Full claim text, classification, domain
235 +* **Scenarios**: Definitions, assumptions, boundaries
236 +* **Evidence metadata**: Source URLs, hashes, reliability scores (not always full files)
237 +* **Verdicts**: Likelihood ranges, uncertainty, reasoning chains
238 +* **Lineage**: Version history, parent relationships
239 +* **Signatures**: Cryptographic proof of origin
240 +
241 +=== Step 4: Bundle Pushed to Trusted Neighbor Nodes ===
242 +
243 +Based on trust table:
244 +* Push to **trusted nodes** immediately
245 +* Queue for **neutral nodes** (batched)
246 +* Skip **untrusted nodes**
247 +
248 +Push methods:
249 +* Webhook notification
250 +* Direct API call
251 +* Pub/Sub message queue
252 +
253 +=== Step 5: Remote Nodes Validate Lineage and Signatures ===
254 +
255 +Receiving node:
256 +1. Verifies cryptographic signatures
257 +2. Validates version lineage (ParentVersionID chain)
258 +3. Checks for conflicts with local data
259 +4. Validates data structure and required fields
260 +5. Applies local trust policies
261 +
262 +Validation failures → reject or quarantine bundle
263 +
264 +=== Step 6: Accept or Branch Versions ===
265 +
266 +**Accept** (if validation passes):
267 +* Import new versions
268 +* Maintain provenance metadata
269 +* Link to local related entities
270 +* Update local indices
271 +
272 +**Branch** (if conflict detected):
273 +* Create parallel version tree
274 +* Mark as "external branch"
275 +* Allow local reviewers to merge or reject
276 +* Preserve both version histories
277 +
278 +**Reject** (if validation fails):
279 +* Log rejection reason
280 +* Notify source node (optional)
281 +* Quarantine for manual review (optional)
282 +
283 +=== Step 7: Local Re-evaluation Runs if Required ===
284 +
285 +After import, local node checks:
286 +* Does new evidence affect existing verdicts?
287 +* Do new scenarios require re-assessment?
288 +* Are there contradictions with local content?
289 +
290 +If yes:
291 +* Trigger AKEL re-evaluation
292 +* Queue for reviewer attention
293 +* Update affected verdicts
294 +* Notify users following related content
295 +
163 163  ----
164 164  
165 -= Synchronization Protocol =
298 +== Cross-Node AI Knowledge Exchange ==
166 166  
167 -Nodes periodically exchange version bundles:
300 +Each node runs its own AKEL instance and may exchange AI-derived knowledge:
168 168  
169 -* Claims
170 -* Scenarios
171 -* Evidence metadata + hashes
172 -* Optional verdicts
173 -* Templates
174 -* Embeddings (optional)
175 -* AKEL distilled knowledge summaries (optional)
302 +=== What Can Be Shared ===
176 176  
177 -Sync methods:
304 +**Vector embeddings**:
305 +* For cross-node claim clustering
306 +* For semantic search alignment
307 +* Never includes training data
178 178  
179 -* Push (webhook)
180 -* Pull (cron)
181 -* Subscription (WebSub-like)
309 +**Canonical claim forms**:
310 +* Normalized claim text
311 +* Standard phrasing templates
312 +* Domain-specific formulations
182 182  
183 -Large assets may be transferred via:
314 +**Scenario templates**:
315 +* Reusable scenario structures
316 +* Common assumption patterns
317 +* Evaluation method templates
184 184  
185 -* S3-compatible file transfer
186 -* IPFS
187 -* Peer-to-peer
319 +**Contradiction alerts**:
320 +* Detected conflicts between claims
321 +* Evidence conflicts across nodes
322 +* Scenario incompatibilities
188 188  
324 +**Metadata and insights**:
325 +* Aggregate quality metrics
326 +* Reliability signal extraction
327 +* Bubble detection patterns
328 +
329 +=== What Can NEVER Be Shared ===
330 +
331 +**Model weights**: No sharing of trained model parameters
332 +
333 +**Training data**: No sharing of full training datasets
334 +
335 +**Local governance overrides**: AKEL suggestions can be overridden locally
336 +
337 +**User behavior data**: No cross-node tracking
338 +
339 +**Internal review discussions**: Private content stays private
340 +
341 +=== Benefits of AI Knowledge Exchange ===
342 +
343 +* Reduced duplication across nodes
344 +* Improved claim clustering accuracy
345 +* Faster contradiction detection
346 +* Shared scenario libraries
347 +* Cross-node quality improvements
348 +
349 +=== Local Control Maintained ===
350 +
351 +* Nodes accept or reject shared AI knowledge
352 +* Human reviewers can override any AKEL suggestion
353 +* Local governance always has final authority
354 +* No external AI control over local content
355 +* Privacy-preserving knowledge exchange
356 +
189 189  ----
190 190  
191 -= Scaling to Thousands of Users =
359 +== Decentralized Processing ==
192 192  
193 -Nodes scale independently:
361 +Each node independently performs:
362 +* AKEL processing
363 +* Scenario drafting and validation
364 +* Evidence review
365 +* Verdict calculation
366 +* Truth landscape summarization
194 194  
195 -* horizontally scaled API servers
196 -* background worker pools
197 -* GPU queues for AKEL
198 -* caching (Redis)
199 -* sharded databases
368 +Nodes can specialize:
369 +* Health-focused node with medical experts
370 +* Energy-focused node with domain knowledge
371 +* Small node delegating scenario libraries to partners
372 +* Regional node with language/culture specialization
200 200  
201 -The network scales horizontally by adding more nodes.
374 +Optional data sharing includes:
375 +* Embeddings for clustering
376 +* Claim clusters for alignment
377 +* Scenario templates for efficiency
378 +* Verdict comparison metadata
202 202  
203 -Communities can form:
380 +----
204 204  
205 -* Domain-specific nodes
206 -* Region/language nodes
207 -* NGO/academic nodes
208 -* Private organization nodes
382 +== Scaling to Thousands of Users ==
209 209  
384 +Nodes scale independently through:
385 +* Horizontally scalable API servers
386 +* Worker pools for AKEL tasks
387 +* Hybrid storage (local + S3/IPFS)
388 +* Redis caching for performance
389 +* Sharded or partitioned databases
390 +
391 +Federation allows effectively unlimited horizontal scaling by adding new nodes.
392 +
393 +Communities may form:
394 +* Domain-specific nodes (epidemiology, energy, climate)
395 +* Language or region-based nodes
396 +* NGO or institutional nodes
397 +* Private organizational nodes
398 +* Academic research nodes
399 +
210 210  Nodes cooperate through:
401 +* Scenario library sharing
402 +* Shared or overlapping claim clusters
403 +* Expert delegation between nodes
404 +* Distributed AKEL task support
405 +* Cross-node quality audits
211 211  
212 -* Shared scenario libraries
213 -* Overlapping claim clusters
214 -* Expert delegation
215 -* Distributed AKEL tasks
407 +----
216 216  
409 +== Federation and Release 1.0 ==
410 +
411 +**POC**: Single node, optional federation experiments
412 +
413 +**Beta 0**: 2-3 nodes, basic federation protocol
414 +
415 +**Release 1.0**: Full federation support with:
416 +* Robust synchronization protocol
417 +* Trust model implementation
418 +* Cross-node AI knowledge exchange
419 +* Federated search and discovery
420 +* Distributed audit collaboration
421 +* Inter-node expert consultation
422 +
217 217  ----
218 218  
219 -= Diagram References =
425 +== Related Pages ==
220 220  
221 -{{include reference="FactHarbor.Specification.Diagrams.Federation Architecture.WebHome"/}}
427 +* [[AKEL (AI Knowledge Extraction Layer)>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]]
428 +* [[Data Model>>FactHarbor.Specification.Data Model.WebHome]]
429 +* [[Architecture>>FactHarbor.Specification.Architecture.WebHome]]
430 +* [[Workflows>>FactHarbor.Specification.Workflows.WebHome]]
431 +