Wiki source code of Federation & Decentralization

Last modified by Robert Schaub on 2025/12/24 20:35

version	line-number	content
1.1	1	= Federation & Decentralization =
	2
	3	FactHarbor is designed to operate as a federated network of nodes rather than a single central server.
	4
	5	Decentralization provides:
1.2	6
1.1	7	* Resilience against censorship or political pressure
	8	* Autonomy for local governance and moderation
	9	* Scalability across many independent communities
	10	* Trust without centralized control
	11	* Domain specialization (health-focused nodes, energy-focused nodes, etc.)
	12
	13	FactHarbor draws inspiration from the Fediverse but uses stronger structure, versioning, and integrity guarantees.
	14
	15
	16	== 1. Federated FactHarbor Nodes ==
	17
	18	Each FactHarbor instance ("node") maintains:
1.2	19
1.1	20	* Its own database
	21	* Its own AKEL instance
	22	* Its own reviewers, experts, and contributors
	23	* Its own governance rules
	24
	25	Nodes exchange structured information:
1.2	26
1.1	27	* Claims
	28	* Scenarios
	29	* Evidence metadata (not necessarily full files)
	30	* Verdicts (optional)
	31	* Hashes and signatures for integrity
	32
	33	Nodes choose which external nodes they trust.
	34
	35
	36	== 2. Global Identifiers ==
	37
	38	Every entity receives a globally unique, linkable identifier.
	39
	40	Format:
	41	`factharbor://node_url/type/local_id`
	42
	43	Example:
	44	`factharbor://factharbor.energy/claim/CLM-55812`
	45
	46	Supported types:
1.2	47
1.1	48	* `claim`
	49	* `scenario`
	50	* `evidence`
	51	* `verdict`
	52	* `user` (optional)
	53	* `cluster`
	54
	55	Properties:
1.2	56
1.1	57	* Globally consistent
	58	* Human-readable
	59	* Hash-derived
	60	* Independent of database internals
	61	* URL-resolvable (future enhancement)
	62
	63	This allows cross-node references and prevents identifier collisions in federated environments.
	64
	65
	66	== 3. Trust Model ==
	67
	68	Each node maintains a trust table defining relationships with other nodes:
	69
	70	=== 3.1 Trust Levels ===
	71
	72	Trusted Nodes:
1.2	73
1.1	74	* Claims auto-imported
	75	* Scenarios accepted without re-review
	76	* Evidence considered valid
	77	* Verdicts displayed to users
	78	* High synchronization priority
	79
	80	Neutral Nodes:
1.2	81
1.1	82	* Claims imported but flagged for review
	83	* Scenarios require local validation
	84	* Evidence requires re-assessment
	85	* Verdicts shown with "external node" disclaimer
	86	* Normal synchronization priority
	87
	88	Untrusted Nodes:
1.2	89
1.1	90	* Claims quarantined, manual import only
	91	* Scenarios rejected by default
	92	* Evidence not accepted
	93	* Verdicts not displayed
	94	* No automatic synchronization
	95
	96	=== 3.2 Trust Affects ===
	97
	98	* Auto-import: Whether claims/scenarios are automatically added
	99	* Re-review requirements: Whether local reviewers must validate
	100	* Verdict display: Whether external verdicts are shown to users
	101	* Synchronization frequency: How often data is exchanged
	102	* Reputation signals: How external reputation is interpreted
	103
	104	=== 3.3 Local Trust Authority ===
	105
	106	Each node's governance team decides:
1.2	107
1.1	108	* Which nodes to trust
	109	* Trust level criteria
	110	* Trust escalation/degradation rules
	111	* Dispute resolution with partner nodes
	112
	113	Trust is local and autonomous - no global trust registry exists.
	114
	115
	116	== 4. Data Sharing Model ==
	117
	118	=== 4.1 What Nodes Share ===
	119
	120	Always shared (if federation enabled):
1.2	121
1.1	122	* Claims and claim clusters
	123	* Scenario structures
	124	* Evidence metadata and content hashes
	125	* Integrity signatures
	126
	127	Optionally shared:
1.2	128
1.1	129	* Full evidence files (large documents)
	130	* Verdicts (nodes may choose to keep verdicts local)
	131	* Vector embeddings
	132	* Scenario templates
	133	* AKEL distilled knowledge
	134
	135	Never shared:
1.2	136
1.1	137	* Internal user lists
	138	* Reviewer comments and internal discussions
	139	* Governance decisions and meeting notes
	140	* Access control data
	141	* Private or sensitive content marked as local-only
	142
	143	=== 4.2 Large Evidence Files ===
	144
	145	Evidence files are:
1.2	146
1.1	147	* Stored locally by default
	148	* Referenced via global content hash
	149	* Optionally served through IPFS
	150	* Accessible via direct peer-to-peer transfer
	151	* Can be stored in S3-compatible object storage
	152
	153	== 5. Synchronization Protocol ==
	154
	155	Nodes exchange data using multiple synchronization methods:
	156
	157	=== 5.1 Push-Based Synchronization ===
	158
	159	Mechanism: Webhooks
	160
	161	When local content changes:
1.2	162
1.1	163	1. Node builds signed bundle
	164	2. Sends webhook notification to subscribed nodes
	165	3. Remote nodes fetch bundle
	166	4. Remote nodes validate and import
	167
	168	Use case: Real-time updates for trusted partners
	169
	170	=== 5.2 Pull-Based Synchronization ===
	171
	172	Mechanism: Scheduled polling
	173
	174	Nodes periodically:
1.2	175
1.1	176	1. Query partner nodes for updates
	177	2. Fetch changed entities since last sync
	178	3. Validate and import
	179	4. Store sync checkpoint
	180
	181	Use case: Regular batch updates, lower trust nodes
	182
	183	=== 5.3 Subscription-Based Synchronization ===
	184
	185	Mechanism: WebSub-like protocol
	186
	187	Nodes subscribe to:
1.2	188
1.1	189	* Specific claim clusters
	190	* Specific domains (medical, energy, etc.)
	191	* Specific scenario types
	192	* Verdict updates
	193
	194	Publisher pushes updates only to subscribers.
	195
	196	Use case: Selective federation, domain specialization
	197
	198	=== 5.4 Large Asset Transfer ===
	199
	200	For files >10MB:
1.2	201
1.1	202	* S3-compatible object storage
	203	* IPFS (content-addressed)
	204	* Direct peer-to-peer transfer
	205	* Chunked HTTP transfer with resume support
	206
	207	== 6. Federation Sync Workflow ==
	208
	209	Complete synchronization sequence for creating and sharing new content:
	210
	211	=== 6.1 Step 1: Local Node Creates New Versions ===
	212
	213	User or AKEL creates:
1.2	214
1.1	215	* New claim version
	216	* New scenario version
	217	* New evidence version
	218	* New verdict version
	219
	220	All changes tracked with:
1.2	221
1.1	222	* VersionID
	223	* ParentVersionID
	224	* AuthorType
	225	* Timestamp
	226	* JustificationText
	227
	228	=== 6.2 Step 2: Federation Layer Builds Signed Bundle ===
	229
	230	Federation layer packages:
1.2	231
1.1	232	* Entity data (claim, scenario, evidence metadata, verdict)
	233	* Version lineage (ParentVersionID chain)
	234	* Cryptographic signatures
	235	* Node provenance information
	236	* Trust metadata
	237
	238	Bundle format:
1.2	239
1.1	240	* JSON-LD for structured data
	241	* Content-addressed hashes
	242	* Digital signatures for integrity
	243
	244	=== 6.3 Step 3: Bundle Includes Required Data ===
	245
	246	Each bundle contains:
1.2	247
1.1	248	* Claims: Full claim text, classification, domain
	249	* Scenarios: Definitions, assumptions, boundaries
	250	* Evidence metadata: Source URLs, hashes, reliability scores (not always full files)
	251	* Verdicts: Likelihood ranges, uncertainty, reasoning chains
	252	* Lineage: Version history, parent relationships
	253	* Signatures: Cryptographic proof of origin
	254
	255	=== 6.4 Step 4: Bundle Pushed to Trusted Neighbor Nodes ===
	256
	257	Based on trust table:
1.2	258
1.1	259	* Push to trusted nodes immediately
	260	* Queue for neutral nodes (batched)
	261	* Skip untrusted nodes
	262
	263	Push methods:
1.2	264
1.1	265	* Webhook notification
	266	* Direct API call
	267	* Pub/Sub message queue
	268
	269	=== 6.5 Step 5: Remote Nodes Validate Lineage and Signatures ===
	270
	271	Receiving node:
1.2	272
1.1	273	1. Verifies cryptographic signatures
	274	2. Validates version lineage (ParentVersionID chain)
	275	3. Checks for conflicts with local data
	276	4. Validates data structure and required fields
	277	5. Applies local trust policies
	278
	279	Validation failures → reject or quarantine bundle
	280
	281	=== 6.6 Step 6: Accept or Branch Versions ===
	282
	283	Accept (if validation passes):
1.2	284
1.1	285	* Import new versions
	286	* Maintain provenance metadata
	287	* Link to local related entities
	288	* Update local indices
	289
	290	Branch (if conflict detected):
1.2	291
1.1	292	* Create parallel version tree
	293	* Mark as "external branch"
	294	* Allow local reviewers to merge or reject
	295	* Preserve both version histories
	296
	297	Reject (if validation fails):
1.2	298
1.1	299	* Log rejection reason
	300	* Notify source node (optional)
	301	* Quarantine for manual review (optional)
	302
	303	=== 6.7 Step 7: Local Re-evaluation Runs if Required ===
	304
	305	After import, local node checks:
1.2	306
1.1	307	* Does new evidence affect existing verdicts?
	308	* Do new scenarios require re-assessment?
	309	* Are there contradictions with local content?
	310
	311	If yes:
1.2	312
1.1	313	* Trigger AKEL re-evaluation
	314	* Queue for reviewer attention
	315	* Update affected verdicts
	316	* Notify users following related content
	317
	318	== 7. Cross-Node AI Knowledge Exchange ==
	319
	320	Each node runs its own AKEL instance and may exchange AI-derived knowledge:
	321
	322	=== 7.1 What Can Be Shared ===
	323
	324	Vector embeddings:
1.2	325
1.1	326	* For cross-node claim clustering
	327	* For semantic search alignment
	328	* Never includes training data
	329
	330	Canonical claim forms:
1.2	331
1.1	332	* Normalized claim text
	333	* Standard phrasing templates
	334	* Domain-specific formulations
	335
	336	Scenario templates:
1.2	337
1.1	338	* Reusable scenario structures
	339	* Common assumption patterns
	340	* Evaluation method templates
	341
	342	Contradiction alerts:
1.2	343
1.1	344	* Detected conflicts between claims
	345	* Evidence conflicts across nodes
	346	* Scenario incompatibilities
	347
	348	Metadata and insights:
1.2	349
1.1	350	* Aggregate quality metrics
	351	* Reliability signal extraction
	352	* Bubble detection patterns
	353
	354	=== 7.2 What Can NEVER Be Shared ===
	355
	356	Model weights: No sharing of trained model parameters
	357
	358	Training data: No sharing of full training datasets
	359
	360	Local governance overrides: AKEL suggestions can be overridden locally
	361
	362	User behavior data: No cross-node tracking
	363
	364	Internal review discussions: Private content stays private
	365
	366	=== 7.3 Benefits of AI Knowledge Exchange ===
	367
	368	* Reduced duplication across nodes
	369	* Improved claim clustering accuracy
	370	* Faster contradiction detection
	371	* Shared scenario libraries
	372	* Cross-node quality improvements
	373
	374	=== 7.4 Local Control Maintained ===
	375
	376	* Nodes accept or reject shared AI knowledge
	377	* Human reviewers can override any AKEL suggestion
	378	* Local governance always has final authority
	379	* No external AI control over local content
	380	* Privacy-preserving knowledge exchange
	381
	382	== 8. Decentralized Processing ==
	383
	384	Each node independently performs:
1.2	385
1.1	386	* AKEL processing
	387	* Scenario drafting and validation
	388	* Evidence review
	389	* Verdict calculation
	390	* Truth landscape summarization
	391
	392	Nodes can specialize:
1.2	393
1.1	394	* Health-focused node with medical experts
	395	* Energy-focused node with domain knowledge
	396	* Small node delegating scenario libraries to partners
	397	* Regional node with language/culture specialization
	398
	399	Optional data sharing includes:
1.2	400
1.1	401	* Embeddings for clustering
	402	* Claim clusters for alignment
	403	* Scenario templates for efficiency
	404	* Verdict comparison metadata
	405
	406	== 9. Scaling to Thousands of Users ==
	407
	408	Nodes scale independently through:
1.2	409
1.1	410	* Horizontally scalable API servers
	411	* Worker pools for AKEL tasks
	412	* Hybrid storage (local + S3/IPFS)
	413	* Redis caching for performance
	414	* Sharded or partitioned databases
	415
	416	Federation allows effectively unlimited horizontal scaling by adding new nodes.
	417
	418	Communities may form:
1.2	419
1.1	420	* Domain-specific nodes (epidemiology, energy, climate)
	421	* Language or region-based nodes
	422	* NGO or institutional nodes
	423	* Private organizational nodes
	424	* Academic research nodes
	425
	426	Nodes cooperate through:
1.2	427
1.1	428	* Scenario library sharing
	429	* Shared or overlapping claim clusters
	430	* Expert delegation between nodes
	431	* Distributed AKEL task support
	432	* Cross-node quality audits
	433
	434	== 10. Federation and Release 1.0 ==
	435
	436	POC: Single node, optional federation experiments
	437
	438	Beta 0: 2-3 nodes, basic federation protocol
	439
	440	Release 1.0: Full federation support with:
1.2	441
1.1	442	* Robust synchronization protocol
	443	* Trust model implementation
	444	* Cross-node AI knowledge exchange
	445	* Federated search and discovery
	446	* Distributed audit collaboration
	447	* Inter-node expert consultation
	448
	449	== 11. Related Pages ==
	450
1.12	451	* [[AKEL (AI Knowledge Extraction Layer)>>Archive.FactHarbor V0\.9\.23 Lost Data.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]]
1.14	452	* [[Data Model>>Archive.FactHarbor V0\.9\.23 Lost Data.Specification.Data Model.WebHome]]
1.13	453	* [[Architecture>>Archive.FactHarbor V0\.9\.23 Lost Data.Specification.Architecture.WebHome]]
1.16	454	* [[Workflows>>Archive.FactHarbor V0\.9\.23 Lost Data.Specification.Workflows.WebHome]]