Federation & Decentralization
Federation & Decentralization
FactHarbor is designed to operate as a federated network of nodes rather than a single central server.
Decentralization provides:
- Resilience against censorship or political pressure
- Autonomy for local governance and moderation
- Scalability across many independent communities
- Trust without centralized control
- Domain specialization (health-focused nodes, energy-focused nodes, etc.)
FactHarbor draws inspiration from the Fediverse but uses stronger structure, versioning, and integrity guarantees.
1. Federation Architecture Diagram
The following diagram shows the complete federated architecture with node components and communication layers.
Federation Architecture
This diagram shows the complete federated architecture with node components and communication layers.
graph LR FH1[FactHarbor
Instance 1] FH2[FactHarbor
Instance 2] FH3[FactHarbor
Instance 3] FH1 -.->|V1.0+:
Sync claims| FH2 FH2 -.->|V1.0+:
Sync claims| FH3 FH3 -.->|V1.0+:
Sync claims| FH1 U1[Users] --> FH1 U2[Users] --> FH2 U3[Users] --> FH3 style FH1 fill:#e1f5ff style FH2 fill:#e1f5ff style FH3 fill:#e1f5ff
Federation Architecture - Future (V1.0+): Independent FactHarbor instances can sync claims for broader reach while maintaining local control.
2. Federated FactHarbor Nodes
Each FactHarbor instance ("node") maintains:
- Its own database
- Its own AKEL instance
- Its own reviewers, experts, and contributors
- Its own governance rules
Nodes exchange structured information:
- Claims
- Scenarios
- Evidence metadata (not necessarily full files)
- Verdicts (optional)
- Hashes and signatures for integrity
Nodes choose which external nodes they trust.
3. Global Identifiers
Every entity receives a globally unique, linkable identifier.
Format:
`factharbor://node_url/type/local_id`
Example:
`factharbor://factharbor.energy/claim/CLM-55812`
Supported types:
- `claim`
- `scenario`
- `evidence`
- `verdict`
- `user` (optional)
- `cluster`
Properties:
- Globally consistent
- Human-readable
- Hash-derived
- Independent of database internals
- URL-resolvable (future enhancement)
This allows cross-node references and prevents identifier collisions in federated environments.
4. Trust Model
Each node maintains a trust table defining relationships with other nodes:
4.1 Trust Levels
Trusted Nodes:
- Claims auto-imported
- Scenarios accepted without re-review
- Evidence considered valid
- Verdicts displayed to users
- High synchronization priority
Neutral Nodes:
- Claims imported but flagged for review
- Scenarios require local validation
- Evidence requires re-assessment
- Verdicts shown with "external node" disclaimer
- Normal synchronization priority
Untrusted Nodes:
- Claims quarantined, manual import only
- Scenarios rejected by default
- Evidence not accepted
- Verdicts not displayed
- No automatic synchronization
4.2 Trust Affects
- Auto-import: Whether claims/scenarios are automatically added
- Re-review requirements: Whether local reviewers must validate
- Verdict display: Whether external verdicts are shown to users
- Synchronization frequency: How often data is exchanged
- Reputation signals: How external reputation is interpreted
4.3 Local Trust Authority
Each node's governance team decides:
- Which nodes to trust
- Trust level criteria
- Trust escalation/degradation rules
- Dispute resolution with partner nodes
Trust is local and autonomous - no global trust registry exists.
5. Data Sharing Model
5.1 What Nodes Share
Always shared (if federation enabled):
- Claims and claim clusters
- Scenario structures
- Evidence metadata and content hashes
- Integrity signatures
Optionally shared:
- Full evidence files (large documents)
- Verdicts (nodes may choose to keep verdicts local)
- Vector embeddings
- Scenario templates
- AKEL distilled knowledge
Never shared:
- Internal user lists
- Reviewer comments and internal discussions
- Governance decisions and meeting notes
- Access control data
- Private or sensitive content marked as local-only
5.2 Large Evidence Files
Evidence files are:
- Stored locally by default
- Referenced via global content hash
- Optionally served through IPFS
- Accessible via direct peer-to-peer transfer
- Can be stored in S3-compatible object storage
6. Synchronization Protocol
Nodes exchange data using multiple synchronization methods:
6.1 Push-Based Synchronization
Mechanism: Webhooks
When local content changes:
- Node builds signed bundle
2. Sends webhook notification to subscribed nodes
3. Remote nodes fetch bundle
4. Remote nodes validate and import
Use case: Real-time updates for trusted partners
6.2 Pull-Based Synchronization
Mechanism: Scheduled polling
Nodes periodically:
- Query partner nodes for updates
2. Fetch changed entities since last sync
3. Validate and import
4. Store sync checkpoint
Use case: Regular batch updates, lower trust nodes
6.3 Subscription-Based Synchronization
Mechanism: WebSub-like protocol
Nodes subscribe to:
- Specific claim clusters
- Specific domains (medical, energy, etc.)
- Specific scenario types
- Verdict updates
Publisher pushes updates only to subscribers.
Use case: Selective federation, domain specialization
6.4 Large Asset Transfer
For files >10MB:
- S3-compatible object storage
- IPFS (content-addressed)
- Direct peer-to-peer transfer
- Chunked HTTP transfer with resume support
7. Federation Sync Workflow
Complete synchronization sequence for creating and sharing new content:
7.1 Step 1: Local Node Creates New Versions
User or AKEL creates:
- New claim version
- New scenario version
- New evidence version
- New verdict version
All changes tracked with:
- VersionID
- ParentVersionID
- AuthorType
- Timestamp
- JustificationText
7.2 Step 2: Federation Layer Builds Signed Bundle
Federation layer packages:
- Entity data (claim, scenario, evidence metadata, verdict)
- Version lineage (ParentVersionID chain)
- Cryptographic signatures
- Node provenance information
- Trust metadata
Bundle format:
- JSON-LD for structured data
- Content-addressed hashes
- Digital signatures for integrity
7.3 Step 3: Bundle Includes Required Data
Each bundle contains:
- Claims: Full claim text, classification, domain
- Scenarios: Definitions, assumptions, boundaries
- Evidence metadata: Source URLs, hashes, reliability scores (not always full files)
- Verdicts: Likelihood ranges, uncertainty, reasoning chains
- Lineage: Version history, parent relationships
- Signatures: Cryptographic proof of origin
7.4 Step 4: Bundle Pushed to Trusted Neighbor Nodes
Based on trust table:
- Push to trusted nodes immediately
- Queue for neutral nodes (batched)
- Skip untrusted nodes
Push methods:
- Webhook notification
- Direct API call
- Pub/Sub message queue
7.5 Step 5: Remote Nodes Validate Lineage and Signatures
Receiving node:
- Verifies cryptographic signatures
2. Validates version lineage (ParentVersionID chain)
3. Checks for conflicts with local data
4. Validates data structure and required fields
5. Applies local trust policies
Validation failures → reject or quarantine bundle
7.6 Step 6: Accept or Branch Versions
Accept (if validation passes):
- Import new versions
- Maintain provenance metadata
- Link to local related entities
- Update local indices
Branch (if conflict detected):
- Create parallel version tree
- Mark as "external branch"
- Allow local reviewers to merge or reject
- Preserve both version histories
Reject (if validation fails):
- Log rejection reason
- Notify source node (optional)
- Quarantine for manual review (optional)
7.7 Step 7: Local Re-evaluation Runs if Required
After import, local node checks:
- Does new evidence affect existing verdicts?
- Do new scenarios require re-assessment?
- Are there contradictions with local content?
If yes:
- Trigger AKEL re-evaluation
- Queue for reviewer attention
- Update affected verdicts
- Notify users following related content
8. Cross-Node AI Knowledge Exchange
Each node runs its own AKEL instance and may exchange AI-derived knowledge:
8.1 What Can Be Shared
Vector embeddings:
- For cross-node claim clustering
- For semantic search alignment
- Never includes training data
Canonical claim forms:
- Normalized claim text
- Standard phrasing templates
- Domain-specific formulations
Scenario templates:
- Reusable scenario structures
- Common assumption patterns
- Evaluation method templates
Contradiction alerts:
- Detected conflicts between claims
- Evidence conflicts across nodes
- Scenario incompatibilities
Metadata and insights:
- Aggregate quality metrics
- Reliability signal extraction
- Bubble detection patterns
8.2 What Can NEVER Be Shared
Model weights: No sharing of trained model parameters
Training data: No sharing of full training datasets
Local governance overrides: AKEL suggestions can be overridden locally
User behavior data: No cross-node tracking
Internal review discussions: Private content stays private
8.3 Benefits of AI Knowledge Exchange
- Reduced duplication across nodes
- Improved claim clustering accuracy
- Faster contradiction detection
- Shared scenario libraries
- Cross-node quality improvements
8.4 Local Control Maintained
- Nodes accept or reject shared AI knowledge
- Human reviewers can override any AKEL suggestion
- Local governance always has final authority
- No external AI control over local content
- Privacy-preserving knowledge exchange
9. Decentralized Processing
Each node independently performs:
- AKEL processing
- Scenario drafting and validation
- Evidence review
- Verdict calculation
- Truth landscape summarization
Nodes can specialize:
- Health-focused node with medical experts
- Energy-focused node with domain knowledge
- Small node delegating scenario libraries to partners
- Regional node with language/culture specialization
Optional data sharing includes:
- Embeddings for clustering
- Claim clusters for alignment
- Scenario templates for efficiency
- Verdict comparison metadata
10. Scaling to Thousands of Users
Nodes scale independently through:
- Horizontally scalable API servers
- Worker pools for AKEL tasks
- Hybrid storage (local + S3/IPFS)
- Redis caching for performance
- Sharded or partitioned databases
Federation allows effectively unlimited horizontal scaling by adding new nodes.
Communities may form:
- Domain-specific nodes (epidemiology, energy, climate)
- Language or region-based nodes
- NGO or institutional nodes
- Private organizational nodes
- Academic research nodes
Nodes cooperate through:
- Scenario library sharing
- Shared or overlapping claim clusters
- Expert delegation between nodes
- Distributed AKEL task support
- Cross-node quality audits
11. Federation and Release 1.0
POC: Single node, optional federation experiments
Beta 0: 2-3 nodes, basic federation protocol
Release 1.0: Full federation support with:
- Robust synchronization protocol
- Trust model implementation
- Cross-node AI knowledge exchange
- Federated search and discovery
- Distributed audit collaboration
- Inter-node expert consultation