Federation & Decentralization

Version 7.4 by Robert Schaub on 2025/12/16 21:39

Federation & Decentralization

FactHarbor is designed to operate as a federated network of nodes rather than a single central server.

Decentralization provides:

  • Resilience against censorship or political pressure
  • Autonomy for local governance and moderation
  • Scalability across many independent communities
  • Trust without centralized control
  • Domain specialization (health-focused nodes, energy-focused nodes, etc.)

FactHarbor draws inspiration from the Fediverse but uses stronger structure, versioning, and integrity guarantees.


Federated FactHarbor Nodes

Each FactHarbor instance ("node") maintains:

  • Its own database
  • Its own AKEL instance
  • Its own reviewers, experts, and contributors
  • Its own governance rules

Nodes exchange structured information:

  • Claims
  • Scenarios
  • Evidence metadata (not necessarily full files)
  • Verdicts (optional)
  • Hashes and signatures for integrity

Nodes choose which external nodes they trust.


Global Identifiers

Every entity receives a globally unique, linkable identifier.

Format:  
`factharbor://node_url/type/local_id`

Example:  
`factharbor://factharbor.energy/claim/CLM-55812`

Supported types:

  • `claim`
  • `scenario`
  • `evidence`
  • `verdict`
  • `user` (optional)
  • `cluster`

Properties:

  • Globally consistent
  • Human-readable
  • Hash-derived
  • Independent of database internals
  • URL-resolvable (future enhancement)

This allows cross-node references and prevents identifier collisions in federated environments.


Trust Model

Each node maintains a trust table defining relationships with other nodes:

Trust Levels

Trusted Nodes:

  • Claims auto-imported
  • Scenarios accepted without re-review
  • Evidence considered valid
  • Verdicts displayed to users
  • High synchronization priority

Neutral Nodes:

  • Claims imported but flagged for review
  • Scenarios require local validation
  • Evidence requires re-assessment
  • Verdicts shown with "external node" disclaimer
  • Normal synchronization priority

Untrusted Nodes:

  • Claims quarantined, manual import only
  • Scenarios rejected by default
  • Evidence not accepted
  • Verdicts not displayed
  • No automatic synchronization

Trust Affects

  • Auto-import: Whether claims/scenarios are automatically added
  • Re-review requirements: Whether local reviewers must validate
  • Verdict display: Whether external verdicts are shown to users
  • Synchronization frequency: How often data is exchanged
  • Reputation signals: How external reputation is interpreted

Local Trust Authority

Each node's governance team decides:

  • Which nodes to trust
  • Trust level criteria
  • Trust escalation/degradation rules
  • Dispute resolution with partner nodes

Trust is local and autonomous - no global trust registry exists.


Data Sharing Model

What Nodes Share

Always shared (if federation enabled):

  • Claims and claim clusters
  • Scenario structures
  • Evidence metadata and content hashes
  • Integrity signatures

Optionally shared:

  • Full evidence files (large documents)
  • Verdicts (nodes may choose to keep verdicts local)
  • Vector embeddings
  • Scenario templates
  • AKEL distilled knowledge

Never shared:

  • Internal user lists
  • Reviewer comments and internal discussions
  • Governance decisions and meeting notes
  • Access control data
  • Private or sensitive content marked as local-only

Large Evidence Files

Evidence files are:

  • Stored locally by default
  • Referenced via global content hash
  • Optionally served through IPFS
  • Accessible via direct peer-to-peer transfer
  • Can be stored in S3-compatible object storage

Synchronization Protocol

Nodes exchange data using multiple synchronization methods:

Push-Based Synchronization

Mechanism: Webhooks

When local content changes:

  1. Node builds signed bundle
    2. Sends webhook notification to subscribed nodes
    3. Remote nodes fetch bundle
    4. Remote nodes validate and import

Use case: Real-time updates for trusted partners

Pull-Based Synchronization

Mechanism: Scheduled polling

Nodes periodically:

  1. Query partner nodes for updates
    2. Fetch changed entities since last sync
    3. Validate and import
    4. Store sync checkpoint

Use case: Regular batch updates, lower trust nodes

Subscription-Based Synchronization

Mechanism: WebSub-like protocol

Nodes subscribe to:

  • Specific claim clusters
  • Specific domains (medical, energy, etc.)
  • Specific scenario types
  • Verdict updates

Publisher pushes updates only to subscribers.

Use case: Selective federation, domain specialization

Large Asset Transfer

For files >10MB:

  • S3-compatible object storage
  • IPFS (content-addressed)
  • Direct peer-to-peer transfer
  • Chunked HTTP transfer with resume support

Federation Sync Workflow

Complete synchronization sequence for creating and sharing new content:

Step 1: Local Node Creates New Versions

User or AKEL creates:

  • New claim version
  • New scenario version
  • New evidence version
  • New verdict version

All changes tracked with:

  • VersionID
  • ParentVersionID
  • AuthorType
  • Timestamp
  • JustificationText

Step 2: Federation Layer Builds Signed Bundle

Federation layer packages:

  • Entity data (claim, scenario, evidence metadata, verdict)
  • Version lineage (ParentVersionID chain)
  • Cryptographic signatures
  • Node provenance information
  • Trust metadata

Bundle format:

  • JSON-LD for structured data
  • Content-addressed hashes
  • Digital signatures for integrity

Step 3: Bundle Includes Required Data

Each bundle contains:

  • Claims: Full claim text, classification, domain
  • Scenarios: Definitions, assumptions, boundaries
  • Evidence metadata: Source URLs, hashes, reliability scores (not always full files)
  • Verdicts: Likelihood ranges, uncertainty, reasoning chains
  • Lineage: Version history, parent relationships
  • Signatures: Cryptographic proof of origin

Step 4: Bundle Pushed to Trusted Neighbor Nodes

Based on trust table:

  • Push to trusted nodes immediately
  • Queue for neutral nodes (batched)
  • Skip untrusted nodes

Push methods:

  • Webhook notification
  • Direct API call
  • Pub/Sub message queue

Step 5: Remote Nodes Validate Lineage and Signatures

Receiving node:

  1. Verifies cryptographic signatures
    2. Validates version lineage (ParentVersionID chain)
    3. Checks for conflicts with local data
    4. Validates data structure and required fields
    5. Applies local trust policies

Validation failures → reject or quarantine bundle

Step 6: Accept or Branch Versions

Accept (if validation passes):

  • Import new versions
  • Maintain provenance metadata
  • Link to local related entities
  • Update local indices

Branch (if conflict detected):

  • Create parallel version tree
  • Mark as "external branch"
  • Allow local reviewers to merge or reject
  • Preserve both version histories

Reject (if validation fails):

  • Log rejection reason
  • Notify source node (optional)
  • Quarantine for manual review (optional)

Step 7: Local Re-evaluation Runs if Required

After import, local node checks:

  • Does new evidence affect existing verdicts?
  • Do new scenarios require re-assessment?
  • Are there contradictions with local content?

If yes:

  • Trigger AKEL re-evaluation
  • Queue for reviewer attention
  • Update affected verdicts
  • Notify users following related content

Cross-Node AI Knowledge Exchange

Each node runs its own AKEL instance and may exchange AI-derived knowledge:

What Can Be Shared

Vector embeddings:

  • For cross-node claim clustering
  • For semantic search alignment
  • Never includes training data

Canonical claim forms:

  • Normalized claim text
  • Standard phrasing templates
  • Domain-specific formulations

Scenario templates:

  • Reusable scenario structures
  • Common assumption patterns
  • Evaluation method templates

Contradiction alerts:

  • Detected conflicts between claims
  • Evidence conflicts across nodes
  • Scenario incompatibilities

Metadata and insights:

  • Aggregate quality metrics
  • Reliability signal extraction
  • Bubble detection patterns

What Can NEVER Be Shared

Model weights: No sharing of trained model parameters

Training data: No sharing of full training datasets

Local governance overrides: AKEL suggestions can be overridden locally

User behavior data: No cross-node tracking

Internal review discussions: Private content stays private

Benefits of AI Knowledge Exchange

  • Reduced duplication across nodes
  • Improved claim clustering accuracy
  • Faster contradiction detection
  • Shared scenario libraries
  • Cross-node quality improvements

Local Control Maintained

  • Nodes accept or reject shared AI knowledge
  • Human reviewers can override any AKEL suggestion
  • Local governance always has final authority
  • No external AI control over local content
  • Privacy-preserving knowledge exchange

Decentralized Processing

Each node independently performs:

  • AKEL processing
  • Scenario drafting and validation
  • Evidence review
  • Verdict calculation
  • Truth landscape summarization

Nodes can specialize:

  • Health-focused node with medical experts
  • Energy-focused node with domain knowledge
  • Small node delegating scenario libraries to partners
  • Regional node with language/culture specialization

Optional data sharing includes:

  • Embeddings for clustering
  • Claim clusters for alignment
  • Scenario templates for efficiency
  • Verdict comparison metadata

Scaling to Thousands of Users

Nodes scale independently through:

  • Horizontally scalable API servers
  • Worker pools for AKEL tasks
  • Hybrid storage (local + S3/IPFS)
  • Redis caching for performance
  • Sharded or partitioned databases

Federation allows effectively unlimited horizontal scaling by adding new nodes.

Communities may form:

  • Domain-specific nodes (epidemiology, energy, climate)
  • Language or region-based nodes
  • NGO or institutional nodes
  • Private organizational nodes
  • Academic research nodes

Nodes cooperate through:

  • Scenario library sharing
  • Shared or overlapping claim clusters
  • Expert delegation between nodes
  • Distributed AKEL task support
  • Cross-node quality audits

Federation and Release 1.0

POC: Single node, optional federation experiments

Beta 0: 2-3 nodes, basic federation protocol

Release 1.0: Full federation support with:

  • Robust synchronization protocol
  • Trust model implementation
  • Cross-node AI knowledge exchange
  • Federated search and discovery
  • Distributed audit collaboration
  • Inter-node expert consultation

Related Pages