Architecture

Version 2.6 by Robert Schaub on 2025/12/11 21:34

Architecture

FactHarbor uses a modular-monolith architecture (POC → Beta 0) designed to evolve into a distributed, federated, multi-node system (Release 1.0+).
Modules are strongly separated, versioned, and auditable. All logic is transparent and deterministic.


High-Level System Architecture

FactHarbor is composed of the following major modules:

  • UI Frontend
  • REST API Layer
  • Core Logic Layer
    – Claim Processing  
    – Scenario Engine  
    – Evidence Repository  
    – Verdict Engine  
    – Re-evaluation Engine  
    – Roles / Identity / Reputation
  • AKEL (AI Knowledge Extraction Layer)
  • Federation Layer
  • Workers & Background Jobs
  • Storage Layer (Postgres + VectorDB + ObjectStore)

Key ideas:

  • Core logic is deterministic, auditable, and versioned 
  • AKEL drafts structured outputs but never publishes directly 
  • Workers run long or asynchronous tasks 
  • Storage is separated for scalability and clarity 
  • Federation Layer provides optional distributed operation 

Storage Architecture

FactHarbor separates structured data, embeddings, and evidence files:

  • PostgreSQL — canonical structured entities, all versioning, lineage, signatures 
  • Vector DB (Qdrant or pgvector) — semantic search, duplication detection, cluster mapping 
  • Object Storage — PDFs, datasets, raw evidence, transcripts 
  • Optional (Release 1.0): Redis for caching, IPFS for decentralized object storage 

Core Backend Module Architecture

Each module has a clear responsibility and versioned boundaries to allow future extraction into microservices.

Claim Processing Module

Responsibilities:

  • Ingest text, URLs, documents, transcripts, federated input 
  • Extract claims (AKEL-assisted) 
  • Normalize structure 
  • Classify (type, domain, evaluability, safety) 
  • Deduplicate via embeddings 
  • Assign to claim clusters 

Flow:  
Ingest → Normalize → Classify → Deduplicate → Cluster


Scenario Engine

Responsibilities:

  • Create and validate scenarios 
  • Enforce required fields (definitions, assumptions, boundaries...) 
  • Perform safety checks (AKEL-assisted) 
  • Manage versioning and lifecycle 
  • Provide contextual evaluation settings to the Verdict Engine 

Flow:  
Create → Validate → Version → Lifecycle → Safety


Evidence Repository

Responsibilities:

  • Store metadata + files (object store) 
  • Classify evidence 
  • Compute preliminary reliability 
  • Maintain version history 
  • Detect retractions or disputes 
  • Provide structured metadata to the Verdict Engine 

Flow:  
Store → Classify → Score → Version → Update/Retract


Verdict Engine

Responsibilities:

  • Aggregate scenario-linked evidence 
  • Compute likelihood ranges 
  • Generate reasoning chain 
  • Track uncertainty factors 
  • Maintain verdict version timelines 

Flow:  
Aggregate → Compute → Explain → Version → Timeline


Re-evaluation Engine

Responsibilities:

  • Listen for upstream changes 
  • Trigger partial or full recomputation 
  • Update verdicts + summary views 
  • Maintain consistency across federated nodes 

Triggers include:

  • Evidence updated or retracted 
  • Scenario definition or assumption changes 
  • Claim type or evaluability changes 
  • Contradiction detection 
  • Federation sync updates 

Flow:  
Trigger → Impact Analysis → Recompute → Publish Update


AKEL Integration Summary

AKEL is fully documented in its own chapter.
Here is only the architectural integration summary:

  • Receives raw input for claims 
  • Proposes scenario drafts 
  • Extracts and summarizes evidence 
  • Gives reliability hints 
  • Suggests draft verdicts 
  • Monitors contradictions 
  • Syncs metadata with trusted nodes 

AKEL runs in parallel to human review — never overrides it.


Federated Architecture

Each FactHarbor node:

  • Has its own dataset (claims, scenarios, evidence, verdicts) 
  • Runs its own AKEL 
  • Maintains local governance and reviewer rules 
  • May partially mirror global or domain-specific data 
  • Contributes to global knowledge clusters 

Nodes synchronize via:

  • Signed version bundles 
  • Merkle-tree lineage structures 
  • Optionally IPFS for evidence 
  • Trust-weighted acceptance 

Benefits:

  • Community independence 
  • Scalability 
  • Resilience 
  • Domain specialization 

Request → Verdict Flow

Simple end-to-end flow:

User → UI Frontend → REST API → FactHarbor Core
      → (Claim Processing → Scenario Engine → Evidence Repository → Verdict Engine)
      → Summary View → UI Frontend → User


Federation Sync Workflow

Sequence:

Detect Local Change → Build Signed Bundle → Push to Peers → Validate Signature → Merge or Fork → Trigger Re-evaluation


Versioning Architecture

All entities (Claim, Scenario, Evidence, Verdict) use immutable version chains:

  • VersionID 
  • ParentVersionID 
  • Timestamp 
  • AuthorType (Human, AI, ExternalNode) 
  • ChangeReason 
  • Signature (optional POC, required in 1.0)