Architecture

Version 6.8 by Robert Schaub on 2025/12/16 20:28

Architecture

FactHarbor uses a modular-monolith architecture (POC → Beta 0) designed to evolve into a distributed, federated, multi-node system (Release 1.0+).
Modules are strongly separated, versioned, and auditable. All logic is transparent and deterministic.


High-Level System Architecture

FactHarbor is composed of the following major modules:

  • UI Frontend
  • REST API Layer
  • Core Logic Layer
    • Claim Processing  
    • Scenario Engine  
    • Evidence Repository  
    • Verdict Engine  
    • Re-evaluation Engine  
    • Roles / Identity / Reputation
  • AKEL (AI Knowledge Extraction Layer)
  • Federation Layer
  • Workers & Background Jobs
  • Storage Layer (Postgres + VectorDB + ObjectStore)

Key ideas:

  • Core logic is deterministic, auditable, and versioned  
  • AKEL drafts structured outputs but never publishes directly  
  • Workers run long or asynchronous tasks  
  • Storage is separated for scalability and clarity  
  • Federation Layer provides optional distributed operation  

Storage Architecture

FactHarbor separates structured data, embeddings, and evidence files:

  • PostgreSQL — canonical structured entities, all versioning, lineage, signatures  
  • Vector DB (Qdrant or pgvector) — semantic search, duplication detection, cluster mapping  
  • Object Storage — PDFs, datasets, raw evidence, transcripts  
  • Optional (Release 1.0): Redis for caching, IPFS for decentralized object storage  

Core Backend Module Architecture

Each module has a clear responsibility and versioned boundaries to allow future extraction into microservices.

Claim Processing Module

Responsibilities:

  • Ingest text, URLs, documents, transcripts, federated input  
  • Extract claims (AKEL-assisted)  
  • Normalize structure  
  • Classify (type, domain, evaluability, safety)  
  • Deduplicate via embeddings  
  • Assign to claim clusters  

Flow:  
Ingest → Normalize → Classify → Deduplicate → Cluster


Scenario Engine

Responsibilities:

  • Create and validate scenarios  
  • Enforce required fields (definitions, assumptions, boundaries...)  
  • Perform safety checks (AKEL-assisted)  
  • Manage versioning and lifecycle  
  • Provide contextual evaluation settings to the Verdict Engine  

Flow:  
Create → Validate → Version → Lifecycle → Safety


Evidence Repository

Responsibilities:

  • Store metadata + files (object store)  
  • Classify evidence  
  • Compute preliminary reliability  
  • Maintain version history  
  • Detect retractions or disputes  
  • Provide structured metadata to the Verdict Engine  

Flow:  
Store → Classify → Score → Version → Update/Retract


Verdict Engine

Responsibilities:

  • Aggregate scenario-linked evidence  
  • Compute likelihood ranges per scenario
  • Generate reasoning chain  
  • Track uncertainty factors  
  • Maintain verdict version timelines  

Flow:  
Aggregate → Compute → Explain → Version → Timeline


Re-evaluation Engine

Responsibilities:

  • Listen for upstream changes  
  • Trigger partial or full recomputation  
  • Update verdicts + summary views  
  • Maintain consistency across federated nodes  

Triggers include:

  • Evidence updated or retracted  
  • Scenario definition or assumption changes  
  • Claim type or evaluability changes  
  • Contradiction detection  
  • Federation sync updates  

Flow:  
Trigger → Impact Analysis → Recompute → Publish Update


AKEL Integration Summary

AKEL is fully documented in its own chapter.
Here is only the architectural integration summary:

  • Receives raw input for claims  
  • Proposes scenario drafts  
  • Extracts and summarizes evidence  
  • Gives reliability hints  
  • Suggests draft verdicts  
  • Monitors contradictions  
  • Syncs metadata with trusted nodes  

AKEL runs in parallel to human review — never overrides it.


Federated Architecture

Each FactHarbor node:

  • Has its own dataset (claims, scenarios, evidence, verdicts)  
  • Runs its own AKEL  
  • Maintains local governance and reviewer rules  
  • May partially mirror global or domain-specific data  
  • Contributes to global knowledge clusters  

Nodes synchronize via:

  • Signed version bundles  
  • Merkle-tree lineage structures  
  • Optionally IPFS for evidence  
  • Trust-weighted acceptance  

Benefits:

  • Community independence  
  • Scalability  
  • Resilience  
  • Domain specialization  

Request → Verdict Flow

Simple end-to-end flow:

User → UI Frontend → REST API → FactHarbor Core
      → (Claim Processing → Scenario Engine → Evidence Repository → Verdict Engine)
      → Summary View → UI Frontend → User


Federation Sync Workflow

Sequence:

Detect Local Change → Build Signed Bundle → Push to Peers → Validate Signature → Merge or Fork → Trigger Re-evaluation


Versioning Architecture

All entities (Claim, Scenario, Evidence, Verdict) use immutable version chains:

  • VersionID  
  • ParentVersionID  
  • Timestamp  
  • AuthorType (Human, AI, ExternalNode)  
  • ChangeReason  
  • Signature (optional POC, required in 1.0)