Architecture

Last modified by Robert Schaub on 2025/12/24 20:34

Architecture

FactHarbor uses a modular-monolith architecture (POC → Beta 0) designed to evolve into a distributed, federated, multi-node system (Release 1.0+).
Modules are strongly separated, versioned, and auditable. All logic is transparent and deterministic.


High-Level System Architecture

FactHarbor is composed of the following major modules:

  • UI Frontend
  • REST API Layer
  • Core Logic Layer
    • Claim Processing  
    • Scenario Engine  
    • Evidence Repository  
    • Verdict Engine  
    • Re-evaluation Engine  
    • Roles / Identity / Reputation
  • AKEL (AI Knowledge Extraction Layer)
  • Federation Layer
  • Workers & Background Jobs
  • Storage Layer (Postgres + VectorDB + ObjectStore)

High-Level Architecture

graph TD
    User((User)) <--> UI[UI Frontend]
    UI <--> API[REST API]
    
    subgraph Backend
        API <--> Core[FactHarbor Core
- Claims/Scenarios
- Evidence/Verdicts
- Identity]
        
        Core <--> AKEL[AKEL - AI Layer]
        Core <--> Workers[Background Workers]
        Core <--> Fed[Federation Layer]
    end

    subgraph Storage
        Core --> PG[(PostgreSQL)]
        Core --> Vector[(Vector DB)]
        Core --> Obj[(Object Store)]
    end

    Fed -.-> OtherNodes((Other Nodes))
    AKEL -.-> Storage
    Workers -.-> Storage

Key ideas:

  • Core logic is deterministic, auditable, and versioned  
  • AKEL drafts structured outputs but never publishes directly  
  • Workers run long or asynchronous tasks  
  • Storage is separated for scalability and clarity  
  • Federation Layer provides optional distributed operation  

Storage Architecture

FactHarbor separates structured data, embeddings, and evidence files:

  • PostgreSQL — canonical structured entities, all versioning, lineage, signatures  
  • Vector DB (Qdrant or pgvector) — semantic search, duplication detection, cluster mapping  
  • Object Storage — PDFs, datasets, raw evidence, transcripts  
  • Optional (Release 1.0): Redis for caching, IPFS for decentralized object storage  

Storage Architecture

graph TD
    App[Storage Layer Interface] --> PG[(PostgreSQL
Core Entities & Versioning)]
    App --> Vec[(Vector DB
Embeddings & Search)]
    App --> Obj[(Object Storage
Docs & Datasets)]
    
    App -.-> Cache[(Redis - Optional)]
    App -.-> IPFS((IPFS - Optional))

Core Backend Module Architecture

Each module has a clear responsibility and versioned boundaries to allow future extraction into microservices.

Claim Processing Module

Responsibilities:

  • Ingest text, URLs, documents, transcripts, federated input  
  • Extract claims (AKEL-assisted)  
  • Normalize structure  
  • Classify (type, domain, evaluability, safety)  
  • Deduplicate via embeddings  
  • Assign to claim clusters  

Flow:  
Ingest → Normalize → Classify → Deduplicate → Cluster


Scenario Engine

Responsibilities:

  • Create and validate scenarios  
  • Enforce required fields (definitions, assumptions, boundaries...)  
  • Perform safety checks (AKEL-assisted)  
  • Manage versioning and lifecycle  
  • Provide contextual evaluation settings to the Verdict Engine  

Flow:  
Create → Validate → Version → Lifecycle → Safety


Evidence Repository

Responsibilities:

  • Store metadata + files (object store)  
  • Classify evidence  
  • Compute preliminary reliability  
  • Maintain version history  
  • Detect retractions or disputes  
  • Provide structured metadata to the Verdict Engine  

Flow:  
Store → Classify → Score → Version → Update/Retract


Verdict Engine

Responsibilities:

  • Aggregate scenario-linked evidence  
  • Compute likelihood ranges per scenario
  • Generate reasoning chain  
  • Track uncertainty factors  
  • Maintain verdict version timelines  

Flow:  
Aggregate → Compute → Explain → Version → Timeline


Re-evaluation Engine

Responsibilities:

  • Listen for upstream changes  
  • Trigger partial or full recomputation  
  • Update verdicts + summary views  
  • Maintain consistency across federated nodes  

Triggers include:

  • Evidence updated or retracted  
  • Scenario definition or assumption changes  
  • Claim type or evaluability changes  
  • Contradiction detection  
  • Federation sync updates  

Flow:  
Trigger → Impact Analysis → Recompute → Publish Update


AKEL Integration Summary

AKEL is fully documented in its own chapter.
Here is only the architectural integration summary:

  • Receives raw input for claims  
  • Proposes scenario drafts  
  • Extracts and summarizes evidence  
  • Gives reliability hints  
  • Suggests draft verdicts  
  • Monitors contradictions  
  • Syncs metadata with trusted nodes  

AKEL runs in parallel to human review — never overrides it.

AKEL Architecture

graph TD
    Ext[External/Local LLM APIs] --> Orch[AKEL Orchestrator]

    subgraph Pipeline
        Orch --> ClaimProc[Claim Processor]
        Orch --> ScenGen[Scenario Generator]
        Orch --> EvidMiner[Evidence Miner]
        Orch --> Verdict[Draft Verdict Module]
        Orch --> Explain[Explanation Engine]
        
        ClaimProc -->|Normalization| ScenGen
        ScenGen -->|Context| EvidMiner
        EvidMiner -->|Facts| Verdict
        Verdict -->|Likelihood| Explain
    end

    subgraph Storage
        Vectors[(Vector DB / Embeddings)]
        CoreDB[(FactHarbor Core DB)]
    end

    ClaimProc -.-> Vectors
    ScenGen -.-> Vectors
    EvidMiner -.-> Vectors

    Verdict --> CoreDB
    Explain --> CoreDB

Federated Architecture

Each FactHarbor node:

  • Has its own dataset (claims, scenarios, evidence, verdicts)  
  • Runs its own AKEL  
  • Maintains local governance and reviewer rules  
  • May partially mirror global or domain-specific data  
  • Contributes to global knowledge clusters  

Nodes synchronize via:

  • Signed version bundles  
  • Merkle-tree lineage structures  
  • Optionally IPFS for evidence  
  • Trust-weighted acceptance  

Benefits:

  • Community independence  
  • Scalability  
  • Resilience  
  • Domain specialization  

Federation Architecture

This diagram shows the complete federated architecture with consistent node representation.

graph TB
    subgraph Node_A[Node A]
        A_DB[(Database)]
        A_AKEL[AKEL Instance]
        A_Users[Users]
    end
    
    subgraph Node_B[Node B]
        B_DB[(Database)]
        B_AKEL[AKEL Instance]
        B_Users[Users]
    end
    
    subgraph Node_C[Node C]
        C_DB[(Database)]
        C_AKEL[AKEL Instance]
        C_Users[Users]
    end
    
    subgraph Federation_Layer[Federation Infrastructure]
        Sync[Sync Layer
Version Bundles
Signatures & Trust] Storage[(IPFS / S3
Evidence Files)] end Node_A <-->|Content Sync| Sync Node_B <-->|Content Sync| Sync Node_C <-->|Content Sync| Sync A_AKEL -.->|Knowledge Exchange| B_AKEL B_AKEL -.->|Knowledge Exchange| C_AKEL A_AKEL -.->|Knowledge Exchange| C_AKEL Sync -.->|Large Files| Storage Storage -.-> Node_A Storage -.-> Node_B Storage -.-> Node_C

Request → Verdict Flow

Simple end-to-end flow:

User → UI Frontend → REST API → FactHarbor Core
      → (Claim Processing → Scenario Engine → Evidence Repository → Verdict Engine)
      → Summary View → UI Frontend → User


Federation Sync Workflow

Sequence:

Detect Local Change → Build Signed Bundle → Push to Peers → Validate Signature → Merge or Fork → Trigger Re-evaluation


Versioning Architecture

All entities (Claim, Scenario, Evidence, Verdict) use immutable version chains:

  • VersionID  
  • ParentVersionID  
  • Timestamp  
  • AuthorType (Human, AI, ExternalNode)  
  • ChangeReason  
  • Signature (optional POC, required in 1.0)  

Versioning Architecture

This diagram shows how each entity type maintains its version history.

graph TD
    subgraph ClaimVersioning[Claim Versioning]
        C1[Claim v1]
        C2[Claim v2]
        C3[Claim v3]
        C1 --> C2
        C2 --> C3
    end
    
    subgraph ScenarioVersioning[Scenario Versioning]
        S1[Scenario v1]
        S2[Scenario v2]
        S1 --> S2
    end
    
    subgraph EvidenceVersioning[Evidence Versioning]
        E1[Evidence v1]
        E2[Evidence v2]
        E3[Evidence v3]
        E1 --> E2
        E2 --> E3
    end
    
    subgraph VerdictVersioning[Verdict Versioning]
        V1[Verdict v1]
        V2[Verdict v2]
        V3[Verdict v3]
        V1 --> V2
        V2 --> V3
    end
    
    ClaimVersioning -.->|has| ScenarioVersioning
    ScenarioVersioning -.->|uses| EvidenceVersioning
    ScenarioVersioning -.->|produces| VerdictVersioning
    
    Note[Each Version Stores:
- VersionID
- ParentVersionID
- AuthorType
- Timestamp
- JustificationText] style Note fill:#f9f9f9,stroke:#333,stroke-width:2px