AI Knowledge Extraction Layer (AKEL)

Version 3.1 by Robert Schaub on 2025/12/12 09:32

AKEL — AI Knowledge Extraction Layer

AKEL is FactHarbor’s automated intelligence subsystem.  
Its purpose is to reduce human workload, enhance consistency, and enable scalable knowledge processing — without ever replacing human judgment.

All AKEL outputs are marked with AuthorType = AI and require human approval before publication.

AKEL operates in two modes:

  • Single-node mode (POC & Beta 0)
  • Federated multi-node mode (Release 1.0+)

Human reviewers, experts, and moderators always retain final authority.

Purpose and Role

AKEL transforms unstructured inputs into structured, review-ready drafts.

Core responsibilities:

  • Claim extraction from arbitrary text
  • Claim classification (domain, type, evaluability, safety)
  • Scenario generation (definitions, boundaries, assumptions, methodology)
  • Evidence summarization and metadata extraction
  • Contradiction detection
  • Re-evaluation proposal generation
  • Cross-node embedding exchange (Release 1.0+)

Components

  • AKEL Orchestrator – central coordinator
  • Claim Extractor
  • Claim Classifier
  • Scenario Generator
  • Evidence Summarizer
  • Contradiction Detector
  • Embedding Handler (Release 1.0+)
  • Federation Sync Adapter (Release 1.0+)

Inputs and Outputs

Inputs

  • User-submitted claims or evidence
  • Uploaded documents
  • URLs or citations
  • External LLM API (optional)
  • Embeddings (from local or federated peers)

Outputs (all require human approval)

  • ClaimVersion (draft)
  • ScenarioVersion (draft)
  • EvidenceVersion (summary + metadata draft)
  • VerdictVersion (draft; internal only)
  • Contradiction alerts
  • Re-evaluation proposals
  • Updated embeddings

Architecture Overview

Information

Current Implementation - Triple-Path Pipeline Architecture. Three pipeline variants share common modules for AnalysisContext detection, aggregation, claim processing, evidence filtering, verdict corrections, and source reliability.

Updated 2026-02-08 per documentation audit report.

Triple-Path Pipeline Architecture


graph TB
    subgraph Input[User Input]
        URL[URL Input]
        TEXT[Text Input]
    end

    subgraph Shared[Shared Modules]
        CONTEXTS[analysis-contexts.ts Context Detection]
        AGG[aggregation.ts Verdict Aggregation]
        CLAIM_D[claim-decomposition.ts]
        EF[evidence-filter.ts ~330 lines]
        QG[quality-gates.ts ~410 lines]
        SR[source-reliability.ts ~620 lines]
        VC[verdict-corrections.ts ~310 lines]
        TS[truth-scale.ts ~280 lines]
        BU[budgets.ts ~250 lines]
    end

    subgraph Dispatch[Pipeline Dispatch]
        SELECT{Select Pipeline}
    end

    subgraph Pipelines[Pipeline Implementations]
        ORCH[Orchestrated Pipeline]
        CANON[Monolithic Canonical]
        DYN[Monolithic Dynamic]
    end

    subgraph LLM[LLM Layer]
        PROVIDER[AI SDK Provider]
    end

    subgraph Output[Result]
        RESULT[AnalysisResult JSON]
        REPORT[Markdown Report]
    end

    URL --> SELECT
    TEXT --> SELECT
    SELECT -->|orchestrated| ORCH
    SELECT -->|monolithic_canonical| CANON
    SELECT -->|monolithic_dynamic| DYN
    CONTEXTS --> ORCH
    CONTEXTS --> CANON
    AGG --> ORCH
    AGG --> CANON
    CLAIM_D --> ORCH
    CLAIM_D --> CANON
    EF --> ORCH
    QG --> ORCH
    SR --> ORCH
    SR --> CANON
    SR --> DYN
    VC --> ORCH
    TS --> CANON
    TS --> DYN
    BU --> ORCH
    BU --> CANON
    BU --> DYN
    ORCH --> PROVIDER
    CANON --> PROVIDER
    DYN --> PROVIDER
    ORCH --> RESULT
    CANON --> RESULT
    DYN --> RESULT
    RESULT --> REPORT

Pipeline Variants

 Variant  File  Lines  Approach  Output Schema
 Orchestrated  orchestrated.ts  13,300  Multi-step workflow with explicit stages  Canonical (structured)
 Monolithic Canonical  monolithic-canonical.ts  1,500  Single LLM tool-loop call  Canonical (structured)
 Monolithic Dynamic  monolithic-dynamic.ts  735  Single LLM tool-loop call  Dynamic (flexible)

Shared Modules

 Module  Lines  Used By  Purpose
 analysis-contexts.ts   Orch, Canon  Heuristic context pre-detection before LLM
 aggregation.ts   Orch, Canon  Verdict weighting, contestation validation
 claim-decomposition.ts   Orch, Canon  Claim text parsing and normalization
 evidence-filter.ts  330  Orch  Probative value filtering, false positive rate calculation
 quality-gates.ts  410  Orch  Gate 1 (claim validation) and Gate 4 (verdict confidence)
 source-reliability.ts  620  Orch, Canon, Dyn  LLM-based source reliability evaluation with cache
 verdict-corrections.ts  310  Orch  Post-hoc verdict direction mismatch corrections
 truth-scale.ts  280  Canon, Dyn  Percentage-to-verdict label mapping
 budgets.ts  250  Orch, Canon, Dyn  Token/cost budget tracking and enforcement

Orchestrated Pipeline Steps

  1. Understand - Detect input type, extract claims, identify dependencies
    2. Research (iterative) - Generate queries, fetch sources, extract evidence
    3. Verdict Generation - Generate claim and article verdicts
    4. Summary - Build two-panel summary
    5. Report - Generate markdown report

Detailed Pipeline Diagrams

For internal implementation details of each pipeline variant:

AKEL and Federation

In Release 1.0+, AKEL participates in cross-node knowledge alignment:

  • Shares embeddings
  • Exchanges canonicalized claim forms
  • Exchanges scenario templates
  • Sends + receives contradiction alerts
  • Never shares model weights
  • Never overrides local governance

Nodes may choose trust levels for AKEL-related data:

  • Trusted nodes: auto-merge embeddings + templates
  • Neutral nodes: require reviewer approval
  • Untrusted nodes: fully manual import

Human Approval Workflow

  1. AKEL generates draft outputs (AuthorType = AI)
    2. Reviewers inspect and approve/moderate the drafts
    3. Experts validate high-risk or domain-specific outputs
    4. Moderators finalize publication
    5. Version numbers increment, history preserved

No AKEL output is ever published automatically.