Automation

Version 3.1 by Robert Schaub on 2025/12/18 22:28

Automation

How FactHarbor scales through automated claim evaluation.

1. Automation Philosophy

FactHarbor is automation-first: AKEL (AI Knowledge Extraction Layer) makes all content decisions. Humans monitor system performance and improve algorithms.
Why automation:

  • Scale: Can process millions of claims
  • Consistency: Same evaluation criteria applied uniformly
  • Transparency: Algorithms are auditable
  • Speed: Results in <20 seconds typically
    See Automation Philosophy for detailed principles.

2. Claim Processing Flow

2.1 User Submits Claim

  • User provides claim text + source URLs
  • System validates format
  • Assigns processing ID
  • Queues for AKEL processing

2.2 AKEL Processing

AKEL automatically:

  1. Parses claim into testable components
    2. Extracts evidence from sources
    3. Scores source credibility
    4. Evaluates claim against evidence
    5. Generates verdict with confidence score
    6. Assigns risk tier (A/B/C)
    7. Publishes result
    Processing time: Typically <20 seconds
    No human approval required - publication is automatic

2.3 Publication States

Processing: AKEL working on claim (not visible to public)
Published: AKEL completed evaluation (public)

  • Verdict displayed with confidence score
  • Evidence and sources shown
  • Risk tier indicated
  • Users can report issues
    Flagged: AKEL identified issue requiring moderator attention (still public)
  • Low confidence below threshold
  • Detected manipulation attempt
  • Unusual pattern
  • Moderator reviews and may take action

2.5 LLM-Based Processing Architecture

FactHarbor delegates complex reasoning and analysis tasks to Large Language Models (LLMs). The architecture evolves from POC to production:

POC: Two-Phase Approach

Phase 1: Claim Extraction

  • Single LLM call to extract all claims from submitted content
  • Light structure, focused on identifying distinct verifiable claims
  • Output: List of claims with context

Phase 2: Claim Analysis (Parallel)

  • Single LLM call per claim (parallelizable)
  • Full structured output: Evidence, Scenarios, Sources, Verdict, Risk
  • Each claim analyzed independently

Advantages:

  • Fast to implement (2-4 weeks to working POC)
  • Only 2-3 API calls total (1 + N claims)
  • Simple to debug (claim-level isolation)
  • Proves concept viability

Production: Three-Phase Approach

Phase 1: Claim Extraction + Validation

  • Extract distinct verifiable claims
  • Validate claim clarity and uniqueness
  • Remove duplicates and vague claims

Phase 2: Evidence Gathering (Parallel)

  • For each claim independently:
  • Find supporting and contradicting evidence
  • Identify authoritative sources
  • Generate test scenarios
  • Validation: Check evidence quality and source validity
  • Error containment: Issues in one claim don't affect others

Phase 3: Verdict Generation (Parallel)

  • For each claim:
  • Generate verdict based on validated evidence
  • Assess confidence and risk level
  • Flag low-confidence results for human review
  • Validation: Check verdict consistency with evidence

Advantages:

  • Error containment between phases
  • Clear quality gates and validation
  • Observable metrics per phase
  • Scalable (parallel processing across claims)
  • Adaptable (can optimize each phase independently)

LLM Task Delegation

All complex cognitive tasks are delegated to LLMs:

  • Claim Extraction: Understanding context, identifying distinct claims
  • Evidence Finding: Analyzing sources, assessing relevance
  • Scenario Generation: Creating testable hypotheses
  • Source Evaluation: Assessing reliability and authority
  • Verdict Generation: Synthesizing evidence into conclusions
  • Risk Assessment: Evaluating potential impact

Error Mitigation

Research shows sequential LLM calls face compound error risks. FactHarbor mitigates this through:

  • Validation gates between phases
  • Confidence thresholds for quality control
  • Parallel processing to avoid error propagation across claims
  • Human review queue for low-confidence verdicts
  • Independent claim processing - errors in one claim don't cascade to others

3. Risk Tiers

Risk tiers classify claims by potential impact and guide audit sampling rates.

3.1 Tier A (High Risk)

Domains: Medical, legal, elections, safety, security
Characteristics:

  • High potential for harm if incorrect
  • Complex specialized knowledge required
  • Often subject to regulation
    Publication: AKEL publishes automatically with prominent risk warning
    Audit rate: Higher sampling recommended

3.2 Tier B (Medium Risk)

Domains: Complex policy, science, causality claims
Characteristics:

  • Moderate potential impact
  • Requires careful evidence evaluation
  • Multiple valid interpretations possible
    Publication: AKEL publishes automatically with standard risk label
    Audit rate: Moderate sampling recommended

3.3 Tier C (Low Risk)

Domains: Definitions, established facts, historical data
Characteristics:

  • Low potential for harm
  • Well-documented information
  • Clear right/wrong answers typically
    Publication: AKEL publishes by default
    Audit rate: Lower sampling recommended

4. Quality Gates

AKEL applies quality gates before publication. If any fail, claim is flagged (not blocked - still published).
Quality gates:

  • Sufficient evidence extracted (≥2 sources)
  • Sources meet minimum credibility threshold
  • Confidence score calculable
  • No detected manipulation patterns
  • Claim parseable into testable form
    Failed gates: Claim published with flag for moderator review

5. Automation Levels

Information

Current Status: Level 0 (POC/Demo) - v2.6.33. FactHarbor is currently at POC level with full AKEL automation but limited production features.

Automation Maturity Progression


graph TD
    POC[Level 0 POC Demo CURRENT]
    R05[Level 0.5 Limited Production]
    R10[Level 1.0 Full Production]
    R20[Level 2.0+ Distributed Intelligence]

    POC --> R05
    R05 --> R10
    R10 --> R20

Level Descriptions

 Level  Name  Key Features
 Level 0  POC/Demo (CURRENT)  All content auto-analyzed, AKEL generates verdicts, no risk tier filtering, single-user demo mode
 Level 0.5  Limited Production  Multi-user support, risk tier classification, basic sampling audit, algorithm improvement focus
 Level 1.0  Full Production  All tiers auto-published, clear risk labels, reduced sampling, mature algorithms
 Level 2.0+  Distributed  Federated multi-node, cross-node audits, advanced patterns, strategic sampling only

Current Implementation (v2.6.33)

 Feature  POC Target  Actual Status
 AKEL auto-analysis  Yes  Implemented
 Verdict generation  Yes  Implemented (7-point scale)
 Quality Gates  Basic  Gates 1 and 4 implemented
 Risk tiers  Yes  Not implemented
 Sampling audits  High sampling  Not implemented
 User system  Demo only  Anonymous only

Key Principles

Across All Levels:

  • AKEL makes all publication decisions
  • No human approval gates
  • Humans monitor metrics and improve algorithms
  • Risk tiers guide audit priorities, not publication
  • Sampling audits inform improvements

FactHarbor progresses through automation maturity levels:
Release 0.5 (Proof-of-Concept): Tier C only, human review required
Release 1.0 (Initial): Tier B/C auto-published, Tier A flagged for review
Release 2.0 (Mature): All tiers auto-published with risk labels, sampling audits
See Automation Roadmap for detailed progression.

5.5 Automation Roadmap

Information

Current Status: POC (v2.6.33) - FactHarbor is at Proof of Concept stage. No risk tiers, no sampling audits yet.

Automation Roadmap


graph LR
    subgraph QA[Quality Assurance Evolution]
        QA1[Initial High Sampling]
        QA2[Intermediate Strategic]
        QA3[Mature Anomaly-Triggered]

        QA1 --> QA2
        QA2 --> QA3
    end

    subgraph POC[POC CURRENT]
        POC_F[POC Features]
    end

    subgraph R05[Release 0.5]
        R05_F[Limited Production]
    end

    subgraph R10[Release 1.0]
        R10_F[Full Production]
    end

    subgraph Future[Future]
        Future_F[Distributed Intelligence]
    end

    POC_F --> R05_F
    R05_F --> R10_F
    R10_F --> Future_F

Phase Details

POC (Current v2.6.33)

  • All content analyzed
  • Basic AKEL Processing
  • No risk tiers yet
  • No sampling audits

Release 0.5 (Planned)

  • Tier A/B/C Published
  • All auto-publication
  • Risk Labels Active
  • Contradiction Detection
  • Sampling-Based QA

Release 1.0 (Planned)

  • Comprehensive AI Publication
  • Strategic Audits Only
  • Federated Nodes Beta
  • Cross-Node Data Sharing
  • Mature Algorithm Performance

Future (V2.0+)

  • Advanced Pattern Detection
  • Global Contradiction Network
  • Minimal Human QA
  • Full Federation

Philosophy

Automation Philosophy: At all stages, AKEL publishes automatically. Humans improve algorithms, not review content.

Sampling Rates: Start higher for learning, reduce as confidence grows.

6. Human Role

Humans do NOT review content for approval. Instead:
Monitoring: Watch aggregate performance metrics
Improvement: Fix algorithms when patterns show issues
Exception handling: Review AKEL-flagged items
Governance: Set policies AKEL applies
See Contributor Processes for how to improve the system.

6.5 Manual vs Automated Matrix

Information

Design Philosophy - This matrix shows the intended division of responsibilities between AKEL and humans. v2.6.33 implements the automated claim evaluation; human responsibilities require the user system (not yet implemented).

Manual vs Automated Matrix


graph TD
    subgraph Automated[Automated by AKEL]
        A1[Claim Evaluation]
        A2[Quality Assessment]
        A3[Content Management]
    end
    subgraph Human[Human Responsibilities]
        H1[Algorithm Improvement]
        H2[Policy Governance]
        H3[Exception Handling]
        H4[Strategic Decisions]
    end

Automated by AKEL

 Function  Details  Status
 Claim Evaluation  Evidence extraction, source scoring, verdict generation, risk classification, publication  Implemented
 Quality Assessment  Contradiction detection, confidence scoring, pattern recognition, anomaly flagging  Partial (Gates 1 and 4)
 Content Management  KeyFactor generation, evidence linking, source tracking  Implemented

Human Responsibilities

 Function  Details  Status
 Algorithm Improvement  Monitor metrics, identify issues, propose fixes, test, deploy  Via code changes
 Policy Governance  Set criteria, define risk tiers, establish thresholds, update guidelines  Not implemented (env vars only)
 Exception Handling  Review flagged items, handle abuse, address safety, manage legal  Not implemented
 Strategic Decisions  Budget, hiring, major policy, partnerships  N/A

Key Principles

Never Manual:

  • Individual claim approval
  • Routine content review
  • Verdict overrides (fix algorithm instead)
  • Publication gates

Key Principle: AKEL handles all content decisions. Humans improve the system, not the data.

7. Moderation

Moderators handle items AKEL flags:
Abuse detection: Spam, manipulation, harassment
Safety issues: Content that could cause immediate harm
System gaming: Attempts to manipulate scoring
Action: May temporarily hide content, ban users, or propose algorithm improvements
Does NOT: Routinely review claims or override verdicts
See Organisational Model for moderator role details.

8. Continuous Improvement

Performance monitoring: Track AKEL accuracy, speed, coverage
Issue identification: Find systematic errors from metrics
Algorithm updates: Deploy improvements to fix patterns
A/B testing: Validate changes before full rollout
Retrospectives: Learn from failures systematically
See Continuous Improvement for improvement cycle.

9. Scalability

Automation enables FactHarbor to scale:

  • Millions of claims processable
  • Consistent quality at any volume
  • Cost efficiency through automation
  • Rapid iteration on algorithms
    Without automation: Human review doesn't scale, creates bottlenecks, introduces inconsistency.

10. Transparency

All automation is transparent:

  • Algorithm parameters documented
  • Evaluation criteria public
  • Source scoring rules explicit
  • Confidence calculations explained
  • Performance metrics visible
    See System Performance Metrics for what we measure.