Architecture

Last modified by Robert Schaub on 2025/12/24 21:53

Architecture

FactHarbor's architecture is designed for simplicity, automation, and continuous improvement.

1. Core Principles

AI-First: AKEL (AI) is the primary system, humans supplement
Publish by Default: No centralized approval (removed in V0.9.50), publish with confidence scores
System Over Data: Fix algorithms, not individual outputs
Measure Everything: Quality metrics drive improvements
Scale Through Automation: Minimal human intervention
Start Simple: Add complexity only when metrics prove necessary

2. High-Level Architecture

High-Level Architecture

graph TB
 subgraph Interface_Layer["🖥️ Interface Layer"]
 UI[Web UI
Browse & Submit]
 API[REST API
Programmatic Access]
 AUTH[Authentication
& Authorization]
 end
 subgraph Processing_Layer["⚙️ Processing Layer"]
 AKEL[AKEL Pipeline
Parallel Processing
10-18 seconds]
 LLM[LLM Abstraction Layer
Multi-Provider Support
Anthropic OpenAI Google]
 BG[Background Jobs
Source Scoring,
Cache, Archival]
 QM[Quality Monitoring
Automated Checks]
 end
 subgraph Data_Layer["💾 Data & Storage Layer"]
 PG[(PostgreSQL
Primary Database
All Core Data)]
 REDIS[(Redis
Cache & LLM Config)]
 S3[(S3
Archives)]
 end
 UI --> AUTH
 API --> AUTH
 AUTH --> AKEL
 AUTH --> QM
 AKEL --> LLM
 LLM --> PG
 LLM --> REDIS
 AKEL --> PG
 AKEL --> REDIS
 BG --> PG
 BG --> S3
 QM --> PG
 REDIS --> PG
 style Interface_Layer fill:#e1f5ff
 style Processing_Layer fill:#fff4e1
 style Data_Layer fill:#f0f0f0
 style AKEL fill:#ffcccc
 style LLM fill:#ccffcc
 style PG fill:#9999ff

Three-Layer Architecture - Clean separation with LLM abstraction: Interface Layer (user interactions), Processing Layer (AKEL + LLM Abstraction + background jobs), Data Layer (PostgreSQL primary + Redis cache/config + S3 archives). LLM Abstraction Layer provides provider-agnostic access to Anthropic, OpenAI, Google, and local models with automatic failover.

2.1 Three-Layer Architecture

FactHarbor uses a clean three-layer architecture:

Interface Layer

Handles all user and system interactions:

Web UI: Browse claims, view evidence, submit feedback
REST API: Programmatic access for integrations
Authentication & Authorization: User identity and permissions
Rate Limiting: Protect against abuse

Processing Layer

Core business logic and AI processing:

AKEL Pipeline: AI-driven claim analysis (parallel processing)
Parse and extract claim components
Gather evidence from multiple sources
Check source track records
Extract scenarios from evidence
Synthesize verdicts
Calculate risk scores

LLM Abstraction Layer: Provider-agnostic AI access
Multi-provider support (Anthropic, OpenAI, Google, local models)
Automatic failover and rate limit handling
Per-stage model configuration
Cost optimization through provider selection
No vendor lock-in
Background Jobs: Automated maintenance tasks
Source track record updates (weekly)
Cache warming and invalidation
Metrics aggregation
Data archival
Quality Monitoring: Automated quality checks
Anomaly detection
Contradiction detection
Completeness validation
Moderation Detection: Automated abuse detection
Spam identification
Manipulation detection
Flag suspicious activity

Data & Storage Layer

Persistent data storage and caching:

PostgreSQL: Primary database for all core data
Claims, evidence, sources, users
Scenarios, edits, audit logs
Built-in full-text search
Time-series capabilities for metrics
Redis: High-speed caching layer
Session data
Frequently accessed claims
API rate limiting
S3 Storage: Long-term archival
Old edit history (90+ days)
AKEL processing logs
Backup snapshots
Optional future additions (add only when metrics prove necessary):
Elasticsearch: If PostgreSQL full-text search becomes slow
TimescaleDB: If metrics queries become a bottleneck

2.2 LLM Abstraction Layer

LLM Abstraction Architecture

graph LR
 subgraph AKEL["AKEL Pipeline"]
 S1[Stage 1
Extract Claims]
 S2[Stage 2
Analyze Claims]
 S3[Stage 3
Holistic Assessment]
 end
 
 subgraph LLM["LLM Abstraction Layer"]
 INT[Provider Interface]
 CFG[Configuration
Registry]
 FAIL[Failover
Handler]
 end
 
 subgraph Providers["LLM Providers"]
 ANT[Anthropic
Claude API
PRIMARY]
 OAI[OpenAI
GPT API
SECONDARY]
 GOO[Google
Gemini API
TERTIARY]
 LOC[Local Models
Llama/Mistral
FUTURE]
 end
 
 S1 --> INT
 S2 --> INT
 S3 --> INT
 
 INT --> CFG
 INT --> FAIL
 
 CFG --> ANT
 FAIL --> ANT
 FAIL --> OAI
 FAIL --> GOO
 
 ANT -.fallback.-> OAI
 OAI -.fallback.-> GOO
 
 style AKEL fill:#ffcccc
 style LLM fill:#ccffcc
 style Providers fill:#e1f5ff
 style ANT fill:#ff9999
 style OAI fill:#99ccff
 style GOO fill:#99ff99
 style LOC fill:#cccccc

LLM Abstraction Architecture - AKEL stages call through provider interface. Configuration registry selects provider per stage. Failover handler implements automatic fallback chain.

POC1 Implementation:

PRIMARY: Provider A API (FAST model for Stage 1, REASONING model for Stages 2 & 3)
Failover: Basic error handling with cache fallback

Future (POC2/Beta):

SECONDARY: OpenAI GPT API (automatic failover)
TERTIARY: Google Gemini API (tertiary fallback)
FUTURE: Local models (Llama/Mistral for on-premises deployments)

Architecture Benefits:

Prevents vendor lock-in
Ensures resilience through automatic failover
Enables cost optimization per stage
Supports regulatory compliance (provider selection for data residency)

Description: Shows how AKEL stages interact with multiple LLM providers through an abstraction layer. POC1 uses Anthropic Claude as primary provider (Haiku 4.5 for extraction, Sonnet 4.5 for analysis). OpenAI, Google, and local models are shown as future expansion options (POC2/Beta).

Purpose: FactHarbor uses a provider-agnostic abstraction layer for all AI interactions, avoiding vendor lock-in and enabling flexible provider selection.

Multi-Provider Support:

Primary: Anthropic Claude API (Haiku for extraction, Sonnet for analysis)
Secondary: OpenAI GPT API (automatic failover)
Tertiary: Google Vertex AI / Gemini
Future: Local models (Llama, Mistral) for on-premises deployments

Provider Interface:

Abstract `LLMProvider` interface with `complete()`, `stream()`, `getName()`, `getCostPer1kTokens()`, `isAvailable()` methods
Per-stage model configuration (Stage 1: Haiku, Stage 2 & 3: Sonnet)
Environment variable and database configuration
Adapter pattern implementation (AnthropicProvider, OpenAIProvider, GoogleProvider)

Configuration:

Runtime provider switching without code changes
Admin API for provider management (`POST /admin/v1/llm/configure`)
Per-stage cost optimization (use cheaper models for extraction, quality models for analysis)
Support for rate limit handling and cost tracking

Failover Strategy:

Automatic fallback: Primary → Secondary → Tertiary
Circuit breaker pattern for unavailable providers
Health checking and provider availability monitoring
Graceful degradation when all providers unavailable

Cost Optimization:

Track and compare costs across providers per request
Enable A/B testing of different models for quality/cost tradeoffs
Per-stage provider selection for optimal cost-efficiency
Cost comparison: Anthropic ($0.114), OpenAI ($0.065), Google ($0.072) per article at 0% cache

Architecture Pattern:

AKEL Stages          LLM Abstraction       Providers
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Stage 1 Extract ──→ Provider Interface ──→ Anthropic (PRIMARY)
Stage 2 Analyze ──→ Configuration      ──→ OpenAI (SECONDARY)
Stage 3 Holistic ──→ Failover Handler   ──→ Google (TERTIARY)
                                        └→ Local Models (FUTURE)

Benefits:

No Vendor Lock-In: Switch providers based on cost, quality, or availability without code changes
Resilience: Automatic failover ensures service continuity during provider outages
Cost Efficiency: Use optimal provider per task (cheap for extraction, quality for analysis)
Quality Assurance: Cross-provider output verification for critical claims
Regulatory Compliance: Use specific providers for data residency requirements
Future-Proofing: Easy integration of new models as they become available

Cross-References:

Requirements: NFR-14 (formal requirement)
POC Requirements: NFR-POC-11 (POC1 implementation)
API Specification: Section 6 (implementation details)
Design Decisions: Section 9 (design rationale)

2.2 Design Philosophy

Start Simple, Evolve Based on Metrics
The architecture deliberately starts simple:

Single primary database (PostgreSQL handles most workloads initially)
Three clear layers (easy to understand and maintain)
Automated operations (minimal human intervention)
Measure before optimizing (add complexity only when proven necessary)
See Design Decisions and When to Add Complexity for detailed rationale.

3. AKEL Architecture

AKEL Architecture

graph TB
 User[User Submits Content
Text/URL/Single Claim]
 Extract[Claim Extraction
LLM identifies distinct claims]
 AKEL[AKEL Core Processing
Per Claim]
 Evidence[Evidence Gathering]
 Scenario[Scenario Generation]
 Verdict[Verdict Generation]
 Storage[(Storage Layer
PostgreSQL + S3)]
 Queue[Processing Queue
Parallel Claims]
 
 User --> Extract
 Extract -->|Multiple Claims| Queue
 Extract -->|Single Claim| AKEL
 Queue -->|Process Each| AKEL
 
 AKEL --> Evidence
 AKEL --> Scenario
 Evidence --> Verdict
 Scenario --> Verdict
 Verdict --> Storage
 
 style Extract fill:#e1f5ff
 style Queue fill:#fff4e1
 style AKEL fill:#f0f0f0

See AI Knowledge Extraction Layer (AKEL) for detailed information.

3.5 Claim Processing Architecture

FactHarbor's claim processing architecture is designed to handle both single-claim and multi-claim submissions efficiently.

Multi-Claim Handling

Users often submit:

Text with multiple claims: Articles, statements, or paragraphs containing several distinct factual claims
Web pages: URLs that are analyzed to extract all verifiable claims
Single claims: Simple, direct factual statements

The first processing step is always Claim Extraction: identifying and isolating individual verifiable claims from submitted content.

Processing Phases

POC Implementation (Two-Phase):

Phase 1 - Claim Extraction:

LLM analyzes submitted content
Extracts all distinct, verifiable claims
Returns structured list of claims with context

Phase 2 - Parallel Analysis:

Each claim processed independently by LLM
Single call per claim generates: Evidence, Scenarios, Sources, Verdict, Risk
Parallelized across all claims
Results aggregated for presentation

Production Implementation (Three-Phase):

Phase 1 - Extraction + Validation:

Extract claims from content
Validate clarity and uniqueness
Filter vague or duplicate claims

Phase 2 - Evidence Gathering (Parallel):

Independent evidence gathering per claim
Source validation and scenario generation
Quality gates prevent poor data from advancing

Phase 3 - Verdict Generation (Parallel):

Generate verdict from validated evidence
Confidence scoring and risk assessment
Low-confidence cases routed to human review

Architectural Benefits

Scalability:

Process 100 claims with 3x latency of single claim
Parallel processing across independent claims
Linear cost scaling with claim count

2.3 Design Philosophy

Quality:

Validation gates between phases
Errors isolated to individual claims
Clear observability per processing step

Flexibility:

Each phase optimizable independently
Can use different model sizes per phase
Easy to add human review at decision points

4. Storage Architecture

Storage Architecture

graph TB
 APP[Application
API + AKEL] --> REDIS[Redis Cache
Hot data, sessions,
rate limiting]
 REDIS --> PG[(PostgreSQL
Primary Database
**All core data**
Claims, Evidence,
Sources, Users)]
 APP --> PG
 PG -->|Backups &
Archives| S3[(S3 Storage
Old logs,
Backups)]
 BG[Background
Scheduler] --> PG
 BG --> S3
 subgraph V10["✅ V1.0 Core (3 systems)"]
 PG
 REDIS
 S3
 end
 subgraph Future["🔮 Optional Future (Add if metrics show need)"]
 PG -.->|If search slow
>500ms| ES[(Elasticsearch
Full-text search)]
 PG -.->|If metrics slow
>1s queries| TS[TimescaleDB
Time-series]
 end
 style PG fill:#9999ff
 style REDIS fill:#ff9999
 style S3 fill:#ff99ff
 style ES fill:#cccccc
 style TS fill:#cccccc
 style V10 fill:#e8f5e9
 style Future fill:#fff3e0

Simplified Storage - PostgreSQL as single primary database for all core data (claims, evidence, sources, users, metrics). Redis for caching, S3 for archives. Elasticsearch and TimescaleDB are optional additions only if performance metrics prove necessary. Start with 3 systems, not 5.

See Storage Strategy for detailed information.

4.5 Versioning Architecture

graph LR
 CLAIM[Claim] -->|edited| EDIT[Edit Record]
 EDIT -->|stores| BEFORE[Before State]
 EDIT -->|stores| AFTER[After State]
 EDIT -->|tracks| WHO[Who Changed]
 EDIT -->|tracks| WHEN[When Changed]
 EDIT -->|tracks| WHY[Why Changed]
 EDIT -->|if needed| RESTORE[Manual Restore]
 RESTORE -->|create new| CLAIM
 style EDIT fill:#ffcccc
 style RESTORE fill:#ccffcc

Versioning Architecture - Simple audit trail for V1.0: Track who, what, when, why for each change. Store before/after values in edits table. Manual restore if needed (create new edit with old values). Full versioning system (branching, merging, automatic rollback) deferred to V2.0+ unless users explicitly request it.
V1.0: Simple edit history sufficient for accountability and basic rollback.
V2.0+: Add complex versioning if users request "see version history" or "restore previous version" features.

5. Automated Systems in Detail

FactHarbor relies heavily on automation to achieve scale and quality. Here's how each automated system works:

5.1 AKEL (AI Knowledge Evaluation Layer)

What it does: Primary AI processing engine that analyzes claims automatically
Inputs:

User-submitted claim text
Existing evidence and sources
Source track record database
Processing steps:

Parse & Extract: Identify key components, entities, assertions
2. Gather Evidence: Search web and database for relevant sources
3. Check Sources: Evaluate source reliability using track records
4. Extract Scenarios: Identify different contexts from evidence
5. Synthesize Verdict: Compile evidence assessment per scenario
6. Calculate Risk: Assess potential harm and controversy
Outputs:

Structured claim record
Evidence links with relevance scores
Scenarios with context descriptions
Verdict summary per scenario
Overall confidence score
Risk assessment
Timing: 10-18 seconds total (parallel processing)

5.2 Background Jobs

Source Track Record Updates (Weekly):

Analyze claim outcomes from past week
Calculate source accuracy and reliability
Update source_track_record table
Never triggered by individual claims (prevents circular dependencies)
Cache Management (Continuous):
Warm cache for popular claims
Invalidate cache on claim updates
Monitor cache hit rates
Metrics Aggregation (Hourly):
Roll up detailed metrics
Calculate system health indicators
Generate performance reports
Data Archival (Daily):
Move old AKEL logs to S3 (90+ days)
Archive old edit history
Compress and backup data

5.3 Quality Monitoring

Automated checks run continuously:

Anomaly Detection: Flag unusual patterns
Sudden confidence score changes
Unusual evidence distributions
Suspicious source patterns
Contradiction Detection: Identify conflicts
Evidence that contradicts other evidence
Claims with internal contradictions
Source track record anomalies
Completeness Validation: Ensure thoroughness
Sufficient evidence gathered
Multiple source types represented
Key scenarios identified

5.4 Moderation Detection

Automated abuse detection:

Spam Identification: Pattern matching for spam claims
Manipulation Detection: Identify coordinated editing
Gaming Detection: Flag attempts to game source scores
Suspicious Activity: Log unusual behavior patterns
Human Review: Moderators review flagged items, system learns from decisions

6. Scalability Strategy

6.1 Horizontal Scaling

Components scale independently:

AKEL Workers: Add more processing workers as claim volume grows
Database Read Replicas: Add replicas for read-heavy workloads
Cache Layer: Redis cluster for distributed caching
API Servers: Load-balanced API instances

6.2 Vertical Scaling

Individual components can be upgraded:

Database Server: Increase CPU/RAM for PostgreSQL
Cache Memory: Expand Redis memory
Worker Resources: More powerful AKEL worker machines

6.3 Performance Optimization

Built-in optimizations:

Denormalized Data: Cache summary data in claim records (70% fewer joins)
Parallel Processing: AKEL pipeline processes in parallel (40% faster)
Intelligent Caching: Redis caches frequently accessed data
Background Processing: Non-urgent tasks run asynchronously

7. Monitoring & Observability

7.1 Key Metrics

System tracks:

Performance: AKEL processing time, API response time, cache hit rate
Quality: Confidence score distribution, evidence completeness, contradiction rate
Usage: Claims per day, active users, API requests
Errors: Failed AKEL runs, API errors, database issues

7.2 Alerts

Automated alerts for:

Processing time >30 seconds (threshold breach)
Error rate >1% (quality issue)
Cache hit rate <80% (cache problem)
Database connections >80% capacity (scaling needed)

7.3 Dashboards

Real-time monitoring:

System Health: Overall status and key metrics
AKEL Performance: Processing time breakdown
Quality Metrics: Confidence scores, completeness
User Activity: Usage patterns, peak times

8. Security Architecture

8.1 Authentication & Authorization

User Authentication: Secure login with password hashing
Role-Based Access: Reader, Contributor, Moderator, Admin
API Keys: For programmatic access
Rate Limiting: Prevent abuse

8.2 Data Security

Encryption: TLS for transport, encrypted storage for sensitive data
Audit Logging: Track all significant changes
Input Validation: Sanitize all user inputs
SQL Injection Protection: Parameterized queries

8.3 Abuse Prevention

Rate Limiting: Prevent flooding and DDoS
Automated Detection: Flag suspicious patterns
Human Review: Moderators investigate flagged content
Ban Mechanisms: Block abusive users/IPs

9. Deployment Architecture

9.1 Production Environment

Components:

Load Balancer (HAProxy or cloud LB)
Multiple API servers (stateless)
AKEL worker pool (auto-scaling)
PostgreSQL primary + read replicas
Redis cluster
S3-compatible storage
Regions: Single region for V1.0, multi-region when needed

9.2 Development & Staging

Development: Local Docker Compose setup
Staging: Scaled-down production replica
CI/CD: Automated testing and deployment

9.3 Disaster Recovery

Database Backups: Daily automated backups to S3
Point-in-Time Recovery: Transaction log archival
Replication: Real-time replication to standby
Recovery Time Objective: <4 hours

9.5 Federation Architecture Diagram

Federation Architecture
This diagram shows the complete federated architecture with node components and communication layers.

graph LR
 FH1[FactHarbor
Instance 1] 
 FH2[FactHarbor
Instance 2]
 FH3[FactHarbor
Instance 3]
 FH1 -.->|V1.0+:
Sync claims| FH2
 FH2 -.->|V1.0+:
Sync claims| FH3
 FH3 -.->|V1.0+:
Sync claims| FH1
 U1[Users] --> FH1
 U2[Users] --> FH2
 U3[Users] --> FH3
 style FH1 fill:#e1f5ff
 style FH2 fill:#e1f5ff
 style FH3 fill:#e1f5ff

Federation Architecture - Future (V1.0+): Independent FactHarbor instances can sync claims for broader reach while maintaining local control.

10. Future Architecture Evolution

10.1 When to Add Complexity

See When to Add Complexity for specific triggers.
Elasticsearch: When PostgreSQL search consistently >500ms
TimescaleDB: When metrics queries consistently >1s
Federation: When 10,000+ users and explicit demand
Complex Reputation: When 100+ active contributors

10.2 Federation (V2.0+)

Deferred until:

Core product proven with 10,000+ users
User demand for decentralization
Single-node limits reached
See Federation & Decentralization for future plans.

11. Technology Stack Summary

Backend:

Python (FastAPI or Django)
PostgreSQL (primary database)
Redis (caching)
Frontend:
Modern JavaScript framework (React, Vue, or Svelte)
Server-side rendering for SEO
AI/LLM:
Multi-provider orchestration (Claude, GPT-4, local models)
Fallback and cross-checking support
Infrastructure:
Docker containers
Kubernetes or cloud platform auto-scaling
S3-compatible object storage
Monitoring:
Prometheus + Grafana
Structured logging (ELK or cloud logging)
Error tracking (Sentry)

Architecture

Architecture

1. Core Principles

2. High-Level Architecture

2.1 Three-Layer Architecture

Interface Layer

Processing Layer

Data & Storage Layer

2.2 LLM Abstraction Layer

LLM Abstraction Architecture

2.2 Design Philosophy

3. AKEL Architecture

3.5 Claim Processing Architecture

Multi-Claim Handling

Processing Phases

Architectural Benefits

2.3 Design Philosophy

4. Storage Architecture

4.5 Versioning Architecture

5. Automated Systems in Detail

5.1 AKEL (AI Knowledge Evaluation Layer)

5.2 Background Jobs

5.3 Quality Monitoring

5.4 Moderation Detection

6. Scalability Strategy

6.1 Horizontal Scaling

6.2 Vertical Scaling

6.3 Performance Optimization

7. Monitoring & Observability

7.1 Key Metrics

7.2 Alerts

7.3 Dashboards

8. Security Architecture

8.1 Authentication & Authorization

8.2 Data Security

8.3 Abuse Prevention

9. Deployment Architecture

9.1 Production Environment

9.2 Development & Staging

9.3 Disaster Recovery

9.5 Federation Architecture Diagram

10. Future Architecture Evolution

10.1 When to Add Complexity

10.2 Federation (V2.0+)

11. Technology Stack Summary

12. Related Pages

Applications

Need help?