Changes for page Architecture

Last modified by Robert Schaub on 2025/12/24 18:26

From 2.1 to 1.1 From 3.3 to 3.2

From version 3.2

edited by Robert Schaub
on 2025/12/24 18:26

Change comment: Update document after refactoring.

To version 2.1

edited by Robert Schaub
on 2025/12/24 13:58

Change comment: Imported from XAR

Raw
Rendered

Summary

Page properties (2 modified, 0 added, 0 removed)

Details

Page properties

Parent

@@ -1,1 +1,1 @@
--Test.FactHarbor V0\.9\.103.Specification.WebHome
++Test.FactHarbor.Specification.WebHome

Content

@@ -20,114 +20,43 @@
  ==== Processing Layer ====
  Core business logic and AI processing:
  * **AKEL Pipeline**: AI-driven claim analysis (parallel processing)
-- * Parse and extract claim components
-- * Gather evidence from multiple sources
-- * Check source track records
-- * Extract scenarios from evidence
-- * Synthesize verdicts
-- * Calculate risk scores
--
--* **LLM Abstraction Layer**: Provider-agnostic AI access
-- * Multi-provider support (Anthropic, OpenAI, Google, local models)
-- * Automatic failover and rate limit handling
-- * Per-stage model configuration
-- * Cost optimization through provider selection
-- * No vendor lock-in
++  * Parse and extract claim components
++  * Gather evidence from multiple sources
++  * Check source track records
++  * Extract scenarios from evidence
++  * Synthesize verdicts
++  * Calculate risk scores
  * **Background Jobs**: Automated maintenance tasks
-- * Source track record updates (weekly)
-- * Cache warming and invalidation
-- * Metrics aggregation
-- * Data archival
++  * Source track record updates (weekly)
++  * Cache warming and invalidation
++  * Metrics aggregation
++  * Data archival
  * **Quality Monitoring**: Automated quality checks
-- * Anomaly detection
-- * Contradiction detection
-- * Completeness validation
++  * Anomaly detection
++  * Contradiction detection
++  * Completeness validation
  * **Moderation Detection**: Automated abuse detection
-- * Spam identification
-- * Manipulation detection
-- * Flag suspicious activity
++  * Spam identification
++  * Manipulation detection
++  * Flag suspicious activity
  ==== Data & Storage Layer ====
  Persistent data storage and caching:
  * **PostgreSQL**: Primary database for all core data
-- * Claims, evidence, sources, users
-- * Scenarios, edits, audit logs
-- * Built-in full-text search
-- * Time-series capabilities for metrics
++  * Claims, evidence, sources, users
++  * Scenarios, edits, audit logs
++  * Built-in full-text search
++  * Time-series capabilities for metrics
  * **Redis**: High-speed caching layer
-- * Session data
-- * Frequently accessed claims
-- * API rate limiting
++  * Session data
++  * Frequently accessed claims
++  * API rate limiting
  * **S3 Storage**: Long-term archival
-- * Old edit history (90+ days)
-- * AKEL processing logs
-- * Backup snapshots
++  * Old edit history (90+ days)
++  * AKEL processing logs
++  * Backup snapshots
  **Optional future additions** (add only when metrics prove necessary):
  * **Elasticsearch**: If PostgreSQL full-text search becomes slow
  * **TimescaleDB**: If metrics queries become a bottleneck
--
--
--=== 2.2 LLM Abstraction Layer ===
--
--{{include reference="Test.FactHarbor.Specification.Diagrams.LLM Abstraction Architecture.WebHome"/}}
--
--**Purpose:** FactHarbor uses a provider-agnostic abstraction layer for all AI interactions, avoiding vendor lock-in and enabling flexible provider selection.
--
--**Multi-Provider Support:**
--* **Primary:** Anthropic Claude API (Haiku for extraction, Sonnet for analysis)
--* **Secondary:** OpenAI GPT API (automatic failover)
--* **Tertiary:** Google Vertex AI / Gemini
--* **Future:** Local models (Llama, Mistral) for on-premises deployments
--
--**Provider Interface:**
--* Abstract `LLMProvider` interface with `complete()`, `stream()`, `getName()`, `getCostPer1kTokens()`, `isAvailable()` methods
--* Per-stage model configuration (Stage 1: Haiku, Stage 2 & 3: Sonnet)
--* Environment variable and database configuration
--* Adapter pattern implementation (AnthropicProvider, OpenAIProvider, GoogleProvider)
--
--**Configuration:**
--* Runtime provider switching without code changes
--* Admin API for provider management (`POST /admin/v1/llm/configure`)
--* Per-stage cost optimization (use cheaper models for extraction, quality models for analysis)
--* Support for rate limit handling and cost tracking
--
--**Failover Strategy:**
--* Automatic fallback: Primary → Secondary → Tertiary
--* Circuit breaker pattern for unavailable providers
--* Health checking and provider availability monitoring
--* Graceful degradation when all providers unavailable
--
--**Cost Optimization:**
--* Track and compare costs across providers per request
--* Enable A/B testing of different models for quality/cost tradeoffs
--* Per-stage provider selection for optimal cost-efficiency
--* Cost comparison: Anthropic ($0.114), OpenAI ($0.065), Google ($0.072) per article at 0% cache
--
--**Architecture Pattern:**
--
--{{code}}
--AKEL Stages          LLM Abstraction       Providers
--━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
--Stage 1 Extract  ──→ Provider Interface ──→ Anthropic (PRIMARY)
--Stage 2 Analyze  ──→ Configuration      ──→ OpenAI (SECONDARY)
--Stage 3 Holistic ──→ Failover Handler   ──→ Google (TERTIARY)
--                                         └→ Local Models (FUTURE)
--{{/code}}
--
--**Benefits:**
--* **No Vendor Lock-In:** Switch providers based on cost, quality, or availability without code changes
--* **Resilience:** Automatic failover ensures service continuity during provider outages
--* **Cost Efficiency:** Use optimal provider per task (cheap for extraction, quality for analysis)
--* **Quality Assurance:** Cross-provider output verification for critical claims
--* **Regulatory Compliance:** Use specific providers for data residency requirements
--* **Future-Proofing:** Easy integration of new models as they become available
--
--**Cross-References:**
--* [[Requirements>>FactHarbor.Specification.Requirements.WebHome#NFR-14]]: NFR-14 (formal requirement)
--* [[POC Requirements>>FactHarbor.Specification.POC.Requirements#NFR-POC-11]]: NFR-POC-11 (POC1 implementation)
--* [[API Specification>>FactHarbor.Specification.POC.API-and-Schemas.WebHome#Section-6]]: Section 6 (implementation details)
--* [[Design Decisions>>FactHarbor.Specification.Design-Decisions#Section-9]]: Section 9 (design rationale)
--
--
  === 2.2 Design Philosophy ===
  **Start Simple, Evolve Based on Metrics**
  The architecture deliberately starts simple:
@@ -191,7 +191,7 @@
  * Process 100 claims with ~3x latency of single claim
  * Parallel processing across independent claims
  * Linear cost scaling with claim count
--=== 2.3 Design Philosophy ===
++
  **Quality:**
  * Validation gates between phases
  * Errors isolated to individual claims
@@ -202,6 +202,7 @@
  * Can use different model sizes per phase
  * Easy to add human review at decision points
++
  == 4. Storage Architecture ==
  {{include reference="FactHarbor.Specification.Diagrams.Storage Architecture.WebHome"/}}
  See [[Storage Strategy>>FactHarbor.Specification.Architecture.WebHome]] for detailed information.
@@ -251,17 +251,17 @@
  === 5.3 Quality Monitoring ===
  **Automated checks run continuously**:
  * **Anomaly Detection**: Flag unusual patterns
-- * Sudden confidence score changes
-- * Unusual evidence distributions
-- * Suspicious source patterns
++  * Sudden confidence score changes
++  * Unusual evidence distributions
++  * Suspicious source patterns
  * **Contradiction Detection**: Identify conflicts
-- * Evidence that contradicts other evidence
-- * Claims with internal contradictions
-- * Source track record anomalies
++  * Evidence that contradicts other evidence
++  * Claims with internal contradictions
++  * Source track record anomalies
  * **Completeness Validation**: Ensure thoroughness
-- * Sufficient evidence gathered
-- * Multiple source types represented
-- * Key scenarios identified
++  * Sufficient evidence gathered
++  * Multiple source types represented
++  * Key scenarios identified
  === 5.4 Moderation Detection ===
  **Automated abuse detection**:
  * **Spam Identification**: Pattern matching for spam claims
@@ -350,7 +350,7 @@
  === 10.1 When to Add Complexity ===
  See [[When to Add Complexity>>FactHarbor.Specification.When-to-Add-Complexity]] for specific triggers.
  **Elasticsearch**: When PostgreSQL search consistently >500ms
--**TimescaleDB**: When metrics queries consistently >1s
++**TimescaleDB**: When metrics queries consistently >1s
  **Federation**: When 10,000+ users and explicit demand
  **Complex Reputation**: When 100+ active contributors
  === 10.2 Federation (V2.0+) ===

Changes for page Architecture

Summary

Details

Applications

Navigation

Need help?