Wiki source code of Architecture

Last modified by Robert Schaub on 2025/12/24 21:53

Show last authors
1 = Architecture =
2 FactHarbor's architecture is designed for **simplicity, automation, and continuous improvement**.
3 == 1. Core Principles ==
4 * **AI-First**: AKEL (AI) is the primary system, humans supplement
5 * **Publish by Default**: No centralized approval (removed in V0.9.50), publish with confidence scores
6 * **System Over Data**: Fix algorithms, not individual outputs
7 * **Measure Everything**: Quality metrics drive improvements
8 * **Scale Through Automation**: Minimal human intervention
9 * **Start Simple**: Add complexity only when metrics prove necessary
10 == 2. High-Level Architecture ==
11 {{include reference="FactHarbor.Specification.Diagrams.High-Level Architecture.WebHome"/}}
12 === 2.1 Three-Layer Architecture ===
13 FactHarbor uses a clean three-layer architecture:
14 ==== Interface Layer ====
15 Handles all user and system interactions:
16 * **Web UI**: Browse claims, view evidence, submit feedback
17 * **REST API**: Programmatic access for integrations
18 * **Authentication & Authorization**: User identity and permissions
19 * **Rate Limiting**: Protect against abuse
20 ==== Processing Layer ====
21 Core business logic and AI processing:
22 * **AKEL Pipeline**: AI-driven claim analysis (parallel processing)
23 * Parse and extract claim components
24 * Gather evidence from multiple sources
25 * Check source track records
26 * Extract scenarios from evidence
27 * Synthesize verdicts
28 * Calculate risk scores
29
30 * **LLM Abstraction Layer**: Provider-agnostic AI access
31 * Multi-provider support (Anthropic, OpenAI, Google, local models)
32 * Automatic failover and rate limit handling
33 * Per-stage model configuration
34 * Cost optimization through provider selection
35 * No vendor lock-in
36 * **Background Jobs**: Automated maintenance tasks
37 * Source track record updates (weekly)
38 * Cache warming and invalidation
39 * Metrics aggregation
40 * Data archival
41 * **Quality Monitoring**: Automated quality checks
42 * Anomaly detection
43 * Contradiction detection
44 * Completeness validation
45 * **Moderation Detection**: Automated abuse detection
46 * Spam identification
47 * Manipulation detection
48 * Flag suspicious activity
49 ==== Data & Storage Layer ====
50 Persistent data storage and caching:
51 * **PostgreSQL**: Primary database for all core data
52 * Claims, evidence, sources, users
53 * Scenarios, edits, audit logs
54 * Built-in full-text search
55 * Time-series capabilities for metrics
56 * **Redis**: High-speed caching layer
57 * Session data
58 * Frequently accessed claims
59 * API rate limiting
60 * **S3 Storage**: Long-term archival
61 * Old edit history (90+ days)
62 * AKEL processing logs
63 * Backup snapshots
64 **Optional future additions** (add only when metrics prove necessary):
65 * **Elasticsearch**: If PostgreSQL full-text search becomes slow
66 * **TimescaleDB**: If metrics queries become a bottleneck
67
68
69 === 2.2 LLM Abstraction Layer ===
70
71 {{include reference="Test.FactHarbor.Specification.Diagrams.LLM Abstraction Architecture.WebHome"/}}
72
73 **Purpose:** FactHarbor uses a provider-agnostic abstraction layer for all AI interactions, avoiding vendor lock-in and enabling flexible provider selection.
74
75 **Multi-Provider Support:**
76 * **Primary:** Anthropic Claude API (Haiku for extraction, Sonnet for analysis)
77 * **Secondary:** OpenAI GPT API (automatic failover)
78 * **Tertiary:** Google Vertex AI / Gemini
79 * **Future:** Local models (Llama, Mistral) for on-premises deployments
80
81 **Provider Interface:**
82 * Abstract `LLMProvider` interface with `complete()`, `stream()`, `getName()`, `getCostPer1kTokens()`, `isAvailable()` methods
83 * Per-stage model configuration (Stage 1: Haiku, Stage 2 & 3: Sonnet)
84 * Environment variable and database configuration
85 * Adapter pattern implementation (AnthropicProvider, OpenAIProvider, GoogleProvider)
86
87 **Configuration:**
88 * Runtime provider switching without code changes
89 * Admin API for provider management (`POST /admin/v1/llm/configure`)
90 * Per-stage cost optimization (use cheaper models for extraction, quality models for analysis)
91 * Support for rate limit handling and cost tracking
92
93 **Failover Strategy:**
94 * Automatic fallback: Primary → Secondary → Tertiary
95 * Circuit breaker pattern for unavailable providers
96 * Health checking and provider availability monitoring
97 * Graceful degradation when all providers unavailable
98
99 **Cost Optimization:**
100 * Track and compare costs across providers per request
101 * Enable A/B testing of different models for quality/cost tradeoffs
102 * Per-stage provider selection for optimal cost-efficiency
103 * Cost comparison: Anthropic ($0.114), OpenAI ($0.065), Google ($0.072) per article at 0% cache
104
105 **Architecture Pattern:**
106
107 {{code}}
108 AKEL Stages LLM Abstraction Providers
109 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
110 Stage 1 Extract ──→ Provider Interface ──→ Anthropic (PRIMARY)
111 Stage 2 Analyze ──→ Configuration ──→ OpenAI (SECONDARY)
112 Stage 3 Holistic ──→ Failover Handler ──→ Google (TERTIARY)
113 └→ Local Models (FUTURE)
114 {{/code}}
115
116 **Benefits:**
117 * **No Vendor Lock-In:** Switch providers based on cost, quality, or availability without code changes
118 * **Resilience:** Automatic failover ensures service continuity during provider outages
119 * **Cost Efficiency:** Use optimal provider per task (cheap for extraction, quality for analysis)
120 * **Quality Assurance:** Cross-provider output verification for critical claims
121 * **Regulatory Compliance:** Use specific providers for data residency requirements
122 * **Future-Proofing:** Easy integration of new models as they become available
123
124 **Cross-References:**
125 * [[Requirements>>FactHarbor.Specification.Requirements.WebHome#NFR-14]]: NFR-14 (formal requirement)
126 * [[POC Requirements>>FactHarbor.Specification.POC.Requirements#NFR-POC-11]]: NFR-POC-11 (POC1 implementation)
127 * [[API Specification>>FactHarbor.Specification.POC.API-and-Schemas.WebHome#Section-6]]: Section 6 (implementation details)
128 * [[Design Decisions>>FactHarbor.Specification.Design-Decisions#Section-9]]: Section 9 (design rationale)
129
130
131 === 2.2 Design Philosophy ===
132 **Start Simple, Evolve Based on Metrics**
133 The architecture deliberately starts simple:
134 * Single primary database (PostgreSQL handles most workloads initially)
135 * Three clear layers (easy to understand and maintain)
136 * Automated operations (minimal human intervention)
137 * Measure before optimizing (add complexity only when proven necessary)
138 See [[Design Decisions>>FactHarbor.Specification.Design-Decisions]] and [[When to Add Complexity>>FactHarbor.Specification.When-to-Add-Complexity]] for detailed rationale.
139 == 3. AKEL Architecture ==
140 {{include reference="FactHarbor.Specification.Diagrams.AKEL Architecture.WebHome"/}}
141 See [[AI Knowledge Extraction Layer (AKEL)>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]] for detailed information.
142
143 == 3.5 Claim Processing Architecture ==
144
145 FactHarbor's claim processing architecture is designed to handle both single-claim and multi-claim submissions efficiently.
146
147 === Multi-Claim Handling ===
148
149 Users often submit:
150 * **Text with multiple claims**: Articles, statements, or paragraphs containing several distinct factual claims
151 * **Web pages**: URLs that are analyzed to extract all verifiable claims
152 * **Single claims**: Simple, direct factual statements
153
154 The first processing step is always **Claim Extraction**: identifying and isolating individual verifiable claims from submitted content.
155
156 === Processing Phases ===
157
158 **POC Implementation (Two-Phase):**
159
160 Phase 1 - Claim Extraction:
161 * LLM analyzes submitted content
162 * Extracts all distinct, verifiable claims
163 * Returns structured list of claims with context
164
165 Phase 2 - Parallel Analysis:
166 * Each claim processed independently by LLM
167 * Single call per claim generates: Evidence, Scenarios, Sources, Verdict, Risk
168 * Parallelized across all claims
169 * Results aggregated for presentation
170
171 **Production Implementation (Three-Phase):**
172
173 Phase 1 - Extraction + Validation:
174 * Extract claims from content
175 * Validate clarity and uniqueness
176 * Filter vague or duplicate claims
177
178 Phase 2 - Evidence Gathering (Parallel):
179 * Independent evidence gathering per claim
180 * Source validation and scenario generation
181 * Quality gates prevent poor data from advancing
182
183 Phase 3 - Verdict Generation (Parallel):
184 * Generate verdict from validated evidence
185 * Confidence scoring and risk assessment
186 * Low-confidence cases routed to human review
187
188 === Architectural Benefits ===
189
190 **Scalability:**
191 * Process 100 claims with ~3x latency of single claim
192 * Parallel processing across independent claims
193 * Linear cost scaling with claim count
194 === 2.3 Design Philosophy ===
195 **Quality:**
196 * Validation gates between phases
197 * Errors isolated to individual claims
198 * Clear observability per processing step
199
200 **Flexibility:**
201 * Each phase optimizable independently
202 * Can use different model sizes per phase
203 * Easy to add human review at decision points
204
205 == 4. Storage Architecture ==
206 {{include reference="FactHarbor.Specification.Diagrams.Storage Architecture.WebHome"/}}
207 See [[Storage Strategy>>FactHarbor.Specification.Architecture.WebHome]] for detailed information.
208 == 4.5 Versioning Architecture ==
209 {{include reference="FactHarbor.Specification.Diagrams.Versioning Architecture.WebHome"/}}
210 == 5. Automated Systems in Detail ==
211 FactHarbor relies heavily on automation to achieve scale and quality. Here's how each automated system works:
212 === 5.1 AKEL (AI Knowledge Evaluation Layer) ===
213 **What it does**: Primary AI processing engine that analyzes claims automatically
214 **Inputs**:
215 * User-submitted claim text
216 * Existing evidence and sources
217 * Source track record database
218 **Processing steps**:
219 1. **Parse & Extract**: Identify key components, entities, assertions
220 2. **Gather Evidence**: Search web and database for relevant sources
221 3. **Check Sources**: Evaluate source reliability using track records
222 4. **Extract Scenarios**: Identify different contexts from evidence
223 5. **Synthesize Verdict**: Compile evidence assessment per scenario
224 6. **Calculate Risk**: Assess potential harm and controversy
225 **Outputs**:
226 * Structured claim record
227 * Evidence links with relevance scores
228 * Scenarios with context descriptions
229 * Verdict summary per scenario
230 * Overall confidence score
231 * Risk assessment
232 **Timing**: 10-18 seconds total (parallel processing)
233 === 5.2 Background Jobs ===
234 **Source Track Record Updates** (Weekly):
235 * Analyze claim outcomes from past week
236 * Calculate source accuracy and reliability
237 * Update source_track_record table
238 * Never triggered by individual claims (prevents circular dependencies)
239 **Cache Management** (Continuous):
240 * Warm cache for popular claims
241 * Invalidate cache on claim updates
242 * Monitor cache hit rates
243 **Metrics Aggregation** (Hourly):
244 * Roll up detailed metrics
245 * Calculate system health indicators
246 * Generate performance reports
247 **Data Archival** (Daily):
248 * Move old AKEL logs to S3 (90+ days)
249 * Archive old edit history
250 * Compress and backup data
251 === 5.3 Quality Monitoring ===
252 **Automated checks run continuously**:
253 * **Anomaly Detection**: Flag unusual patterns
254 * Sudden confidence score changes
255 * Unusual evidence distributions
256 * Suspicious source patterns
257 * **Contradiction Detection**: Identify conflicts
258 * Evidence that contradicts other evidence
259 * Claims with internal contradictions
260 * Source track record anomalies
261 * **Completeness Validation**: Ensure thoroughness
262 * Sufficient evidence gathered
263 * Multiple source types represented
264 * Key scenarios identified
265 === 5.4 Moderation Detection ===
266 **Automated abuse detection**:
267 * **Spam Identification**: Pattern matching for spam claims
268 * **Manipulation Detection**: Identify coordinated editing
269 * **Gaming Detection**: Flag attempts to game source scores
270 * **Suspicious Activity**: Log unusual behavior patterns
271 **Human Review**: Moderators review flagged items, system learns from decisions
272 == 6. Scalability Strategy ==
273 === 6.1 Horizontal Scaling ===
274 Components scale independently:
275 * **AKEL Workers**: Add more processing workers as claim volume grows
276 * **Database Read Replicas**: Add replicas for read-heavy workloads
277 * **Cache Layer**: Redis cluster for distributed caching
278 * **API Servers**: Load-balanced API instances
279 === 6.2 Vertical Scaling ===
280 Individual components can be upgraded:
281 * **Database Server**: Increase CPU/RAM for PostgreSQL
282 * **Cache Memory**: Expand Redis memory
283 * **Worker Resources**: More powerful AKEL worker machines
284 === 6.3 Performance Optimization ===
285 Built-in optimizations:
286 * **Denormalized Data**: Cache summary data in claim records (70% fewer joins)
287 * **Parallel Processing**: AKEL pipeline processes in parallel (40% faster)
288 * **Intelligent Caching**: Redis caches frequently accessed data
289 * **Background Processing**: Non-urgent tasks run asynchronously
290 == 7. Monitoring & Observability ==
291 === 7.1 Key Metrics ===
292 System tracks:
293 * **Performance**: AKEL processing time, API response time, cache hit rate
294 * **Quality**: Confidence score distribution, evidence completeness, contradiction rate
295 * **Usage**: Claims per day, active users, API requests
296 * **Errors**: Failed AKEL runs, API errors, database issues
297 === 7.2 Alerts ===
298 Automated alerts for:
299 * Processing time >30 seconds (threshold breach)
300 * Error rate >1% (quality issue)
301 * Cache hit rate <80% (cache problem)
302 * Database connections >80% capacity (scaling needed)
303 === 7.3 Dashboards ===
304 Real-time monitoring:
305 * **System Health**: Overall status and key metrics
306 * **AKEL Performance**: Processing time breakdown
307 * **Quality Metrics**: Confidence scores, completeness
308 * **User Activity**: Usage patterns, peak times
309 == 8. Security Architecture ==
310 === 8.1 Authentication & Authorization ===
311 * **User Authentication**: Secure login with password hashing
312 * **Role-Based Access**: Reader, Contributor, Moderator, Admin
313 * **API Keys**: For programmatic access
314 * **Rate Limiting**: Prevent abuse
315 === 8.2 Data Security ===
316 * **Encryption**: TLS for transport, encrypted storage for sensitive data
317 * **Audit Logging**: Track all significant changes
318 * **Input Validation**: Sanitize all user inputs
319 * **SQL Injection Protection**: Parameterized queries
320 === 8.3 Abuse Prevention ===
321 * **Rate Limiting**: Prevent flooding and DDoS
322 * **Automated Detection**: Flag suspicious patterns
323 * **Human Review**: Moderators investigate flagged content
324 * **Ban Mechanisms**: Block abusive users/IPs
325 == 9. Deployment Architecture ==
326 === 9.1 Production Environment ===
327 **Components**:
328 * Load Balancer (HAProxy or cloud LB)
329 * Multiple API servers (stateless)
330 * AKEL worker pool (auto-scaling)
331 * PostgreSQL primary + read replicas
332 * Redis cluster
333 * S3-compatible storage
334 **Regions**: Single region for V1.0, multi-region when needed
335 === 9.2 Development & Staging ===
336 **Development**: Local Docker Compose setup
337 **Staging**: Scaled-down production replica
338 **CI/CD**: Automated testing and deployment
339 === 9.3 Disaster Recovery ===
340 * **Database Backups**: Daily automated backups to S3
341 * **Point-in-Time Recovery**: Transaction log archival
342 * **Replication**: Real-time replication to standby
343 * **Recovery Time Objective**: <4 hours
344
345 === 9.5 Federation Architecture Diagram ===
346
347 {{include reference="FactHarbor.Specification.Diagrams.Federation Architecture.WebHome"/}}
348
349 == 10. Future Architecture Evolution ==
350 === 10.1 When to Add Complexity ===
351 See [[When to Add Complexity>>FactHarbor.Specification.When-to-Add-Complexity]] for specific triggers.
352 **Elasticsearch**: When PostgreSQL search consistently >500ms
353 **TimescaleDB**: When metrics queries consistently >1s
354 **Federation**: When 10,000+ users and explicit demand
355 **Complex Reputation**: When 100+ active contributors
356 === 10.2 Federation (V2.0+) ===
357 **Deferred until**:
358 * Core product proven with 10,000+ users
359 * User demand for decentralization
360 * Single-node limits reached
361 See [[Federation & Decentralization>>FactHarbor.Specification.Federation & Decentralization.WebHome]] for future plans.
362 == 11. Technology Stack Summary ==
363 **Backend**:
364 * Python (FastAPI or Django)
365 * PostgreSQL (primary database)
366 * Redis (caching)
367 **Frontend**:
368 * Modern JavaScript framework (React, Vue, or Svelte)
369 * Server-side rendering for SEO
370 **AI/LLM**:
371 * Multi-provider orchestration (Claude, GPT-4, local models)
372 * Fallback and cross-checking support
373 **Infrastructure**:
374 * Docker containers
375 * Kubernetes or cloud platform auto-scaling
376 * S3-compatible object storage
377 **Monitoring**:
378 * Prometheus + Grafana
379 * Structured logging (ELK or cloud logging)
380 * Error tracking (Sentry)
381 == 12. Related Pages ==
382 * [[AI Knowledge Extraction Layer (AKEL)>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]]
383 * [[Storage Strategy>>FactHarbor.Specification.Architecture.WebHome]]
384 * [[Data Model>>FactHarbor.Specification.Data Model.WebHome]]
385 * [[API Layer>>FactHarbor.Specification.Architecture.WebHome]]
386 * [[Design Decisions>>FactHarbor.Specification.Design-Decisions]]
387 * [[When to Add Complexity>>FactHarbor.Specification.When-to-Add-Complexity]]