Changes for page Architecture

Last modified by Robert Schaub on 2025/12/24 18:26

From version 3.3
edited by Robert Schaub
on 2025/12/24 18:26
Change comment: Renamed back-links.
To version 1.1
edited by Robert Schaub
on 2025/12/24 11:54
Change comment: Imported from XAR

Summary

Details

Page properties
Parent
... ... @@ -1,1 +1,1 @@
1 -Test.FactHarbor V0\.9\.103.Specification.WebHome
1 +FactHarbor.Specification.WebHome
Content
... ... @@ -1,9 +1,6 @@
1 1  = Architecture =
2 -
3 3  FactHarbor's architecture is designed for **simplicity, automation, and continuous improvement**.
4 -
5 5  == 1. Core Principles ==
6 -
7 7  * **AI-First**: AKEL (AI) is the primary system, humans supplement
8 8  * **Publish by Default**: No centralized approval (removed in V0.9.50), publish with confidence scores
9 9  * **System Over Data**: Fix algorithms, not individual outputs
... ... @@ -10,158 +10,65 @@
10 10  * **Measure Everything**: Quality metrics drive improvements
11 11  * **Scale Through Automation**: Minimal human intervention
12 12  * **Start Simple**: Add complexity only when metrics prove necessary
13 -
14 14  == 2. High-Level Architecture ==
15 -
16 16  {{include reference="FactHarbor.Specification.Diagrams.High-Level Architecture.WebHome"/}}
17 -
18 18  === 2.1 Three-Layer Architecture ===
19 -
20 20  FactHarbor uses a clean three-layer architecture:
21 -
22 22  ==== Interface Layer ====
23 -
24 24  Handles all user and system interactions:
25 -
26 26  * **Web UI**: Browse claims, view evidence, submit feedback
27 27  * **REST API**: Programmatic access for integrations
28 28  * **Authentication & Authorization**: User identity and permissions
29 29  * **Rate Limiting**: Protect against abuse
30 -
31 31  ==== Processing Layer ====
32 -
33 33  Core business logic and AI processing:
34 -
35 35  * **AKEL Pipeline**: AI-driven claim analysis (parallel processing)
36 -* Parse and extract claim components
37 -* Gather evidence from multiple sources
38 -* Check source track records
39 -* Extract scenarios from evidence
40 -* Synthesize verdicts
41 -* Calculate risk scores
42 -
43 -* **LLM Abstraction Layer**: Provider-agnostic AI access
44 -* Multi-provider support (Anthropic, OpenAI, Google, local models)
45 -* Automatic failover and rate limit handling
46 -* Per-stage model configuration
47 -* Cost optimization through provider selection
48 -* No vendor lock-in
23 + * Parse and extract claim components
24 + * Gather evidence from multiple sources
25 + * Check source track records
26 + * Extract scenarios from evidence
27 + * Synthesize verdicts
28 + * Calculate risk scores
49 49  * **Background Jobs**: Automated maintenance tasks
50 -* Source track record updates (weekly)
51 -* Cache warming and invalidation
52 -* Metrics aggregation
53 -* Data archival
30 + * Source track record updates (weekly)
31 + * Cache warming and invalidation
32 + * Metrics aggregation
33 + * Data archival
54 54  * **Quality Monitoring**: Automated quality checks
55 -* Anomaly detection
56 -* Contradiction detection
57 -* Completeness validation
35 + * Anomaly detection
36 + * Contradiction detection
37 + * Completeness validation
58 58  * **Moderation Detection**: Automated abuse detection
59 -* Spam identification
60 -* Manipulation detection
61 -* Flag suspicious activity
62 -
39 + * Spam identification
40 + * Manipulation detection
41 + * Flag suspicious activity
63 63  ==== Data & Storage Layer ====
64 -
65 65  Persistent data storage and caching:
66 -
67 67  * **PostgreSQL**: Primary database for all core data
68 -* Claims, evidence, sources, users
69 -* Scenarios, edits, audit logs
70 -* Built-in full-text search
71 -* Time-series capabilities for metrics
45 + * Claims, evidence, sources, users
46 + * Scenarios, edits, audit logs
47 + * Built-in full-text search
48 + * Time-series capabilities for metrics
72 72  * **Redis**: High-speed caching layer
73 -* Session data
74 -* Frequently accessed claims
75 -* API rate limiting
50 + * Session data
51 + * Frequently accessed claims
52 + * API rate limiting
76 76  * **S3 Storage**: Long-term archival
77 -* Old edit history (90+ days)
78 -* AKEL processing logs
79 -* Backup snapshots
54 + * Old edit history (90+ days)
55 + * AKEL processing logs
56 + * Backup snapshots
80 80  **Optional future additions** (add only when metrics prove necessary):
81 81  * **Elasticsearch**: If PostgreSQL full-text search becomes slow
82 82  * **TimescaleDB**: If metrics queries become a bottleneck
83 -
84 -=== 2.2 LLM Abstraction Layer ===
85 -
86 -{{include reference="Test.FactHarbor V0\.9\.103.Specification.Diagrams.LLM Abstraction Architecture.WebHome"/}}
87 -
88 -**Purpose:** FactHarbor uses a provider-agnostic abstraction layer for all AI interactions, avoiding vendor lock-in and enabling flexible provider selection.
89 -
90 -**Multi-Provider Support:**
91 -
92 -* **Primary:** Anthropic Claude API (Haiku for extraction, Sonnet for analysis)
93 -* **Secondary:** OpenAI GPT API (automatic failover)
94 -* **Tertiary:** Google Vertex AI / Gemini
95 -* **Future:** Local models (Llama, Mistral) for on-premises deployments
96 -
97 -**Provider Interface:**
98 -
99 -* Abstract `LLMProvider` interface with `complete()`, `stream()`, `getName()`, `getCostPer1kTokens()`, `isAvailable()` methods
100 -* Per-stage model configuration (Stage 1: Haiku, Stage 2 & 3: Sonnet)
101 -* Environment variable and database configuration
102 -* Adapter pattern implementation (AnthropicProvider, OpenAIProvider, GoogleProvider)
103 -
104 -**Configuration:**
105 -
106 -* Runtime provider switching without code changes
107 -* Admin API for provider management (`POST /admin/v1/llm/configure`)
108 -* Per-stage cost optimization (use cheaper models for extraction, quality models for analysis)
109 -* Support for rate limit handling and cost tracking
110 -
111 -**Failover Strategy:**
112 -
113 -* Automatic fallback: Primary → Secondary → Tertiary
114 -* Circuit breaker pattern for unavailable providers
115 -* Health checking and provider availability monitoring
116 -* Graceful degradation when all providers unavailable
117 -
118 -**Cost Optimization:**
119 -
120 -* Track and compare costs across providers per request
121 -* Enable A/B testing of different models for quality/cost tradeoffs
122 -* Per-stage provider selection for optimal cost-efficiency
123 -* Cost comparison: Anthropic ($0.114), OpenAI ($0.065), Google ($0.072) per article at 0% cache
124 -
125 -**Architecture Pattern:**
126 -
127 -{{code}}
128 -AKEL Stages LLM Abstraction Providers
129 -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
130 -Stage 1 Extract ──→ Provider Interface ──→ Anthropic (PRIMARY)
131 -Stage 2 Analyze ──→ Configuration ──→ OpenAI (SECONDARY)
132 -Stage 3 Holistic ──→ Failover Handler ──→ Google (TERTIARY)
133 - └→ Local Models (FUTURE)
134 -{{/code}}
135 -
136 -**Benefits:**
137 -
138 -* **No Vendor Lock-In:** Switch providers based on cost, quality, or availability without code changes
139 -* **Resilience:** Automatic failover ensures service continuity during provider outages
140 -* **Cost Efficiency:** Use optimal provider per task (cheap for extraction, quality for analysis)
141 -* **Quality Assurance:** Cross-provider output verification for critical claims
142 -* **Regulatory Compliance:** Use specific providers for data residency requirements
143 -* **Future-Proofing:** Easy integration of new models as they become available
144 -
145 -**Cross-References:**
146 -
147 -* [[Requirements>>FactHarbor.Specification.Requirements.WebHome#NFR-14]]: NFR-14 (formal requirement)
148 -* [[POC Requirements>>FactHarbor.Specification.POC.Requirements#NFR-POC-11]]: NFR-POC-11 (POC1 implementation)
149 -* [[API Specification>>FactHarbor.Specification.POC.API-and-Schemas.WebHome#Section-6]]: Section 6 (implementation details)
150 -* [[Design Decisions>>FactHarbor.Specification.Design-Decisions#Section-9]]: Section 9 (design rationale)
151 -
152 152  === 2.2 Design Philosophy ===
153 -
154 154  **Start Simple, Evolve Based on Metrics**
155 155  The architecture deliberately starts simple:
156 -
157 157  * Single primary database (PostgreSQL handles most workloads initially)
158 158  * Three clear layers (easy to understand and maintain)
159 159  * Automated operations (minimal human intervention)
160 160  * Measure before optimizing (add complexity only when proven necessary)
161 161  See [[Design Decisions>>FactHarbor.Specification.Design-Decisions]] and [[When to Add Complexity>>FactHarbor.Specification.When-to-Add-Complexity]] for detailed rationale.
162 -
163 163  == 3. AKEL Architecture ==
164 -
165 165  {{include reference="FactHarbor.Specification.Diagrams.AKEL_Architecture.WebHome"/}}
166 166  See [[AI Knowledge Extraction Layer (AKEL)>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]] for detailed information.
167 167  
... ... @@ -172,7 +172,6 @@
172 172  === Multi-Claim Handling ===
173 173  
174 174  Users often submit:
175 -
176 176  * **Text with multiple claims**: Articles, statements, or paragraphs containing several distinct factual claims
177 177  * **Web pages**: URLs that are analyzed to extract all verifiable claims
178 178  * **Single claims**: Simple, direct factual statements
... ... @@ -184,13 +184,11 @@
184 184  **POC Implementation (Two-Phase):**
185 185  
186 186  Phase 1 - Claim Extraction:
187 -
188 188  * LLM analyzes submitted content
189 189  * Extracts all distinct, verifiable claims
190 190  * Returns structured list of claims with context
191 191  
192 192  Phase 2 - Parallel Analysis:
193 -
194 194  * Each claim processed independently by LLM
195 195  * Single call per claim generates: Evidence, Scenarios, Sources, Verdict, Risk
196 196  * Parallelized across all claims
... ... @@ -199,19 +199,16 @@
199 199  **Production Implementation (Three-Phase):**
200 200  
201 201  Phase 1 - Extraction + Validation:
202 -
203 203  * Extract claims from content
204 204  * Validate clarity and uniqueness
205 205  * Filter vague or duplicate claims
206 206  
207 207  Phase 2 - Evidence Gathering (Parallel):
208 -
209 209  * Independent evidence gathering per claim
210 210  * Source validation and scenario generation
211 211  * Quality gates prevent poor data from advancing
212 212  
213 213  Phase 3 - Verdict Generation (Parallel):
214 -
215 215  * Generate verdict from validated evidence
216 216  * Confidence scoring and risk assessment
217 217  * Low-confidence cases routed to human review
... ... @@ -219,48 +219,35 @@
219 219  === Architectural Benefits ===
220 220  
221 221  **Scalability:**
222 -
223 -* Process 100 claims with 3x latency of single claim
120 +* Process 100 claims with ~3x latency of single claim
224 224  * Parallel processing across independent claims
225 225  * Linear cost scaling with claim count
226 226  
227 -=== 2.3 Design Philosophy ===
228 -
229 229  **Quality:**
230 -
231 231  * Validation gates between phases
232 232  * Errors isolated to individual claims
233 233  * Clear observability per processing step
234 234  
235 235  **Flexibility:**
236 -
237 237  * Each phase optimizable independently
238 238  * Can use different model sizes per phase
239 239  * Easy to add human review at decision points
240 240  
241 -== 4. Storage Architecture ==
242 242  
135 +== 4. Storage Architecture ==
243 243  {{include reference="FactHarbor.Specification.Diagrams.Storage Architecture.WebHome"/}}
244 244  See [[Storage Strategy>>FactHarbor.Specification.Architecture.WebHome]] for detailed information.
245 -
246 246  == 4.5 Versioning Architecture ==
247 -
248 248  {{include reference="FactHarbor.Specification.Diagrams.Versioning Architecture.WebHome"/}}
249 -
250 250  == 5. Automated Systems in Detail ==
251 -
252 252  FactHarbor relies heavily on automation to achieve scale and quality. Here's how each automated system works:
253 -
254 254  === 5.1 AKEL (AI Knowledge Evaluation Layer) ===
255 -
256 256  **What it does**: Primary AI processing engine that analyzes claims automatically
257 257  **Inputs**:
258 -
259 259  * User-submitted claim text
260 260  * Existing evidence and sources
261 261  * Source track record database
262 262  **Processing steps**:
263 -
264 264  1. **Parse & Extract**: Identify key components, entities, assertions
265 265  2. **Gather Evidence**: Search web and database for relevant sources
266 266  3. **Check Sources**: Evaluate source reliability using track records
... ... @@ -268,7 +268,6 @@
268 268  5. **Synthesize Verdict**: Compile evidence assessment per scenario
269 269  6. **Calculate Risk**: Assess potential harm and controversy
270 270  **Outputs**:
271 -
272 272  * Structured claim record
273 273  * Evidence links with relevance scores
274 274  * Scenarios with context descriptions
... ... @@ -276,11 +276,8 @@
276 276  * Overall confidence score
277 277  * Risk assessment
278 278  **Timing**: 10-18 seconds total (parallel processing)
279 -
280 280  === 5.2 Background Jobs ===
281 -
282 282  **Source Track Record Updates** (Weekly):
283 -
284 284  * Analyze claim outcomes from past week
285 285  * Calculate source accuracy and reliability
286 286  * Update source_track_record table
... ... @@ -297,120 +297,83 @@
297 297  * Move old AKEL logs to S3 (90+ days)
298 298  * Archive old edit history
299 299  * Compress and backup data
300 -
301 301  === 5.3 Quality Monitoring ===
302 -
303 303  **Automated checks run continuously**:
304 -
305 305  * **Anomaly Detection**: Flag unusual patterns
306 -* Sudden confidence score changes
307 -* Unusual evidence distributions
308 -* Suspicious source patterns
184 + * Sudden confidence score changes
185 + * Unusual evidence distributions
186 + * Suspicious source patterns
309 309  * **Contradiction Detection**: Identify conflicts
310 -* Evidence that contradicts other evidence
311 -* Claims with internal contradictions
312 -* Source track record anomalies
188 + * Evidence that contradicts other evidence
189 + * Claims with internal contradictions
190 + * Source track record anomalies
313 313  * **Completeness Validation**: Ensure thoroughness
314 -* Sufficient evidence gathered
315 -* Multiple source types represented
316 -* Key scenarios identified
317 -
192 + * Sufficient evidence gathered
193 + * Multiple source types represented
194 + * Key scenarios identified
318 318  === 5.4 Moderation Detection ===
319 -
320 320  **Automated abuse detection**:
321 -
322 322  * **Spam Identification**: Pattern matching for spam claims
323 323  * **Manipulation Detection**: Identify coordinated editing
324 324  * **Gaming Detection**: Flag attempts to game source scores
325 325  * **Suspicious Activity**: Log unusual behavior patterns
326 326  **Human Review**: Moderators review flagged items, system learns from decisions
327 -
328 328  == 6. Scalability Strategy ==
329 -
330 330  === 6.1 Horizontal Scaling ===
331 -
332 332  Components scale independently:
333 -
334 334  * **AKEL Workers**: Add more processing workers as claim volume grows
335 335  * **Database Read Replicas**: Add replicas for read-heavy workloads
336 336  * **Cache Layer**: Redis cluster for distributed caching
337 337  * **API Servers**: Load-balanced API instances
338 -
339 339  === 6.2 Vertical Scaling ===
340 -
341 341  Individual components can be upgraded:
342 -
343 343  * **Database Server**: Increase CPU/RAM for PostgreSQL
344 344  * **Cache Memory**: Expand Redis memory
345 345  * **Worker Resources**: More powerful AKEL worker machines
346 -
347 347  === 6.3 Performance Optimization ===
348 -
349 349  Built-in optimizations:
350 -
351 351  * **Denormalized Data**: Cache summary data in claim records (70% fewer joins)
352 352  * **Parallel Processing**: AKEL pipeline processes in parallel (40% faster)
353 353  * **Intelligent Caching**: Redis caches frequently accessed data
354 354  * **Background Processing**: Non-urgent tasks run asynchronously
355 -
356 356  == 7. Monitoring & Observability ==
357 -
358 358  === 7.1 Key Metrics ===
359 -
360 360  System tracks:
361 -
362 362  * **Performance**: AKEL processing time, API response time, cache hit rate
363 363  * **Quality**: Confidence score distribution, evidence completeness, contradiction rate
364 364  * **Usage**: Claims per day, active users, API requests
365 365  * **Errors**: Failed AKEL runs, API errors, database issues
366 -
367 367  === 7.2 Alerts ===
368 -
369 369  Automated alerts for:
370 -
371 371  * Processing time >30 seconds (threshold breach)
372 372  * Error rate >1% (quality issue)
373 373  * Cache hit rate <80% (cache problem)
374 374  * Database connections >80% capacity (scaling needed)
375 -
376 376  === 7.3 Dashboards ===
377 -
378 378  Real-time monitoring:
379 -
380 380  * **System Health**: Overall status and key metrics
381 381  * **AKEL Performance**: Processing time breakdown
382 382  * **Quality Metrics**: Confidence scores, completeness
383 383  * **User Activity**: Usage patterns, peak times
384 -
385 385  == 8. Security Architecture ==
386 -
387 387  === 8.1 Authentication & Authorization ===
388 -
389 389  * **User Authentication**: Secure login with password hashing
390 390  * **Role-Based Access**: Reader, Contributor, Moderator, Admin
391 391  * **API Keys**: For programmatic access
392 392  * **Rate Limiting**: Prevent abuse
393 -
394 394  === 8.2 Data Security ===
395 -
396 396  * **Encryption**: TLS for transport, encrypted storage for sensitive data
397 397  * **Audit Logging**: Track all significant changes
398 398  * **Input Validation**: Sanitize all user inputs
399 399  * **SQL Injection Protection**: Parameterized queries
400 -
401 401  === 8.3 Abuse Prevention ===
402 -
403 403  * **Rate Limiting**: Prevent flooding and DDoS
404 404  * **Automated Detection**: Flag suspicious patterns
405 405  * **Human Review**: Moderators investigate flagged content
406 406  * **Ban Mechanisms**: Block abusive users/IPs
407 -
408 408  == 9. Deployment Architecture ==
409 -
410 410  === 9.1 Production Environment ===
411 -
412 412  **Components**:
413 -
414 414  * Load Balancer (HAProxy or cloud LB)
415 415  * Multiple API servers (stateless)
416 416  * AKEL worker pool (auto-scaling)
... ... @@ -418,15 +418,11 @@
418 418  * Redis cluster
419 419  * S3-compatible storage
420 420  **Regions**: Single region for V1.0, multi-region when needed
421 -
422 422  === 9.2 Development & Staging ===
423 -
424 424  **Development**: Local Docker Compose setup
425 425  **Staging**: Scaled-down production replica
426 426  **CI/CD**: Automated testing and deployment
427 -
428 428  === 9.3 Disaster Recovery ===
429 -
430 430  * **Database Backups**: Daily automated backups to S3
431 431  * **Point-in-Time Recovery**: Transaction log archival
432 432  * **Replication**: Real-time replication to standby
... ... @@ -437,28 +437,20 @@
437 437  {{include reference="FactHarbor.Specification.Diagrams.Federation Architecture.WebHome"/}}
438 438  
439 439  == 10. Future Architecture Evolution ==
440 -
441 441  === 10.1 When to Add Complexity ===
442 -
443 443  See [[When to Add Complexity>>FactHarbor.Specification.When-to-Add-Complexity]] for specific triggers.
444 444  **Elasticsearch**: When PostgreSQL search consistently >500ms
445 -**TimescaleDB**: When metrics queries consistently >1s
283 +**TimescaleDB**: When metrics queries consistently >1s
446 446  **Federation**: When 10,000+ users and explicit demand
447 447  **Complex Reputation**: When 100+ active contributors
448 -
449 449  === 10.2 Federation (V2.0+) ===
450 -
451 451  **Deferred until**:
452 -
453 453  * Core product proven with 10,000+ users
454 454  * User demand for decentralization
455 455  * Single-node limits reached
456 456  See [[Federation & Decentralization>>FactHarbor.Specification.Federation & Decentralization.WebHome]] for future plans.
457 -
458 458  == 11. Technology Stack Summary ==
459 -
460 460  **Backend**:
461 -
462 462  * Python (FastAPI or Django)
463 463  * PostgreSQL (primary database)
464 464  * Redis (caching)
... ... @@ -476,9 +476,7 @@
476 476  * Prometheus + Grafana
477 477  * Structured logging (ELK or cloud logging)
478 478  * Error tracking (Sentry)
479 -
480 480  == 12. Related Pages ==
481 -
482 482  * [[AI Knowledge Extraction Layer (AKEL)>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]]
483 483  * [[Storage Strategy>>FactHarbor.Specification.Architecture.WebHome]]
484 484  * [[Data Model>>FactHarbor.Specification.Data Model.WebHome]]