Changes for page Architecture

Last modified by Robert Schaub on 2025/12/24 18:26

From version 3.1
edited by Robert Schaub
on 2025/12/24 17:59
Change comment: Imported from XAR
To version 3.3
edited by Robert Schaub
on 2025/12/24 18:26
Change comment: Renamed back-links.

Summary

Details

Page properties
Parent
... ... @@ -1,1 +1,1 @@
1 -Test.FactHarbor.Specification.WebHome
1 +Test.FactHarbor V0\.9\.103.Specification.WebHome
Content
... ... @@ -1,6 +1,9 @@
1 1  = Architecture =
2 +
2 2  FactHarbor's architecture is designed for **simplicity, automation, and continuous improvement**.
4 +
3 3  == 1. Core Principles ==
6 +
4 4  * **AI-First**: AKEL (AI) is the primary system, humans supplement
5 5  * **Publish by Default**: No centralized approval (removed in V0.9.50), publish with confidence scores
6 6  * **System Over Data**: Fix algorithms, not individual outputs
... ... @@ -7,72 +7,85 @@
7 7  * **Measure Everything**: Quality metrics drive improvements
8 8  * **Scale Through Automation**: Minimal human intervention
9 9  * **Start Simple**: Add complexity only when metrics prove necessary
13 +
10 10  == 2. High-Level Architecture ==
15 +
11 11  {{include reference="FactHarbor.Specification.Diagrams.High-Level Architecture.WebHome"/}}
17 +
12 12  === 2.1 Three-Layer Architecture ===
19 +
13 13  FactHarbor uses a clean three-layer architecture:
21 +
14 14  ==== Interface Layer ====
23 +
15 15  Handles all user and system interactions:
25 +
16 16  * **Web UI**: Browse claims, view evidence, submit feedback
17 17  * **REST API**: Programmatic access for integrations
18 18  * **Authentication & Authorization**: User identity and permissions
19 19  * **Rate Limiting**: Protect against abuse
30 +
20 20  ==== Processing Layer ====
32 +
21 21  Core business logic and AI processing:
34 +
22 22  * **AKEL Pipeline**: AI-driven claim analysis (parallel processing)
23 - * Parse and extract claim components
24 - * Gather evidence from multiple sources
25 - * Check source track records
26 - * Extract scenarios from evidence
27 - * Synthesize verdicts
28 - * Calculate risk scores
36 +* Parse and extract claim components
37 +* Gather evidence from multiple sources
38 +* Check source track records
39 +* Extract scenarios from evidence
40 +* Synthesize verdicts
41 +* Calculate risk scores
29 29  
30 30  * **LLM Abstraction Layer**: Provider-agnostic AI access
31 - * Multi-provider support (Anthropic, OpenAI, Google, local models)
32 - * Automatic failover and rate limit handling
33 - * Per-stage model configuration
34 - * Cost optimization through provider selection
35 - * No vendor lock-in
44 +* Multi-provider support (Anthropic, OpenAI, Google, local models)
45 +* Automatic failover and rate limit handling
46 +* Per-stage model configuration
47 +* Cost optimization through provider selection
48 +* No vendor lock-in
36 36  * **Background Jobs**: Automated maintenance tasks
37 - * Source track record updates (weekly)
38 - * Cache warming and invalidation
39 - * Metrics aggregation
40 - * Data archival
50 +* Source track record updates (weekly)
51 +* Cache warming and invalidation
52 +* Metrics aggregation
53 +* Data archival
41 41  * **Quality Monitoring**: Automated quality checks
42 - * Anomaly detection
43 - * Contradiction detection
44 - * Completeness validation
55 +* Anomaly detection
56 +* Contradiction detection
57 +* Completeness validation
45 45  * **Moderation Detection**: Automated abuse detection
46 - * Spam identification
47 - * Manipulation detection
48 - * Flag suspicious activity
59 +* Spam identification
60 +* Manipulation detection
61 +* Flag suspicious activity
62 +
49 49  ==== Data & Storage Layer ====
64 +
50 50  Persistent data storage and caching:
66 +
51 51  * **PostgreSQL**: Primary database for all core data
52 - * Claims, evidence, sources, users
53 - * Scenarios, edits, audit logs
54 - * Built-in full-text search
55 - * Time-series capabilities for metrics
68 +* Claims, evidence, sources, users
69 +* Scenarios, edits, audit logs
70 +* Built-in full-text search
71 +* Time-series capabilities for metrics
56 56  * **Redis**: High-speed caching layer
57 - * Session data
58 - * Frequently accessed claims
59 - * API rate limiting
73 +* Session data
74 +* Frequently accessed claims
75 +* API rate limiting
60 60  * **S3 Storage**: Long-term archival
61 - * Old edit history (90+ days)
62 - * AKEL processing logs
63 - * Backup snapshots
77 +* Old edit history (90+ days)
78 +* AKEL processing logs
79 +* Backup snapshots
64 64  **Optional future additions** (add only when metrics prove necessary):
65 65  * **Elasticsearch**: If PostgreSQL full-text search becomes slow
66 66  * **TimescaleDB**: If metrics queries become a bottleneck
67 67  
68 -
69 69  === 2.2 LLM Abstraction Layer ===
70 70  
71 -{{include reference="Test.FactHarbor.Specification.Diagrams.LLM Abstraction Architecture.WebHome"/}}
86 +{{include reference="Test.FactHarbor V0\.9\.103.Specification.Diagrams.LLM Abstraction Architecture.WebHome"/}}
72 72  
73 73  **Purpose:** FactHarbor uses a provider-agnostic abstraction layer for all AI interactions, avoiding vendor lock-in and enabling flexible provider selection.
74 74  
75 75  **Multi-Provider Support:**
91 +
76 76  * **Primary:** Anthropic Claude API (Haiku for extraction, Sonnet for analysis)
77 77  * **Secondary:** OpenAI GPT API (automatic failover)
78 78  * **Tertiary:** Google Vertex AI / Gemini
... ... @@ -79,6 +79,7 @@
79 79  * **Future:** Local models (Llama, Mistral) for on-premises deployments
80 80  
81 81  **Provider Interface:**
98 +
82 82  * Abstract `LLMProvider` interface with `complete()`, `stream()`, `getName()`, `getCostPer1kTokens()`, `isAvailable()` methods
83 83  * Per-stage model configuration (Stage 1: Haiku, Stage 2 & 3: Sonnet)
84 84  * Environment variable and database configuration
... ... @@ -85,6 +85,7 @@
85 85  * Adapter pattern implementation (AnthropicProvider, OpenAIProvider, GoogleProvider)
86 86  
87 87  **Configuration:**
105 +
88 88  * Runtime provider switching without code changes
89 89  * Admin API for provider management (`POST /admin/v1/llm/configure`)
90 90  * Per-stage cost optimization (use cheaper models for extraction, quality models for analysis)
... ... @@ -91,6 +91,7 @@
91 91  * Support for rate limit handling and cost tracking
92 92  
93 93  **Failover Strategy:**
112 +
94 94  * Automatic fallback: Primary → Secondary → Tertiary
95 95  * Circuit breaker pattern for unavailable providers
96 96  * Health checking and provider availability monitoring
... ... @@ -97,6 +97,7 @@
97 97  * Graceful degradation when all providers unavailable
98 98  
99 99  **Cost Optimization:**
119 +
100 100  * Track and compare costs across providers per request
101 101  * Enable A/B testing of different models for quality/cost tradeoffs
102 102  * Per-stage provider selection for optimal cost-efficiency
... ... @@ -114,6 +114,7 @@
114 114  {{/code}}
115 115  
116 116  **Benefits:**
137 +
117 117  * **No Vendor Lock-In:** Switch providers based on cost, quality, or availability without code changes
118 118  * **Resilience:** Automatic failover ensures service continuity during provider outages
119 119  * **Cost Efficiency:** Use optimal provider per task (cheap for extraction, quality for analysis)
... ... @@ -122,21 +122,25 @@
122 122  * **Future-Proofing:** Easy integration of new models as they become available
123 123  
124 124  **Cross-References:**
146 +
125 125  * [[Requirements>>FactHarbor.Specification.Requirements.WebHome#NFR-14]]: NFR-14 (formal requirement)
126 126  * [[POC Requirements>>FactHarbor.Specification.POC.Requirements#NFR-POC-11]]: NFR-POC-11 (POC1 implementation)
127 127  * [[API Specification>>FactHarbor.Specification.POC.API-and-Schemas.WebHome#Section-6]]: Section 6 (implementation details)
128 128  * [[Design Decisions>>FactHarbor.Specification.Design-Decisions#Section-9]]: Section 9 (design rationale)
129 129  
130 -
131 131  === 2.2 Design Philosophy ===
153 +
132 132  **Start Simple, Evolve Based on Metrics**
133 133  The architecture deliberately starts simple:
156 +
134 134  * Single primary database (PostgreSQL handles most workloads initially)
135 135  * Three clear layers (easy to understand and maintain)
136 136  * Automated operations (minimal human intervention)
137 137  * Measure before optimizing (add complexity only when proven necessary)
138 138  See [[Design Decisions>>FactHarbor.Specification.Design-Decisions]] and [[When to Add Complexity>>FactHarbor.Specification.When-to-Add-Complexity]] for detailed rationale.
162 +
139 139  == 3. AKEL Architecture ==
164 +
140 140  {{include reference="FactHarbor.Specification.Diagrams.AKEL_Architecture.WebHome"/}}
141 141  See [[AI Knowledge Extraction Layer (AKEL)>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]] for detailed information.
142 142  
... ... @@ -147,6 +147,7 @@
147 147  === Multi-Claim Handling ===
148 148  
149 149  Users often submit:
175 +
150 150  * **Text with multiple claims**: Articles, statements, or paragraphs containing several distinct factual claims
151 151  * **Web pages**: URLs that are analyzed to extract all verifiable claims
152 152  * **Single claims**: Simple, direct factual statements
... ... @@ -158,11 +158,13 @@
158 158  **POC Implementation (Two-Phase):**
159 159  
160 160  Phase 1 - Claim Extraction:
187 +
161 161  * LLM analyzes submitted content
162 162  * Extracts all distinct, verifiable claims
163 163  * Returns structured list of claims with context
164 164  
165 165  Phase 2 - Parallel Analysis:
193 +
166 166  * Each claim processed independently by LLM
167 167  * Single call per claim generates: Evidence, Scenarios, Sources, Verdict, Risk
168 168  * Parallelized across all claims
... ... @@ -171,16 +171,19 @@
171 171  **Production Implementation (Three-Phase):**
172 172  
173 173  Phase 1 - Extraction + Validation:
202 +
174 174  * Extract claims from content
175 175  * Validate clarity and uniqueness
176 176  * Filter vague or duplicate claims
177 177  
178 178  Phase 2 - Evidence Gathering (Parallel):
208 +
179 179  * Independent evidence gathering per claim
180 180  * Source validation and scenario generation
181 181  * Quality gates prevent poor data from advancing
182 182  
183 183  Phase 3 - Verdict Generation (Parallel):
214 +
184 184  * Generate verdict from validated evidence
185 185  * Confidence scoring and risk assessment
186 186  * Low-confidence cases routed to human review
... ... @@ -188,34 +188,48 @@
188 188  === Architectural Benefits ===
189 189  
190 190  **Scalability:**
191 -* Process 100 claims with ~3x latency of single claim
222 +
223 +* Process 100 claims with 3x latency of single claim
192 192  * Parallel processing across independent claims
193 193  * Linear cost scaling with claim count
226 +
194 194  === 2.3 Design Philosophy ===
228 +
195 195  **Quality:**
230 +
196 196  * Validation gates between phases
197 197  * Errors isolated to individual claims
198 198  * Clear observability per processing step
199 199  
200 200  **Flexibility:**
236 +
201 201  * Each phase optimizable independently
202 202  * Can use different model sizes per phase
203 203  * Easy to add human review at decision points
204 204  
205 205  == 4. Storage Architecture ==
242 +
206 206  {{include reference="FactHarbor.Specification.Diagrams.Storage Architecture.WebHome"/}}
207 207  See [[Storage Strategy>>FactHarbor.Specification.Architecture.WebHome]] for detailed information.
245 +
208 208  == 4.5 Versioning Architecture ==
247 +
209 209  {{include reference="FactHarbor.Specification.Diagrams.Versioning Architecture.WebHome"/}}
249 +
210 210  == 5. Automated Systems in Detail ==
251 +
211 211  FactHarbor relies heavily on automation to achieve scale and quality. Here's how each automated system works:
253 +
212 212  === 5.1 AKEL (AI Knowledge Evaluation Layer) ===
255 +
213 213  **What it does**: Primary AI processing engine that analyzes claims automatically
214 214  **Inputs**:
258 +
215 215  * User-submitted claim text
216 216  * Existing evidence and sources
217 217  * Source track record database
218 218  **Processing steps**:
263 +
219 219  1. **Parse & Extract**: Identify key components, entities, assertions
220 220  2. **Gather Evidence**: Search web and database for relevant sources
221 221  3. **Check Sources**: Evaluate source reliability using track records
... ... @@ -223,6 +223,7 @@
223 223  5. **Synthesize Verdict**: Compile evidence assessment per scenario
224 224  6. **Calculate Risk**: Assess potential harm and controversy
225 225  **Outputs**:
271 +
226 226  * Structured claim record
227 227  * Evidence links with relevance scores
228 228  * Scenarios with context descriptions
... ... @@ -230,8 +230,11 @@
230 230  * Overall confidence score
231 231  * Risk assessment
232 232  **Timing**: 10-18 seconds total (parallel processing)
279 +
233 233  === 5.2 Background Jobs ===
281 +
234 234  **Source Track Record Updates** (Weekly):
283 +
235 235  * Analyze claim outcomes from past week
236 236  * Calculate source accuracy and reliability
237 237  * Update source_track_record table
... ... @@ -248,83 +248,120 @@
248 248  * Move old AKEL logs to S3 (90+ days)
249 249  * Archive old edit history
250 250  * Compress and backup data
300 +
251 251  === 5.3 Quality Monitoring ===
302 +
252 252  **Automated checks run continuously**:
304 +
253 253  * **Anomaly Detection**: Flag unusual patterns
254 - * Sudden confidence score changes
255 - * Unusual evidence distributions
256 - * Suspicious source patterns
306 +* Sudden confidence score changes
307 +* Unusual evidence distributions
308 +* Suspicious source patterns
257 257  * **Contradiction Detection**: Identify conflicts
258 - * Evidence that contradicts other evidence
259 - * Claims with internal contradictions
260 - * Source track record anomalies
310 +* Evidence that contradicts other evidence
311 +* Claims with internal contradictions
312 +* Source track record anomalies
261 261  * **Completeness Validation**: Ensure thoroughness
262 - * Sufficient evidence gathered
263 - * Multiple source types represented
264 - * Key scenarios identified
314 +* Sufficient evidence gathered
315 +* Multiple source types represented
316 +* Key scenarios identified
317 +
265 265  === 5.4 Moderation Detection ===
319 +
266 266  **Automated abuse detection**:
321 +
267 267  * **Spam Identification**: Pattern matching for spam claims
268 268  * **Manipulation Detection**: Identify coordinated editing
269 269  * **Gaming Detection**: Flag attempts to game source scores
270 270  * **Suspicious Activity**: Log unusual behavior patterns
271 271  **Human Review**: Moderators review flagged items, system learns from decisions
327 +
272 272  == 6. Scalability Strategy ==
329 +
273 273  === 6.1 Horizontal Scaling ===
331 +
274 274  Components scale independently:
333 +
275 275  * **AKEL Workers**: Add more processing workers as claim volume grows
276 276  * **Database Read Replicas**: Add replicas for read-heavy workloads
277 277  * **Cache Layer**: Redis cluster for distributed caching
278 278  * **API Servers**: Load-balanced API instances
338 +
279 279  === 6.2 Vertical Scaling ===
340 +
280 280  Individual components can be upgraded:
342 +
281 281  * **Database Server**: Increase CPU/RAM for PostgreSQL
282 282  * **Cache Memory**: Expand Redis memory
283 283  * **Worker Resources**: More powerful AKEL worker machines
346 +
284 284  === 6.3 Performance Optimization ===
348 +
285 285  Built-in optimizations:
350 +
286 286  * **Denormalized Data**: Cache summary data in claim records (70% fewer joins)
287 287  * **Parallel Processing**: AKEL pipeline processes in parallel (40% faster)
288 288  * **Intelligent Caching**: Redis caches frequently accessed data
289 289  * **Background Processing**: Non-urgent tasks run asynchronously
355 +
290 290  == 7. Monitoring & Observability ==
357 +
291 291  === 7.1 Key Metrics ===
359 +
292 292  System tracks:
361 +
293 293  * **Performance**: AKEL processing time, API response time, cache hit rate
294 294  * **Quality**: Confidence score distribution, evidence completeness, contradiction rate
295 295  * **Usage**: Claims per day, active users, API requests
296 296  * **Errors**: Failed AKEL runs, API errors, database issues
366 +
297 297  === 7.2 Alerts ===
368 +
298 298  Automated alerts for:
370 +
299 299  * Processing time >30 seconds (threshold breach)
300 300  * Error rate >1% (quality issue)
301 301  * Cache hit rate <80% (cache problem)
302 302  * Database connections >80% capacity (scaling needed)
375 +
303 303  === 7.3 Dashboards ===
377 +
304 304  Real-time monitoring:
379 +
305 305  * **System Health**: Overall status and key metrics
306 306  * **AKEL Performance**: Processing time breakdown
307 307  * **Quality Metrics**: Confidence scores, completeness
308 308  * **User Activity**: Usage patterns, peak times
384 +
309 309  == 8. Security Architecture ==
386 +
310 310  === 8.1 Authentication & Authorization ===
388 +
311 311  * **User Authentication**: Secure login with password hashing
312 312  * **Role-Based Access**: Reader, Contributor, Moderator, Admin
313 313  * **API Keys**: For programmatic access
314 314  * **Rate Limiting**: Prevent abuse
393 +
315 315  === 8.2 Data Security ===
395 +
316 316  * **Encryption**: TLS for transport, encrypted storage for sensitive data
317 317  * **Audit Logging**: Track all significant changes
318 318  * **Input Validation**: Sanitize all user inputs
319 319  * **SQL Injection Protection**: Parameterized queries
400 +
320 320  === 8.3 Abuse Prevention ===
402 +
321 321  * **Rate Limiting**: Prevent flooding and DDoS
322 322  * **Automated Detection**: Flag suspicious patterns
323 323  * **Human Review**: Moderators investigate flagged content
324 324  * **Ban Mechanisms**: Block abusive users/IPs
407 +
325 325  == 9. Deployment Architecture ==
409 +
326 326  === 9.1 Production Environment ===
411 +
327 327  **Components**:
413 +
328 328  * Load Balancer (HAProxy or cloud LB)
329 329  * Multiple API servers (stateless)
330 330  * AKEL worker pool (auto-scaling)
... ... @@ -332,11 +332,15 @@
332 332  * Redis cluster
333 333  * S3-compatible storage
334 334  **Regions**: Single region for V1.0, multi-region when needed
421 +
335 335  === 9.2 Development & Staging ===
423 +
336 336  **Development**: Local Docker Compose setup
337 337  **Staging**: Scaled-down production replica
338 338  **CI/CD**: Automated testing and deployment
427 +
339 339  === 9.3 Disaster Recovery ===
429 +
340 340  * **Database Backups**: Daily automated backups to S3
341 341  * **Point-in-Time Recovery**: Transaction log archival
342 342  * **Replication**: Real-time replication to standby
... ... @@ -347,20 +347,28 @@
347 347  {{include reference="FactHarbor.Specification.Diagrams.Federation Architecture.WebHome"/}}
348 348  
349 349  == 10. Future Architecture Evolution ==
440 +
350 350  === 10.1 When to Add Complexity ===
442 +
351 351  See [[When to Add Complexity>>FactHarbor.Specification.When-to-Add-Complexity]] for specific triggers.
352 352  **Elasticsearch**: When PostgreSQL search consistently >500ms
353 353  **TimescaleDB**: When metrics queries consistently >1s
354 354  **Federation**: When 10,000+ users and explicit demand
355 355  **Complex Reputation**: When 100+ active contributors
448 +
356 356  === 10.2 Federation (V2.0+) ===
450 +
357 357  **Deferred until**:
452 +
358 358  * Core product proven with 10,000+ users
359 359  * User demand for decentralization
360 360  * Single-node limits reached
361 361  See [[Federation & Decentralization>>FactHarbor.Specification.Federation & Decentralization.WebHome]] for future plans.
457 +
362 362  == 11. Technology Stack Summary ==
459 +
363 363  **Backend**:
461 +
364 364  * Python (FastAPI or Django)
365 365  * PostgreSQL (primary database)
366 366  * Redis (caching)
... ... @@ -378,7 +378,9 @@
378 378  * Prometheus + Grafana
379 379  * Structured logging (ELK or cloud logging)
380 380  * Error tracking (Sentry)
479 +
381 381  == 12. Related Pages ==
481 +
382 382  * [[AI Knowledge Extraction Layer (AKEL)>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]]
383 383  * [[Storage Strategy>>FactHarbor.Specification.Architecture.WebHome]]
384 384  * [[Data Model>>FactHarbor.Specification.Data Model.WebHome]]