Wiki source code of Architecture
Last modified by Robert Schaub on 2025/12/24 20:16
Hide last authors
| author | version | line-number | content |
|---|---|---|---|
| |
1.1 | 1 | = Architecture = |
| |
1.3 | 2 | |
| |
1.1 | 3 | FactHarbor's architecture is designed for **simplicity, automation, and continuous improvement**. |
| |
1.3 | 4 | |
| |
1.1 | 5 | == 1. Core Principles == |
| |
1.3 | 6 | |
| |
1.1 | 7 | * **AI-First**: AKEL (AI) is the primary system, humans supplement |
| 8 | * **Publish by Default**: No centralized approval (removed in V0.9.50), publish with confidence scores | ||
| 9 | * **System Over Data**: Fix algorithms, not individual outputs | ||
| 10 | * **Measure Everything**: Quality metrics drive improvements | ||
| 11 | * **Scale Through Automation**: Minimal human intervention | ||
| 12 | * **Start Simple**: Add complexity only when metrics prove necessary | ||
| |
1.3 | 13 | |
| |
1.1 | 14 | == 2. High-Level Architecture == |
| |
1.3 | 15 | |
| |
1.1 | 16 | {{include reference="FactHarbor.Specification.Diagrams.High-Level Architecture.WebHome"/}} |
| |
1.3 | 17 | |
| |
1.1 | 18 | === 2.1 Three-Layer Architecture === |
| |
1.3 | 19 | |
| |
1.1 | 20 | FactHarbor uses a clean three-layer architecture: |
| |
1.3 | 21 | |
| |
1.1 | 22 | ==== Interface Layer ==== |
| |
1.3 | 23 | |
| |
1.1 | 24 | Handles all user and system interactions: |
| |
1.3 | 25 | |
| |
1.1 | 26 | * **Web UI**: Browse claims, view evidence, submit feedback |
| 27 | * **REST API**: Programmatic access for integrations | ||
| 28 | * **Authentication & Authorization**: User identity and permissions | ||
| 29 | * **Rate Limiting**: Protect against abuse | ||
| |
1.3 | 30 | |
| |
1.1 | 31 | ==== Processing Layer ==== |
| |
1.3 | 32 | |
| |
1.1 | 33 | Core business logic and AI processing: |
| |
1.3 | 34 | |
| |
1.1 | 35 | * **AKEL Pipeline**: AI-driven claim analysis (parallel processing) |
| |
1.3 | 36 | * Parse and extract claim components |
| 37 | * Gather evidence from multiple sources | ||
| 38 | * Check source track records | ||
| 39 | * Extract scenarios from evidence | ||
| 40 | * Synthesize verdicts | ||
| 41 | * Calculate risk scores | ||
| |
1.1 | 42 | |
| 43 | * **LLM Abstraction Layer**: Provider-agnostic AI access | ||
| |
1.3 | 44 | * Multi-provider support (Anthropic, OpenAI, Google, local models) |
| 45 | * Automatic failover and rate limit handling | ||
| 46 | * Per-stage model configuration | ||
| 47 | * Cost optimization through provider selection | ||
| 48 | * No vendor lock-in | ||
| |
1.1 | 49 | * **Background Jobs**: Automated maintenance tasks |
| |
1.3 | 50 | * Source track record updates (weekly) |
| 51 | * Cache warming and invalidation | ||
| 52 | * Metrics aggregation | ||
| 53 | * Data archival | ||
| |
1.1 | 54 | * **Quality Monitoring**: Automated quality checks |
| |
1.3 | 55 | * Anomaly detection |
| 56 | * Contradiction detection | ||
| 57 | * Completeness validation | ||
| |
1.1 | 58 | * **Moderation Detection**: Automated abuse detection |
| |
1.3 | 59 | * Spam identification |
| 60 | * Manipulation detection | ||
| 61 | * Flag suspicious activity | ||
| 62 | |||
| |
1.1 | 63 | ==== Data & Storage Layer ==== |
| |
1.3 | 64 | |
| |
1.1 | 65 | Persistent data storage and caching: |
| |
1.3 | 66 | |
| |
1.1 | 67 | * **PostgreSQL**: Primary database for all core data |
| |
1.3 | 68 | * Claims, evidence, sources, users |
| 69 | * Scenarios, edits, audit logs | ||
| 70 | * Built-in full-text search | ||
| 71 | * Time-series capabilities for metrics | ||
| |
1.1 | 72 | * **Redis**: High-speed caching layer |
| |
1.3 | 73 | * Session data |
| 74 | * Frequently accessed claims | ||
| 75 | * API rate limiting | ||
| |
1.1 | 76 | * **S3 Storage**: Long-term archival |
| |
1.3 | 77 | * Old edit history (90+ days) |
| 78 | * AKEL processing logs | ||
| 79 | * Backup snapshots | ||
| |
1.1 | 80 | **Optional future additions** (add only when metrics prove necessary): |
| 81 | * **Elasticsearch**: If PostgreSQL full-text search becomes slow | ||
| 82 | * **TimescaleDB**: If metrics queries become a bottleneck | ||
| 83 | |||
| 84 | === 2.2 LLM Abstraction Layer === | ||
| 85 | |||
| |
1.3 | 86 | {{include reference="Test.FactHarbor V0\.9\.105.Specification.Diagrams.LLM Abstraction Architecture.WebHome"/}} |
| |
1.1 | 87 | |
| 88 | **Purpose:** FactHarbor uses a provider-agnostic abstraction layer for all AI interactions, avoiding vendor lock-in and enabling flexible provider selection. | ||
| 89 | |||
| 90 | **Multi-Provider Support:** | ||
| |
1.3 | 91 | |
| |
1.1 | 92 | * **Primary:** Anthropic Claude API (Haiku for extraction, Sonnet for analysis) |
| 93 | * **Secondary:** OpenAI GPT API (automatic failover) | ||
| 94 | * **Tertiary:** Google Vertex AI / Gemini | ||
| 95 | * **Future:** Local models (Llama, Mistral) for on-premises deployments | ||
| 96 | |||
| 97 | **Provider Interface:** | ||
| |
1.3 | 98 | |
| |
1.1 | 99 | * Abstract `LLMProvider` interface with `complete()`, `stream()`, `getName()`, `getCostPer1kTokens()`, `isAvailable()` methods |
| 100 | * Per-stage model configuration (Stage 1: Haiku, Stage 2 & 3: Sonnet) | ||
| 101 | * Environment variable and database configuration | ||
| 102 | * Adapter pattern implementation (AnthropicProvider, OpenAIProvider, GoogleProvider) | ||
| 103 | |||
| 104 | **Configuration:** | ||
| |
1.3 | 105 | |
| |
1.1 | 106 | * Runtime provider switching without code changes |
| 107 | * Admin API for provider management (`POST /admin/v1/llm/configure`) | ||
| 108 | * Per-stage cost optimization (use cheaper models for extraction, quality models for analysis) | ||
| 109 | * Support for rate limit handling and cost tracking | ||
| 110 | |||
| 111 | **Failover Strategy:** | ||
| |
1.3 | 112 | |
| |
1.1 | 113 | * Automatic fallback: Primary → Secondary → Tertiary |
| 114 | * Circuit breaker pattern for unavailable providers | ||
| 115 | * Health checking and provider availability monitoring | ||
| 116 | * Graceful degradation when all providers unavailable | ||
| 117 | |||
| 118 | **Cost Optimization:** | ||
| |
1.3 | 119 | |
| |
1.1 | 120 | * Track and compare costs across providers per request |
| 121 | * Enable A/B testing of different models for quality/cost tradeoffs | ||
| 122 | * Per-stage provider selection for optimal cost-efficiency | ||
| 123 | * Cost comparison: Anthropic ($0.114), OpenAI ($0.065), Google ($0.072) per article at 0% cache | ||
| 124 | |||
| 125 | **Architecture Pattern:** | ||
| 126 | |||
| 127 | {{code}} | ||
| 128 | AKEL Stages LLM Abstraction Providers | ||
| 129 | ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ | ||
| 130 | Stage 1 Extract ──→ Provider Interface ──→ Anthropic (PRIMARY) | ||
| 131 | Stage 2 Analyze ──→ Configuration ──→ OpenAI (SECONDARY) | ||
| 132 | Stage 3 Holistic ──→ Failover Handler ──→ Google (TERTIARY) | ||
| 133 | └→ Local Models (FUTURE) | ||
| 134 | {{/code}} | ||
| 135 | |||
| 136 | **Benefits:** | ||
| |
1.3 | 137 | |
| |
1.1 | 138 | * **No Vendor Lock-In:** Switch providers based on cost, quality, or availability without code changes |
| 139 | * **Resilience:** Automatic failover ensures service continuity during provider outages | ||
| 140 | * **Cost Efficiency:** Use optimal provider per task (cheap for extraction, quality for analysis) | ||
| 141 | * **Quality Assurance:** Cross-provider output verification for critical claims | ||
| 142 | * **Regulatory Compliance:** Use specific providers for data residency requirements | ||
| 143 | * **Future-Proofing:** Easy integration of new models as they become available | ||
| 144 | |||
| 145 | **Cross-References:** | ||
| |
1.3 | 146 | |
| |
1.1 | 147 | * [[Requirements>>FactHarbor.Specification.Requirements.WebHome#NFR-14]]: NFR-14 (formal requirement) |
| 148 | * [[POC Requirements>>FactHarbor.Specification.POC.Requirements#NFR-POC-11]]: NFR-POC-11 (POC1 implementation) | ||
| 149 | * [[API Specification>>FactHarbor.Specification.POC.API-and-Schemas.WebHome#Section-6]]: Section 6 (implementation details) | ||
| 150 | * [[Design Decisions>>FactHarbor.Specification.Design-Decisions#Section-9]]: Section 9 (design rationale) | ||
| 151 | |||
| |
1.3 | 152 | === 2.2 Design Philosophy === |
| |
1.1 | 153 | |
| 154 | **Start Simple, Evolve Based on Metrics** | ||
| 155 | The architecture deliberately starts simple: | ||
| |
1.3 | 156 | |
| |
1.1 | 157 | * Single primary database (PostgreSQL handles most workloads initially) |
| 158 | * Three clear layers (easy to understand and maintain) | ||
| 159 | * Automated operations (minimal human intervention) | ||
| 160 | * Measure before optimizing (add complexity only when proven necessary) | ||
| 161 | See [[Design Decisions>>FactHarbor.Specification.Design-Decisions]] and [[When to Add Complexity>>FactHarbor.Specification.When-to-Add-Complexity]] for detailed rationale. | ||
| |
1.3 | 162 | |
| |
1.1 | 163 | == 3. AKEL Architecture == |
| |
1.3 | 164 | |
| |
1.1 | 165 | {{include reference="FactHarbor.Specification.Diagrams.AKEL_Architecture.WebHome"/}} |
| 166 | See [[AI Knowledge Extraction Layer (AKEL)>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]] for detailed information. | ||
| 167 | |||
| 168 | == 3.5 Claim Processing Architecture == | ||
| 169 | |||
| 170 | FactHarbor's claim processing architecture is designed to handle both single-claim and multi-claim submissions efficiently. | ||
| 171 | |||
| 172 | === Multi-Claim Handling === | ||
| 173 | |||
| 174 | Users often submit: | ||
| |
1.3 | 175 | |
| |
1.1 | 176 | * **Text with multiple claims**: Articles, statements, or paragraphs containing several distinct factual claims |
| 177 | * **Web pages**: URLs that are analyzed to extract all verifiable claims | ||
| 178 | * **Single claims**: Simple, direct factual statements | ||
| 179 | |||
| 180 | The first processing step is always **Claim Extraction**: identifying and isolating individual verifiable claims from submitted content. | ||
| 181 | |||
| 182 | === Processing Phases === | ||
| 183 | |||
| 184 | **POC Implementation (Two-Phase):** | ||
| 185 | |||
| 186 | Phase 1 - Claim Extraction: | ||
| |
1.3 | 187 | |
| |
1.1 | 188 | * LLM analyzes submitted content |
| 189 | * Extracts all distinct, verifiable claims | ||
| 190 | * Returns structured list of claims with context | ||
| 191 | |||
| 192 | Phase 2 - Parallel Analysis: | ||
| |
1.3 | 193 | |
| |
1.1 | 194 | * Each claim processed independently by LLM |
| 195 | * Single call per claim generates: Evidence, Scenarios, Sources, Verdict, Risk | ||
| 196 | * Parallelized across all claims | ||
| 197 | * Results aggregated for presentation | ||
| 198 | |||
| 199 | **Production Implementation (Three-Phase):** | ||
| 200 | |||
| 201 | Phase 1 - Extraction + Validation: | ||
| |
1.3 | 202 | |
| |
1.1 | 203 | * Extract claims from content |
| 204 | * Validate clarity and uniqueness | ||
| 205 | * Filter vague or duplicate claims | ||
| 206 | |||
| 207 | Phase 2 - Evidence Gathering (Parallel): | ||
| |
1.3 | 208 | |
| |
1.1 | 209 | * Independent evidence gathering per claim |
| 210 | * Source validation and scenario generation | ||
| 211 | * Quality gates prevent poor data from advancing | ||
| 212 | |||
| 213 | Phase 3 - Verdict Generation (Parallel): | ||
| |
1.3 | 214 | |
| |
1.1 | 215 | * Generate verdict from validated evidence |
| 216 | * Confidence scoring and risk assessment | ||
| 217 | * Low-confidence cases routed to human review | ||
| 218 | |||
| 219 | === Architectural Benefits === | ||
| 220 | |||
| 221 | **Scalability:** | ||
| |
1.3 | 222 | |
| 223 | * Process 100 claims with 3x latency of single claim | ||
| |
1.1 | 224 | * Parallel processing across independent claims |
| 225 | * Linear cost scaling with claim count | ||
| |
1.3 | 226 | |
| |
1.1 | 227 | === 2.3 Design Philosophy === |
| |
1.3 | 228 | |
| |
1.1 | 229 | **Quality:** |
| |
1.3 | 230 | |
| |
1.1 | 231 | * Validation gates between phases |
| 232 | * Errors isolated to individual claims | ||
| 233 | * Clear observability per processing step | ||
| 234 | |||
| 235 | **Flexibility:** | ||
| |
1.3 | 236 | |
| |
1.1 | 237 | * Each phase optimizable independently |
| 238 | * Can use different model sizes per phase | ||
| 239 | * Easy to add human review at decision points | ||
| 240 | |||
| 241 | == 4. Storage Architecture == | ||
| |
1.3 | 242 | |
| |
1.1 | 243 | {{include reference="FactHarbor.Specification.Diagrams.Storage Architecture.WebHome"/}} |
| 244 | See [[Storage Strategy>>FactHarbor.Specification.Architecture.WebHome]] for detailed information. | ||
| |
1.3 | 245 | |
| |
1.1 | 246 | == 4.5 Versioning Architecture == |
| |
1.3 | 247 | |
| |
1.1 | 248 | {{include reference="FactHarbor.Specification.Diagrams.Versioning Architecture.WebHome"/}} |
| |
1.3 | 249 | |
| |
1.1 | 250 | == 5. Automated Systems in Detail == |
| |
1.3 | 251 | |
| |
1.1 | 252 | FactHarbor relies heavily on automation to achieve scale and quality. Here's how each automated system works: |
| |
1.3 | 253 | |
| |
1.1 | 254 | === 5.1 AKEL (AI Knowledge Evaluation Layer) === |
| |
1.3 | 255 | |
| |
1.1 | 256 | **What it does**: Primary AI processing engine that analyzes claims automatically |
| 257 | **Inputs**: | ||
| |
1.3 | 258 | |
| |
1.1 | 259 | * User-submitted claim text |
| 260 | * Existing evidence and sources | ||
| 261 | * Source track record database | ||
| 262 | **Processing steps**: | ||
| |
1.3 | 263 | |
| |
1.1 | 264 | 1. **Parse & Extract**: Identify key components, entities, assertions |
| 265 | 2. **Gather Evidence**: Search web and database for relevant sources | ||
| 266 | 3. **Check Sources**: Evaluate source reliability using track records | ||
| 267 | 4. **Extract Scenarios**: Identify different contexts from evidence | ||
| 268 | 5. **Synthesize Verdict**: Compile evidence assessment per scenario | ||
| 269 | 6. **Calculate Risk**: Assess potential harm and controversy | ||
| 270 | **Outputs**: | ||
| |
1.3 | 271 | |
| |
1.1 | 272 | * Structured claim record |
| 273 | * Evidence links with relevance scores | ||
| 274 | * Scenarios with context descriptions | ||
| 275 | * Verdict summary per scenario | ||
| 276 | * Overall confidence score | ||
| 277 | * Risk assessment | ||
| 278 | **Timing**: 10-18 seconds total (parallel processing) | ||
| |
1.3 | 279 | |
| |
1.1 | 280 | === 5.2 Background Jobs === |
| |
1.3 | 281 | |
| |
1.1 | 282 | **Source Track Record Updates** (Weekly): |
| |
1.3 | 283 | |
| |
1.1 | 284 | * Analyze claim outcomes from past week |
| 285 | * Calculate source accuracy and reliability | ||
| 286 | * Update source_track_record table | ||
| 287 | * Never triggered by individual claims (prevents circular dependencies) | ||
| 288 | **Cache Management** (Continuous): | ||
| 289 | * Warm cache for popular claims | ||
| 290 | * Invalidate cache on claim updates | ||
| 291 | * Monitor cache hit rates | ||
| 292 | **Metrics Aggregation** (Hourly): | ||
| 293 | * Roll up detailed metrics | ||
| 294 | * Calculate system health indicators | ||
| 295 | * Generate performance reports | ||
| 296 | **Data Archival** (Daily): | ||
| 297 | * Move old AKEL logs to S3 (90+ days) | ||
| 298 | * Archive old edit history | ||
| 299 | * Compress and backup data | ||
| |
1.3 | 300 | |
| |
1.1 | 301 | === 5.3 Quality Monitoring === |
| |
1.3 | 302 | |
| |
1.1 | 303 | **Automated checks run continuously**: |
| |
1.3 | 304 | |
| |
1.1 | 305 | * **Anomaly Detection**: Flag unusual patterns |
| |
1.3 | 306 | * Sudden confidence score changes |
| 307 | * Unusual evidence distributions | ||
| 308 | * Suspicious source patterns | ||
| |
1.1 | 309 | * **Contradiction Detection**: Identify conflicts |
| |
1.3 | 310 | * Evidence that contradicts other evidence |
| 311 | * Claims with internal contradictions | ||
| 312 | * Source track record anomalies | ||
| |
1.1 | 313 | * **Completeness Validation**: Ensure thoroughness |
| |
1.3 | 314 | * Sufficient evidence gathered |
| 315 | * Multiple source types represented | ||
| 316 | * Key scenarios identified | ||
| 317 | |||
| |
1.1 | 318 | === 5.4 Moderation Detection === |
| |
1.3 | 319 | |
| |
1.1 | 320 | **Automated abuse detection**: |
| |
1.3 | 321 | |
| |
1.1 | 322 | * **Spam Identification**: Pattern matching for spam claims |
| 323 | * **Manipulation Detection**: Identify coordinated editing | ||
| 324 | * **Gaming Detection**: Flag attempts to game source scores | ||
| 325 | * **Suspicious Activity**: Log unusual behavior patterns | ||
| 326 | **Human Review**: Moderators review flagged items, system learns from decisions | ||
| |
1.3 | 327 | |
| |
1.1 | 328 | == 6. Scalability Strategy == |
| |
1.3 | 329 | |
| |
1.1 | 330 | === 6.1 Horizontal Scaling === |
| |
1.3 | 331 | |
| |
1.1 | 332 | Components scale independently: |
| |
1.3 | 333 | |
| |
1.1 | 334 | * **AKEL Workers**: Add more processing workers as claim volume grows |
| 335 | * **Database Read Replicas**: Add replicas for read-heavy workloads | ||
| 336 | * **Cache Layer**: Redis cluster for distributed caching | ||
| 337 | * **API Servers**: Load-balanced API instances | ||
| |
1.3 | 338 | |
| |
1.1 | 339 | === 6.2 Vertical Scaling === |
| |
1.3 | 340 | |
| |
1.1 | 341 | Individual components can be upgraded: |
| |
1.3 | 342 | |
| |
1.1 | 343 | * **Database Server**: Increase CPU/RAM for PostgreSQL |
| 344 | * **Cache Memory**: Expand Redis memory | ||
| 345 | * **Worker Resources**: More powerful AKEL worker machines | ||
| |
1.3 | 346 | |
| |
1.1 | 347 | === 6.3 Performance Optimization === |
| |
1.3 | 348 | |
| |
1.1 | 349 | Built-in optimizations: |
| |
1.3 | 350 | |
| |
1.1 | 351 | * **Denormalized Data**: Cache summary data in claim records (70% fewer joins) |
| 352 | * **Parallel Processing**: AKEL pipeline processes in parallel (40% faster) | ||
| 353 | * **Intelligent Caching**: Redis caches frequently accessed data | ||
| 354 | * **Background Processing**: Non-urgent tasks run asynchronously | ||
| |
1.3 | 355 | |
| |
1.1 | 356 | == 7. Monitoring & Observability == |
| |
1.3 | 357 | |
| |
1.1 | 358 | === 7.1 Key Metrics === |
| |
1.3 | 359 | |
| |
1.1 | 360 | System tracks: |
| |
1.3 | 361 | |
| |
1.1 | 362 | * **Performance**: AKEL processing time, API response time, cache hit rate |
| 363 | * **Quality**: Confidence score distribution, evidence completeness, contradiction rate | ||
| 364 | * **Usage**: Claims per day, active users, API requests | ||
| 365 | * **Errors**: Failed AKEL runs, API errors, database issues | ||
| |
1.3 | 366 | |
| |
1.1 | 367 | === 7.2 Alerts === |
| |
1.3 | 368 | |
| |
1.1 | 369 | Automated alerts for: |
| |
1.3 | 370 | |
| |
1.1 | 371 | * Processing time >30 seconds (threshold breach) |
| 372 | * Error rate >1% (quality issue) | ||
| 373 | * Cache hit rate <80% (cache problem) | ||
| 374 | * Database connections >80% capacity (scaling needed) | ||
| |
1.3 | 375 | |
| |
1.1 | 376 | === 7.3 Dashboards === |
| |
1.3 | 377 | |
| |
1.1 | 378 | Real-time monitoring: |
| |
1.3 | 379 | |
| |
1.1 | 380 | * **System Health**: Overall status and key metrics |
| 381 | * **AKEL Performance**: Processing time breakdown | ||
| 382 | * **Quality Metrics**: Confidence scores, completeness | ||
| 383 | * **User Activity**: Usage patterns, peak times | ||
| |
1.3 | 384 | |
| |
1.1 | 385 | == 8. Security Architecture == |
| |
1.3 | 386 | |
| |
1.1 | 387 | === 8.1 Authentication & Authorization === |
| |
1.3 | 388 | |
| |
1.1 | 389 | * **User Authentication**: Secure login with password hashing |
| 390 | * **Role-Based Access**: Reader, Contributor, Moderator, Admin | ||
| 391 | * **API Keys**: For programmatic access | ||
| 392 | * **Rate Limiting**: Prevent abuse | ||
| |
1.3 | 393 | |
| |
1.1 | 394 | === 8.2 Data Security === |
| |
1.3 | 395 | |
| |
1.1 | 396 | * **Encryption**: TLS for transport, encrypted storage for sensitive data |
| 397 | * **Audit Logging**: Track all significant changes | ||
| 398 | * **Input Validation**: Sanitize all user inputs | ||
| 399 | * **SQL Injection Protection**: Parameterized queries | ||
| |
1.3 | 400 | |
| |
1.1 | 401 | === 8.3 Abuse Prevention === |
| |
1.3 | 402 | |
| |
1.1 | 403 | * **Rate Limiting**: Prevent flooding and DDoS |
| 404 | * **Automated Detection**: Flag suspicious patterns | ||
| 405 | * **Human Review**: Moderators investigate flagged content | ||
| 406 | * **Ban Mechanisms**: Block abusive users/IPs | ||
| |
1.3 | 407 | |
| |
1.1 | 408 | == 9. Deployment Architecture == |
| |
1.3 | 409 | |
| |
1.1 | 410 | === 9.1 Production Environment === |
| |
1.3 | 411 | |
| |
1.1 | 412 | **Components**: |
| |
1.3 | 413 | |
| |
1.1 | 414 | * Load Balancer (HAProxy or cloud LB) |
| 415 | * Multiple API servers (stateless) | ||
| 416 | * AKEL worker pool (auto-scaling) | ||
| 417 | * PostgreSQL primary + read replicas | ||
| 418 | * Redis cluster | ||
| 419 | * S3-compatible storage | ||
| 420 | **Regions**: Single region for V1.0, multi-region when needed | ||
| |
1.3 | 421 | |
| |
1.1 | 422 | === 9.2 Development & Staging === |
| |
1.3 | 423 | |
| |
1.1 | 424 | **Development**: Local Docker Compose setup |
| 425 | **Staging**: Scaled-down production replica | ||
| 426 | **CI/CD**: Automated testing and deployment | ||
| |
1.3 | 427 | |
| |
1.1 | 428 | === 9.3 Disaster Recovery === |
| |
1.3 | 429 | |
| |
1.1 | 430 | * **Database Backups**: Daily automated backups to S3 |
| 431 | * **Point-in-Time Recovery**: Transaction log archival | ||
| 432 | * **Replication**: Real-time replication to standby | ||
| 433 | * **Recovery Time Objective**: <4 hours | ||
| 434 | |||
| 435 | === 9.5 Federation Architecture Diagram === | ||
| 436 | |||
| 437 | {{include reference="FactHarbor.Specification.Diagrams.Federation Architecture.WebHome"/}} | ||
| 438 | |||
| 439 | == 10. Future Architecture Evolution == | ||
| |
1.3 | 440 | |
| |
1.1 | 441 | === 10.1 When to Add Complexity === |
| |
1.3 | 442 | |
| |
1.1 | 443 | See [[When to Add Complexity>>FactHarbor.Specification.When-to-Add-Complexity]] for specific triggers. |
| 444 | **Elasticsearch**: When PostgreSQL search consistently >500ms | ||
| 445 | **TimescaleDB**: When metrics queries consistently >1s | ||
| 446 | **Federation**: When 10,000+ users and explicit demand | ||
| 447 | **Complex Reputation**: When 100+ active contributors | ||
| |
1.3 | 448 | |
| |
1.1 | 449 | === 10.2 Federation (V2.0+) === |
| |
1.3 | 450 | |
| |
1.1 | 451 | **Deferred until**: |
| |
1.3 | 452 | |
| |
1.1 | 453 | * Core product proven with 10,000+ users |
| 454 | * User demand for decentralization | ||
| 455 | * Single-node limits reached | ||
| 456 | See [[Federation & Decentralization>>FactHarbor.Specification.Federation & Decentralization.WebHome]] for future plans. | ||
| |
1.3 | 457 | |
| |
1.1 | 458 | == 11. Technology Stack Summary == |
| |
1.3 | 459 | |
| |
1.1 | 460 | **Backend**: |
| |
1.3 | 461 | |
| |
1.1 | 462 | * Python (FastAPI or Django) |
| 463 | * PostgreSQL (primary database) | ||
| 464 | * Redis (caching) | ||
| 465 | **Frontend**: | ||
| 466 | * Modern JavaScript framework (React, Vue, or Svelte) | ||
| 467 | * Server-side rendering for SEO | ||
| 468 | **AI/LLM**: | ||
| 469 | * Multi-provider orchestration (Claude, GPT-4, local models) | ||
| 470 | * Fallback and cross-checking support | ||
| 471 | **Infrastructure**: | ||
| 472 | * Docker containers | ||
| 473 | * Kubernetes or cloud platform auto-scaling | ||
| 474 | * S3-compatible object storage | ||
| 475 | **Monitoring**: | ||
| 476 | * Prometheus + Grafana | ||
| 477 | * Structured logging (ELK or cloud logging) | ||
| 478 | * Error tracking (Sentry) | ||
| |
1.3 | 479 | |
| |
1.1 | 480 | == 12. Related Pages == |
| |
1.3 | 481 | |
| |
1.1 | 482 | * [[AI Knowledge Extraction Layer (AKEL)>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]] |
| 483 | * [[Storage Strategy>>FactHarbor.Specification.Architecture.WebHome]] | ||
| 484 | * [[Data Model>>FactHarbor.Specification.Data Model.WebHome]] | ||
| 485 | * [[API Layer>>FactHarbor.Specification.Architecture.WebHome]] | ||
| 486 | * [[Design Decisions>>FactHarbor.Specification.Design-Decisions]] | ||
| 487 | * [[When to Add Complexity>>FactHarbor.Specification.When-to-Add-Complexity]] |