Changes for page POC Requirements
Last modified by Robert Schaub on 2026/02/08 08:25
To version 1.2
edited by Robert Schaub
on 2025/12/22 13:50
on 2025/12/22 13:50
Change comment:
Update document after refactoring.
Summary
-
Page properties (1 modified, 0 added, 0 removed)
Details
- Page properties
-
- Content
-
... ... @@ -5,7 +5,7 @@ 5 5 **Goal:** Prove that AI can extract claims and determine verdicts automatically without human intervention 6 6 7 7 {{info}} 8 -**Core Philosophy:** POC validates the [[Main Requirements>> Archive.FactHarbor.Specification.Requirements.WebHome]] through simplified implementation. All POC features map to formal FR/NFR requirements.8 +**Core Philosophy:** POC validates the [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]] through simplified implementation. All POC features map to formal FR/NFR requirements. 9 9 {{/info}} 10 10 11 11 ... ... @@ -14,11 +14,9 @@ 14 14 === 1.1 What POC Tests === 15 15 16 16 **Core Question:** 17 - 18 18 > Can AI automatically extract factual claims from articles and evaluate them with reasonable verdicts? 19 19 20 20 **What we're proving:** 21 - 22 22 * AI can identify factual claims from text 23 23 * AI can evaluate those claims with structured evidence 24 24 * Quality gates can filter unreliable outputs ... ... @@ -25,7 +25,6 @@ 25 25 * The core workflow is technically feasible 26 26 27 27 **What we're NOT proving:** 28 - 29 29 * Production-ready reliability (that's POC2) 30 30 * User-facing features (that's Beta 0) 31 31 * Full IFCN compliance (that's V1.0) ... ... @@ -32,22 +32,22 @@ 32 32 33 33 === 1.2 Requirements Mapping === 34 34 35 -POC1 implements a **subset** of the full system requirements defined in [[Main Requirements>> Archive.FactHarbor.Specification.Requirements.WebHome]].32 +POC1 implements a **subset** of the full system requirements defined in [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]]. 36 36 37 37 **Scope Summary:** 38 - 39 39 * **In Scope:** 8 requirements (7 FRs + 1 NFR) 40 40 * **Partial:** 3 NFRs (simplified versions) 41 41 * **Out of Scope:** 19 requirements (deferred to later phases) 42 42 39 + 43 43 == 2. Requirements Scope Matrix == 44 44 45 45 {{success}} 46 -**Requirements Traceability:** This matrix shows which [[Main Requirements>> Archive.FactHarbor.Specification.Requirements.WebHome]] are implemented in POC1, providing full traceability between POC and system requirements.43 +**Requirements Traceability:** This matrix shows which [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]] are implemented in POC1, providing full traceability between POC and system requirements. 47 47 {{/success}} 48 48 49 49 |=Requirement|=POC1 Status|=Implementation Level|=Notes 50 -|**CORE WORKFLOW**|||| \\47 +|**CORE WORKFLOW**|||| 51 51 |FR1: Claim Extraction|✅ **In Scope**|Full|AKEL extracts claims from text 52 52 |FR2: Claim Context|✅ **In Scope**|Basic|Context preserved with claim 53 53 |FR3: Multiple Scenarios|✅ **In Scope**|Full|AKEL generates interpretation scenarios ... ... @@ -55,12 +55,12 @@ 55 55 |FR5: Evidence Collection|✅ **In Scope**|Full|AKEL searches for evidence 56 56 |FR6: Evidence Evaluation|✅ **In Scope**|Full|AKEL evaluates source reliability 57 57 |FR7: Automated Verdicts|✅ **In Scope**|Full|AKEL computes verdicts with uncertainty 58 -|**QUALITY & RELIABILITY**|||| \\55 +|**QUALITY & RELIABILITY**|||| 59 59 |NFR11: Quality Assurance|✅ **In Scope**|**Lite**|**2 gates only** (Gate 1 & 4) 60 60 |NFR1: Performance|⚠️ **Partial**|Basic|Response time monitored, not optimized 61 61 |NFR2: Scalability|⚠️ **Partial**|Single-thread|No concurrent processing 62 62 |NFR3: Reliability|⚠️ **Partial**|Basic|Error handling, no retry logic 63 -|**DEFERRED TO LATER**|||| \\60 +|**DEFERRED TO LATER**|||| 64 64 |FR8-FR13|❌ Out of Scope|N/A|User accounts, corrections, publishing 65 65 |FR44-FR53|❌ Out of Scope|N/A|Advanced features (V1.0+) 66 66 |NFR4: Security|❌ Out of Scope|N/A|POC2 ... ... @@ -68,6 +68,7 @@ 68 68 |NFR12: Security Controls|❌ Out of Scope|N/A|Beta 0 69 69 |NFR13: Monitoring|❌ Out of Scope|N/A|POC2 70 70 68 + 71 71 == 3. POC Simplifications == 72 72 73 73 === 3.1 FR1: Claim Extraction (Full Implementation) === ... ... @@ -75,7 +75,6 @@ 75 75 **Main Requirement:** AI extracts factual claims from input text 76 76 77 77 **POC Implementation:** 78 - 79 79 * ✅ AKEL extracts claims using LLM 80 80 * ✅ Each claim includes original text reference 81 81 * ✅ Claims are identified as factual/non-factual ... ... @@ -82,17 +82,16 @@ 82 82 * ❌ No advanced claim parsing (added in POC2) 83 83 84 84 **Acceptance Criteria:** 85 - 86 86 * Extracts 3-5 claims from typical article 87 87 * Identifies factual vs non-factual claims 88 88 * Quality Gate 1 validates extraction 89 89 86 + 90 90 === 3.2 FR3: Multiple Scenarios (Full Implementation) === 91 91 92 92 **Main Requirement:** Generate multiple interpretation scenarios for ambiguous claims 93 93 94 94 **POC Implementation:** 95 - 96 96 * ✅ AKEL generates 2-3 scenarios per claim 97 97 * ✅ Scenarios capture different interpretations 98 98 * ✅ Each scenario is evaluated separately ... ... @@ -99,17 +99,16 @@ 99 99 * ✅ Verdict considers all scenarios 100 100 101 101 **Acceptance Criteria:** 102 - 103 103 * Generates 2+ scenarios for ambiguous claims 104 104 * Scenarios are meaningfully different 105 105 * All scenarios are evaluated 106 106 102 + 107 107 === 3.3 FR4: Analysis Summary (Basic Implementation) === 108 108 109 109 **Main Requirement:** Provide user-friendly summary of analysis 110 110 111 111 **POC Implementation:** 112 - 113 113 * ✅ Simple text summary generated 114 114 * ❌ No rich formatting (added in Beta 0) 115 115 * ❌ No visual elements (added in Beta 0) ... ... @@ -127,12 +127,10 @@ 127 127 === 3.4 FR5-FR6: Evidence Collection & Evaluation (Full Implementation) === 128 128 129 129 **Main Requirements:** 130 - 131 131 * FR5: Collect supporting and opposing evidence 132 132 * FR6: Evaluate evidence source reliability 133 133 134 134 **POC Implementation:** 135 - 136 136 * ✅ AKEL searches for evidence (web/knowledge base) 137 137 * ✅ **Mandatory contradiction search** (finds opposing evidence) 138 138 * ✅ Source reliability scoring ... ... @@ -140,17 +140,16 @@ 140 140 * ❌ No advanced source verification (added in POC2) 141 141 142 142 **Acceptance Criteria:** 143 - 144 144 * Finds 2+ supporting evidence items 145 145 * Finds 1+ opposing evidence (if exists) 146 146 * Sources scored for reliability 147 147 140 + 148 148 === 3.5 FR7: Automated Verdicts (Full Implementation) === 149 149 150 150 **Main Requirement:** AI computes verdicts with uncertainty quantification 151 151 152 152 **POC Implementation:** 153 - 154 154 * ✅ Probabilistic verdicts (0-100% confidence) 155 155 * ✅ Uncertainty explicitly stated 156 156 * ✅ Reasoning chain provided ... ... @@ -165,11 +165,11 @@ 165 165 ``` 166 166 167 167 **Acceptance Criteria:** 168 - 169 169 * Verdicts include probability (0-100%) 170 170 * Uncertainty explicitly quantified 171 171 * Reasoning chain explains verdict 172 172 164 + 173 173 === 3.6 NFR11: Quality Assurance Framework (LITE VERSION) === 174 174 175 175 **Main Requirement:** Complete quality assurance with 7 quality gates ... ... @@ -177,13 +177,11 @@ 177 177 **POC Implementation:** **2 gates only** 178 178 179 179 **Quality Gate 1: Claim Validation** 180 - 181 181 * ✅ Validates claim is factual and verifiable 182 182 * ✅ Blocks non-factual claims (opinion/prediction/ambiguous) 183 183 * ✅ Provides clear rejection reason 184 184 185 185 **Quality Gate 4: Verdict Confidence Assessment** 186 - 187 187 * ✅ Validates ≥2 sources found 188 188 * ✅ Validates quality score ≥0.6 189 189 * ✅ Blocks low-confidence verdicts ... ... @@ -190,7 +190,6 @@ 190 190 * ✅ Provides clear rejection reason 191 191 192 192 **Out of Scope (POC2+):** 193 - 194 194 * ❌ Gate 2: Evidence Relevance 195 195 * ❌ Gate 3: Scenario Coherence 196 196 * ❌ Gate 5: Source Diversity ... ... @@ -203,13 +203,11 @@ 203 203 === 3.7 NFR1-3: Performance, Scalability, Reliability (Basic) === 204 204 205 205 **Main Requirements:** 206 - 207 207 * NFR1: Response time < 30 seconds 208 208 * NFR2: Handle 1000+ concurrent users 209 209 * NFR3: 99.9% uptime 210 210 211 211 **POC Implementation:** 212 - 213 213 * ⚠️ **Response time monitored** (not optimized) 214 214 * ⚠️ **Single-threaded processing** (no concurrency) 215 215 * ⚠️ **Basic error handling** (no advanced retry logic) ... ... @@ -217,11 +217,11 @@ 217 217 **Rationale:** POC proves functionality. Performance optimization happens in POC2. 218 218 219 219 **POC Acceptance:** 220 - 221 221 * Analysis completes (no timeout requirement) 222 222 * Errors don't crash system 223 223 * Basic logging in place 224 224 211 + 225 225 == 4. What's NOT in POC Scope == 226 226 227 227 === 4.1 User-Facing Features (Beta 0+) === ... ... @@ -231,7 +231,6 @@ 231 231 {{/warning}} 232 232 233 233 **Out of Scope:** 234 - 235 235 * ❌ User accounts and authentication (FR8) 236 236 * ❌ User corrections system (FR9, FR45-46) 237 237 * ❌ Public publishing interface (FR10) ... ... @@ -245,7 +245,6 @@ 245 245 === 4.2 Advanced Features (V1.0+) === 246 246 247 247 **Out of Scope:** 248 - 249 249 * ❌ IFCN compliance (FR47) 250 250 * ❌ ClaimReview schema (FR48) 251 251 * ❌ Archive.org integration (FR49) ... ... @@ -260,7 +260,6 @@ 260 260 === 4.3 Production Requirements (POC2, Beta 0) === 261 261 262 262 **Out of Scope:** 263 - 264 264 * ❌ Security controls (NFR4, NFR12) 265 265 * ❌ Code maintainability (NFR5) 266 266 * ❌ System monitoring (NFR13) ... ... @@ -277,26 +277,21 @@ 277 277 278 278 For each analyzed claim, POC must produce: 279 279 280 -* \\ 281 -** \\ 282 -**1. Claim 264 +**1. Claim** 283 283 * Original text 284 284 * Classification (factual/non-factual/ambiguous) 285 285 * If non-factual: Clear reason why 286 286 287 287 **2. Scenarios** (if factual) 288 - 289 289 * 2-3 interpretation scenarios 290 290 * Each scenario clearly described 291 291 292 292 **3. Evidence** (if factual) 293 - 294 294 * Supporting evidence (2+ items) 295 295 * Opposing evidence (if exists) 296 296 * Source URLs and reliability scores 297 297 298 298 **4. Verdict** (if factual) 299 - 300 300 * Probability (0-100%) 301 301 * Uncertainty quantification 302 302 * Confidence level (LOW/MEDIUM/HIGH) ... ... @@ -303,10 +303,10 @@ 303 303 * Reasoning chain 304 304 305 305 **5. Quality Status** 306 - 307 307 * Which gates passed/failed 308 308 * If failed: Clear explanation why 309 309 288 + 310 310 === 5.2 Example POC Output === 311 311 312 312 {{code language="json"}} ... ... @@ -358,7 +358,6 @@ 358 358 POC is successful if: 359 359 360 360 ✅ **FR1-FR7 Requirements Met:** 361 - 362 362 1. Extracts 3-5 factual claims from test articles 363 363 2. Generates 2-3 scenarios per ambiguous claim 364 364 3. Finds supporting AND opposing evidence ... ... @@ -366,21 +366,19 @@ 366 366 5. Provides clear reasoning chains 367 367 368 368 ✅ **Quality Gates Work:** 369 - 370 370 1. Gate 1 blocks non-factual claims (100% block rate) 371 371 2. Gate 4 blocks low-quality verdicts (blocks if <2 sources or quality <0.6) 372 372 3. Clear rejection reasons provided 373 373 374 374 ✅ **NFR11 Met:** 375 - 376 376 1. Quality gates reduce hallucination rate 377 377 2. Blocked outputs have clear explanations 378 378 3. Quality metrics are logged 379 379 356 + 380 380 === 6.2 Quality Thresholds === 381 381 382 382 **Minimum Acceptable:** 383 - 384 384 * ≥70% of test claims correctly classified (factual/non-factual) 385 385 * ≥60% of verdicts are reasonable (human evaluation) 386 386 * Gate 1 blocks 100% of non-factual claims ... ... @@ -387,17 +387,16 @@ 387 387 * Gate 4 blocks verdicts with <2 sources 388 388 389 389 **Target:** 390 - 391 391 * ≥80% claims correctly classified 392 392 * ≥75% verdicts are reasonable 393 393 * <10% false positives (blocking good claims) 394 394 370 + 395 395 === 6.3 POC Decision Gate === 396 396 397 397 **After POC1, we decide:** 398 398 399 399 **✅ PROCEED to POC2** if: 400 - 401 401 * Success criteria met 402 402 * Quality gates demonstrably improve output 403 403 * Core workflow is technically sound ... ... @@ -404,72 +404,65 @@ 404 404 * Clear path to production quality 405 405 406 406 **⚠️ ITERATE POC1** if: 407 - 408 408 * Success criteria partially met 409 409 * Gates work but need tuning 410 410 * Core issues identified but fixable 411 411 412 412 **❌ PIVOT APPROACH** if: 413 - 414 414 * Success criteria not met 415 415 * Fundamental AI limitations discovered 416 416 * Quality gates insufficient 417 417 * Alternative approach needed 418 418 392 + 419 419 == 7. Test Cases == 420 420 421 421 === 7.1 Happy Path === 422 422 423 423 **Test 1: Simple Factual Claim** 424 - 425 425 * Input: "Paris is the capital of France" 426 -* Expected: Factual, 1 scenario, verdict 95% true 399 +* Expected: Factual, 1 scenario, verdict ~95% true 427 427 428 428 **Test 2: Ambiguous Claim** 429 - 430 430 * Input: "Switzerland has the highest income in Europe" 431 431 * Expected: Factual, 2-3 scenarios, verdict with uncertainty 432 432 433 433 **Test 3: Statistical Claim** 434 - 435 435 * Input: "10% of people have condition X" 436 436 * Expected: Factual, evidence with numbers, probabilistic verdict 437 437 409 + 438 438 === 7.2 Edge Cases === 439 439 440 440 **Test 4: Opinion** 441 - 442 442 * Input: "Paris is the best city" 443 443 * Expected: Non-factual (opinion), blocked by Gate 1 444 444 445 445 **Test 5: Prediction** 446 - 447 447 * Input: "Bitcoin will reach $100,000 next year" 448 448 * Expected: Non-factual (prediction), blocked by Gate 1 449 449 450 450 **Test 6: Insufficient Evidence** 451 - 452 452 * Input: Obscure factual claim with no sources 453 453 * Expected: Blocked by Gate 4 (<2 sources) 454 454 424 + 455 455 === 7.3 Quality Gate Tests === 456 456 457 457 **Test 7: Gate 1 Effectiveness** 458 - 459 459 * Input: Mix of 10 factual + 10 non-factual claims 460 460 * Expected: Gate 1 blocks all 10 non-factual (100% precision) 461 461 462 462 **Test 8: Gate 4 Effectiveness** 463 - 464 464 * Input: Claims with varying evidence availability 465 465 * Expected: Gate 4 blocks low-confidence verdicts 466 466 435 + 467 467 == 8. Technical Architecture (POC) == 468 468 469 469 === 8.1 Simplified Architecture === 470 470 471 471 **POC Tech Stack:** 472 - 473 473 * **Frontend:** Simple web interface (Next.js + TypeScript) 474 474 * **Backend:** Single API endpoint 475 475 * **AI:** Claude API (Sonnet 4.5) ... ... @@ -482,7 +482,6 @@ 482 482 === 8.2 AKEL Implementation === 483 483 484 484 **POC AKEL:** 485 - 486 486 * Single-threaded processing 487 487 * Synchronous API calls 488 488 * No caching ... ... @@ -490,7 +490,6 @@ 490 490 * Console logging 491 491 492 492 **Full AKEL (POC2+):** 493 - 494 494 * Multi-threaded processing 495 495 * Async API calls 496 496 * Evidence caching ... ... @@ -497,6 +497,7 @@ 497 497 * Advanced error handling with retry 498 498 * Structured logging + monitoring 499 499 466 + 500 500 == 9. POC Philosophy == 501 501 502 502 {{info}} ... ... @@ -505,67 +505,60 @@ 505 505 506 506 === 9.1 Core Principles === 507 507 508 -* \\ 509 -** \\ 510 -**1. Prove Concept, Not Production 475 +**1. Prove Concept, Not Production** 511 511 * POC validates AI can do the job 512 512 * Production quality comes in POC2 and Beta 0 513 513 * Focus on "does it work?" not "is it perfect?" 514 514 515 515 **2. Implement Subset of Requirements** 516 - 517 517 * POC covers FR1-7, NFR11 (lite) 518 518 * All other requirements deferred 519 -* Clear mapping to [[Main Requirements>> Archive.FactHarbor.Specification.Requirements.WebHome]]483 +* Clear mapping to [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]] 520 520 521 521 **3. Quality Gates Validate Approach** 522 - 523 523 * 2 gates prove the concept 524 524 * Remaining 5 gates added in POC2 525 525 * Gates must demonstrably improve quality 526 526 527 527 **4. Iterate Based on Results** 528 - 529 529 * POC results determine next steps 530 530 * Decision gate after POC1 531 531 * Flexibility to pivot if needed 532 532 533 -=== 9.2 Success === 534 534 535 - Clear Path Forward === 496 +=== 9.2 Success = Clear Path Forward === 536 536 537 537 POC succeeds if we can confidently answer: 538 538 539 539 ✅ **Technical Feasibility:** 540 - 541 541 * Can AI extract claims reliably? 542 542 * Can AI find balanced evidence? 543 543 * Can AI compute reasonable verdicts? 544 544 545 545 ✅ **Quality Approach:** 546 - 547 547 * Do quality gates improve output? 548 548 * Can we measure and track quality? 549 549 * Is the gate approach scalable? 550 550 551 551 ✅ **Production Path:** 552 - 553 553 * Is the core architecture sound? 554 554 * What needs improvement for production? 555 555 * Is POC2 the right next step? 556 556 515 + 557 557 == 10. Related Pages == 558 558 559 -* **[[Main Requirements>> Archive.FactHarbor.Specification.Requirements.WebHome]]** - Full system requirements (this POC implements a subset)518 +* **[[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]]** - Full system requirements (this POC implements a subset) 560 560 * **[[POC1 Specification (Detailed)>>FactHarbor.Specification.POC.Specification]]** - Detailed POC1 technical specs 561 561 * **[[POC Summary>>FactHarbor.Specification.POC.Summary]]** - High-level POC overview 562 -* **[[Implementation Roadmap>> Archive.FactHarbor2026\.01\.20.Roadmap.WebHome]]** - POC1, POC2, Beta 0, V1.0 phases563 -* **[[User Needs>> Archive.FactHarbor.Specification.Requirements.User Needs.WebHome]]** - What users need (drives requirements)521 +* **[[Implementation Roadmap>>FactHarbor.Roadmap.WebHome]]** - POC1, POC2, Beta 0, V1.0 phases 522 +* **[[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]]** - What users need (drives requirements) 564 564 524 + 565 565 **Document Owner:** Technical Team 566 566 **Review Frequency:** After each POC iteration 567 567 **Version History:** 568 - 569 569 * v1.0 - Initial POC requirements 570 570 * v2.0 - Updated after specification cross-check 571 571 * v3.0 - Aligned with Main Requirements (FR/NFR IDs added) 531 +