Changes for page POC Requirements
Last modified by Robert Schaub on 2026/02/08 08:25
To version 1.2
edited by Robert Schaub
on 2025/12/22 14:38
on 2025/12/22 14:38
Change comment:
Update document after refactoring.
Summary
-
Page properties (1 modified, 0 added, 0 removed)
Details
- Page properties
-
- Content
-
... ... @@ -1,9 +1,7 @@ 1 -= POC Requirements = 2 - 3 - **Status:** ✅ Approved for Development **Version:** 3.0 (Aligned with Main Requirements) **Goal:** Prove that AI can extract claims and determine verdicts automatically without human intervention {{info}}**Core Philosophy:** POC validates the [[Main Requirements>>Archive.FactHarbor.Specification.Requirements.WebHome]] through simplified implementation. All POC features map to formal FR/NFR requirements.{{/info}} == 1. POC Overview == === 1.1 What POC Tests === **Core Question:** 4 - 1 += POC Requirements = **Status:** ✅ Approved for Development **Version:** 3.0 (Aligned with Main Requirements) **Goal:** Prove that AI can extract claims and determine verdicts automatically without human intervention {{info}} 2 +**Core Philosophy:** POC validates the [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]] through simplified implementation. All POC features map to formal FR/NFR requirements. 3 +{{/info}} == 1. POC Overview == === 1.1 What POC Tests === **Core Question:** 5 5 > Can AI automatically extract factual claims from articles and evaluate them with reasonable verdicts? **What we're proving:** 6 - 7 7 * AI can identify factual claims from text 8 8 * AI can evaluate those claims with structured evidence 9 9 * Quality gates can filter unreliable outputs ... ... @@ -10,12 +10,13 @@ 10 10 * The core workflow is technically feasible **What we're NOT proving:** 11 11 * Production-ready reliability (that's POC2) 12 12 * User-facing features (that's Beta 0) 13 -* Full IFCN compliance (that's V1.0) === 1.2 Requirements Mapping === POC1 implements a **subset** of the full system requirements defined in [[Main Requirements>> Archive.FactHarbor.Specification.Requirements.WebHome]]. **Scope Summary:**11 +* Full IFCN compliance (that's V1.0) === 1.2 Requirements Mapping === POC1 implements a **subset** of the full system requirements defined in [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]]. **Scope Summary:** 14 14 * **In Scope:** 8 requirements (7 FRs + 1 NFR) 15 15 * **Partial:** 3 NFRs (simplified versions) 16 -* **Out of Scope:** 19 requirements (deferred to later phases) == 2. Requirements Scope Matrix == {{success}}**Requirements Traceability:** This matrix shows which [[Main Requirements>>Archive.FactHarbor.Specification.Requirements.WebHome]] are implemented in POC1, providing full traceability between POC and system requirements.{{/success}} |=Requirement|=POC1 Status|=Implementation Level|=Notes 17 - 18 -|**CORE WORKFLOW**||||\\ 14 +* **Out of Scope:** 19 requirements (deferred to later phases) == 2. Requirements Scope Matrix == {{success}} 15 +**Requirements Traceability:** This matrix shows which [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]] are implemented in POC1, providing full traceability between POC and system requirements. 16 +{{/success}} |=Requirement|=POC1 Status|=Implementation Level|=Notes 17 +|**CORE WORKFLOW**|||| 19 19 |FR1: Claim Extraction|✅ **In Scope**|Full|AKEL extracts claims from text 20 20 |FR2: Claim Context|✅ **In Scope**|Basic|Context preserved with claim 21 21 |FR3: Multiple Scenarios|✅ **In Scope**|Full|AKEL generates interpretation scenarios ... ... @@ -23,12 +23,12 @@ 23 23 |FR5: Evidence Collection|✅ **In Scope**|Full|AKEL searches for evidence 24 24 |FR6: Evidence Evaluation|✅ **In Scope**|Full|AKEL evaluates source reliability 25 25 |FR7: Automated Verdicts|✅ **In Scope**|Full|AKEL computes verdicts with uncertainty 26 -|**QUALITY & RELIABILITY**|||| \\25 +|**QUALITY & RELIABILITY**|||| 27 27 |NFR11: Quality Assurance|✅ **In Scope**|**Lite**|**2 gates only** (Gate 1 & 4) 28 28 |NFR1: Performance|⚠️ **Partial**|Basic|Response time monitored, not optimized 29 29 |NFR2: Scalability|⚠️ **Partial**|Single-thread|No concurrent processing 30 30 |NFR3: Reliability|⚠️ **Partial**|Basic|Error handling, no retry logic 31 -|**DEFERRED TO LATER**|||| \\30 +|**DEFERRED TO LATER**|||| 32 32 |FR8-FR13|❌ Out of Scope|N/A|User accounts, corrections, publishing 33 33 |FR44-FR53|❌ Out of Scope|N/A|Advanced features (V1.0+) 34 34 |NFR4: Security|❌ Out of Scope|N/A|POC2 ... ... @@ -35,7 +35,6 @@ 35 35 |NFR5: Maintainability|❌ Out of Scope|N/A|POC2 36 36 |NFR12: Security Controls|❌ Out of Scope|N/A|Beta 0 37 37 |NFR13: Monitoring|❌ Out of Scope|N/A|POC2 == 3. POC Simplifications == === 3.1 FR1: Claim Extraction (Full Implementation) === **Main Requirement:** AI extracts factual claims from input text **POC Implementation:** 38 - 39 39 * ✅ AKEL extracts claims using LLM 40 40 * ✅ Each claim includes original text reference 41 41 * ✅ Claims are identified as factual/non-factual ... ... @@ -103,7 +103,9 @@ 103 103 * ⚠️ **Basic error handling** (no advanced retry logic) **Rationale:** POC proves functionality. Performance optimization happens in POC2. **POC Acceptance:** 104 104 * Analysis completes (no timeout requirement) 105 105 * Errors don't crash system 106 -* Basic logging in place == 4. What's NOT in POC Scope == === 4.1 User-Facing Features (Beta 0+) === {{warning}}**Deferred to Beta 0:**{{/warning}} **Out of Scope:** 104 +* Basic logging in place == 4. What's NOT in POC Scope == === 4.1 User-Facing Features (Beta 0+) === {{warning}} 105 +**Deferred to Beta 0:** 106 +{{/warning}} **Out of Scope:** 107 107 * ❌ User accounts and authentication (FR8) 108 108 * ❌ User corrections system (FR9, FR45-46) 109 109 * ❌ Public publishing interface (FR10) ... ... @@ -136,9 +136,12 @@ 136 136 * Confidence level (LOW/MEDIUM/HIGH) 137 137 * Reasoning chain **5. Quality Status** 138 138 * Which gates passed/failed 139 -* If failed: Clear explanation why === 5.2 Example POC Output === {{code language="json"}}{ "claim": { "text": "Switzerland has the highest life expectancy in Europe", "type": "factual", "gate1_status": "PASS" }, "scenarios": [ "Switzerland's overall life expectancy is highest", "Switzerland ranks highest for specific age groups" ], "evidence": { "supporting": [ { "source": "WHO Report 2023", "reliability": 0.95, "excerpt": "Switzerland: 83.4 years average..." } ], "opposing": [ { "source": "Eurostat 2024", "reliability": 0.90, "excerpt": "Spain leads at 83.5 years..." } ] }, "verdict": { "probability": 0.65, "uncertainty": 0.15, "confidence": "MEDIUM", "reasoning": "WHO and Eurostat show similar but conflicting data...", "gate4_status": "PASS" } 140 -}{{/code}} == 6. Success Criteria == {{success}}**POC Success Definition:** POC validates that AI can extract claims, find balanced evidence, and compute reasonable verdicts with quality gates improving output quality.{{/success}} === 6.1 Functional Success === POC is successful if: ✅ **FR1-FR7 Requirements Met:** 141 - 139 +* If failed: Clear explanation why === 5.2 Example POC Output === {{code language="json"}} 140 +{ "claim": { "text": "Switzerland has the highest life expectancy in Europe", "type": "factual", "gate1_status": "PASS" }, "scenarios": [ "Switzerland's overall life expectancy is highest", "Switzerland ranks highest for specific age groups" ], "evidence": { "supporting": [ { "source": "WHO Report 2023", "reliability": 0.95, "excerpt": "Switzerland: 83.4 years average..." } ], "opposing": [ { "source": "Eurostat 2024", "reliability": 0.90, "excerpt": "Spain leads at 83.5 years..." } ] }, "verdict": { "probability": 0.65, "uncertainty": 0.15, "confidence": "MEDIUM", "reasoning": "WHO and Eurostat show similar but conflicting data...", "gate4_status": "PASS" } 141 +} 142 +{{/code}} == 6. Success Criteria == {{success}} 143 +**POC Success Definition:** POC validates that AI can extract claims, find balanced evidence, and compute reasonable verdicts with quality gates improving output quality. 144 +{{/success}} === 6.1 Functional Success === POC is successful if: ✅ **FR1-FR7 Requirements Met:** 142 142 1. Extracts 3-5 factual claims from test articles 143 143 2. Generates 2-3 scenarios per ambiguous claim 144 144 3. Finds supporting AND opposing evidence ... ... @@ -150,7 +150,6 @@ 150 150 1. Quality gates reduce hallucination rate 151 151 2. Blocked outputs have clear explanations 152 152 3. Quality metrics are logged === 6.2 Quality Thresholds === **Minimum Acceptable:** 153 - 154 154 * ≥70% of test claims correctly classified (factual/non-factual) 155 155 * ≥60% of verdicts are reasonable (human evaluation) 156 156 * Gate 1 blocks 100% of non-factual claims ... ... @@ -170,7 +170,7 @@ 170 170 * Quality gates insufficient 171 171 * Alternative approach needed == 7. Test Cases == === 7.1 Happy Path === **Test 1: Simple Factual Claim** 172 172 * Input: "Paris is the capital of France" 173 -* Expected: Factual, 1 scenario, verdict 95% true **Test 2: Ambiguous Claim** 175 +* Expected: Factual, 1 scenario, verdict ~95% true **Test 2: Ambiguous Claim** 174 174 * Input: "Switzerland has the highest income in Europe" 175 175 * Expected: Factual, 2-3 scenarios, verdict with uncertainty **Test 3: Statistical Claim** 176 176 * Input: "10% of people have condition X" ... ... @@ -199,13 +199,15 @@ 199 199 * Async API calls 200 200 * Evidence caching 201 201 * Advanced error handling with retry 202 -* Structured logging + monitoring == 9. POC Philosophy == {{info}}**Important:** POC validates concept, not production readiness. Focus is on proving AI can do the job, with production quality coming in later phases.{{/info}} === 9.1 Core Principles === **1. Prove Concept, Not Production** 204 +* Structured logging + monitoring == 9. POC Philosophy == {{info}} 205 +**Important:** POC validates concept, not production readiness. Focus is on proving AI can do the job, with production quality coming in later phases. 206 +{{/info}} === 9.1 Core Principles === **1. Prove Concept, Not Production** 203 203 * POC validates AI can do the job 204 204 * Production quality comes in POC2 and Beta 0 205 205 * Focus on "does it work?" not "is it perfect?" **2. Implement Subset of Requirements** 206 206 * POC covers FR1-7, NFR11 (lite) 207 207 * All other requirements deferred 208 -* Clear mapping to [[Main Requirements>> Archive.FactHarbor.Specification.Requirements.WebHome]] **3. Quality Gates Validate Approach**212 +* Clear mapping to [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]] **3. Quality Gates Validate Approach** 209 209 * 2 gates prove the concept 210 210 * Remaining 5 gates added in POC2 211 211 * Gates must demonstrably improve quality **4. Iterate Based on Results** ... ... @@ -220,11 +220,12 @@ 220 220 * Is the gate approach scalable? ✅ **Production Path:** 221 221 * Is the core architecture sound? 222 222 * What needs improvement for production? 223 -* Is POC2 the right next step? == 10. Related Pages == * **[[Main Requirements>> Archive.FactHarbor.Specification.Requirements.WebHome]]** - Full system requirements (this POC implements a subset)227 +* Is POC2 the right next step? == 10. Related Pages == * **[[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]]** - Full system requirements (this POC implements a subset) 224 224 * **[[POC1 Specification (Detailed)>>FactHarbor.Specification.POC.Specification]]** - Detailed POC1 technical specs 225 225 * **[[POC Summary>>FactHarbor.Specification.POC.Summary]]** - High-level POC overview 226 -* **[[Implementation Roadmap>> Archive.FactHarbor.Roadmap.WebHome]]** - POC1, POC2, Beta 0, V1.0 phases227 -* **[[User Needs>> Archive.FactHarbor.Specification.Requirements.User Needs.WebHome]]** - What users need (drives requirements) **Document Owner:** Technical Team **Review Frequency:** After each POC iteration **Version History:**230 +* **[[Implementation Roadmap>>FactHarbor.Roadmap.WebHome]]** - POC1, POC2, Beta 0, V1.0 phases 231 +* **[[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]]** - What users need (drives requirements) **Document Owner:** Technical Team **Review Frequency:** After each POC iteration **Version History:** 228 228 * v1.0 - Initial POC requirements 229 229 * v2.0 - Updated after specification cross-check 230 230 * v3.0 - Aligned with Main Requirements (FR/NFR IDs added) 235 +