Wiki source code of POC Requirements

Last modified by Robert Schaub on 2026/02/08 08:25

Show last authors
1 = POC Requirements =
2
3 **Status:** ✅ Approved for Development **Version:** 3.0 (Aligned with Main Requirements) **Goal:** Prove that AI can extract claims and determine verdicts automatically without human intervention {{info}}**Core Philosophy:** POC validates the [[Main Requirements>>Archive.FactHarbor 2026\.01\.20.Specification.Requirements.WebHome]] through simplified implementation. All POC features map to formal FR/NFR requirements.{{/info}} == 1. POC Overview == === 1.1 What POC Tests === **Core Question:**
4
5 > Can AI automatically extract factual claims from articles and evaluate them with reasonable verdicts? **What we're proving:**
6
7 * AI can identify factual claims from text
8 * AI can evaluate those claims with structured evidence
9 * Quality gates can filter unreliable outputs
10 * The core workflow is technically feasible **What we're NOT proving:**
11 * Production-ready reliability (that's POC2)
12 * User-facing features (that's Beta 0)
13 * Full IFCN compliance (that's V1.0) === 1.2 Requirements Mapping === POC1 implements a **subset** of the full system requirements defined in [[Main Requirements>>Archive.FactHarbor 2026\.01\.20.Specification.Requirements.WebHome]]. **Scope Summary:**
14 * **In Scope:** 8 requirements (7 FRs + 1 NFR)
15 * **Partial:** 3 NFRs (simplified versions)
16 * **Out of Scope:** 19 requirements (deferred to later phases) == 2. Requirements Scope Matrix == {{success}}**Requirements Traceability:** This matrix shows which [[Main Requirements>>Archive.FactHarbor 2026\.01\.20.Specification.Requirements.WebHome]] are implemented in POC1, providing full traceability between POC and system requirements.{{/success}} |=Requirement|=POC1 Status|=Implementation Level|=Notes
17
18 |**CORE WORKFLOW**||||\\
19 |FR1: Claim Extraction|✅ **In Scope**|Full|AKEL extracts claims from text
20 |FR2: Claim Context|✅ **In Scope**|Basic|Context preserved with claim
21 |FR3: Multiple Scenarios|✅ **In Scope**|Full|AKEL generates interpretation scenarios
22 |FR4: Analysis Summary|✅ **In Scope**|Basic|Simple summary format
23 |FR5: Evidence Collection|✅ **In Scope**|Full|AKEL searches for evidence
24 |FR6: Evidence Evaluation|✅ **In Scope**|Full|AKEL evaluates source reliability
25 |FR7: Automated Verdicts|✅ **In Scope**|Full|AKEL computes verdicts with uncertainty
26 |**QUALITY & RELIABILITY**||||\\
27 |NFR11: Quality Assurance|✅ **In Scope**|**Lite**|**2 gates only** (Gate 1 & 4)
28 |NFR1: Performance|⚠️ **Partial**|Basic|Response time monitored, not optimized
29 |NFR2: Scalability|⚠️ **Partial**|Single-thread|No concurrent processing
30 |NFR3: Reliability|⚠️ **Partial**|Basic|Error handling, no retry logic
31 |**DEFERRED TO LATER**||||\\
32 |FR8-FR13|❌ Out of Scope|N/A|User accounts, corrections, publishing
33 |FR44-FR53|❌ Out of Scope|N/A|Advanced features (V1.0+)
34 |NFR4: Security|❌ Out of Scope|N/A|POC2
35 |NFR5: Maintainability|❌ Out of Scope|N/A|POC2
36 |NFR12: Security Controls|❌ Out of Scope|N/A|Beta 0
37 |NFR13: Monitoring|❌ Out of Scope|N/A|POC2 == 3. POC Simplifications == === 3.1 FR1: Claim Extraction (Full Implementation) === **Main Requirement:** AI extracts factual claims from input text **POC Implementation:**
38
39 * ✅ AKEL extracts claims using LLM
40 * ✅ Each claim includes original text reference
41 * ✅ Claims are identified as factual/non-factual
42 * ❌ No advanced claim parsing (added in POC2) **Acceptance Criteria:**
43 * Extracts 3-5 claims from typical article
44 * Identifies factual vs non-factual claims
45 * Quality Gate 1 validates extraction === 3.2 FR3: Multiple Scenarios (Full Implementation) === **Main Requirement:** Generate multiple interpretation scenarios for ambiguous claims **POC Implementation:**
46 * ✅ AKEL generates 2-3 scenarios per claim
47 * ✅ Scenarios capture different interpretations
48 * ✅ Each scenario is evaluated separately
49 * ✅ Verdict considers all scenarios **Acceptance Criteria:**
50 * Generates 2+ scenarios for ambiguous claims
51 * Scenarios are meaningfully different
52 * All scenarios are evaluated === 3.3 FR4: Analysis Summary (Basic Implementation) === **Main Requirement:** Provide user-friendly summary of analysis **POC Implementation:**
53 * ✅ Simple text summary generated
54 * ❌ No rich formatting (added in Beta 0)
55 * ❌ No visual elements (added in Beta 0)
56 * ❌ No interactive features (added in Beta 0) **POC Format:**
57 ```
58 Claim: [extracted claim]
59 Scenarios: [list of scenarios]
60 Evidence: [supporting/opposing evidence]
61 Verdict: [probability with uncertainty]
62 ``` === 3.4 FR5-FR6: Evidence Collection & Evaluation (Full Implementation) === **Main Requirements:**
63 * FR5: Collect supporting and opposing evidence
64 * FR6: Evaluate evidence source reliability **POC Implementation:**
65 * ✅ AKEL searches for evidence (web/knowledge base)
66 * ✅ **Mandatory contradiction search** (finds opposing evidence)
67 * ✅ Source reliability scoring
68 * ❌ No evidence deduplication (added in POC2)
69 * ❌ No advanced source verification (added in POC2) **Acceptance Criteria:**
70 * Finds 2+ supporting evidence items
71 * Finds 1+ opposing evidence (if exists)
72 * Sources scored for reliability === 3.5 FR7: Automated Verdicts (Full Implementation) === **Main Requirement:** AI computes verdicts with uncertainty quantification **POC Implementation:**
73 * ✅ Probabilistic verdicts (0-100% confidence)
74 * ✅ Uncertainty explicitly stated
75 * ✅ Reasoning chain provided
76 * ✅ Quality Gate 4 validates verdict confidence **POC Output:**
77 ```
78 Verdict: 70% likely true
79 Uncertainty: ±15% (moderate confidence)
80 Reasoning: Based on 3 high-quality sources...
81 Confidence Level: MEDIUM
82 ``` **Acceptance Criteria:**
83 * Verdicts include probability (0-100%)
84 * Uncertainty explicitly quantified
85 * Reasoning chain explains verdict === 3.6 NFR11: Quality Assurance Framework (LITE VERSION) === **Main Requirement:** Complete quality assurance with 7 quality gates **POC Implementation:** **2 gates only** **Quality Gate 1: Claim Validation**
86 * ✅ Validates claim is factual and verifiable
87 * ✅ Blocks non-factual claims (opinion/prediction/ambiguous)
88 * ✅ Provides clear rejection reason **Quality Gate 4: Verdict Confidence Assessment**
89 * ✅ Validates ≥2 sources found
90 * ✅ Validates quality score ≥0.6
91 * ✅ Blocks low-confidence verdicts
92 * ✅ Provides clear rejection reason **Out of Scope (POC2+):**
93 * ❌ Gate 2: Evidence Relevance
94 * ❌ Gate 3: Scenario Coherence
95 * ❌ Gate 5: Source Diversity
96 * ❌ Gate 6: Reasoning Validity
97 * ❌ Gate 7: Output Completeness **Rationale:** Prove gate concept works. Add remaining gates in POC2 after validating approach. === 3.7 NFR1-3: Performance, Scalability, Reliability (Basic) === **Main Requirements:**
98 * NFR1: Response time < 30 seconds
99 * NFR2: Handle 1000+ concurrent users
100 * NFR3: 99.9% uptime **POC Implementation:**
101 * ⚠️ **Response time monitored** (not optimized)
102 * ⚠️ **Single-threaded processing** (no concurrency)
103 * ⚠️ **Basic error handling** (no advanced retry logic) **Rationale:** POC proves functionality. Performance optimization happens in POC2. **POC Acceptance:**
104 * Analysis completes (no timeout requirement)
105 * Errors don't crash system
106 * Basic logging in place == 4. What's NOT in POC Scope == === 4.1 User-Facing Features (Beta 0+) === {{warning}}**Deferred to Beta 0:**{{/warning}} **Out of Scope:**
107 * ❌ User accounts and authentication (FR8)
108 * ❌ User corrections system (FR9, FR45-46)
109 * ❌ Public publishing interface (FR10)
110 * ❌ Social sharing (FR11)
111 * ❌ Email notifications (FR12)
112 * ❌ API access (FR13) **Rationale:** POC validates AI capabilities. User features added in Beta 0. === 4.2 Advanced Features (V1.0+) === **Out of Scope:**
113 * ❌ IFCN compliance (FR47)
114 * ❌ ClaimReview schema (FR48)
115 * ❌ Archive.org integration (FR49)
116 * ❌ OSINT toolkit (FR50)
117 * ❌ Video verification (FR51)
118 * ❌ Deepfake detection (FR52)
119 * ❌ Cross-org sharing (FR53) **Rationale:** Advanced features require proven platform. Added post-V1.0. === 4.3 Production Requirements (POC2, Beta 0) === **Out of Scope:**
120 * ❌ Security controls (NFR4, NFR12)
121 * ❌ Code maintainability (NFR5)
122 * ❌ System monitoring (NFR13)
123 * ❌ Evidence deduplication
124 * ❌ Advanced source verification
125 * ❌ Full 7-gate quality framework **Rationale:** POC proves concept. Production hardening happens in POC2 and Beta 0. == 5. POC Output Specification == === 5.1 Required Output Elements === For each analyzed claim, POC must produce: **1. Claim**
126 * Original text
127 * Classification (factual/non-factual/ambiguous)
128 * If non-factual: Clear reason why **2. Scenarios** (if factual)
129 * 2-3 interpretation scenarios
130 * Each scenario clearly described **3. Evidence** (if factual)
131 * Supporting evidence (2+ items)
132 * Opposing evidence (if exists)
133 * Source URLs and reliability scores **4. Verdict** (if factual)
134 * Probability (0-100%)
135 * Uncertainty quantification
136 * Confidence level (LOW/MEDIUM/HIGH)
137 * Reasoning chain **5. Quality Status**
138 * Which gates passed/failed
139 * If failed: Clear explanation why === 5.2 Example POC Output === {{code language="json"}}{ "claim": { "text": "Switzerland has the highest life expectancy in Europe", "type": "factual", "gate1_status": "PASS" }, "scenarios": [ "Switzerland's overall life expectancy is highest", "Switzerland ranks highest for specific age groups" ], "evidence": { "supporting": [ { "source": "WHO Report 2023", "reliability": 0.95, "excerpt": "Switzerland: 83.4 years average..." } ], "opposing": [ { "source": "Eurostat 2024", "reliability": 0.90, "excerpt": "Spain leads at 83.5 years..." } ] }, "verdict": { "probability": 0.65, "uncertainty": 0.15, "confidence": "MEDIUM", "reasoning": "WHO and Eurostat show similar but conflicting data...", "gate4_status": "PASS" }
140 }{{/code}} == 6. Success Criteria == {{success}}**POC Success Definition:** POC validates that AI can extract claims, find balanced evidence, and compute reasonable verdicts with quality gates improving output quality.{{/success}} === 6.1 Functional Success === POC is successful if: ✅ **FR1-FR7 Requirements Met:**
141
142 1. Extracts 3-5 factual claims from test articles
143 2. Generates 2-3 scenarios per ambiguous claim
144 3. Finds supporting AND opposing evidence
145 4. Computes probabilistic verdicts with uncertainty
146 5. Provides clear reasoning chains ✅ **Quality Gates Work:**
147 1. Gate 1 blocks non-factual claims (100% block rate)
148 2. Gate 4 blocks low-quality verdicts (blocks if <2 sources or quality <0.6)
149 3. Clear rejection reasons provided ✅ **NFR11 Met:**
150 1. Quality gates reduce hallucination rate
151 2. Blocked outputs have clear explanations
152 3. Quality metrics are logged === 6.2 Quality Thresholds === **Minimum Acceptable:**
153
154 * ≥70% of test claims correctly classified (factual/non-factual)
155 * ≥60% of verdicts are reasonable (human evaluation)
156 * Gate 1 blocks 100% of non-factual claims
157 * Gate 4 blocks verdicts with <2 sources **Target:**
158 * ≥80% claims correctly classified
159 * ≥75% verdicts are reasonable
160 * <10% false positives (blocking good claims) === 6.3 POC Decision Gate === **After POC1, we decide:** **✅ PROCEED to POC2** if:
161 * Success criteria met
162 * Quality gates demonstrably improve output
163 * Core workflow is technically sound
164 * Clear path to production quality **⚠️ ITERATE POC1** if:
165 * Success criteria partially met
166 * Gates work but need tuning
167 * Core issues identified but fixable **❌ PIVOT APPROACH** if:
168 * Success criteria not met
169 * Fundamental AI limitations discovered
170 * Quality gates insufficient
171 * Alternative approach needed == 7. Test Cases == === 7.1 Happy Path === **Test 1: Simple Factual Claim**
172 * Input: "Paris is the capital of France"
173 * Expected: Factual, 1 scenario, verdict 95% true **Test 2: Ambiguous Claim**
174 * Input: "Switzerland has the highest income in Europe"
175 * Expected: Factual, 2-3 scenarios, verdict with uncertainty **Test 3: Statistical Claim**
176 * Input: "10% of people have condition X"
177 * Expected: Factual, evidence with numbers, probabilistic verdict === 7.2 Edge Cases === **Test 4: Opinion**
178 * Input: "Paris is the best city"
179 * Expected: Non-factual (opinion), blocked by Gate 1 **Test 5: Prediction**
180 * Input: "Bitcoin will reach $100,000 next year"
181 * Expected: Non-factual (prediction), blocked by Gate 1 **Test 6: Insufficient Evidence**
182 * Input: Obscure factual claim with no sources
183 * Expected: Blocked by Gate 4 (<2 sources) === 7.3 Quality Gate Tests === **Test 7: Gate 1 Effectiveness**
184 * Input: Mix of 10 factual + 10 non-factual claims
185 * Expected: Gate 1 blocks all 10 non-factual (100% precision) **Test 8: Gate 4 Effectiveness**
186 * Input: Claims with varying evidence availability
187 * Expected: Gate 4 blocks low-confidence verdicts == 8. Technical Architecture (POC) == === 8.1 Simplified Architecture === **POC Tech Stack:**
188 * **Frontend:** Simple web interface (Next.js + TypeScript)
189 * **Backend:** Single API endpoint
190 * **AI:** Claude API (Sonnet 4.5)
191 * **Database:** Local JSON files (no database)
192 * **Deployment:** Single server **Architecture Diagram:** See [[POC1 Specification>>FactHarbor.Specification.POC.Specification]] === 8.2 AKEL Implementation === **POC AKEL:**
193 * Single-threaded processing
194 * Synchronous API calls
195 * No caching
196 * Basic error handling
197 * Console logging **Full AKEL (POC2+):**
198 * Multi-threaded processing
199 * Async API calls
200 * Evidence caching
201 * Advanced error handling with retry
202 * Structured logging + monitoring == 9. POC Philosophy == {{info}}**Important:** POC validates concept, not production readiness. Focus is on proving AI can do the job, with production quality coming in later phases.{{/info}} === 9.1 Core Principles === **1. Prove Concept, Not Production**
203 * POC validates AI can do the job
204 * Production quality comes in POC2 and Beta 0
205 * Focus on "does it work?" not "is it perfect?" **2. Implement Subset of Requirements**
206 * POC covers FR1-7, NFR11 (lite)
207 * All other requirements deferred
208 * Clear mapping to [[Main Requirements>>Archive.FactHarbor 2026\.01\.20.Specification.Requirements.WebHome]] **3. Quality Gates Validate Approach**
209 * 2 gates prove the concept
210 * Remaining 5 gates added in POC2
211 * Gates must demonstrably improve quality **4. Iterate Based on Results**
212 * POC results determine next steps
213 * Decision gate after POC1
214 * Flexibility to pivot if needed === 9.2 Success = Clear Path Forward === POC succeeds if we can confidently answer: ✅ **Technical Feasibility:**
215 * Can AI extract claims reliably?
216 * Can AI find balanced evidence?
217 * Can AI compute reasonable verdicts? ✅ **Quality Approach:**
218 * Do quality gates improve output?
219 * Can we measure and track quality?
220 * Is the gate approach scalable? ✅ **Production Path:**
221 * Is the core architecture sound?
222 * What needs improvement for production?
223 * Is POC2 the right next step? == 10. Related Pages == * **[[Main Requirements>>Archive.FactHarbor 2026\.01\.20.Specification.Requirements.WebHome]]** - Full system requirements (this POC implements a subset)
224 * **[[POC1 Specification (Detailed)>>FactHarbor.Specification.POC.Specification]]** - Detailed POC1 technical specs
225 * **[[POC Summary>>FactHarbor.Specification.POC.Summary]]** - High-level POC overview
226 * **[[Implementation Roadmap>>Archive.FactHarbor 2026\.01\.20.Roadmap.WebHome]]** - POC1, POC2, Beta 0, V1.0 phases
227 * **[[User Needs>>Archive.FactHarbor 2026\.01\.20.Specification.Requirements.User Needs.WebHome]]** - What users need (drives requirements) **Document Owner:** Technical Team **Review Frequency:** After each POC iteration **Version History:**
228 * v1.0 - Initial POC requirements
229 * v2.0 - Updated after specification cross-check
230 * v3.0 - Aligned with Main Requirements (FR/NFR IDs added)