Wiki source code of POC Requirements

Last modified by Robert Schaub on 2025/12/22 13:50

Show last authors
1 = POC Requirements =
2
3 **Status:** ✅ Approved for Development
4 **Version:** 3.0 (Aligned with Main Requirements)
5 **Goal:** Prove that AI can extract claims and determine verdicts automatically without human intervention
6
7 {{info}}
8 **Core Philosophy:** POC validates the [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]] through simplified implementation. All POC features map to formal FR/NFR requirements.
9 {{/info}}
10
11
12 == 1. POC Overview ==
13
14 === 1.1 What POC Tests ===
15
16 **Core Question:**
17 > Can AI automatically extract factual claims from articles and evaluate them with reasonable verdicts?
18
19 **What we're proving:**
20 * AI can identify factual claims from text
21 * AI can evaluate those claims with structured evidence
22 * Quality gates can filter unreliable outputs
23 * The core workflow is technically feasible
24
25 **What we're NOT proving:**
26 * Production-ready reliability (that's POC2)
27 * User-facing features (that's Beta 0)
28 * Full IFCN compliance (that's V1.0)
29
30 === 1.2 Requirements Mapping ===
31
32 POC1 implements a **subset** of the full system requirements defined in [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]].
33
34 **Scope Summary:**
35 * **In Scope:** 8 requirements (7 FRs + 1 NFR)
36 * **Partial:** 3 NFRs (simplified versions)
37 * **Out of Scope:** 19 requirements (deferred to later phases)
38
39
40 == 2. Requirements Scope Matrix ==
41
42 {{success}}
43 **Requirements Traceability:** This matrix shows which [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]] are implemented in POC1, providing full traceability between POC and system requirements.
44 {{/success}}
45
46 |=Requirement|=POC1 Status|=Implementation Level|=Notes
47 |**CORE WORKFLOW**||||
48 |FR1: Claim Extraction|✅ **In Scope**|Full|AKEL extracts claims from text
49 |FR2: Claim Context|✅ **In Scope**|Basic|Context preserved with claim
50 |FR3: Multiple Scenarios|✅ **In Scope**|Full|AKEL generates interpretation scenarios
51 |FR4: Analysis Summary|✅ **In Scope**|Basic|Simple summary format
52 |FR5: Evidence Collection|✅ **In Scope**|Full|AKEL searches for evidence
53 |FR6: Evidence Evaluation|✅ **In Scope**|Full|AKEL evaluates source reliability
54 |FR7: Automated Verdicts|✅ **In Scope**|Full|AKEL computes verdicts with uncertainty
55 |**QUALITY & RELIABILITY**||||
56 |NFR11: Quality Assurance|✅ **In Scope**|**Lite**|**2 gates only** (Gate 1 & 4)
57 |NFR1: Performance|⚠️ **Partial**|Basic|Response time monitored, not optimized
58 |NFR2: Scalability|⚠️ **Partial**|Single-thread|No concurrent processing
59 |NFR3: Reliability|⚠️ **Partial**|Basic|Error handling, no retry logic
60 |**DEFERRED TO LATER**||||
61 |FR8-FR13|❌ Out of Scope|N/A|User accounts, corrections, publishing
62 |FR44-FR53|❌ Out of Scope|N/A|Advanced features (V1.0+)
63 |NFR4: Security|❌ Out of Scope|N/A|POC2
64 |NFR5: Maintainability|❌ Out of Scope|N/A|POC2
65 |NFR12: Security Controls|❌ Out of Scope|N/A|Beta 0
66 |NFR13: Monitoring|❌ Out of Scope|N/A|POC2
67
68
69 == 3. POC Simplifications ==
70
71 === 3.1 FR1: Claim Extraction (Full Implementation) ===
72
73 **Main Requirement:** AI extracts factual claims from input text
74
75 **POC Implementation:**
76 * ✅ AKEL extracts claims using LLM
77 * ✅ Each claim includes original text reference
78 * ✅ Claims are identified as factual/non-factual
79 * ❌ No advanced claim parsing (added in POC2)
80
81 **Acceptance Criteria:**
82 * Extracts 3-5 claims from typical article
83 * Identifies factual vs non-factual claims
84 * Quality Gate 1 validates extraction
85
86
87 === 3.2 FR3: Multiple Scenarios (Full Implementation) ===
88
89 **Main Requirement:** Generate multiple interpretation scenarios for ambiguous claims
90
91 **POC Implementation:**
92 * ✅ AKEL generates 2-3 scenarios per claim
93 * ✅ Scenarios capture different interpretations
94 * ✅ Each scenario is evaluated separately
95 * ✅ Verdict considers all scenarios
96
97 **Acceptance Criteria:**
98 * Generates 2+ scenarios for ambiguous claims
99 * Scenarios are meaningfully different
100 * All scenarios are evaluated
101
102
103 === 3.3 FR4: Analysis Summary (Basic Implementation) ===
104
105 **Main Requirement:** Provide user-friendly summary of analysis
106
107 **POC Implementation:**
108 * ✅ Simple text summary generated
109 * ❌ No rich formatting (added in Beta 0)
110 * ❌ No visual elements (added in Beta 0)
111 * ❌ No interactive features (added in Beta 0)
112
113 **POC Format:**
114 ```
115 Claim: [extracted claim]
116 Scenarios: [list of scenarios]
117 Evidence: [supporting/opposing evidence]
118 Verdict: [probability with uncertainty]
119 ```
120
121
122 === 3.4 FR5-FR6: Evidence Collection & Evaluation (Full Implementation) ===
123
124 **Main Requirements:**
125 * FR5: Collect supporting and opposing evidence
126 * FR6: Evaluate evidence source reliability
127
128 **POC Implementation:**
129 * ✅ AKEL searches for evidence (web/knowledge base)
130 * ✅ **Mandatory contradiction search** (finds opposing evidence)
131 * ✅ Source reliability scoring
132 * ❌ No evidence deduplication (added in POC2)
133 * ❌ No advanced source verification (added in POC2)
134
135 **Acceptance Criteria:**
136 * Finds 2+ supporting evidence items
137 * Finds 1+ opposing evidence (if exists)
138 * Sources scored for reliability
139
140
141 === 3.5 FR7: Automated Verdicts (Full Implementation) ===
142
143 **Main Requirement:** AI computes verdicts with uncertainty quantification
144
145 **POC Implementation:**
146 * ✅ Probabilistic verdicts (0-100% confidence)
147 * ✅ Uncertainty explicitly stated
148 * ✅ Reasoning chain provided
149 * ✅ Quality Gate 4 validates verdict confidence
150
151 **POC Output:**
152 ```
153 Verdict: 70% likely true
154 Uncertainty: ±15% (moderate confidence)
155 Reasoning: Based on 3 high-quality sources...
156 Confidence Level: MEDIUM
157 ```
158
159 **Acceptance Criteria:**
160 * Verdicts include probability (0-100%)
161 * Uncertainty explicitly quantified
162 * Reasoning chain explains verdict
163
164
165 === 3.6 NFR11: Quality Assurance Framework (LITE VERSION) ===
166
167 **Main Requirement:** Complete quality assurance with 7 quality gates
168
169 **POC Implementation:** **2 gates only**
170
171 **Quality Gate 1: Claim Validation**
172 * ✅ Validates claim is factual and verifiable
173 * ✅ Blocks non-factual claims (opinion/prediction/ambiguous)
174 * ✅ Provides clear rejection reason
175
176 **Quality Gate 4: Verdict Confidence Assessment**
177 * ✅ Validates ≥2 sources found
178 * ✅ Validates quality score ≥0.6
179 * ✅ Blocks low-confidence verdicts
180 * ✅ Provides clear rejection reason
181
182 **Out of Scope (POC2+):**
183 * ❌ Gate 2: Evidence Relevance
184 * ❌ Gate 3: Scenario Coherence
185 * ❌ Gate 5: Source Diversity
186 * ❌ Gate 6: Reasoning Validity
187 * ❌ Gate 7: Output Completeness
188
189 **Rationale:** Prove gate concept works. Add remaining gates in POC2 after validating approach.
190
191
192 === 3.7 NFR1-3: Performance, Scalability, Reliability (Basic) ===
193
194 **Main Requirements:**
195 * NFR1: Response time < 30 seconds
196 * NFR2: Handle 1000+ concurrent users
197 * NFR3: 99.9% uptime
198
199 **POC Implementation:**
200 * ⚠️ **Response time monitored** (not optimized)
201 * ⚠️ **Single-threaded processing** (no concurrency)
202 * ⚠️ **Basic error handling** (no advanced retry logic)
203
204 **Rationale:** POC proves functionality. Performance optimization happens in POC2.
205
206 **POC Acceptance:**
207 * Analysis completes (no timeout requirement)
208 * Errors don't crash system
209 * Basic logging in place
210
211
212 == 4. What's NOT in POC Scope ==
213
214 === 4.1 User-Facing Features (Beta 0+) ===
215
216 {{warning}}
217 **Deferred to Beta 0:**
218 {{/warning}}
219
220 **Out of Scope:**
221 * ❌ User accounts and authentication (FR8)
222 * ❌ User corrections system (FR9, FR45-46)
223 * ❌ Public publishing interface (FR10)
224 * ❌ Social sharing (FR11)
225 * ❌ Email notifications (FR12)
226 * ❌ API access (FR13)
227
228 **Rationale:** POC validates AI capabilities. User features added in Beta 0.
229
230
231 === 4.2 Advanced Features (V1.0+) ===
232
233 **Out of Scope:**
234 * ❌ IFCN compliance (FR47)
235 * ❌ ClaimReview schema (FR48)
236 * ❌ Archive.org integration (FR49)
237 * ❌ OSINT toolkit (FR50)
238 * ❌ Video verification (FR51)
239 * ❌ Deepfake detection (FR52)
240 * ❌ Cross-org sharing (FR53)
241
242 **Rationale:** Advanced features require proven platform. Added post-V1.0.
243
244
245 === 4.3 Production Requirements (POC2, Beta 0) ===
246
247 **Out of Scope:**
248 * ❌ Security controls (NFR4, NFR12)
249 * ❌ Code maintainability (NFR5)
250 * ❌ System monitoring (NFR13)
251 * ❌ Evidence deduplication
252 * ❌ Advanced source verification
253 * ❌ Full 7-gate quality framework
254
255 **Rationale:** POC proves concept. Production hardening happens in POC2 and Beta 0.
256
257
258 == 5. POC Output Specification ==
259
260 === 5.1 Required Output Elements ===
261
262 For each analyzed claim, POC must produce:
263
264 **1. Claim**
265 * Original text
266 * Classification (factual/non-factual/ambiguous)
267 * If non-factual: Clear reason why
268
269 **2. Scenarios** (if factual)
270 * 2-3 interpretation scenarios
271 * Each scenario clearly described
272
273 **3. Evidence** (if factual)
274 * Supporting evidence (2+ items)
275 * Opposing evidence (if exists)
276 * Source URLs and reliability scores
277
278 **4. Verdict** (if factual)
279 * Probability (0-100%)
280 * Uncertainty quantification
281 * Confidence level (LOW/MEDIUM/HIGH)
282 * Reasoning chain
283
284 **5. Quality Status**
285 * Which gates passed/failed
286 * If failed: Clear explanation why
287
288
289 === 5.2 Example POC Output ===
290
291 {{code language="json"}}
292 {
293 "claim": {
294 "text": "Switzerland has the highest life expectancy in Europe",
295 "type": "factual",
296 "gate1_status": "PASS"
297 },
298 "scenarios": [
299 "Switzerland's overall life expectancy is highest",
300 "Switzerland ranks highest for specific age groups"
301 ],
302 "evidence": {
303 "supporting": [
304 {
305 "source": "WHO Report 2023",
306 "reliability": 0.95,
307 "excerpt": "Switzerland: 83.4 years average..."
308 }
309 ],
310 "opposing": [
311 {
312 "source": "Eurostat 2024",
313 "reliability": 0.90,
314 "excerpt": "Spain leads at 83.5 years..."
315 }
316 ]
317 },
318 "verdict": {
319 "probability": 0.65,
320 "uncertainty": 0.15,
321 "confidence": "MEDIUM",
322 "reasoning": "WHO and Eurostat show similar but conflicting data...",
323 "gate4_status": "PASS"
324 }
325 }
326 {{/code}}
327
328
329 == 6. Success Criteria ==
330
331 {{success}}
332 **POC Success Definition:** POC validates that AI can extract claims, find balanced evidence, and compute reasonable verdicts with quality gates improving output quality.
333 {{/success}}
334
335 === 6.1 Functional Success ===
336
337 POC is successful if:
338
339 ✅ **FR1-FR7 Requirements Met:**
340 1. Extracts 3-5 factual claims from test articles
341 2. Generates 2-3 scenarios per ambiguous claim
342 3. Finds supporting AND opposing evidence
343 4. Computes probabilistic verdicts with uncertainty
344 5. Provides clear reasoning chains
345
346 ✅ **Quality Gates Work:**
347 1. Gate 1 blocks non-factual claims (100% block rate)
348 2. Gate 4 blocks low-quality verdicts (blocks if <2 sources or quality <0.6)
349 3. Clear rejection reasons provided
350
351 ✅ **NFR11 Met:**
352 1. Quality gates reduce hallucination rate
353 2. Blocked outputs have clear explanations
354 3. Quality metrics are logged
355
356
357 === 6.2 Quality Thresholds ===
358
359 **Minimum Acceptable:**
360 * ≥70% of test claims correctly classified (factual/non-factual)
361 * ≥60% of verdicts are reasonable (human evaluation)
362 * Gate 1 blocks 100% of non-factual claims
363 * Gate 4 blocks verdicts with <2 sources
364
365 **Target:**
366 * ≥80% claims correctly classified
367 * ≥75% verdicts are reasonable
368 * <10% false positives (blocking good claims)
369
370
371 === 6.3 POC Decision Gate ===
372
373 **After POC1, we decide:**
374
375 **✅ PROCEED to POC2** if:
376 * Success criteria met
377 * Quality gates demonstrably improve output
378 * Core workflow is technically sound
379 * Clear path to production quality
380
381 **⚠️ ITERATE POC1** if:
382 * Success criteria partially met
383 * Gates work but need tuning
384 * Core issues identified but fixable
385
386 **❌ PIVOT APPROACH** if:
387 * Success criteria not met
388 * Fundamental AI limitations discovered
389 * Quality gates insufficient
390 * Alternative approach needed
391
392
393 == 7. Test Cases ==
394
395 === 7.1 Happy Path ===
396
397 **Test 1: Simple Factual Claim**
398 * Input: "Paris is the capital of France"
399 * Expected: Factual, 1 scenario, verdict ~95% true
400
401 **Test 2: Ambiguous Claim**
402 * Input: "Switzerland has the highest income in Europe"
403 * Expected: Factual, 2-3 scenarios, verdict with uncertainty
404
405 **Test 3: Statistical Claim**
406 * Input: "10% of people have condition X"
407 * Expected: Factual, evidence with numbers, probabilistic verdict
408
409
410 === 7.2 Edge Cases ===
411
412 **Test 4: Opinion**
413 * Input: "Paris is the best city"
414 * Expected: Non-factual (opinion), blocked by Gate 1
415
416 **Test 5: Prediction**
417 * Input: "Bitcoin will reach $100,000 next year"
418 * Expected: Non-factual (prediction), blocked by Gate 1
419
420 **Test 6: Insufficient Evidence**
421 * Input: Obscure factual claim with no sources
422 * Expected: Blocked by Gate 4 (<2 sources)
423
424
425 === 7.3 Quality Gate Tests ===
426
427 **Test 7: Gate 1 Effectiveness**
428 * Input: Mix of 10 factual + 10 non-factual claims
429 * Expected: Gate 1 blocks all 10 non-factual (100% precision)
430
431 **Test 8: Gate 4 Effectiveness**
432 * Input: Claims with varying evidence availability
433 * Expected: Gate 4 blocks low-confidence verdicts
434
435
436 == 8. Technical Architecture (POC) ==
437
438 === 8.1 Simplified Architecture ===
439
440 **POC Tech Stack:**
441 * **Frontend:** Simple web interface (Next.js + TypeScript)
442 * **Backend:** Single API endpoint
443 * **AI:** Claude API (Sonnet 4.5)
444 * **Database:** Local JSON files (no database)
445 * **Deployment:** Single server
446
447 **Architecture Diagram:** See [[POC1 Specification>>FactHarbor.Specification.POC.Specification]]
448
449
450 === 8.2 AKEL Implementation ===
451
452 **POC AKEL:**
453 * Single-threaded processing
454 * Synchronous API calls
455 * No caching
456 * Basic error handling
457 * Console logging
458
459 **Full AKEL (POC2+):**
460 * Multi-threaded processing
461 * Async API calls
462 * Evidence caching
463 * Advanced error handling with retry
464 * Structured logging + monitoring
465
466
467 == 9. POC Philosophy ==
468
469 {{info}}
470 **Important:** POC validates concept, not production readiness. Focus is on proving AI can do the job, with production quality coming in later phases.
471 {{/info}}
472
473 === 9.1 Core Principles ===
474
475 **1. Prove Concept, Not Production**
476 * POC validates AI can do the job
477 * Production quality comes in POC2 and Beta 0
478 * Focus on "does it work?" not "is it perfect?"
479
480 **2. Implement Subset of Requirements**
481 * POC covers FR1-7, NFR11 (lite)
482 * All other requirements deferred
483 * Clear mapping to [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]]
484
485 **3. Quality Gates Validate Approach**
486 * 2 gates prove the concept
487 * Remaining 5 gates added in POC2
488 * Gates must demonstrably improve quality
489
490 **4. Iterate Based on Results**
491 * POC results determine next steps
492 * Decision gate after POC1
493 * Flexibility to pivot if needed
494
495
496 === 9.2 Success = Clear Path Forward ===
497
498 POC succeeds if we can confidently answer:
499
500 ✅ **Technical Feasibility:**
501 * Can AI extract claims reliably?
502 * Can AI find balanced evidence?
503 * Can AI compute reasonable verdicts?
504
505 ✅ **Quality Approach:**
506 * Do quality gates improve output?
507 * Can we measure and track quality?
508 * Is the gate approach scalable?
509
510 ✅ **Production Path:**
511 * Is the core architecture sound?
512 * What needs improvement for production?
513 * Is POC2 the right next step?
514
515
516 == 10. Related Pages ==
517
518 * **[[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]]** - Full system requirements (this POC implements a subset)
519 * **[[POC1 Specification (Detailed)>>FactHarbor.Specification.POC.Specification]]** - Detailed POC1 technical specs
520 * **[[POC Summary>>FactHarbor.Specification.POC.Summary]]** - High-level POC overview
521 * **[[Implementation Roadmap>>FactHarbor.Roadmap.WebHome]]** - POC1, POC2, Beta 0, V1.0 phases
522 * **[[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]]** - What users need (drives requirements)
523
524
525 **Document Owner:** Technical Team
526 **Review Frequency:** After each POC iteration
527 **Version History:**
528 * v1.0 - Initial POC requirements
529 * v2.0 - Updated after specification cross-check
530 * v3.0 - Aligned with Main Requirements (FR/NFR IDs added)