Wiki source code of POC Requirements

Version 1.1 by Robert Schaub on 2025/12/23 11:20

Hide last authors
Robert Schaub 1.1 1 = POC Requirements =
2
3 **Status:** ✅ Approved for Development
4 **Version:** 3.0 (Aligned with Main Requirements)
5 **Goal:** Prove that AI can extract claims and determine verdicts automatically without human intervention
6
7 {{info}}
8 **Core Philosophy:** POC validates the [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]] through simplified implementation. All POC features map to formal FR/NFR requirements.
9 {{/info}}
10
11
12 == 1. POC Overview ==
13
14 === 1.1 What POC Tests ===
15
16 **Core Question:**
17 > Can AI automatically extract factual claims from articles and evaluate them with reasonable verdicts?
18
19 **What we're proving:**
20 * AI can identify factual claims from text
21 * AI can evaluate those claims with structured evidence
22 * Quality gates can filter unreliable outputs
23 * The core workflow is technically feasible
24
25 **What we're NOT proving:**
26 * Production-ready reliability (that's POC2)
27 * User-facing features (that's Beta 0)
28 * Full IFCN compliance (that's V1.0)
29
30 === 1.2 Requirements Mapping ===
31
32 POC1 implements a **subset** of the full system requirements defined in [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]].
33
34 **Scope Summary:**
35 * **In Scope:** 8 requirements (7 FRs + 1 NFR)
36 * **Partial:** 3 NFRs (simplified versions)
37 * **Out of Scope:** 19 requirements (deferred to later phases)
38
39
40 == 2. Requirements Scope Matrix ==
41
42 {{success}}
43 **Authoritative Source:** See [[Requirements Roadmap Matrix>>Test.FactHarbor.Specification.Requirements-Roadmap-Matrix.WebHome]] for complete phase-to-requirement mapping across all phases.
44 {{/success}}
45
46 **POC1 Scope Summary:**
47
48 POC1 implements the following requirements from the [[Main Requirements>>Test.FactHarbor.Specification.Requirements.WebHome]]:
49
50 **Full Implementation (8 requirements):**
51 * FR1: Claim Extraction
52 * FR2: Claim Context
53 * FR3: Multiple Scenarios
54 * FR4: Analysis Summary (Basic)
55 * FR5: Evidence Collection
56 * FR6: Source Quality Assessment
57 * FR7: Automated Verdicts (with quality gates)
58 * NFR11: AKEL Quality Assurance Framework (Basic - 4 quality gates)
59
60 **Partial Implementation (3 requirements):**
61 * NFR1: Explainability (Basic explanations only)
62 * NFR2: Performance (Functional but not optimized)
63 * NFR3: Transparency (Basic transparency)
64
65 **Deferred to Later Phases:**
66 * All other requirements (see Roadmap Matrix for phase assignments)
67
68 **Detailed POC1 specifications continue below...**
69
70
71 == 3. POC Simplifications ==
72
73 === 3.1 FR1: Claim Extraction (Full Implementation) ===
74
75 **Main Requirement:** AI extracts factual claims from input text
76
77 **POC Implementation:**
78 * ✅ AKEL extracts claims using LLM
79 * ✅ Each claim includes original text reference
80 * ✅ Claims are identified as factual/non-factual
81 * ❌ No advanced claim parsing (added in POC2)
82
83 **Acceptance Criteria:**
84 * Extracts 3-5 claims from typical article
85 * Identifies factual vs non-factual claims
86 * Quality Gate 1 validates extraction
87
88
89 === 3.2 FR3: Multiple Scenarios (Full Implementation) ===
90
91 **Main Requirement:** Generate multiple interpretation scenarios for ambiguous claims
92
93 **POC Implementation:**
94 * ✅ AKEL generates 2-3 scenarios per claim
95 * ✅ Scenarios capture different interpretations
96 * ✅ Each scenario is evaluated separately
97 * ✅ Verdict considers all scenarios
98
99 **Acceptance Criteria:**
100 * Generates 2+ scenarios for ambiguous claims
101 * Scenarios are meaningfully different
102 * All scenarios are evaluated
103
104
105 === 3.3 FR4: Analysis Summary (Basic Implementation) ===
106
107 **Main Requirement:** Provide user-friendly summary of analysis
108
109 **POC Implementation:**
110 * ✅ Simple text summary generated
111 * ❌ No rich formatting (added in Beta 0)
112 * ❌ No visual elements (added in Beta 0)
113 * ❌ No interactive features (added in Beta 0)
114
115 **POC Format:**
116 ```
117 Claim: [extracted claim]
118 Scenarios: [list of scenarios]
119 Evidence: [supporting/opposing evidence]
120 Verdict: [probability with uncertainty]
121 ```
122
123
124 === 3.4 FR5-FR6: Evidence Collection & Evaluation (Full Implementation) ===
125
126 **Main Requirements:**
127 * FR5: Collect supporting and opposing evidence
128 * FR6: Evaluate evidence source reliability
129
130 **POC Implementation:**
131 * ✅ AKEL searches for evidence (web/knowledge base)
132 * ✅ **Mandatory contradiction search** (finds opposing evidence)
133 * ✅ Source reliability scoring
134 * ❌ No evidence deduplication (added in POC2)
135 * ❌ No advanced source verification (added in POC2)
136
137 **Acceptance Criteria:**
138 * Finds 2+ supporting evidence items
139 * Finds 1+ opposing evidence (if exists)
140 * Sources scored for reliability
141
142
143 === 3.5 FR7: Automated Verdicts (Full Implementation) ===
144
145 **Main Requirement:** AI computes verdicts with uncertainty quantification
146
147 **POC Implementation:**
148 * ✅ Probabilistic verdicts (0-100% confidence)
149 * ✅ Uncertainty explicitly stated
150 * ✅ Reasoning chain provided
151 * ✅ Quality Gate 4 validates verdict confidence
152
153 **POC Output:**
154 ```
155 Verdict: 70% likely true
156 Uncertainty: ±15% (moderate confidence)
157 Reasoning: Based on 3 high-quality sources...
158 Confidence Level: MEDIUM
159 ```
160
161 **Acceptance Criteria:**
162 * Verdicts include probability (0-100%)
163 * Uncertainty explicitly quantified
164 * Reasoning chain explains verdict
165
166
167 === 3.6 NFR11: Quality Assurance Framework (LITE VERSION) ===
168
169 **Main Requirement:** Complete quality assurance with 7 quality gates
170
171 **POC Implementation:** **2 gates only**
172
173 **Quality Gate 1: Claim Validation**
174 * ✅ Validates claim is factual and verifiable
175 * ✅ Blocks non-factual claims (opinion/prediction/ambiguous)
176 * ✅ Provides clear rejection reason
177
178 **Quality Gate 4: Verdict Confidence Assessment**
179 * ✅ Validates ≥2 sources found
180 * ✅ Validates quality score ≥0.6
181 * ✅ Blocks low-confidence verdicts
182 * ✅ Provides clear rejection reason
183
184 **Out of Scope (POC2+):**
185 * ❌ Gate 2: Evidence Relevance
186 * ❌ Gate 3: Scenario Coherence
187 * ❌ Gate 5: Source Diversity
188 * ❌ Gate 6: Reasoning Validity
189 * ❌ Gate 7: Output Completeness
190
191 **Rationale:** Prove gate concept works. Add remaining gates in POC2 after validating approach.
192
193
194 === 3.7 NFR1-3: Performance, Scalability, Reliability (Basic) ===
195
196 **Main Requirements:**
197 * NFR1: Response time < 30 seconds
198 * NFR2: Handle 1000+ concurrent users
199 * NFR3: 99.9% uptime
200
201 **POC Implementation:**
202 * ⚠️ **Response time monitored** (not optimized)
203 * ⚠️ **Single-threaded processing** (no concurrency)
204 * ⚠️ **Basic error handling** (no advanced retry logic)
205
206 **Rationale:** POC proves functionality. Performance optimization happens in POC2.
207
208 **POC Acceptance:**
209 * Analysis completes (no timeout requirement)
210 * Errors don't crash system
211 * Basic logging in place
212
213
214 == 4. What's NOT in POC Scope ==
215
216 === 4.1 User-Facing Features (Beta 0+) ===
217
218 {{warning}}
219 **Deferred to Beta 0:**
220 {{/warning}}
221
222 **Out of Scope:**
223 * ❌ User accounts and authentication (FR8)
224 * ❌ User corrections system (FR9, FR45-46)
225 * ❌ Public publishing interface (FR10)
226 * ❌ Social sharing (FR11)
227 * ❌ Email notifications (FR12)
228 * ❌ API access (FR13)
229
230 **Rationale:** POC validates AI capabilities. User features added in Beta 0.
231
232
233 === 4.2 Advanced Features (V1.0+) ===
234
235 **Out of Scope:**
236 * ❌ IFCN compliance (FR47)
237 * ❌ ClaimReview schema (FR48)
238 * ❌ Archive.org integration (FR49)
239 * ❌ OSINT toolkit (FR50)
240 * ❌ Video verification (FR51)
241 * ❌ Deepfake detection (FR52)
242 * ❌ Cross-org sharing (FR53)
243
244 **Rationale:** Advanced features require proven platform. Added post-V1.0.
245
246
247 === 4.3 Production Requirements (POC2, Beta 0) ===
248
249 **Out of Scope:**
250 * ❌ Security controls (NFR4, NFR12)
251 * ❌ Code maintainability (NFR5)
252 * ❌ System monitoring (NFR13)
253 * ❌ Evidence deduplication
254 * ❌ Advanced source verification
255 * ❌ Full 7-gate quality framework
256
257 **Rationale:** POC proves concept. Production hardening happens in POC2 and Beta 0.
258
259
260 == 5. POC Output Specification ==
261
262 === 5.1 Required Output Elements ===
263
264 For each analyzed claim, POC must produce:
265
266 **1. Claim**
267 * Original text
268 * Classification (factual/non-factual/ambiguous)
269 * If non-factual: Clear reason why
270
271 **2. Scenarios** (if factual)
272 * 2-3 interpretation scenarios
273 * Each scenario clearly described
274
275 **3. Evidence** (if factual)
276 * Supporting evidence (2+ items)
277 * Opposing evidence (if exists)
278 * Source URLs and reliability scores
279
280 **4. Verdict** (if factual)
281 * Probability (0-100%)
282 * Uncertainty quantification
283 * Confidence level (LOW/MEDIUM/HIGH)
284 * Reasoning chain
285
286 **5. Quality Status**
287 * Which gates passed/failed
288 * If failed: Clear explanation why
289
290
291 === 5.2 Example POC Output ===
292
293 {{code language="json"}}
294 {
295 "claim": {
296 "text": "Switzerland has the highest life expectancy in Europe",
297 "type": "factual",
298 "gate1_status": "PASS"
299 },
300 "scenarios": [
301 "Switzerland's overall life expectancy is highest",
302 "Switzerland ranks highest for specific age groups"
303 ],
304 "evidence": {
305 "supporting": [
306 {
307 "source": "WHO Report 2023",
308 "reliability": 0.95,
309 "excerpt": "Switzerland: 83.4 years average..."
310 }
311 ],
312 "opposing": [
313 {
314 "source": "Eurostat 2024",
315 "reliability": 0.90,
316 "excerpt": "Spain leads at 83.5 years..."
317 }
318 ]
319 },
320 "verdict": {
321 "probability": 0.65,
322 "uncertainty": 0.15,
323 "confidence": "MEDIUM",
324 "reasoning": "WHO and Eurostat show similar but conflicting data...",
325 "gate4_status": "PASS"
326 }
327 }
328 {{/code}}
329
330
331 == 6. Success Criteria ==
332
333 {{success}}
334 **POC Success Definition:** POC validates that AI can extract claims, find balanced evidence, and compute reasonable verdicts with quality gates improving output quality.
335 {{/success}}
336
337 === 6.1 Functional Success ===
338
339 POC is successful if:
340
341 ✅ **FR1-FR7 Requirements Met:**
342 1. Extracts 3-5 factual claims from test articles
343 2. Generates 2-3 scenarios per ambiguous claim
344 3. Finds supporting AND opposing evidence
345 4. Computes probabilistic verdicts with uncertainty
346 5. Provides clear reasoning chains
347
348 ✅ **Quality Gates Work:**
349 1. Gate 1 blocks non-factual claims (100% block rate)
350 2. Gate 4 blocks low-quality verdicts (blocks if <2 sources or quality <0.6)
351 3. Clear rejection reasons provided
352
353 ✅ **NFR11 Met:**
354 1. Quality gates reduce hallucination rate
355 2. Blocked outputs have clear explanations
356 3. Quality metrics are logged
357
358
359 === 6.2 Quality Thresholds ===
360
361 **Minimum Acceptable:**
362 * ≥70% of test claims correctly classified (factual/non-factual)
363 * ≥60% of verdicts are reasonable (human evaluation)
364 * Gate 1 blocks 100% of non-factual claims
365 * Gate 4 blocks verdicts with <2 sources
366
367 **Target:**
368 * ≥80% claims correctly classified
369 * ≥75% verdicts are reasonable
370 * <10% false positives (blocking good claims)
371
372
373 === 6.3 POC Decision Gate ===
374
375 **After POC1, we decide:**
376
377 **✅ PROCEED to POC2** if:
378 * Success criteria met
379 * Quality gates demonstrably improve output
380 * Core workflow is technically sound
381 * Clear path to production quality
382
383 **⚠️ ITERATE POC1** if:
384 * Success criteria partially met
385 * Gates work but need tuning
386 * Core issues identified but fixable
387
388 **❌ PIVOT APPROACH** if:
389 * Success criteria not met
390 * Fundamental AI limitations discovered
391 * Quality gates insufficient
392 * Alternative approach needed
393
394
395 == 7. Test Cases ==
396
397 === 7.1 Happy Path ===
398
399 **Test 1: Simple Factual Claim**
400 * Input: "Paris is the capital of France"
401 * Expected: Factual, 1 scenario, verdict ~95% true
402
403 **Test 2: Ambiguous Claim**
404 * Input: "Switzerland has the highest income in Europe"
405 * Expected: Factual, 2-3 scenarios, verdict with uncertainty
406
407 **Test 3: Statistical Claim**
408 * Input: "10% of people have condition X"
409 * Expected: Factual, evidence with numbers, probabilistic verdict
410
411
412 === 7.2 Edge Cases ===
413
414 **Test 4: Opinion**
415 * Input: "Paris is the best city"
416 * Expected: Non-factual (opinion), blocked by Gate 1
417
418 **Test 5: Prediction**
419 * Input: "Bitcoin will reach $100,000 next year"
420 * Expected: Non-factual (prediction), blocked by Gate 1
421
422 **Test 6: Insufficient Evidence**
423 * Input: Obscure factual claim with no sources
424 * Expected: Blocked by Gate 4 (<2 sources)
425
426
427 === 7.3 Quality Gate Tests ===
428
429 **Test 7: Gate 1 Effectiveness**
430 * Input: Mix of 10 factual + 10 non-factual claims
431 * Expected: Gate 1 blocks all 10 non-factual (100% precision)
432
433 **Test 8: Gate 4 Effectiveness**
434 * Input: Claims with varying evidence availability
435 * Expected: Gate 4 blocks low-confidence verdicts
436
437
438 == 8. Technical Architecture (POC) ==
439
440 === 8.1 Simplified Architecture ===
441
442 **POC Tech Stack:**
443 * **Frontend:** Simple web interface (Next.js + TypeScript)
444 * **Backend:** Single API endpoint
445 * **AI:** Claude API (Sonnet 4.5)
446 * **Database:** Local JSON files (no database)
447 * **Deployment:** Single server
448
449 **Architecture Diagram:** See [[POC1 Specification>>FactHarbor.Specification.POC.Specification]]
450
451
452 === 8.2 AKEL Implementation ===
453
454 **POC AKEL:**
455 * Single-threaded processing
456 * Synchronous API calls
457 * No caching
458 * Basic error handling
459 * Console logging
460
461 **Full AKEL (POC2+):**
462 * Multi-threaded processing
463 * Async API calls
464 * Evidence caching
465 * Advanced error handling with retry
466 * Structured logging + monitoring
467
468
469 == 9. POC Philosophy ==
470
471 {{info}}
472 **Important:** POC validates concept, not production readiness. Focus is on proving AI can do the job, with production quality coming in later phases.
473 {{/info}}
474
475 === 9.1 Core Principles ===
476
477 **1. Prove Concept, Not Production**
478 * POC validates AI can do the job
479 * Production quality comes in POC2 and Beta 0
480 * Focus on "does it work?" not "is it perfect?"
481
482 **2. Implement Subset of Requirements**
483 * POC covers FR1-7, NFR11 (lite)
484 * All other requirements deferred
485 * Clear mapping to [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]]
486
487 **3. Quality Gates Validate Approach**
488 * 2 gates prove the concept
489 * Remaining 5 gates added in POC2
490 * Gates must demonstrably improve quality
491
492 **4. Iterate Based on Results**
493 * POC results determine next steps
494 * Decision gate after POC1
495 * Flexibility to pivot if needed
496
497
498 === 9.2 Success = Clear Path Forward ===
499
500 POC succeeds if we can confidently answer:
501
502 ✅ **Technical Feasibility:**
503 * Can AI extract claims reliably?
504 * Can AI find balanced evidence?
505 * Can AI compute reasonable verdicts?
506
507 ✅ **Quality Approach:**
508 * Do quality gates improve output?
509 * Can we measure and track quality?
510 * Is the gate approach scalable?
511
512 ✅ **Production Path:**
513 * Is the core architecture sound?
514 * What needs improvement for production?
515 * Is POC2 the right next step?
516
517
518 == 10. Related Pages ==
519
520 * **[[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]]** - Full system requirements (this POC implements a subset)
521 * **[[POC1 Specification (Detailed)>>FactHarbor.Specification.POC.Specification]]** - Detailed POC1 technical specs
522 * **[[POC Summary>>FactHarbor.Specification.POC.Summary]]** - High-level POC overview
523 * **[[Implementation Roadmap>>FactHarbor.Roadmap.WebHome]]** - POC1, POC2, Beta 0, V1.0 phases
524 * **[[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]]** - What users need (drives requirements)
525
526
527 **Document Owner:** Technical Team
528 **Review Frequency:** After each POC iteration
529 **Version History:**
530 * v1.0 - Initial POC requirements
531 * v2.0 - Updated after specification cross-check
532 * v3.0 - Aligned with Main Requirements (FR/NFR IDs added)
533