Wiki source code of POC Requirements

Last modified by Robert Schaub on 2025/12/23 11:35

Show last authors
1 = POC Requirements =
2
3 **Status:** ✅ Approved for Development
4 **Version:** 3.0 (Aligned with Main Requirements)
5 **Goal:** Prove that AI can extract claims and determine verdicts automatically without human intervention
6
7 {{info}}
8 **Core Philosophy:** POC validates the [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]] through simplified implementation. All POC features map to formal FR/NFR requirements.
9 {{/info}}
10
11
12 == 1. POC Overview ==
13
14 === 1.1 What POC Tests ===
15
16 **Core Question:**
17
18 > Can AI automatically extract factual claims from articles and evaluate them with reasonable verdicts?
19
20 **What we're proving:**
21
22 * AI can identify factual claims from text
23 * AI can evaluate those claims with structured evidence
24 * Quality gates can filter unreliable outputs
25 * The core workflow is technically feasible
26
27 **What we're NOT proving:**
28
29 * Production-ready reliability (that's POC2)
30 * User-facing features (that's Beta 0)
31 * Full IFCN compliance (that's V1.0)
32
33 === 1.2 Requirements Mapping ===
34
35 POC1 implements a **subset** of the full system requirements defined in [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]].
36
37 **Scope Summary:**
38
39 * **In Scope:** 8 requirements (7 FRs + 1 NFR)
40 * **Partial:** 3 NFRs (simplified versions)
41 * **Out of Scope:** 19 requirements (deferred to later phases)
42
43 == 2. Requirements Scope Matrix ==
44
45 {{success}}
46 **Authoritative Source:** See [[Requirements Roadmap Matrix>>Test.FactHarbor V0\.9\.78.Specification.Requirements-Roadmap-Matrix.WebHome]] for complete phase-to-requirement mapping across all phases.
47 {{/success}}
48
49 **POC1 Scope Summary:**
50
51 POC1 implements the following requirements from the [[Main Requirements>>Test.FactHarbor V0\.9\.78.Specification.Requirements.WebHome]]:
52
53 **Full Implementation (8 requirements):**
54
55 * FR1: Claim Extraction
56 * FR2: Claim Context
57 * FR3: Multiple Scenarios
58 * FR4: Analysis Summary (Basic)
59 * FR5: Evidence Collection
60 * FR6: Source Quality Assessment
61 * FR7: Automated Verdicts (with quality gates)
62 * NFR11: AKEL Quality Assurance Framework (Basic - 4 quality gates)
63
64 **Partial Implementation (3 requirements):**
65
66 * NFR1: Explainability (Basic explanations only)
67 * NFR2: Performance (Functional but not optimized)
68 * NFR3: Transparency (Basic transparency)
69
70 **Deferred to Later Phases:**
71
72 * All other requirements (see Roadmap Matrix for phase assignments)
73
74 **Detailed POC1 specifications continue below...**
75
76
77 == 3. POC Simplifications ==
78
79 === 3.1 FR1: Claim Extraction (Full Implementation) ===
80
81 **Main Requirement:** AI extracts factual claims from input text
82
83 **POC Implementation:**
84
85 * ✅ AKEL extracts claims using LLM
86 * ✅ Each claim includes original text reference
87 * ✅ Claims are identified as factual/non-factual
88 * ❌ No advanced claim parsing (added in POC2)
89
90 **Acceptance Criteria:**
91
92 * Extracts 3-5 claims from typical article
93 * Identifies factual vs non-factual claims
94 * Quality Gate 1 validates extraction
95
96 === 3.2 FR3: Multiple Scenarios (Full Implementation) ===
97
98 **Main Requirement:** Generate multiple interpretation scenarios for ambiguous claims
99
100 **POC Implementation:**
101
102 * ✅ AKEL generates 2-3 scenarios per claim
103 * ✅ Scenarios capture different interpretations
104 * ✅ Each scenario is evaluated separately
105 * ✅ Verdict considers all scenarios
106
107 **Acceptance Criteria:**
108
109 * Generates 2+ scenarios for ambiguous claims
110 * Scenarios are meaningfully different
111 * All scenarios are evaluated
112
113 === 3.3 FR4: Analysis Summary (Basic Implementation) ===
114
115 **Main Requirement:** Provide user-friendly summary of analysis
116
117 **POC Implementation:**
118
119 * ✅ Simple text summary generated
120 * ❌ No rich formatting (added in Beta 0)
121 * ❌ No visual elements (added in Beta 0)
122 * ❌ No interactive features (added in Beta 0)
123
124 **POC Format:**
125 ```
126 Claim: [extracted claim]
127 Scenarios: [list of scenarios]
128 Evidence: [supporting/opposing evidence]
129 Verdict: [probability with uncertainty]
130 ```
131
132
133 === 3.4 FR5-FR6: Evidence Collection & Evaluation (Full Implementation) ===
134
135 **Main Requirements:**
136
137 * FR5: Collect supporting and opposing evidence
138 * FR6: Evaluate evidence source reliability
139
140 **POC Implementation:**
141
142 * ✅ AKEL searches for evidence (web/knowledge base)
143 * ✅ **Mandatory contradiction search** (finds opposing evidence)
144 * ✅ Source reliability scoring
145 * ❌ No evidence deduplication (added in POC2)
146 * ❌ No advanced source verification (added in POC2)
147
148 **Acceptance Criteria:**
149
150 * Finds 2+ supporting evidence items
151 * Finds 1+ opposing evidence (if exists)
152 * Sources scored for reliability
153
154 === 3.5 FR7: Automated Verdicts (Full Implementation) ===
155
156 **Main Requirement:** AI computes verdicts with uncertainty quantification
157
158 **POC Implementation:**
159
160 * ✅ Probabilistic verdicts (0-100% confidence)
161 * ✅ Uncertainty explicitly stated
162 * ✅ Reasoning chain provided
163 * ✅ Quality Gate 4 validates verdict confidence
164
165 **POC Output:**
166 ```
167 Verdict: 70% likely true
168 Uncertainty: ±15% (moderate confidence)
169 Reasoning: Based on 3 high-quality sources...
170 Confidence Level: MEDIUM
171 ```
172
173 **Acceptance Criteria:**
174
175 * Verdicts include probability (0-100%)
176 * Uncertainty explicitly quantified
177 * Reasoning chain explains verdict
178
179 === 3.6 NFR11: Quality Assurance Framework (LITE VERSION) ===
180
181 **Main Requirement:** Complete quality assurance with 7 quality gates
182
183 **POC Implementation:** **2 gates only**
184
185 **Quality Gate 1: Claim Validation**
186
187 * ✅ Validates claim is factual and verifiable
188 * ✅ Blocks non-factual claims (opinion/prediction/ambiguous)
189 * ✅ Provides clear rejection reason
190
191 **Quality Gate 4: Verdict Confidence Assessment**
192
193 * ✅ Validates ≥2 sources found
194 * ✅ Validates quality score ≥0.6
195 * ✅ Blocks low-confidence verdicts
196 * ✅ Provides clear rejection reason
197
198 **Out of Scope (POC2+):**
199
200 * ❌ Gate 2: Evidence Relevance
201 * ❌ Gate 3: Scenario Coherence
202 * ❌ Gate 5: Source Diversity
203 * ❌ Gate 6: Reasoning Validity
204 * ❌ Gate 7: Output Completeness
205
206 **Rationale:** Prove gate concept works. Add remaining gates in POC2 after validating approach.
207
208
209 === 3.7 NFR1-3: Performance, Scalability, Reliability (Basic) ===
210
211 **Main Requirements:**
212
213 * NFR1: Response time < 30 seconds
214 * NFR2: Handle 1000+ concurrent users
215 * NFR3: 99.9% uptime
216
217 **POC Implementation:**
218
219 * ⚠️ **Response time monitored** (not optimized)
220 * ⚠️ **Single-threaded processing** (no concurrency)
221 * ⚠️ **Basic error handling** (no advanced retry logic)
222
223 **Rationale:** POC proves functionality. Performance optimization happens in POC2.
224
225 **POC Acceptance:**
226
227 * Analysis completes (no timeout requirement)
228 * Errors don't crash system
229 * Basic logging in place
230
231 == 4. What's NOT in POC Scope ==
232
233 === 4.1 User-Facing Features (Beta 0+) ===
234
235 {{warning}}
236 **Deferred to Beta 0:**
237 {{/warning}}
238
239 **Out of Scope:**
240
241 * ❌ User accounts and authentication (FR8)
242 * ❌ User corrections system (FR9, FR45-46)
243 * ❌ Public publishing interface (FR10)
244 * ❌ Social sharing (FR11)
245 * ❌ Email notifications (FR12)
246 * ❌ API access (FR13)
247
248 **Rationale:** POC validates AI capabilities. User features added in Beta 0.
249
250
251 === 4.2 Advanced Features (V1.0+) ===
252
253 **Out of Scope:**
254
255 * ❌ IFCN compliance (FR47)
256 * ❌ ClaimReview schema (FR48)
257 * ❌ Archive.org integration (FR49)
258 * ❌ OSINT toolkit (FR50)
259 * ❌ Video verification (FR51)
260 * ❌ Deepfake detection (FR52)
261 * ❌ Cross-org sharing (FR53)
262
263 **Rationale:** Advanced features require proven platform. Added post-V1.0.
264
265
266 === 4.3 Production Requirements (POC2, Beta 0) ===
267
268 **Out of Scope:**
269
270 * ❌ Security controls (NFR4, NFR12)
271 * ❌ Code maintainability (NFR5)
272 * ❌ System monitoring (NFR13)
273 * ❌ Evidence deduplication
274 * ❌ Advanced source verification
275 * ❌ Full 7-gate quality framework
276
277 **Rationale:** POC proves concept. Production hardening happens in POC2 and Beta 0.
278
279
280 == 5. POC Output Specification ==
281
282 === 5.1 Required Output Elements ===
283
284 For each analyzed claim, POC must produce:
285
286 * \\
287 ** \\
288 **1. Claim
289 * Original text
290 * Classification (factual/non-factual/ambiguous)
291 * If non-factual: Clear reason why
292
293 **2. Scenarios** (if factual)
294
295 * 2-3 interpretation scenarios
296 * Each scenario clearly described
297
298 **3. Evidence** (if factual)
299
300 * Supporting evidence (2+ items)
301 * Opposing evidence (if exists)
302 * Source URLs and reliability scores
303
304 **4. Verdict** (if factual)
305
306 * Probability (0-100%)
307 * Uncertainty quantification
308 * Confidence level (LOW/MEDIUM/HIGH)
309 * Reasoning chain
310
311 **5. Quality Status**
312
313 * Which gates passed/failed
314 * If failed: Clear explanation why
315
316 === 5.2 Example POC Output ===
317
318 {{code language="json"}}
319 {
320 "claim": {
321 "text": "Switzerland has the highest life expectancy in Europe",
322 "type": "factual",
323 "gate1_status": "PASS"
324 },
325 "scenarios": [
326 "Switzerland's overall life expectancy is highest",
327 "Switzerland ranks highest for specific age groups"
328 ],
329 "evidence": {
330 "supporting": [
331 {
332 "source": "WHO Report 2023",
333 "reliability": 0.95,
334 "excerpt": "Switzerland: 83.4 years average..."
335 }
336 ],
337 "opposing": [
338 {
339 "source": "Eurostat 2024",
340 "reliability": 0.90,
341 "excerpt": "Spain leads at 83.5 years..."
342 }
343 ]
344 },
345 "verdict": {
346 "probability": 0.65,
347 "uncertainty": 0.15,
348 "confidence": "MEDIUM",
349 "reasoning": "WHO and Eurostat show similar but conflicting data...",
350 "gate4_status": "PASS"
351 }
352 }
353 {{/code}}
354
355
356 == 6. Success Criteria ==
357
358 {{success}}
359 **POC Success Definition:** POC validates that AI can extract claims, find balanced evidence, and compute reasonable verdicts with quality gates improving output quality.
360 {{/success}}
361
362 === 6.1 Functional Success ===
363
364 POC is successful if:
365
366 ✅ **FR1-FR7 Requirements Met:**
367
368 1. Extracts 3-5 factual claims from test articles
369 2. Generates 2-3 scenarios per ambiguous claim
370 3. Finds supporting AND opposing evidence
371 4. Computes probabilistic verdicts with uncertainty
372 5. Provides clear reasoning chains
373
374 ✅ **Quality Gates Work:**
375
376 1. Gate 1 blocks non-factual claims (100% block rate)
377 2. Gate 4 blocks low-quality verdicts (blocks if <2 sources or quality <0.6)
378 3. Clear rejection reasons provided
379
380 ✅ **NFR11 Met:**
381
382 1. Quality gates reduce hallucination rate
383 2. Blocked outputs have clear explanations
384 3. Quality metrics are logged
385
386 === 6.2 Quality Thresholds ===
387
388 **Minimum Acceptable:**
389
390 * ≥70% of test claims correctly classified (factual/non-factual)
391 * ≥60% of verdicts are reasonable (human evaluation)
392 * Gate 1 blocks 100% of non-factual claims
393 * Gate 4 blocks verdicts with <2 sources
394
395 **Target:**
396
397 * ≥80% claims correctly classified
398 * ≥75% verdicts are reasonable
399 * <10% false positives (blocking good claims)
400
401 === 6.3 POC Decision Gate ===
402
403 **After POC1, we decide:**
404
405 **✅ PROCEED to POC2** if:
406
407 * Success criteria met
408 * Quality gates demonstrably improve output
409 * Core workflow is technically sound
410 * Clear path to production quality
411
412 **⚠️ ITERATE POC1** if:
413
414 * Success criteria partially met
415 * Gates work but need tuning
416 * Core issues identified but fixable
417
418 **❌ PIVOT APPROACH** if:
419
420 * Success criteria not met
421 * Fundamental AI limitations discovered
422 * Quality gates insufficient
423 * Alternative approach needed
424
425 == 7. Test Cases ==
426
427 === 7.1 Happy Path ===
428
429 **Test 1: Simple Factual Claim**
430
431 * Input: "Paris is the capital of France"
432 * Expected: Factual, 1 scenario, verdict 95% true
433
434 **Test 2: Ambiguous Claim**
435
436 * Input: "Switzerland has the highest income in Europe"
437 * Expected: Factual, 2-3 scenarios, verdict with uncertainty
438
439 **Test 3: Statistical Claim**
440
441 * Input: "10% of people have condition X"
442 * Expected: Factual, evidence with numbers, probabilistic verdict
443
444 === 7.2 Edge Cases ===
445
446 **Test 4: Opinion**
447
448 * Input: "Paris is the best city"
449 * Expected: Non-factual (opinion), blocked by Gate 1
450
451 **Test 5: Prediction**
452
453 * Input: "Bitcoin will reach $100,000 next year"
454 * Expected: Non-factual (prediction), blocked by Gate 1
455
456 **Test 6: Insufficient Evidence**
457
458 * Input: Obscure factual claim with no sources
459 * Expected: Blocked by Gate 4 (<2 sources)
460
461 === 7.3 Quality Gate Tests ===
462
463 **Test 7: Gate 1 Effectiveness**
464
465 * Input: Mix of 10 factual + 10 non-factual claims
466 * Expected: Gate 1 blocks all 10 non-factual (100% precision)
467
468 **Test 8: Gate 4 Effectiveness**
469
470 * Input: Claims with varying evidence availability
471 * Expected: Gate 4 blocks low-confidence verdicts
472
473 == 8. Technical Architecture (POC) ==
474
475 === 8.1 Simplified Architecture ===
476
477 **POC Tech Stack:**
478
479 * **Frontend:** Simple web interface (Next.js + TypeScript)
480 * **Backend:** Single API endpoint
481 * **AI:** Claude API (Sonnet 4.5)
482 * **Database:** Local JSON files (no database)
483 * **Deployment:** Single server
484
485 **Architecture Diagram:** See [[POC1 Specification>>FactHarbor.Specification.POC.Specification]]
486
487
488 === 8.2 AKEL Implementation ===
489
490 **POC AKEL:**
491
492 * Single-threaded processing
493 * Synchronous API calls
494 * No caching
495 * Basic error handling
496 * Console logging
497
498 **Full AKEL (POC2+):**
499
500 * Multi-threaded processing
501 * Async API calls
502 * Evidence caching
503 * Advanced error handling with retry
504 * Structured logging + monitoring
505
506 == 9. POC Philosophy ==
507
508 {{info}}
509 **Important:** POC validates concept, not production readiness. Focus is on proving AI can do the job, with production quality coming in later phases.
510 {{/info}}
511
512 === 9.1 Core Principles ===
513
514 * \\
515 ** \\
516 **1. Prove Concept, Not Production
517 * POC validates AI can do the job
518 * Production quality comes in POC2 and Beta 0
519 * Focus on "does it work?" not "is it perfect?"
520
521 **2. Implement Subset of Requirements**
522
523 * POC covers FR1-7, NFR11 (lite)
524 * All other requirements deferred
525 * Clear mapping to [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]]
526
527 **3. Quality Gates Validate Approach**
528
529 * 2 gates prove the concept
530 * Remaining 5 gates added in POC2
531 * Gates must demonstrably improve quality
532
533 **4. Iterate Based on Results**
534
535 * POC results determine next steps
536 * Decision gate after POC1
537 * Flexibility to pivot if needed
538
539 === 9.2 Success ===
540
541 Clear Path Forward ===
542
543 POC succeeds if we can confidently answer:
544
545 ✅ **Technical Feasibility:**
546
547 * Can AI extract claims reliably?
548 * Can AI find balanced evidence?
549 * Can AI compute reasonable verdicts?
550
551 ✅ **Quality Approach:**
552
553 * Do quality gates improve output?
554 * Can we measure and track quality?
555 * Is the gate approach scalable?
556
557 ✅ **Production Path:**
558
559 * Is the core architecture sound?
560 * What needs improvement for production?
561 * Is POC2 the right next step?
562
563 == 10. Related Pages ==
564
565 * **[[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]]** - Full system requirements (this POC implements a subset)
566 * **[[POC1 Specification (Detailed)>>FactHarbor.Specification.POC.Specification]]** - Detailed POC1 technical specs
567 * **[[POC Summary>>FactHarbor.Specification.POC.Summary]]** - High-level POC overview
568 * **[[Implementation Roadmap>>FactHarbor.Roadmap.WebHome]]** - POC1, POC2, Beta 0, V1.0 phases
569 * **[[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]]** - What users need (drives requirements)
570
571 **Document Owner:** Technical Team
572 **Review Frequency:** After each POC iteration
573 **Version History:**
574
575 * v1.0 - Initial POC requirements
576 * v2.0 - Updated after specification cross-check
577 * v3.0 - Aligned with Main Requirements (FR/NFR IDs added)