Wiki source code of POC Requirements (POC1 & POC2)

Version 2.1 by Robert Schaub on 2025/12/23 17:44

Show last authors
1 = POC Requirements =
2
3 **Status:** ✅ Approved for Development
4 **Version:** 3.0 (Aligned with Main Requirements)
5 **Goal:** Prove that AI can extract claims and determine verdicts automatically without human intervention
6
7 {{info}}
8 **Core Philosophy:** POC validates the [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]] through simplified implementation. All POC features map to formal FR/NFR requirements.
9 {{/info}}
10
11
12 == 1. POC Overview ==
13
14 === 1.1 What POC Tests ===
15
16 **Core Question:**
17 > Can AI automatically extract factual claims from articles and evaluate them with reasonable verdicts?
18
19 **What we're proving:**
20 * AI can identify factual claims from text
21 * AI can evaluate those claims with structured evidence
22 * Quality gates can filter unreliable outputs
23 * The core workflow is technically feasible
24
25 **What we're NOT proving:**
26 * Production-ready reliability (that's POC2)
27 * User-facing features (that's Beta 0)
28 * Full IFCN compliance (that's V1.0)
29
30 === 1.2 Requirements Mapping ===
31
32 POC1 implements a **subset** of the full system requirements defined in [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]].
33
34 **Scope Summary:**
35 * **In Scope:** 8 requirements (7 FRs + 1 NFR)
36 * **Partial:** 3 NFRs (simplified versions)
37 * **Out of Scope:** 19 requirements (deferred to later phases)
38
39
40 == 2. POC1 Scope ==
41
42 {{success}}
43 **Authoritative Source for Phase Mapping:** [[Requirements Roadmap Matrix>>Test.FactHarbor.Roadmap.Requirements-Roadmap-Matrix.WebHome]]
44
45 The Roadmap Matrix is the single source of truth for which requirements are implemented in which phases. This page provides POC1-specific implementation details only.
46 {{/success}}
47
48 **POC1 implements these formal requirements:**
49
50 |= Formal Req |= Implementation in POC1 |= Notes
51 | **FR4** | Analysis Summary | Basic format; quality metadata deferred to POC2
52 | **FR7** | Automated Verdicts | Full implementation with quality gates (NFR11)
53 | **NFR11** | Quality Assurance Framework | 4 quality gates implemented
54
55 **POC1 also implements these workflow components** (detailed as FR1-FR6 in implementation sections below)
56
57 {{info}}
58 **Note:** FR11 (Audit Trail) and FR13 (In-Article Claim Highlighting) are deferred to Beta 0 for production readiness and user experience enhancement.
59 {{/info}}:
60 * Claim extraction (FR1)
61 * Claim context (FR2)
62 * Multiple scenarios (FR3)
63 * Evidence collection (FR5)
64 * Source quality assessment (FR6)
65 * Time evolution tracking (FR8) - deferred to POC2
66 * Audit trail (FR11) - deferred to Beta 0
67 * In-article highlighting (FR13) - deferred to Beta 0
68
69 **Partial implementations:**
70 * NFR1 (Explainability) - Basic only
71 * NFR2 (Performance) - Functional but not optimized
72 * NFR3 (Transparency) - Basic only
73
74 **Detailed POC1 implementation specifications continue below...**
75
76
77
78 == 3. POC Simplifications ==
79
80 === 3.1 FR1: Claim Extraction (Full Implementation) ===
81
82 **Main Requirement:** AI extracts factual claims from input text
83
84 **POC Implementation:**
85 * ✅ AKEL extracts claims using LLM
86 * ✅ Each claim includes original text reference
87 * ✅ Claims are identified as factual/non-factual
88 * ❌ No advanced claim parsing (added in POC2)
89
90 **Acceptance Criteria:**
91 * Extracts 3-5 claims from typical article
92 * Identifies factual vs non-factual claims
93 * Quality Gate 1 validates extraction
94
95
96 === 3.2 FR3: Multiple Scenarios (Full Implementation) ===
97
98 **Main Requirement:** Generate multiple interpretation scenarios for ambiguous claims
99
100 **POC Implementation:**
101 * ✅ AKEL generates 2-3 scenarios per claim
102 * ✅ Scenarios capture different interpretations
103 * ✅ Each scenario is evaluated separately
104 * ✅ Verdict considers all scenarios
105
106 **Acceptance Criteria:**
107 * Generates 2+ scenarios for ambiguous claims
108 * Scenarios are meaningfully different
109 * All scenarios are evaluated
110
111
112 === 3.3 FR4: Analysis Summary (Basic Implementation) ===
113
114 **Main Requirement:** Provide user-friendly summary of analysis
115
116 **POC Implementation:**
117 * ✅ Simple text summary generated
118 * ❌ No rich formatting (added in Beta 0)
119 * ❌ No visual elements (added in Beta 0)
120 * ❌ No interactive features (added in Beta 0)
121
122 **POC Format:**
123 ```
124 Claim: [extracted claim]
125 Scenarios: [list of scenarios]
126 Evidence: [supporting/opposing evidence]
127 Verdict: [probability with uncertainty]
128 ```
129
130
131 === 3.4 FR5-FR6: Evidence Collection & Evaluation (Full Implementation) ===
132
133 **Main Requirements:**
134 * FR5: Collect supporting and opposing evidence
135 * FR6: Evaluate evidence source reliability
136
137 **POC Implementation:**
138 * ✅ AKEL searches for evidence (web/knowledge base)
139 * ✅ **Mandatory contradiction search** (finds opposing evidence)
140 * ✅ Source reliability scoring
141 * ❌ No evidence deduplication (added in POC2)
142 * ❌ No advanced source verification (added in POC2)
143
144 **Acceptance Criteria:**
145 * Finds 2+ supporting evidence items
146 * Finds 1+ opposing evidence (if exists)
147 * Sources scored for reliability
148
149
150 === 3.5 FR7: Automated Verdicts (Full Implementation) ===
151
152 **Main Requirement:** AI computes verdicts with uncertainty quantification
153
154 **POC Implementation:**
155 * ✅ Probabilistic verdicts (0-100% confidence)
156 * ✅ Uncertainty explicitly stated
157 * ✅ Reasoning chain provided
158 * ✅ Quality Gate 4 validates verdict confidence
159
160 **POC Output:**
161 ```
162 Verdict: 70% likely true
163 Uncertainty: ±15% (moderate confidence)
164 Reasoning: Based on 3 high-quality sources...
165 Confidence Level: MEDIUM
166 ```
167
168 **Acceptance Criteria:**
169 * Verdicts include probability (0-100%)
170 * Uncertainty explicitly quantified
171 * Reasoning chain explains verdict
172
173
174 === 3.6 NFR11: Quality Assurance Framework (LITE VERSION) ===
175
176 **Main Requirement:** Complete quality assurance with 7 quality gates
177
178 **POC Implementation:** **2 gates only**
179
180 **Quality Gate 1: Claim Validation**
181 * ✅ Validates claim is factual and verifiable
182 * ✅ Blocks non-factual claims (opinion/prediction/ambiguous)
183 * ✅ Provides clear rejection reason
184
185 **Quality Gate 4: Verdict Confidence Assessment**
186 * ✅ Validates ≥2 sources found
187 * ✅ Validates quality score ≥0.6
188 * ✅ Blocks low-confidence verdicts
189 * ✅ Provides clear rejection reason
190
191 **Out of Scope (POC2+):**
192 * ❌ Gate 2: Evidence Relevance
193 * ❌ Gate 3: Scenario Coherence
194 * ❌ Gate 5: Source Diversity
195 * ❌ Gate 6: Reasoning Validity
196 * ❌ Gate 7: Output Completeness
197
198 **Rationale:** Prove gate concept works. Add remaining gates in POC2 after validating approach.
199
200
201 === 3.7 NFR1-3: Performance, Scalability, Reliability (Basic) ===
202
203 **Main Requirements:**
204 * NFR1: Response time < 30 seconds
205 * NFR2: Handle 1000+ concurrent users
206 * NFR3: 99.9% uptime
207
208 **POC Implementation:**
209 * ⚠️ **Response time monitored** (not optimized)
210 * ⚠️ **Single-threaded processing** (no concurrency)
211 * ⚠️ **Basic error handling** (no advanced retry logic)
212
213 **Rationale:** POC proves functionality. Performance optimization happens in POC2.
214
215 **POC Acceptance:**
216 * Analysis completes (no timeout requirement)
217 * Errors don't crash system
218 * Basic logging in place
219
220
221 == 4. What's NOT in POC Scope ==
222
223 === 4.1 User-Facing Features (Beta 0+) ===
224
225 {{warning}}
226 **Deferred to Beta 0:**
227 {{/warning}}
228
229 **Out of Scope:**
230 * ❌ User accounts and authentication (FR8)
231 * ❌ User corrections system (FR9, FR45-46)
232 * ❌ Public publishing interface (FR10)
233 * ❌ Social sharing (FR11)
234 * ❌ Email notifications (FR12)
235 * ❌ API access (FR13)
236
237 **Rationale:** POC validates AI capabilities. User features added in Beta 0.
238
239
240 === 4.2 Advanced Features (V1.0+) ===
241
242 **Out of Scope:**
243 * ❌ IFCN compliance (FR47)
244 * ❌ ClaimReview schema (FR48)
245 * ❌ Archive.org integration (FR49)
246 * ❌ OSINT toolkit (FR50)
247 * ❌ Video verification (FR51)
248 * ❌ Deepfake detection (FR52)
249 * ❌ Cross-org sharing (FR53)
250
251 **Rationale:** Advanced features require proven platform. Added post-V1.0.
252
253
254 === 4.3 Production Requirements (POC2, Beta 0) ===
255
256 **Out of Scope:**
257 * ❌ Security controls (NFR4, NFR12)
258 * ❌ Code maintainability (NFR5)
259 * ❌ System monitoring (NFR13)
260 * ❌ Evidence deduplication
261 * ❌ Advanced source verification
262 * ❌ Full 7-gate quality framework
263
264 **Rationale:** POC proves concept. Production hardening happens in POC2 and Beta 0.
265
266
267 == 5. POC Output Specification ==
268
269 === 5.1 Required Output Elements ===
270
271 For each analyzed claim, POC must produce:
272
273 **1. Claim**
274 * Original text
275 * Classification (factual/non-factual/ambiguous)
276 * If non-factual: Clear reason why
277
278 **2. Scenarios** (if factual)
279 * 2-3 interpretation scenarios
280 * Each scenario clearly described
281
282 **3. Evidence** (if factual)
283 * Supporting evidence (2+ items)
284 * Opposing evidence (if exists)
285 * Source URLs and reliability scores
286
287 **4. Verdict** (if factual)
288 * Probability (0-100%)
289 * Uncertainty quantification
290 * Confidence level (LOW/MEDIUM/HIGH)
291 * Reasoning chain
292
293 **5. Quality Status**
294 * Which gates passed/failed
295 * If failed: Clear explanation why
296
297
298 === 5.2 Example POC Output ===
299
300 {{code language="json"}}
301 {
302 "claim": {
303 "text": "Switzerland has the highest life expectancy in Europe",
304 "type": "factual",
305 "gate1_status": "PASS"
306 },
307 "scenarios": [
308 "Switzerland's overall life expectancy is highest",
309 "Switzerland ranks highest for specific age groups"
310 ],
311 "evidence": {
312 "supporting": [
313 {
314 "source": "WHO Report 2023",
315 "reliability": 0.95,
316 "excerpt": "Switzerland: 83.4 years average..."
317 }
318 ],
319 "opposing": [
320 {
321 "source": "Eurostat 2024",
322 "reliability": 0.90,
323 "excerpt": "Spain leads at 83.5 years..."
324 }
325 ]
326 },
327 "verdict": {
328 "probability": 0.65,
329 "uncertainty": 0.15,
330 "confidence": "MEDIUM",
331 "reasoning": "WHO and Eurostat show similar but conflicting data...",
332 "gate4_status": "PASS"
333 }
334 }
335 {{/code}}
336
337
338 == 6. Success Criteria ==
339
340 {{success}}
341 **POC Success Definition:** POC validates that AI can extract claims, find balanced evidence, and compute reasonable verdicts with quality gates improving output quality.
342 {{/success}}
343
344 === 6.1 Functional Success ===
345
346 POC is successful if:
347
348 ✅ **FR1-FR7 Requirements Met:**
349 1. Extracts 3-5 factual claims from test articles
350 2. Generates 2-3 scenarios per ambiguous claim
351 3. Finds supporting AND opposing evidence
352 4. Computes probabilistic verdicts with uncertainty
353 5. Provides clear reasoning chains
354
355 ✅ **Quality Gates Work:**
356 1. Gate 1 blocks non-factual claims (100% block rate)
357 2. Gate 4 blocks low-quality verdicts (blocks if <2 sources or quality <0.6)
358 3. Clear rejection reasons provided
359
360 ✅ **NFR11 Met:**
361 1. Quality gates reduce hallucination rate
362 2. Blocked outputs have clear explanations
363 3. Quality metrics are logged
364
365
366 === 6.2 Quality Thresholds ===
367
368 **Minimum Acceptable:**
369 * ≥70% of test claims correctly classified (factual/non-factual)
370 * ≥60% of verdicts are reasonable (human evaluation)
371 * Gate 1 blocks 100% of non-factual claims
372 * Gate 4 blocks verdicts with <2 sources
373
374 **Target:**
375 * ≥80% claims correctly classified
376 * ≥75% verdicts are reasonable
377 * <10% false positives (blocking good claims)
378
379
380 === 6.3 POC Decision Gate ===
381
382 **After POC1, we decide:**
383
384 **✅ PROCEED to POC2** if:
385 * Success criteria met
386 * Quality gates demonstrably improve output
387 * Core workflow is technically sound
388 * Clear path to production quality
389
390 **⚠️ ITERATE POC1** if:
391 * Success criteria partially met
392 * Gates work but need tuning
393 * Core issues identified but fixable
394
395 **❌ PIVOT APPROACH** if:
396 * Success criteria not met
397 * Fundamental AI limitations discovered
398 * Quality gates insufficient
399 * Alternative approach needed
400
401
402 == 7. Test Cases ==
403
404 === 7.1 Happy Path ===
405
406 **Test 1: Simple Factual Claim**
407 * Input: "Paris is the capital of France"
408 * Expected: Factual, 1 scenario, verdict ~95% true
409
410 **Test 2: Ambiguous Claim**
411 * Input: "Switzerland has the highest income in Europe"
412 * Expected: Factual, 2-3 scenarios, verdict with uncertainty
413
414 **Test 3: Statistical Claim**
415 * Input: "10% of people have condition X"
416 * Expected: Factual, evidence with numbers, probabilistic verdict
417
418
419 === 7.2 Edge Cases ===
420
421 **Test 4: Opinion**
422 * Input: "Paris is the best city"
423 * Expected: Non-factual (opinion), blocked by Gate 1
424
425 **Test 5: Prediction**
426 * Input: "Bitcoin will reach $100,000 next year"
427 * Expected: Non-factual (prediction), blocked by Gate 1
428
429 **Test 6: Insufficient Evidence**
430 * Input: Obscure factual claim with no sources
431 * Expected: Blocked by Gate 4 (<2 sources)
432
433
434 === 7.3 Quality Gate Tests ===
435
436 **Test 7: Gate 1 Effectiveness**
437 * Input: Mix of 10 factual + 10 non-factual claims
438 * Expected: Gate 1 blocks all 10 non-factual (100% precision)
439
440 **Test 8: Gate 4 Effectiveness**
441 * Input: Claims with varying evidence availability
442 * Expected: Gate 4 blocks low-confidence verdicts
443
444
445 == 8. Technical Architecture (POC) ==
446
447 === 8.1 Simplified Architecture ===
448
449 **POC Tech Stack:**
450 * **Frontend:** Simple web interface (Next.js + TypeScript)
451 * **Backend:** Single API endpoint
452 * **AI:** Claude API (Sonnet 4.5)
453 * **Database:** Local JSON files (no database)
454 * **Deployment:** Single server
455
456 **Architecture Diagram:** See [[POC1 Specification>>FactHarbor.Specification.POC.Specification]]
457
458
459 === 8.2 AKEL Implementation ===
460
461 **POC AKEL:**
462 * Single-threaded processing
463 * Synchronous API calls
464 * No caching
465 * Basic error handling
466 * Console logging
467
468 **Full AKEL (POC2+):**
469 * Multi-threaded processing
470 * Async API calls
471 * Evidence caching
472 * Advanced error handling with retry
473 * Structured logging + monitoring
474
475
476 == 9. POC Philosophy ==
477
478 {{info}}
479 **Important:** POC validates concept, not production readiness. Focus is on proving AI can do the job, with production quality coming in later phases.
480 {{/info}}
481
482 === 9.1 Core Principles ===
483
484 **1. Prove Concept, Not Production**
485 * POC validates AI can do the job
486 * Production quality comes in POC2 and Beta 0
487 * Focus on "does it work?" not "is it perfect?"
488
489 **2. Implement Subset of Requirements**
490 * POC covers FR1-7, NFR11 (lite)
491 * All other requirements deferred
492 * Clear mapping to [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]]
493
494 **3. Quality Gates Validate Approach**
495 * 2 gates prove the concept
496 * Remaining 5 gates added in POC2
497 * Gates must demonstrably improve quality
498
499 **4. Iterate Based on Results**
500 * POC results determine next steps
501 * Decision gate after POC1
502 * Flexibility to pivot if needed
503
504
505 === 9.2 Success = Clear Path Forward ===
506
507 POC succeeds if we can confidently answer:
508
509 ✅ **Technical Feasibility:**
510 * Can AI extract claims reliably?
511 * Can AI find balanced evidence?
512 * Can AI compute reasonable verdicts?
513
514 ✅ **Quality Approach:**
515 * Do quality gates improve output?
516 * Can we measure and track quality?
517 * Is the gate approach scalable?
518
519 ✅ **Production Path:**
520 * Is the core architecture sound?
521 * What needs improvement for production?
522 * Is POC2 the right next step?
523
524
525 == 10. Related Pages ==
526
527 * **[[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]]** - Full system requirements (this POC implements a subset)
528 * **[[POC1 Specification (Detailed)>>FactHarbor.Specification.POC.Specification]]** - Detailed POC1 technical specs
529 * **[[POC Summary>>FactHarbor.Specification.POC.Summary]]** - High-level POC overview
530 * **[[Implementation Roadmap>>FactHarbor.Roadmap.WebHome]]** - POC1, POC2, Beta 0, V1.0 phases
531 * **[[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]]** - What users need (drives requirements)
532
533
534 **Document Owner:** Technical Team
535 **Review Frequency:** After each POC iteration
536 **Version History:**
537 * v1.0 - Initial POC requirements
538 * v2.0 - Updated after specification cross-check
539 * v3.0 - Aligned with Main Requirements (FR/NFR IDs added)