Wiki source code of POC Requirements (POC1 & POC2)

Last modified by Robert Schaub on 2025/12/23 15:53

Show last authors
1 = POC Requirements =
2
3 **Status:** ✅ Approved for Development
4 **Version:** 3.0 (Aligned with Main Requirements)
5 **Goal:** Prove that AI can extract claims and determine verdicts automatically without human intervention
6
7 {{info}}
8 **Core Philosophy:** POC validates the [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]] through simplified implementation. All POC features map to formal FR/NFR requirements.
9 {{/info}}
10
11
12 == 1. POC Overview ==
13
14 === 1.1 What POC Tests ===
15
16 **Core Question:**
17
18 > Can AI automatically extract factual claims from articles and evaluate them with reasonable verdicts?
19
20 **What we're proving:**
21
22 * AI can identify factual claims from text
23 * AI can evaluate those claims with structured evidence
24 * Quality gates can filter unreliable outputs
25 * The core workflow is technically feasible
26
27 **What we're NOT proving:**
28
29 * Production-ready reliability (that's POC2)
30 * User-facing features (that's Beta 0)
31 * Full IFCN compliance (that's V1.0)
32
33 === 1.2 Requirements Mapping ===
34
35 POC1 implements a **subset** of the full system requirements defined in [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]].
36
37 **Scope Summary:**
38
39 * **In Scope:** 8 requirements (7 FRs + 1 NFR)
40 * **Partial:** 3 NFRs (simplified versions)
41 * **Out of Scope:** 19 requirements (deferred to later phases)
42
43 == 2. POC1 Scope ==
44
45 {{success}}
46 **Authoritative Source for Phase Mapping:** [[Requirements Roadmap Matrix>>Test.FactHarbor V0\.9\.82.Roadmap.Requirements-Roadmap-Matrix.WebHome]]
47
48 The Roadmap Matrix is the single source of truth for which requirements are implemented in which phases. This page provides POC1-specific implementation details only.
49 {{/success}}
50
51 **POC1 implements these formal requirements:**
52
53 |= Formal Req |= Implementation in POC1 |= Notes
54 | **FR4** | Analysis Summary | Basic format; quality metadata deferred to POC2
55 | **FR7** | Automated Verdicts | Full implementation with quality gates (NFR11)
56 | **NFR11** | Quality Assurance Framework | 4 quality gates implemented
57
58 **POC1 also implements these workflow components** (detailed as FR1-FR6, FR8, FR11, FR13 in implementation sections below):
59
60 * Claim extraction (FR1)
61 * Claim context (FR2)
62 * Multiple scenarios (FR3)
63 * Evidence collection (FR5)
64 * Source quality assessment (FR6)
65 * Time evolution tracking (FR8) - deferred to POC2
66 * Audit trail (FR11) - deferred to Beta 0
67 * In-article highlighting (FR13) - deferred to Beta 0
68
69 **Partial implementations:**
70
71 * NFR1 (Explainability) - Basic only
72 * NFR2 (Performance) - Functional but not optimized
73 * NFR3 (Transparency) - Basic only
74
75 **Detailed POC1 implementation specifications continue below...**
76
77
78
79 == 3. POC Simplifications ==
80
81 === 3.1 FR1: Claim Extraction (Full Implementation) ===
82
83 **Main Requirement:** AI extracts factual claims from input text
84
85 **POC Implementation:**
86
87 * ✅ AKEL extracts claims using LLM
88 * ✅ Each claim includes original text reference
89 * ✅ Claims are identified as factual/non-factual
90 * ❌ No advanced claim parsing (added in POC2)
91
92 **Acceptance Criteria:**
93
94 * Extracts 3-5 claims from typical article
95 * Identifies factual vs non-factual claims
96 * Quality Gate 1 validates extraction
97
98 === 3.2 FR3: Multiple Scenarios (Full Implementation) ===
99
100 **Main Requirement:** Generate multiple interpretation scenarios for ambiguous claims
101
102 **POC Implementation:**
103
104 * ✅ AKEL generates 2-3 scenarios per claim
105 * ✅ Scenarios capture different interpretations
106 * ✅ Each scenario is evaluated separately
107 * ✅ Verdict considers all scenarios
108
109 **Acceptance Criteria:**
110
111 * Generates 2+ scenarios for ambiguous claims
112 * Scenarios are meaningfully different
113 * All scenarios are evaluated
114
115 === 3.3 FR4: Analysis Summary (Basic Implementation) ===
116
117 **Main Requirement:** Provide user-friendly summary of analysis
118
119 **POC Implementation:**
120
121 * ✅ Simple text summary generated
122 * ❌ No rich formatting (added in Beta 0)
123 * ❌ No visual elements (added in Beta 0)
124 * ❌ No interactive features (added in Beta 0)
125
126 **POC Format:**
127 ```
128 Claim: [extracted claim]
129 Scenarios: [list of scenarios]
130 Evidence: [supporting/opposing evidence]
131 Verdict: [probability with uncertainty]
132 ```
133
134
135 === 3.4 FR5-FR6: Evidence Collection & Evaluation (Full Implementation) ===
136
137 **Main Requirements:**
138
139 * FR5: Collect supporting and opposing evidence
140 * FR6: Evaluate evidence source reliability
141
142 **POC Implementation:**
143
144 * ✅ AKEL searches for evidence (web/knowledge base)
145 * ✅ **Mandatory contradiction search** (finds opposing evidence)
146 * ✅ Source reliability scoring
147 * ❌ No evidence deduplication (added in POC2)
148 * ❌ No advanced source verification (added in POC2)
149
150 **Acceptance Criteria:**
151
152 * Finds 2+ supporting evidence items
153 * Finds 1+ opposing evidence (if exists)
154 * Sources scored for reliability
155
156 === 3.5 FR7: Automated Verdicts (Full Implementation) ===
157
158 **Main Requirement:** AI computes verdicts with uncertainty quantification
159
160 **POC Implementation:**
161
162 * ✅ Probabilistic verdicts (0-100% confidence)
163 * ✅ Uncertainty explicitly stated
164 * ✅ Reasoning chain provided
165 * ✅ Quality Gate 4 validates verdict confidence
166
167 **POC Output:**
168 ```
169 Verdict: 70% likely true
170 Uncertainty: ±15% (moderate confidence)
171 Reasoning: Based on 3 high-quality sources...
172 Confidence Level: MEDIUM
173 ```
174
175 **Acceptance Criteria:**
176
177 * Verdicts include probability (0-100%)
178 * Uncertainty explicitly quantified
179 * Reasoning chain explains verdict
180
181 === 3.6 NFR11: Quality Assurance Framework (LITE VERSION) ===
182
183 **Main Requirement:** Complete quality assurance with 7 quality gates
184
185 **POC Implementation:** **2 gates only**
186
187 **Quality Gate 1: Claim Validation**
188
189 * ✅ Validates claim is factual and verifiable
190 * ✅ Blocks non-factual claims (opinion/prediction/ambiguous)
191 * ✅ Provides clear rejection reason
192
193 **Quality Gate 4: Verdict Confidence Assessment**
194
195 * ✅ Validates ≥2 sources found
196 * ✅ Validates quality score ≥0.6
197 * ✅ Blocks low-confidence verdicts
198 * ✅ Provides clear rejection reason
199
200 **Out of Scope (POC2+):**
201
202 * ❌ Gate 2: Evidence Relevance
203 * ❌ Gate 3: Scenario Coherence
204 * ❌ Gate 5: Source Diversity
205 * ❌ Gate 6: Reasoning Validity
206 * ❌ Gate 7: Output Completeness
207
208 **Rationale:** Prove gate concept works. Add remaining gates in POC2 after validating approach.
209
210
211 === 3.7 NFR1-3: Performance, Scalability, Reliability (Basic) ===
212
213 **Main Requirements:**
214
215 * NFR1: Response time < 30 seconds
216 * NFR2: Handle 1000+ concurrent users
217 * NFR3: 99.9% uptime
218
219 **POC Implementation:**
220
221 * ⚠️ **Response time monitored** (not optimized)
222 * ⚠️ **Single-threaded processing** (no concurrency)
223 * ⚠️ **Basic error handling** (no advanced retry logic)
224
225 **Rationale:** POC proves functionality. Performance optimization happens in POC2.
226
227 **POC Acceptance:**
228
229 * Analysis completes (no timeout requirement)
230 * Errors don't crash system
231 * Basic logging in place
232
233 == 4. What's NOT in POC Scope ==
234
235 === 4.1 User-Facing Features (Beta 0+) ===
236
237 {{warning}}
238 **Deferred to Beta 0:**
239 {{/warning}}
240
241 **Out of Scope:**
242
243 * ❌ User accounts and authentication (FR8)
244 * ❌ User corrections system (FR9, FR45-46)
245 * ❌ Public publishing interface (FR10)
246 * ❌ Social sharing (FR11)
247 * ❌ Email notifications (FR12)
248 * ❌ API access (FR13)
249
250 **Rationale:** POC validates AI capabilities. User features added in Beta 0.
251
252
253 === 4.2 Advanced Features (V1.0+) ===
254
255 **Out of Scope:**
256
257 * ❌ IFCN compliance (FR47)
258 * ❌ ClaimReview schema (FR48)
259 * ❌ Archive.org integration (FR49)
260 * ❌ OSINT toolkit (FR50)
261 * ❌ Video verification (FR51)
262 * ❌ Deepfake detection (FR52)
263 * ❌ Cross-org sharing (FR53)
264
265 **Rationale:** Advanced features require proven platform. Added post-V1.0.
266
267
268 === 4.3 Production Requirements (POC2, Beta 0) ===
269
270 **Out of Scope:**
271
272 * ❌ Security controls (NFR4, NFR12)
273 * ❌ Code maintainability (NFR5)
274 * ❌ System monitoring (NFR13)
275 * ❌ Evidence deduplication
276 * ❌ Advanced source verification
277 * ❌ Full 7-gate quality framework
278
279 **Rationale:** POC proves concept. Production hardening happens in POC2 and Beta 0.
280
281
282 == 5. POC Output Specification ==
283
284 === 5.1 Required Output Elements ===
285
286 For each analyzed claim, POC must produce:
287
288 *
289 **
290 **1. Claim
291 * Original text
292 * Classification (factual/non-factual/ambiguous)
293 * If non-factual: Clear reason why
294
295 **2. Scenarios** (if factual)
296
297 * 2-3 interpretation scenarios
298 * Each scenario clearly described
299
300 **3. Evidence** (if factual)
301
302 * Supporting evidence (2+ items)
303 * Opposing evidence (if exists)
304 * Source URLs and reliability scores
305
306 **4. Verdict** (if factual)
307
308 * Probability (0-100%)
309 * Uncertainty quantification
310 * Confidence level (LOW/MEDIUM/HIGH)
311 * Reasoning chain
312
313 **5. Quality Status**
314
315 * Which gates passed/failed
316 * If failed: Clear explanation why
317
318 === 5.2 Example POC Output ===
319
320 {{code language="json"}}
321 {
322 "claim": {
323 "text": "Switzerland has the highest life expectancy in Europe",
324 "type": "factual",
325 "gate1_status": "PASS"
326 },
327 "scenarios": [
328 "Switzerland's overall life expectancy is highest",
329 "Switzerland ranks highest for specific age groups"
330 ],
331 "evidence": {
332 "supporting": [
333 {
334 "source": "WHO Report 2023",
335 "reliability": 0.95,
336 "excerpt": "Switzerland: 83.4 years average..."
337 }
338 ],
339 "opposing": [
340 {
341 "source": "Eurostat 2024",
342 "reliability": 0.90,
343 "excerpt": "Spain leads at 83.5 years..."
344 }
345 ]
346 },
347 "verdict": {
348 "probability": 0.65,
349 "uncertainty": 0.15,
350 "confidence": "MEDIUM",
351 "reasoning": "WHO and Eurostat show similar but conflicting data...",
352 "gate4_status": "PASS"
353 }
354 }
355 {{/code}}
356
357
358 == 6. Success Criteria ==
359
360 {{success}}
361 **POC Success Definition:** POC validates that AI can extract claims, find balanced evidence, and compute reasonable verdicts with quality gates improving output quality.
362 {{/success}}
363
364 === 6.1 Functional Success ===
365
366 POC is successful if:
367
368 ✅ **FR1-FR7 Requirements Met:**
369
370 1. Extracts 3-5 factual claims from test articles
371 2. Generates 2-3 scenarios per ambiguous claim
372 3. Finds supporting AND opposing evidence
373 4. Computes probabilistic verdicts with uncertainty
374 5. Provides clear reasoning chains
375
376 ✅ **Quality Gates Work:**
377
378 1. Gate 1 blocks non-factual claims (100% block rate)
379 2. Gate 4 blocks low-quality verdicts (blocks if <2 sources or quality <0.6)
380 3. Clear rejection reasons provided
381
382 ✅ **NFR11 Met:**
383
384 1. Quality gates reduce hallucination rate
385 2. Blocked outputs have clear explanations
386 3. Quality metrics are logged
387
388 === 6.2 Quality Thresholds ===
389
390 **Minimum Acceptable:**
391
392 * ≥70% of test claims correctly classified (factual/non-factual)
393 * ≥60% of verdicts are reasonable (human evaluation)
394 * Gate 1 blocks 100% of non-factual claims
395 * Gate 4 blocks verdicts with <2 sources
396
397 **Target:**
398
399 * ≥80% claims correctly classified
400 * ≥75% verdicts are reasonable
401 * <10% false positives (blocking good claims)
402
403 === 6.3 POC Decision Gate ===
404
405 **After POC1, we decide:**
406
407 **✅ PROCEED to POC2** if:
408
409 * Success criteria met
410 * Quality gates demonstrably improve output
411 * Core workflow is technically sound
412 * Clear path to production quality
413
414 **⚠️ ITERATE POC1** if:
415
416 * Success criteria partially met
417 * Gates work but need tuning
418 * Core issues identified but fixable
419
420 **❌ PIVOT APPROACH** if:
421
422 * Success criteria not met
423 * Fundamental AI limitations discovered
424 * Quality gates insufficient
425 * Alternative approach needed
426
427 == 7. Test Cases ==
428
429 === 7.1 Happy Path ===
430
431 **Test 1: Simple Factual Claim**
432
433 * Input: "Paris is the capital of France"
434 * Expected: Factual, 1 scenario, verdict 95% true
435
436 **Test 2: Ambiguous Claim**
437
438 * Input: "Switzerland has the highest income in Europe"
439 * Expected: Factual, 2-3 scenarios, verdict with uncertainty
440
441 **Test 3: Statistical Claim**
442
443 * Input: "10% of people have condition X"
444 * Expected: Factual, evidence with numbers, probabilistic verdict
445
446 === 7.2 Edge Cases ===
447
448 **Test 4: Opinion**
449
450 * Input: "Paris is the best city"
451 * Expected: Non-factual (opinion), blocked by Gate 1
452
453 **Test 5: Prediction**
454
455 * Input: "Bitcoin will reach $100,000 next year"
456 * Expected: Non-factual (prediction), blocked by Gate 1
457
458 **Test 6: Insufficient Evidence**
459
460 * Input: Obscure factual claim with no sources
461 * Expected: Blocked by Gate 4 (<2 sources)
462
463 === 7.3 Quality Gate Tests ===
464
465 **Test 7: Gate 1 Effectiveness**
466
467 * Input: Mix of 10 factual + 10 non-factual claims
468 * Expected: Gate 1 blocks all 10 non-factual (100% precision)
469
470 **Test 8: Gate 4 Effectiveness**
471
472 * Input: Claims with varying evidence availability
473 * Expected: Gate 4 blocks low-confidence verdicts
474
475 == 8. Technical Architecture (POC) ==
476
477 === 8.1 Simplified Architecture ===
478
479 **POC Tech Stack:**
480
481 * **Frontend:** Simple web interface (Next.js + TypeScript)
482 * **Backend:** Single API endpoint
483 * **AI:** Claude API (Sonnet 4.5)
484 * **Database:** Local JSON files (no database)
485 * **Deployment:** Single server
486
487 **Architecture Diagram:** See [[POC1 Specification>>FactHarbor.Specification.POC.Specification]]
488
489
490 === 8.2 AKEL Implementation ===
491
492 **POC AKEL:**
493
494 * Single-threaded processing
495 * Synchronous API calls
496 * No caching
497 * Basic error handling
498 * Console logging
499
500 **Full AKEL (POC2+):**
501
502 * Multi-threaded processing
503 * Async API calls
504 * Evidence caching
505 * Advanced error handling with retry
506 * Structured logging + monitoring
507
508 == 9. POC Philosophy ==
509
510 {{info}}
511 **Important:** POC validates concept, not production readiness. Focus is on proving AI can do the job, with production quality coming in later phases.
512 {{/info}}
513
514 === 9.1 Core Principles ===
515
516 *
517 **
518 **1. Prove Concept, Not Production
519 * POC validates AI can do the job
520 * Production quality comes in POC2 and Beta 0
521 * Focus on "does it work?" not "is it perfect?"
522
523 **2. Implement Subset of Requirements**
524
525 * POC covers FR1-7, NFR11 (lite)
526 * All other requirements deferred
527 * Clear mapping to [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]]
528
529 **3. Quality Gates Validate Approach**
530
531 * 2 gates prove the concept
532 * Remaining 5 gates added in POC2
533 * Gates must demonstrably improve quality
534
535 **4. Iterate Based on Results**
536
537 * POC results determine next steps
538 * Decision gate after POC1
539 * Flexibility to pivot if needed
540
541 === 9.2 Success ===
542
543 Clear Path Forward ===
544
545 POC succeeds if we can confidently answer:
546
547 ✅ **Technical Feasibility:**
548
549 * Can AI extract claims reliably?
550 * Can AI find balanced evidence?
551 * Can AI compute reasonable verdicts?
552
553 ✅ **Quality Approach:**
554
555 * Do quality gates improve output?
556 * Can we measure and track quality?
557 * Is the gate approach scalable?
558
559 ✅ **Production Path:**
560
561 * Is the core architecture sound?
562 * What needs improvement for production?
563 * Is POC2 the right next step?
564
565 == 10. Related Pages ==
566
567 * **[[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]]** - Full system requirements (this POC implements a subset)
568 * **[[POC1 Specification (Detailed)>>FactHarbor.Specification.POC.Specification]]** - Detailed POC1 technical specs
569 * **[[POC Summary>>FactHarbor.Specification.POC.Summary]]** - High-level POC overview
570 * **[[Implementation Roadmap>>FactHarbor.Roadmap.WebHome]]** - POC1, POC2, Beta 0, V1.0 phases
571 * **[[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]]** - What users need (drives requirements)
572
573 **Document Owner:** Technical Team
574 **Review Frequency:** After each POC iteration
575 **Version History:**
576
577 * v1.0 - Initial POC requirements
578 * v2.0 - Updated after specification cross-check
579 * v3.0 - Aligned with Main Requirements (FR/NFR IDs added)