Changes for page POC Requirements

Last modified by Robert Schaub on 2026/02/08 08:25

From version 1.2
edited by Robert Schaub
on 2025/12/22 13:50
Change comment: Update document after refactoring.
To version 1.4
edited by Robert Schaub
on 2026/01/20 20:29
Change comment: Renamed back-links.

Summary

Details

Page properties
Content
... ... @@ -14,9 +14,11 @@
14 14  === 1.1 What POC Tests ===
15 15  
16 16  **Core Question:**
17 +
17 17  > Can AI automatically extract factual claims from articles and evaluate them with reasonable verdicts?
18 18  
19 19  **What we're proving:**
21 +
20 20  * AI can identify factual claims from text
21 21  * AI can evaluate those claims with structured evidence
22 22  * Quality gates can filter unreliable outputs
... ... @@ -23,6 +23,7 @@
23 23  * The core workflow is technically feasible
24 24  
25 25  **What we're NOT proving:**
28 +
26 26  * Production-ready reliability (that's POC2)
27 27  * User-facing features (that's Beta 0)
28 28  * Full IFCN compliance (that's V1.0)
... ... @@ -32,11 +32,11 @@
32 32  POC1 implements a **subset** of the full system requirements defined in [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]].
33 33  
34 34  **Scope Summary:**
38 +
35 35  * **In Scope:** 8 requirements (7 FRs + 1 NFR)
36 36  * **Partial:** 3 NFRs (simplified versions)
37 37  * **Out of Scope:** 19 requirements (deferred to later phases)
38 38  
39 -
40 40  == 2. Requirements Scope Matrix ==
41 41  
42 42  {{success}}
... ... @@ -44,7 +44,7 @@
44 44  {{/success}}
45 45  
46 46  |=Requirement|=POC1 Status|=Implementation Level|=Notes
47 -|**CORE WORKFLOW**||||
50 +|**CORE WORKFLOW**||||\\
48 48  |FR1: Claim Extraction|✅ **In Scope**|Full|AKEL extracts claims from text
49 49  |FR2: Claim Context|✅ **In Scope**|Basic|Context preserved with claim
50 50  |FR3: Multiple Scenarios|✅ **In Scope**|Full|AKEL generates interpretation scenarios
... ... @@ -52,12 +52,12 @@
52 52  |FR5: Evidence Collection|✅ **In Scope**|Full|AKEL searches for evidence
53 53  |FR6: Evidence Evaluation|✅ **In Scope**|Full|AKEL evaluates source reliability
54 54  |FR7: Automated Verdicts|✅ **In Scope**|Full|AKEL computes verdicts with uncertainty
55 -|**QUALITY & RELIABILITY**||||
58 +|**QUALITY & RELIABILITY**||||\\
56 56  |NFR11: Quality Assurance|✅ **In Scope**|**Lite**|**2 gates only** (Gate 1 & 4)
57 57  |NFR1: Performance|⚠️ **Partial**|Basic|Response time monitored, not optimized
58 58  |NFR2: Scalability|⚠️ **Partial**|Single-thread|No concurrent processing
59 59  |NFR3: Reliability|⚠️ **Partial**|Basic|Error handling, no retry logic
60 -|**DEFERRED TO LATER**||||
63 +|**DEFERRED TO LATER**||||\\
61 61  |FR8-FR13|❌ Out of Scope|N/A|User accounts, corrections, publishing
62 62  |FR44-FR53|❌ Out of Scope|N/A|Advanced features (V1.0+)
63 63  |NFR4: Security|❌ Out of Scope|N/A|POC2
... ... @@ -65,7 +65,6 @@
65 65  |NFR12: Security Controls|❌ Out of Scope|N/A|Beta 0
66 66  |NFR13: Monitoring|❌ Out of Scope|N/A|POC2
67 67  
68 -
69 69  == 3. POC Simplifications ==
70 70  
71 71  === 3.1 FR1: Claim Extraction (Full Implementation) ===
... ... @@ -73,6 +73,7 @@
73 73  **Main Requirement:** AI extracts factual claims from input text
74 74  
75 75  **POC Implementation:**
78 +
76 76  * ✅ AKEL extracts claims using LLM
77 77  * ✅ Each claim includes original text reference
78 78  * ✅ Claims are identified as factual/non-factual
... ... @@ -79,16 +79,17 @@
79 79  * ❌ No advanced claim parsing (added in POC2)
80 80  
81 81  **Acceptance Criteria:**
85 +
82 82  * Extracts 3-5 claims from typical article
83 83  * Identifies factual vs non-factual claims
84 84  * Quality Gate 1 validates extraction
85 85  
86 -
87 87  === 3.2 FR3: Multiple Scenarios (Full Implementation) ===
88 88  
89 89  **Main Requirement:** Generate multiple interpretation scenarios for ambiguous claims
90 90  
91 91  **POC Implementation:**
95 +
92 92  * ✅ AKEL generates 2-3 scenarios per claim
93 93  * ✅ Scenarios capture different interpretations
94 94  * ✅ Each scenario is evaluated separately
... ... @@ -95,16 +95,17 @@
95 95  * ✅ Verdict considers all scenarios
96 96  
97 97  **Acceptance Criteria:**
102 +
98 98  * Generates 2+ scenarios for ambiguous claims
99 99  * Scenarios are meaningfully different
100 100  * All scenarios are evaluated
101 101  
102 -
103 103  === 3.3 FR4: Analysis Summary (Basic Implementation) ===
104 104  
105 105  **Main Requirement:** Provide user-friendly summary of analysis
106 106  
107 107  **POC Implementation:**
112 +
108 108  * ✅ Simple text summary generated
109 109  * ❌ No rich formatting (added in Beta 0)
110 110  * ❌ No visual elements (added in Beta 0)
... ... @@ -122,10 +122,12 @@
122 122  === 3.4 FR5-FR6: Evidence Collection & Evaluation (Full Implementation) ===
123 123  
124 124  **Main Requirements:**
130 +
125 125  * FR5: Collect supporting and opposing evidence
126 126  * FR6: Evaluate evidence source reliability
127 127  
128 128  **POC Implementation:**
135 +
129 129  * ✅ AKEL searches for evidence (web/knowledge base)
130 130  * ✅ **Mandatory contradiction search** (finds opposing evidence)
131 131  * ✅ Source reliability scoring
... ... @@ -133,16 +133,17 @@
133 133  * ❌ No advanced source verification (added in POC2)
134 134  
135 135  **Acceptance Criteria:**
143 +
136 136  * Finds 2+ supporting evidence items
137 137  * Finds 1+ opposing evidence (if exists)
138 138  * Sources scored for reliability
139 139  
140 -
141 141  === 3.5 FR7: Automated Verdicts (Full Implementation) ===
142 142  
143 143  **Main Requirement:** AI computes verdicts with uncertainty quantification
144 144  
145 145  **POC Implementation:**
153 +
146 146  * ✅ Probabilistic verdicts (0-100% confidence)
147 147  * ✅ Uncertainty explicitly stated
148 148  * ✅ Reasoning chain provided
... ... @@ -157,11 +157,11 @@
157 157  ```
158 158  
159 159  **Acceptance Criteria:**
168 +
160 160  * Verdicts include probability (0-100%)
161 161  * Uncertainty explicitly quantified
162 162  * Reasoning chain explains verdict
163 163  
164 -
165 165  === 3.6 NFR11: Quality Assurance Framework (LITE VERSION) ===
166 166  
167 167  **Main Requirement:** Complete quality assurance with 7 quality gates
... ... @@ -169,11 +169,13 @@
169 169  **POC Implementation:** **2 gates only**
170 170  
171 171  **Quality Gate 1: Claim Validation**
180 +
172 172  * ✅ Validates claim is factual and verifiable
173 173  * ✅ Blocks non-factual claims (opinion/prediction/ambiguous)
174 174  * ✅ Provides clear rejection reason
175 175  
176 176  **Quality Gate 4: Verdict Confidence Assessment**
186 +
177 177  * ✅ Validates ≥2 sources found
178 178  * ✅ Validates quality score ≥0.6
179 179  * ✅ Blocks low-confidence verdicts
... ... @@ -180,6 +180,7 @@
180 180  * ✅ Provides clear rejection reason
181 181  
182 182  **Out of Scope (POC2+):**
193 +
183 183  * ❌ Gate 2: Evidence Relevance
184 184  * ❌ Gate 3: Scenario Coherence
185 185  * ❌ Gate 5: Source Diversity
... ... @@ -192,11 +192,13 @@
192 192  === 3.7 NFR1-3: Performance, Scalability, Reliability (Basic) ===
193 193  
194 194  **Main Requirements:**
206 +
195 195  * NFR1: Response time < 30 seconds
196 196  * NFR2: Handle 1000+ concurrent users
197 197  * NFR3: 99.9% uptime
198 198  
199 199  **POC Implementation:**
212 +
200 200  * ⚠️ **Response time monitored** (not optimized)
201 201  * ⚠️ **Single-threaded processing** (no concurrency)
202 202  * ⚠️ **Basic error handling** (no advanced retry logic)
... ... @@ -204,11 +204,11 @@
204 204  **Rationale:** POC proves functionality. Performance optimization happens in POC2.
205 205  
206 206  **POC Acceptance:**
220 +
207 207  * Analysis completes (no timeout requirement)
208 208  * Errors don't crash system
209 209  * Basic logging in place
210 210  
211 -
212 212  == 4. What's NOT in POC Scope ==
213 213  
214 214  === 4.1 User-Facing Features (Beta 0+) ===
... ... @@ -218,6 +218,7 @@
218 218  {{/warning}}
219 219  
220 220  **Out of Scope:**
234 +
221 221  * ❌ User accounts and authentication (FR8)
222 222  * ❌ User corrections system (FR9, FR45-46)
223 223  * ❌ Public publishing interface (FR10)
... ... @@ -231,6 +231,7 @@
231 231  === 4.2 Advanced Features (V1.0+) ===
232 232  
233 233  **Out of Scope:**
248 +
234 234  * ❌ IFCN compliance (FR47)
235 235  * ❌ ClaimReview schema (FR48)
236 236  * ❌ Archive.org integration (FR49)
... ... @@ -245,6 +245,7 @@
245 245  === 4.3 Production Requirements (POC2, Beta 0) ===
246 246  
247 247  **Out of Scope:**
263 +
248 248  * ❌ Security controls (NFR4, NFR12)
249 249  * ❌ Code maintainability (NFR5)
250 250  * ❌ System monitoring (NFR13)
... ... @@ -261,21 +261,26 @@
261 261  
262 262  For each analyzed claim, POC must produce:
263 263  
264 -**1. Claim**
280 +* \\
281 +** \\
282 +**1. Claim
265 265  * Original text
266 266  * Classification (factual/non-factual/ambiguous)
267 267  * If non-factual: Clear reason why
268 268  
269 269  **2. Scenarios** (if factual)
288 +
270 270  * 2-3 interpretation scenarios
271 271  * Each scenario clearly described
272 272  
273 273  **3. Evidence** (if factual)
293 +
274 274  * Supporting evidence (2+ items)
275 275  * Opposing evidence (if exists)
276 276  * Source URLs and reliability scores
277 277  
278 278  **4. Verdict** (if factual)
299 +
279 279  * Probability (0-100%)
280 280  * Uncertainty quantification
281 281  * Confidence level (LOW/MEDIUM/HIGH)
... ... @@ -282,10 +282,10 @@
282 282  * Reasoning chain
283 283  
284 284  **5. Quality Status**
306 +
285 285  * Which gates passed/failed
286 286  * If failed: Clear explanation why
287 287  
288 -
289 289  === 5.2 Example POC Output ===
290 290  
291 291  {{code language="json"}}
... ... @@ -337,6 +337,7 @@
337 337  POC is successful if:
338 338  
339 339  ✅ **FR1-FR7 Requirements Met:**
361 +
340 340  1. Extracts 3-5 factual claims from test articles
341 341  2. Generates 2-3 scenarios per ambiguous claim
342 342  3. Finds supporting AND opposing evidence
... ... @@ -344,19 +344,21 @@
344 344  5. Provides clear reasoning chains
345 345  
346 346  ✅ **Quality Gates Work:**
369 +
347 347  1. Gate 1 blocks non-factual claims (100% block rate)
348 348  2. Gate 4 blocks low-quality verdicts (blocks if <2 sources or quality <0.6)
349 349  3. Clear rejection reasons provided
350 350  
351 351  ✅ **NFR11 Met:**
375 +
352 352  1. Quality gates reduce hallucination rate
353 353  2. Blocked outputs have clear explanations
354 354  3. Quality metrics are logged
355 355  
356 -
357 357  === 6.2 Quality Thresholds ===
358 358  
359 359  **Minimum Acceptable:**
383 +
360 360  * ≥70% of test claims correctly classified (factual/non-factual)
361 361  * ≥60% of verdicts are reasonable (human evaluation)
362 362  * Gate 1 blocks 100% of non-factual claims
... ... @@ -363,16 +363,17 @@
363 363  * Gate 4 blocks verdicts with <2 sources
364 364  
365 365  **Target:**
390 +
366 366  * ≥80% claims correctly classified
367 367  * ≥75% verdicts are reasonable
368 368  * <10% false positives (blocking good claims)
369 369  
370 -
371 371  === 6.3 POC Decision Gate ===
372 372  
373 373  **After POC1, we decide:**
374 374  
375 375  **✅ PROCEED to POC2** if:
400 +
376 376  * Success criteria met
377 377  * Quality gates demonstrably improve output
378 378  * Core workflow is technically sound
... ... @@ -379,65 +379,72 @@
379 379  * Clear path to production quality
380 380  
381 381  **⚠️ ITERATE POC1** if:
407 +
382 382  * Success criteria partially met
383 383  * Gates work but need tuning
384 384  * Core issues identified but fixable
385 385  
386 386  **❌ PIVOT APPROACH** if:
413 +
387 387  * Success criteria not met
388 388  * Fundamental AI limitations discovered
389 389  * Quality gates insufficient
390 390  * Alternative approach needed
391 391  
392 -
393 393  == 7. Test Cases ==
394 394  
395 395  === 7.1 Happy Path ===
396 396  
397 397  **Test 1: Simple Factual Claim**
424 +
398 398  * Input: "Paris is the capital of France"
399 -* Expected: Factual, 1 scenario, verdict ~95% true
426 +* Expected: Factual, 1 scenario, verdict 95% true
400 400  
401 401  **Test 2: Ambiguous Claim**
429 +
402 402  * Input: "Switzerland has the highest income in Europe"
403 403  * Expected: Factual, 2-3 scenarios, verdict with uncertainty
404 404  
405 405  **Test 3: Statistical Claim**
434 +
406 406  * Input: "10% of people have condition X"
407 407  * Expected: Factual, evidence with numbers, probabilistic verdict
408 408  
409 -
410 410  === 7.2 Edge Cases ===
411 411  
412 412  **Test 4: Opinion**
441 +
413 413  * Input: "Paris is the best city"
414 414  * Expected: Non-factual (opinion), blocked by Gate 1
415 415  
416 416  **Test 5: Prediction**
446 +
417 417  * Input: "Bitcoin will reach $100,000 next year"
418 418  * Expected: Non-factual (prediction), blocked by Gate 1
419 419  
420 420  **Test 6: Insufficient Evidence**
451 +
421 421  * Input: Obscure factual claim with no sources
422 422  * Expected: Blocked by Gate 4 (<2 sources)
423 423  
424 -
425 425  === 7.3 Quality Gate Tests ===
426 426  
427 427  **Test 7: Gate 1 Effectiveness**
458 +
428 428  * Input: Mix of 10 factual + 10 non-factual claims
429 429  * Expected: Gate 1 blocks all 10 non-factual (100% precision)
430 430  
431 431  **Test 8: Gate 4 Effectiveness**
463 +
432 432  * Input: Claims with varying evidence availability
433 433  * Expected: Gate 4 blocks low-confidence verdicts
434 434  
435 -
436 436  == 8. Technical Architecture (POC) ==
437 437  
438 438  === 8.1 Simplified Architecture ===
439 439  
440 440  **POC Tech Stack:**
472 +
441 441  * **Frontend:** Simple web interface (Next.js + TypeScript)
442 442  * **Backend:** Single API endpoint
443 443  * **AI:** Claude API (Sonnet 4.5)
... ... @@ -450,6 +450,7 @@
450 450  === 8.2 AKEL Implementation ===
451 451  
452 452  **POC AKEL:**
485 +
453 453  * Single-threaded processing
454 454  * Synchronous API calls
455 455  * No caching
... ... @@ -457,6 +457,7 @@
457 457  * Console logging
458 458  
459 459  **Full AKEL (POC2+):**
493 +
460 460  * Multi-threaded processing
461 461  * Async API calls
462 462  * Evidence caching
... ... @@ -463,7 +463,6 @@
463 463  * Advanced error handling with retry
464 464  * Structured logging + monitoring
465 465  
466 -
467 467  == 9. POC Philosophy ==
468 468  
469 469  {{info}}
... ... @@ -472,60 +472,67 @@
472 472  
473 473  === 9.1 Core Principles ===
474 474  
475 -**1. Prove Concept, Not Production**
508 +* \\
509 +** \\
510 +**1. Prove Concept, Not Production
476 476  * POC validates AI can do the job
477 477  * Production quality comes in POC2 and Beta 0
478 478  * Focus on "does it work?" not "is it perfect?"
479 479  
480 480  **2. Implement Subset of Requirements**
516 +
481 481  * POC covers FR1-7, NFR11 (lite)
482 482  * All other requirements deferred
483 483  * Clear mapping to [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]]
484 484  
485 485  **3. Quality Gates Validate Approach**
522 +
486 486  * 2 gates prove the concept
487 487  * Remaining 5 gates added in POC2
488 488  * Gates must demonstrably improve quality
489 489  
490 490  **4. Iterate Based on Results**
528 +
491 491  * POC results determine next steps
492 492  * Decision gate after POC1
493 493  * Flexibility to pivot if needed
494 494  
533 +=== 9.2 Success ===
495 495  
496 -=== 9.2 Success = Clear Path Forward ===
535 + Clear Path Forward ===
497 497  
498 498  POC succeeds if we can confidently answer:
499 499  
500 500  ✅ **Technical Feasibility:**
540 +
501 501  * Can AI extract claims reliably?
502 502  * Can AI find balanced evidence?
503 503  * Can AI compute reasonable verdicts?
504 504  
505 505  ✅ **Quality Approach:**
546 +
506 506  * Do quality gates improve output?
507 507  * Can we measure and track quality?
508 508  * Is the gate approach scalable?
509 509  
510 510  ✅ **Production Path:**
552 +
511 511  * Is the core architecture sound?
512 512  * What needs improvement for production?
513 513  * Is POC2 the right next step?
514 514  
515 -
516 516  == 10. Related Pages ==
517 517  
518 518  * **[[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]]** - Full system requirements (this POC implements a subset)
519 519  * **[[POC1 Specification (Detailed)>>FactHarbor.Specification.POC.Specification]]** - Detailed POC1 technical specs
520 520  * **[[POC Summary>>FactHarbor.Specification.POC.Summary]]** - High-level POC overview
521 -* **[[Implementation Roadmap>>FactHarbor.Roadmap.WebHome]]** - POC1, POC2, Beta 0, V1.0 phases
522 -* **[[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]]** - What users need (drives requirements)
562 +* **[[Implementation Roadmap>>Archive.FactHarbor.Roadmap.WebHome]]** - POC1, POC2, Beta 0, V1.0 phases
563 +* **[[User Needs>>Archive.FactHarbor.Specification.Requirements.User Needs.WebHome]]** - What users need (drives requirements)
523 523  
524 -
525 525  **Document Owner:** Technical Team
526 526  **Review Frequency:** After each POC iteration
527 527  **Version History:**
568 +
528 528  * v1.0 - Initial POC requirements
529 529  * v2.0 - Updated after specification cross-check
530 530  * v3.0 - Aligned with Main Requirements (FR/NFR IDs added)
531 -