Changes for page POC Requirements

Last modified by Robert Schaub on 2025/12/23 11:35

From version 1.3
edited by Robert Schaub
on 2025/12/23 11:35
Change comment: Renamed back-links.
To version 1.1
edited by Robert Schaub
on 2025/12/23 11:20
Change comment: Imported from XAR

Summary

Details

Page properties
Content
... ... @@ -14,11 +14,9 @@
14 14  === 1.1 What POC Tests ===
15 15  
16 16  **Core Question:**
17 -
18 18  > Can AI automatically extract factual claims from articles and evaluate them with reasonable verdicts?
19 19  
20 20  **What we're proving:**
21 -
22 22  * AI can identify factual claims from text
23 23  * AI can evaluate those claims with structured evidence
24 24  * Quality gates can filter unreliable outputs
... ... @@ -25,7 +25,6 @@
25 25  * The core workflow is technically feasible
26 26  
27 27  **What we're NOT proving:**
28 -
29 29  * Production-ready reliability (that's POC2)
30 30  * User-facing features (that's Beta 0)
31 31  * Full IFCN compliance (that's V1.0)
... ... @@ -35,23 +35,22 @@
35 35  POC1 implements a **subset** of the full system requirements defined in [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]].
36 36  
37 37  **Scope Summary:**
38 -
39 39  * **In Scope:** 8 requirements (7 FRs + 1 NFR)
40 40  * **Partial:** 3 NFRs (simplified versions)
41 41  * **Out of Scope:** 19 requirements (deferred to later phases)
42 42  
39 +
43 43  == 2. Requirements Scope Matrix ==
44 44  
45 45  {{success}}
46 -**Authoritative Source:** See [[Requirements Roadmap Matrix>>Test.FactHarbor V0\.9\.78.Specification.Requirements-Roadmap-Matrix.WebHome]] for complete phase-to-requirement mapping across all phases.
43 +**Authoritative Source:** See [[Requirements Roadmap Matrix>>Test.FactHarbor.Specification.Requirements-Roadmap-Matrix.WebHome]] for complete phase-to-requirement mapping across all phases.
47 47  {{/success}}
48 48  
49 49  **POC1 Scope Summary:**
50 50  
51 -POC1 implements the following requirements from the [[Main Requirements>>Test.FactHarbor V0\.9\.78.Specification.Requirements.WebHome]]:
48 +POC1 implements the following requirements from the [[Main Requirements>>Test.FactHarbor.Specification.Requirements.WebHome]]:
52 52  
53 53  **Full Implementation (8 requirements):**
54 -
55 55  * FR1: Claim Extraction
56 56  * FR2: Claim Context
57 57  * FR3: Multiple Scenarios
... ... @@ -62,13 +62,11 @@
62 62  * NFR11: AKEL Quality Assurance Framework (Basic - 4 quality gates)
63 63  
64 64  **Partial Implementation (3 requirements):**
65 -
66 66  * NFR1: Explainability (Basic explanations only)
67 67  * NFR2: Performance (Functional but not optimized)
68 68  * NFR3: Transparency (Basic transparency)
69 69  
70 70  **Deferred to Later Phases:**
71 -
72 72  * All other requirements (see Roadmap Matrix for phase assignments)
73 73  
74 74  **Detailed POC1 specifications continue below...**
... ... @@ -81,7 +81,6 @@
81 81  **Main Requirement:** AI extracts factual claims from input text
82 82  
83 83  **POC Implementation:**
84 -
85 85  * ✅ AKEL extracts claims using LLM
86 86  * ✅ Each claim includes original text reference
87 87  * ✅ Claims are identified as factual/non-factual
... ... @@ -88,17 +88,16 @@
88 88  * ❌ No advanced claim parsing (added in POC2)
89 89  
90 90  **Acceptance Criteria:**
91 -
92 92  * Extracts 3-5 claims from typical article
93 93  * Identifies factual vs non-factual claims
94 94  * Quality Gate 1 validates extraction
95 95  
88 +
96 96  === 3.2 FR3: Multiple Scenarios (Full Implementation) ===
97 97  
98 98  **Main Requirement:** Generate multiple interpretation scenarios for ambiguous claims
99 99  
100 100  **POC Implementation:**
101 -
102 102  * ✅ AKEL generates 2-3 scenarios per claim
103 103  * ✅ Scenarios capture different interpretations
104 104  * ✅ Each scenario is evaluated separately
... ... @@ -105,17 +105,16 @@
105 105  * ✅ Verdict considers all scenarios
106 106  
107 107  **Acceptance Criteria:**
108 -
109 109  * Generates 2+ scenarios for ambiguous claims
110 110  * Scenarios are meaningfully different
111 111  * All scenarios are evaluated
112 112  
104 +
113 113  === 3.3 FR4: Analysis Summary (Basic Implementation) ===
114 114  
115 115  **Main Requirement:** Provide user-friendly summary of analysis
116 116  
117 117  **POC Implementation:**
118 -
119 119  * ✅ Simple text summary generated
120 120  * ❌ No rich formatting (added in Beta 0)
121 121  * ❌ No visual elements (added in Beta 0)
... ... @@ -133,12 +133,10 @@
133 133  === 3.4 FR5-FR6: Evidence Collection & Evaluation (Full Implementation) ===
134 134  
135 135  **Main Requirements:**
136 -
137 137  * FR5: Collect supporting and opposing evidence
138 138  * FR6: Evaluate evidence source reliability
139 139  
140 140  **POC Implementation:**
141 -
142 142  * ✅ AKEL searches for evidence (web/knowledge base)
143 143  * ✅ **Mandatory contradiction search** (finds opposing evidence)
144 144  * ✅ Source reliability scoring
... ... @@ -146,17 +146,16 @@
146 146  * ❌ No advanced source verification (added in POC2)
147 147  
148 148  **Acceptance Criteria:**
149 -
150 150  * Finds 2+ supporting evidence items
151 151  * Finds 1+ opposing evidence (if exists)
152 152  * Sources scored for reliability
153 153  
142 +
154 154  === 3.5 FR7: Automated Verdicts (Full Implementation) ===
155 155  
156 156  **Main Requirement:** AI computes verdicts with uncertainty quantification
157 157  
158 158  **POC Implementation:**
159 -
160 160  * ✅ Probabilistic verdicts (0-100% confidence)
161 161  * ✅ Uncertainty explicitly stated
162 162  * ✅ Reasoning chain provided
... ... @@ -171,11 +171,11 @@
171 171  ```
172 172  
173 173  **Acceptance Criteria:**
174 -
175 175  * Verdicts include probability (0-100%)
176 176  * Uncertainty explicitly quantified
177 177  * Reasoning chain explains verdict
178 178  
166 +
179 179  === 3.6 NFR11: Quality Assurance Framework (LITE VERSION) ===
180 180  
181 181  **Main Requirement:** Complete quality assurance with 7 quality gates
... ... @@ -183,13 +183,11 @@
183 183  **POC Implementation:** **2 gates only**
184 184  
185 185  **Quality Gate 1: Claim Validation**
186 -
187 187  * ✅ Validates claim is factual and verifiable
188 188  * ✅ Blocks non-factual claims (opinion/prediction/ambiguous)
189 189  * ✅ Provides clear rejection reason
190 190  
191 191  **Quality Gate 4: Verdict Confidence Assessment**
192 -
193 193  * ✅ Validates ≥2 sources found
194 194  * ✅ Validates quality score ≥0.6
195 195  * ✅ Blocks low-confidence verdicts
... ... @@ -196,7 +196,6 @@
196 196  * ✅ Provides clear rejection reason
197 197  
198 198  **Out of Scope (POC2+):**
199 -
200 200  * ❌ Gate 2: Evidence Relevance
201 201  * ❌ Gate 3: Scenario Coherence
202 202  * ❌ Gate 5: Source Diversity
... ... @@ -209,13 +209,11 @@
209 209  === 3.7 NFR1-3: Performance, Scalability, Reliability (Basic) ===
210 210  
211 211  **Main Requirements:**
212 -
213 213  * NFR1: Response time < 30 seconds
214 214  * NFR2: Handle 1000+ concurrent users
215 215  * NFR3: 99.9% uptime
216 216  
217 217  **POC Implementation:**
218 -
219 219  * ⚠️ **Response time monitored** (not optimized)
220 220  * ⚠️ **Single-threaded processing** (no concurrency)
221 221  * ⚠️ **Basic error handling** (no advanced retry logic)
... ... @@ -223,11 +223,11 @@
223 223  **Rationale:** POC proves functionality. Performance optimization happens in POC2.
224 224  
225 225  **POC Acceptance:**
226 -
227 227  * Analysis completes (no timeout requirement)
228 228  * Errors don't crash system
229 229  * Basic logging in place
230 230  
213 +
231 231  == 4. What's NOT in POC Scope ==
232 232  
233 233  === 4.1 User-Facing Features (Beta 0+) ===
... ... @@ -237,7 +237,6 @@
237 237  {{/warning}}
238 238  
239 239  **Out of Scope:**
240 -
241 241  * ❌ User accounts and authentication (FR8)
242 242  * ❌ User corrections system (FR9, FR45-46)
243 243  * ❌ Public publishing interface (FR10)
... ... @@ -251,7 +251,6 @@
251 251  === 4.2 Advanced Features (V1.0+) ===
252 252  
253 253  **Out of Scope:**
254 -
255 255  * ❌ IFCN compliance (FR47)
256 256  * ❌ ClaimReview schema (FR48)
257 257  * ❌ Archive.org integration (FR49)
... ... @@ -266,7 +266,6 @@
266 266  === 4.3 Production Requirements (POC2, Beta 0) ===
267 267  
268 268  **Out of Scope:**
269 -
270 270  * ❌ Security controls (NFR4, NFR12)
271 271  * ❌ Code maintainability (NFR5)
272 272  * ❌ System monitoring (NFR13)
... ... @@ -283,26 +283,21 @@
283 283  
284 284  For each analyzed claim, POC must produce:
285 285  
286 -* \\
287 -** \\
288 -**1. Claim
266 +**1. Claim**
289 289  * Original text
290 290  * Classification (factual/non-factual/ambiguous)
291 291  * If non-factual: Clear reason why
292 292  
293 293  **2. Scenarios** (if factual)
294 -
295 295  * 2-3 interpretation scenarios
296 296  * Each scenario clearly described
297 297  
298 298  **3. Evidence** (if factual)
299 -
300 300  * Supporting evidence (2+ items)
301 301  * Opposing evidence (if exists)
302 302  * Source URLs and reliability scores
303 303  
304 304  **4. Verdict** (if factual)
305 -
306 306  * Probability (0-100%)
307 307  * Uncertainty quantification
308 308  * Confidence level (LOW/MEDIUM/HIGH)
... ... @@ -309,10 +309,10 @@
309 309  * Reasoning chain
310 310  
311 311  **5. Quality Status**
312 -
313 313  * Which gates passed/failed
314 314  * If failed: Clear explanation why
315 315  
290 +
316 316  === 5.2 Example POC Output ===
317 317  
318 318  {{code language="json"}}
... ... @@ -364,7 +364,6 @@
364 364  POC is successful if:
365 365  
366 366  ✅ **FR1-FR7 Requirements Met:**
367 -
368 368  1. Extracts 3-5 factual claims from test articles
369 369  2. Generates 2-3 scenarios per ambiguous claim
370 370  3. Finds supporting AND opposing evidence
... ... @@ -372,21 +372,19 @@
372 372  5. Provides clear reasoning chains
373 373  
374 374  ✅ **Quality Gates Work:**
375 -
376 376  1. Gate 1 blocks non-factual claims (100% block rate)
377 377  2. Gate 4 blocks low-quality verdicts (blocks if <2 sources or quality <0.6)
378 378  3. Clear rejection reasons provided
379 379  
380 380  ✅ **NFR11 Met:**
381 -
382 382  1. Quality gates reduce hallucination rate
383 383  2. Blocked outputs have clear explanations
384 384  3. Quality metrics are logged
385 385  
358 +
386 386  === 6.2 Quality Thresholds ===
387 387  
388 388  **Minimum Acceptable:**
389 -
390 390  * ≥70% of test claims correctly classified (factual/non-factual)
391 391  * ≥60% of verdicts are reasonable (human evaluation)
392 392  * Gate 1 blocks 100% of non-factual claims
... ... @@ -393,17 +393,16 @@
393 393  * Gate 4 blocks verdicts with <2 sources
394 394  
395 395  **Target:**
396 -
397 397  * ≥80% claims correctly classified
398 398  * ≥75% verdicts are reasonable
399 399  * <10% false positives (blocking good claims)
400 400  
372 +
401 401  === 6.3 POC Decision Gate ===
402 402  
403 403  **After POC1, we decide:**
404 404  
405 405  **✅ PROCEED to POC2** if:
406 -
407 407  * Success criteria met
408 408  * Quality gates demonstrably improve output
409 409  * Core workflow is technically sound
... ... @@ -410,72 +410,65 @@
410 410  * Clear path to production quality
411 411  
412 412  **⚠️ ITERATE POC1** if:
413 -
414 414  * Success criteria partially met
415 415  * Gates work but need tuning
416 416  * Core issues identified but fixable
417 417  
418 418  **❌ PIVOT APPROACH** if:
419 -
420 420  * Success criteria not met
421 421  * Fundamental AI limitations discovered
422 422  * Quality gates insufficient
423 423  * Alternative approach needed
424 424  
394 +
425 425  == 7. Test Cases ==
426 426  
427 427  === 7.1 Happy Path ===
428 428  
429 429  **Test 1: Simple Factual Claim**
430 -
431 431  * Input: "Paris is the capital of France"
432 -* Expected: Factual, 1 scenario, verdict 95% true
401 +* Expected: Factual, 1 scenario, verdict ~95% true
433 433  
434 434  **Test 2: Ambiguous Claim**
435 -
436 436  * Input: "Switzerland has the highest income in Europe"
437 437  * Expected: Factual, 2-3 scenarios, verdict with uncertainty
438 438  
439 439  **Test 3: Statistical Claim**
440 -
441 441  * Input: "10% of people have condition X"
442 442  * Expected: Factual, evidence with numbers, probabilistic verdict
443 443  
411 +
444 444  === 7.2 Edge Cases ===
445 445  
446 446  **Test 4: Opinion**
447 -
448 448  * Input: "Paris is the best city"
449 449  * Expected: Non-factual (opinion), blocked by Gate 1
450 450  
451 451  **Test 5: Prediction**
452 -
453 453  * Input: "Bitcoin will reach $100,000 next year"
454 454  * Expected: Non-factual (prediction), blocked by Gate 1
455 455  
456 456  **Test 6: Insufficient Evidence**
457 -
458 458  * Input: Obscure factual claim with no sources
459 459  * Expected: Blocked by Gate 4 (<2 sources)
460 460  
426 +
461 461  === 7.3 Quality Gate Tests ===
462 462  
463 463  **Test 7: Gate 1 Effectiveness**
464 -
465 465  * Input: Mix of 10 factual + 10 non-factual claims
466 466  * Expected: Gate 1 blocks all 10 non-factual (100% precision)
467 467  
468 468  **Test 8: Gate 4 Effectiveness**
469 -
470 470  * Input: Claims with varying evidence availability
471 471  * Expected: Gate 4 blocks low-confidence verdicts
472 472  
437 +
473 473  == 8. Technical Architecture (POC) ==
474 474  
475 475  === 8.1 Simplified Architecture ===
476 476  
477 477  **POC Tech Stack:**
478 -
479 479  * **Frontend:** Simple web interface (Next.js + TypeScript)
480 480  * **Backend:** Single API endpoint
481 481  * **AI:** Claude API (Sonnet 4.5)
... ... @@ -488,7 +488,6 @@
488 488  === 8.2 AKEL Implementation ===
489 489  
490 490  **POC AKEL:**
491 -
492 492  * Single-threaded processing
493 493  * Synchronous API calls
494 494  * No caching
... ... @@ -496,7 +496,6 @@
496 496  * Console logging
497 497  
498 498  **Full AKEL (POC2+):**
499 -
500 500  * Multi-threaded processing
501 501  * Async API calls
502 502  * Evidence caching
... ... @@ -503,6 +503,7 @@
503 503  * Advanced error handling with retry
504 504  * Structured logging + monitoring
505 505  
468 +
506 506  == 9. POC Philosophy ==
507 507  
508 508  {{info}}
... ... @@ -511,55 +511,47 @@
511 511  
512 512  === 9.1 Core Principles ===
513 513  
514 -* \\
515 -** \\
516 -**1. Prove Concept, Not Production
477 +**1. Prove Concept, Not Production**
517 517  * POC validates AI can do the job
518 518  * Production quality comes in POC2 and Beta 0
519 519  * Focus on "does it work?" not "is it perfect?"
520 520  
521 521  **2. Implement Subset of Requirements**
522 -
523 523  * POC covers FR1-7, NFR11 (lite)
524 524  * All other requirements deferred
525 525  * Clear mapping to [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]]
526 526  
527 527  **3. Quality Gates Validate Approach**
528 -
529 529  * 2 gates prove the concept
530 530  * Remaining 5 gates added in POC2
531 531  * Gates must demonstrably improve quality
532 532  
533 533  **4. Iterate Based on Results**
534 -
535 535  * POC results determine next steps
536 536  * Decision gate after POC1
537 537  * Flexibility to pivot if needed
538 538  
539 -=== 9.2 Success ===
540 540  
541 - Clear Path Forward ===
498 +=== 9.2 Success = Clear Path Forward ===
542 542  
543 543  POC succeeds if we can confidently answer:
544 544  
545 545  ✅ **Technical Feasibility:**
546 -
547 547  * Can AI extract claims reliably?
548 548  * Can AI find balanced evidence?
549 549  * Can AI compute reasonable verdicts?
550 550  
551 551  ✅ **Quality Approach:**
552 -
553 553  * Do quality gates improve output?
554 554  * Can we measure and track quality?
555 555  * Is the gate approach scalable?
556 556  
557 557  ✅ **Production Path:**
558 -
559 559  * Is the core architecture sound?
560 560  * What needs improvement for production?
561 561  * Is POC2 the right next step?
562 562  
517 +
563 563  == 10. Related Pages ==
564 564  
565 565  * **[[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]]** - Full system requirements (this POC implements a subset)
... ... @@ -568,10 +568,11 @@
568 568  * **[[Implementation Roadmap>>FactHarbor.Roadmap.WebHome]]** - POC1, POC2, Beta 0, V1.0 phases
569 569  * **[[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]]** - What users need (drives requirements)
570 570  
526 +
571 571  **Document Owner:** Technical Team
572 572  **Review Frequency:** After each POC iteration
573 573  **Version History:**
574 -
575 575  * v1.0 - Initial POC requirements
576 576  * v2.0 - Updated after specification cross-check
577 577  * v3.0 - Aligned with Main Requirements (FR/NFR IDs added)
533 +