Wiki source code of POC Requirements (POC1 & POC2)
Last modified by Robert Schaub on 2025/12/23 18:00
Hide last authors
| author | version | line-number | content |
|---|---|---|---|
| |
1.1 | 1 | = POC Requirements = |
| 2 | |||
| 3 | **Status:** ✅ Approved for Development | ||
| |
2.1 | 4 | **Version:** 3.0 (Aligned with Main Requirements) |
| |
1.1 | 5 | **Goal:** Prove that AI can extract claims and determine verdicts automatically without human intervention |
| 6 | |||
| |
2.1 | 7 | {{info}} |
| 8 | **Core Philosophy:** POC validates the [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]] through simplified implementation. All POC features map to formal FR/NFR requirements. | ||
| 9 | {{/info}} | ||
| |
1.1 | 10 | |
| |
2.1 | 11 | |
| |
1.1 | 12 | == 1. POC Overview == |
| 13 | |||
| 14 | === 1.1 What POC Tests === | ||
| 15 | |||
| 16 | **Core Question:** | ||
| |
2.2 | 17 | |
| |
1.1 | 18 | > Can AI automatically extract factual claims from articles and evaluate them with reasonable verdicts? |
| 19 | |||
| 20 | **What we're proving:** | ||
| |
2.2 | 21 | |
| |
1.1 | 22 | * AI can identify factual claims from text |
| |
2.1 | 23 | * AI can evaluate those claims with structured evidence |
| 24 | * Quality gates can filter unreliable outputs | ||
| 25 | * The core workflow is technically feasible | ||
| |
1.1 | 26 | |
| |
2.1 | 27 | **What we're NOT proving:** |
| |
2.2 | 28 | |
| |
2.1 | 29 | * Production-ready reliability (that's POC2) |
| 30 | * User-facing features (that's Beta 0) | ||
| 31 | * Full IFCN compliance (that's V1.0) | ||
| |
1.1 | 32 | |
| |
2.1 | 33 | === 1.2 Requirements Mapping === |
| |
1.1 | 34 | |
| |
2.1 | 35 | POC1 implements a **subset** of the full system requirements defined in [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]]. |
| |
1.1 | 36 | |
| |
2.1 | 37 | **Scope Summary:** |
| |
2.2 | 38 | |
| |
2.1 | 39 | * **In Scope:** 8 requirements (7 FRs + 1 NFR) |
| 40 | * **Partial:** 3 NFRs (simplified versions) | ||
| 41 | * **Out of Scope:** 19 requirements (deferred to later phases) | ||
| |
1.1 | 42 | |
| |
2.1 | 43 | == 2. POC1 Scope == |
| |
1.1 | 44 | |
| |
2.1 | 45 | {{success}} |
| |
2.2 | 46 | **Authoritative Source for Phase Mapping:** [[Requirements Roadmap Matrix>>Test.FactHarbor V0\.9\.88 ex 2 new Org Pages.Roadmap.Requirements-Roadmap-Matrix.WebHome]] |
| |
1.1 | 47 | |
| |
2.1 | 48 | The Roadmap Matrix is the single source of truth for which requirements are implemented in which phases. This page provides POC1-specific implementation details only. |
| 49 | {{/success}} | ||
| |
1.1 | 50 | |
| |
2.1 | 51 | **POC1 implements these formal requirements:** |
| |
1.1 | 52 | |
| |
2.1 | 53 | |= Formal Req |= Implementation in POC1 |= Notes |
| 54 | | **FR4** | Analysis Summary | Basic format; quality metadata deferred to POC2 | ||
| 55 | | **FR7** | Automated Verdicts | Full implementation with quality gates (NFR11) | ||
| 56 | | **NFR11** | Quality Assurance Framework | 4 quality gates implemented | ||
| |
1.1 | 57 | |
| |
2.1 | 58 | **POC1 also implements these workflow components** (detailed as FR1-FR6 in implementation sections below) |
| |
1.1 | 59 | |
| |
2.2 | 60 | {{info}}**Note:** FR11 (Audit Trail) and FR13 (In-Article Claim Highlighting) are deferred to Beta 0 for production readiness and user experience enhancement.{{/info}}: |
| 61 | |||
| |
2.1 | 62 | * Claim extraction (FR1) |
| 63 | * Claim context (FR2) | ||
| 64 | * Multiple scenarios (FR3) | ||
| 65 | * Evidence collection (FR5) | ||
| 66 | * Source quality assessment (FR6) | ||
| 67 | * Time evolution tracking (FR8) - deferred to POC2 | ||
| 68 | * Audit trail (FR11) - deferred to Beta 0 | ||
| 69 | * In-article highlighting (FR13) - deferred to Beta 0 | ||
| |
1.1 | 70 | |
| |
2.1 | 71 | **Partial implementations:** |
| |
2.2 | 72 | |
| |
2.1 | 73 | * NFR1 (Explainability) - Basic only |
| 74 | * NFR2 (Performance) - Functional but not optimized | ||
| 75 | * NFR3 (Transparency) - Basic only | ||
| |
1.1 | 76 | |
| |
2.1 | 77 | **Detailed POC1 implementation specifications continue below...** |
| |
1.1 | 78 | |
| 79 | |||
| 80 | |||
| |
2.1 | 81 | == 3. POC Simplifications == |
| |
1.1 | 82 | |
| |
2.1 | 83 | === 3.1 FR1: Claim Extraction (Full Implementation) === |
| |
1.1 | 84 | |
| |
2.1 | 85 | **Main Requirement:** AI extracts factual claims from input text |
| |
1.1 | 86 | |
| 87 | **POC Implementation:** | ||
| |
2.2 | 88 | |
| |
2.1 | 89 | * ✅ AKEL extracts claims using LLM |
| 90 | * ✅ Each claim includes original text reference | ||
| 91 | * ✅ Claims are identified as factual/non-factual | ||
| 92 | * ❌ No advanced claim parsing (added in POC2) | ||
| |
1.1 | 93 | |
| |
2.1 | 94 | **Acceptance Criteria:** |
| |
2.2 | 95 | |
| |
2.1 | 96 | * Extracts 3-5 claims from typical article |
| 97 | * Identifies factual vs non-factual claims | ||
| 98 | * Quality Gate 1 validates extraction | ||
| |
1.1 | 99 | |
| |
2.1 | 100 | === 3.2 FR3: Multiple Scenarios (Full Implementation) === |
| |
1.1 | 101 | |
| |
2.1 | 102 | **Main Requirement:** Generate multiple interpretation scenarios for ambiguous claims |
| |
1.1 | 103 | |
| 104 | **POC Implementation:** | ||
| |
2.2 | 105 | |
| |
2.1 | 106 | * ✅ AKEL generates 2-3 scenarios per claim |
| 107 | * ✅ Scenarios capture different interpretations | ||
| 108 | * ✅ Each scenario is evaluated separately | ||
| 109 | * ✅ Verdict considers all scenarios | ||
| |
1.1 | 110 | |
| |
2.1 | 111 | **Acceptance Criteria:** |
| |
2.2 | 112 | |
| |
2.1 | 113 | * Generates 2+ scenarios for ambiguous claims |
| 114 | * Scenarios are meaningfully different | ||
| 115 | * All scenarios are evaluated | ||
| |
1.1 | 116 | |
| |
2.1 | 117 | === 3.3 FR4: Analysis Summary (Basic Implementation) === |
| |
1.1 | 118 | |
| |
2.1 | 119 | **Main Requirement:** Provide user-friendly summary of analysis |
| |
1.1 | 120 | |
| 121 | **POC Implementation:** | ||
| |
2.2 | 122 | |
| |
2.1 | 123 | * ✅ Simple text summary generated |
| 124 | * ❌ No rich formatting (added in Beta 0) | ||
| 125 | * ❌ No visual elements (added in Beta 0) | ||
| 126 | * ❌ No interactive features (added in Beta 0) | ||
| |
1.1 | 127 | |
| |
2.1 | 128 | **POC Format:** |
| 129 | ``` | ||
| 130 | Claim: [extracted claim] | ||
| 131 | Scenarios: [list of scenarios] | ||
| 132 | Evidence: [supporting/opposing evidence] | ||
| 133 | Verdict: [probability with uncertainty] | ||
| 134 | ``` | ||
| |
1.1 | 135 | |
| 136 | |||
| |
2.1 | 137 | === 3.4 FR5-FR6: Evidence Collection & Evaluation (Full Implementation) === |
| |
1.1 | 138 | |
| |
2.1 | 139 | **Main Requirements:** |
| |
2.2 | 140 | |
| |
2.1 | 141 | * FR5: Collect supporting and opposing evidence |
| 142 | * FR6: Evaluate evidence source reliability | ||
| |
1.1 | 143 | |
| 144 | **POC Implementation:** | ||
| |
2.2 | 145 | |
| |
2.1 | 146 | * ✅ AKEL searches for evidence (web/knowledge base) |
| 147 | * ✅ **Mandatory contradiction search** (finds opposing evidence) | ||
| 148 | * ✅ Source reliability scoring | ||
| 149 | * ❌ No evidence deduplication (added in POC2) | ||
| 150 | * ❌ No advanced source verification (added in POC2) | ||
| |
1.1 | 151 | |
| 152 | **Acceptance Criteria:** | ||
| |
2.2 | 153 | |
| |
2.1 | 154 | * Finds 2+ supporting evidence items |
| 155 | * Finds 1+ opposing evidence (if exists) | ||
| 156 | * Sources scored for reliability | ||
| |
1.1 | 157 | |
| |
2.1 | 158 | === 3.5 FR7: Automated Verdicts (Full Implementation) === |
| |
1.1 | 159 | |
| |
2.1 | 160 | **Main Requirement:** AI computes verdicts with uncertainty quantification |
| |
1.1 | 161 | |
| |
2.1 | 162 | **POC Implementation:** |
| |
2.2 | 163 | |
| |
2.1 | 164 | * ✅ Probabilistic verdicts (0-100% confidence) |
| 165 | * ✅ Uncertainty explicitly stated | ||
| 166 | * ✅ Reasoning chain provided | ||
| 167 | * ✅ Quality Gate 4 validates verdict confidence | ||
| |
1.1 | 168 | |
| |
2.1 | 169 | **POC Output:** |
| 170 | ``` | ||
| 171 | Verdict: 70% likely true | ||
| 172 | Uncertainty: ±15% (moderate confidence) | ||
| 173 | Reasoning: Based on 3 high-quality sources... | ||
| 174 | Confidence Level: MEDIUM | ||
| 175 | ``` | ||
| |
1.1 | 176 | |
| 177 | **Acceptance Criteria:** | ||
| |
2.2 | 178 | |
| |
2.1 | 179 | * Verdicts include probability (0-100%) |
| 180 | * Uncertainty explicitly quantified | ||
| 181 | * Reasoning chain explains verdict | ||
| |
1.1 | 182 | |
| |
2.1 | 183 | === 3.6 NFR11: Quality Assurance Framework (LITE VERSION) === |
| |
1.1 | 184 | |
| |
2.1 | 185 | **Main Requirement:** Complete quality assurance with 7 quality gates |
| |
1.1 | 186 | |
| |
2.1 | 187 | **POC Implementation:** **2 gates only** |
| |
1.1 | 188 | |
| |
2.1 | 189 | **Quality Gate 1: Claim Validation** |
| |
2.2 | 190 | |
| |
2.1 | 191 | * ✅ Validates claim is factual and verifiable |
| 192 | * ✅ Blocks non-factual claims (opinion/prediction/ambiguous) | ||
| 193 | * ✅ Provides clear rejection reason | ||
| |
1.1 | 194 | |
| |
2.1 | 195 | **Quality Gate 4: Verdict Confidence Assessment** |
| |
2.2 | 196 | |
| |
2.1 | 197 | * ✅ Validates ≥2 sources found |
| 198 | * ✅ Validates quality score ≥0.6 | ||
| 199 | * ✅ Blocks low-confidence verdicts | ||
| 200 | * ✅ Provides clear rejection reason | ||
| |
1.1 | 201 | |
| |
2.1 | 202 | **Out of Scope (POC2+):** |
| |
2.2 | 203 | |
| |
2.1 | 204 | * ❌ Gate 2: Evidence Relevance |
| 205 | * ❌ Gate 3: Scenario Coherence | ||
| 206 | * ❌ Gate 5: Source Diversity | ||
| 207 | * ❌ Gate 6: Reasoning Validity | ||
| 208 | * ❌ Gate 7: Output Completeness | ||
| |
1.1 | 209 | |
| |
2.1 | 210 | **Rationale:** Prove gate concept works. Add remaining gates in POC2 after validating approach. |
| |
1.1 | 211 | |
| 212 | |||
| |
2.1 | 213 | === 3.7 NFR1-3: Performance, Scalability, Reliability (Basic) === |
| |
1.1 | 214 | |
| |
2.1 | 215 | **Main Requirements:** |
| |
2.2 | 216 | |
| |
2.1 | 217 | * NFR1: Response time < 30 seconds |
| 218 | * NFR2: Handle 1000+ concurrent users | ||
| 219 | * NFR3: 99.9% uptime | ||
| |
1.1 | 220 | |
| |
2.1 | 221 | **POC Implementation:** |
| |
2.2 | 222 | |
| |
2.1 | 223 | * ⚠️ **Response time monitored** (not optimized) |
| 224 | * ⚠️ **Single-threaded processing** (no concurrency) | ||
| 225 | * ⚠️ **Basic error handling** (no advanced retry logic) | ||
| |
1.1 | 226 | |
| |
2.1 | 227 | **Rationale:** POC proves functionality. Performance optimization happens in POC2. |
| |
1.1 | 228 | |
| |
2.1 | 229 | **POC Acceptance:** |
| |
2.2 | 230 | |
| |
2.1 | 231 | * Analysis completes (no timeout requirement) |
| 232 | * Errors don't crash system | ||
| 233 | * Basic logging in place | ||
| |
1.1 | 234 | |
| |
2.1 | 235 | == 4. What's NOT in POC Scope == |
| |
1.1 | 236 | |
| |
2.1 | 237 | === 4.1 User-Facing Features (Beta 0+) === |
| |
1.1 | 238 | |
| |
2.1 | 239 | {{warning}} |
| 240 | **Deferred to Beta 0:** | ||
| 241 | {{/warning}} | ||
| |
1.1 | 242 | |
| |
2.1 | 243 | **Out of Scope:** |
| |
2.2 | 244 | |
| |
2.1 | 245 | * ❌ User accounts and authentication (FR8) |
| 246 | * ❌ User corrections system (FR9, FR45-46) | ||
| 247 | * ❌ Public publishing interface (FR10) | ||
| 248 | * ❌ Social sharing (FR11) | ||
| 249 | * ❌ Email notifications (FR12) | ||
| 250 | * ❌ API access (FR13) | ||
| |
1.1 | 251 | |
| |
2.1 | 252 | **Rationale:** POC validates AI capabilities. User features added in Beta 0. |
| |
1.1 | 253 | |
| 254 | |||
| |
2.1 | 255 | === 4.2 Advanced Features (V1.0+) === |
| |
1.1 | 256 | |
| |
2.1 | 257 | **Out of Scope:** |
| |
2.2 | 258 | |
| |
2.1 | 259 | * ❌ IFCN compliance (FR47) |
| 260 | * ❌ ClaimReview schema (FR48) | ||
| 261 | * ❌ Archive.org integration (FR49) | ||
| 262 | * ❌ OSINT toolkit (FR50) | ||
| 263 | * ❌ Video verification (FR51) | ||
| 264 | * ❌ Deepfake detection (FR52) | ||
| 265 | * ❌ Cross-org sharing (FR53) | ||
| |
1.1 | 266 | |
| |
2.1 | 267 | **Rationale:** Advanced features require proven platform. Added post-V1.0. |
| |
1.1 | 268 | |
| 269 | |||
| |
2.1 | 270 | === 4.3 Production Requirements (POC2, Beta 0) === |
| |
1.1 | 271 | |
| |
2.1 | 272 | **Out of Scope:** |
| |
2.2 | 273 | |
| |
2.1 | 274 | * ❌ Security controls (NFR4, NFR12) |
| 275 | * ❌ Code maintainability (NFR5) | ||
| 276 | * ❌ System monitoring (NFR13) | ||
| 277 | * ❌ Evidence deduplication | ||
| 278 | * ❌ Advanced source verification | ||
| 279 | * ❌ Full 7-gate quality framework | ||
| |
1.1 | 280 | |
| |
2.1 | 281 | **Rationale:** POC proves concept. Production hardening happens in POC2 and Beta 0. |
| |
1.1 | 282 | |
| 283 | |||
| |
2.1 | 284 | == 5. POC Output Specification == |
| |
1.1 | 285 | |
| |
2.1 | 286 | === 5.1 Required Output Elements === |
| |
1.1 | 287 | |
| |
2.1 | 288 | For each analyzed claim, POC must produce: |
| |
1.1 | 289 | |
| |
2.2 | 290 | * |
| 291 | ** | ||
| 292 | **1. Claim | ||
| |
2.1 | 293 | * Original text |
| 294 | * Classification (factual/non-factual/ambiguous) | ||
| 295 | * If non-factual: Clear reason why | ||
| |
1.1 | 296 | |
| |
2.1 | 297 | **2. Scenarios** (if factual) |
| |
2.2 | 298 | |
| |
2.1 | 299 | * 2-3 interpretation scenarios |
| 300 | * Each scenario clearly described | ||
| |
1.1 | 301 | |
| |
2.1 | 302 | **3. Evidence** (if factual) |
| |
2.2 | 303 | |
| |
2.1 | 304 | * Supporting evidence (2+ items) |
| 305 | * Opposing evidence (if exists) | ||
| 306 | * Source URLs and reliability scores | ||
| |
1.1 | 307 | |
| |
2.1 | 308 | **4. Verdict** (if factual) |
| |
2.2 | 309 | |
| |
2.1 | 310 | * Probability (0-100%) |
| 311 | * Uncertainty quantification | ||
| 312 | * Confidence level (LOW/MEDIUM/HIGH) | ||
| 313 | * Reasoning chain | ||
| |
1.1 | 314 | |
| |
2.1 | 315 | **5. Quality Status** |
| |
2.2 | 316 | |
| |
2.1 | 317 | * Which gates passed/failed |
| 318 | * If failed: Clear explanation why | ||
| |
1.1 | 319 | |
| |
2.1 | 320 | === 5.2 Example POC Output === |
| |
1.1 | 321 | |
| |
2.1 | 322 | {{code language="json"}} |
| 323 | { | ||
| 324 | "claim": { | ||
| 325 | "text": "Switzerland has the highest life expectancy in Europe", | ||
| 326 | "type": "factual", | ||
| 327 | "gate1_status": "PASS" | ||
| 328 | }, | ||
| 329 | "scenarios": [ | ||
| 330 | "Switzerland's overall life expectancy is highest", | ||
| 331 | "Switzerland ranks highest for specific age groups" | ||
| 332 | ], | ||
| 333 | "evidence": { | ||
| 334 | "supporting": [ | ||
| 335 | { | ||
| 336 | "source": "WHO Report 2023", | ||
| 337 | "reliability": 0.95, | ||
| 338 | "excerpt": "Switzerland: 83.4 years average..." | ||
| 339 | } | ||
| 340 | ], | ||
| 341 | "opposing": [ | ||
| 342 | { | ||
| 343 | "source": "Eurostat 2024", | ||
| 344 | "reliability": 0.90, | ||
| 345 | "excerpt": "Spain leads at 83.5 years..." | ||
| 346 | } | ||
| 347 | ] | ||
| 348 | }, | ||
| 349 | "verdict": { | ||
| 350 | "probability": 0.65, | ||
| 351 | "uncertainty": 0.15, | ||
| 352 | "confidence": "MEDIUM", | ||
| 353 | "reasoning": "WHO and Eurostat show similar but conflicting data...", | ||
| 354 | "gate4_status": "PASS" | ||
| 355 | } | ||
| 356 | } | ||
| |
1.1 | 357 | {{/code}} |
| 358 | |||
| 359 | |||
| |
2.1 | 360 | == 6. Success Criteria == |
| |
1.1 | 361 | |
| |
2.1 | 362 | {{success}} |
| 363 | **POC Success Definition:** POC validates that AI can extract claims, find balanced evidence, and compute reasonable verdicts with quality gates improving output quality. | ||
| 364 | {{/success}} | ||
| |
1.1 | 365 | |
| |
2.1 | 366 | === 6.1 Functional Success === |
| |
1.1 | 367 | |
| |
2.1 | 368 | POC is successful if: |
| |
1.1 | 369 | |
| |
2.1 | 370 | ✅ **FR1-FR7 Requirements Met:** |
| |
2.2 | 371 | |
| |
2.1 | 372 | 1. Extracts 3-5 factual claims from test articles |
| 373 | 2. Generates 2-3 scenarios per ambiguous claim | ||
| 374 | 3. Finds supporting AND opposing evidence | ||
| 375 | 4. Computes probabilistic verdicts with uncertainty | ||
| 376 | 5. Provides clear reasoning chains | ||
| |
1.1 | 377 | |
| |
2.1 | 378 | ✅ **Quality Gates Work:** |
| |
2.2 | 379 | |
| |
2.1 | 380 | 1. Gate 1 blocks non-factual claims (100% block rate) |
| 381 | 2. Gate 4 blocks low-quality verdicts (blocks if <2 sources or quality <0.6) | ||
| 382 | 3. Clear rejection reasons provided | ||
| |
1.1 | 383 | |
| |
2.1 | 384 | ✅ **NFR11 Met:** |
| |
2.2 | 385 | |
| |
2.1 | 386 | 1. Quality gates reduce hallucination rate |
| 387 | 2. Blocked outputs have clear explanations | ||
| 388 | 3. Quality metrics are logged | ||
| |
1.1 | 389 | |
| |
2.1 | 390 | === 6.2 Quality Thresholds === |
| |
1.1 | 391 | |
| |
2.1 | 392 | **Minimum Acceptable:** |
| |
2.2 | 393 | |
| |
2.1 | 394 | * ≥70% of test claims correctly classified (factual/non-factual) |
| 395 | * ≥60% of verdicts are reasonable (human evaluation) | ||
| 396 | * Gate 1 blocks 100% of non-factual claims | ||
| 397 | * Gate 4 blocks verdicts with <2 sources | ||
| |
1.1 | 398 | |
| |
2.1 | 399 | **Target:** |
| |
2.2 | 400 | |
| |
2.1 | 401 | * ≥80% claims correctly classified |
| 402 | * ≥75% verdicts are reasonable | ||
| 403 | * <10% false positives (blocking good claims) | ||
| |
1.1 | 404 | |
| |
2.1 | 405 | === 6.3 POC Decision Gate === |
| |
1.1 | 406 | |
| |
2.1 | 407 | **After POC1, we decide:** |
| |
1.1 | 408 | |
| |
2.1 | 409 | **✅ PROCEED to POC2** if: |
| |
2.2 | 410 | |
| |
2.1 | 411 | * Success criteria met |
| 412 | * Quality gates demonstrably improve output | ||
| 413 | * Core workflow is technically sound | ||
| 414 | * Clear path to production quality | ||
| |
1.1 | 415 | |
| |
2.1 | 416 | **⚠️ ITERATE POC1** if: |
| |
2.2 | 417 | |
| |
2.1 | 418 | * Success criteria partially met |
| 419 | * Gates work but need tuning | ||
| 420 | * Core issues identified but fixable | ||
| |
1.1 | 421 | |
| |
2.1 | 422 | **❌ PIVOT APPROACH** if: |
| |
2.2 | 423 | |
| |
2.1 | 424 | * Success criteria not met |
| 425 | * Fundamental AI limitations discovered | ||
| 426 | * Quality gates insufficient | ||
| 427 | * Alternative approach needed | ||
| |
1.1 | 428 | |
| |
2.1 | 429 | == 7. Test Cases == |
| |
1.1 | 430 | |
| |
2.1 | 431 | === 7.1 Happy Path === |
| |
1.1 | 432 | |
| |
2.1 | 433 | **Test 1: Simple Factual Claim** |
| |
2.2 | 434 | |
| |
2.1 | 435 | * Input: "Paris is the capital of France" |
| |
2.2 | 436 | * Expected: Factual, 1 scenario, verdict 95% true |
| |
1.1 | 437 | |
| |
2.1 | 438 | **Test 2: Ambiguous Claim** |
| |
2.2 | 439 | |
| |
2.1 | 440 | * Input: "Switzerland has the highest income in Europe" |
| 441 | * Expected: Factual, 2-3 scenarios, verdict with uncertainty | ||
| |
1.1 | 442 | |
| |
2.1 | 443 | **Test 3: Statistical Claim** |
| |
2.2 | 444 | |
| |
2.1 | 445 | * Input: "10% of people have condition X" |
| 446 | * Expected: Factual, evidence with numbers, probabilistic verdict | ||
| |
1.1 | 447 | |
| |
2.1 | 448 | === 7.2 Edge Cases === |
| |
1.1 | 449 | |
| |
2.1 | 450 | **Test 4: Opinion** |
| |
2.2 | 451 | |
| |
2.1 | 452 | * Input: "Paris is the best city" |
| 453 | * Expected: Non-factual (opinion), blocked by Gate 1 | ||
| |
1.1 | 454 | |
| |
2.1 | 455 | **Test 5: Prediction** |
| |
2.2 | 456 | |
| |
2.1 | 457 | * Input: "Bitcoin will reach $100,000 next year" |
| 458 | * Expected: Non-factual (prediction), blocked by Gate 1 | ||
| |
1.1 | 459 | |
| |
2.1 | 460 | **Test 6: Insufficient Evidence** |
| |
2.2 | 461 | |
| |
2.1 | 462 | * Input: Obscure factual claim with no sources |
| 463 | * Expected: Blocked by Gate 4 (<2 sources) | ||
| |
1.1 | 464 | |
| |
2.1 | 465 | === 7.3 Quality Gate Tests === |
| |
1.1 | 466 | |
| |
2.1 | 467 | **Test 7: Gate 1 Effectiveness** |
| |
2.2 | 468 | |
| |
2.1 | 469 | * Input: Mix of 10 factual + 10 non-factual claims |
| 470 | * Expected: Gate 1 blocks all 10 non-factual (100% precision) | ||
| |
1.1 | 471 | |
| |
2.1 | 472 | **Test 8: Gate 4 Effectiveness** |
| |
2.2 | 473 | |
| |
2.1 | 474 | * Input: Claims with varying evidence availability |
| 475 | * Expected: Gate 4 blocks low-confidence verdicts | ||
| |
1.1 | 476 | |
| |
2.1 | 477 | == 8. Technical Architecture (POC) == |
| |
1.1 | 478 | |
| |
2.1 | 479 | === 8.1 Simplified Architecture === |
| |
1.1 | 480 | |
| |
2.1 | 481 | **POC Tech Stack:** |
| |
2.2 | 482 | |
| |
2.1 | 483 | * **Frontend:** Simple web interface (Next.js + TypeScript) |
| 484 | * **Backend:** Single API endpoint | ||
| 485 | * **AI:** Claude API (Sonnet 4.5) | ||
| 486 | * **Database:** Local JSON files (no database) | ||
| 487 | * **Deployment:** Single server | ||
| |
1.1 | 488 | |
| |
2.1 | 489 | **Architecture Diagram:** See [[POC1 Specification>>FactHarbor.Specification.POC.Specification]] |
| |
1.1 | 490 | |
| 491 | |||
| |
2.1 | 492 | === 8.2 AKEL Implementation === |
| |
1.1 | 493 | |
| |
2.1 | 494 | **POC AKEL:** |
| |
2.2 | 495 | |
| |
2.1 | 496 | * Single-threaded processing |
| 497 | * Synchronous API calls | ||
| 498 | * No caching | ||
| 499 | * Basic error handling | ||
| 500 | * Console logging | ||
| |
1.1 | 501 | |
| |
2.1 | 502 | **Full AKEL (POC2+):** |
| |
2.2 | 503 | |
| |
2.1 | 504 | * Multi-threaded processing |
| 505 | * Async API calls | ||
| 506 | * Evidence caching | ||
| 507 | * Advanced error handling with retry | ||
| 508 | * Structured logging + monitoring | ||
| |
1.1 | 509 | |
| |
2.1 | 510 | == 9. POC Philosophy == |
| |
1.1 | 511 | |
| |
2.1 | 512 | {{info}} |
| 513 | **Important:** POC validates concept, not production readiness. Focus is on proving AI can do the job, with production quality coming in later phases. | ||
| 514 | {{/info}} | ||
| |
1.1 | 515 | |
| |
2.1 | 516 | === 9.1 Core Principles === |
| |
1.1 | 517 | |
| |
2.2 | 518 | * |
| 519 | ** | ||
| 520 | **1. Prove Concept, Not Production | ||
| |
2.1 | 521 | * POC validates AI can do the job |
| 522 | * Production quality comes in POC2 and Beta 0 | ||
| 523 | * Focus on "does it work?" not "is it perfect?" | ||
| |
1.1 | 524 | |
| |
2.1 | 525 | **2. Implement Subset of Requirements** |
| |
2.2 | 526 | |
| |
2.1 | 527 | * POC covers FR1-7, NFR11 (lite) |
| 528 | * All other requirements deferred | ||
| 529 | * Clear mapping to [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]] | ||
| |
1.1 | 530 | |
| |
2.1 | 531 | **3. Quality Gates Validate Approach** |
| |
2.2 | 532 | |
| |
2.1 | 533 | * 2 gates prove the concept |
| 534 | * Remaining 5 gates added in POC2 | ||
| 535 | * Gates must demonstrably improve quality | ||
| |
1.1 | 536 | |
| |
2.1 | 537 | **4. Iterate Based on Results** |
| |
2.2 | 538 | |
| |
2.1 | 539 | * POC results determine next steps |
| 540 | * Decision gate after POC1 | ||
| 541 | * Flexibility to pivot if needed | ||
| |
1.1 | 542 | |
| |
2.2 | 543 | === 9.2 Success === |
| |
1.1 | 544 | |
| |
2.2 | 545 | Clear Path Forward === |
| |
1.1 | 546 | |
| |
2.1 | 547 | POC succeeds if we can confidently answer: |
| |
1.1 | 548 | |
| |
2.1 | 549 | ✅ **Technical Feasibility:** |
| |
2.2 | 550 | |
| |
2.1 | 551 | * Can AI extract claims reliably? |
| 552 | * Can AI find balanced evidence? | ||
| 553 | * Can AI compute reasonable verdicts? | ||
| |
1.1 | 554 | |
| |
2.1 | 555 | ✅ **Quality Approach:** |
| |
2.2 | 556 | |
| |
2.1 | 557 | * Do quality gates improve output? |
| 558 | * Can we measure and track quality? | ||
| 559 | * Is the gate approach scalable? | ||
| |
1.1 | 560 | |
| |
2.1 | 561 | ✅ **Production Path:** |
| |
2.2 | 562 | |
| |
2.1 | 563 | * Is the core architecture sound? |
| 564 | * What needs improvement for production? | ||
| 565 | * Is POC2 the right next step? | ||
| |
1.1 | 566 | |
| |
2.1 | 567 | == 10. Related Pages == |
| |
1.1 | 568 | |
| |
2.1 | 569 | * **[[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]]** - Full system requirements (this POC implements a subset) |
| 570 | * **[[POC1 Specification (Detailed)>>FactHarbor.Specification.POC.Specification]]** - Detailed POC1 technical specs | ||
| 571 | * **[[POC Summary>>FactHarbor.Specification.POC.Summary]]** - High-level POC overview | ||
| 572 | * **[[Implementation Roadmap>>FactHarbor.Roadmap.WebHome]]** - POC1, POC2, Beta 0, V1.0 phases | ||
| 573 | * **[[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]]** - What users need (drives requirements) | ||
| |
1.1 | 574 | |
| |
2.1 | 575 | **Document Owner:** Technical Team |
| 576 | **Review Frequency:** After each POC iteration | ||
| 577 | **Version History:** | ||
| |
2.2 | 578 | |
| |
2.1 | 579 | * v1.0 - Initial POC requirements |
| 580 | * v2.0 - Updated after specification cross-check | ||
| 581 | * v3.0 - Aligned with Main Requirements (FR/NFR IDs added) |