Changes for page POC1: Core Workflow with Quality Gates
Last modified by Robert Schaub on 2025/12/22 13:50
Summary
-
Page properties (2 modified, 0 added, 0 removed)
Details
- Page properties
-
- Parent
-
... ... @@ -1,1 +1,1 @@ 1 -Test.FactHarbor pre10 V0\.9\.70.Roadmap.WebHome1 +Test.FactHarbor.Roadmap.WebHome - Content
-
... ... @@ -12,12 +12,12 @@ 12 12 **Key Innovation:** Quality validation BEFORE publication, not after 13 13 14 14 **What We're Proving:** 15 - 16 16 * AKEL can reliably extract factual claims from articles 17 17 * AKEL can generate credible verdicts with proper evidence 18 18 * Quality gates prevent hallucinations and low-confidence outputs 19 19 * Fully automated approach is viable 20 20 20 + 21 21 == 2. Scope == 22 22 23 23 === In Scope === ... ... @@ -40,6 +40,7 @@ 40 40 * A/B testing 41 41 * Gates 2 & 3 (Evidence relevance, Scenario coherence) 42 42 43 + 43 43 == 3. Requirements == 44 44 45 45 === 3.1 NFR11: Quality Assurance Framework (POC1 Lite Version) === ... ... @@ -56,7 +56,6 @@ 56 56 **Purpose:** Ensure extracted claims are factual assertions, not opinions or predictions 57 57 58 58 **Validation Checks:** 59 - 60 60 1. **Factual Statement Test:** Can this be verified with evidence? 61 61 2. **Opinion Detection:** Contains hedging language? ("I think", "probably", "best", "worst") 62 62 3. **Specificity Score:** Contains concrete details? (names, numbers, dates, locations) ... ... @@ -63,13 +63,14 @@ 63 63 4. **Future Prediction Test:** Makes claims about future events? 64 64 65 65 **Pass Criteria:** 66 -{{code}}- isFactual: true 66 +{{code}} 67 +- isFactual: true 67 67 - opinionScore: ≤ 0.3 68 68 - specificityScore: ≥ 0.3 69 -- claimType: FACTUAL{{/code}} 70 +- claimType: FACTUAL 71 +{{/code}} 70 70 71 71 **Action if Failed:** 72 - 73 73 * Flag as "Non-verifiable: Opinion/Prediction/Ambiguous" 74 74 * Do NOT generate scenarios or verdicts 75 75 * Display explanation to user ... ... @@ -82,7 +82,6 @@ 82 82 **Purpose:** Only publish verdicts with sufficient evidence and confidence 83 83 84 84 **Validation Checks:** 85 - 86 86 1. **Evidence Count:** Minimum 2 independent sources 87 87 2. **Source Quality:** Average reliability ≥ 0.6 (on 0-1 scale) 88 88 3. **Evidence Agreement:** % supporting vs. contradicting ≥ 0.6 ... ... @@ -89,7 +89,8 @@ 89 89 4. **Uncertainty Factors:** Count of hedging statements in reasoning 90 90 91 91 **Confidence Tiers:** 92 -{{code}}HIGH (80-100%): 92 +{{code}} 93 +HIGH (80-100%): 93 93 - ≥3 sources 94 94 - ≥0.7 average quality 95 95 - ≥80% agreement ... ... @@ -103,10 +103,10 @@ 103 103 - ≥2 sources BUT low quality/agreement 104 104 105 105 INSUFFICIENT: 106 - - <2 sources → DO NOT PUBLISH{{/code}} 107 + - <2 sources → DO NOT PUBLISH 108 +{{/code}} 107 107 108 108 **POC1 Publication Rule:** 109 - 110 110 * Minimum **MEDIUM** confidence required 111 111 * Blocked verdicts show "Insufficient Evidence" message 112 112 ... ... @@ -139,7 +139,6 @@ 139 139 {{/code}} 140 140 141 141 **Updated Verdict States:** 142 - 143 143 * PUBLISHED - Passed all gates 144 144 * INSUFFICIENT_EVIDENCE - Failed Gate 4 145 145 * NON_FACTUAL_CLAIM - Failed Gate 1 ... ... @@ -146,6 +146,7 @@ 146 146 * PROCESSING - In progress 147 147 * ERROR - System failure 148 148 149 + 149 149 === 3.3 Modified FR4: Analysis Summary (Enhanced) === 150 150 151 151 **Enhancement for POC1:** ... ... @@ -176,7 +176,6 @@ 176 176 POC1 is considered **SUCCESSFUL** if: 177 177 178 178 **✅ Functional:** 179 - 180 180 * Processes diverse test articles without crashes 181 181 * Generates verdicts for all factual claims 182 182 * Blocks all non-factual claims (0% pass through) ... ... @@ -183,7 +183,6 @@ 183 183 * Blocks all insufficient-evidence verdicts (0% with <2 sources) 184 184 185 185 **✅ Quality:** 186 - 187 187 * Hallucination rate <10% (manual verification) 188 188 * 0 verdicts with <2 sources published 189 189 * 0 opinion statements published as facts ... ... @@ -190,18 +190,17 @@ 190 190 * Average quality score ≥7.0/10 191 191 192 192 **✅ Performance:** 193 - 194 194 * Processing time reasonable for POC demonstration 195 195 * Quality gates execute efficiently 196 196 * UI displays results clearly 197 197 198 198 **✅ Learnings:** 199 - 200 200 * Identified prompt engineering improvements 201 201 * Documented AKEL strengths/weaknesses 202 202 * Validated threshold values 203 203 * Clear path to POC2 defined 204 204 202 + 205 205 == 5. Decision Gates == 206 206 207 207 **POC1 → POC2 Decision:** ... ... @@ -232,7 +232,6 @@ 232 232 {{/code}} 233 233 234 234 **POC1 Acceptable Simplifications:** 235 - 236 236 * Single AKEL call (not multi-component pipeline) 237 237 * No scenarios (implicit in verdicts) 238 238 * Basic evidence linking ... ... @@ -239,16 +239,18 @@ 239 239 * 2 gates instead of 4 240 240 * No review queue 241 241 242 -**See:** [[Architecture>>Test.FactHarbor pre10 V0\.9\.70.Specification.Architecture.WebHome]] for details239 +**See:** [[Architecture>>Test.FactHarbor.Specification.Architecture.WebHome]] for details 243 243 244 244 245 245 == Related Pages == 246 246 247 -* [[Roadmap Overview>>Test.FactHarbor pre10 V0\.9\.70.Roadmap.WebHome]] - All phases248 -* [[POC2 Requirements>>Test.FactHarbor pre10 V0\.9\.70.Roadmap.POC2.WebHome]] - Next phase249 -* [[Requirements>>Test.FactHarbor pre10 V0\.9\.70.Specification.Requirements.WebHome]] - Full system requirements250 -* [[Architecture>>Test.FactHarbor pre10 V0\.9\.70.Specification.Architecture.WebHome]] - System architecture244 +* [[Roadmap Overview>>Test.FactHarbor.Roadmap.WebHome]] - All phases 245 +* [[POC2 Requirements>>Test.FactHarbor.Roadmap.POC2.WebHome]] - Next phase 246 +* [[Requirements>>Test.FactHarbor.Specification.Requirements.WebHome]] - Full system requirements 247 +* [[Architecture>>Test.FactHarbor.Specification.Architecture.WebHome]] - System architecture 251 251 * [[NFR11 Full Specification>>Test.FactHarbor.Specification.Requirements.WebHome#NFR11]] - Complete quality framework 252 252 250 + 253 253 **Document Status:** ✅ POC1 Specification Complete - Ready for Implementation 254 254 **Version:** V0.9.70 253 +