Changes for page POC1: Core Workflow with Quality Gates
Last modified by Robert Schaub on 2025/12/22 13:50
Summary
-
Page properties (2 modified, 0 added, 0 removed)
Details
- Page properties
-
- Parent
-
... ... @@ -1,1 +1,1 @@ 1 -Test.FactHarbor.Roadmap.WebHome 1 +Test.FactHarbor pre10 V0\.9\.70.Roadmap.WebHome - Content
-
... ... @@ -12,12 +12,12 @@ 12 12 **Key Innovation:** Quality validation BEFORE publication, not after 13 13 14 14 **What We're Proving:** 15 + 15 15 * AKEL can reliably extract factual claims from articles 16 16 * AKEL can generate credible verdicts with proper evidence 17 17 * Quality gates prevent hallucinations and low-confidence outputs 18 18 * Fully automated approach is viable 19 19 20 - 21 21 == 2. Scope == 22 22 23 23 === In Scope === ... ... @@ -40,7 +40,6 @@ 40 40 * A/B testing 41 41 * Gates 2 & 3 (Evidence relevance, Scenario coherence) 42 42 43 - 44 44 == 3. Requirements == 45 45 46 46 === 3.1 NFR11: Quality Assurance Framework (POC1 Lite Version) === ... ... @@ -57,6 +57,7 @@ 57 57 **Purpose:** Ensure extracted claims are factual assertions, not opinions or predictions 58 58 59 59 **Validation Checks:** 59 + 60 60 1. **Factual Statement Test:** Can this be verified with evidence? 61 61 2. **Opinion Detection:** Contains hedging language? ("I think", "probably", "best", "worst") 62 62 3. **Specificity Score:** Contains concrete details? (names, numbers, dates, locations) ... ... @@ -63,14 +63,13 @@ 63 63 4. **Future Prediction Test:** Makes claims about future events? 64 64 65 65 **Pass Criteria:** 66 -{{code}} 67 -- isFactual: true 66 +{{code}}- isFactual: true 68 68 - opinionScore: ≤ 0.3 69 69 - specificityScore: ≥ 0.3 70 -- claimType: FACTUAL 71 -{{/code}} 69 +- claimType: FACTUAL{{/code}} 72 72 73 73 **Action if Failed:** 72 + 74 74 * Flag as "Non-verifiable: Opinion/Prediction/Ambiguous" 75 75 * Do NOT generate scenarios or verdicts 76 76 * Display explanation to user ... ... @@ -83,6 +83,7 @@ 83 83 **Purpose:** Only publish verdicts with sufficient evidence and confidence 84 84 85 85 **Validation Checks:** 85 + 86 86 1. **Evidence Count:** Minimum 2 independent sources 87 87 2. **Source Quality:** Average reliability ≥ 0.6 (on 0-1 scale) 88 88 3. **Evidence Agreement:** % supporting vs. contradicting ≥ 0.6 ... ... @@ -89,8 +89,7 @@ 89 89 4. **Uncertainty Factors:** Count of hedging statements in reasoning 90 90 91 91 **Confidence Tiers:** 92 -{{code}} 93 -HIGH (80-100%): 92 +{{code}}HIGH (80-100%): 94 94 - ≥3 sources 95 95 - ≥0.7 average quality 96 96 - ≥80% agreement ... ... @@ -104,10 +104,10 @@ 104 104 - ≥2 sources BUT low quality/agreement 105 105 106 106 INSUFFICIENT: 107 - - <2 sources → DO NOT PUBLISH 108 -{{/code}} 106 + - <2 sources → DO NOT PUBLISH{{/code}} 109 109 110 110 **POC1 Publication Rule:** 109 + 111 111 * Minimum **MEDIUM** confidence required 112 112 * Blocked verdicts show "Insufficient Evidence" message 113 113 ... ... @@ -140,6 +140,7 @@ 140 140 {{/code}} 141 141 142 142 **Updated Verdict States:** 142 + 143 143 * PUBLISHED - Passed all gates 144 144 * INSUFFICIENT_EVIDENCE - Failed Gate 4 145 145 * NON_FACTUAL_CLAIM - Failed Gate 1 ... ... @@ -146,7 +146,6 @@ 146 146 * PROCESSING - In progress 147 147 * ERROR - System failure 148 148 149 - 150 150 === 3.3 Modified FR4: Analysis Summary (Enhanced) === 151 151 152 152 **Enhancement for POC1:** ... ... @@ -177,6 +177,7 @@ 177 177 POC1 is considered **SUCCESSFUL** if: 178 178 179 179 **✅ Functional:** 179 + 180 180 * Processes diverse test articles without crashes 181 181 * Generates verdicts for all factual claims 182 182 * Blocks all non-factual claims (0% pass through) ... ... @@ -183,6 +183,7 @@ 183 183 * Blocks all insufficient-evidence verdicts (0% with <2 sources) 184 184 185 185 **✅ Quality:** 186 + 186 186 * Hallucination rate <10% (manual verification) 187 187 * 0 verdicts with <2 sources published 188 188 * 0 opinion statements published as facts ... ... @@ -189,17 +189,18 @@ 189 189 * Average quality score ≥7.0/10 190 190 191 191 **✅ Performance:** 193 + 192 192 * Processing time reasonable for POC demonstration 193 193 * Quality gates execute efficiently 194 194 * UI displays results clearly 195 195 196 196 **✅ Learnings:** 199 + 197 197 * Identified prompt engineering improvements 198 198 * Documented AKEL strengths/weaknesses 199 199 * Validated threshold values 200 200 * Clear path to POC2 defined 201 201 202 - 203 203 == 5. Decision Gates == 204 204 205 205 **POC1 → POC2 Decision:** ... ... @@ -230,6 +230,7 @@ 230 230 {{/code}} 231 231 232 232 **POC1 Acceptable Simplifications:** 235 + 233 233 * Single AKEL call (not multi-component pipeline) 234 234 * No scenarios (implicit in verdicts) 235 235 * Basic evidence linking ... ... @@ -236,18 +236,16 @@ 236 236 * 2 gates instead of 4 237 237 * No review queue 238 238 239 -**See:** [[Architecture>>Test.FactHarbor.Specification.Architecture.WebHome]] for details 242 +**See:** [[Architecture>>Test.FactHarbor pre10 V0\.9\.70.Specification.Architecture.WebHome]] for details 240 240 241 241 242 242 == Related Pages == 243 243 244 -* [[Roadmap Overview>>Test.FactHarbor.Roadmap.WebHome]] - All phases 245 -* [[POC2 Requirements>>Test.FactHarbor.Roadmap.POC2.WebHome]] - Next phase 246 -* [[Requirements>>Test.FactHarbor.Specification.Requirements.WebHome]] - Full system requirements 247 -* [[Architecture>>Test.FactHarbor.Specification.Architecture.WebHome]] - System architecture 247 +* [[Roadmap Overview>>Test.FactHarbor pre10 V0\.9\.70.Roadmap.WebHome]] - All phases 248 +* [[POC2 Requirements>>Test.FactHarbor pre10 V0\.9\.70.Roadmap.POC2.WebHome]] - Next phase 249 +* [[Requirements>>Test.FactHarbor pre10 V0\.9\.70.Specification.Requirements.WebHome]] - Full system requirements 250 +* [[Architecture>>Test.FactHarbor pre10 V0\.9\.70.Specification.Architecture.WebHome]] - System architecture 248 248 * [[NFR11 Full Specification>>Test.FactHarbor.Specification.Requirements.WebHome#NFR11]] - Complete quality framework 249 249 250 - 251 251 **Document Status:** ✅ POC1 Specification Complete - Ready for Implementation 252 252 **Version:** V0.9.70 253 -