Changes for page POC1: Core Workflow with Quality Gates
Last modified by Robert Schaub on 2025/12/24 20:35
Summary
-
Page properties (2 modified, 0 added, 0 removed)
Details
- Page properties
-
- Parent
-
... ... @@ -1,1 +1,1 @@ 1 - Test.FactHarbor.Roadmap.WebHome1 +FactHarbor.Archive.FactHarbor delta for V0\.9\.70.Roadmap.WebHome - Content
-
... ... @@ -4,7 +4,7 @@ 4 4 5 5 **Success Metric:** <10% hallucination rate, quality gates prevent low-confidence publications 6 6 7 ---- 7 +---- 8 8 9 9 == 1. Overview == 10 10 ... ... @@ -13,12 +13,13 @@ 13 13 **Key Innovation:** Quality validation BEFORE publication, not after 14 14 15 15 **What We're Proving:** 16 + 16 16 * AKEL can reliably extract factual claims from articles 17 17 * AKEL can generate credible verdicts with proper evidence 18 18 * Quality gates prevent hallucinations and low-confidence outputs 19 19 * Fully automated approach is viable 20 20 21 ---- 22 +---- 22 22 23 23 == 2. Scope == 24 24 ... ... @@ -42,7 +42,7 @@ 42 42 * A/B testing 43 43 * Gates 2 & 3 (Evidence relevance, Scenario coherence) 44 44 45 ---- 46 +---- 46 46 47 47 == 3. Requirements == 48 48 ... ... @@ -60,6 +60,7 @@ 60 60 **Purpose:** Ensure extracted claims are factual assertions, not opinions or predictions 61 61 62 62 **Validation Checks:** 64 + 63 63 1. **Factual Statement Test:** Can this be verified with evidence? 64 64 2. **Opinion Detection:** Contains hedging language? ("I think", "probably", "best", "worst") 65 65 3. **Specificity Score:** Contains concrete details? (names, numbers, dates, locations) ... ... @@ -66,14 +66,13 @@ 66 66 4. **Future Prediction Test:** Makes claims about future events? 67 67 68 68 **Pass Criteria:** 69 -{{code}} 70 -- isFactual: true 71 +{{code}}- isFactual: true 71 71 - opinionScore: ≤ 0.3 72 72 - specificityScore: ≥ 0.3 73 -- claimType: FACTUAL 74 -{{/code}} 74 +- claimType: FACTUAL{{/code}} 75 75 76 76 **Action if Failed:** 77 + 77 77 * Flag as "Non-verifiable: Opinion/Prediction/Ambiguous" 78 78 * Do NOT generate scenarios or verdicts 79 79 * Display explanation to user ... ... @@ -80,7 +80,7 @@ 80 80 81 81 **Target:** 0% opinion statements processed as facts 82 82 83 ---- 84 +---- 84 84 85 85 ==== Gate 4: Verdict Confidence Assessment ==== 86 86 ... ... @@ -87,6 +87,7 @@ 87 87 **Purpose:** Only publish verdicts with sufficient evidence and confidence 88 88 89 89 **Validation Checks:** 91 + 90 90 1. **Evidence Count:** Minimum 2 independent sources 91 91 2. **Source Quality:** Average reliability ≥ 0.6 (on 0-1 scale) 92 92 3. **Evidence Agreement:** % supporting vs. contradicting ≥ 0.6 ... ... @@ -93,8 +93,7 @@ 93 93 4. **Uncertainty Factors:** Count of hedging statements in reasoning 94 94 95 95 **Confidence Tiers:** 96 -{{code}} 97 -HIGH (80-100%): 98 +{{code}}HIGH (80-100%): 98 98 - ≥3 sources 99 99 - ≥0.7 average quality 100 100 - ≥80% agreement ... ... @@ -108,16 +108,16 @@ 108 108 - ≥2 sources BUT low quality/agreement 109 109 110 110 INSUFFICIENT: 111 - - <2 sources → DO NOT PUBLISH 112 -{{/code}} 112 + - <2 sources → DO NOT PUBLISH{{/code}} 113 113 114 114 **POC1 Publication Rule:** 115 + 115 115 * Minimum **MEDIUM** confidence required 116 116 * Blocked verdicts show "Insufficient Evidence" message 117 117 118 118 **Target:** 0% verdicts published with <2 sources 119 119 120 ---- 121 +---- 121 121 122 122 === 3.2 Modified FR7: Automated Verdicts (Enhanced) === 123 123 ... ... @@ -145,6 +145,7 @@ 145 145 {{/code}} 146 146 147 147 **Updated Verdict States:** 149 + 148 148 * PUBLISHED - Passed all gates 149 149 * INSUFFICIENT_EVIDENCE - Failed Gate 4 150 150 * NON_FACTUAL_CLAIM - Failed Gate 1 ... ... @@ -151,7 +151,7 @@ 151 151 * PROCESSING - In progress 152 152 * ERROR - System failure 153 153 154 ---- 156 +---- 155 155 156 156 === 3.3 Modified FR4: Analysis Summary (Enhanced) === 157 157 ... ... @@ -177,7 +177,7 @@ 177 177 Quality Score: 8.5/10 178 178 {{/code}} 179 179 180 ---- 182 +---- 181 181 182 182 == 4. Success Criteria == 183 183 ... ... @@ -184,6 +184,7 @@ 184 184 POC1 is considered **SUCCESSFUL** if: 185 185 186 186 **✅ Functional:** 189 + 187 187 * Processes diverse test articles without crashes 188 188 * Generates verdicts for all factual claims 189 189 * Blocks all non-factual claims (0% pass through) ... ... @@ -190,6 +190,7 @@ 190 190 * Blocks all insufficient-evidence verdicts (0% with <2 sources) 191 191 192 192 **✅ Quality:** 196 + 193 193 * Hallucination rate <10% (manual verification) 194 194 * 0 verdicts with <2 sources published 195 195 * 0 opinion statements published as facts ... ... @@ -196,17 +196,19 @@ 196 196 * Average quality score ≥7.0/10 197 197 198 198 **✅ Performance:** 203 + 199 199 * Processing time reasonable for POC demonstration 200 200 * Quality gates execute efficiently 201 201 * UI displays results clearly 202 202 203 203 **✅ Learnings:** 209 + 204 204 * Identified prompt engineering improvements 205 205 * Documented AKEL strengths/weaknesses 206 206 * Validated threshold values 207 207 * Clear path to POC2 defined 208 208 209 ---- 215 +---- 210 210 211 211 == 5. Decision Gates == 212 212 ... ... @@ -219,7 +219,7 @@ 219 219 220 220 **Only proceed to POC2 if all success criteria met** 221 221 222 ---- 228 +---- 223 223 224 224 == 6. Architecture Notes == 225 225 ... ... @@ -239,6 +239,7 @@ 239 239 {{/code}} 240 240 241 241 **POC1 Acceptable Simplifications:** 248 + 242 242 * Single AKEL call (not multi-component pipeline) 243 243 * No scenarios (implicit in verdicts) 244 244 * Basic evidence linking ... ... @@ -247,18 +247,17 @@ 247 247 248 248 **See:** [[Architecture>>Test.FactHarbor.Specification.Architecture.WebHome]] for details 249 249 250 ---- 257 +---- 251 251 252 252 == Related Pages == 253 253 254 -* [[Roadmap Overview>> Test.FactHarbor.Roadmap.WebHome]] - All phases255 -* [[POC2 Requirements>> Test.FactHarbor.Roadmap.POC2.WebHome]] - Next phase261 +* [[Roadmap Overview>>FactHarbor.Archive.FactHarbor delta for V0\.9\.70.Roadmap.WebHome]] - All phases 262 +* [[POC2 Requirements>>FactHarbor.Archive.FactHarbor delta for V0\.9\.70.Roadmap.POC2.WebHome]] - Next phase 256 256 * [[Requirements>>Test.FactHarbor.Specification.Requirements.WebHome]] - Full system requirements 257 257 * [[Architecture>>Test.FactHarbor.Specification.Architecture.WebHome]] - System architecture 258 258 * [[NFR11 Full Specification>>Test.FactHarbor.Specification.Requirements.WebHome#NFR11]] - Complete quality framework 259 259 260 ---- 267 +---- 261 261 262 262 **Document Status:** ✅ POC1 Specification Complete - Ready for Implementation 263 263 **Version:** V0.9.70 264 -