Changes for page POC1: Core Workflow with Quality Gates
Last modified by Robert Schaub on 2025/12/24 20:35
Summary
-
Page properties (2 modified, 0 added, 0 removed)
Details
- Page properties
-
- Parent
-
... ... @@ -1,1 +1,1 @@ 1 - FactHarbor.Archive.FactHarbordelta for V0\.9\.70.Roadmap.WebHome1 +Test.FactHarbor.Roadmap.WebHome - Content
-
... ... @@ -4,7 +4,7 @@ 4 4 5 5 **Success Metric:** <10% hallucination rate, quality gates prevent low-confidence publications 6 6 7 ---- -7 +--- 8 8 9 9 == 1. Overview == 10 10 ... ... @@ -13,13 +13,12 @@ 13 13 **Key Innovation:** Quality validation BEFORE publication, not after 14 14 15 15 **What We're Proving:** 16 - 17 17 * AKEL can reliably extract factual claims from articles 18 18 * AKEL can generate credible verdicts with proper evidence 19 19 * Quality gates prevent hallucinations and low-confidence outputs 20 20 * Fully automated approach is viable 21 21 22 ---- -21 +--- 23 23 24 24 == 2. Scope == 25 25 ... ... @@ -43,7 +43,7 @@ 43 43 * A/B testing 44 44 * Gates 2 & 3 (Evidence relevance, Scenario coherence) 45 45 46 ---- -45 +--- 47 47 48 48 == 3. Requirements == 49 49 ... ... @@ -61,7 +61,6 @@ 61 61 **Purpose:** Ensure extracted claims are factual assertions, not opinions or predictions 62 62 63 63 **Validation Checks:** 64 - 65 65 1. **Factual Statement Test:** Can this be verified with evidence? 66 66 2. **Opinion Detection:** Contains hedging language? ("I think", "probably", "best", "worst") 67 67 3. **Specificity Score:** Contains concrete details? (names, numbers, dates, locations) ... ... @@ -68,13 +68,14 @@ 68 68 4. **Future Prediction Test:** Makes claims about future events? 69 69 70 70 **Pass Criteria:** 71 -{{code}}- isFactual: true 69 +{{code}} 70 +- isFactual: true 72 72 - opinionScore: ≤ 0.3 73 73 - specificityScore: ≥ 0.3 74 -- claimType: FACTUAL{{/code}} 73 +- claimType: FACTUAL 74 +{{/code}} 75 75 76 76 **Action if Failed:** 77 - 78 78 * Flag as "Non-verifiable: Opinion/Prediction/Ambiguous" 79 79 * Do NOT generate scenarios or verdicts 80 80 * Display explanation to user ... ... @@ -81,7 +81,7 @@ 81 81 82 82 **Target:** 0% opinion statements processed as facts 83 83 84 ---- -83 +--- 85 85 86 86 ==== Gate 4: Verdict Confidence Assessment ==== 87 87 ... ... @@ -88,7 +88,6 @@ 88 88 **Purpose:** Only publish verdicts with sufficient evidence and confidence 89 89 90 90 **Validation Checks:** 91 - 92 92 1. **Evidence Count:** Minimum 2 independent sources 93 93 2. **Source Quality:** Average reliability ≥ 0.6 (on 0-1 scale) 94 94 3. **Evidence Agreement:** % supporting vs. contradicting ≥ 0.6 ... ... @@ -95,7 +95,8 @@ 95 95 4. **Uncertainty Factors:** Count of hedging statements in reasoning 96 96 97 97 **Confidence Tiers:** 98 -{{code}}HIGH (80-100%): 96 +{{code}} 97 +HIGH (80-100%): 99 99 - ≥3 sources 100 100 - ≥0.7 average quality 101 101 - ≥80% agreement ... ... @@ -109,16 +109,16 @@ 109 109 - ≥2 sources BUT low quality/agreement 110 110 111 111 INSUFFICIENT: 112 - - <2 sources → DO NOT PUBLISH{{/code}} 111 + - <2 sources → DO NOT PUBLISH 112 +{{/code}} 113 113 114 114 **POC1 Publication Rule:** 115 - 116 116 * Minimum **MEDIUM** confidence required 117 117 * Blocked verdicts show "Insufficient Evidence" message 118 118 119 119 **Target:** 0% verdicts published with <2 sources 120 120 121 ---- -120 +--- 122 122 123 123 === 3.2 Modified FR7: Automated Verdicts (Enhanced) === 124 124 ... ... @@ -146,7 +146,6 @@ 146 146 {{/code}} 147 147 148 148 **Updated Verdict States:** 149 - 150 150 * PUBLISHED - Passed all gates 151 151 * INSUFFICIENT_EVIDENCE - Failed Gate 4 152 152 * NON_FACTUAL_CLAIM - Failed Gate 1 ... ... @@ -153,7 +153,7 @@ 153 153 * PROCESSING - In progress 154 154 * ERROR - System failure 155 155 156 ---- -154 +--- 157 157 158 158 === 3.3 Modified FR4: Analysis Summary (Enhanced) === 159 159 ... ... @@ -179,7 +179,7 @@ 179 179 Quality Score: 8.5/10 180 180 {{/code}} 181 181 182 ---- -180 +--- 183 183 184 184 == 4. Success Criteria == 185 185 ... ... @@ -186,7 +186,6 @@ 186 186 POC1 is considered **SUCCESSFUL** if: 187 187 188 188 **✅ Functional:** 189 - 190 190 * Processes diverse test articles without crashes 191 191 * Generates verdicts for all factual claims 192 192 * Blocks all non-factual claims (0% pass through) ... ... @@ -193,7 +193,6 @@ 193 193 * Blocks all insufficient-evidence verdicts (0% with <2 sources) 194 194 195 195 **✅ Quality:** 196 - 197 197 * Hallucination rate <10% (manual verification) 198 198 * 0 verdicts with <2 sources published 199 199 * 0 opinion statements published as facts ... ... @@ -200,19 +200,17 @@ 200 200 * Average quality score ≥7.0/10 201 201 202 202 **✅ Performance:** 203 - 204 204 * Processing time reasonable for POC demonstration 205 205 * Quality gates execute efficiently 206 206 * UI displays results clearly 207 207 208 208 **✅ Learnings:** 209 - 210 210 * Identified prompt engineering improvements 211 211 * Documented AKEL strengths/weaknesses 212 212 * Validated threshold values 213 213 * Clear path to POC2 defined 214 214 215 ---- -209 +--- 216 216 217 217 == 5. Decision Gates == 218 218 ... ... @@ -225,7 +225,7 @@ 225 225 226 226 **Only proceed to POC2 if all success criteria met** 227 227 228 ---- -222 +--- 229 229 230 230 == 6. Architecture Notes == 231 231 ... ... @@ -245,7 +245,6 @@ 245 245 {{/code}} 246 246 247 247 **POC1 Acceptable Simplifications:** 248 - 249 249 * Single AKEL call (not multi-component pipeline) 250 250 * No scenarios (implicit in verdicts) 251 251 * Basic evidence linking ... ... @@ -252,19 +252,20 @@ 252 252 * 2 gates instead of 4 253 253 * No review queue 254 254 255 -**See:** [[Architecture>> FactHarbor.Archive.FactHarbordelta for V0\.9\.70.Specification.Architecture.WebHome]] for details248 +**See:** [[Architecture>>Test.FactHarbor.Specification.Architecture.WebHome]] for details 256 256 257 ---- -250 +--- 258 258 259 259 == Related Pages == 260 260 261 -* [[Roadmap Overview>> FactHarbor.Archive.FactHarbordelta for V0\.9\.70.Roadmap.WebHome]] - All phases262 -* [[POC2 Requirements>> FactHarbor.Archive.FactHarbordelta for V0\.9\.70.Roadmap.POC2.WebHome]] - Next phase263 -* [[Requirements>> FactHarbor.Archive.FactHarbordelta for V0\.9\.70.Specification.Requirements.WebHome]] - Full system requirements264 -* [[Architecture>> FactHarbor.Archive.FactHarbordelta for V0\.9\.70.Specification.Architecture.WebHome]] - System architecture254 +* [[Roadmap Overview>>Test.FactHarbor.Roadmap.WebHome]] - All phases 255 +* [[POC2 Requirements>>Test.FactHarbor.Roadmap.POC2.WebHome]] - Next phase 256 +* [[Requirements>>Test.FactHarbor.Specification.Requirements.WebHome]] - Full system requirements 257 +* [[Architecture>>Test.FactHarbor.Specification.Architecture.WebHome]] - System architecture 265 265 * [[NFR11 Full Specification>>Test.FactHarbor.Specification.Requirements.WebHome#NFR11]] - Complete quality framework 266 266 267 ---- -260 +--- 268 268 269 269 **Document Status:** ✅ POC1 Specification Complete - Ready for Implementation 270 270 **Version:** V0.9.70 264 +