Last modified by Robert Schaub on 2025/12/23 16:13

Show last authors
1 = POC1: Core Workflow with Quality Gates =
2
3 **Phase Goal:** Prove AKEL can produce credible, quality outputs without manual intervention
4
5 **Success Metric:** <10% hallucination rate, quality gates prevent low-confidence publications
6
7
8 == 1. Overview ==
9
10 POC1 validates that the core AKEL workflow (Article → Claims → Verdicts) can produce trustworthy fact-checking analyses automatically. This phase implements **2 critical quality gates** to prevent low-quality outputs from being published.
11
12 **Key Innovation:** Quality validation BEFORE publication, not after
13
14 **What We're Proving:**
15
16 * AKEL can reliably extract factual claims from articles
17 * AKEL can generate credible verdicts with proper evidence
18 * Quality gates prevent hallucinations and low-confidence outputs
19 * Fully automated approach is viable
20
21 == 2. Scope ==
22
23 === In Scope ===
24
25 * Core AKEL workflow (claim extraction, verdict generation)
26 * **Gate 1:** Claim Validation (factual vs. opinion/prediction)
27 * **Gate 4:** Verdict Confidence Assessment (minimum 2 sources, quality thresholds)
28 * Basic UI to display results
29 * Manual quality tracking
30
31 === Out of Scope (Deferred to POC2+) ===
32
33 * User accounts / authentication
34 * Corrections system
35 * Search engine optimization (ClaimReview schema)
36 * Image verification
37 * API endpoints
38 * Archive.org integration
39 * Security hardening
40 * A/B testing
41 * Gates 2 & 3 (Evidence relevance, Scenario coherence)
42
43 == 3. Requirements ==
44
45 === 3.1 NFR11: Quality Assurance Framework (POC1 Lite Version) ===
46
47 **Importance:** CRITICAL - Core POC1 Requirement
48 **Fulfills:** AI safety, credibility, prevents embarrassing failures
49
50 **Specification:**
51
52 AKEL must validate outputs before displaying to users. POC1 implements a **2-gate subset** of the full NFR11 framework.
53
54 ==== Gate 1: Claim Validation ====
55
56 **Purpose:** Ensure extracted claims are factual assertions, not opinions or predictions
57
58 **Validation Checks:**
59
60 1. **Factual Statement Test:** Can this be verified with evidence?
61 2. **Opinion Detection:** Contains hedging language? ("I think", "probably", "best", "worst")
62 3. **Specificity Score:** Contains concrete details? (names, numbers, dates, locations)
63 4. **Future Prediction Test:** Makes claims about future events?
64
65 **Pass Criteria:**
66 {{code}}- isFactual: true
67 - opinionScore: ≤ 0.3
68 - specificityScore: ≥ 0.3
69 - claimType: FACTUAL{{/code}}
70
71 **Action if Failed:**
72
73 * Flag as "Non-verifiable: Opinion/Prediction/Ambiguous"
74 * Do NOT generate scenarios or verdicts
75 * Display explanation to user
76
77 **Target:** 0% opinion statements processed as facts
78
79
80 ==== Gate 4: Verdict Confidence Assessment ====
81
82 **Purpose:** Only publish verdicts with sufficient evidence and confidence
83
84 **Validation Checks:**
85
86 1. **Evidence Count:** Minimum 2 independent sources
87 2. **Source Quality:** Average reliability ≥ 0.6 (on 0-1 scale)
88 3. **Evidence Agreement:** % supporting vs. contradicting ≥ 0.6
89 4. **Uncertainty Factors:** Count of hedging statements in reasoning
90
91 **Confidence Tiers:**
92 {{code}}HIGH (80-100%):
93 - ≥3 sources
94 - ≥0.7 average quality
95 - ≥80% agreement
96
97 MEDIUM (50-79%):
98 - ≥2 sources
99 - ≥0.6 average quality
100 - ≥60% agreement
101
102 LOW (0-49%):
103 - ≥2 sources BUT low quality/agreement
104
105 INSUFFICIENT:
106 - <2 sources → DO NOT PUBLISH{{/code}}
107
108 **POC1 Publication Rule:**
109
110 * Minimum **MEDIUM** confidence required
111 * Blocked verdicts show "Insufficient Evidence" message
112
113 **Target:** 0% verdicts published with <2 sources
114
115
116 === 3.2 Modified FR7: Automated Verdicts (Enhanced) ===
117
118 **Enhancement for POC1:**
119
120 After AKEL generates a verdict, it must pass through the quality validation pipeline:
121
122 {{code}}
123 AKEL Workflow (POC1):
124
125 1. Extract claims from article
126
127 2. [GATE 1] Validate each claim is fact-checkable
128 ↓ (pass claims only)
129 3. Generate verdicts for each claim
130
131 4. [GATE 4] Validate verdict has sufficient evidence
132 ↓ (pass verdicts only)
133 5. Display to user
134
135 Failed claims/verdicts:
136 - Store in database with failure reason
137 - Display explanatory message to user
138 - Log for quality metrics tracking
139 {{/code}}
140
141 **Updated Verdict States:**
142
143 * PUBLISHED - Passed all gates
144 * INSUFFICIENT_EVIDENCE - Failed Gate 4
145 * NON_FACTUAL_CLAIM - Failed Gate 1
146 * PROCESSING - In progress
147 * ERROR - System failure
148
149 === 3.3 Modified FR4: Analysis Summary (Enhanced) ===
150
151 **Enhancement for POC1:**
152
153 Analysis Summary must now display quality metadata:
154
155 {{code}}
156 Analysis Summary:
157 Total Claims Found: 5
158 Verifiable Claims: 3
159 Non-verifiable (Opinion): 1
160 Non-verifiable (Prediction): 1
161
162 Verdicts Generated: 3
163 High Confidence: 1
164 Medium Confidence: 2
165 Insufficient Evidence: 0
166
167 Evidence Sources: 12 total
168 Average Source Quality: 0.73
169
170 Quality Score: 8.5/10
171 {{/code}}
172
173
174 == 4. Success Criteria ==
175
176 POC1 is considered **SUCCESSFUL** if:
177
178 **✅ Functional:**
179
180 * Processes diverse test articles without crashes
181 * Generates verdicts for all factual claims
182 * Blocks all non-factual claims (0% pass through)
183 * Blocks all insufficient-evidence verdicts (0% with <2 sources)
184
185 **✅ Quality:**
186
187 * Hallucination rate <10% (manual verification)
188 * 0 verdicts with <2 sources published
189 * 0 opinion statements published as facts
190 * Average quality score ≥7.0/10
191
192 **✅ Performance:**
193
194 * Processing time reasonable for POC demonstration
195 * Quality gates execute efficiently
196 * UI displays results clearly
197
198 **✅ Learnings:**
199
200 * Identified prompt engineering improvements
201 * Documented AKEL strengths/weaknesses
202 * Validated threshold values
203 * Clear path to POC2 defined
204
205 == 5. Decision Gates ==
206
207 **POC1 → POC2 Decision:**
208
209 * **IF** hallucination rate >10% → Pause, improve prompts before POC2
210 * **IF** majority of claims non-processable → Rethink claim extraction approach
211 * **IF** quality gates too strict (excessive blocking) → Adjust thresholds
212 * **IF** quality gates too loose (hallucinations pass) → Tighten criteria
213
214 **Only proceed to POC2 if all success criteria met**
215
216
217 == 6. Architecture Notes ==
218
219 **POC1 Simplified Architecture:**
220
221 {{code}}
222 User Input → AKEL Processing → Quality Gates → Display
223 (claim extraction (Gates 1 & 4)
224 + verdicts)
225 {{/code}}
226
227 **vs. Full System (Future):**
228
229 {{code}}
230 Input → Claim Extractor → Scenario Generator → Evidence Linker
231 → Verdict Generator → All 4 Gates → Review Queue → Publication
232 {{/code}}
233
234 **POC1 Acceptable Simplifications:**
235
236 * Single AKEL call (not multi-component pipeline)
237 * No scenarios (implicit in verdicts)
238 * Basic evidence linking
239 * 2 gates instead of 4
240 * No review queue
241
242 **See:** [[Architecture>>Test.FactHarbor pre10 V0\.9\.70.Specification.Architecture.WebHome]] for details
243
244
245 == Related Pages ==
246
247 * [[Roadmap Overview>>Test.FactHarbor pre10 V0\.9\.70.Roadmap.WebHome]] - All phases
248 * [[POC2 Requirements>>Test.FactHarbor pre10 V0\.9\.70.Roadmap.POC2.WebHome]] - Next phase
249 * [[Requirements>>Test.FactHarbor pre10 V0\.9\.70.Specification.Requirements.WebHome]] - Full system requirements
250 * [[Architecture>>Test.FactHarbor pre10 V0\.9\.70.Specification.Architecture.WebHome]] - System architecture
251 * [[NFR11 Full Specification>>Test.FactHarbor.Specification.Requirements.WebHome#NFR11]] - Complete quality framework
252
253 **Document Status:** ✅ POC1 Specification Complete - Ready for Implementation
254 **Version:** V0.9.70