Version 1.2 by Robert Schaub on 2025/12/22 13:49

Show last authors
1 = POC1: Core Workflow with Quality Gates =
2
3 **Phase Goal:** Prove AKEL can produce credible, quality outputs without manual intervention
4
5 **Success Metric:** <10% hallucination rate, quality gates prevent low-confidence publications
6
7
8 == 1. Overview ==
9
10 POC1 validates that the core AKEL workflow (Article → Claims → Verdicts) can produce trustworthy fact-checking analyses automatically. This phase implements **2 critical quality gates** to prevent low-quality outputs from being published.
11
12 **Key Innovation:** Quality validation BEFORE publication, not after
13
14 **What We're Proving:**
15 * AKEL can reliably extract factual claims from articles
16 * AKEL can generate credible verdicts with proper evidence
17 * Quality gates prevent hallucinations and low-confidence outputs
18 * Fully automated approach is viable
19
20
21 == 2. Scope ==
22
23 === In Scope ===
24
25 * Core AKEL workflow (claim extraction, verdict generation)
26 * **Gate 1:** Claim Validation (factual vs. opinion/prediction)
27 * **Gate 4:** Verdict Confidence Assessment (minimum 2 sources, quality thresholds)
28 * Basic UI to display results
29 * Manual quality tracking
30
31 === Out of Scope (Deferred to POC2+) ===
32
33 * User accounts / authentication
34 * Corrections system
35 * Search engine optimization (ClaimReview schema)
36 * Image verification
37 * API endpoints
38 * Archive.org integration
39 * Security hardening
40 * A/B testing
41 * Gates 2 & 3 (Evidence relevance, Scenario coherence)
42
43
44 == 3. Requirements ==
45
46 === 3.1 NFR11: Quality Assurance Framework (POC1 Lite Version) ===
47
48 **Priority:** CRITICAL - Core POC1 Requirement
49 **Fulfills:** AI safety, credibility, prevents embarrassing failures
50
51 **Specification:**
52
53 AKEL must validate outputs before displaying to users. POC1 implements a **2-gate subset** of the full NFR11 framework.
54
55 ==== Gate 1: Claim Validation ====
56
57 **Purpose:** Ensure extracted claims are factual assertions, not opinions or predictions
58
59 **Validation Checks:**
60 1. **Factual Statement Test:** Can this be verified with evidence?
61 2. **Opinion Detection:** Contains hedging language? ("I think", "probably", "best", "worst")
62 3. **Specificity Score:** Contains concrete details? (names, numbers, dates, locations)
63 4. **Future Prediction Test:** Makes claims about future events?
64
65 **Pass Criteria:**
66 {{code}}
67 - isFactual: true
68 - opinionScore: ≤ 0.3
69 - specificityScore: ≥ 0.3
70 - claimType: FACTUAL
71 {{/code}}
72
73 **Action if Failed:**
74 * Flag as "Non-verifiable: Opinion/Prediction/Ambiguous"
75 * Do NOT generate scenarios or verdicts
76 * Display explanation to user
77
78 **Target:** 0% opinion statements processed as facts
79
80
81 ==== Gate 4: Verdict Confidence Assessment ====
82
83 **Purpose:** Only publish verdicts with sufficient evidence and confidence
84
85 **Validation Checks:**
86 1. **Evidence Count:** Minimum 2 independent sources
87 2. **Source Quality:** Average reliability ≥ 0.6 (on 0-1 scale)
88 3. **Evidence Agreement:** % supporting vs. contradicting ≥ 0.6
89 4. **Uncertainty Factors:** Count of hedging statements in reasoning
90
91 **Confidence Tiers:**
92 {{code}}
93 HIGH (80-100%):
94 - ≥3 sources
95 - ≥0.7 average quality
96 - ≥80% agreement
97
98 MEDIUM (50-79%):
99 - ≥2 sources
100 - ≥0.6 average quality
101 - ≥60% agreement
102
103 LOW (0-49%):
104 - ≥2 sources BUT low quality/agreement
105
106 INSUFFICIENT:
107 - <2 sources → DO NOT PUBLISH
108 {{/code}}
109
110 **POC1 Publication Rule:**
111 * Minimum **MEDIUM** confidence required
112 * Blocked verdicts show "Insufficient Evidence" message
113
114 **Target:** 0% verdicts published with <2 sources
115
116
117 === 3.2 Modified FR7: Automated Verdicts (Enhanced) ===
118
119 **Enhancement for POC1:**
120
121 After AKEL generates a verdict, it must pass through the quality validation pipeline:
122
123 {{code}}
124 AKEL Workflow (POC1):
125
126 1. Extract claims from article
127
128 2. [GATE 1] Validate each claim is fact-checkable
129 ↓ (pass claims only)
130 3. Generate verdicts for each claim
131
132 4. [GATE 4] Validate verdict has sufficient evidence
133 ↓ (pass verdicts only)
134 5. Display to user
135
136 Failed claims/verdicts:
137 - Store in database with failure reason
138 - Display explanatory message to user
139 - Log for quality metrics tracking
140 {{/code}}
141
142 **Updated Verdict States:**
143 * PUBLISHED - Passed all gates
144 * INSUFFICIENT_EVIDENCE - Failed Gate 4
145 * NON_FACTUAL_CLAIM - Failed Gate 1
146 * PROCESSING - In progress
147 * ERROR - System failure
148
149
150 === 3.3 Modified FR4: Analysis Summary (Enhanced) ===
151
152 **Enhancement for POC1:**
153
154 Analysis Summary must now display quality metadata:
155
156 {{code}}
157 Analysis Summary:
158 Total Claims Found: 5
159 Verifiable Claims: 3
160 Non-verifiable (Opinion): 1
161 Non-verifiable (Prediction): 1
162
163 Verdicts Generated: 3
164 High Confidence: 1
165 Medium Confidence: 2
166 Insufficient Evidence: 0
167
168 Evidence Sources: 12 total
169 Average Source Quality: 0.73
170
171 Quality Score: 8.5/10
172 {{/code}}
173
174
175 == 4. Success Criteria ==
176
177 POC1 is considered **SUCCESSFUL** if:
178
179 **✅ Functional:**
180 * Processes diverse test articles without crashes
181 * Generates verdicts for all factual claims
182 * Blocks all non-factual claims (0% pass through)
183 * Blocks all insufficient-evidence verdicts (0% with <2 sources)
184
185 **✅ Quality:**
186 * Hallucination rate <10% (manual verification)
187 * 0 verdicts with <2 sources published
188 * 0 opinion statements published as facts
189 * Average quality score ≥7.0/10
190
191 **✅ Performance:**
192 * Processing time reasonable for POC demonstration
193 * Quality gates execute efficiently
194 * UI displays results clearly
195
196 **✅ Learnings:**
197 * Identified prompt engineering improvements
198 * Documented AKEL strengths/weaknesses
199 * Validated threshold values
200 * Clear path to POC2 defined
201
202
203 == 5. Decision Gates ==
204
205 **POC1 → POC2 Decision:**
206
207 * **IF** hallucination rate >10% → Pause, improve prompts before POC2
208 * **IF** majority of claims non-processable → Rethink claim extraction approach
209 * **IF** quality gates too strict (excessive blocking) → Adjust thresholds
210 * **IF** quality gates too loose (hallucinations pass) → Tighten criteria
211
212 **Only proceed to POC2 if all success criteria met**
213
214
215 == 6. Architecture Notes ==
216
217 **POC1 Simplified Architecture:**
218
219 {{code}}
220 User Input → AKEL Processing → Quality Gates → Display
221 (claim extraction (Gates 1 & 4)
222 + verdicts)
223 {{/code}}
224
225 **vs. Full System (Future):**
226
227 {{code}}
228 Input → Claim Extractor → Scenario Generator → Evidence Linker
229 → Verdict Generator → All 4 Gates → Review Queue → Publication
230 {{/code}}
231
232 **POC1 Acceptable Simplifications:**
233 * Single AKEL call (not multi-component pipeline)
234 * No scenarios (implicit in verdicts)
235 * Basic evidence linking
236 * 2 gates instead of 4
237 * No review queue
238
239 **See:** [[Architecture>>Test.FactHarbor.Specification.Architecture.WebHome]] for details
240
241
242 == Related Pages ==
243
244 * [[Roadmap Overview>>Test.FactHarbor.Roadmap.WebHome]] - All phases
245 * [[POC2 Requirements>>Test.FactHarbor.Roadmap.POC2.WebHome]] - Next phase
246 * [[Requirements>>Test.FactHarbor.Specification.Requirements.WebHome]] - Full system requirements
247 * [[Architecture>>Test.FactHarbor.Specification.Architecture.WebHome]] - System architecture
248 * [[NFR11 Full Specification>>Test.FactHarbor.Specification.Requirements.WebHome#NFR11]] - Complete quality framework
249
250
251 **Document Status:** ✅ POC1 Specification Complete - Ready for Implementation
252 **Version:** V0.9.70