POC1: Core Workflow with Quality Gates

1

= POC1: Core Workflow with Quality Gates =

2

3

**Phase Goal:** Prove AKEL can produce credible, quality outputs without manual intervention

4

5

**Success Metric:** <10% hallucination rate, quality gates prevent low-confidence publications

== 1. Overview ==

POC1 validates that the core AKEL workflow (Article → Claims → Verdicts) can produce trustworthy fact-checking analyses automatically. This phase implements **2 critical quality gates** to prevent low-quality outputs from being published.

10

11

**Key Innovation:** Quality validation BEFORE publication, not after

12

13

**What We're Proving:**

14

15

* AKEL can reliably extract factual claims from articles

16

* AKEL can generate credible verdicts with proper evidence

17

* **AKEL can assess article credibility beyond simple claim averaging** (context-aware analysis)

18

* Quality gates prevent hallucinations and low-confidence outputs

19

* Fully automated approach is viable

== 2. Scope ==

=== In Scope ===

* Core AKEL workflow (claim extraction, verdict generation)

26

* **Gate 1:** Claim Validation (factual vs. opinion/prediction)

27

* **Gate 4:** Verdict Confidence Assessment (minimum 2 sources, quality thresholds)

28

* Basic UI to display results

29

* Manual quality tracking

30

31

=== Out of Scope (Deferred to POC2+) ===

32

33

* User accounts / authentication

34

* Corrections system

35

* Search engine optimization (ClaimReview schema)

36

* Image verification

37

* API endpoints

38

* Archive.org integration

39

* Security hardening

40

* A/B testing

41

* Gates 2 & 3 (Evidence relevance, Scenario coherence)

42

43

=== Experimental Features (POC1) ===

44

45

**Context-Aware Analysis** (Approach 1: Single-Pass Holistic)

46

47

**Goal:** Test if AI can detect when an article's overall credibility differs from the average of its claim verdicts (e.g., accurate facts but misleading conclusion).

48

49

**Implementation:**

50

* Enhanced AI prompt to evaluate logical structure

51

* AI identifies article's main argument

52

* AI assesses if conclusion follows from evidence

53

* Article verdict may differ from claim average

54

55

**Testing:**

56

* 30-article test set (10 straightforward, 10 misleading, 10 complex)

57

* Success criteria: ≥70% accuracy on misleading articles

58

* Marked as experimental - doesn't block POC1 success

59

60

**See:** [[Article Verdict Problem>>FactHarbor.Specification.POC.Article-Verdict-Problem]] for complete analysis

61

62

**Decision:**

63

* If ≥70% accuracy → ship in POC2

64

* If 50-70% → try weighted aggregation approach

65

* If <50% → defer to POC2 with different approach

66

67

== 3. Requirements ==

68

69

=== 3.1 NFR11: Quality Assurance Framework (POC1 Lite Version) ===

70

71

**Importance:** CRITICAL - Core POC1 Requirement

72

**Fulfills:** AI safety, credibility, prevents embarrassing failures

**Specification:**

AKEL must validate outputs before displaying to users. POC1 implements a **2-gate subset** of the full NFR11 framework.

77

78

==== Gate 1: Claim Validation ====

79

80

**Purpose:** Ensure extracted claims are factual assertions, not opinions or predictions

81

82

**Validation Checks:**

83

84

1. **Factual Statement Test:** Can this be verified with evidence?

85

2. **Opinion Detection:** Contains hedging language? ("I think", "probably", "best", "worst")

86

3. **Specificity Score:** Contains concrete details? (names, numbers, dates, locations)

87

4. **Future Prediction Test:** Makes claims about future events?

88

89

**Pass Criteria:**

90

{{code}}- isFactual: true

91

- opinionScore: ≤ 0.3

92

- specificityScore: ≥ 0.3

93

- claimType: FACTUAL{{/code}}

94

95

**Action if Failed:**

96

97

* Flag as "Non-verifiable: Opinion/Prediction/Ambiguous"

98

* Do NOT generate scenarios or verdicts

99

* Display explanation to user

100

101

**Target:** 0% opinion statements processed as facts

102

103

==== Gate 4: Verdict Confidence Assessment ====

104

105

**Purpose:** Only publish verdicts with sufficient evidence and confidence

106

107

**Validation Checks:**

108

109

1. **Evidence Count:** Minimum 2 independent sources

110

2. **Source Quality:** Average reliability ≥ 0.6 (on 0-1 scale)

111

3. **Evidence Agreement:** % supporting vs. contradicting ≥ 0.6

112

4. **Uncertainty Factors:** Count of hedging statements in reasoning

113

114

**Confidence Tiers:**

115

{{code}}HIGH (80-100%):

116

- ≥3 sources

117

- ≥0.7 average quality

- ≥80% agreement

MEDIUM (50-79%):

- ≥2 sources

- ≥0.6 average quality

- ≥60% agreement

LOW (0-49%):

- ≥2 sources BUT low quality/agreement

127

128

INSUFFICIENT:

129

- <2 sources → DO NOT PUBLISH{{/code}}

130

131

**POC1 Publication Rule:**

132

133

* Minimum **MEDIUM** confidence required

134

* Blocked verdicts show "Insufficient Evidence" message

135

136

**Target:** 0% verdicts published with <2 sources

137

138

=== 3.2 Modified FR7: Automated Verdicts (Enhanced) ===

139

140

**Enhancement for POC1:**

141

142

After AKEL generates a verdict, it must pass through the quality validation pipeline:

143

144

145

AKEL Workflow (POC1):

146

147

1. Extract claims from article

148

↓

149

2. [GATE 1] Validate each claim is fact-checkable

150

↓ (pass claims only)

151

3. Generate verdicts for each claim

152

↓

153

4. [GATE 4] Validate verdict has sufficient evidence

154

↓ (pass verdicts only)

155

5. Display to user

156

157

Failed claims/verdicts:

158

- Store in database with failure reason

159

- Display explanatory message to user

160

- Log for quality metrics tracking

161

162

163

**Updated Verdict States:**

164

165

* PUBLISHED - Passed all gates

166

* INSUFFICIENT_EVIDENCE - Failed Gate 4

167

* NON_FACTUAL_CLAIM - Failed Gate 1

168

* PROCESSING - In progress

169

* ERROR - System failure

170

171

=== 3.3 Modified FR4: Analysis Summary (Enhanced) ===

172

173

**Enhancement for POC1:**

174

175

Analysis Summary must now display quality metadata:

Analysis Summary:

Total Claims Found: 5

180

Verifiable Claims: 3

181

Non-verifiable (Opinion): 1

182

Non-verifiable (Prediction): 1

183

184

Verdicts Generated: 3

185

High Confidence: 1

186

Medium Confidence: 2

187

Insufficient Evidence: 0

188

189

Evidence Sources: 12 total

190

Average Source Quality: 0.73

191

192

Quality Score: 8.5/10

193

194

195

== 4. Success Criteria ==

196

197

POC1 is considered **SUCCESSFUL** if:

**✅ Functional:**

* Processes diverse test articles without crashes

202

* Generates verdicts for all factual claims

203

* Blocks all non-factual claims (0% pass through)

204

* Blocks all insufficient-evidence verdicts (0% with <2 sources)

**✅ Quality:**

* Hallucination rate <10% (manual verification)

209

* 0 verdicts with <2 sources published

210

* 0 opinion statements published as facts

211

* Average quality score ≥7.0/10

**✅ Performance:**

* Processing time reasonable for POC demonstration

216

* Quality gates execute efficiently

217

* UI displays results clearly

**✅ Learnings:**

* Identified prompt engineering improvements

222

* Documented AKEL strengths/weaknesses

223

* Validated threshold values

224

* Clear path to POC2 defined

225

226

== 5. Decision Gates ==

227

228

**POC1 → POC2 Decision:**

229

230

* **IF** hallucination rate >10% → Pause, improve prompts before POC2

231

* **IF** majority of claims non-processable → Rethink claim extraction approach

232

* **IF** quality gates too strict (excessive blocking) → Adjust thresholds

233

* **IF** quality gates too loose (hallucinations pass) → Tighten criteria

234

235

**Only proceed to POC2 if all success criteria met**

236

237

== 6. Architecture Notes ==

238

239

**POC1 Simplified Architecture:**

240

241

242

User Input → AKEL Processing → Quality Gates → Display

243

(claim extraction (Gates 1 & 4)

+ verdicts)

**vs. Full System (Future):**

248

249

250

Input → Claim Extractor → Scenario Generator → Evidence Linker

251

→ Verdict Generator → All 4 Gates → Review Queue → Publication

252

253

254

**POC1 Acceptable Simplifications:**

255

256

* Single AKEL call (not multi-component pipeline)

257

* No scenarios (implicit in verdicts)

258

* Basic evidence linking

259

* 2 gates instead of 4

260

* No review queue

261

262

**See:** [[Architecture>>FactHarbor pre10 V0\.9\.70.Specification.Architecture.WebHome]] for details

== Related Pages ==

* [[Roadmap Overview>>FactHarbor pre10 V0\.9\.70.Roadmap.WebHome]] - All phases

267

* [[POC2 Requirements>>FactHarbor pre10 V0\.9\.70.Roadmap.POC2.WebHome]] - Next phase

268

* [[Requirements>>FactHarbor pre10 V0\.9\.70.Specification.Requirements.WebHome]] - Full system requirements

269

* [[Architecture>>FactHarbor pre10 V0\.9\.70.Specification.Architecture.WebHome]] - System architecture

270

* [[NFR11 Full Specification>>FactHarbor.Specification.Requirements.WebHome#NFR11]] - Complete quality framework

271

272

**Document Status:** ✅ POC1 Specification Complete - Ready for Implementation

273

**Version:** V0.9.70

Wiki source code of POC1: Core Workflow with Quality Gates