POC1: Core Workflow with Quality Gates

1

= POC1: Core Workflow with Quality Gates =

2

3

**Phase Goal:** Prove AKEL can produce credible, quality outputs without manual intervention

4

5

**Success Metric:** <10% hallucination rate, quality gates prevent low-confidence publications

== 1. Overview ==

POC1 validates that the core AKEL workflow (Article → Claims → Verdicts) can produce trustworthy fact-checking analyses automatically. This phase implements **2 critical quality gates** to prevent low-quality outputs from being published.

11

12

**Key Innovation:** Quality validation BEFORE publication, not after

13

14

**What We're Proving:**

15

16

* AKEL can reliably extract factual claims from articles

17

* AKEL can generate credible verdicts with proper evidence

18

* **AKEL can assess article credibility beyond simple claim averaging** (context-aware analysis)

19

* Quality gates prevent hallucinations and low-confidence outputs

20

* Fully automated approach is viable

== 2. Scope ==

=== In Scope ===

* Core AKEL workflow (claim extraction, verdict generation)

27

* **Gate 1:** Claim Validation (factual vs. opinion/prediction)

28

* **Gate 4:** Verdict Confidence Assessment (minimum 2 sources, quality thresholds)

29

* Basic UI to display results

30

* Manual quality tracking

31

32

=== Out of Scope (Deferred to POC2+) ===

33

34

* User accounts / authentication

35

* Corrections system

36

* Search engine optimization (ClaimReview schema)

37

* Image verification

38

* API endpoints

39

* Archive.org integration

40

* Security hardening

41

* A/B testing

42

* Gates 2 & 3 (Evidence relevance, Scenario coherence)

43

44

45

=== Experimental Features (POC1) ===

46

47

**Context-Aware Analysis** (Approach 1: Single-Pass Holistic)

48

49

**Goal:** Test if AI can detect when an article's overall credibility differs from the average of its claim verdicts (e.g., accurate facts but misleading conclusion).

50

51

**Implementation:**

52

* Enhanced AI prompt to evaluate logical structure

53

* AI identifies article's main argument

54

* AI assesses if conclusion follows from evidence

55

* Article verdict may differ from claim average

56

57

**Testing:**

58

* 30-article test set (10 straightforward, 10 misleading, 10 complex)

59

* Success criteria: ≥70% accuracy on misleading articles

60

* Marked as experimental - doesn't block POC1 success

61

62

**See:** [[Article Verdict Problem>>Test.FactHarbor.Specification.POC.Article-Verdict-Problem]] for complete analysis

63

64

**Decision:**

65

* If ≥70% accuracy → ship in POC2

66

* If 50-70% → try weighted aggregation approach

67

* If <50% → defer to POC2 with different approach

68

69

== 3. Requirements ==

70

71

=== 3.1 NFR11: Quality Assurance Framework (POC1 Lite Version) ===

72

73

**Importance:** CRITICAL - Core POC1 Requirement

74

**Fulfills:** AI safety, credibility, prevents embarrassing failures

**Specification:**

AKEL must validate outputs before displaying to users. POC1 implements a **2-gate subset** of the full NFR11 framework.

79

80

==== Gate 1: Claim Validation ====

81

82

**Purpose:** Ensure extracted claims are factual assertions, not opinions or predictions

83

84

**Validation Checks:**

85

86

1. **Factual Statement Test:** Can this be verified with evidence?

87

2. **Opinion Detection:** Contains hedging language? ("I think", "probably", "best", "worst")

88

3. **Specificity Score:** Contains concrete details? (names, numbers, dates, locations)

89

4. **Future Prediction Test:** Makes claims about future events?

90

91

**Pass Criteria:**

92

{{code}}- isFactual: true

93

- opinionScore: ≤ 0.3

94

- specificityScore: ≥ 0.3

95

- claimType: FACTUAL{{/code}}

96

97

**Action if Failed:**

98

99

* Flag as "Non-verifiable: Opinion/Prediction/Ambiguous"

100

* Do NOT generate scenarios or verdicts

101

* Display explanation to user

102

103

**Target:** 0% opinion statements processed as facts

104

105

106

==== Gate 4: Verdict Confidence Assessment ====

107

108

**Purpose:** Only publish verdicts with sufficient evidence and confidence

109

110

**Validation Checks:**

111

112

1. **Evidence Count:** Minimum 2 independent sources

113

2. **Source Quality:** Average reliability ≥ 0.6 (on 0-1 scale)

114

3. **Evidence Agreement:** % supporting vs. contradicting ≥ 0.6

115

4. **Uncertainty Factors:** Count of hedging statements in reasoning

116

117

**Confidence Tiers:**

118

{{code}}HIGH (80-100%):

119

- ≥3 sources

120

- ≥0.7 average quality

- ≥80% agreement

MEDIUM (50-79%):

- ≥2 sources

- ≥0.6 average quality

- ≥60% agreement

LOW (0-49%):

- ≥2 sources BUT low quality/agreement

130

131

INSUFFICIENT:

132

- <2 sources → DO NOT PUBLISH{{/code}}

133

134

**POC1 Publication Rule:**

135

136

* Minimum **MEDIUM** confidence required

137

* Blocked verdicts show "Insufficient Evidence" message

138

139

**Target:** 0% verdicts published with <2 sources

140

141

142

=== 3.2 Modified FR7: Automated Verdicts (Enhanced) ===

143

144

**Enhancement for POC1:**

145

146

After AKEL generates a verdict, it must pass through the quality validation pipeline:

147

148

149

AKEL Workflow (POC1):

150

151

1. Extract claims from article

152

↓

153

2. [GATE 1] Validate each claim is fact-checkable

154

↓ (pass claims only)

155

3. Generate verdicts for each claim

156

↓

157

4. [GATE 4] Validate verdict has sufficient evidence

158

↓ (pass verdicts only)

159

5. Display to user

160

161

Failed claims/verdicts:

162

- Store in database with failure reason

163

- Display explanatory message to user

164

- Log for quality metrics tracking

165

166

167

**Updated Verdict States:**

168

169

* PUBLISHED - Passed all gates

170

* INSUFFICIENT_EVIDENCE - Failed Gate 4

171

* NON_FACTUAL_CLAIM - Failed Gate 1

172

* PROCESSING - In progress

173

* ERROR - System failure

174

175

=== 3.3 Modified FR4: Analysis Summary (Enhanced) ===

176

177

**Enhancement for POC1:**

178

179

Analysis Summary must now display quality metadata:

Analysis Summary:

Total Claims Found: 5

184

Verifiable Claims: 3

185

Non-verifiable (Opinion): 1

186

Non-verifiable (Prediction): 1

187

188

Verdicts Generated: 3

189

High Confidence: 1

190

Medium Confidence: 2

191

Insufficient Evidence: 0

192

193

Evidence Sources: 12 total

194

Average Source Quality: 0.73

195

196

Quality Score: 8.5/10

== 4. Success Criteria ==

201

202

POC1 is considered **SUCCESSFUL** if:

**✅ Functional:**

* Processes diverse test articles without crashes

207

* Generates verdicts for all factual claims

208

* Blocks all non-factual claims (0% pass through)

209

* Blocks all insufficient-evidence verdicts (0% with <2 sources)

**✅ Quality:**

* Hallucination rate <10% (manual verification)

214

* 0 verdicts with <2 sources published

215

* 0 opinion statements published as facts

216

* Average quality score ≥7.0/10

**✅ Performance:**

* Processing time reasonable for POC demonstration

221

* Quality gates execute efficiently

222

* UI displays results clearly

**✅ Learnings:**

* Identified prompt engineering improvements

227

* Documented AKEL strengths/weaknesses

228

* Validated threshold values

229

* Clear path to POC2 defined

230

231

== 5. Decision Gates ==

232

233

**POC1 → POC2 Decision:**

234

235

* **IF** hallucination rate >10% → Pause, improve prompts before POC2

236

* **IF** majority of claims non-processable → Rethink claim extraction approach

237

* **IF** quality gates too strict (excessive blocking) → Adjust thresholds

238

* **IF** quality gates too loose (hallucinations pass) → Tighten criteria

239

240

**Only proceed to POC2 if all success criteria met**

241

242

243

== 6. Architecture Notes ==

244

245

**POC1 Simplified Architecture:**

246

247

248

User Input → AKEL Processing → Quality Gates → Display

249

(claim extraction (Gates 1 & 4)

+ verdicts)

**vs. Full System (Future):**

254

255

256

Input → Claim Extractor → Scenario Generator → Evidence Linker

257

→ Verdict Generator → All 4 Gates → Review Queue → Publication

258

259

260

**POC1 Acceptable Simplifications:**

261

262

* Single AKEL call (not multi-component pipeline)

263

* No scenarios (implicit in verdicts)

264

* Basic evidence linking

265

* 2 gates instead of 4

266

* No review queue

267

268

**See:** [[Architecture>>Test.FactHarbor pre10 V0\.9\.70.Specification.Architecture.WebHome]] for details

== Related Pages ==

* [[Roadmap Overview>>Test.FactHarbor pre10 V0\.9\.70.Roadmap.WebHome]] - All phases

274

* [[POC2 Requirements>>Test.FactHarbor pre10 V0\.9\.70.Roadmap.POC2.WebHome]] - Next phase

275

* [[Requirements>>Test.FactHarbor pre10 V0\.9\.70.Specification.Requirements.WebHome]] - Full system requirements

276

* [[Architecture>>Test.FactHarbor pre10 V0\.9\.70.Specification.Architecture.WebHome]] - System architecture

277

* [[NFR11 Full Specification>>Test.FactHarbor.Specification.Requirements.WebHome#NFR11]] - Complete quality framework

278

279

**Document Status:** ✅ POC1 Specification Complete - Ready for Implementation

280

**Version:** V0.9.70

Wiki source code of POC1: Core Workflow with Quality Gates

Applications

Navigation

Need help?