Skip to Content

Wiki source code of POC1: Core Workflow with Quality Gates

Version 1.2 by Robert Schaub on 2025/12/21 13:38

Show last authors

author	version	line-number	content
		1	= POC1: Core Workflow with Quality Gates =
		2
		3	Phase Goal: Prove AKEL can produce credible, quality outputs without manual intervention
		4
		5	Success Metric: <10% hallucination rate, quality gates prevent low-confidence publications
		6
		7	---
		8
		9	== 1. Overview ==
		10
		11	POC1 validates that the core AKEL workflow (Article → Claims → Verdicts) can produce trustworthy fact-checking analyses automatically. This phase implements 2 critical quality gates to prevent low-quality outputs from being published.
		12
		13	Key Innovation: Quality validation BEFORE publication, not after
		14
		15	What We're Proving:
		16	* AKEL can reliably extract factual claims from articles
		17	* AKEL can generate credible verdicts with proper evidence
		18	* Quality gates prevent hallucinations and low-confidence outputs
		19	* Fully automated approach is viable
		20
		21	---
		22
		23	== 2. Scope ==
		24
		25	=== In Scope ===
		26
		27	* Core AKEL workflow (claim extraction, verdict generation)
		28	* Gate 1: Claim Validation (factual vs. opinion/prediction)
		29	* Gate 4: Verdict Confidence Assessment (minimum 2 sources, quality thresholds)
		30	* Basic UI to display results
		31	* Manual quality tracking
		32
		33	=== Out of Scope (Deferred to POC2+) ===
		34
		35	* User accounts / authentication
		36	* Corrections system
		37	* Search engine optimization (ClaimReview schema)
		38	* Image verification
		39	* API endpoints
		40	* Archive.org integration
		41	* Security hardening
		42	* A/B testing
		43	* Gates 2 & 3 (Evidence relevance, Scenario coherence)
		44
		45	---
		46
		47	== 3. Requirements ==
		48
		49	=== 3.1 NFR11-POC1: Quality Assurance Framework (Lite) ===
		50
		51	Priority: CRITICAL - Core POC1 Requirement
		52	Fulfills: AI safety, credibility, prevents embarrassing failures
		53
		54	Specification:
		55
		56	AKEL must validate outputs before displaying to users. POC1 implements a 2-gate subset of the full NFR11 framework.
		57
		58	==== Gate 1: Claim Validation ====
		59
		60	Purpose: Ensure extracted claims are factual assertions, not opinions or predictions
		61
		62	Validation Checks:
		63	1. Factual Statement Test: Can this be verified with evidence?
		64	2. Opinion Detection: Contains hedging language? ("I think", "probably", "best", "worst")
		65	3. Specificity Score: Contains concrete details? (names, numbers, dates, locations)
		66	4. Future Prediction Test: Makes claims about future events?
		67
		68	Pass Criteria:
		69	{{code}}
		70	- isFactual: true
		71	- opinionScore: ≤ 0.3
		72	- specificityScore: ≥ 0.3
		73	- claimType: FACTUAL
		74	{{/code}}
		75
		76	Action if Failed:
		77	* Flag as "Non-verifiable: Opinion/Prediction/Ambiguous"
		78	* Do NOT generate scenarios or verdicts
		79	* Display explanation to user
		80
		81	Target: 0% opinion statements processed as facts
		82
		83	---
		84
		85	==== Gate 4: Verdict Confidence Assessment ====
		86
		87	Purpose: Only publish verdicts with sufficient evidence and confidence
		88
		89	Validation Checks:
		90	1. Evidence Count: Minimum 2 independent sources
		91	2. Source Quality: Average reliability ≥ 0.6 (on 0-1 scale)
		92	3. Evidence Agreement: % supporting vs. contradicting ≥ 0.6
		93	4. Uncertainty Factors: Count of hedging statements in reasoning
		94
		95	Confidence Tiers:
		96	{{code}}
		97	HIGH (80-100%):
		98	- ≥3 sources
		99	- ≥0.7 average quality
		100	- ≥80% agreement
		101
		102	MEDIUM (50-79%):
		103	- ≥2 sources
		104	- ≥0.6 average quality
		105	- ≥60% agreement
		106
		107	LOW (0-49%):
		108	- ≥2 sources BUT low quality/agreement
		109
		110	INSUFFICIENT:
		111	- <2 sources → DO NOT PUBLISH
		112	{{/code}}
		113
		114	POC1 Publication Rule:
		115	* Minimum MEDIUM confidence required
		116	* Blocked verdicts show "Insufficient Evidence" message
		117
		118	Target: 0% verdicts published with <2 sources
		119
		120	---
		121
		122	=== 3.2 Modified FR7: Automated Verdicts (Enhanced) ===
		123
		124	Enhancement for POC1:
		125
		126	After AKEL generates a verdict, it must pass through the quality validation pipeline:
		127
		128	{{code}}
		129	AKEL Workflow (POC1):
		130
		131	1. Extract claims from article
		132	↓
		133	2. [GATE 1] Validate each claim is fact-checkable
		134	↓ (pass claims only)
		135	3. Generate verdicts for each claim
		136	↓
		137	4. [GATE 4] Validate verdict has sufficient evidence
		138	↓ (pass verdicts only)
		139	5. Display to user
		140
		141	Failed claims/verdicts:
		142	- Store in database with failure reason
		143	- Display explanatory message to user
		144	- Log for quality metrics tracking
		145	{{/code}}
		146
		147	Updated Verdict States:
		148	* PUBLISHED - Passed all gates
		149	* INSUFFICIENT_EVIDENCE - Failed Gate 4
		150	* NON_FACTUAL_CLAIM - Failed Gate 1
		151	* PROCESSING - In progress
		152	* ERROR - System failure
		153
		154	---
		155
		156	=== 3.3 Modified FR4: Analysis Summary (Enhanced) ===
		157
		158	Enhancement for POC1:
		159
		160	Analysis Summary must now display quality metadata:
		161
		162	{{code}}
		163	Analysis Summary:
		164	Total Claims Found: 5
		165	Verifiable Claims: 3
		166	Non-verifiable (Opinion): 1
		167	Non-verifiable (Prediction): 1
		168
		169	Verdicts Generated: 3
		170	High Confidence: 1
		171	Medium Confidence: 2
		172	Insufficient Evidence: 0
		173
		174	Evidence Sources: 12 total
		175	Average Source Quality: 0.73
		176
		177	Quality Score: 8.5/10
		178	{{/code}}
		179
		180	---
		181
		182	== 4. Success Criteria ==
		183
		184	POC1 is considered SUCCESSFUL if:
		185
		186	✅ Functional:
		187	* Processes diverse test articles without crashes
		188	* Generates verdicts for all factual claims
		189	* Blocks all non-factual claims (0% pass through)
		190	* Blocks all insufficient-evidence verdicts (0% with <2 sources)
		191
		192	✅ Quality:
		193	* Hallucination rate <10% (manual verification)
		194	* 0 verdicts with <2 sources published
		195	* 0 opinion statements published as facts
		196	* Average quality score ≥7.0/10
		197
		198	✅ Performance:
		199	* Processing time reasonable for POC demonstration
		200	* Quality gates execute efficiently
		201	* UI displays results clearly
		202
		203	✅ Learnings:
		204	* Identified prompt engineering improvements
		205	* Documented AKEL strengths/weaknesses
		206	* Validated threshold values
		207	* Clear path to POC2 defined
		208
		209	---
		210
		211	== 5. Decision Gates ==
		212
		213	POC1 → POC2 Decision:
		214
		215	* IF hallucination rate >10% → Pause, improve prompts before POC2
		216	* IF majority of claims non-processable → Rethink claim extraction approach
		217	* IF quality gates too strict (excessive blocking) → Adjust thresholds
		218	* IF quality gates too loose (hallucinations pass) → Tighten criteria
		219
		220	Only proceed to POC2 if all success criteria met
		221
		222	---
		223
		224	== 6. Architecture Notes ==
		225
		226	POC1 Simplified Architecture:
		227
		228	{{code}}
		229	User Input → AKEL Processing → Quality Gates → Display
		230	(claim extraction (Gates 1 & 4)
		231	+ verdicts)
		232	{{/code}}
		233
		234	vs. Full System (Future):
		235
		236	{{code}}
		237	Input → Claim Extractor → Scenario Generator → Evidence Linker
		238	→ Verdict Generator → All 4 Gates → Review Queue → Publication
		239	{{/code}}
		240
		241	POC1 Acceptable Simplifications:
		242	* Single AKEL call (not multi-component pipeline)
		243	* No scenarios (implicit in verdicts)
		244	* Basic evidence linking
		245	* 2 gates instead of 4
		246	* No review queue
		247
		248	See: [[Architecture>>Test.FactHarbor.Specification.Architecture.WebHome]] for details
		249
		250	---
		251
		252	== Related Pages ==
		253
		254	* [[Roadmap Overview>>Test.FactHarbor.Roadmap.WebHome]] - All phases
		255	* [[POC2 Requirements>>Test.FactHarbor.Roadmap.POC2.WebHome]] - Next phase
		256	* [[Requirements>>Test.FactHarbor.Specification.Requirements.WebHome]] - Full system requirements
		257	* [[Architecture>>Test.FactHarbor.Specification.Architecture.WebHome]] - System architecture
		258	* [[NFR11 Full Specification>>Test.FactHarbor.Specification.Requirements.WebHome#NFR11]] - Complete quality framework
		259
		260	---
		261
		262	Document Status: ✅ POC1 Specification Complete - Ready for Implementation
		263	Version: V0.9.70