Skip to Content

Wiki source code of POC1: Core Workflow with Quality Gates

Last modified by Robert Schaub on 2025/12/23 16:13

Show last authors

author	version	line-number	content
		1	= POC1: Core Workflow with Quality Gates =
		2
		3	Phase Goal: Prove AKEL can produce credible, quality outputs without manual intervention
		4
		5	Success Metric: <10% hallucination rate, quality gates prevent low-confidence publications
		6
		7
		8	== 1. Overview ==
		9
		10	POC1 validates that the core AKEL workflow (Article → Claims → Verdicts) can produce trustworthy fact-checking analyses automatically. This phase implements 2 critical quality gates to prevent low-quality outputs from being published.
		11
		12	Key Innovation: Quality validation BEFORE publication, not after
		13
		14	What We're Proving:
		15
		16	* AKEL can reliably extract factual claims from articles
		17	* AKEL can generate credible verdicts with proper evidence
		18	* Quality gates prevent hallucinations and low-confidence outputs
		19	* Fully automated approach is viable
		20
		21	== 2. Scope ==
		22
		23	=== In Scope ===
		24
		25	* Core AKEL workflow (claim extraction, verdict generation)
		26	* Gate 1: Claim Validation (factual vs. opinion/prediction)
		27	* Gate 4: Verdict Confidence Assessment (minimum 2 sources, quality thresholds)
		28	* Basic UI to display results
		29	* Manual quality tracking
		30
		31	=== Out of Scope (Deferred to POC2+) ===
		32
		33	* User accounts / authentication
		34	* Corrections system
		35	* Search engine optimization (ClaimReview schema)
		36	* Image verification
		37	* API endpoints
		38	* Archive.org integration
		39	* Security hardening
		40	* A/B testing
		41	* Gates 2 & 3 (Evidence relevance, Scenario coherence)
		42
		43	== 3. Requirements ==
		44
		45	=== 3.1 NFR11: Quality Assurance Framework (POC1 Lite Version) ===
		46
		47	Importance: CRITICAL - Core POC1 Requirement
		48	Fulfills: AI safety, credibility, prevents embarrassing failures
		49
		50	Specification:
		51
		52	AKEL must validate outputs before displaying to users. POC1 implements a 2-gate subset of the full NFR11 framework.
		53
		54	==== Gate 1: Claim Validation ====
		55
		56	Purpose: Ensure extracted claims are factual assertions, not opinions or predictions
		57
		58	Validation Checks:
		59
		60	1. Factual Statement Test: Can this be verified with evidence?
		61	2. Opinion Detection: Contains hedging language? ("I think", "probably", "best", "worst")
		62	3. Specificity Score: Contains concrete details? (names, numbers, dates, locations)
		63	4. Future Prediction Test: Makes claims about future events?
		64
		65	Pass Criteria:
		66	{{code}}- isFactual: true
		67	- opinionScore: ≤ 0.3
		68	- specificityScore: ≥ 0.3
		69	- claimType: FACTUAL{{/code}}
		70
		71	Action if Failed:
		72
		73	* Flag as "Non-verifiable: Opinion/Prediction/Ambiguous"
		74	* Do NOT generate scenarios or verdicts
		75	* Display explanation to user
		76
		77	Target: 0% opinion statements processed as facts
		78
		79
		80	==== Gate 4: Verdict Confidence Assessment ====
		81
		82	Purpose: Only publish verdicts with sufficient evidence and confidence
		83
		84	Validation Checks:
		85
		86	1. Evidence Count: Minimum 2 independent sources
		87	2. Source Quality: Average reliability ≥ 0.6 (on 0-1 scale)
		88	3. Evidence Agreement: % supporting vs. contradicting ≥ 0.6
		89	4. Uncertainty Factors: Count of hedging statements in reasoning
		90
		91	Confidence Tiers:
		92	{{code}}HIGH (80-100%):
		93	- ≥3 sources
		94	- ≥0.7 average quality
		95	- ≥80% agreement
		96
		97	MEDIUM (50-79%):
		98	- ≥2 sources
		99	- ≥0.6 average quality
		100	- ≥60% agreement
		101
		102	LOW (0-49%):
		103	- ≥2 sources BUT low quality/agreement
		104
		105	INSUFFICIENT:
		106	- <2 sources → DO NOT PUBLISH{{/code}}
		107
		108	POC1 Publication Rule:
		109
		110	* Minimum MEDIUM confidence required
		111	* Blocked verdicts show "Insufficient Evidence" message
		112
		113	Target: 0% verdicts published with <2 sources
		114
		115
		116	=== 3.2 Modified FR7: Automated Verdicts (Enhanced) ===
		117
		118	Enhancement for POC1:
		119
		120	After AKEL generates a verdict, it must pass through the quality validation pipeline:
		121
		122	{{code}}
		123	AKEL Workflow (POC1):
		124
		125	1. Extract claims from article
		126	↓
		127	2. [GATE 1] Validate each claim is fact-checkable
		128	↓ (pass claims only)
		129	3. Generate verdicts for each claim
		130	↓
		131	4. [GATE 4] Validate verdict has sufficient evidence
		132	↓ (pass verdicts only)
		133	5. Display to user
		134
		135	Failed claims/verdicts:
		136	- Store in database with failure reason
		137	- Display explanatory message to user
		138	- Log for quality metrics tracking
		139	{{/code}}
		140
		141	Updated Verdict States:
		142
		143	* PUBLISHED - Passed all gates
		144	* INSUFFICIENT_EVIDENCE - Failed Gate 4
		145	* NON_FACTUAL_CLAIM - Failed Gate 1
		146	* PROCESSING - In progress
		147	* ERROR - System failure
		148
		149	=== 3.3 Modified FR4: Analysis Summary (Enhanced) ===
		150
		151	Enhancement for POC1:
		152
		153	Analysis Summary must now display quality metadata:
		154
		155	{{code}}
		156	Analysis Summary:
		157	Total Claims Found: 5
		158	Verifiable Claims: 3
		159	Non-verifiable (Opinion): 1
		160	Non-verifiable (Prediction): 1
		161
		162	Verdicts Generated: 3
		163	High Confidence: 1
		164	Medium Confidence: 2
		165	Insufficient Evidence: 0
		166
		167	Evidence Sources: 12 total
		168	Average Source Quality: 0.73
		169
		170	Quality Score: 8.5/10
		171	{{/code}}
		172
		173
		174	== 4. Success Criteria ==
		175
		176	POC1 is considered SUCCESSFUL if:
		177
		178	✅ Functional:
		179
		180	* Processes diverse test articles without crashes
		181	* Generates verdicts for all factual claims
		182	* Blocks all non-factual claims (0% pass through)
		183	* Blocks all insufficient-evidence verdicts (0% with <2 sources)
		184
		185	✅ Quality:
		186
		187	* Hallucination rate <10% (manual verification)
		188	* 0 verdicts with <2 sources published
		189	* 0 opinion statements published as facts
		190	* Average quality score ≥7.0/10
		191
		192	✅ Performance:
		193
		194	* Processing time reasonable for POC demonstration
		195	* Quality gates execute efficiently
		196	* UI displays results clearly
		197
		198	✅ Learnings:
		199
		200	* Identified prompt engineering improvements
		201	* Documented AKEL strengths/weaknesses
		202	* Validated threshold values
		203	* Clear path to POC2 defined
		204
		205	== 5. Decision Gates ==
		206
		207	POC1 → POC2 Decision:
		208
		209	* IF hallucination rate >10% → Pause, improve prompts before POC2
		210	* IF majority of claims non-processable → Rethink claim extraction approach
		211	* IF quality gates too strict (excessive blocking) → Adjust thresholds
		212	* IF quality gates too loose (hallucinations pass) → Tighten criteria
		213
		214	Only proceed to POC2 if all success criteria met
		215
		216
		217	== 6. Architecture Notes ==
		218
		219	POC1 Simplified Architecture:
		220
		221	{{code}}
		222	User Input → AKEL Processing → Quality Gates → Display
		223	(claim extraction (Gates 1 & 4)
		224	+ verdicts)
		225	{{/code}}
		226
		227	vs. Full System (Future):
		228
		229	{{code}}
		230	Input → Claim Extractor → Scenario Generator → Evidence Linker
		231	→ Verdict Generator → All 4 Gates → Review Queue → Publication
		232	{{/code}}
		233
		234	POC1 Acceptable Simplifications:
		235
		236	* Single AKEL call (not multi-component pipeline)
		237	* No scenarios (implicit in verdicts)
		238	* Basic evidence linking
		239	* 2 gates instead of 4
		240	* No review queue
		241
		242	See: [[Architecture>>Test.FactHarbor pre10 V0\.9\.70.Specification.Architecture.WebHome]] for details
		243
		244
		245	== Related Pages ==
		246
		247	* [[Roadmap Overview>>Test.FactHarbor pre10 V0\.9\.70.Roadmap.WebHome]] - All phases
		248	* [[POC2 Requirements>>Test.FactHarbor pre10 V0\.9\.70.Roadmap.POC2.WebHome]] - Next phase
		249	* [[Requirements>>Test.FactHarbor pre10 V0\.9\.70.Specification.Requirements.WebHome]] - Full system requirements
		250	* [[Architecture>>Test.FactHarbor pre10 V0\.9\.70.Specification.Architecture.WebHome]] - System architecture
		251	* [[NFR11 Full Specification>>Test.FactHarbor.Specification.Requirements.WebHome#NFR11]] - Complete quality framework
		252
		253	Document Status: ✅ POC1 Specification Complete - Ready for Implementation
		254	Version: V0.9.70