Skip to Content

Wiki source code of Automation

Last modified by Robert Schaub on 2025/12/22 13:50

Show last authors

author	version	line-number	content
		1	= Automation =
		2
		3	How FactHarbor scales through automated claim evaluation.
		4
		5	== 1. Automation Philosophy ==
		6
		7	FactHarbor is automation-first: AKEL (AI Knowledge Extraction Layer) makes all content decisions. Humans monitor system performance and improve algorithms.
		8	Why automation:
		9
		10	* Scale: Can process millions of claims
		11	* Consistency: Same evaluation criteria applied uniformly
		12	* Transparency: Algorithms are auditable
		13	* Speed: Results in <20 seconds typically
		14	See [[Automation Philosophy>>Test.FactHarbor.Organisation.Automation-Philosophy]] for detailed principles.
		15
		16	== 2. Claim Processing Flow ==
		17
		18	=== 2.1 User Submits Claim ===
		19
		20	* User provides claim text + source URLs
		21	* System validates format
		22	* Assigns processing ID
		23	* Queues for AKEL processing
		24
		25	=== 2.2 AKEL Processing ===
		26
		27	AKEL automatically:
		28
		29	1. Parses claim into testable components
		30	2. Extracts evidence from sources
		31	3. Scores source credibility
		32	4. Evaluates claim against evidence
		33	5. Generates verdict with confidence score
		34	6. Assigns risk tier (A/B/C)
		35	7. Publishes result
		36	Processing time: Typically <20 seconds
		37	No human approval required - publication is automatic
		38
		39	=== 2.3 Publication States ===
		40
		41	Processing: AKEL working on claim (not visible to public)
		42	Published: AKEL completed evaluation (public)
		43
		44	* Verdict displayed with confidence score
		45	* Evidence and sources shown
		46	* Risk tier indicated
		47	* Users can report issues
		48	Flagged: AKEL identified issue requiring moderator attention (still public)
		49	* Low confidence below threshold
		50	* Detected manipulation attempt
		51	* Unusual pattern
		52	* Moderator reviews and may take action
		53
		54	== 2.5 LLM-Based Processing Architecture ==
		55
		56	FactHarbor delegates complex reasoning and analysis tasks to Large Language Models (LLMs). The architecture evolves from POC to production:
		57
		58	=== POC: Two-Phase Approach ===
		59
		60	Phase 1: Claim Extraction
		61
		62	* Single LLM call to extract all claims from submitted content
		63	* Light structure, focused on identifying distinct verifiable claims
		64	* Output: List of claims with context
		65
		66	Phase 2: Claim Analysis (Parallel)
		67
		68	* Single LLM call per claim (parallelizable)
		69	* Full structured output: Evidence, Scenarios, Sources, Verdict, Risk
		70	* Each claim analyzed independently
		71
		72	Advantages:
		73
		74	* Fast to implement (2-4 weeks to working POC)
		75	* Only 2-3 API calls total (1 + N claims)
		76	* Simple to debug (claim-level isolation)
		77	* Proves concept viability
		78
		79	=== Production: Three-Phase Approach ===
		80
		81	Phase 1: Claim Extraction + Validation
		82
		83	* Extract distinct verifiable claims
		84	* Validate claim clarity and uniqueness
		85	* Remove duplicates and vague claims
		86
		87	Phase 2: Evidence Gathering (Parallel)
		88
		89	* For each claim independently:
		90	* Find supporting and contradicting evidence
		91	* Identify authoritative sources
		92	* Generate test scenarios
		93	* Validation: Check evidence quality and source validity
		94	* Error containment: Issues in one claim don't affect others
		95
		96	Phase 3: Verdict Generation (Parallel)
		97
		98	* For each claim:
		99	* Generate verdict based on validated evidence
		100	* Assess confidence and risk level
		101	* Flag low-confidence results for human review
		102	* Validation: Check verdict consistency with evidence
		103
		104	Advantages:
		105
		106	* Error containment between phases
		107	* Clear quality gates and validation
		108	* Observable metrics per phase
		109	* Scalable (parallel processing across claims)
		110	* Adaptable (can optimize each phase independently)
		111
		112	=== LLM Task Delegation ===
		113
		114	All complex cognitive tasks are delegated to LLMs:
		115
		116	* Claim Extraction: Understanding context, identifying distinct claims
		117	* Evidence Finding: Analyzing sources, assessing relevance
		118	* Scenario Generation: Creating testable hypotheses
		119	* Source Evaluation: Assessing reliability and authority
		120	* Verdict Generation: Synthesizing evidence into conclusions
		121	* Risk Assessment: Evaluating potential impact
		122
		123	=== Error Mitigation ===
		124
		125	Research shows sequential LLM calls face compound error risks. FactHarbor mitigates this through:
		126
		127	* Validation gates between phases
		128	* Confidence thresholds for quality control
		129	* Parallel processing to avoid error propagation across claims
		130	* Human review queue for low-confidence verdicts
		131	* Independent claim processing - errors in one claim don't cascade to others
		132
		133	== 3. Risk Tiers ==
		134
		135	Risk tiers classify claims by potential impact and guide audit sampling rates.
		136
		137	=== 3.1 Tier A (High Risk) ===
		138
		139	Domains: Medical, legal, elections, safety, security
		140	Characteristics:
		141
		142	* High potential for harm if incorrect
		143	* Complex specialized knowledge required
		144	* Often subject to regulation
		145	Publication: AKEL publishes automatically with prominent risk warning
		146	Audit rate: Higher sampling recommended
		147
		148	=== 3.2 Tier B (Medium Risk) ===
		149
		150	Domains: Complex policy, science, causality claims
		151	Characteristics:
		152
		153	* Moderate potential impact
		154	* Requires careful evidence evaluation
		155	* Multiple valid interpretations possible
		156	Publication: AKEL publishes automatically with standard risk label
		157	Audit rate: Moderate sampling recommended
		158
		159	=== 3.3 Tier C (Low Risk) ===
		160
		161	Domains: Definitions, established facts, historical data
		162	Characteristics:
		163
		164	* Low potential for harm
		165	* Well-documented information
		166	* Clear right/wrong answers typically
		167	Publication: AKEL publishes by default
		168	Audit rate: Lower sampling recommended
		169
		170	== 4. Quality Gates ==
		171
		172	AKEL applies quality gates before publication. If any fail, claim is flagged (not blocked - still published).
		173	Quality gates:
		174
		175	* Sufficient evidence extracted (≥2 sources)
		176	* Sources meet minimum credibility threshold
		177	* Confidence score calculable
		178	* No detected manipulation patterns
		179	* Claim parseable into testable form
		180	Failed gates: Claim published with flag for moderator review
		181
		182	== 5. Automation Levels ==
		183
		184	{{include reference="Test.FactHarbor pre10 V0\.9\.70.Specification.Diagrams.Automation Level.WebHome"/}}
		185	FactHarbor progresses through automation maturity levels:
		186	Release 0.5 (Proof-of-Concept): Tier C only, human review required
		187	Release 1.0 (Initial): Tier B/C auto-published, Tier A flagged for review
		188	Release 2.0 (Mature): All tiers auto-published with risk labels, sampling audits
		189	See [[Automation Roadmap>>Test.FactHarbor pre10 V0\.9\.70.Specification.Diagrams.Automation Roadmap.WebHome]] for detailed progression.
		190
		191	== 5.5 Automation Roadmap ==
		192
		193	{{include reference="Test.FactHarbor pre10 V0\.9\.70.Specification.Diagrams.Automation Roadmap.WebHome"/}}
		194
		195	== 6. Human Role ==
		196
		197	Humans do NOT review content for approval. Instead:
		198	Monitoring: Watch aggregate performance metrics
		199	Improvement: Fix algorithms when patterns show issues
		200	Exception handling: Review AKEL-flagged items
		201	Governance: Set policies AKEL applies
		202	See [[Contributor Processes>>Test.FactHarbor.Organisation.Contributor-Processes]] for how to improve the system.
		203
		204	== 6.5 Manual vs Automated Matrix ==
		205
		206	{{include reference="Test.FactHarbor pre10 V0\.9\.70.Specification.Diagrams.Manual vs Automated matrix.WebHome"/}}
		207
		208	== 7. Moderation ==
		209
		210	Moderators handle items AKEL flags:
		211	Abuse detection: Spam, manipulation, harassment
		212	Safety issues: Content that could cause immediate harm
		213	System gaming: Attempts to manipulate scoring
		214	Action: May temporarily hide content, ban users, or propose algorithm improvements
		215	Does NOT: Routinely review claims or override verdicts
		216	See [[Organisational Model>>Test.FactHarbor.Organisation.Organisational-Model]] for moderator role details.
		217
		218	== 8. Continuous Improvement ==
		219
		220	Performance monitoring: Track AKEL accuracy, speed, coverage
		221	Issue identification: Find systematic errors from metrics
		222	Algorithm updates: Deploy improvements to fix patterns
		223	A/B testing: Validate changes before full rollout
		224	Retrospectives: Learn from failures systematically
		225	See [[Continuous Improvement>>Test.FactHarbor.Organisation.How-We-Work-Together.Continuous-Improvement]] for improvement cycle.
		226
		227	== 9. Scalability ==
		228
		229	Automation enables FactHarbor to scale:
		230
		231	* Millions of claims processable
		232	* Consistent quality at any volume
		233	* Cost efficiency through automation
		234	* Rapid iteration on algorithms
		235	Without automation: Human review doesn't scale, creates bottlenecks, introduces inconsistency.
		236
		237	== 10. Transparency ==
		238
		239	All automation is transparent:
		240
		241	* Algorithm parameters documented
		242	* Evaluation criteria public
		243	* Source scoring rules explicit
		244	* Confidence calculations explained
		245	* Performance metrics visible
		246	See [[System Performance Metrics>>Test.FactHarbor.Specification.System-Performance-Metrics]] for what we measure.