Wiki source code of Automation

Last modified by Robert Schaub on 2025/12/22 14:32

version	line-number	content
1.1	1	= Automation =
1.3	2
1.1	3	How FactHarbor scales through automated claim evaluation.
1.3	4
1.1	5	== 1. Automation Philosophy ==
1.3	6
1.1	7	FactHarbor is automation-first: AKEL (AI Knowledge Extraction Layer) makes all content decisions. Humans monitor system performance and improve algorithms.
	8	Why automation:
1.3	9
1.1	10	* Scale: Can process millions of claims
	11	* Consistency: Same evaluation criteria applied uniformly
	12	* Transparency: Algorithms are auditable
	13	* Speed: Results in <20 seconds typically
	14	See [[Automation Philosophy>>Test.FactHarbor.Organisation.Automation-Philosophy]] for detailed principles.
1.3	15
1.1	16	== 2. Claim Processing Flow ==
1.3	17
1.1	18	=== 2.1 User Submits Claim ===
1.3	19
1.1	20	* User provides claim text + source URLs
	21	* System validates format
	22	* Assigns processing ID
	23	* Queues for AKEL processing
1.3	24
1.1	25	=== 2.2 AKEL Processing ===
1.3	26
1.1	27	AKEL automatically:
1.3	28
1.1	29	1. Parses claim into testable components
	30	2. Extracts evidence from sources
	31	3. Scores source credibility
	32	4. Evaluates claim against evidence
	33	5. Generates verdict with confidence score
	34	6. Assigns risk tier (A/B/C)
	35	7. Publishes result
	36	Processing time: Typically <20 seconds
	37	No human approval required - publication is automatic
1.3	38
1.1	39	=== 2.3 Publication States ===
1.3	40
1.1	41	Processing: AKEL working on claim (not visible to public)
	42	Published: AKEL completed evaluation (public)
1.3	43
1.1	44	* Verdict displayed with confidence score
	45	* Evidence and sources shown
	46	* Risk tier indicated
	47	* Users can report issues
	48	Flagged: AKEL identified issue requiring moderator attention (still public)
	49	* Low confidence below threshold
	50	* Detected manipulation attempt
	51	* Unusual pattern
	52	* Moderator reviews and may take action == 2.5 LLM-Based Processing Architecture == FactHarbor delegates complex reasoning and analysis tasks to Large Language Models (LLMs). The architecture evolves from POC to production: === POC: Two-Phase Approach === Phase 1: Claim Extraction
	53	* Single LLM call to extract all claims from submitted content
	54	* Light structure, focused on identifying distinct verifiable claims
	55	* Output: List of claims with context Phase 2: Claim Analysis (Parallel)
	56	* Single LLM call per claim (parallelizable)
	57	* Full structured output: Evidence, Scenarios, Sources, Verdict, Risk
	58	* Each claim analyzed independently Advantages:
	59	* Fast to implement (2-4 weeks to working POC)
	60	* Only 2-3 API calls total (1 + N claims)
	61	* Simple to debug (claim-level isolation)
	62	* Proves concept viability === Production: Three-Phase Approach === Phase 1: Claim Extraction + Validation
	63	* Extract distinct verifiable claims
	64	* Validate claim clarity and uniqueness
	65	* Remove duplicates and vague claims Phase 2: Evidence Gathering (Parallel)
	66	* For each claim independently: * Find supporting and contradicting evidence * Identify authoritative sources * Generate test scenarios
	67	* Validation: Check evidence quality and source validity
	68	* Error containment: Issues in one claim don't affect others Phase 3: Verdict Generation (Parallel)
	69	* For each claim: * Generate verdict based on validated evidence * Assess confidence and risk level * Flag low-confidence results for human review
	70	* Validation: Check verdict consistency with evidence Advantages:
	71	* Error containment between phases
	72	* Clear quality gates and validation
	73	* Observable metrics per phase
	74	* Scalable (parallel processing across claims)
	75	* Adaptable (can optimize each phase independently) === LLM Task Delegation === All complex cognitive tasks are delegated to LLMs:
	76	* Claim Extraction: Understanding context, identifying distinct claims
	77	* Evidence Finding: Analyzing sources, assessing relevance
	78	* Scenario Generation: Creating testable hypotheses
	79	* Source Evaluation: Assessing reliability and authority
	80	* Verdict Generation: Synthesizing evidence into conclusions
	81	* Risk Assessment: Evaluating potential impact === Error Mitigation === Research shows sequential LLM calls face compound error risks. FactHarbor mitigates this through:
	82	* Validation gates between phases
	83	* Confidence thresholds for quality control
	84	* Parallel processing to avoid error propagation across claims
	85	* Human review queue for low-confidence verdicts
	86	* Independent claim processing - errors in one claim don't cascade to others == 3. Risk Tiers ==
	87	Risk tiers classify claims by potential impact and guide audit sampling rates.
1.3	88
1.1	89	=== 3.1 Tier A (High Risk) ===
1.3	90
1.1	91	Domains: Medical, legal, elections, safety, security
	92	Characteristics:
1.3	93
1.1	94	* High potential for harm if incorrect
	95	* Complex specialized knowledge required
	96	* Often subject to regulation
	97	Publication: AKEL publishes automatically with prominent risk warning
	98	Audit rate: Higher sampling recommended
1.3	99
1.1	100	=== 3.2 Tier B (Medium Risk) ===
1.3	101
1.1	102	Domains: Complex policy, science, causality claims
	103	Characteristics:
1.3	104
1.1	105	* Moderate potential impact
	106	* Requires careful evidence evaluation
	107	* Multiple valid interpretations possible
	108	Publication: AKEL publishes automatically with standard risk label
	109	Audit rate: Moderate sampling recommended
1.3	110
1.1	111	=== 3.3 Tier C (Low Risk) ===
1.3	112
1.1	113	Domains: Definitions, established facts, historical data
	114	Characteristics:
1.3	115
1.1	116	* Low potential for harm
	117	* Well-documented information
	118	* Clear right/wrong answers typically
	119	Publication: AKEL publishes by default
	120	Audit rate: Lower sampling recommended
1.3	121
1.1	122	== 4. Quality Gates ==
1.3	123
1.1	124	AKEL applies quality gates before publication. If any fail, claim is flagged (not blocked - still published).
	125	Quality gates:
1.3	126
1.1	127	* Sufficient evidence extracted (≥2 sources)
	128	* Sources meet minimum credibility threshold
	129	* Confidence score calculable
	130	* No detected manipulation patterns
	131	* Claim parseable into testable form
	132	Failed gates: Claim published with flag for moderator review
1.3	133
1.1	134	== 5. Automation Levels ==
1.3	135
	136	{{include reference="Test.FactHarbor pre12 V0\.9\.70.Specification.Diagrams.Automation Level.WebHome"/}}
1.1	137	FactHarbor progresses through automation maturity levels:
	138	Release 0.5 (Proof-of-Concept): Tier C only, human review required
	139	Release 1.0 (Initial): Tier B/C auto-published, Tier A flagged for review
	140	Release 2.0 (Mature): All tiers auto-published with risk labels, sampling audits
1.4	141	See [[Automation Roadmap>>Test.FactHarbor pre12 V0\.9\.70.Specification.Diagrams.Automation Roadmap.WebHome]] for detailed progression. == 5.5 Automation Roadmap == {{include reference="Test.FactHarbor pre12 V0\.9\.70.Specification.Diagrams.Automation Roadmap.WebHome"/}} == 6. Human Role ==
1.1	142	Humans do NOT review content for approval. Instead:
	143	Monitoring: Watch aggregate performance metrics
	144	Improvement: Fix algorithms when patterns show issues
	145	Exception handling: Review AKEL-flagged items
	146	Governance: Set policies AKEL applies
1.5	147	See [[Contributor Processes>>Test.FactHarbor.Organisation.Contributor-Processes]] for how to improve the system. == 6.5 Manual vs Automated Matrix == {{include reference="Test.FactHarbor pre12 V0\.9\.70.Specification.Diagrams.Manual vs Automated matrix.WebHome"/}} == 7. Moderation ==
1.1	148	Moderators handle items AKEL flags:
	149	Abuse detection: Spam, manipulation, harassment
	150	Safety issues: Content that could cause immediate harm
	151	System gaming: Attempts to manipulate scoring
	152	Action: May temporarily hide content, ban users, or propose algorithm improvements
	153	Does NOT: Routinely review claims or override verdicts
	154	See [[Organisational Model>>Test.FactHarbor.Organisation.Organisational-Model]] for moderator role details.
1.3	155
1.1	156	== 8. Continuous Improvement ==
1.3	157
1.1	158	Performance monitoring: Track AKEL accuracy, speed, coverage
	159	Issue identification: Find systematic errors from metrics
	160	Algorithm updates: Deploy improvements to fix patterns
	161	A/B testing: Validate changes before full rollout
	162	Retrospectives: Learn from failures systematically
	163	See [[Continuous Improvement>>Test.FactHarbor.Organisation.How-We-Work-Together.Continuous-Improvement]] for improvement cycle.
1.3	164
1.1	165	== 9. Scalability ==
1.3	166
1.1	167	Automation enables FactHarbor to scale:
1.3	168
1.1	169	* Millions of claims processable
	170	* Consistent quality at any volume
	171	* Cost efficiency through automation
	172	* Rapid iteration on algorithms
	173	Without automation: Human review doesn't scale, creates bottlenecks, introduces inconsistency.
1.3	174
1.1	175	== 10. Transparency ==
1.3	176
1.1	177	All automation is transparent:
1.3	178
1.1	179	* Algorithm parameters documented
	180	* Evaluation criteria public
	181	* Source scoring rules explicit
	182	* Confidence calculations explained
	183	* Performance metrics visible
	184	See [[System Performance Metrics>>Test.FactHarbor.Specification.System-Performance-Metrics]] for what we measure.