Wiki source code of Automation

Last modified by Robert Schaub on 2025/12/22 13:50

version	line-number	content
1.1	1	= Automation =
1.3	2
1.1	3	How FactHarbor scales through automated claim evaluation.
1.3	4
1.1	5	== 1. Automation Philosophy ==
1.3	6
1.1	7	FactHarbor is automation-first: AKEL (AI Knowledge Extraction Layer) makes all content decisions. Humans monitor system performance and improve algorithms.
	8	Why automation:
1.3	9
1.1	10	* Scale: Can process millions of claims
	11	* Consistency: Same evaluation criteria applied uniformly
	12	* Transparency: Algorithms are auditable
	13	* Speed: Results in <20 seconds typically
	14	See [[Automation Philosophy>>Test.FactHarbor.Organisation.Automation-Philosophy]] for detailed principles.
1.3	15
1.1	16	== 2. Claim Processing Flow ==
1.3	17
1.1	18	=== 2.1 User Submits Claim ===
1.3	19
1.1	20	* User provides claim text + source URLs
	21	* System validates format
	22	* Assigns processing ID
	23	* Queues for AKEL processing
1.3	24
1.1	25	=== 2.2 AKEL Processing ===
1.3	26
1.1	27	AKEL automatically:
1.3	28
1.1	29	1. Parses claim into testable components
	30	2. Extracts evidence from sources
	31	3. Scores source credibility
	32	4. Evaluates claim against evidence
	33	5. Generates verdict with confidence score
	34	6. Assigns risk tier (A/B/C)
	35	7. Publishes result
	36	Processing time: Typically <20 seconds
	37	No human approval required - publication is automatic
1.3	38
1.1	39	=== 2.3 Publication States ===
1.3	40
1.1	41	Processing: AKEL working on claim (not visible to public)
	42	Published: AKEL completed evaluation (public)
1.3	43
1.1	44	* Verdict displayed with confidence score
	45	* Evidence and sources shown
	46	* Risk tier indicated
	47	* Users can report issues
	48	Flagged: AKEL identified issue requiring moderator attention (still public)
	49	* Low confidence below threshold
	50	* Detected manipulation attempt
	51	* Unusual pattern
	52	* Moderator reviews and may take action
	53
	54	== 2.5 LLM-Based Processing Architecture ==
	55
	56	FactHarbor delegates complex reasoning and analysis tasks to Large Language Models (LLMs). The architecture evolves from POC to production:
	57
	58	=== POC: Two-Phase Approach ===
	59
	60	Phase 1: Claim Extraction
1.3	61
1.1	62	* Single LLM call to extract all claims from submitted content
	63	* Light structure, focused on identifying distinct verifiable claims
	64	* Output: List of claims with context
	65
	66	Phase 2: Claim Analysis (Parallel)
1.3	67
1.1	68	* Single LLM call per claim (parallelizable)
	69	* Full structured output: Evidence, Scenarios, Sources, Verdict, Risk
	70	* Each claim analyzed independently
	71
	72	Advantages:
1.3	73
1.1	74	* Fast to implement (2-4 weeks to working POC)
	75	* Only 2-3 API calls total (1 + N claims)
	76	* Simple to debug (claim-level isolation)
	77	* Proves concept viability
	78
	79	=== Production: Three-Phase Approach ===
	80
	81	Phase 1: Claim Extraction + Validation
1.3	82
1.1	83	* Extract distinct verifiable claims
	84	* Validate claim clarity and uniqueness
	85	* Remove duplicates and vague claims
	86
	87	Phase 2: Evidence Gathering (Parallel)
1.3	88
1.1	89	* For each claim independently:
1.3	90	* Find supporting and contradicting evidence
	91	* Identify authoritative sources
	92	* Generate test scenarios
1.1	93	* Validation: Check evidence quality and source validity
	94	* Error containment: Issues in one claim don't affect others
	95
	96	Phase 3: Verdict Generation (Parallel)
1.3	97
1.1	98	* For each claim:
1.3	99	* Generate verdict based on validated evidence
	100	* Assess confidence and risk level
	101	* Flag low-confidence results for human review
1.1	102	* Validation: Check verdict consistency with evidence
	103
	104	Advantages:
1.3	105
1.1	106	* Error containment between phases
	107	* Clear quality gates and validation
	108	* Observable metrics per phase
	109	* Scalable (parallel processing across claims)
	110	* Adaptable (can optimize each phase independently)
	111
	112	=== LLM Task Delegation ===
	113
	114	All complex cognitive tasks are delegated to LLMs:
1.3	115
1.1	116	* Claim Extraction: Understanding context, identifying distinct claims
	117	* Evidence Finding: Analyzing sources, assessing relevance
	118	* Scenario Generation: Creating testable hypotheses
	119	* Source Evaluation: Assessing reliability and authority
	120	* Verdict Generation: Synthesizing evidence into conclusions
	121	* Risk Assessment: Evaluating potential impact
	122
	123	=== Error Mitigation ===
	124
	125	Research shows sequential LLM calls face compound error risks. FactHarbor mitigates this through:
1.3	126
1.1	127	* Validation gates between phases
	128	* Confidence thresholds for quality control
	129	* Parallel processing to avoid error propagation across claims
	130	* Human review queue for low-confidence verdicts
	131	* Independent claim processing - errors in one claim don't cascade to others
	132
1.3	133	== 3. Risk Tiers ==
1.1	134
	135	Risk tiers classify claims by potential impact and guide audit sampling rates.
1.3	136
1.1	137	=== 3.1 Tier A (High Risk) ===
1.3	138
1.1	139	Domains: Medical, legal, elections, safety, security
	140	Characteristics:
1.3	141
1.1	142	* High potential for harm if incorrect
	143	* Complex specialized knowledge required
	144	* Often subject to regulation
	145	Publication: AKEL publishes automatically with prominent risk warning
	146	Audit rate: Higher sampling recommended
1.3	147
1.1	148	=== 3.2 Tier B (Medium Risk) ===
1.3	149
1.1	150	Domains: Complex policy, science, causality claims
	151	Characteristics:
1.3	152
1.1	153	* Moderate potential impact
	154	* Requires careful evidence evaluation
	155	* Multiple valid interpretations possible
	156	Publication: AKEL publishes automatically with standard risk label
	157	Audit rate: Moderate sampling recommended
1.3	158
1.1	159	=== 3.3 Tier C (Low Risk) ===
1.3	160
1.1	161	Domains: Definitions, established facts, historical data
	162	Characteristics:
1.3	163
1.1	164	* Low potential for harm
	165	* Well-documented information
	166	* Clear right/wrong answers typically
	167	Publication: AKEL publishes by default
	168	Audit rate: Lower sampling recommended
1.3	169
1.1	170	== 4. Quality Gates ==
1.3	171
1.1	172	AKEL applies quality gates before publication. If any fail, claim is flagged (not blocked - still published).
	173	Quality gates:
1.3	174
1.1	175	* Sufficient evidence extracted (≥2 sources)
	176	* Sources meet minimum credibility threshold
	177	* Confidence score calculable
	178	* No detected manipulation patterns
	179	* Claim parseable into testable form
	180	Failed gates: Claim published with flag for moderator review
1.3	181
1.1	182	== 5. Automation Levels ==
1.3	183
	184	{{include reference="Test.FactHarbor pre10 V0\.9\.70.Specification.Diagrams.Automation Level.WebHome"/}}
1.1	185	FactHarbor progresses through automation maturity levels:
	186	Release 0.5 (Proof-of-Concept): Tier C only, human review required
	187	Release 1.0 (Initial): Tier B/C auto-published, Tier A flagged for review
	188	Release 2.0 (Mature): All tiers auto-published with risk labels, sampling audits
1.4	189	See [[Automation Roadmap>>Test.FactHarbor pre10 V0\.9\.70.Specification.Diagrams.Automation Roadmap.WebHome]] for detailed progression.
1.1	190
	191	== 5.5 Automation Roadmap ==
	192
1.4	193	{{include reference="Test.FactHarbor pre10 V0\.9\.70.Specification.Diagrams.Automation Roadmap.WebHome"/}}
1.1	194
	195	== 6. Human Role ==
1.3	196
1.1	197	Humans do NOT review content for approval. Instead:
	198	Monitoring: Watch aggregate performance metrics
	199	Improvement: Fix algorithms when patterns show issues
	200	Exception handling: Review AKEL-flagged items
	201	Governance: Set policies AKEL applies
	202	See [[Contributor Processes>>Test.FactHarbor.Organisation.Contributor-Processes]] for how to improve the system.
	203
	204	== 6.5 Manual vs Automated Matrix ==
	205
1.5	206	{{include reference="Test.FactHarbor pre10 V0\.9\.70.Specification.Diagrams.Manual vs Automated matrix.WebHome"/}}
1.1	207
	208	== 7. Moderation ==
1.3	209
1.1	210	Moderators handle items AKEL flags:
	211	Abuse detection: Spam, manipulation, harassment
	212	Safety issues: Content that could cause immediate harm
	213	System gaming: Attempts to manipulate scoring
	214	Action: May temporarily hide content, ban users, or propose algorithm improvements
	215	Does NOT: Routinely review claims or override verdicts
	216	See [[Organisational Model>>Test.FactHarbor.Organisation.Organisational-Model]] for moderator role details.
1.3	217
1.1	218	== 8. Continuous Improvement ==
1.3	219
1.1	220	Performance monitoring: Track AKEL accuracy, speed, coverage
	221	Issue identification: Find systematic errors from metrics
	222	Algorithm updates: Deploy improvements to fix patterns
	223	A/B testing: Validate changes before full rollout
	224	Retrospectives: Learn from failures systematically
	225	See [[Continuous Improvement>>Test.FactHarbor.Organisation.How-We-Work-Together.Continuous-Improvement]] for improvement cycle.
1.3	226
1.1	227	== 9. Scalability ==
1.3	228
1.1	229	Automation enables FactHarbor to scale:
1.3	230
1.1	231	* Millions of claims processable
	232	* Consistent quality at any volume
	233	* Cost efficiency through automation
	234	* Rapid iteration on algorithms
	235	Without automation: Human review doesn't scale, creates bottlenecks, introduces inconsistency.
1.3	236
1.1	237	== 10. Transparency ==
1.3	238
1.1	239	All automation is transparent:
1.3	240
1.1	241	* Algorithm parameters documented
	242	* Evaluation criteria public
	243	* Source scoring rules explicit
	244	* Confidence calculations explained
	245	* Performance metrics visible
	246	See [[System Performance Metrics>>Test.FactHarbor.Specification.System-Performance-Metrics]] for what we measure.

Wiki source code of Automation

Applications

Navigation

Need help?