Wiki source code of Automation

Last modified by Robert Schaub on 2025/12/24 09:59

Show last authors
1 = Automation =
2 **How FactHarbor scales through automated claim evaluation.**
3 == 1. Automation Philosophy ==
4 FactHarbor is **automation-first**: AKEL (AI Knowledge Extraction Layer) makes all content decisions. Humans monitor system performance and improve algorithms.
5 **Why automation:**
6 * **Scale**: Can process millions of claims
7 * **Consistency**: Same evaluation criteria applied uniformly
8 * **Transparency**: Algorithms are auditable
9 * **Speed**: Results in <20 seconds typically
10 See [[Automation Philosophy>>FactHarbor.Organisation.Automation-Philosophy]] for detailed principles.
11 == 2. Claim Processing Flow ==
12 === 2.1 User Submits Claim ===
13 * User provides claim text + source URLs
14 * System validates format
15 * Assigns processing ID
16 * Queues for AKEL processing
17 === 2.2 AKEL Processing ===
18 **AKEL automatically:**
19 1. Parses claim into testable components
20 2. Extracts evidence from sources
21 3. Scores source credibility
22 4. Evaluates claim against evidence
23 5. Generates verdict with confidence score
24 6. Assigns risk tier (A/B/C)
25 7. Publishes result
26 **Processing time**: Typically <20 seconds
27 **No human approval required** - publication is automatic
28 === 2.3 Publication States ===
29 **Processing**: AKEL working on claim (not visible to public)
30 **Published**: AKEL completed evaluation (public)
31 * Verdict displayed with confidence score
32 * Evidence and sources shown
33 * Risk tier indicated
34 * Users can report issues
35 **Flagged**: AKEL identified issue requiring moderator attention (still public)
36 * Low confidence below threshold
37 * Detected manipulation attempt
38 * Unusual pattern
39 * Moderator reviews and may take action
40
41 == 2.5 LLM-Based Processing Architecture ==
42
43 FactHarbor delegates complex reasoning and analysis tasks to Large Language Models (LLMs). The architecture evolves from POC to production:
44
45 === POC: Two-Phase Approach ===
46
47 **Phase 1: Claim Extraction**
48 * Single LLM call to extract all claims from submitted content
49 * Light structure, focused on identifying distinct verifiable claims
50 * Output: List of claims with context
51
52 **Phase 2: Claim Analysis (Parallel)**
53 * Single LLM call per claim (parallelizable)
54 * Full structured output: Evidence, Scenarios, Sources, Verdict, Risk
55 * Each claim analyzed independently
56
57 **Advantages:**
58 * Fast to implement (2-4 weeks to working POC)
59 * Only 2-3 API calls total (1 + N claims)
60 * Simple to debug (claim-level isolation)
61 * Proves concept viability
62
63 === Production: Three-Phase Approach ===
64
65 **Phase 1: Claim Extraction + Validation**
66 * Extract distinct verifiable claims
67 * Validate claim clarity and uniqueness
68 * Remove duplicates and vague claims
69
70 **Phase 2: Evidence Gathering (Parallel)**
71 * For each claim independently:
72 * Find supporting and contradicting evidence
73 * Identify authoritative sources
74 * Generate test scenarios
75 * Validation: Check evidence quality and source validity
76 * Error containment: Issues in one claim don't affect others
77
78 **Phase 3: Verdict Generation (Parallel)**
79 * For each claim:
80 * Generate verdict based on validated evidence
81 * Assess confidence and risk level
82 * Flag low-confidence results for human review
83 * Validation: Check verdict consistency with evidence
84
85 **Advantages:**
86 * Error containment between phases
87 * Clear quality gates and validation
88 * Observable metrics per phase
89 * Scalable (parallel processing across claims)
90 * Adaptable (can optimize each phase independently)
91
92 === LLM Task Delegation ===
93
94 All complex cognitive tasks are delegated to LLMs:
95 * **Claim Extraction**: Understanding context, identifying distinct claims
96 * **Evidence Finding**: Analyzing sources, assessing relevance
97 * **Scenario Generation**: Creating testable hypotheses
98 * **Source Evaluation**: Assessing reliability and authority
99 * **Verdict Generation**: Synthesizing evidence into conclusions
100 * **Risk Assessment**: Evaluating potential impact
101
102 === Error Mitigation ===
103
104 Research shows sequential LLM calls face compound error risks. FactHarbor mitigates this through:
105 * **Validation gates** between phases
106 * **Confidence thresholds** for quality control
107 * **Parallel processing** to avoid error propagation across claims
108 * **Human review queue** for low-confidence verdicts
109 * **Independent claim processing** - errors in one claim don't cascade to others
110
111
112 == 3. Risk Tiers ==
113 Risk tiers classify claims by potential impact and guide audit sampling rates.
114 === 3.1 Tier A (High Risk) ===
115 **Domains**: Medical, legal, elections, safety, security
116 **Characteristics**:
117 * High potential for harm if incorrect
118 * Complex specialized knowledge required
119 * Often subject to regulation
120 **Publication**: AKEL publishes automatically with prominent risk warning
121 **Audit rate**: Higher sampling recommended
122 === 3.2 Tier B (Medium Risk) ===
123 **Domains**: Complex policy, science, causality claims
124 **Characteristics**:
125 * Moderate potential impact
126 * Requires careful evidence evaluation
127 * Multiple valid interpretations possible
128 **Publication**: AKEL publishes automatically with standard risk label
129 **Audit rate**: Moderate sampling recommended
130 === 3.3 Tier C (Low Risk) ===
131 **Domains**: Definitions, established facts, historical data
132 **Characteristics**:
133 * Low potential for harm
134 * Well-documented information
135 * Clear right/wrong answers typically
136 **Publication**: AKEL publishes by default
137 **Audit rate**: Lower sampling recommended
138 == 4. Quality Gates ==
139 AKEL applies quality gates before publication. If any fail, claim is **flagged** (not blocked - still published).
140 **Quality gates**:
141 * Sufficient evidence extracted (≥2 sources)
142 * Sources meet minimum credibility threshold
143 * Confidence score calculable
144 * No detected manipulation patterns
145 * Claim parseable into testable form
146 **Failed gates**: Claim published with flag for moderator review
147 == 5. Automation Levels ==
148 {{include reference="FactHarbor.Specification.Diagrams.Automation Level.WebHome"/}}
149 FactHarbor progresses through automation maturity levels:
150 **Release 0.5** (Proof-of-Concept): Tier C only, human review required
151 **Release 1.0** (Initial): Tier B/C auto-published, Tier A flagged for review
152 **Release 2.0** (Mature): All tiers auto-published with risk labels, sampling audits
153 See [[Automation Roadmap>>FactHarbor.Specification.Diagrams.Automation Roadmap.WebHome]] for detailed progression.
154
155 == 5.5 Automation Roadmap ==
156
157 {{include reference="FactHarbor.Specification.Diagrams.Automation Roadmap.WebHome"/}}
158
159 == 6. Human Role ==
160 Humans do NOT review content for approval. Instead:
161 **Monitoring**: Watch aggregate performance metrics
162 **Improvement**: Fix algorithms when patterns show issues
163 **Exception handling**: Review AKEL-flagged items
164 **Governance**: Set policies AKEL applies
165 See [[Contributor Processes>>FactHarbor.Organisation.Contributor-Processes]] for how to improve the system.
166
167 == 6.5 Manual vs Automated Matrix ==
168
169 {{include reference="FactHarbor.Specification.Diagrams.Manual vs Automated matrix.WebHome"/}}
170
171 == 7. Moderation ==
172 Moderators handle items AKEL flags:
173 **Abuse detection**: Spam, manipulation, harassment
174 **Safety issues**: Content that could cause immediate harm
175 **System gaming**: Attempts to manipulate scoring
176 **Action**: May temporarily hide content, ban users, or propose algorithm improvements
177 **Does NOT**: Routinely review claims or override verdicts
178 See [[Organisational Model>>FactHarbor.Organisation.Organisational-Model]] for moderator role details.
179 == 8. Continuous Improvement ==
180 **Performance monitoring**: Track AKEL accuracy, speed, coverage
181 **Issue identification**: Find systematic errors from metrics
182 **Algorithm updates**: Deploy improvements to fix patterns
183 **A/B testing**: Validate changes before full rollout
184 **Retrospectives**: Learn from failures systematically
185 See [[Continuous Improvement>>FactHarbor.Organisation.How-We-Work-Together.Continuous-Improvement]] for improvement cycle.
186 == 9. Scalability ==
187 Automation enables FactHarbor to scale:
188 * **Millions of claims** processable
189 * **Consistent quality** at any volume
190 * **Cost efficiency** through automation
191 * **Rapid iteration** on algorithms
192 Without automation: Human review doesn't scale, creates bottlenecks, introduces inconsistency.
193 == 10. Transparency ==
194 All automation is transparent:
195 * **Algorithm parameters** documented
196 * **Evaluation criteria** public
197 * **Source scoring rules** explicit
198 * **Confidence calculations** explained
199 * **Performance metrics** visible
200 See [[System Performance Metrics>>FactHarbor.Specification.System-Performance-Metrics]] for what we measure.