Wiki source code of Automation

Last modified by Robert Schaub on 2025/12/22 13:50

Show last authors
1 = Automation =
2
3 **How FactHarbor scales through automated claim evaluation.**
4
5 == 1. Automation Philosophy ==
6
7 FactHarbor is **automation-first**: AKEL (AI Knowledge Extraction Layer) makes all content decisions. Humans monitor system performance and improve algorithms.
8 **Why automation:**
9
10 * **Scale**: Can process millions of claims
11 * **Consistency**: Same evaluation criteria applied uniformly
12 * **Transparency**: Algorithms are auditable
13 * **Speed**: Results in <20 seconds typically
14 See [[Automation Philosophy>>Test.FactHarbor.Organisation.Automation-Philosophy]] for detailed principles.
15
16 == 2. Claim Processing Flow ==
17
18 === 2.1 User Submits Claim ===
19
20 * User provides claim text + source URLs
21 * System validates format
22 * Assigns processing ID
23 * Queues for AKEL processing
24
25 === 2.2 AKEL Processing ===
26
27 **AKEL automatically:**
28
29 1. Parses claim into testable components
30 2. Extracts evidence from sources
31 3. Scores source credibility
32 4. Evaluates claim against evidence
33 5. Generates verdict with confidence score
34 6. Assigns risk tier (A/B/C)
35 7. Publishes result
36 **Processing time**: Typically <20 seconds
37 **No human approval required** - publication is automatic
38
39 === 2.3 Publication States ===
40
41 **Processing**: AKEL working on claim (not visible to public)
42 **Published**: AKEL completed evaluation (public)
43
44 * Verdict displayed with confidence score
45 * Evidence and sources shown
46 * Risk tier indicated
47 * Users can report issues
48 **Flagged**: AKEL identified issue requiring moderator attention (still public)
49 * Low confidence below threshold
50 * Detected manipulation attempt
51 * Unusual pattern
52 * Moderator reviews and may take action
53
54 == 2.5 LLM-Based Processing Architecture ==
55
56 FactHarbor delegates complex reasoning and analysis tasks to Large Language Models (LLMs). The architecture evolves from POC to production:
57
58 === POC: Two-Phase Approach ===
59
60 **Phase 1: Claim Extraction**
61
62 * Single LLM call to extract all claims from submitted content
63 * Light structure, focused on identifying distinct verifiable claims
64 * Output: List of claims with context
65
66 **Phase 2: Claim Analysis (Parallel)**
67
68 * Single LLM call per claim (parallelizable)
69 * Full structured output: Evidence, Scenarios, Sources, Verdict, Risk
70 * Each claim analyzed independently
71
72 **Advantages:**
73
74 * Fast to implement (2-4 weeks to working POC)
75 * Only 2-3 API calls total (1 + N claims)
76 * Simple to debug (claim-level isolation)
77 * Proves concept viability
78
79 === Production: Three-Phase Approach ===
80
81 **Phase 1: Claim Extraction + Validation**
82
83 * Extract distinct verifiable claims
84 * Validate claim clarity and uniqueness
85 * Remove duplicates and vague claims
86
87 **Phase 2: Evidence Gathering (Parallel)**
88
89 * For each claim independently:
90 * Find supporting and contradicting evidence
91 * Identify authoritative sources
92 * Generate test scenarios
93 * Validation: Check evidence quality and source validity
94 * Error containment: Issues in one claim don't affect others
95
96 **Phase 3: Verdict Generation (Parallel)**
97
98 * For each claim:
99 * Generate verdict based on validated evidence
100 * Assess confidence and risk level
101 * Flag low-confidence results for human review
102 * Validation: Check verdict consistency with evidence
103
104 **Advantages:**
105
106 * Error containment between phases
107 * Clear quality gates and validation
108 * Observable metrics per phase
109 * Scalable (parallel processing across claims)
110 * Adaptable (can optimize each phase independently)
111
112 === LLM Task Delegation ===
113
114 All complex cognitive tasks are delegated to LLMs:
115
116 * **Claim Extraction**: Understanding context, identifying distinct claims
117 * **Evidence Finding**: Analyzing sources, assessing relevance
118 * **Scenario Generation**: Creating testable hypotheses
119 * **Source Evaluation**: Assessing reliability and authority
120 * **Verdict Generation**: Synthesizing evidence into conclusions
121 * **Risk Assessment**: Evaluating potential impact
122
123 === Error Mitigation ===
124
125 Research shows sequential LLM calls face compound error risks. FactHarbor mitigates this through:
126
127 * **Validation gates** between phases
128 * **Confidence thresholds** for quality control
129 * **Parallel processing** to avoid error propagation across claims
130 * **Human review queue** for low-confidence verdicts
131 * **Independent claim processing** - errors in one claim don't cascade to others
132
133 == 3. Risk Tiers ==
134
135 Risk tiers classify claims by potential impact and guide audit sampling rates.
136
137 === 3.1 Tier A (High Risk) ===
138
139 **Domains**: Medical, legal, elections, safety, security
140 **Characteristics**:
141
142 * High potential for harm if incorrect
143 * Complex specialized knowledge required
144 * Often subject to regulation
145 **Publication**: AKEL publishes automatically with prominent risk warning
146 **Audit rate**: Higher sampling recommended
147
148 === 3.2 Tier B (Medium Risk) ===
149
150 **Domains**: Complex policy, science, causality claims
151 **Characteristics**:
152
153 * Moderate potential impact
154 * Requires careful evidence evaluation
155 * Multiple valid interpretations possible
156 **Publication**: AKEL publishes automatically with standard risk label
157 **Audit rate**: Moderate sampling recommended
158
159 === 3.3 Tier C (Low Risk) ===
160
161 **Domains**: Definitions, established facts, historical data
162 **Characteristics**:
163
164 * Low potential for harm
165 * Well-documented information
166 * Clear right/wrong answers typically
167 **Publication**: AKEL publishes by default
168 **Audit rate**: Lower sampling recommended
169
170 == 4. Quality Gates ==
171
172 AKEL applies quality gates before publication. If any fail, claim is **flagged** (not blocked - still published).
173 **Quality gates**:
174
175 * Sufficient evidence extracted (≥2 sources)
176 * Sources meet minimum credibility threshold
177 * Confidence score calculable
178 * No detected manipulation patterns
179 * Claim parseable into testable form
180 **Failed gates**: Claim published with flag for moderator review
181
182 == 5. Automation Levels ==
183
184 {{include reference="Test.FactHarbor pre10 V0\.9\.70.Specification.Diagrams.Automation Level.WebHome"/}}
185 FactHarbor progresses through automation maturity levels:
186 **Release 0.5** (Proof-of-Concept): Tier C only, human review required
187 **Release 1.0** (Initial): Tier B/C auto-published, Tier A flagged for review
188 **Release 2.0** (Mature): All tiers auto-published with risk labels, sampling audits
189 See [[Automation Roadmap>>Test.FactHarbor pre10 V0\.9\.70.Specification.Diagrams.Automation Roadmap.WebHome]] for detailed progression.
190
191 == 5.5 Automation Roadmap ==
192
193 {{include reference="Test.FactHarbor pre10 V0\.9\.70.Specification.Diagrams.Automation Roadmap.WebHome"/}}
194
195 == 6. Human Role ==
196
197 Humans do NOT review content for approval. Instead:
198 **Monitoring**: Watch aggregate performance metrics
199 **Improvement**: Fix algorithms when patterns show issues
200 **Exception handling**: Review AKEL-flagged items
201 **Governance**: Set policies AKEL applies
202 See [[Contributor Processes>>Test.FactHarbor.Organisation.Contributor-Processes]] for how to improve the system.
203
204 == 6.5 Manual vs Automated Matrix ==
205
206 {{include reference="Test.FactHarbor pre10 V0\.9\.70.Specification.Diagrams.Manual vs Automated matrix.WebHome"/}}
207
208 == 7. Moderation ==
209
210 Moderators handle items AKEL flags:
211 **Abuse detection**: Spam, manipulation, harassment
212 **Safety issues**: Content that could cause immediate harm
213 **System gaming**: Attempts to manipulate scoring
214 **Action**: May temporarily hide content, ban users, or propose algorithm improvements
215 **Does NOT**: Routinely review claims or override verdicts
216 See [[Organisational Model>>Test.FactHarbor.Organisation.Organisational-Model]] for moderator role details.
217
218 == 8. Continuous Improvement ==
219
220 **Performance monitoring**: Track AKEL accuracy, speed, coverage
221 **Issue identification**: Find systematic errors from metrics
222 **Algorithm updates**: Deploy improvements to fix patterns
223 **A/B testing**: Validate changes before full rollout
224 **Retrospectives**: Learn from failures systematically
225 See [[Continuous Improvement>>Test.FactHarbor.Organisation.How-We-Work-Together.Continuous-Improvement]] for improvement cycle.
226
227 == 9. Scalability ==
228
229 Automation enables FactHarbor to scale:
230
231 * **Millions of claims** processable
232 * **Consistent quality** at any volume
233 * **Cost efficiency** through automation
234 * **Rapid iteration** on algorithms
235 Without automation: Human review doesn't scale, creates bottlenecks, introduces inconsistency.
236
237 == 10. Transparency ==
238
239 All automation is transparent:
240
241 * **Algorithm parameters** documented
242 * **Evaluation criteria** public
243 * **Source scoring rules** explicit
244 * **Confidence calculations** explained
245 * **Performance metrics** visible
246 See [[System Performance Metrics>>Test.FactHarbor.Specification.System-Performance-Metrics]] for what we measure.