Wiki source code of Automation

Last modified by Robert Schaub on 2025/12/24 21:53

Show last authors
1 = Automation =
2 **How FactHarbor scales through automated claim evaluation.**
3 == 1. Automation Philosophy ==
4 FactHarbor is **automation-first**: AKEL (AI Knowledge Extraction Layer) makes all content decisions. Humans monitor system performance and improve algorithms.
5 **Why automation:**
6 * **Scale**: Can process millions of claims
7 * **Consistency**: Same evaluation criteria applied uniformly
8 * **Transparency**: Algorithms are auditable
9 * **Speed**: Results in <20 seconds typically
10 See [[Automation Philosophy>>FactHarbor.Organisation.Automation-Philosophy]] for detailed principles.
11 == 2. Claim Processing Flow ==
12 === 2.1 User Submits Claim ===
13 * User provides claim text + source URLs
14 * System validates format
15 * Assigns processing ID
16 * Queues for AKEL processing
17 === 2.2 AKEL Processing ===
18 **AKEL automatically:**
19 1. Parses claim into testable components
20 2. Extracts evidence from sources
21 3. Scores source credibility
22 4. Evaluates claim against evidence
23 5. Generates verdict with confidence score
24 6. Assigns risk tier (A/B/C)
25 7. Publishes result
26 **Processing time**: Typically <20 seconds
27 **No human approval required** - publication is automatic
28 === 2.3 Publication States ===
29 **Processing**: AKEL working on claim (not visible to public)
30 **Published**: AKEL completed evaluation (public)
31 * Verdict displayed with confidence score
32 * Evidence and sources shown
33 * Risk tier indicated
34 * Users can report issues
35 **Flagged**: AKEL identified issue requiring moderator attention (still public)
36 * Low confidence below threshold
37 * Detected manipulation attempt
38 * Unusual pattern
39 * Moderator reviews and may take action
40
41 == 2.5 LLM-Based Processing Architecture ==
42
43 FactHarbor delegates complex reasoning and analysis tasks to Large Language Models (LLMs). The architecture evolves from POC to production:
44
45 === POC: Two-Phase Approach ===
46
47 **Phase 1: Claim Extraction**
48 * Single LLM call to extract all claims from submitted content
49 * Light structure, focused on identifying distinct verifiable claims
50 * Output: List of claims with context
51
52 **Phase 2: Claim Analysis (Parallel)**
53 * Single LLM call per claim (parallelizable)
54 * Full structured output: Evidence, Scenarios, Sources, Verdict, Risk
55 * Each claim analyzed independently
56
57 **Advantages:**
58 * Fast to implement ( to working POC)
59 * Only 2-3 API calls total (1 + N claims)
60 * Simple to debug (claim-level isolation)
61 * Proves concept viability
62
63 === Production: Three-Phase Approach ===
64
65 **Phase 1: Claim Extraction + Validation**
66 * Extract distinct verifiable claims
67 * Validate claim clarity and uniqueness
68 * Remove duplicates and vague claims
69
70 **Phase 2: Evidence Gathering (Parallel)**
71 * For each claim independently:
72 * Find supporting and contradicting evidence
73 * Identify authoritative sources
74 * Generate test scenarios
75 * Validation: Check evidence quality and source validity
76 * Error containment: Issues in one claim don't affect others
77
78 **Phase 3: Verdict Generation (Parallel)**
79 * For each claim:
80 * Generate verdict based on validated evidence
81 * Assess confidence and risk level
82 * Flag low-confidence results for human review
83 * Validation: Check verdict consistency with evidence
84
85 **Advantages:**
86 * Error containment between phases
87 * Clear quality gates and validation
88 * Observable metrics per phase
89 * Scalable (parallel processing across claims)
90 * Adaptable (can optimize each phase independently)
91
92 === LLM Task Delegation ===
93
94 All complex cognitive tasks are delegated to LLMs:
95 * **Claim Extraction**: Understanding context, identifying distinct claims
96 * **Evidence Finding**: Analyzing sources, assessing relevance
97 * **Scenario Generation**: Creating testable hypotheses
98 * **Source Evaluation**: Assessing reliability and authority
99 * **Verdict Generation**: Synthesizing evidence into conclusions
100 * **Risk Assessment**: Evaluating potential impact
101
102 === Error Mitigation ===
103
104 Research shows sequential LLM calls face compound error risks. FactHarbor mitigates this through:
105 * **Validation gates** between phases
106 * **Confidence thresholds** for quality control
107 * **Parallel processing** to avoid error propagation across claims
108 * **Human review queue** for low-confidence verdicts
109 * **Independent claim processing** - errors in one claim don't cascade to others
110
111 == 3. Risk Tiers ==
112 Risk tiers classify claims by potential impact and guide audit sampling rates.
113 === 3.1 Tier A (High Risk) ===
114 **Domains**: Medical, legal, elections, safety, security
115 **Characteristics**:
116 * High potential for harm if incorrect
117 * Complex specialized knowledge required
118 * Often subject to regulation
119 **Publication**: AKEL publishes automatically with prominent risk warning
120 **Audit rate**: Higher sampling recommended
121 === 3.2 Tier B (Medium Risk) ===
122 **Domains**: Complex policy, science, causality claims
123 **Characteristics**:
124 * Moderate potential impact
125 * Requires careful evidence evaluation
126 * Multiple valid interpretations possible
127 **Publication**: AKEL publishes automatically with standard risk label
128 **Audit rate**: Moderate sampling recommended
129 === 3.3 Tier C (Low Risk) ===
130 **Domains**: Definitions, established facts, historical data
131 **Characteristics**:
132 * Low potential for harm
133 * Well-documented information
134 * Clear right/wrong answers typically
135 **Publication**: AKEL publishes by default
136 **Audit rate**: Lower sampling recommended
137 == 4. Quality Gates ==
138 AKEL applies quality gates before publication. If any fail, claim is **flagged** (not blocked - still published).
139 **Quality gates**:
140 * Sufficient evidence extracted (≥2 sources)
141 * Sources meet minimum credibility threshold
142 * Confidence score calculable
143 * No detected manipulation patterns
144 * Claim parseable into testable form
145 **Failed gates**: Claim published with flag for moderator review
146 == 5. Automation Levels ==
147 {{include reference="FactHarbor.Specification.Diagrams.Automation Level.WebHome"/}}
148 FactHarbor progresses through automation maturity levels:
149 **Release 0.5** (Proof-of-Concept): Tier C only, human review required
150 **Release 1.0** (Initial): Tier B/C auto-published, Tier A flagged for review
151 **Release 2.0** (Mature): All tiers auto-published with risk labels, sampling audits
152 See [[Automation Roadmap>>FactHarbor.Specification.Diagrams.Automation Roadmap.WebHome]] for detailed progression.
153
154 == 5.5 Automation Roadmap ==
155
156 {{include reference="FactHarbor.Specification.Diagrams.Automation Roadmap.WebHome"/}}
157
158 == 6. Human Role ==
159 Humans do NOT review content for approval. Instead:
160 **Monitoring**: Watch aggregate performance metrics
161 **Improvement**: Fix algorithms when patterns show issues
162 **Exception handling**: Review AKEL-flagged items
163 **Governance**: Set policies AKEL applies
164 See [[Contributor Processes>>FactHarbor.Organisation.Contributor-Processes]] for how to improve the system.
165
166 == 6.5 Manual vs Automated Matrix ==
167
168 {{include reference="FactHarbor.Specification.Diagrams.Manual vs Automated matrix.WebHome"/}}
169
170 == 7. Moderation ==
171 Moderators handle items AKEL flags:
172 **Abuse detection**: Spam, manipulation, harassment
173 **Safety issues**: Content that could cause immediate harm
174 **System gaming**: Attempts to manipulate scoring
175 **Action**: May temporarily hide content, ban users, or propose algorithm improvements
176 **Does NOT**: Routinely review claims or override verdicts
177 See [[Organisational Model>>FactHarbor.Organisation.Organisational-Model]] for moderator role details.
178 == 8. Continuous Improvement ==
179 **Performance monitoring**: Track AKEL accuracy, speed, coverage
180 **Issue identification**: Find systematic errors from metrics
181 **Algorithm updates**: Deploy improvements to fix patterns
182 **A/B testing**: Validate changes before full rollout
183 **Retrospectives**: Learn from failures systematically
184 See [[Continuous Improvement>>FactHarbor.Organisation.How-We-Work-Together.Continuous-Improvement]] for improvement cycle.
185 == 9. Scalability ==
186 Automation enables FactHarbor to scale:
187 * **Millions of claims** processable
188 * **Consistent quality** at any volume
189 * **Cost efficiency** through automation
190 * **Rapid iteration** on algorithms
191 Without automation: Human review doesn't scale, creates bottlenecks, introduces inconsistency.
192 == 10. Transparency ==
193 All automation is transparent:
194 * **Algorithm parameters** documented
195 * **Evaluation criteria** public
196 * **Source scoring rules** explicit
197 * **Confidence calculations** explained
198 * **Performance metrics** visible
199 See [[System Performance Metrics>>FactHarbor.Specification.System-Performance-Metrics]] for what we measure.