Wiki source code of Automation

Last modified by Robert Schaub on 2025/12/22 14:16

Show last authors
1 = Automation =
2
3 **How FactHarbor scales through automated claim evaluation.**
4
5 == 1. Automation Philosophy ==
6
7 FactHarbor is **automation-first**: AKEL (AI Knowledge Extraction Layer) makes all content decisions. Humans monitor system performance and improve algorithms.
8 **Why automation:**
9
10 * **Scale**: Can process millions of claims
11 * **Consistency**: Same evaluation criteria applied uniformly
12 * **Transparency**: Algorithms are auditable
13 * **Speed**: Results in <20 seconds typically
14 See [[Automation Philosophy>>Test.FactHarbor.Organisation.Automation-Philosophy]] for detailed principles.
15
16 == 2. Claim Processing Flow ==
17
18 === 2.1 User Submits Claim ===
19
20 * User provides claim text + source URLs
21 * System validates format
22 * Assigns processing ID
23 * Queues for AKEL processing
24
25 === 2.2 AKEL Processing ===
26
27 **AKEL automatically:**
28
29 1. Parses claim into testable components
30 2. Extracts evidence from sources
31 3. Scores source credibility
32 4. Evaluates claim against evidence
33 5. Generates verdict with confidence score
34 6. Assigns risk tier (A/B/C)
35 7. Publishes result
36 **Processing time**: Typically <20 seconds
37 **No human approval required** - publication is automatic
38
39 === 2.3 Publication States ===
40
41 **Processing**: AKEL working on claim (not visible to public)
42 **Published**: AKEL completed evaluation (public)
43
44 * Verdict displayed with confidence score
45 * Evidence and sources shown
46 * Risk tier indicated
47 * Users can report issues
48 **Flagged**: AKEL identified issue requiring moderator attention (still public)
49 * Low confidence below threshold
50 * Detected manipulation attempt
51 * Unusual pattern
52 * Moderator reviews and may take action == 2.5 LLM-Based Processing Architecture == FactHarbor delegates complex reasoning and analysis tasks to Large Language Models (LLMs). The architecture evolves from POC to production: === POC: Two-Phase Approach === **Phase 1: Claim Extraction**
53 * Single LLM call to extract all claims from submitted content
54 * Light structure, focused on identifying distinct verifiable claims
55 * Output: List of claims with context **Phase 2: Claim Analysis (Parallel)**
56 * Single LLM call per claim (parallelizable)
57 * Full structured output: Evidence, Scenarios, Sources, Verdict, Risk
58 * Each claim analyzed independently **Advantages:**
59 * Fast to implement (2-4 weeks to working POC)
60 * Only 2-3 API calls total (1 + N claims)
61 * Simple to debug (claim-level isolation)
62 * Proves concept viability === Production: Three-Phase Approach === **Phase 1: Claim Extraction + Validation**
63 * Extract distinct verifiable claims
64 * Validate claim clarity and uniqueness
65 * Remove duplicates and vague claims **Phase 2: Evidence Gathering (Parallel)**
66 * For each claim independently: * Find supporting and contradicting evidence * Identify authoritative sources * Generate test scenarios
67 * Validation: Check evidence quality and source validity
68 * Error containment: Issues in one claim don't affect others **Phase 3: Verdict Generation (Parallel)**
69 * For each claim: * Generate verdict based on validated evidence * Assess confidence and risk level * Flag low-confidence results for human review
70 * Validation: Check verdict consistency with evidence **Advantages:**
71 * Error containment between phases
72 * Clear quality gates and validation
73 * Observable metrics per phase
74 * Scalable (parallel processing across claims)
75 * Adaptable (can optimize each phase independently) === LLM Task Delegation === All complex cognitive tasks are delegated to LLMs:
76 * **Claim Extraction**: Understanding context, identifying distinct claims
77 * **Evidence Finding**: Analyzing sources, assessing relevance
78 * **Scenario Generation**: Creating testable hypotheses
79 * **Source Evaluation**: Assessing reliability and authority
80 * **Verdict Generation**: Synthesizing evidence into conclusions
81 * **Risk Assessment**: Evaluating potential impact === Error Mitigation === Research shows sequential LLM calls face compound error risks. FactHarbor mitigates this through:
82 * **Validation gates** between phases
83 * **Confidence thresholds** for quality control
84 * **Parallel processing** to avoid error propagation across claims
85 * **Human review queue** for low-confidence verdicts
86 * **Independent claim processing** - errors in one claim don't cascade to others == 3. Risk Tiers ==
87 Risk tiers classify claims by potential impact and guide audit sampling rates.
88
89 === 3.1 Tier A (High Risk) ===
90
91 **Domains**: Medical, legal, elections, safety, security
92 **Characteristics**:
93
94 * High potential for harm if incorrect
95 * Complex specialized knowledge required
96 * Often subject to regulation
97 **Publication**: AKEL publishes automatically with prominent risk warning
98 **Audit rate**: Higher sampling recommended
99
100 === 3.2 Tier B (Medium Risk) ===
101
102 **Domains**: Complex policy, science, causality claims
103 **Characteristics**:
104
105 * Moderate potential impact
106 * Requires careful evidence evaluation
107 * Multiple valid interpretations possible
108 **Publication**: AKEL publishes automatically with standard risk label
109 **Audit rate**: Moderate sampling recommended
110
111 === 3.3 Tier C (Low Risk) ===
112
113 **Domains**: Definitions, established facts, historical data
114 **Characteristics**:
115
116 * Low potential for harm
117 * Well-documented information
118 * Clear right/wrong answers typically
119 **Publication**: AKEL publishes by default
120 **Audit rate**: Lower sampling recommended
121
122 == 4. Quality Gates ==
123
124 AKEL applies quality gates before publication. If any fail, claim is **flagged** (not blocked - still published).
125 **Quality gates**:
126
127 * Sufficient evidence extracted (≥2 sources)
128 * Sources meet minimum credibility threshold
129 * Confidence score calculable
130 * No detected manipulation patterns
131 * Claim parseable into testable form
132 **Failed gates**: Claim published with flag for moderator review
133
134 == 5. Automation Levels ==
135
136 {{include reference="Test.FactHarbor pre11 V0\.9\.70.Specification.Diagrams.Automation Level.WebHome"/}}
137 FactHarbor progresses through automation maturity levels:
138 **Release 0.5** (Proof-of-Concept): Tier C only, human review required
139 **Release 1.0** (Initial): Tier B/C auto-published, Tier A flagged for review
140 **Release 2.0** (Mature): All tiers auto-published with risk labels, sampling audits
141 See [[Automation Roadmap>>Test.FactHarbor pre11 V0\.9\.70.Specification.Diagrams.Automation Roadmap.WebHome]] for detailed progression. == 5.5 Automation Roadmap == {{include reference="Test.FactHarbor pre11 V0\.9\.70.Specification.Diagrams.Automation Roadmap.WebHome"/}} == 6. Human Role ==
142 Humans do NOT review content for approval. Instead:
143 **Monitoring**: Watch aggregate performance metrics
144 **Improvement**: Fix algorithms when patterns show issues
145 **Exception handling**: Review AKEL-flagged items
146 **Governance**: Set policies AKEL applies
147 See [[Contributor Processes>>Test.FactHarbor.Organisation.Contributor-Processes]] for how to improve the system. == 6.5 Manual vs Automated Matrix == {{include reference="Test.FactHarbor pre11 V0\.9\.70.Specification.Diagrams.Manual vs Automated matrix.WebHome"/}} == 7. Moderation ==
148 Moderators handle items AKEL flags:
149 **Abuse detection**: Spam, manipulation, harassment
150 **Safety issues**: Content that could cause immediate harm
151 **System gaming**: Attempts to manipulate scoring
152 **Action**: May temporarily hide content, ban users, or propose algorithm improvements
153 **Does NOT**: Routinely review claims or override verdicts
154 See [[Organisational Model>>Test.FactHarbor.Organisation.Organisational-Model]] for moderator role details.
155
156 == 8. Continuous Improvement ==
157
158 **Performance monitoring**: Track AKEL accuracy, speed, coverage
159 **Issue identification**: Find systematic errors from metrics
160 **Algorithm updates**: Deploy improvements to fix patterns
161 **A/B testing**: Validate changes before full rollout
162 **Retrospectives**: Learn from failures systematically
163 See [[Continuous Improvement>>Test.FactHarbor.Organisation.How-We-Work-Together.Continuous-Improvement]] for improvement cycle.
164
165 == 9. Scalability ==
166
167 Automation enables FactHarbor to scale:
168
169 * **Millions of claims** processable
170 * **Consistent quality** at any volume
171 * **Cost efficiency** through automation
172 * **Rapid iteration** on algorithms
173 Without automation: Human review doesn't scale, creates bottlenecks, introduces inconsistency.
174
175 == 10. Transparency ==
176
177 All automation is transparent:
178
179 * **Algorithm parameters** documented
180 * **Evaluation criteria** public
181 * **Source scoring rules** explicit
182 * **Confidence calculations** explained
183 * **Performance metrics** visible
184 See [[System Performance Metrics>>Test.FactHarbor.Specification.System-Performance-Metrics]] for what we measure.