Wiki source code of Automation

Version 1.2 by Robert Schaub on 2025/12/22 14:16

Show last authors
1 = Automation =
2 **How FactHarbor scales through automated claim evaluation.**
3 == 1. Automation Philosophy ==
4 FactHarbor is **automation-first**: AKEL (AI Knowledge Extraction Layer) makes all content decisions. Humans monitor system performance and improve algorithms.
5 **Why automation:**
6 * **Scale**: Can process millions of claims
7 * **Consistency**: Same evaluation criteria applied uniformly
8 * **Transparency**: Algorithms are auditable
9 * **Speed**: Results in <20 seconds typically
10 See [[Automation Philosophy>>Test.FactHarbor.Organisation.Automation-Philosophy]] for detailed principles.
11 == 2. Claim Processing Flow ==
12 === 2.1 User Submits Claim ===
13 * User provides claim text + source URLs
14 * System validates format
15 * Assigns processing ID
16 * Queues for AKEL processing
17 === 2.2 AKEL Processing ===
18 **AKEL automatically:**
19 1. Parses claim into testable components
20 2. Extracts evidence from sources
21 3. Scores source credibility
22 4. Evaluates claim against evidence
23 5. Generates verdict with confidence score
24 6. Assigns risk tier (A/B/C)
25 7. Publishes result
26 **Processing time**: Typically <20 seconds
27 **No human approval required** - publication is automatic
28 === 2.3 Publication States ===
29 **Processing**: AKEL working on claim (not visible to public)
30 **Published**: AKEL completed evaluation (public)
31 * Verdict displayed with confidence score
32 * Evidence and sources shown
33 * Risk tier indicated
34 * Users can report issues
35 **Flagged**: AKEL identified issue requiring moderator attention (still public)
36 * Low confidence below threshold
37 * Detected manipulation attempt
38 * Unusual pattern
39 * Moderator reviews and may take action == 2.5 LLM-Based Processing Architecture == FactHarbor delegates complex reasoning and analysis tasks to Large Language Models (LLMs). The architecture evolves from POC to production: === POC: Two-Phase Approach === **Phase 1: Claim Extraction**
40 * Single LLM call to extract all claims from submitted content
41 * Light structure, focused on identifying distinct verifiable claims
42 * Output: List of claims with context **Phase 2: Claim Analysis (Parallel)**
43 * Single LLM call per claim (parallelizable)
44 * Full structured output: Evidence, Scenarios, Sources, Verdict, Risk
45 * Each claim analyzed independently **Advantages:**
46 * Fast to implement (2-4 weeks to working POC)
47 * Only 2-3 API calls total (1 + N claims)
48 * Simple to debug (claim-level isolation)
49 * Proves concept viability === Production: Three-Phase Approach === **Phase 1: Claim Extraction + Validation**
50 * Extract distinct verifiable claims
51 * Validate claim clarity and uniqueness
52 * Remove duplicates and vague claims **Phase 2: Evidence Gathering (Parallel)**
53 * For each claim independently: * Find supporting and contradicting evidence * Identify authoritative sources * Generate test scenarios
54 * Validation: Check evidence quality and source validity
55 * Error containment: Issues in one claim don't affect others **Phase 3: Verdict Generation (Parallel)**
56 * For each claim: * Generate verdict based on validated evidence * Assess confidence and risk level * Flag low-confidence results for human review
57 * Validation: Check verdict consistency with evidence **Advantages:**
58 * Error containment between phases
59 * Clear quality gates and validation
60 * Observable metrics per phase
61 * Scalable (parallel processing across claims)
62 * Adaptable (can optimize each phase independently) === LLM Task Delegation === All complex cognitive tasks are delegated to LLMs:
63 * **Claim Extraction**: Understanding context, identifying distinct claims
64 * **Evidence Finding**: Analyzing sources, assessing relevance
65 * **Scenario Generation**: Creating testable hypotheses
66 * **Source Evaluation**: Assessing reliability and authority
67 * **Verdict Generation**: Synthesizing evidence into conclusions
68 * **Risk Assessment**: Evaluating potential impact === Error Mitigation === Research shows sequential LLM calls face compound error risks. FactHarbor mitigates this through:
69 * **Validation gates** between phases
70 * **Confidence thresholds** for quality control
71 * **Parallel processing** to avoid error propagation across claims
72 * **Human review queue** for low-confidence verdicts
73 * **Independent claim processing** - errors in one claim don't cascade to others == 3. Risk Tiers ==
74 Risk tiers classify claims by potential impact and guide audit sampling rates.
75 === 3.1 Tier A (High Risk) ===
76 **Domains**: Medical, legal, elections, safety, security
77 **Characteristics**:
78 * High potential for harm if incorrect
79 * Complex specialized knowledge required
80 * Often subject to regulation
81 **Publication**: AKEL publishes automatically with prominent risk warning
82 **Audit rate**: Higher sampling recommended
83 === 3.2 Tier B (Medium Risk) ===
84 **Domains**: Complex policy, science, causality claims
85 **Characteristics**:
86 * Moderate potential impact
87 * Requires careful evidence evaluation
88 * Multiple valid interpretations possible
89 **Publication**: AKEL publishes automatically with standard risk label
90 **Audit rate**: Moderate sampling recommended
91 === 3.3 Tier C (Low Risk) ===
92 **Domains**: Definitions, established facts, historical data
93 **Characteristics**:
94 * Low potential for harm
95 * Well-documented information
96 * Clear right/wrong answers typically
97 **Publication**: AKEL publishes by default
98 **Audit rate**: Lower sampling recommended
99 == 4. Quality Gates ==
100 AKEL applies quality gates before publication. If any fail, claim is **flagged** (not blocked - still published).
101 **Quality gates**:
102 * Sufficient evidence extracted (≥2 sources)
103 * Sources meet minimum credibility threshold
104 * Confidence score calculable
105 * No detected manipulation patterns
106 * Claim parseable into testable form
107 **Failed gates**: Claim published with flag for moderator review
108 == 5. Automation Levels ==
109 {{include reference="Test.FactHarbor.Specification.Diagrams.Automation Level.WebHome"/}}
110 FactHarbor progresses through automation maturity levels:
111 **Release 0.5** (Proof-of-Concept): Tier C only, human review required
112 **Release 1.0** (Initial): Tier B/C auto-published, Tier A flagged for review
113 **Release 2.0** (Mature): All tiers auto-published with risk labels, sampling audits
114 See [[Automation Roadmap>>Test.FactHarbor.Specification.Diagrams.Automation Roadmap.WebHome]] for detailed progression. == 5.5 Automation Roadmap == {{include reference="Test.FactHarbor.Specification.Diagrams.Automation Roadmap.WebHome"/}} == 6. Human Role ==
115 Humans do NOT review content for approval. Instead:
116 **Monitoring**: Watch aggregate performance metrics
117 **Improvement**: Fix algorithms when patterns show issues
118 **Exception handling**: Review AKEL-flagged items
119 **Governance**: Set policies AKEL applies
120 See [[Contributor Processes>>Test.FactHarbor.Organisation.Contributor-Processes]] for how to improve the system. == 6.5 Manual vs Automated Matrix == {{include reference="Test.FactHarbor.Specification.Diagrams.Manual vs Automated matrix.WebHome"/}} == 7. Moderation ==
121 Moderators handle items AKEL flags:
122 **Abuse detection**: Spam, manipulation, harassment
123 **Safety issues**: Content that could cause immediate harm
124 **System gaming**: Attempts to manipulate scoring
125 **Action**: May temporarily hide content, ban users, or propose algorithm improvements
126 **Does NOT**: Routinely review claims or override verdicts
127 See [[Organisational Model>>Test.FactHarbor.Organisation.Organisational-Model]] for moderator role details.
128 == 8. Continuous Improvement ==
129 **Performance monitoring**: Track AKEL accuracy, speed, coverage
130 **Issue identification**: Find systematic errors from metrics
131 **Algorithm updates**: Deploy improvements to fix patterns
132 **A/B testing**: Validate changes before full rollout
133 **Retrospectives**: Learn from failures systematically
134 See [[Continuous Improvement>>Test.FactHarbor.Organisation.How-We-Work-Together.Continuous-Improvement]] for improvement cycle.
135 == 9. Scalability ==
136 Automation enables FactHarbor to scale:
137 * **Millions of claims** processable
138 * **Consistent quality** at any volume
139 * **Cost efficiency** through automation
140 * **Rapid iteration** on algorithms
141 Without automation: Human review doesn't scale, creates bottlenecks, introduces inconsistency.
142 == 10. Transparency ==
143 All automation is transparent:
144 * **Algorithm parameters** documented
145 * **Evaluation criteria** public
146 * **Source scoring rules** explicit
147 * **Confidence calculations** explained
148 * **Performance metrics** visible
149 See [[System Performance Metrics>>Test.FactHarbor.Specification.System-Performance-Metrics]] for what we measure.