Wiki source code of Automation
Last modified by Robert Schaub on 2025/12/24 21:53
Show last authors
| author | version | line-number | content |
|---|---|---|---|
| 1 | = Automation = | ||
| 2 | **How FactHarbor scales through automated claim evaluation.** | ||
| 3 | == 1. Automation Philosophy == | ||
| 4 | FactHarbor is **automation-first**: AKEL (AI Knowledge Extraction Layer) makes all content decisions. Humans monitor system performance and improve algorithms. | ||
| 5 | **Why automation:** | ||
| 6 | * **Scale**: Can process millions of claims | ||
| 7 | * **Consistency**: Same evaluation criteria applied uniformly | ||
| 8 | * **Transparency**: Algorithms are auditable | ||
| 9 | * **Speed**: Results in <20 seconds typically | ||
| 10 | See [[Automation Philosophy>>FactHarbor.Organisation.Automation-Philosophy]] for detailed principles. | ||
| 11 | == 2. Claim Processing Flow == | ||
| 12 | === 2.1 User Submits Claim === | ||
| 13 | * User provides claim text + source URLs | ||
| 14 | * System validates format | ||
| 15 | * Assigns processing ID | ||
| 16 | * Queues for AKEL processing | ||
| 17 | === 2.2 AKEL Processing === | ||
| 18 | **AKEL automatically:** | ||
| 19 | 1. Parses claim into testable components | ||
| 20 | 2. Extracts evidence from sources | ||
| 21 | 3. Scores source credibility | ||
| 22 | 4. Evaluates claim against evidence | ||
| 23 | 5. Generates verdict with confidence score | ||
| 24 | 6. Assigns risk tier (A/B/C) | ||
| 25 | 7. Publishes result | ||
| 26 | **Processing time**: Typically <20 seconds | ||
| 27 | **No human approval required** - publication is automatic | ||
| 28 | === 2.3 Publication States === | ||
| 29 | **Processing**: AKEL working on claim (not visible to public) | ||
| 30 | **Published**: AKEL completed evaluation (public) | ||
| 31 | * Verdict displayed with confidence score | ||
| 32 | * Evidence and sources shown | ||
| 33 | * Risk tier indicated | ||
| 34 | * Users can report issues | ||
| 35 | **Flagged**: AKEL identified issue requiring moderator attention (still public) | ||
| 36 | * Low confidence below threshold | ||
| 37 | * Detected manipulation attempt | ||
| 38 | * Unusual pattern | ||
| 39 | * Moderator reviews and may take action | ||
| 40 | |||
| 41 | == 2.5 LLM-Based Processing Architecture == | ||
| 42 | |||
| 43 | FactHarbor delegates complex reasoning and analysis tasks to Large Language Models (LLMs). The architecture evolves from POC to production: | ||
| 44 | |||
| 45 | === POC: Two-Phase Approach === | ||
| 46 | |||
| 47 | **Phase 1: Claim Extraction** | ||
| 48 | * Single LLM call to extract all claims from submitted content | ||
| 49 | * Light structure, focused on identifying distinct verifiable claims | ||
| 50 | * Output: List of claims with context | ||
| 51 | |||
| 52 | **Phase 2: Claim Analysis (Parallel)** | ||
| 53 | * Single LLM call per claim (parallelizable) | ||
| 54 | * Full structured output: Evidence, Scenarios, Sources, Verdict, Risk | ||
| 55 | * Each claim analyzed independently | ||
| 56 | |||
| 57 | **Advantages:** | ||
| 58 | * Fast to implement ( to working POC) | ||
| 59 | * Only 2-3 API calls total (1 + N claims) | ||
| 60 | * Simple to debug (claim-level isolation) | ||
| 61 | * Proves concept viability | ||
| 62 | |||
| 63 | === Production: Three-Phase Approach === | ||
| 64 | |||
| 65 | **Phase 1: Claim Extraction + Validation** | ||
| 66 | * Extract distinct verifiable claims | ||
| 67 | * Validate claim clarity and uniqueness | ||
| 68 | * Remove duplicates and vague claims | ||
| 69 | |||
| 70 | **Phase 2: Evidence Gathering (Parallel)** | ||
| 71 | * For each claim independently: | ||
| 72 | * Find supporting and contradicting evidence | ||
| 73 | * Identify authoritative sources | ||
| 74 | * Generate test scenarios | ||
| 75 | * Validation: Check evidence quality and source validity | ||
| 76 | * Error containment: Issues in one claim don't affect others | ||
| 77 | |||
| 78 | **Phase 3: Verdict Generation (Parallel)** | ||
| 79 | * For each claim: | ||
| 80 | * Generate verdict based on validated evidence | ||
| 81 | * Assess confidence and risk level | ||
| 82 | * Flag low-confidence results for human review | ||
| 83 | * Validation: Check verdict consistency with evidence | ||
| 84 | |||
| 85 | **Advantages:** | ||
| 86 | * Error containment between phases | ||
| 87 | * Clear quality gates and validation | ||
| 88 | * Observable metrics per phase | ||
| 89 | * Scalable (parallel processing across claims) | ||
| 90 | * Adaptable (can optimize each phase independently) | ||
| 91 | |||
| 92 | === LLM Task Delegation === | ||
| 93 | |||
| 94 | All complex cognitive tasks are delegated to LLMs: | ||
| 95 | * **Claim Extraction**: Understanding context, identifying distinct claims | ||
| 96 | * **Evidence Finding**: Analyzing sources, assessing relevance | ||
| 97 | * **Scenario Generation**: Creating testable hypotheses | ||
| 98 | * **Source Evaluation**: Assessing reliability and authority | ||
| 99 | * **Verdict Generation**: Synthesizing evidence into conclusions | ||
| 100 | * **Risk Assessment**: Evaluating potential impact | ||
| 101 | |||
| 102 | === Error Mitigation === | ||
| 103 | |||
| 104 | Research shows sequential LLM calls face compound error risks. FactHarbor mitigates this through: | ||
| 105 | * **Validation gates** between phases | ||
| 106 | * **Confidence thresholds** for quality control | ||
| 107 | * **Parallel processing** to avoid error propagation across claims | ||
| 108 | * **Human review queue** for low-confidence verdicts | ||
| 109 | * **Independent claim processing** - errors in one claim don't cascade to others | ||
| 110 | |||
| 111 | == 3. Risk Tiers == | ||
| 112 | Risk tiers classify claims by potential impact and guide audit sampling rates. | ||
| 113 | === 3.1 Tier A (High Risk) === | ||
| 114 | **Domains**: Medical, legal, elections, safety, security | ||
| 115 | **Characteristics**: | ||
| 116 | * High potential for harm if incorrect | ||
| 117 | * Complex specialized knowledge required | ||
| 118 | * Often subject to regulation | ||
| 119 | **Publication**: AKEL publishes automatically with prominent risk warning | ||
| 120 | **Audit rate**: Higher sampling recommended | ||
| 121 | === 3.2 Tier B (Medium Risk) === | ||
| 122 | **Domains**: Complex policy, science, causality claims | ||
| 123 | **Characteristics**: | ||
| 124 | * Moderate potential impact | ||
| 125 | * Requires careful evidence evaluation | ||
| 126 | * Multiple valid interpretations possible | ||
| 127 | **Publication**: AKEL publishes automatically with standard risk label | ||
| 128 | **Audit rate**: Moderate sampling recommended | ||
| 129 | === 3.3 Tier C (Low Risk) === | ||
| 130 | **Domains**: Definitions, established facts, historical data | ||
| 131 | **Characteristics**: | ||
| 132 | * Low potential for harm | ||
| 133 | * Well-documented information | ||
| 134 | * Clear right/wrong answers typically | ||
| 135 | **Publication**: AKEL publishes by default | ||
| 136 | **Audit rate**: Lower sampling recommended | ||
| 137 | == 4. Quality Gates == | ||
| 138 | AKEL applies quality gates before publication. If any fail, claim is **flagged** (not blocked - still published). | ||
| 139 | **Quality gates**: | ||
| 140 | * Sufficient evidence extracted (≥2 sources) | ||
| 141 | * Sources meet minimum credibility threshold | ||
| 142 | * Confidence score calculable | ||
| 143 | * No detected manipulation patterns | ||
| 144 | * Claim parseable into testable form | ||
| 145 | **Failed gates**: Claim published with flag for moderator review | ||
| 146 | == 5. Automation Levels == | ||
| 147 | {{include reference="FactHarbor.Specification.Diagrams.Automation Level.WebHome"/}} | ||
| 148 | FactHarbor progresses through automation maturity levels: | ||
| 149 | **Release 0.5** (Proof-of-Concept): Tier C only, human review required | ||
| 150 | **Release 1.0** (Initial): Tier B/C auto-published, Tier A flagged for review | ||
| 151 | **Release 2.0** (Mature): All tiers auto-published with risk labels, sampling audits | ||
| 152 | See [[Automation Roadmap>>FactHarbor.Specification.Diagrams.Automation Roadmap.WebHome]] for detailed progression. | ||
| 153 | |||
| 154 | == 5.5 Automation Roadmap == | ||
| 155 | |||
| 156 | {{include reference="FactHarbor.Specification.Diagrams.Automation Roadmap.WebHome"/}} | ||
| 157 | |||
| 158 | == 6. Human Role == | ||
| 159 | Humans do NOT review content for approval. Instead: | ||
| 160 | **Monitoring**: Watch aggregate performance metrics | ||
| 161 | **Improvement**: Fix algorithms when patterns show issues | ||
| 162 | **Exception handling**: Review AKEL-flagged items | ||
| 163 | **Governance**: Set policies AKEL applies | ||
| 164 | See [[Contributor Processes>>FactHarbor.Organisation.Contributor-Processes]] for how to improve the system. | ||
| 165 | |||
| 166 | == 6.5 Manual vs Automated Matrix == | ||
| 167 | |||
| 168 | {{include reference="FactHarbor.Specification.Diagrams.Manual vs Automated matrix.WebHome"/}} | ||
| 169 | |||
| 170 | == 7. Moderation == | ||
| 171 | Moderators handle items AKEL flags: | ||
| 172 | **Abuse detection**: Spam, manipulation, harassment | ||
| 173 | **Safety issues**: Content that could cause immediate harm | ||
| 174 | **System gaming**: Attempts to manipulate scoring | ||
| 175 | **Action**: May temporarily hide content, ban users, or propose algorithm improvements | ||
| 176 | **Does NOT**: Routinely review claims or override verdicts | ||
| 177 | See [[Organisational Model>>FactHarbor.Organisation.Organisational-Model]] for moderator role details. | ||
| 178 | == 8. Continuous Improvement == | ||
| 179 | **Performance monitoring**: Track AKEL accuracy, speed, coverage | ||
| 180 | **Issue identification**: Find systematic errors from metrics | ||
| 181 | **Algorithm updates**: Deploy improvements to fix patterns | ||
| 182 | **A/B testing**: Validate changes before full rollout | ||
| 183 | **Retrospectives**: Learn from failures systematically | ||
| 184 | See [[Continuous Improvement>>FactHarbor.Organisation.How-We-Work-Together.Continuous-Improvement]] for improvement cycle. | ||
| 185 | == 9. Scalability == | ||
| 186 | Automation enables FactHarbor to scale: | ||
| 187 | * **Millions of claims** processable | ||
| 188 | * **Consistent quality** at any volume | ||
| 189 | * **Cost efficiency** through automation | ||
| 190 | * **Rapid iteration** on algorithms | ||
| 191 | Without automation: Human review doesn't scale, creates bottlenecks, introduces inconsistency. | ||
| 192 | == 10. Transparency == | ||
| 193 | All automation is transparent: | ||
| 194 | * **Algorithm parameters** documented | ||
| 195 | * **Evaluation criteria** public | ||
| 196 | * **Source scoring rules** explicit | ||
| 197 | * **Confidence calculations** explained | ||
| 198 | * **Performance metrics** visible | ||
| 199 | See [[System Performance Metrics>>FactHarbor.Specification.System-Performance-Metrics]] for what we measure. |