Wiki source code of Automation
Last modified by Robert Schaub on 2025/12/22 13:50
Hide last authors
| author | version | line-number | content |
|---|---|---|---|
| |
1.1 | 1 | = Automation = |
| |
1.3 | 2 | |
| |
1.1 | 3 | **How FactHarbor scales through automated claim evaluation.** |
| |
1.3 | 4 | |
| |
1.1 | 5 | == 1. Automation Philosophy == |
| |
1.3 | 6 | |
| |
1.1 | 7 | FactHarbor is **automation-first**: AKEL (AI Knowledge Extraction Layer) makes all content decisions. Humans monitor system performance and improve algorithms. |
| 8 | **Why automation:** | ||
| |
1.3 | 9 | |
| |
1.1 | 10 | * **Scale**: Can process millions of claims |
| 11 | * **Consistency**: Same evaluation criteria applied uniformly | ||
| 12 | * **Transparency**: Algorithms are auditable | ||
| 13 | * **Speed**: Results in <20 seconds typically | ||
| 14 | See [[Automation Philosophy>>Test.FactHarbor.Organisation.Automation-Philosophy]] for detailed principles. | ||
| |
1.3 | 15 | |
| |
1.1 | 16 | == 2. Claim Processing Flow == |
| |
1.3 | 17 | |
| |
1.1 | 18 | === 2.1 User Submits Claim === |
| |
1.3 | 19 | |
| |
1.1 | 20 | * User provides claim text + source URLs |
| 21 | * System validates format | ||
| 22 | * Assigns processing ID | ||
| 23 | * Queues for AKEL processing | ||
| |
1.3 | 24 | |
| |
1.1 | 25 | === 2.2 AKEL Processing === |
| |
1.3 | 26 | |
| |
1.1 | 27 | **AKEL automatically:** |
| |
1.3 | 28 | |
| |
1.1 | 29 | 1. Parses claim into testable components |
| 30 | 2. Extracts evidence from sources | ||
| 31 | 3. Scores source credibility | ||
| 32 | 4. Evaluates claim against evidence | ||
| 33 | 5. Generates verdict with confidence score | ||
| 34 | 6. Assigns risk tier (A/B/C) | ||
| 35 | 7. Publishes result | ||
| 36 | **Processing time**: Typically <20 seconds | ||
| 37 | **No human approval required** - publication is automatic | ||
| |
1.3 | 38 | |
| |
1.1 | 39 | === 2.3 Publication States === |
| |
1.3 | 40 | |
| |
1.1 | 41 | **Processing**: AKEL working on claim (not visible to public) |
| 42 | **Published**: AKEL completed evaluation (public) | ||
| |
1.3 | 43 | |
| |
1.1 | 44 | * Verdict displayed with confidence score |
| 45 | * Evidence and sources shown | ||
| 46 | * Risk tier indicated | ||
| 47 | * Users can report issues | ||
| 48 | **Flagged**: AKEL identified issue requiring moderator attention (still public) | ||
| 49 | * Low confidence below threshold | ||
| 50 | * Detected manipulation attempt | ||
| 51 | * Unusual pattern | ||
| 52 | * Moderator reviews and may take action | ||
| 53 | |||
| 54 | == 2.5 LLM-Based Processing Architecture == | ||
| 55 | |||
| 56 | FactHarbor delegates complex reasoning and analysis tasks to Large Language Models (LLMs). The architecture evolves from POC to production: | ||
| 57 | |||
| 58 | === POC: Two-Phase Approach === | ||
| 59 | |||
| 60 | **Phase 1: Claim Extraction** | ||
| |
1.3 | 61 | |
| |
1.1 | 62 | * Single LLM call to extract all claims from submitted content |
| 63 | * Light structure, focused on identifying distinct verifiable claims | ||
| 64 | * Output: List of claims with context | ||
| 65 | |||
| 66 | **Phase 2: Claim Analysis (Parallel)** | ||
| |
1.3 | 67 | |
| |
1.1 | 68 | * Single LLM call per claim (parallelizable) |
| 69 | * Full structured output: Evidence, Scenarios, Sources, Verdict, Risk | ||
| 70 | * Each claim analyzed independently | ||
| 71 | |||
| 72 | **Advantages:** | ||
| |
1.3 | 73 | |
| |
1.1 | 74 | * Fast to implement (2-4 weeks to working POC) |
| 75 | * Only 2-3 API calls total (1 + N claims) | ||
| 76 | * Simple to debug (claim-level isolation) | ||
| 77 | * Proves concept viability | ||
| 78 | |||
| 79 | === Production: Three-Phase Approach === | ||
| 80 | |||
| 81 | **Phase 1: Claim Extraction + Validation** | ||
| |
1.3 | 82 | |
| |
1.1 | 83 | * Extract distinct verifiable claims |
| 84 | * Validate claim clarity and uniqueness | ||
| 85 | * Remove duplicates and vague claims | ||
| 86 | |||
| 87 | **Phase 2: Evidence Gathering (Parallel)** | ||
| |
1.3 | 88 | |
| |
1.1 | 89 | * For each claim independently: |
| |
1.3 | 90 | * Find supporting and contradicting evidence |
| 91 | * Identify authoritative sources | ||
| 92 | * Generate test scenarios | ||
| |
1.1 | 93 | * Validation: Check evidence quality and source validity |
| 94 | * Error containment: Issues in one claim don't affect others | ||
| 95 | |||
| 96 | **Phase 3: Verdict Generation (Parallel)** | ||
| |
1.3 | 97 | |
| |
1.1 | 98 | * For each claim: |
| |
1.3 | 99 | * Generate verdict based on validated evidence |
| 100 | * Assess confidence and risk level | ||
| 101 | * Flag low-confidence results for human review | ||
| |
1.1 | 102 | * Validation: Check verdict consistency with evidence |
| 103 | |||
| 104 | **Advantages:** | ||
| |
1.3 | 105 | |
| |
1.1 | 106 | * Error containment between phases |
| 107 | * Clear quality gates and validation | ||
| 108 | * Observable metrics per phase | ||
| 109 | * Scalable (parallel processing across claims) | ||
| 110 | * Adaptable (can optimize each phase independently) | ||
| 111 | |||
| 112 | === LLM Task Delegation === | ||
| 113 | |||
| 114 | All complex cognitive tasks are delegated to LLMs: | ||
| |
1.3 | 115 | |
| |
1.1 | 116 | * **Claim Extraction**: Understanding context, identifying distinct claims |
| 117 | * **Evidence Finding**: Analyzing sources, assessing relevance | ||
| 118 | * **Scenario Generation**: Creating testable hypotheses | ||
| 119 | * **Source Evaluation**: Assessing reliability and authority | ||
| 120 | * **Verdict Generation**: Synthesizing evidence into conclusions | ||
| 121 | * **Risk Assessment**: Evaluating potential impact | ||
| 122 | |||
| 123 | === Error Mitigation === | ||
| 124 | |||
| 125 | Research shows sequential LLM calls face compound error risks. FactHarbor mitigates this through: | ||
| |
1.3 | 126 | |
| |
1.1 | 127 | * **Validation gates** between phases |
| 128 | * **Confidence thresholds** for quality control | ||
| 129 | * **Parallel processing** to avoid error propagation across claims | ||
| 130 | * **Human review queue** for low-confidence verdicts | ||
| 131 | * **Independent claim processing** - errors in one claim don't cascade to others | ||
| 132 | |||
| |
1.3 | 133 | == 3. Risk Tiers == |
| |
1.1 | 134 | |
| 135 | Risk tiers classify claims by potential impact and guide audit sampling rates. | ||
| |
1.3 | 136 | |
| |
1.1 | 137 | === 3.1 Tier A (High Risk) === |
| |
1.3 | 138 | |
| |
1.1 | 139 | **Domains**: Medical, legal, elections, safety, security |
| 140 | **Characteristics**: | ||
| |
1.3 | 141 | |
| |
1.1 | 142 | * High potential for harm if incorrect |
| 143 | * Complex specialized knowledge required | ||
| 144 | * Often subject to regulation | ||
| 145 | **Publication**: AKEL publishes automatically with prominent risk warning | ||
| 146 | **Audit rate**: Higher sampling recommended | ||
| |
1.3 | 147 | |
| |
1.1 | 148 | === 3.2 Tier B (Medium Risk) === |
| |
1.3 | 149 | |
| |
1.1 | 150 | **Domains**: Complex policy, science, causality claims |
| 151 | **Characteristics**: | ||
| |
1.3 | 152 | |
| |
1.1 | 153 | * Moderate potential impact |
| 154 | * Requires careful evidence evaluation | ||
| 155 | * Multiple valid interpretations possible | ||
| 156 | **Publication**: AKEL publishes automatically with standard risk label | ||
| 157 | **Audit rate**: Moderate sampling recommended | ||
| |
1.3 | 158 | |
| |
1.1 | 159 | === 3.3 Tier C (Low Risk) === |
| |
1.3 | 160 | |
| |
1.1 | 161 | **Domains**: Definitions, established facts, historical data |
| 162 | **Characteristics**: | ||
| |
1.3 | 163 | |
| |
1.1 | 164 | * Low potential for harm |
| 165 | * Well-documented information | ||
| 166 | * Clear right/wrong answers typically | ||
| 167 | **Publication**: AKEL publishes by default | ||
| 168 | **Audit rate**: Lower sampling recommended | ||
| |
1.3 | 169 | |
| |
1.1 | 170 | == 4. Quality Gates == |
| |
1.3 | 171 | |
| |
1.1 | 172 | AKEL applies quality gates before publication. If any fail, claim is **flagged** (not blocked - still published). |
| 173 | **Quality gates**: | ||
| |
1.3 | 174 | |
| |
1.1 | 175 | * Sufficient evidence extracted (≥2 sources) |
| 176 | * Sources meet minimum credibility threshold | ||
| 177 | * Confidence score calculable | ||
| 178 | * No detected manipulation patterns | ||
| 179 | * Claim parseable into testable form | ||
| 180 | **Failed gates**: Claim published with flag for moderator review | ||
| |
1.3 | 181 | |
| |
1.1 | 182 | == 5. Automation Levels == |
| |
1.3 | 183 | |
| 184 | {{include reference="Test.FactHarbor pre10 V0\.9\.70.Specification.Diagrams.Automation Level.WebHome"/}} | ||
| |
1.1 | 185 | FactHarbor progresses through automation maturity levels: |
| 186 | **Release 0.5** (Proof-of-Concept): Tier C only, human review required | ||
| 187 | **Release 1.0** (Initial): Tier B/C auto-published, Tier A flagged for review | ||
| 188 | **Release 2.0** (Mature): All tiers auto-published with risk labels, sampling audits | ||
| |
1.4 | 189 | See [[Automation Roadmap>>Test.FactHarbor pre10 V0\.9\.70.Specification.Diagrams.Automation Roadmap.WebHome]] for detailed progression. |
| |
1.1 | 190 | |
| 191 | == 5.5 Automation Roadmap == | ||
| 192 | |||
| |
1.4 | 193 | {{include reference="Test.FactHarbor pre10 V0\.9\.70.Specification.Diagrams.Automation Roadmap.WebHome"/}} |
| |
1.1 | 194 | |
| 195 | == 6. Human Role == | ||
| |
1.3 | 196 | |
| |
1.1 | 197 | Humans do NOT review content for approval. Instead: |
| 198 | **Monitoring**: Watch aggregate performance metrics | ||
| 199 | **Improvement**: Fix algorithms when patterns show issues | ||
| 200 | **Exception handling**: Review AKEL-flagged items | ||
| 201 | **Governance**: Set policies AKEL applies | ||
| 202 | See [[Contributor Processes>>Test.FactHarbor.Organisation.Contributor-Processes]] for how to improve the system. | ||
| 203 | |||
| 204 | == 6.5 Manual vs Automated Matrix == | ||
| 205 | |||
| |
1.5 | 206 | {{include reference="Test.FactHarbor pre10 V0\.9\.70.Specification.Diagrams.Manual vs Automated matrix.WebHome"/}} |
| |
1.1 | 207 | |
| 208 | == 7. Moderation == | ||
| |
1.3 | 209 | |
| |
1.1 | 210 | Moderators handle items AKEL flags: |
| 211 | **Abuse detection**: Spam, manipulation, harassment | ||
| 212 | **Safety issues**: Content that could cause immediate harm | ||
| 213 | **System gaming**: Attempts to manipulate scoring | ||
| 214 | **Action**: May temporarily hide content, ban users, or propose algorithm improvements | ||
| 215 | **Does NOT**: Routinely review claims or override verdicts | ||
| 216 | See [[Organisational Model>>Test.FactHarbor.Organisation.Organisational-Model]] for moderator role details. | ||
| |
1.3 | 217 | |
| |
1.1 | 218 | == 8. Continuous Improvement == |
| |
1.3 | 219 | |
| |
1.1 | 220 | **Performance monitoring**: Track AKEL accuracy, speed, coverage |
| 221 | **Issue identification**: Find systematic errors from metrics | ||
| 222 | **Algorithm updates**: Deploy improvements to fix patterns | ||
| 223 | **A/B testing**: Validate changes before full rollout | ||
| 224 | **Retrospectives**: Learn from failures systematically | ||
| 225 | See [[Continuous Improvement>>Test.FactHarbor.Organisation.How-We-Work-Together.Continuous-Improvement]] for improvement cycle. | ||
| |
1.3 | 226 | |
| |
1.1 | 227 | == 9. Scalability == |
| |
1.3 | 228 | |
| |
1.1 | 229 | Automation enables FactHarbor to scale: |
| |
1.3 | 230 | |
| |
1.1 | 231 | * **Millions of claims** processable |
| 232 | * **Consistent quality** at any volume | ||
| 233 | * **Cost efficiency** through automation | ||
| 234 | * **Rapid iteration** on algorithms | ||
| 235 | Without automation: Human review doesn't scale, creates bottlenecks, introduces inconsistency. | ||
| |
1.3 | 236 | |
| |
1.1 | 237 | == 10. Transparency == |
| |
1.3 | 238 | |
| |
1.1 | 239 | All automation is transparent: |
| |
1.3 | 240 | |
| |
1.1 | 241 | * **Algorithm parameters** documented |
| 242 | * **Evaluation criteria** public | ||
| 243 | * **Source scoring rules** explicit | ||
| 244 | * **Confidence calculations** explained | ||
| 245 | * **Performance metrics** visible | ||
| 246 | See [[System Performance Metrics>>Test.FactHarbor.Specification.System-Performance-Metrics]] for what we measure. |