Changes for page Automation
Last modified by Robert Schaub on 2025/12/22 13:50
From version 1.2
edited by Robert Schaub
on 2025/12/22 13:49
on 2025/12/22 13:49
Change comment:
Update document after refactoring.
Summary
-
Page properties (1 modified, 0 added, 0 removed)
Details
- Page properties
-
- Content
-
... ... @@ -1,21 +1,31 @@ 1 1 = Automation = 2 + 2 2 **How FactHarbor scales through automated claim evaluation.** 4 + 3 3 == 1. Automation Philosophy == 6 + 4 4 FactHarbor is **automation-first**: AKEL (AI Knowledge Extraction Layer) makes all content decisions. Humans monitor system performance and improve algorithms. 5 5 **Why automation:** 9 + 6 6 * **Scale**: Can process millions of claims 7 7 * **Consistency**: Same evaluation criteria applied uniformly 8 8 * **Transparency**: Algorithms are auditable 9 9 * **Speed**: Results in <20 seconds typically 10 10 See [[Automation Philosophy>>Test.FactHarbor.Organisation.Automation-Philosophy]] for detailed principles. 15 + 11 11 == 2. Claim Processing Flow == 17 + 12 12 === 2.1 User Submits Claim === 19 + 13 13 * User provides claim text + source URLs 14 14 * System validates format 15 15 * Assigns processing ID 16 16 * Queues for AKEL processing 24 + 17 17 === 2.2 AKEL Processing === 26 + 18 18 **AKEL automatically:** 28 + 19 19 1. Parses claim into testable components 20 20 2. Extracts evidence from sources 21 21 3. Scores source credibility ... ... @@ -25,9 +25,12 @@ 25 25 7. Publishes result 26 26 **Processing time**: Typically <20 seconds 27 27 **No human approval required** - publication is automatic 38 + 28 28 === 2.3 Publication States === 40 + 29 29 **Processing**: AKEL working on claim (not visible to public) 30 30 **Published**: AKEL completed evaluation (public) 43 + 31 31 * Verdict displayed with confidence score 32 32 * Evidence and sources shown 33 33 * Risk tier indicated ... ... @@ -45,16 +45,19 @@ 45 45 === POC: Two-Phase Approach === 46 46 47 47 **Phase 1: Claim Extraction** 61 + 48 48 * Single LLM call to extract all claims from submitted content 49 49 * Light structure, focused on identifying distinct verifiable claims 50 50 * Output: List of claims with context 51 51 52 52 **Phase 2: Claim Analysis (Parallel)** 67 + 53 53 * Single LLM call per claim (parallelizable) 54 54 * Full structured output: Evidence, Scenarios, Sources, Verdict, Risk 55 55 * Each claim analyzed independently 56 56 57 57 **Advantages:** 73 + 58 58 * Fast to implement (2-4 weeks to working POC) 59 59 * Only 2-3 API calls total (1 + N claims) 60 60 * Simple to debug (claim-level isolation) ... ... @@ -63,26 +63,30 @@ 63 63 === Production: Three-Phase Approach === 64 64 65 65 **Phase 1: Claim Extraction + Validation** 82 + 66 66 * Extract distinct verifiable claims 67 67 * Validate claim clarity and uniqueness 68 68 * Remove duplicates and vague claims 69 69 70 70 **Phase 2: Evidence Gathering (Parallel)** 88 + 71 71 * For each claim independently: 72 - * Find supporting and contradicting evidence73 - * Identify authoritative sources74 - * Generate test scenarios90 +* Find supporting and contradicting evidence 91 +* Identify authoritative sources 92 +* Generate test scenarios 75 75 * Validation: Check evidence quality and source validity 76 76 * Error containment: Issues in one claim don't affect others 77 77 78 78 **Phase 3: Verdict Generation (Parallel)** 97 + 79 79 * For each claim: 80 - * Generate verdict based on validated evidence81 - * Assess confidence and risk level82 - * Flag low-confidence results for human review99 +* Generate verdict based on validated evidence 100 +* Assess confidence and risk level 101 +* Flag low-confidence results for human review 83 83 * Validation: Check verdict consistency with evidence 84 84 85 85 **Advantages:** 105 + 86 86 * Error containment between phases 87 87 * Clear quality gates and validation 88 88 * Observable metrics per phase ... ... @@ -92,6 +92,7 @@ 92 92 === LLM Task Delegation === 93 93 94 94 All complex cognitive tasks are delegated to LLMs: 115 + 95 95 * **Claim Extraction**: Understanding context, identifying distinct claims 96 96 * **Evidence Finding**: Analyzing sources, assessing relevance 97 97 * **Scenario Generation**: Creating testable hypotheses ... ... @@ -102,6 +102,7 @@ 102 102 === Error Mitigation === 103 103 104 104 Research shows sequential LLM calls face compound error risks. FactHarbor mitigates this through: 126 + 105 105 * **Validation gates** between phases 106 106 * **Confidence thresholds** for quality control 107 107 * **Parallel processing** to avoid error propagation across claims ... ... @@ -108,36 +108,48 @@ 108 108 * **Human review queue** for low-confidence verdicts 109 109 * **Independent claim processing** - errors in one claim don't cascade to others 110 110 111 - 112 112 == 3. Risk Tiers == 134 + 113 113 Risk tiers classify claims by potential impact and guide audit sampling rates. 136 + 114 114 === 3.1 Tier A (High Risk) === 138 + 115 115 **Domains**: Medical, legal, elections, safety, security 116 116 **Characteristics**: 141 + 117 117 * High potential for harm if incorrect 118 118 * Complex specialized knowledge required 119 119 * Often subject to regulation 120 120 **Publication**: AKEL publishes automatically with prominent risk warning 121 121 **Audit rate**: Higher sampling recommended 147 + 122 122 === 3.2 Tier B (Medium Risk) === 149 + 123 123 **Domains**: Complex policy, science, causality claims 124 124 **Characteristics**: 152 + 125 125 * Moderate potential impact 126 126 * Requires careful evidence evaluation 127 127 * Multiple valid interpretations possible 128 128 **Publication**: AKEL publishes automatically with standard risk label 129 129 **Audit rate**: Moderate sampling recommended 158 + 130 130 === 3.3 Tier C (Low Risk) === 160 + 131 131 **Domains**: Definitions, established facts, historical data 132 132 **Characteristics**: 163 + 133 133 * Low potential for harm 134 134 * Well-documented information 135 135 * Clear right/wrong answers typically 136 136 **Publication**: AKEL publishes by default 137 137 **Audit rate**: Lower sampling recommended 169 + 138 138 == 4. Quality Gates == 171 + 139 139 AKEL applies quality gates before publication. If any fail, claim is **flagged** (not blocked - still published). 140 140 **Quality gates**: 174 + 141 141 * Sufficient evidence extracted (≥2 sources) 142 142 * Sources meet minimum credibility threshold 143 143 * Confidence score calculable ... ... @@ -144,8 +144,10 @@ 144 144 * No detected manipulation patterns 145 145 * Claim parseable into testable form 146 146 **Failed gates**: Claim published with flag for moderator review 181 + 147 147 == 5. Automation Levels == 148 -{{include reference="Test.FactHarbor.Specification.Diagrams.Automation Level.WebHome"/}} 183 + 184 +{{include reference="Test.FactHarbor pre10 V0\.9\.70.Specification.Diagrams.Automation Level.WebHome"/}} 149 149 FactHarbor progresses through automation maturity levels: 150 150 **Release 0.5** (Proof-of-Concept): Tier C only, human review required 151 151 **Release 1.0** (Initial): Tier B/C auto-published, Tier A flagged for review ... ... @@ -157,6 +157,7 @@ 157 157 {{include reference="Test.FactHarbor.Specification.Diagrams.Automation Roadmap.WebHome"/}} 158 158 159 159 == 6. Human Role == 196 + 160 160 Humans do NOT review content for approval. Instead: 161 161 **Monitoring**: Watch aggregate performance metrics 162 162 **Improvement**: Fix algorithms when patterns show issues ... ... @@ -169,6 +169,7 @@ 169 169 {{include reference="Test.FactHarbor.Specification.Diagrams.Manual vs Automated matrix.WebHome"/}} 170 170 171 171 == 7. Moderation == 209 + 172 172 Moderators handle items AKEL flags: 173 173 **Abuse detection**: Spam, manipulation, harassment 174 174 **Safety issues**: Content that could cause immediate harm ... ... @@ -176,7 +176,9 @@ 176 176 **Action**: May temporarily hide content, ban users, or propose algorithm improvements 177 177 **Does NOT**: Routinely review claims or override verdicts 178 178 See [[Organisational Model>>Test.FactHarbor.Organisation.Organisational-Model]] for moderator role details. 217 + 179 179 == 8. Continuous Improvement == 219 + 180 180 **Performance monitoring**: Track AKEL accuracy, speed, coverage 181 181 **Issue identification**: Find systematic errors from metrics 182 182 **Algorithm updates**: Deploy improvements to fix patterns ... ... @@ -183,15 +183,21 @@ 183 183 **A/B testing**: Validate changes before full rollout 184 184 **Retrospectives**: Learn from failures systematically 185 185 See [[Continuous Improvement>>Test.FactHarbor.Organisation.How-We-Work-Together.Continuous-Improvement]] for improvement cycle. 226 + 186 186 == 9. Scalability == 228 + 187 187 Automation enables FactHarbor to scale: 230 + 188 188 * **Millions of claims** processable 189 189 * **Consistent quality** at any volume 190 190 * **Cost efficiency** through automation 191 191 * **Rapid iteration** on algorithms 192 192 Without automation: Human review doesn't scale, creates bottlenecks, introduces inconsistency. 236 + 193 193 == 10. Transparency == 238 + 194 194 All automation is transparent: 240 + 195 195 * **Algorithm parameters** documented 196 196 * **Evaluation criteria** public 197 197 * **Source scoring rules** explicit