Changes for page Automation

Last modified by Robert Schaub on 2025/12/22 13:50

From version 1.1
edited by Robert Schaub
on 2025/12/22 13:26
Change comment: Imported from XAR
To version 1.3
edited by Robert Schaub
on 2025/12/22 13:49
Change comment: Renamed back-links.

Summary

Details

Page properties
Parent
... ... @@ -1,1 +1,1 @@
1 -Test.FactHarbor.Specification.WebHome
1 +Test.FactHarbor pre10 V0\.9\.70.Specification.WebHome
Content
... ... @@ -1,21 +1,31 @@
1 1  = Automation =
2 +
2 2  **How FactHarbor scales through automated claim evaluation.**
4 +
3 3  == 1. Automation Philosophy ==
6 +
4 4  FactHarbor is **automation-first**: AKEL (AI Knowledge Extraction Layer) makes all content decisions. Humans monitor system performance and improve algorithms.
5 5  **Why automation:**
9 +
6 6  * **Scale**: Can process millions of claims
7 7  * **Consistency**: Same evaluation criteria applied uniformly
8 8  * **Transparency**: Algorithms are auditable
9 9  * **Speed**: Results in <20 seconds typically
10 10  See [[Automation Philosophy>>Test.FactHarbor.Organisation.Automation-Philosophy]] for detailed principles.
15 +
11 11  == 2. Claim Processing Flow ==
17 +
12 12  === 2.1 User Submits Claim ===
19 +
13 13  * User provides claim text + source URLs
14 14  * System validates format
15 15  * Assigns processing ID
16 16  * Queues for AKEL processing
24 +
17 17  === 2.2 AKEL Processing ===
26 +
18 18  **AKEL automatically:**
28 +
19 19  1. Parses claim into testable components
20 20  2. Extracts evidence from sources
21 21  3. Scores source credibility
... ... @@ -25,9 +25,12 @@
25 25  7. Publishes result
26 26  **Processing time**: Typically <20 seconds
27 27  **No human approval required** - publication is automatic
38 +
28 28  === 2.3 Publication States ===
40 +
29 29  **Processing**: AKEL working on claim (not visible to public)
30 30  **Published**: AKEL completed evaluation (public)
43 +
31 31  * Verdict displayed with confidence score
32 32  * Evidence and sources shown
33 33  * Risk tier indicated
... ... @@ -45,16 +45,19 @@
45 45  === POC: Two-Phase Approach ===
46 46  
47 47  **Phase 1: Claim Extraction**
61 +
48 48  * Single LLM call to extract all claims from submitted content
49 49  * Light structure, focused on identifying distinct verifiable claims
50 50  * Output: List of claims with context
51 51  
52 52  **Phase 2: Claim Analysis (Parallel)**
67 +
53 53  * Single LLM call per claim (parallelizable)
54 54  * Full structured output: Evidence, Scenarios, Sources, Verdict, Risk
55 55  * Each claim analyzed independently
56 56  
57 57  **Advantages:**
73 +
58 58  * Fast to implement (2-4 weeks to working POC)
59 59  * Only 2-3 API calls total (1 + N claims)
60 60  * Simple to debug (claim-level isolation)
... ... @@ -63,26 +63,30 @@
63 63  === Production: Three-Phase Approach ===
64 64  
65 65  **Phase 1: Claim Extraction + Validation**
82 +
66 66  * Extract distinct verifiable claims
67 67  * Validate claim clarity and uniqueness
68 68  * Remove duplicates and vague claims
69 69  
70 70  **Phase 2: Evidence Gathering (Parallel)**
88 +
71 71  * For each claim independently:
72 - * Find supporting and contradicting evidence
73 - * Identify authoritative sources
74 - * Generate test scenarios
90 +* Find supporting and contradicting evidence
91 +* Identify authoritative sources
92 +* Generate test scenarios
75 75  * Validation: Check evidence quality and source validity
76 76  * Error containment: Issues in one claim don't affect others
77 77  
78 78  **Phase 3: Verdict Generation (Parallel)**
97 +
79 79  * For each claim:
80 - * Generate verdict based on validated evidence
81 - * Assess confidence and risk level
82 - * Flag low-confidence results for human review
99 +* Generate verdict based on validated evidence
100 +* Assess confidence and risk level
101 +* Flag low-confidence results for human review
83 83  * Validation: Check verdict consistency with evidence
84 84  
85 85  **Advantages:**
105 +
86 86  * Error containment between phases
87 87  * Clear quality gates and validation
88 88  * Observable metrics per phase
... ... @@ -92,6 +92,7 @@
92 92  === LLM Task Delegation ===
93 93  
94 94  All complex cognitive tasks are delegated to LLMs:
115 +
95 95  * **Claim Extraction**: Understanding context, identifying distinct claims
96 96  * **Evidence Finding**: Analyzing sources, assessing relevance
97 97  * **Scenario Generation**: Creating testable hypotheses
... ... @@ -102,6 +102,7 @@
102 102  === Error Mitigation ===
103 103  
104 104  Research shows sequential LLM calls face compound error risks. FactHarbor mitigates this through:
126 +
105 105  * **Validation gates** between phases
106 106  * **Confidence thresholds** for quality control
107 107  * **Parallel processing** to avoid error propagation across claims
... ... @@ -108,36 +108,48 @@
108 108  * **Human review queue** for low-confidence verdicts
109 109  * **Independent claim processing** - errors in one claim don't cascade to others
110 110  
111 -
112 112  == 3. Risk Tiers ==
134 +
113 113  Risk tiers classify claims by potential impact and guide audit sampling rates.
136 +
114 114  === 3.1 Tier A (High Risk) ===
138 +
115 115  **Domains**: Medical, legal, elections, safety, security
116 116  **Characteristics**:
141 +
117 117  * High potential for harm if incorrect
118 118  * Complex specialized knowledge required
119 119  * Often subject to regulation
120 120  **Publication**: AKEL publishes automatically with prominent risk warning
121 121  **Audit rate**: Higher sampling recommended
147 +
122 122  === 3.2 Tier B (Medium Risk) ===
149 +
123 123  **Domains**: Complex policy, science, causality claims
124 124  **Characteristics**:
152 +
125 125  * Moderate potential impact
126 126  * Requires careful evidence evaluation
127 127  * Multiple valid interpretations possible
128 128  **Publication**: AKEL publishes automatically with standard risk label
129 129  **Audit rate**: Moderate sampling recommended
158 +
130 130  === 3.3 Tier C (Low Risk) ===
160 +
131 131  **Domains**: Definitions, established facts, historical data
132 132  **Characteristics**:
163 +
133 133  * Low potential for harm
134 134  * Well-documented information
135 135  * Clear right/wrong answers typically
136 136  **Publication**: AKEL publishes by default
137 137  **Audit rate**: Lower sampling recommended
169 +
138 138  == 4. Quality Gates ==
171 +
139 139  AKEL applies quality gates before publication. If any fail, claim is **flagged** (not blocked - still published).
140 140  **Quality gates**:
174 +
141 141  * Sufficient evidence extracted (≥2 sources)
142 142  * Sources meet minimum credibility threshold
143 143  * Confidence score calculable
... ... @@ -144,8 +144,10 @@
144 144  * No detected manipulation patterns
145 145  * Claim parseable into testable form
146 146  **Failed gates**: Claim published with flag for moderator review
181 +
147 147  == 5. Automation Levels ==
148 -{{include reference="Test.FactHarbor.Specification.Diagrams.Automation Level.WebHome"/}}
183 +
184 +{{include reference="Test.FactHarbor pre10 V0\.9\.70.Specification.Diagrams.Automation Level.WebHome"/}}
149 149  FactHarbor progresses through automation maturity levels:
150 150  **Release 0.5** (Proof-of-Concept): Tier C only, human review required
151 151  **Release 1.0** (Initial): Tier B/C auto-published, Tier A flagged for review
... ... @@ -157,6 +157,7 @@
157 157  {{include reference="Test.FactHarbor.Specification.Diagrams.Automation Roadmap.WebHome"/}}
158 158  
159 159  == 6. Human Role ==
196 +
160 160  Humans do NOT review content for approval. Instead:
161 161  **Monitoring**: Watch aggregate performance metrics
162 162  **Improvement**: Fix algorithms when patterns show issues
... ... @@ -169,6 +169,7 @@
169 169  {{include reference="Test.FactHarbor.Specification.Diagrams.Manual vs Automated matrix.WebHome"/}}
170 170  
171 171  == 7. Moderation ==
209 +
172 172  Moderators handle items AKEL flags:
173 173  **Abuse detection**: Spam, manipulation, harassment
174 174  **Safety issues**: Content that could cause immediate harm
... ... @@ -176,7 +176,9 @@
176 176  **Action**: May temporarily hide content, ban users, or propose algorithm improvements
177 177  **Does NOT**: Routinely review claims or override verdicts
178 178  See [[Organisational Model>>Test.FactHarbor.Organisation.Organisational-Model]] for moderator role details.
217 +
179 179  == 8. Continuous Improvement ==
219 +
180 180  **Performance monitoring**: Track AKEL accuracy, speed, coverage
181 181  **Issue identification**: Find systematic errors from metrics
182 182  **Algorithm updates**: Deploy improvements to fix patterns
... ... @@ -183,15 +183,21 @@
183 183  **A/B testing**: Validate changes before full rollout
184 184  **Retrospectives**: Learn from failures systematically
185 185  See [[Continuous Improvement>>Test.FactHarbor.Organisation.How-We-Work-Together.Continuous-Improvement]] for improvement cycle.
226 +
186 186  == 9. Scalability ==
228 +
187 187  Automation enables FactHarbor to scale:
230 +
188 188  * **Millions of claims** processable
189 189  * **Consistent quality** at any volume
190 190  * **Cost efficiency** through automation
191 191  * **Rapid iteration** on algorithms
192 192  Without automation: Human review doesn't scale, creates bottlenecks, introduces inconsistency.
236 +
193 193  == 10. Transparency ==
238 +
194 194  All automation is transparent:
240 +
195 195  * **Algorithm parameters** documented
196 196  * **Evaluation criteria** public
197 197  * **Source scoring rules** explicit