Last modified by Robert Schaub on 2025/12/24 20:30

Hide last authors
Robert Schaub 1.1 1 = AKEL — AI Knowledge Extraction Layer =
2
Robert Schaub 2.1 3 AKEL is FactHarbor's automated intelligence subsystem.
Robert Schaub 1.1 4 Its purpose is to reduce human workload, enhance consistency, and enable scalable knowledge processing — **without ever replacing human judgment**.
5
6 AKEL outputs are marked with **AuthorType = AI** and published according to risk-based review policies (see Publication Modes below).
7
8 AKEL operates in two modes:
9
10 * **Single-node mode** (POC & Beta 0)
11 * **Federated multi-node mode** (Release 1.0+)
12
13 Human reviewers, experts, and moderators always retain final authority over content marked as "Human-Reviewed."
14
15
16 == 1. Purpose and Role ==
17
18 AKEL transforms unstructured inputs into structured, publication-ready content.
19
20 Core responsibilities:
21
22 * Claim extraction from arbitrary text
23 * Claim classification (domain, type, evaluability, safety, **risk tier**)
24 * Scenario generation (definitions, boundaries, assumptions, methodology)
25 * Evidence summarization and metadata extraction
26 * **Contradiction detection and counter-evidence search**
27 * **Reservation and limitation identification**
28 * **Bubble detection** (echo chambers, conspiracy theories, isolated sources)
29 * Re-evaluation proposal generation
30 * Cross-node embedding exchange (Release 1.0+)
31
32
33 == 2. Components ==
34
35 * **AKEL Orchestrator** – central coordinator
36 * **Claim Extractor**
37 * **Claim Classifier** (with risk tier assignment)
38 * **Scenario Generator**
39 * **Evidence Summarizer**
40 * **Contradiction Detector** (enhanced with counter-evidence search)
41 * **Quality Gate Validator**
42 * **Audit Sampling Scheduler**
43 * **Embedding Handler** (Release 1.0+)
44 * **Federation Sync Adapter** (Release 1.0+)
45
46
47 == 3. Inputs and Outputs ==
48
49 === 3.1 Inputs ===
Robert Schaub 2.1 50 * User-submitted claims or evidence
51 * Uploaded documents
52 * URLs or citations
53 * External LLM API (optional)
Robert Schaub 1.1 54 * Embeddings (from local or federated peers)
55
56 === 3.2 Outputs (publication mode varies by risk tier) ===
Robert Schaub 2.1 57 * ClaimVersion (draft or AI-generated)
58 * ScenarioVersion (draft or AI-generated)
59 * EvidenceVersion (summary + metadata, draft or AI-generated)
60 * VerdictVersion (draft, AI-generated, or human-reviewed)
61 * Contradiction alerts
Robert Schaub 1.1 62 * Reservation and limitation notices
Robert Schaub 2.1 63 * Re-evaluation proposals
Robert Schaub 1.1 64 * Updated embeddings
65
66
67 == 4. Publication Modes ==
68
69 AKEL content is published according to three modes:
70
71 === 4.1 Mode 1: Draft-Only (Never Public) ===
72
73 **Used for:**
74 * Failed quality gate checks
75 * Sensitive topics flagged for expert review
76 * Unclear scope or missing critical sources
77 * High reputational risk content
78
79 **Visibility:** Internal review queue only
80
81 === 4.2 Mode 2: Published as AI-Generated (No Prior Human Review) ===
82
83 **Requirements:**
84 * All automated quality gates passed (see below)
85 * Risk tier permits AI-draft publication (Tier B or C)
86 * Contradiction search completed successfully
87 * Clear labeling as "AI-Generated, Awaiting Human Review"
88
89 **Label shown to users:**
90 ```
91 [AI-Generated] This content was produced by AI and has not yet been human-reviewed.
92 Source: AI | Review Status: Pending | Risk Tier: [B/C]
93 Contradiction Search: Completed | Last Updated: [timestamp]
94 ```
95
96 **User actions:**
97 * Browse and read content
98 * Request human review (escalates to review queue)
99 * Flag for expert attention
100
101 === 4.3 Mode 3: Published as Human-Reviewed ===
102
103 **Requirements:**
104 * Human reviewer or domain expert validated
105 * All quality gates passed
106 * Visible "Human-Reviewed" mark with reviewer role and timestamp
107
108 **Label shown to users:**
109 ```
110 [Human-Reviewed] This content has been validated by human reviewers.
111 Source: AI+Human | Review Status: Approved | Reviewed by: [Role] on [timestamp]
112 Risk Tier: [A/B/C] | Contradiction Search: Completed
113 ```
114
115
Robert Schaub 2.1 116 == 5. Risk tiers ==
Robert Schaub 1.1 117
118 AKEL assigns risk tiers to all content to determine appropriate review requirements:
119
120 === 5.1 Tier A — High Risk / High Impact ===
121
122 **Domains:** Medical, legal, elections, safety/security, major reputational harm
123
124 **Publication policy:**
125 * Human review REQUIRED before "Human-Reviewed" status
126 * AI-generated content MAY be published but:
Robert Schaub 2.1 127 ** Clearly flagged as AI-draft with prominent disclaimer
128 ** May have limited visibility
129 ** Auto-escalated to expert review queue
130 ** User warnings displayed
Robert Schaub 1.1 131
132 **Audit rate:** Recommendation: 30-50% of published AI-drafts sampled in first 6 months
133
134 === 5.2 Tier B — Medium Risk ===
135
136 **Domains:** Contested public policy, complex science, causality claims, significant financial impact
137
138 **Publication policy:**
139 * AI-draft CAN publish immediately with clear labeling
140 * Sampling audits conducted (see Audit System below)
141 * High-engagement items auto-escalated to expert review
142 * Users can request human review
143
144 **Audit rate:** Recommendation: 10-20% of published AI-drafts sampled
145
146 === 5.3 Tier C — Low Risk ===
147
148 **Domains:** Definitions, simple factual lookups with strong primary sources, historical facts, established scientific consensus
149
150 **Publication policy:**
151 * AI-draft default publication mode
152 * Sampling audits sufficient
153 * Community flagging available
154 * Human review on request
155
156 **Audit rate:** Recommendation: 5-10% of published AI-drafts sampled
157
158
159 == 6. Quality Gates (Mandatory Before AI-Draft Publication) ==
160
161 All AI-generated content must pass these automated checks before Mode 2 publication:
162
163 === 6.1 Gate 1: Source Quality ===
164 * Primary sources identified and accessible
165 * Source reliability scored against whitelist
166 * Citation completeness verified
167 * Publication dates checked
168 * Author credentials validated (where applicable)
169
170 === 6.2 Gate 2: Contradiction Search (MANDATORY) ===
171
172 **The system MUST actively search for:**
173
174 * **Counter-evidence** – Rebuttals, conflicting results, contradictory studies
175 * **Reservations** – Caveats, limitations, boundary conditions, applicability constraints
176 * **Alternative interpretations** – Different framings, definitions, contextual variations
177 * **Bubble detection** – Conspiracy theories, echo chambers, ideologically isolated sources
178
179 **Search coverage requirements:**
180 * Academic literature (BOTH supporting AND opposing views)
181 * Reputable media across diverse political/ideological perspectives
182 * Official contradictions (retractions, corrections, updates, amendments)
183 * Domain-specific skeptics, critics, and alternative expert opinions
184 * Cross-cultural and international perspectives
185
186 **Search must actively avoid algorithmic bubbles:**
187 * Deliberately seek opposing viewpoints
188 * Check for echo chamber patterns in source clusters
189 * Identify tribal or ideological source clustering
190 * Flag when search space appears artificially constrained
191 * Verify diversity of perspectives represented
192
193 **Outcomes:**
194 * **Strong counter-evidence found** → Auto-escalate to Tier B or draft-only mode
195 * **Significant uncertainty detected** → Require uncertainty disclosure in verdict
196 * **Bubble indicators present** → Flag for expert review and human validation
197 * **Limited perspective diversity** → Expand search or flag for human review
198
199 === 6.3 Gate 3: Uncertainty Quantification ===
200 * Confidence scores calculated for all claims and verdicts
201 * Limitations explicitly stated
202 * Data gaps identified and disclosed
203 * Strength of evidence assessed
204 * Alternative scenarios considered
205
206 === 6.4 Gate 4: Structural Integrity ===
207 * No hallucinations detected (fact-checking against sources)
208 * Logic chain valid and traceable
209 * References accessible and verifiable
210 * No circular reasoning
211 * Premises clearly stated
212
213 **If any gate fails:**
214 * Content remains in draft-only mode
215 * Failure reason logged
216 * Human review required before publication
217 * Failure patterns analyzed for system improvement
218
219
220 == 7. Audit System (Sampling-Based Quality Assurance) ==
221
222 Instead of reviewing ALL AI output, FactHarbor implements stratified sampling audits:
223
224 === 7.1 Sampling Strategy ===
225
226 Audits prioritize:
227 * **Risk tier** (higher tiers get more frequent audits)
228 * **AI confidence score** (low confidence → higher sampling rate)
229 * **Traffic and engagement** (high-visibility content audited more)
230 * **Novelty** (new claim types, new domains, emerging topics)
231 * **Disagreement signals** (user flags, contradiction alerts, community reports)
232
233 === 7.2 Audit Process ===
234
235 1. System selects content for audit based on sampling strategy
236 2. Human auditor reviews AI-generated content against quality standards
237 3. Auditor validates or corrects:
Robert Schaub 2.1 238 * Claim extraction accuracy
239 * Scenario appropriateness
240 * Evidence relevance and interpretation
241 * Verdict reasoning
242 * Contradiction search completeness
Robert Schaub 1.1 243 4. Audit outcome recorded (pass/fail + detailed feedback)
244 5. Failed audits trigger immediate content review
245 6. Audit results feed back into system improvement
246
247 === 7.3 Feedback Loop (Continuous Improvement) ===
248
249 Audit outcomes systematically improve:
250 * **Query templates** – Refined based on missed evidence patterns
251 * **Retrieval source weights** – Adjusted for accuracy and reliability
252 * **Contradiction detection heuristics** – Enhanced to catch missed counter-evidence
253 * **Model prompts and extraction rules** – Tuned for better claim extraction
254 * **Risk tier assignments** – Recalibrated based on error patterns
255 * **Bubble detection algorithms** – Improved to identify echo chambers
256
257 === 7.4 Audit Transparency ===
258
259 * Audit statistics published regularly
260 * Accuracy rates by risk tier tracked and reported
261 * System improvements documented
262 * Community can view aggregate audit performance
263
264
265 == 8. Architecture Overview ==
266
267 {{include reference="FactHarbor.Specification.Diagrams.AKEL Architecture.WebHome"/}}
268
269
270 == 9. AKEL and Federation ==
271
272 In Release 1.0+, AKEL participates in cross-node knowledge alignment:
273
Robert Schaub 2.1 274 * Shares embeddings
275 * Exchanges canonicalized claim forms
276 * Exchanges scenario templates
277 * Sends + receives contradiction alerts
Robert Schaub 1.1 278 * Shares audit findings (with privacy controls)
Robert Schaub 2.1 279 * Never shares model weights
Robert Schaub 1.1 280 * Never overrides local governance
281
282 Nodes may choose trust levels for AKEL-related data:
283
Robert Schaub 2.1 284 * Trusted nodes: auto-merge embeddings + templates
285 * Neutral nodes: require reviewer approval
Robert Schaub 1.1 286 * Untrusted nodes: fully manual import
287
288
289 == 10. Human Review Workflow (Mode 3 Publication) ==
290
291 For content requiring human validation before "Human-Reviewed" status:
292
293 1. AKEL generates content and publishes as AI-draft (Mode 2) or keeps as draft (Mode 1)
294 2. Reviewers inspect content in review queue
295 3. Reviewers validate quality gates were correctly applied
Robert Schaub 2.1 296 4. Experts validate high-risk (Tier A) or domain-specific outputs
297 5. Moderators finalize "Human-Reviewed" publication
Robert Schaub 1.1 298 6. Version numbers increment, full history preserved
299
300 **Note:** Most AI-generated content (Tier B and C) can remain in Mode 2 (AI-Generated) indefinitely. Human review is optional for these tiers unless users or audits flag issues.
301
302
303 == 11. POC v1 Behavior ==
304
305 The POC explicitly demonstrates AI-generated content publication:
306
307 * Produces public AI-generated output (Mode 2)
308 * No human data sources required
309 * No human approval gate
310 * Clear "AI-Generated - POC/Demo" labeling
311 * All quality gates active (including contradiction search)
312 * Users understand this demonstrates AI reasoning capabilities
313 * Risk tier classification shown (demo purposes)
314
315
316 == 12. Related Pages ==
317
318 * [[Automation>>FactHarbor.Specification.Automation.WebHome]]
319 * [[Requirements (Roles)>>FactHarbor.Specification.Requirements.WebHome]]
320 * [[Workflows>>FactHarbor.Specification.Workflows.WebHome]]
321 * [[Governance>>FactHarbor.Organisation.Governance]]
322