Last modified by Robert Schaub on 2026/02/08 21:26

Show last authors
1 = AKEL – AI Knowledge Extraction Layer =
2
3 **Version:** 0.9.70
4 **Last Updated:** December 21, 2025
5 **Status:** CORRECTED - Automation Philosophy Consistent
6
7 AKEL is FactHarbor's automated intelligence subsystem.
8 Its purpose is to reduce human workload, enhance consistency, and enable scalable knowledge processing.
9
10 AKEL outputs are marked with **AuthorType = AI** and published according to risk-based policies (see Publication Modes below).
11
12 AKEL operates in two modes:
13
14 * **Single-node mode** (POC & Beta 0)
15 * **Federated multi-node mode** (Release 1.0+)
16
17 == 1. Core Philosophy: Automation First ==
18
19 **V0.9.50+ Philosophy Shift:**
20
21 FactHarbor uses **"Improve the system, not the data"** approach:
22
23 * ✅ **Automated Publication:** AI-generated content publishes immediately after passing quality gates
24 * ✅ **Quality Gates:** Automated checks (not human approval)
25 * ✅ **Sampling Audits:** Humans analyze patterns for system improvement (not individual approval)
26 * ❌ **NO approval workflows:** No review queues, no moderator gatekeeping for content quality
27 * ❌ **NO manual fixes:** If output is wrong, improve the algorithm/prompts
28
29 **Why This Matters:**
30
31 Traditional approach: Human reviews every output → Bottleneck, inconsistent
32 FactHarbor approach: Automated quality gates + pattern-based improvement → Scalable, consistent
33
34
35 == 2. Publication Modes ==
36
37 **V0.9.70 CLARIFICATION:** FactHarbor uses **TWO publication modes** (not three):
38
39 === Mode 1: Draft-Only ===
40
41 **Status:** Not visible to public
42
43 **When Used:**
44
45 * Quality gates failed
46 * Confidence below threshold
47 * Structural integrity issues
48 * Insufficient evidence
49
50 **What Happens:**
51
52 * Content remains private
53 * System logs failure reasons
54 * Prompts/algorithms improved based on patterns
55 * Content may be re-processed after improvements
56
57 **NOT "pending human approval"** - it's blocked because it doesn't meet automated quality standards.
58
59
60 === Mode 2: AI-Generated (Public) ===
61
62 **Status:** Published and visible to all users
63
64 **When Used:**
65
66 * Quality gates passed
67 * Confidence ≥ threshold
68 * Meets structural requirements
69 * Sufficient evidence found
70
71 **Includes:**
72
73 * Confidence score displayed (0-100%)
74 * Risk tier badge (A/B/C)
75 * Quality indicators
76 * Clear "AI-Generated" labeling
77 * Sampling audit status
78
79 **Labels by Risk Tier:**
80
81 * **Tier A (High Risk):** "⚠️ AI-Generated - High Impact Topic - Seek Professional Advice"
82 * **Tier B (Medium Risk):** "🤖 AI-Generated - May Contain Errors"
83 * **Tier C (Low Risk):** "🤖 AI-Generated"
84
85 === REMOVED: "Mode 3: Human-Reviewed" ===
86
87 **V0.9.50 Decision:** No centralized approval workflow.
88
89 **Rationale:**
90
91 * Defeats automation purpose
92 * Creates bottleneck
93 * Inconsistent quality
94 * Not scalable
95
96 **What Replaced It:**
97
98 * Better quality gates
99 * Sampling audits for system improvement
100 * Transparent confidence scoring
101 * Risk-based warnings
102
103 == 3. Risk Tiers (A/B/C) ==
104
105 Risk classification determines WARNING LABELS and AUDIT FREQUENCY, NOT approval requirements.
106
107 === Tier A: High-Stakes Claims ===
108
109 **Examples:** Medical advice, legal interpretations, financial recommendations, safety information
110
111 **Impact:**
112
113 * ✅ Publish immediately (if passes gates)
114 * ✅ Prominent warning labels
115 * ✅ Higher sampling audit frequency (50% audited)
116 * ✅ Explicit disclaimers ("Seek professional advice")
117 * ❌ NOT held for moderator approval
118
119 **Philosophy:** Publish with strong warnings, monitor closely
120
121
122 === Tier B: Moderate-Stakes Claims ===
123
124 **Examples:** Political claims, controversial topics, scientific debates
125
126 **Impact:**
127
128 * ✅ Publish immediately (if passes gates)
129 * ✅ Standard warning labels
130 * ✅ Medium sampling audit frequency (20% audited)
131 * ❌ NOT held for moderator approval
132
133 === Tier C: Low-Stakes Claims ===
134
135 **Examples:** Entertainment facts, sports statistics, general knowledge
136
137 **Impact:**
138
139 * ✅ Publish immediately (if passes gates)
140 * ✅ Minimal warning labels
141 * ✅ Low sampling audit frequency (5% audited)
142
143 == 4. Quality Gates (Automated, Not Human) ==
144
145 All AI-generated content must pass these **AUTOMATED checks** before publication:
146
147 === Gate 1: Source Quality ===
148
149 **Automated Checks:**
150
151 * Primary sources identified and accessible
152 * Source reliability scored against database
153 * Citation completeness verified
154 * Publication dates checked
155 * Author credentials validated (where applicable)
156
157 **If Failed:** Block publication, log pattern, improve source detection
158
159
160 === Gate 2: Contradiction Search (MANDATORY) ===
161
162 **The system MUST actively search for:**
163
164 * **Counter-evidence** – Rebuttals, conflicting results, contradictory studies
165 * **Reservations** – Caveats, limitations, boundary conditions
166 * **Alternative interpretations** – Different framings, definitions
167 * **Bubble detection** – Echo chambers, ideologically isolated sources
168
169 **Search Coverage Requirements:**
170
171 * Academic literature (BOTH supporting AND opposing views)
172 * Diverse media across political/ideological perspectives
173 * Official contradictions (retractions, corrections, amendments)
174 * Cross-cultural and international perspectives
175
176 **Search Must Avoid Algorithmic Bubbles:**
177
178 * Deliberately seek opposing viewpoints
179 * Check for echo chamber patterns
180 * Identify tribal source clustering
181 * Flag artificially constrained search space
182 * Verify diversity of perspectives
183
184 **Outcomes:**
185
186 * Strong counter-evidence → Auto-escalate to Tier B or draft-only
187 * Significant uncertainty → Require uncertainty disclosure in verdict
188 * Bubble indicators → Flag for sampling audit
189 * Limited perspective diversity → Expand search or flag
190
191 **If Failed:** Block publication, improve search algorithms
192
193
194 === Gate 3: Uncertainty Quantification ===
195
196 **Automated Checks:**
197
198 * Confidence scores calculated for all claims and verdicts
199 * Limitations explicitly stated
200 * Data gaps identified and disclosed
201 * Strength of evidence assessed
202 * Alternative scenarios considered
203
204 **If Failed:** Block publication, improve confidence scoring
205
206
207 === Gate 4: Structural Integrity ===
208
209 **Automated Checks:**
210
211 * No hallucinations detected (fact-checking against sources)
212 * Logic chain valid and traceable
213 * References accessible and verifiable
214 * No circular reasoning
215 * Premises clearly stated
216
217 **If Failed:** Block publication, improve hallucination detection
218
219
220 **CRITICAL:** If any gate fails:
221
222 * ✅ Content remains in draft-only mode
223 * ✅ Failure reason logged
224 * ✅ Failure patterns analyzed for system improvement
225 * ❌ **NOT "sent for human review"**
226 * ❌ **NOT "manually overridden"**
227
228 **Philosophy:** Fix the system that generated bad output, don't manually fix individual outputs.
229
230
231 == 5. Sampling Audit System ==
232
233 **Purpose:** Improve the system through pattern analysis (NOT approve individual outputs)
234
235 === 5.1 How Sampling Works ===
236
237 **Stratified Sampling Strategy:**
238
239 Audits prioritize:
240
241 * **Risk tier** (Tier A: 50%, Tier B: 20%, Tier C: 5%)
242 * **AI confidence score** (low confidence → higher sampling rate)
243 * **Traffic and engagement** (high-visibility content audited more)
244 * **Novelty** (new claim types, new domains, emerging topics)
245 * **Disagreement signals** (user flags, contradiction alerts, community reports)
246
247 **NOT:** Review queue for approval
248 **IS:** Statistical sampling for quality monitoring
249
250
251 === 5.2 Audit Process ===
252
253 1. **System selects** content for audit based on sampling strategy
254 2. **Human auditor** reviews AI-generated content against quality standards
255 3. **Auditor validates or identifies issues:**
256
257 * Claim extraction accuracy
258 * Scenario appropriateness
259 * Evidence relevance and interpretation
260 * Verdict reasoning
261 * Contradiction search completeness
262 4. **Audit outcome recorded** (pass/fail + detailed feedback)
263 5. **Failed audits trigger:**
264 * Analysis of failure pattern
265 * System improvement tasks
266 * Algorithm/prompt adjustments
267 6. **Audit results feed back** into system improvement
268
269 **CRITICAL:** Auditors analyze PATTERNS, not fix individual outputs.
270
271
272 === 5.3 Feedback Loop (Continuous Improvement) ===
273
274 Audit outcomes systematically improve:
275
276 * **Query templates** – Refined based on missed evidence patterns
277 * **Retrieval source weights** – Adjusted for accuracy and reliability
278 * **Contradiction detection heuristics** – Enhanced to catch missed counter-evidence
279 * **Model prompts and extraction rules** – Tuned for better claim extraction
280 * **Risk tier assignments** – Recalibrated based on error patterns
281 * **Bubble detection algorithms** – Improved to identify echo chambers
282
283 **Philosophy:** "Improve the system, not the data"
284
285
286 === 5.4 Audit Transparency ===
287
288 **Publicly Published:**
289
290 * Audit statistics (monthly)
291 * Accuracy rates by risk tier
292 * System improvements made
293 * Aggregate audit performance
294
295 **Enables:**
296
297 * Public accountability
298 * System trust
299 * Continuous improvement visibility
300
301 == 6. Human Intervention Criteria ==
302
303 **From Organisation.Decision-Processes:**
304
305 **LEGITIMATE reasons to intervene:**
306
307 * ✅ AKEL explicitly flags item for sampling audit
308 * ✅ System metrics show performance degradation
309 * ✅ Legal/safety issue requires immediate action
310 * ✅ User reports reveal systematic bias pattern
311
312 **ILLEGITIMATE reasons** (system improvement needed instead):
313
314 * ❌ "I disagree with this verdict" → Improve algorithm
315 * ❌ "This source should rank higher" → Improve scoring rules
316 * ❌ "Manual quality gate before publication" → Defeats purpose of automation
317 * ❌ "I know better than the algorithm" → Then improve the algorithm
318
319 **Philosophy:** If you disagree with output, improve the system that generated it.
320
321
322 == 7. Architecture Overview ==
323
324 === POC Architecture (POC1, POC2) ===
325
326 **Simple, Single-Call Approach:**
327
328 ```
329 User submits article/claim
330
331 Single AI API call
332
333 Returns complete analysis
334
335 Quality gates validate
336
337 PASS → Publish (Mode 2)
338 FAIL → Block (Mode 1)
339 ```
340
341 **Components in Single Call:**
342
343 1. Extract 3-5 factual claims
344 2. For each claim: verdict + confidence + risk tier + reasoning
345 3. Generate analysis summary
346 4. Generate article summary
347 5. Run basic quality checks
348
349 **Processing Time:** 10-18 seconds
350
351 **Advantages:** Simple, fast POC development, proves AI capability
352 **Limitations:** No component reusability, all-or-nothing
353
354
355 === Full System Architecture (Beta 0, Release 1.0) ===
356
357 **Multi-Component Pipeline:**
358
359 ```
360 AKEL Orchestrator
361 ├── Claim Extractor
362 ├── Claim Classifier (with risk tier assignment)
363 ├── Scenario Generator
364 ├── Evidence Summarizer
365 ├── Contradiction Detector
366 ├── Quality Gate Validator
367 ├── Audit Sampling Scheduler
368 └── Federation Sync Adapter (Release 1.0+)
369 ```
370
371 **Processing:**
372
373 * Parallel processing where possible
374 * Separate component calls
375 * Quality gates between phases
376 * Audit sampling selection
377 * Cross-node coordination (federated mode)
378
379 **Processing Time:** 10-30 seconds (full pipeline)
380
381
382 === Evolution Path ===
383
384 **POC1:** Single prompt → Prove concept
385 **POC2:** Add scenario component → Test full pipeline
386 **Beta 0:** Multi-component AKEL → Production architecture
387 **Release 1.0:** Full AKEL + Federation → Scale
388
389
390 == 8. AKEL and Federation ==
391
392 In Release 1.0+, AKEL participates in cross-node knowledge alignment:
393
394 * Shares embeddings
395 * Exchanges canonicalized claim forms
396 * Exchanges scenario templates
397 * Sends + receives contradiction alerts
398 * Shares audit findings (with privacy controls)
399 * Never shares model weights
400 * Never overrides local governance
401
402 Nodes may choose trust levels for AKEL-related data:
403
404 * Trusted nodes: auto-merge embeddings + templates
405 * Neutral nodes: require additional verification
406 * Untrusted nodes: fully manual import
407
408 == 9. POC Behavior ==
409
410 The POC explicitly demonstrates AI-generated content publication:
411
412 * ✅ Produces public AI-generated output (Mode 2)
413 * ✅ No human data sources required
414 * ✅ No human approval gate
415 * ✅ Clear "AI-Generated - POC/Demo" labeling
416 * ✅ All quality gates active (including contradiction search)
417 * ✅ Users understand this demonstrates AI reasoning capabilities
418 * ✅ Risk tier classification shown (demo purposes)
419
420 **Philosophy Validation:** POC proves automation-first approach works.
421
422
423 == 10. Related Pages ==
424
425 * [[Automation>>Archive.FactHarbor 2026\.01\.20.Specification.Automation.WebHome]]
426 * [[Requirements (Roles)>>Archive.FactHarbor 2026\.01\.20.Specification.Requirements.WebHome]]
427 * [[Workflows>>Archive.FactHarbor 2026\.01\.20.Specification.Workflows.WebHome]]
428 * [[Governance>>Archive.FactHarbor.Organisation.Governance.WebHome]]
429 * [[Decision Processes>>FactHarbor.Organisation.Decision-Processes.WebHome]]
430
431 **V0.9.70 CHANGES:**
432 - ❌ REMOVED: Section "Human Review Workflow (Mode 3 Publication)"
433 - ❌ REMOVED: All references to "Mode 3"
434 - ❌ REMOVED: "Human review required before publication"
435 - ✅ CLARIFIED: Two modes only (AI-Generated / Draft-Only)
436 - ✅ CLARIFIED: Quality gate failures → Block + improve system
437 - ✅ CLARIFIED: Sampling audits for improvement, NOT approval
438 - ✅ CLARIFIED: Risk tiers affect warnings/audits, NOT approval gates
439 - ✅ ENHANCED: Gate 2 (Contradiction Search) specification
440 - ✅ ADDED: Clear human intervention criteria
441 - ✅ ADDED: Detailed audit system explanation