Skip to Content

Wiki source code of AI Knowledge Extraction Layer (AKEL)

Last modified by Robert Schaub on 2026/02/08 21:23

Show last authors

author	version	line-number	content
		1	= AKEL — AI Knowledge Extraction Layer =
		2
		3	AKEL is FactHarbor's automated intelligence subsystem.
		4	Its purpose is to reduce human workload, enhance consistency, and enable scalable knowledge processing — without ever replacing human judgment.
		5	AKEL outputs are marked with AuthorType = AI and published according to risk-based review policies (see Publication Modes below).
		6	AKEL operates in two modes:
		7
		8	* Single-node mode (POC & Beta 0)
		9	* Federated multi-node mode (Release 1.0+)
		10
		11	== 1. Purpose and Role ==
		12
		13	AKEL transforms unstructured inputs into structured, publication-ready content.
		14	Core responsibilities:
		15
		16	* Claim extraction from arbitrary text
		17	* Claim classification (domain, type, evaluability, safety, risk tier)
		18	* Scenario generation (definitions, boundaries, assumptions, methodology)
		19	* Evidence summarization and metadata extraction
		20	* Contradiction detection and counter-evidence search
		21	* Reservation and limitation identification
		22	* Bubble detection (echo chambers, conspiracy theories, isolated sources)
		23	* Re-evaluation proposal generation
		24	* Cross-node embedding exchange (Release 1.0+)
		25
		26	== 2. Components ==
		27
		28	* AKEL Orchestrator – central coordinator
		29	* Claim Extractor
		30	* Claim Classifier (with risk tier assignment)
		31	* Scenario Generator
		32	* Evidence Summarizer
		33	* Contradiction Detector (enhanced with counter-evidence search)
		34	* Quality Gate Validator
		35	* Audit Sampling Scheduler
		36	* Embedding Handler (Release 1.0+)
		37	* Federation Sync Adapter (Release 1.0+)
		38
		39	== 3. Inputs and Outputs ==
		40
		41	=== 3.1 Inputs ===
		42
		43	* User-submitted claims or evidence
		44	* Uploaded documents
		45	* URLs or citations
		46	* External LLM API (optional)
		47	* Embeddings (from local or federated peers)
		48
		49	=== 3.2 Outputs (publication mode varies by risk tier) ===
		50
		51	* ClaimVersion (draft or AI-generated)
		52	* ScenarioVersion (draft or AI-generated)
		53	* EvidenceVersion (summary + metadata, draft or AI-generated)
		54	* VerdictVersion (draft, AI-generated, or human-reviewed)
		55	* Contradiction alerts
		56	* Reservation and limitation notices
		57	* Re-evaluation proposals
		58	* Updated embeddings
		59
		60	== 4. Publication Modes ==
		61
		62	AKEL content is published according to three modes:
		63
		64	=== 4.1 Mode 1: Draft-Only (Never Public) ===
		65
		66	Used for:
		67
		68	* Failed quality gate checks
		69	* Sensitive topics flagged for expert review
		70	* Unclear scope or missing critical sources
		71	* High reputational risk content
		72	Visibility: Internal review queue only
		73
		74	=== 4.2 Mode 2: Published as AI-Generated (No Prior Human Review) ===
		75
		76	Requirements:
		77
		78	* All automated quality gates passed (see below)
		79	* Risk tier permits AI-draft publication (Tier B or C)
		80	* Contradiction search completed successfully
		81	* Clear labeling as "AI-Generated, AKEL-Generated"
		82	Label shown to users:
		83	```
		84	[AI-Generated] This content was produced by AI and has not yet been human-reviewed.
		85	Source: AI \| Review Status: Pending \| Risk Tier: [B/C]
		86	Contradiction Search: Completed \| Last Updated: [timestamp]
		87	```
		88	User actions:
		89	* Browse and read content
		90	* Request human review (escalates to review queue)
		91	* Flag for expert attention
		92
		93	== 5. Risk tiers ==
		94
		95	AKEL assigns risk tiers to all content to determine appropriate review requirements:
		96
		97	=== 5.1 Tier A — High Risk / High Impact ===
		98
		99	Domains: Medical, legal, elections, safety/security, major reputational harm
		100	Publication policy:
		101
		102	* Human review REQUIRED before "AKEL-Generated" status
		103	* AI-generated content MAY be published but:
		104	** Clearly flagged as AI-draft with prominent disclaimer
		105	** May have limited visibility
		106	** Auto-escalated to expert review queue
		107	** User warnings displayed
		108	Audit rate: Recommendation: 30-50% of published AI-drafts sampled in first 6 months
		109
		110	=== 5.2 Tier B — Medium Risk ===
		111
		112	Domains: Contested public policy, complex science, causality claims, significant financial impact
		113	Publication policy:
		114
		115	* AI-draft CAN publish immediately with clear labeling
		116	* Sampling audits conducted (see Audit System below)
		117	* High-engagement items auto-escalated to expert review
		118	* Users can report issue for moderator review
		119	Audit rate: Recommendation: 10-20% of published AI-drafts sampled
		120
		121	=== 5.3 Tier C — Low Risk ===
		122
		123	Domains: Definitions, simple factual lookups with strong primary sources, historical facts, established scientific consensus
		124	Publication policy:
		125
		126	* AI-draft default publication mode
		127	* Sampling audits sufficient
		128	* Community flagging available
		129	* Human review on request
		130	Audit rate: Recommendation: 5-10% of published AI-drafts sampled
		131
		132	== 6. Quality Gates (Mandatory Before AI-Draft Publication) ==
		133
		134	All AI-generated content must pass these automated checks before Mode 2 publication:
		135
		136	=== 6.1 Gate 1: Source Quality ===
		137
		138	* Primary sources identified and accessible
		139	* Source reliability scored against whitelist
		140	* Citation completeness verified
		141	* Publication dates checked
		142	* Author credentials validated (where applicable)
		143
		144	=== 6.2 Gate 2: Contradiction Search (MANDATORY) ===
		145
		146	The system MUST actively search for:
		147
		148	* Counter-evidence – Rebuttals, conflicting results, contradictory studies
		149	* Reservations – Caveats, limitations, boundary conditions, applicability constraints
		150	* Alternative interpretations – Different framings, definitions, contextual variations
		151	* Bubble detection – Conspiracy theories, echo chambers, ideologically isolated sources
		152	Search coverage requirements:
		153	* Academic literature (BOTH supporting AND opposing views)
		154	* Reputable media across diverse political/ideological perspectives
		155	* Official contradictions (retractions, corrections, updates, amendments)
		156	* Domain-specific skeptics, critics, and alternative expert opinions
		157	* Cross-cultural and international perspectives
		158	Search must actively avoid algorithmic bubbles:
		159	* Deliberately seek opposing viewpoints
		160	* Check for echo chamber patterns in source clusters
		161	* Identify tribal or ideological source clustering
		162	* Flag when search space appears artificially constrained
		163	* Verify diversity of perspectives represented
		164	Outcomes:
		165	* Strong counter-evidence found → Auto-escalate to Tier B or draft-only mode
		166	* Significant uncertainty detected → Require uncertainty disclosure in verdict
		167	* Bubble indicators present → Flag for expert review and human validation
		168	* Limited perspective diversity → Expand search or flag for human review
		169
		170	=== 6.3 Gate 3: Uncertainty Quantification ===
		171
		172	* Confidence scores calculated for all claims and verdicts
		173	* Limitations explicitly stated
		174	* Data gaps identified and disclosed
		175	* Strength of evidence assessed
		176	* Alternative scenarios considered
		177
		178	=== 6.4 Gate 4: Structural Integrity ===
		179
		180	* No hallucinations detected (fact-checking against sources)
		181	* Logic chain valid and traceable
		182	* References accessible and verifiable
		183	* No circular reasoning
		184	* Premises clearly stated
		185	If any gate fails:
		186	* Content remains in draft-only mode
		187	* Failure reason logged
		188	* Human review required before publication
		189	* Failure patterns analyzed for system improvement
		190
		191	== 7. Audit System (Sampling-Based Quality Assurance) ==
		192
		193	Instead of reviewing ALL AI output, FactHarbor implements stratified sampling audits:
		194
		195	=== 7.1 Sampling Strategy ===
		196
		197	Audits prioritize:
		198
		199	* Risk tier (higher tiers get more frequent audits)
		200	* AI confidence score (low confidence → higher sampling rate)
		201	* Traffic and engagement (high-visibility content audited more)
		202	* Novelty (new claim types, new domains, emerging topics)
		203	* Disagreement signals (user flags, contradiction alerts, community reports)
		204
		205	=== 7.2 Audit Process ===
		206
		207	1. System selects content for audit based on sampling strategy
		208	2. Human auditor reviews AI-generated content against quality standards
		209	3. Moderator validates or corrects:
		210
		211	* Claim extraction accuracy
		212	* Scenario appropriateness
		213	* Evidence relevance and interpretation
		214	* Verdict reasoning
		215	* Contradiction search completeness
		216	4. Audit outcome recorded (pass/fail + detailed feedback)
		217	5. Failed audits trigger immediate content review
		218	6. Audit results feed back into system improvement
		219
		220	=== 7.3 Feedback Loop (Continuous Improvement) ===
		221
		222	Audit outcomes systematically improve:
		223
		224	* Query templates – Refined based on missed evidence patterns
		225	* Retrieval source weights – Adjusted for accuracy and reliability
		226	* Contradiction detection heuristics – Enhanced to catch missed counter-evidence
		227	* Model prompts and extraction rules – Tuned for better claim extraction
		228	* Risk tier assignments – Recalibrated based on error patterns
		229	* Bubble detection algorithms – Improved to identify echo chambers
		230
		231	=== 7.4 Audit Transparency ===
		232
		233	* Audit statistics published regularly
		234	* Accuracy rates by risk tier tracked and reported
		235	* System improvements documented
		236	* Community can view aggregate audit performance
		237
		238	== 8. Architecture Overview ==
		239
		240	{{include reference="Archive.FactHarbor 2026\.01\.20.Specification.Diagrams.AKEL Architecture.WebHome"/}}
		241
		242	== 9. AKEL and Federation ==
		243
		244	In Release 1.0+, AKEL participates in cross-node knowledge alignment:
		245
		246	* Shares embeddings
		247	* Exchanges canonicalized claim forms
		248	* Exchanges scenario templates
		249	* Sends + receives contradiction alerts
		250	* Shares audit findings (with privacy controls)
		251	* Never shares model weights
		252	* Never overrides local governance
		253	Nodes may choose trust levels for AKEL-related data:
		254	* Trusted nodes: auto-merge embeddings + templates
		255	* Neutral nodes: require additional verification
		256	* Untrusted nodes: fully manual import
		257
		258	== 10. Human Review Workflow (Mode 3 Publication) ==
		259
		260	For content requiring human validation before "AKEL-Generated" status:
		261
		262	1. AKEL generates content and publishes as AI-draft (Mode 2) or keeps as draft (Mode 1)
		263	2. Contributors inspect content in review queue
		264	3. Contributors validate quality gates were correctly applied
		265	4. Trusted Contributors validate high-risk (Tier A) or domain-specific outputs
		266	5. Moderators finalize "AKEL-Generated" publication
		267	6. Version numbers increment, full history preserved
		268	Note: Most AI-generated content (Tier B and C) can remain in Mode 2 (AI-Generated) indefinitely. Human review is optional for these tiers unless users or audits flag issues.
		269
		270	== 11. POC v1 Behavior ==
		271
		272	The POC explicitly demonstrates AI-generated content publication:
		273
		274	* Produces public AI-generated output (Mode 2)
		275	* No human data sources required
		276	* No human approval gate
		277	* Clear "AI-Generated - POC/Demo" labeling
		278	* All quality gates active (including contradiction search)
		279	* Users understand this demonstrates AI reasoning capabilities
		280	* Risk tier classification shown (demo purposes)
		281
		282	== 12. Related Pages ==
		283
		284	* [[Automation>>Archive.FactHarbor 2026\.01\.20.Specification.Automation.WebHome]]
		285	* [[Requirements (Roles)>>Archive.FactHarbor 2026\.01\.20.Specification.Requirements.WebHome]]
		286	* [[Workflows>>Archive.FactHarbor 2026\.01\.20.Specification.Workflows.WebHome]]
		287	* [[Governance>>Archive.FactHarbor.Organisation.Governance.WebHome]]