Changes for page Automation

Last modified by Robert Schaub on 2025/12/24 20:34

From 4.1 to 5.1 From 6.1 to 7.1

From version 5.1

edited by Robert Schaub
on 2025/12/12 15:41

Change comment: Imported from XAR

To version 6.1

edited by Robert Schaub
on 2025/12/14 18:59

Change comment: Imported from XAR

Raw
Rendered

Summary

Page properties (1 modified, 0 added, 0 removed)

Details

Page properties

Content

@@ -1,17 +1,18 @@
  = Automation =
--Automation in FactHarbor amplifies human capability but never replaces human oversight.
--All automated outputs require human review before publication.
++Automation in FactHarbor amplifies human capability while implementing risk-based oversight.
  This chapter defines:
++* Risk-based publication model
++* Quality gates for AI-generated content
  * What must remain human-only
--* What AI (AKEL) can draft
++* What AI (AKEL) can draft and publish
  * What can be fully automated
  * How automation evolves through POC → Beta 0 → Release 1.0
--== POC v1 (Fully Automated "Text to Truth Landscape") ==
++== POC v1 (AI-Generated Publication Demonstration) ==
--The goal of POC v1 is to validate the automated reasoning capabilities of the data model without human intervention.
++The goal of POC v1 is to validate the automated reasoning capabilities and demonstrate AI-generated content publication.
  === Workflow ===
@@ -19,156 +19,252 @@
 . **Deep Analysis (Background)**: The system autonomously performs the full pipeline **before** displaying the text:
  * Extraction & Normalisation
  * Scenario & Sub-query generation
--* Evidence retrieval & Verdict computation
++* Evidence retrieval with **contradiction search**
++* Quality gate validation
++* Verdict computation
 . **Visualisation (Extraction & Marking)**: The system displays the text with claims extracted and marked.
  * **Verdict-Based Coloring**: The extraction highlights (e.g. Orange/Green) are chosen **according to the computed verdict** for each claim.
++* **AI-Generated Label**: Clear indication that content is AI-produced
 . **Inspection**: User clicks a highlighted claim to see the **Reasoning Trail**, showing exactly which evidence and sub-queries led to that verdict.
  === Technical Scope ===
--* **Fully Automated**: No human-in-the-loop for this phase.
--* **Structured Sub-Queries**: Logic is generated by decomposing claims into the FactHarbor data model.
--* **Latency**: Focus on accuracy of reasoning over real-time speed for v1.
++* **AI-Generated Publication**: Content published as Mode 2 (AI-Generated, no prior human review)
++* **Quality Gates Active**: All automated quality checks enforced
++* **Contradiction Search Demonstrated**: Shows counter-evidence and reservation detection
++* **Risk Tier Classification**: POC shows tier assignment (demo purposes)
++* **No Human Approval Gate**: Demonstrates scalable AI publication
++* **Structured Sub-Queries**: Logic generated by decomposing claims into the FactHarbor data model
  ----
--= Manual vs Automated Responsibilities =
++== Publication Model ==
++FactHarbor implements a risk-based publication model with three modes:
++
++=== Mode 1: Draft-Only ===
++* Failed quality gates
++* High-risk content pending expert review
++* Internal review queue only
++
++=== Mode 2: AI-Generated (Public) ===
++* Passed all quality gates
++* Risk tier B or C
++* Clear AI-generated labeling
++* Users can request human review
++
++=== Mode 3: Human-Reviewed ===
++* Validated by human reviewers/experts
++* "Human-Reviewed" status badge
++* Required for Tier A content publication
++
++See [[AKEL page>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]] for detailed publication mode descriptions.
++
++----
++
++== Risk Tiers and Automation Levels ==
++
++=== Tier A (High Risk) ===
++* **Domains**: Medical, legal, elections, safety, security
++* **Automation**: AI can draft, human review required for "Human-Reviewed" status
++* **AI publication**: Allowed with prominent disclaimers and warnings
++* **Audit rate**: Recommendation: 30-50%
++
++=== Tier B (Medium Risk) ===
++* **Domains**: Complex policy, science, causality claims
++* **Automation**: AI can draft and publish (Mode 2)
++* **Human review**: Optional, audit-based
++* **Audit rate**: Recommendation: 10-20%
++
++=== Tier C (Low Risk) ===
++* **Domains**: Definitions, established facts, historical data
++* **Automation**: AI publication default
++* **Human review**: On request or via sampling
++* **Audit rate**: Recommendation: 5-10%
++
++----
++
  == Human-Only Tasks ==
--These require human judgment, ethics, or contextual interpretation:
++These require human judgment and cannot be automated:
--* Definition of key terms in claims
--* Approval or rejection of scenarios
--* Interpretation of evidence in context
--* Final verdict approval
--* Governance decisions and dispute resolution
--* High-risk domain oversight
--* Ethical boundary decisions (especially medical, political, psychological)
++* **Ethical boundary decisions** (especially medical, political, psychological harm assessment)
++* **Dispute resolution** between conflicting expert opinions
++* **Governance policy** setting and enforcement
++* **Final authority** on Tier A "Human-Reviewed" status
++* **Audit system oversight** and quality standard definition
++* **Risk tier policy** adjustments based on societal context
--== Semi-Automated (AI Draft → Human Review) ==
++----
--AKEL can draft these, but humans must refine/approve:
++== AI-Draft with Audit (Semi-Automated) ==
--* Scenario structures (definitions, assumptions, context)
--* Evaluation methods
--* Evidence relevance suggestions
--* Reliability hints
--* Verdict reasoning chains
--* Uncertainty and limitations
--* Scenario comparison explanations
--* Suggestions for merging or splitting scenarios
--* Draft public summaries
++AKEL drafts these; humans validate via sampling audits:
++* **Scenario structures** (definitions, assumptions, context)
++* **Evaluation methods** and reasoning chains
++* **Evidence relevance** assessment and ranking
++* **Reliability scoring** and source evaluation
++* **Verdict reasoning** with uncertainty quantification
++* **Contradiction and reservation** identification
++* **Scenario comparison** explanations
++* **Public summaries** and accessibility text
++
++Most Tier B and C content remains in AI-draft status unless:
++* Users request human review
++* Audits identify errors
++* High engagement triggers review
++* Community flags issues
++
++----
++
  == Fully Automated Structural Tasks ==
  These require no human interpretation:
--* Claim normalization
--* Duplicate & cluster detection (vector embeddings)
--* Evidence metadata extraction
--* Basic reliability heuristics
--* Contradiction detection
--* Re-evaluation triggers
--* Batch layout generation (diagrams, summaries)
--* Federation integrity checks
++* **Claim normalization** (canonical form generation)
++* **Duplicate detection** (vector embeddings, clustering)
++* **Evidence metadata extraction** (dates, authors, publication info)
++* **Basic reliability heuristics** (source reputation scoring)
++* **Contradiction detection** (conflicting statements across sources)
++* **Re-evaluation triggers** (new evidence, source updates)
++* **Layout generation** (diagrams, summaries, UI presentation)
++* **Federation integrity checks** (cross-node data validation)
  ----
--= Automation Roadmap =
++== Quality Gates (Automated) ==
--Automation increases with maturity.
++Before AI-draft publication (Mode 2), content must pass:
--== POC (Low Automation) ==
++1. **Source Quality Gate**
++   * Primary sources verified
++   * Citations complete and accessible
++   * Source reliability scored
--=== Automated ===
--* Claim normalization
--* Light scenario templates
--* Evidence metadata extraction
--* Simple verdict drafts (internal only)
++2. **Contradiction Search Gate** (MANDATORY)
++   * Counter-evidence actively sought
++   * Reservations and limitations identified
++   * Bubble detection (echo chambers, conspiracy theories)
++   * Diverse perspective verification
--=== Human ===
--* All scenario definitions
--* Evidence interpretation
--* Verdict creation
--* Governance
++3. **Uncertainty Quantification Gate**
++   * Confidence scores calculated
++   * Limitations stated
++   * Data gaps disclosed
--== Beta 0 (Medium Automation) ==
++4. **Structural Integrity Gate**
++   * No hallucinations detected
++   * Logic chain valid
++   * References verifiable
--=== Automated ===
--* Detailed scenario drafts
--* Evidence reliability scoring
--* Cross-scenario comparisons
--* Contradiction detection (local + remote nodes)
--* Internal Truth Landscape drafts
++See [[AKEL page>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]] for detailed quality gate specifications.
--=== Human ===
--* Scenario approval
--* Final verdict validation
++----
--== Release 1.0 (High Automation) ==
++== Audit System ==
--=== Automated ===
--* Full scenario generation (definitions, assumptions, boundaries)
--* Evidence relevance scoring and ranking
--* Bayesian verdict scoring across scenario sets
--* Multi-scenario summary generation
--* Anomaly detection across nodes
--* AKEL-assisted federated synchronization
++Instead of reviewing all AI output, systematic sampling audits ensure quality:
--=== Human ===
--* Final approval of all scenarios and verdicts
--* Ethical decisions
--* Oversight and conflict resolution
++=== Stratified Sampling ===
++* Risk tier (A > B > C sampling rates)
++* Confidence scores (low confidence → more audits)
++* Traffic/engagement (popular content audited more)
++* Novelty (new topics/claim types prioritized)
++* User flags and disagreement signals
++=== Continuous Improvement Loop ===
++Audit findings improve:
++* Query templates
++* Source reliability weights
++* Contradiction detection algorithms
++* Risk tier assignment rules
++* Bubble detection heuristics
++
++=== Transparency ===
++* Audit statistics published
++* Accuracy rates by tier reported
++* System improvements documented
++
  ----
--= Automation Levels =
++== Automation Roadmap ==
--== Level 0 — Human-Centric (POC) ==
--AI is purely advisory, nothing auto-published.
++Automation capabilities increase with system maturity while maintaining quality oversight.
--== Level 1 — Assisted (Beta 0) ==
--AI drafts structures; humans approve each part.
++=== POC (Current Focus) ===
--== Level 2 — Structured (Release 1.0) ==
--AI produces near-complete drafts; humans refine.
++**Automated:**
++* Claim normalization
++* Scenario template generation
++* Evidence metadata extraction
++* Simple verdict drafts
++* **AI-generated publication** (Mode 2, with quality gates)
++* **Contradiction search**
++* **Risk tier assignment**
--== Level 3 — Distributed Intelligence (Future) ==
--Nodes exchange embeddings, contradiction alerts, and scenario templates.
--Humans still approve everything.
++**Human:**
++* High-risk content validation (Tier A)
++* Sampling audits across all tiers
++* Quality standard refinement
++* Governance decisions
------
++=== Beta 0 (Enhanced Automation) ===
--= Automation Matrix =
++**Automated:**
++* Detailed scenario generation
++* Advanced evidence reliability scoring
++* Cross-scenario comparisons
++* Multi-source contradiction detection
++* Internal Truth Landscape generation
++* **Increased AI-draft coverage** (more Tier B content)
--== Always Human ==
--* Final verdict approval
--* Scenario validity
--* Ethical decisions
--* Dispute resolution
++**Human:**
++* Tier A final approval
++* Audit sampling (continued)
++* Expert validation of complex domains
++* Quality improvement oversight
--== Mostly AI (Human Validation Needed) ==
--* Claim normalization
--* Clustering
--* Evidence metadata
--* Reliability heuristics
--* Scenario drafts
--* Contradiction detection
++=== Release 1.0 (High Automation) ===
--== Mixed ==
--* Definitions of ambiguous terms
--* Boundary choices
--* Assumption evaluation
--* Evidence selection
--* Verdict reasoning
++**Automated:**
++* Full scenario generation (comprehensive)
++* Bayesian verdict scoring across scenarios
++* Multi-scenario summary generation
++* Anomaly detection across federated nodes
++* AKEL-assisted cross-node synchronization
++* **Most Tier B and all Tier C** auto-published
++**Human:**
++* Tier A oversight (still required)
++* Strategic audits (lower sampling rates, higher value)
++* Ethical decisions and policy
++* Conflict resolution
++
  ----
--= Diagram References =
++== Automation Levels Diagram ==
++{{include reference="FactHarbor.Specification.Diagrams.Automation Level.WebHome"/}}
++
++----
++
++== Automation Roadmap Diagram ==
++
  {{include reference="FactHarbor.Specification.Diagrams.Automation Roadmap.WebHome"/}}
--{{include reference="FactHarbor.Specification.Diagrams.Automation Level.WebHome"/}}
++----
++== Manual vs Automated Matrix ==
++
  {{include reference="FactHarbor.Specification.Diagrams.Manual vs Automated matrix.WebHome"/}}
++
++----
++
++== Related Pages ==
++
++* [[AKEL (AI Knowledge Extraction Layer)>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]]
++* [[Requirements (Roles)>>FactHarbor.Specification.Requirements.WebHome]]
++* [[Workflows>>FactHarbor.Specification.Workflows.WebHome]]
++* [[Governance>>FactHarbor.Organisation.Governance]]
++

Changes for page Automation

Summary

Details

Applications

Navigation

Need help?