Automation
Automation
How FactHarbor scales through automated claim evaluation.
1. Automation Philosophy
FactHarbor is automation-first: AKEL (AI Knowledge Extraction Layer) makes all content decisions. Humans monitor system performance and improve algorithms.
Why automation:
- Scale: Can process millions of claims
- Consistency: Same evaluation criteria applied uniformly
- Transparency: Algorithms are auditable
- Speed: Results in <20 seconds typically
See Automation Philosophy for detailed principles.
2. Claim Processing Flow
2.1 User Submits Claim
- User provides claim text + source URLs
- System validates format
- Assigns processing ID
- Queues for AKEL processing
2.2 AKEL Processing
AKEL automatically:
- Parses claim into testable components
2. Extracts evidence from sources
3. Scores source credibility
4. Evaluates claim against evidence
5. Generates verdict with confidence score
6. Assigns risk tier (A/B/C)
7. Publishes result
Processing time: Typically <20 seconds
No human approval required - publication is automatic
2.3 Publication States
Processing: AKEL working on claim (not visible to public)
Published: AKEL completed evaluation (public)
- Verdict displayed with confidence score
- Evidence and sources shown
- Risk tier indicated
- Users can report issues
Flagged: AKEL identified issue requiring moderator attention (still public) - Low confidence below threshold
- Detected manipulation attempt
- Unusual pattern
- Moderator reviews and may take action
2.5 LLM-Based Processing Architecture
FactHarbor delegates complex reasoning and analysis tasks to Large Language Models (LLMs). The architecture evolves from POC to production:
POC: Two-Phase Approach
Phase 1: Claim Extraction
- Single LLM call to extract all claims from submitted content
- Light structure, focused on identifying distinct verifiable claims
- Output: List of claims with context
Phase 2: Claim Analysis (Parallel)
- Single LLM call per claim (parallelizable)
- Full structured output: Evidence, Scenarios, Sources, Verdict, Risk
- Each claim analyzed independently
Advantages:
- Fast to implement (2-4 weeks to working POC)
- Only 2-3 API calls total (1 + N claims)
- Simple to debug (claim-level isolation)
- Proves concept viability
Production: Three-Phase Approach
Phase 1: Claim Extraction + Validation
- Extract distinct verifiable claims
- Validate claim clarity and uniqueness
- Remove duplicates and vague claims
Phase 2: Evidence Gathering (Parallel)
- For each claim independently:
- Find supporting and contradicting evidence
- Identify authoritative sources
- Generate test scenarios
- Validation: Check evidence quality and source validity
- Error containment: Issues in one claim don't affect others
Phase 3: Verdict Generation (Parallel)
- For each claim:
- Generate verdict based on validated evidence
- Assess confidence and risk level
- Flag low-confidence results for human review
- Validation: Check verdict consistency with evidence
Advantages:
- Error containment between phases
- Clear quality gates and validation
- Observable metrics per phase
- Scalable (parallel processing across claims)
- Adaptable (can optimize each phase independently)
LLM Task Delegation
All complex cognitive tasks are delegated to LLMs:
- Claim Extraction: Understanding context, identifying distinct claims
- Evidence Finding: Analyzing sources, assessing relevance
- Scenario Generation: Creating testable hypotheses
- Source Evaluation: Assessing reliability and authority
- Verdict Generation: Synthesizing evidence into conclusions
- Risk Assessment: Evaluating potential impact
Error Mitigation
Research shows sequential LLM calls face compound error risks. FactHarbor mitigates this through:
- Validation gates between phases
- Confidence thresholds for quality control
- Parallel processing to avoid error propagation across claims
- Human review queue for low-confidence verdicts
- Independent claim processing - errors in one claim don't cascade to others
3. Risk Tiers
Risk tiers classify claims by potential impact and guide audit sampling rates.
3.1 Tier A (High Risk)
Domains: Medical, legal, elections, safety, security
Characteristics:
- High potential for harm if incorrect
- Complex specialized knowledge required
- Often subject to regulation
Publication: AKEL publishes automatically with prominent risk warning
Audit rate: Higher sampling recommended
3.2 Tier B (Medium Risk)
Domains: Complex policy, science, causality claims
Characteristics:
- Moderate potential impact
- Requires careful evidence evaluation
- Multiple valid interpretations possible
Publication: AKEL publishes automatically with standard risk label
Audit rate: Moderate sampling recommended
3.3 Tier C (Low Risk)
Domains: Definitions, established facts, historical data
Characteristics:
- Low potential for harm
- Well-documented information
- Clear right/wrong answers typically
Publication: AKEL publishes by default
Audit rate: Lower sampling recommended
4. Quality Gates
AKEL applies quality gates before publication. If any fail, claim is flagged (not blocked - still published).
Quality gates:
- Sufficient evidence extracted (≥2 sources)
- Sources meet minimum credibility threshold
- Confidence score calculable
- No detected manipulation patterns
- Claim parseable into testable form
Failed gates: Claim published with flag for moderator review
5. Automation Levels
Automation Maturity Progression
graph TD
POC[Level 0 POC Demo CURRENT]
R05[Level 0.5 Limited Production]
R10[Level 1.0 Full Production]
R20[Level 2.0+ Distributed Intelligence]
POC --> R05
R05 --> R10
R10 --> R20
Level Descriptions
| Level | Name | Key Features |
|---|---|---|
| Level 0 | POC/Demo (CURRENT) | All content auto-analyzed, AKEL generates verdicts, no risk tier filtering, single-user demo mode |
| Level 0.5 | Limited Production | Multi-user support, risk tier classification, basic sampling audit, algorithm improvement focus |
| Level 1.0 | Full Production | All tiers auto-published, clear risk labels, reduced sampling, mature algorithms |
| Level 2.0+ | Distributed | Federated multi-node, cross-node audits, advanced patterns, strategic sampling only |
Current Implementation (v2.6.33)
| Feature | POC Target | Actual Status |
|---|---|---|
| AKEL auto-analysis | Yes | Implemented |
| Verdict generation | Yes | Implemented (7-point scale) |
| Quality Gates | Basic | Gates 1 and 4 implemented |
| Risk tiers | Yes | Not implemented |
| Sampling audits | High sampling | Not implemented |
| User system | Demo only | Anonymous only |
Key Principles
Across All Levels:
- AKEL makes all publication decisions
- No human approval gates
- Humans monitor metrics and improve algorithms
- Risk tiers guide audit priorities, not publication
- Sampling audits inform improvements
FactHarbor progresses through automation maturity levels:
Release 0.5 (Proof-of-Concept): Tier C only, human review required
Release 1.0 (Initial): Tier B/C auto-published, Tier A flagged for review
Release 2.0 (Mature): All tiers auto-published with risk labels, sampling audits
See Automation Roadmap for detailed progression.
5.5 Automation Roadmap
Automation Roadmap
graph LR
subgraph QA[Quality Assurance Evolution]
QA1[Initial High Sampling]
QA2[Intermediate Strategic]
QA3[Mature Anomaly-Triggered]
QA1 --> QA2
QA2 --> QA3
end
subgraph POC[POC CURRENT]
POC_F[POC Features]
end
subgraph R05[Release 0.5]
R05_F[Limited Production]
end
subgraph R10[Release 1.0]
R10_F[Full Production]
end
subgraph Future[Future]
Future_F[Distributed Intelligence]
end
POC_F --> R05_F
R05_F --> R10_F
R10_F --> Future_F
Phase Details
POC (Current v2.6.33)
- All content analyzed
- Basic AKEL Processing
- No risk tiers yet
- No sampling audits
Release 0.5 (Planned)
- Tier A/B/C Published
- All auto-publication
- Risk Labels Active
- Contradiction Detection
- Sampling-Based QA
Release 1.0 (Planned)
- Comprehensive AI Publication
- Strategic Audits Only
- Federated Nodes Beta
- Cross-Node Data Sharing
- Mature Algorithm Performance
Future (V2.0+)
- Advanced Pattern Detection
- Global Contradiction Network
- Minimal Human QA
- Full Federation
Philosophy
Automation Philosophy: At all stages, AKEL publishes automatically. Humans improve algorithms, not review content.
Sampling Rates: Start higher for learning, reduce as confidence grows.
6. Human Role
Humans do NOT review content for approval. Instead:
Monitoring: Watch aggregate performance metrics
Improvement: Fix algorithms when patterns show issues
Exception handling: Review AKEL-flagged items
Governance: Set policies AKEL applies
See Contributor Processes for how to improve the system.
6.5 Manual vs Automated Matrix
Manual vs Automated Matrix
graph TD
subgraph Automated[Automated by AKEL]
A1[Claim Evaluation]
A2[Quality Assessment]
A3[Content Management]
end
subgraph Human[Human Responsibilities]
H1[Algorithm Improvement]
H2[Policy Governance]
H3[Exception Handling]
H4[Strategic Decisions]
end
Automated by AKEL
| Function | Details | Status |
|---|---|---|
| Claim Evaluation | Evidence extraction, source scoring, verdict generation, risk classification, publication | Implemented |
| Quality Assessment | Contradiction detection, confidence scoring, pattern recognition, anomaly flagging | Partial (Gates 1 and 4) |
| Content Management | KeyFactor generation, evidence linking, source tracking | Implemented |
Human Responsibilities
| Function | Details | Status |
|---|---|---|
| Algorithm Improvement | Monitor metrics, identify issues, propose fixes, test, deploy | Via code changes |
| Policy Governance | Set criteria, define risk tiers, establish thresholds, update guidelines | Not implemented (env vars only) |
| Exception Handling | Review flagged items, handle abuse, address safety, manage legal | Not implemented |
| Strategic Decisions | Budget, hiring, major policy, partnerships | N/A |
Key Principles
Never Manual:
- Individual claim approval
- Routine content review
- Verdict overrides (fix algorithm instead)
- Publication gates
Key Principle: AKEL handles all content decisions. Humans improve the system, not the data.
7. Moderation
Moderators handle items AKEL flags:
Abuse detection: Spam, manipulation, harassment
Safety issues: Content that could cause immediate harm
System gaming: Attempts to manipulate scoring
Action: May temporarily hide content, ban users, or propose algorithm improvements
Does NOT: Routinely review claims or override verdicts
See Organisational Model for moderator role details.
8. Continuous Improvement
Performance monitoring: Track AKEL accuracy, speed, coverage
Issue identification: Find systematic errors from metrics
Algorithm updates: Deploy improvements to fix patterns
A/B testing: Validate changes before full rollout
Retrospectives: Learn from failures systematically
See Continuous Improvement for improvement cycle.
9. Scalability
Automation enables FactHarbor to scale:
- Millions of claims processable
- Consistent quality at any volume
- Cost efficiency through automation
- Rapid iteration on algorithms
Without automation: Human review doesn't scale, creates bottlenecks, introduces inconsistency.
10. Transparency
All automation is transparent:
- Algorithm parameters documented
- Evaluation criteria public
- Source scoring rules explicit
- Confidence calculations explained
- Performance metrics visible
See System Performance Metrics for what we measure.