POC Requirements (POC1 & POC2)
POC Requirements
Status: ✅ Approved for Development
Version: 3.0 (Aligned with Main Requirements)
Goal: Prove that AI can extract claims and determine verdicts automatically without human intervention
1. POC Overview
1.1 What POC Tests
Core Question:
Can AI automatically extract factual claims from articles and evaluate them with reasonable verdicts?
What we're proving:
- AI can identify factual claims from text
- AI can evaluate those claims with structured evidence
- Quality gates can filter unreliable outputs
- The core workflow is technically feasible
What we're NOT proving:
- Production-ready reliability (that's POC2)
- User-facing features (that's Beta 0)
- Full IFCN compliance (that's V1.0)
1.2 Requirements Mapping
POC1 implements a subset of the full system requirements defined in Main Requirements.
Scope Summary:
- In Scope: 8 requirements (7 FRs + 1 NFR)
- Partial: 3 NFRs (simplified versions)
- Out of Scope: 19 requirements (deferred to later phases)
2. POC1 Scope
POC1 implements these formal requirements:
| Formal Req | Implementation in POC1 | Notes |
|---|---|---|
| FR4 | Analysis Summary | Basic format; quality metadata deferred to POC2 |
| FR7 | Automated Verdicts | Full implementation with quality gates (NFR11) |
| NFR11 | Quality Assurance Framework | 4 quality gates implemented |
POC1 also implements these workflow components (detailed as FR1-FR6, FR8, FR11, FR13 in implementation sections below):
- Claim extraction (FR1)
- Claim context (FR2)
- Multiple scenarios (FR3)
- Evidence collection (FR5)
- Source quality assessment (FR6)
- Time evolution tracking (FR8) - deferred to POC2
- Audit trail (FR11) - deferred to Beta 0
- In-article highlighting (FR13) - deferred to Beta 0
Partial implementations:
- NFR1 (Explainability) - Basic only
- NFR2 (Performance) - Functional but not optimized
- NFR3 (Transparency) - Basic only
Detailed POC1 implementation specifications continue below...
3. POC Simplifications
3.1 FR1: Claim Extraction (Full Implementation)
Main Requirement: AI extracts factual claims from input text
POC Implementation:
- ✅ AKEL extracts claims using LLM
- ✅ Each claim includes original text reference
- ✅ Claims are identified as factual/non-factual
- ❌ No advanced claim parsing (added in POC2)
Acceptance Criteria:
- Extracts 3-5 claims from typical article
- Identifies factual vs non-factual claims
- Quality Gate 1 validates extraction
3.2 FR3: Multiple Scenarios (Full Implementation)
Main Requirement: Generate multiple interpretation scenarios for ambiguous claims
POC Implementation:
- ✅ AKEL generates 2-3 scenarios per claim
- ✅ Scenarios capture different interpretations
- ✅ Each scenario is evaluated separately
- ✅ Verdict considers all scenarios
Acceptance Criteria:
- Generates 2+ scenarios for ambiguous claims
- Scenarios are meaningfully different
- All scenarios are evaluated
3.3 FR4: Analysis Summary (Basic Implementation)
Main Requirement: Provide user-friendly summary of analysis
POC Implementation:
- ✅ Simple text summary generated
- ❌ No rich formatting (added in Beta 0)
- ❌ No visual elements (added in Beta 0)
- ❌ No interactive features (added in Beta 0)
POC Format:
```
Claim: [extracted claim]
Scenarios: [list of scenarios]
Evidence: [supporting/opposing evidence]
Verdict: [probability with uncertainty]
```
3.4 FR5-FR6: Evidence Collection & Evaluation (Full Implementation)
Main Requirements:
- FR5: Collect supporting and opposing evidence
- FR6: Evaluate evidence source reliability
POC Implementation:
- ✅ AKEL searches for evidence (web/knowledge base)
- ✅ Mandatory contradiction search (finds opposing evidence)
- ✅ Source reliability scoring
- ❌ No evidence deduplication (added in POC2)
- ❌ No advanced source verification (added in POC2)
Acceptance Criteria:
- Finds 2+ supporting evidence items
- Finds 1+ opposing evidence (if exists)
- Sources scored for reliability
3.5 FR7: Automated Verdicts (Full Implementation)
Main Requirement: AI computes verdicts with uncertainty quantification
POC Implementation:
- ✅ Probabilistic verdicts (0-100% confidence)
- ✅ Uncertainty explicitly stated
- ✅ Reasoning chain provided
- ✅ Quality Gate 4 validates verdict confidence
POC Output:
```
Verdict: 70% likely true
Uncertainty: ±15% (moderate confidence)
Reasoning: Based on 3 high-quality sources...
Confidence Level: MEDIUM
```
Acceptance Criteria:
- Verdicts include probability (0-100%)
- Uncertainty explicitly quantified
- Reasoning chain explains verdict
3.6 NFR11: Quality Assurance Framework (LITE VERSION)
Main Requirement: Complete quality assurance with 7 quality gates
POC Implementation: 2 gates only
Quality Gate 1: Claim Validation
- ✅ Validates claim is factual and verifiable
- ✅ Blocks non-factual claims (opinion/prediction/ambiguous)
- ✅ Provides clear rejection reason
Quality Gate 4: Verdict Confidence Assessment
- ✅ Validates ≥2 sources found
- ✅ Validates quality score ≥0.6
- ✅ Blocks low-confidence verdicts
- ✅ Provides clear rejection reason
Out of Scope (POC2+):
- ❌ Gate 2: Evidence Relevance
- ❌ Gate 3: Scenario Coherence
- ❌ Gate 5: Source Diversity
- ❌ Gate 6: Reasoning Validity
- ❌ Gate 7: Output Completeness
Rationale: Prove gate concept works. Add remaining gates in POC2 after validating approach.
3.7 NFR1-3: Performance, Scalability, Reliability (Basic)
Main Requirements:
- NFR1: Response time < 30 seconds
- NFR2: Handle 1000+ concurrent users
- NFR3: 99.9% uptime
POC Implementation:
- ⚠️ Response time monitored (not optimized)
- ⚠️ Single-threaded processing (no concurrency)
- ⚠️ Basic error handling (no advanced retry logic)
Rationale: POC proves functionality. Performance optimization happens in POC2.
POC Acceptance:
- Analysis completes (no timeout requirement)
- Errors don't crash system
- Basic logging in place
4. What's NOT in POC Scope
4.1 User-Facing Features (Beta 0+)
Out of Scope:
- ❌ User accounts and authentication (FR8)
- ❌ User corrections system (FR9, FR45-46)
- ❌ Public publishing interface (FR10)
- ❌ Social sharing (FR11)
- ❌ Email notifications (FR12)
- ❌ API access (FR13)
Rationale: POC validates AI capabilities. User features added in Beta 0.
4.2 Advanced Features (V1.0+)
Out of Scope:
- ❌ IFCN compliance (FR47)
- ❌ ClaimReview schema (FR48)
- ❌ Archive.org integration (FR49)
- ❌ OSINT toolkit (FR50)
- ❌ Video verification (FR51)
- ❌ Deepfake detection (FR52)
- ❌ Cross-org sharing (FR53)
Rationale: Advanced features require proven platform. Added post-V1.0.
4.3 Production Requirements (POC2, Beta 0)
Out of Scope:
- ❌ Security controls (NFR4, NFR12)
- ❌ Code maintainability (NFR5)
- ❌ System monitoring (NFR13)
- ❌ Evidence deduplication
- ❌ Advanced source verification
- ❌ Full 7-gate quality framework
Rationale: POC proves concept. Production hardening happens in POC2 and Beta 0.
5. POC Output Specification
5.1 Required Output Elements
For each analyzed claim, POC must produce:
- Claim
- Original text
- Classification (factual/non-factual/ambiguous)
- If non-factual: Clear reason why
2. Scenarios (if factual)
- 2-3 interpretation scenarios
- Each scenario clearly described
3. Evidence (if factual)
- Supporting evidence (2+ items)
- Opposing evidence (if exists)
- Source URLs and reliability scores
4. Verdict (if factual)
- Probability (0-100%)
- Uncertainty quantification
- Confidence level (LOW/MEDIUM/HIGH)
- Reasoning chain
5. Quality Status
- Which gates passed/failed
- If failed: Clear explanation why
5.2 Example POC Output
"claim": {
"text": "Switzerland has the highest life expectancy in Europe",
"type": "factual",
"gate1_status": "PASS"
},
"scenarios": [
"Switzerland's overall life expectancy is highest",
"Switzerland ranks highest for specific age groups"
],
"evidence": {
"supporting": [
{
"source": "WHO Report 2023",
"reliability": 0.95,
"excerpt": "Switzerland: 83.4 years average..."
}
],
"opposing": [
{
"source": "Eurostat 2024",
"reliability": 0.90,
"excerpt": "Spain leads at 83.5 years..."
}
]
},
"verdict": {
"probability": 0.65,
"uncertainty": 0.15,
"confidence": "MEDIUM",
"reasoning": "WHO and Eurostat show similar but conflicting data...",
"gate4_status": "PASS"
}
}
6. Success Criteria
6.1 Functional Success
POC is successful if:
✅ FR1-FR7 Requirements Met:
- Extracts 3-5 factual claims from test articles
2. Generates 2-3 scenarios per ambiguous claim
3. Finds supporting AND opposing evidence
4. Computes probabilistic verdicts with uncertainty
5. Provides clear reasoning chains
✅ Quality Gates Work:
- Gate 1 blocks non-factual claims (100% block rate)
2. Gate 4 blocks low-quality verdicts (blocks if <2 sources or quality <0.6)
3. Clear rejection reasons provided
✅ NFR11 Met:
- Quality gates reduce hallucination rate
2. Blocked outputs have clear explanations
3. Quality metrics are logged
6.2 Quality Thresholds
Minimum Acceptable:
- ≥70% of test claims correctly classified (factual/non-factual)
- ≥60% of verdicts are reasonable (human evaluation)
- Gate 1 blocks 100% of non-factual claims
- Gate 4 blocks verdicts with <2 sources
Target:
- ≥80% claims correctly classified
- ≥75% verdicts are reasonable
- <10% false positives (blocking good claims)
6.3 POC Decision Gate
After POC1, we decide:
✅ PROCEED to POC2 if:
- Success criteria met
- Quality gates demonstrably improve output
- Core workflow is technically sound
- Clear path to production quality
⚠️ ITERATE POC1 if:
- Success criteria partially met
- Gates work but need tuning
- Core issues identified but fixable
❌ PIVOT APPROACH if:
- Success criteria not met
- Fundamental AI limitations discovered
- Quality gates insufficient
- Alternative approach needed
7. Test Cases
7.1 Happy Path
Test 1: Simple Factual Claim
- Input: "Paris is the capital of France"
- Expected: Factual, 1 scenario, verdict 95% true
Test 2: Ambiguous Claim
- Input: "Switzerland has the highest income in Europe"
- Expected: Factual, 2-3 scenarios, verdict with uncertainty
Test 3: Statistical Claim
- Input: "10% of people have condition X"
- Expected: Factual, evidence with numbers, probabilistic verdict
7.2 Edge Cases
Test 4: Opinion
- Input: "Paris is the best city"
- Expected: Non-factual (opinion), blocked by Gate 1
Test 5: Prediction
- Input: "Bitcoin will reach $100,000 next year"
- Expected: Non-factual (prediction), blocked by Gate 1
Test 6: Insufficient Evidence
- Input: Obscure factual claim with no sources
- Expected: Blocked by Gate 4 (<2 sources)
7.3 Quality Gate Tests
Test 7: Gate 1 Effectiveness
- Input: Mix of 10 factual + 10 non-factual claims
- Expected: Gate 1 blocks all 10 non-factual (100% precision)
Test 8: Gate 4 Effectiveness
- Input: Claims with varying evidence availability
- Expected: Gate 4 blocks low-confidence verdicts
8. Technical Architecture (POC)
8.1 Simplified Architecture
POC Tech Stack:
- Frontend: Simple web interface (Next.js + TypeScript)
- Backend: Single API endpoint
- AI: Claude API (Sonnet 4.5)
- Database: Local JSON files (no database)
- Deployment: Single server
Architecture Diagram: See POC1 Specification
8.2 AKEL Implementation
POC AKEL:
- Single-threaded processing
- Synchronous API calls
- No caching
- Basic error handling
- Console logging
Full AKEL (POC2+):
- Multi-threaded processing
- Async API calls
- Evidence caching
- Advanced error handling with retry
- Structured logging + monitoring
9. POC Philosophy
9.1 Core Principles
- Prove Concept, Not Production
- POC validates AI can do the job
- Production quality comes in POC2 and Beta 0
- Focus on "does it work?" not "is it perfect?"
2. Implement Subset of Requirements
- POC covers FR1-7, NFR11 (lite)
- All other requirements deferred
- Clear mapping to Main Requirements
3. Quality Gates Validate Approach
- 2 gates prove the concept
- Remaining 5 gates added in POC2
- Gates must demonstrably improve quality
4. Iterate Based on Results
- POC results determine next steps
- Decision gate after POC1
- Flexibility to pivot if needed
9.2 Success
Clear Path Forward ===
POC succeeds if we can confidently answer:
✅ Technical Feasibility:
- Can AI extract claims reliably?
- Can AI find balanced evidence?
- Can AI compute reasonable verdicts?
✅ Quality Approach:
- Do quality gates improve output?
- Can we measure and track quality?
- Is the gate approach scalable?
✅ Production Path:
- Is the core architecture sound?
- What needs improvement for production?
- Is POC2 the right next step?
10. Related Pages
- Main Requirements - Full system requirements (this POC implements a subset)
- POC1 Specification (Detailed) - Detailed POC1 technical specs
- POC Summary - High-level POC overview
- Implementation Roadmap - POC1, POC2, Beta 0, V1.0 phases
- User Needs - What users need (drives requirements)
Document Owner: Technical Team
Review Frequency: After each POC iteration
Version History:
- v1.0 - Initial POC requirements
- v2.0 - Updated after specification cross-check
- v3.0 - Aligned with Main Requirements (FR/NFR IDs added)