POC Requirements (POC1 & POC2)

Version 1.1 by Robert Schaub on 2025/12/23 16:52

POC Requirements

Status: ✅ Approved for Development  
Version: 3.0 (Aligned with Main Requirements)  
Goal: Prove that AI can extract claims and determine verdicts automatically without human intervention

Information

Core Philosophy: POC validates the Main Requirements through simplified implementation. All POC features map to formal FR/NFR requirements.

1. POC Overview

1.1 What POC Tests

Core Question:

 Can AI automatically extract factual claims from articles and evaluate them with reasonable verdicts?

What we're proving:

  • AI can identify factual claims from text
  • AI can evaluate those claims with structured evidence
  • Quality gates can filter unreliable outputs
  • The core workflow is technically feasible

What we're NOT proving:

  • Production-ready reliability (that's POC2)
  • User-facing features (that's Beta 0)
  • Full IFCN compliance (that's V1.0)

1.2 Requirements Mapping

POC1 implements a subset of the full system requirements defined in Main Requirements.

Scope Summary:

  • In Scope: 8 requirements (7 FRs + 1 NFR)
  • Partial: 3 NFRs (simplified versions)
  • Out of Scope: 19 requirements (deferred to later phases)

2. POC1 Scope

Success

Authoritative Source for Phase Mapping: Requirements Roadmap Matrix

The Roadmap Matrix is the single source of truth for which requirements are implemented in which phases. This page provides POC1-specific implementation details only.

POC1 implements these formal requirements:

 Formal Req  Implementation in POC1  Notes
 FR4  Analysis Summary  Basic format; quality metadata deferred to POC2
 FR7  Automated Verdicts  Full implementation with quality gates (NFR11)
 NFR11  Quality Assurance Framework  4 quality gates implemented

POC1 also implements these workflow components (detailed as FR1-FR6 in implementation sections below)

InformationNote: FR11 (Audit Trail) and FR13 (In-Article Claim Highlighting) are deferred to Beta 0 for production readiness and user experience enhancement.:

  • Claim extraction (FR1)
  • Claim context (FR2)
  • Multiple scenarios (FR3)
  • Evidence collection (FR5)
  • Source quality assessment (FR6)
  • Time evolution tracking (FR8) - deferred to POC2
  • Audit trail (FR11) - deferred to Beta 0
  • In-article highlighting (FR13) - deferred to Beta 0

Partial implementations:

  • NFR1 (Explainability) - Basic only
  • NFR2 (Performance) - Functional but not optimized
  • NFR3 (Transparency) - Basic only

Detailed POC1 implementation specifications continue below...

3. POC Simplifications

3.1 FR1: Claim Extraction (Full Implementation)

Main Requirement: AI extracts factual claims from input text

POC Implementation:

  • ✅ AKEL extracts claims using LLM
  • ✅ Each claim includes original text reference
  • ✅ Claims are identified as factual/non-factual
  • ❌ No advanced claim parsing (added in POC2)

Acceptance Criteria:

  • Extracts 3-5 claims from typical article
  • Identifies factual vs non-factual claims
  • Quality Gate 1 validates extraction

3.2 FR3: Multiple Scenarios (Full Implementation)

Main Requirement: Generate multiple interpretation scenarios for ambiguous claims

POC Implementation:

  • ✅ AKEL generates 2-3 scenarios per claim
  • ✅ Scenarios capture different interpretations
  • ✅ Each scenario is evaluated separately
  • ✅ Verdict considers all scenarios

Acceptance Criteria:

  • Generates 2+ scenarios for ambiguous claims
  • Scenarios are meaningfully different
  • All scenarios are evaluated

3.3 FR4: Analysis Summary (Basic Implementation)

Main Requirement: Provide user-friendly summary of analysis

POC Implementation:

  • ✅ Simple text summary generated
  • ❌ No rich formatting (added in Beta 0)
  • ❌ No visual elements (added in Beta 0)
  • ❌ No interactive features (added in Beta 0)

POC Format:
```
Claim: [extracted claim]
Scenarios: [list of scenarios]
Evidence: [supporting/opposing evidence]
Verdict: [probability with uncertainty]
```

3.4 FR5-FR6: Evidence Collection & Evaluation (Full Implementation)

Main Requirements:

  • FR5: Collect supporting and opposing evidence
  • FR6: Evaluate evidence source reliability

POC Implementation:

  • ✅ AKEL searches for evidence (web/knowledge base)
  • Mandatory contradiction search (finds opposing evidence)
  • ✅ Source reliability scoring
  • ❌ No evidence deduplication (added in POC2)
  • ❌ No advanced source verification (added in POC2)

Acceptance Criteria:

  • Finds 2+ supporting evidence items
  • Finds 1+ opposing evidence (if exists)
  • Sources scored for reliability

3.5 FR7: Automated Verdicts (Full Implementation)

Main Requirement: AI computes verdicts with uncertainty quantification

POC Implementation:

  • ✅ Probabilistic verdicts (0-100% confidence)
  • ✅ Uncertainty explicitly stated
  • ✅ Reasoning chain provided
  • ✅ Quality Gate 4 validates verdict confidence

POC Output:
```
Verdict: 70% likely true
Uncertainty: ±15% (moderate confidence)
Reasoning: Based on 3 high-quality sources...
Confidence Level: MEDIUM
```

Acceptance Criteria:

  • Verdicts include probability (0-100%)
  • Uncertainty explicitly quantified
  • Reasoning chain explains verdict

3.6 NFR11: Quality Assurance Framework (LITE VERSION)

Main Requirement: Complete quality assurance with 7 quality gates

POC Implementation: 2 gates only

Quality Gate 1: Claim Validation

  • ✅ Validates claim is factual and verifiable
  • ✅ Blocks non-factual claims (opinion/prediction/ambiguous)
  • ✅ Provides clear rejection reason

Quality Gate 4: Verdict Confidence Assessment

  • ✅ Validates ≥2 sources found
  • ✅ Validates quality score ≥0.6
  • ✅ Blocks low-confidence verdicts
  • ✅ Provides clear rejection reason

Out of Scope (POC2+):

  • ❌ Gate 2: Evidence Relevance
  • ❌ Gate 3: Scenario Coherence
  • ❌ Gate 5: Source Diversity
  • ❌ Gate 6: Reasoning Validity
  • ❌ Gate 7: Output Completeness

Rationale: Prove gate concept works. Add remaining gates in POC2 after validating approach.

3.7 NFR1-3: Performance, Scalability, Reliability (Basic)

Main Requirements:

  • NFR1: Response time < 30 seconds
  • NFR2: Handle 1000+ concurrent users
  • NFR3: 99.9% uptime

POC Implementation:

  • ⚠️ Response time monitored (not optimized)
  • ⚠️ Single-threaded processing (no concurrency)
  • ⚠️ Basic error handling (no advanced retry logic)

Rationale: POC proves functionality. Performance optimization happens in POC2.

POC Acceptance:

  • Analysis completes (no timeout requirement)
  • Errors don't crash system
  • Basic logging in place

4. What's NOT in POC Scope

4.1 User-Facing Features (Beta 0+)

Warning

Deferred to Beta 0:

Out of Scope:

  • ❌ User accounts and authentication (FR8)
  • ❌ User corrections system (FR9, FR45-46)
  • ❌ Public publishing interface (FR10)
  • ❌ Social sharing (FR11)
  • ❌ Email notifications (FR12)
  • ❌ API access (FR13)

Rationale: POC validates AI capabilities. User features added in Beta 0.

4.2 Advanced Features (V1.0+)

Out of Scope:

  • ❌ IFCN compliance (FR47)
  • ❌ ClaimReview schema (FR48)
  • ❌ Archive.org integration (FR49)
  • ❌ OSINT toolkit (FR50)
  • ❌ Video verification (FR51)
  • ❌ Deepfake detection (FR52)
  • ❌ Cross-org sharing (FR53)

Rationale: Advanced features require proven platform. Added post-V1.0.

4.3 Production Requirements (POC2, Beta 0)

Out of Scope:

  • ❌ Security controls (NFR4, NFR12)
  • ❌ Code maintainability (NFR5)
  • ❌ System monitoring (NFR13)
  • ❌ Evidence deduplication
  • ❌ Advanced source verification
  • ❌ Full 7-gate quality framework

Rationale: POC proves concept. Production hardening happens in POC2 and Beta 0.

5. POC Output Specification

5.1 Required Output Elements

For each analyzed claim, POC must produce:

      1. Claim
  • Original text
  • Classification (factual/non-factual/ambiguous)
  • If non-factual: Clear reason why

2. Scenarios (if factual)

  • 2-3 interpretation scenarios
  • Each scenario clearly described

3. Evidence (if factual)

  • Supporting evidence (2+ items)
  • Opposing evidence (if exists)
  • Source URLs and reliability scores

4. Verdict (if factual)

  • Probability (0-100%)
  • Uncertainty quantification
  • Confidence level (LOW/MEDIUM/HIGH)
  • Reasoning chain

5. Quality Status

  • Which gates passed/failed
  • If failed: Clear explanation why

5.2 Example POC Output

{
 "claim": {
   "text": "Switzerland has the highest life expectancy in Europe",
   "type": "factual",
   "gate1_status": "PASS"
  },
 "scenarios": [
   "Switzerland's overall life expectancy is highest",
   "Switzerland ranks highest for specific age groups"
  ],
 "evidence": {
   "supporting": [
      {
       "source": "WHO Report 2023",
       "reliability": 0.95,
       "excerpt": "Switzerland: 83.4 years average..."
      }
    ],
   "opposing": [
      {
       "source": "Eurostat 2024",
       "reliability": 0.90,
       "excerpt": "Spain leads at 83.5 years..."
      }
    ]
  },
 "verdict": {
   "probability": 0.65,
   "uncertainty": 0.15,
   "confidence": "MEDIUM",
   "reasoning": "WHO and Eurostat show similar but conflicting data...",
   "gate4_status": "PASS"
  }
}

6. Success Criteria

Success

POC Success Definition: POC validates that AI can extract claims, find balanced evidence, and compute reasonable verdicts with quality gates improving output quality.

6.1 Functional Success

POC is successful if:

FR1-FR7 Requirements Met:

  1. Extracts 3-5 factual claims from test articles
    2. Generates 2-3 scenarios per ambiguous claim
    3. Finds supporting AND opposing evidence
    4. Computes probabilistic verdicts with uncertainty
    5. Provides clear reasoning chains

Quality Gates Work:

  1. Gate 1 blocks non-factual claims (100% block rate)
    2. Gate 4 blocks low-quality verdicts (blocks if <2 sources or quality <0.6)
    3. Clear rejection reasons provided

NFR11 Met:

  1. Quality gates reduce hallucination rate
    2. Blocked outputs have clear explanations
    3. Quality metrics are logged

6.2 Quality Thresholds

Minimum Acceptable:

  • ≥70% of test claims correctly classified (factual/non-factual)
  • ≥60% of verdicts are reasonable (human evaluation)
  • Gate 1 blocks 100% of non-factual claims
  • Gate 4 blocks verdicts with <2 sources

Target:

  • ≥80% claims correctly classified
  • ≥75% verdicts are reasonable
  • <10% false positives (blocking good claims)

6.3 POC Decision Gate

After POC1, we decide:

✅ PROCEED to POC2 if:

  • Success criteria met
  • Quality gates demonstrably improve output
  • Core workflow is technically sound
  • Clear path to production quality

⚠️ ITERATE POC1 if:

  • Success criteria partially met
  • Gates work but need tuning
  • Core issues identified but fixable

❌ PIVOT APPROACH if:

  • Success criteria not met
  • Fundamental AI limitations discovered
  • Quality gates insufficient
  • Alternative approach needed

7. Test Cases

7.1 Happy Path

Test 1: Simple Factual Claim

  • Input: "Paris is the capital of France"
  • Expected: Factual, 1 scenario, verdict 95% true

Test 2: Ambiguous Claim

  • Input: "Switzerland has the highest income in Europe"
  • Expected: Factual, 2-3 scenarios, verdict with uncertainty

Test 3: Statistical Claim

  • Input: "10% of people have condition X"
  • Expected: Factual, evidence with numbers, probabilistic verdict

7.2 Edge Cases

Test 4: Opinion

  • Input: "Paris is the best city"
  • Expected: Non-factual (opinion), blocked by Gate 1

Test 5: Prediction

  • Input: "Bitcoin will reach $100,000 next year"
  • Expected: Non-factual (prediction), blocked by Gate 1

Test 6: Insufficient Evidence

  • Input: Obscure factual claim with no sources
  • Expected: Blocked by Gate 4 (<2 sources)

7.3 Quality Gate Tests

Test 7: Gate 1 Effectiveness

  • Input: Mix of 10 factual + 10 non-factual claims
  • Expected: Gate 1 blocks all 10 non-factual (100% precision)

Test 8: Gate 4 Effectiveness

  • Input: Claims with varying evidence availability
  • Expected: Gate 4 blocks low-confidence verdicts

8. Technical Architecture (POC)

8.1 Simplified Architecture

POC Tech Stack:

  • Frontend: Simple web interface (Next.js + TypeScript)
  • Backend: Single API endpoint
  • AI: Claude API (Sonnet 4.5)
  • Database: Local JSON files (no database)
  • Deployment: Single server

Architecture Diagram: See POC1 Specification

8.2 AKEL Implementation

POC AKEL:

  • Single-threaded processing
  • Synchronous API calls
  • No caching
  • Basic error handling
  • Console logging

Full AKEL (POC2+):

  • Multi-threaded processing
  • Async API calls
  • Evidence caching
  • Advanced error handling with retry
  • Structured logging + monitoring

9. POC Philosophy

Information

Important: POC validates concept, not production readiness. Focus is on proving AI can do the job, with production quality coming in later phases.

9.1 Core Principles

      1. Prove Concept, Not Production
  • POC validates AI can do the job
  • Production quality comes in POC2 and Beta 0
  • Focus on "does it work?" not "is it perfect?"

2. Implement Subset of Requirements

  • POC covers FR1-7, NFR11 (lite)
  • All other requirements deferred
  • Clear mapping to Main Requirements

3. Quality Gates Validate Approach

  • 2 gates prove the concept
  • Remaining 5 gates added in POC2
  • Gates must demonstrably improve quality

4. Iterate Based on Results

  • POC results determine next steps
  • Decision gate after POC1
  • Flexibility to pivot if needed

9.2 Success

 Clear Path Forward ===

POC succeeds if we can confidently answer:

Technical Feasibility:

  • Can AI extract claims reliably?
  • Can AI find balanced evidence?
  • Can AI compute reasonable verdicts?

Quality Approach:

  • Do quality gates improve output?
  • Can we measure and track quality?
  • Is the gate approach scalable?

Production Path:

  • Is the core architecture sound?
  • What needs improvement for production?
  • Is POC2 the right next step?

10. Related Pages

Document Owner: Technical Team  
Review Frequency: After each POC iteration  
Version History:

  • v1.0 - Initial POC requirements
  • v2.0 - Updated after specification cross-check
  • v3.0 - Aligned with Main Requirements (FR/NFR IDs added)