POC Requirements (POC1 & POC2)

Version 1.1 by Robert Schaub on 2025/12/23 16:52

POC Requirements

Status: ✅ Approved for Development
Version: 3.0 (Aligned with Main Requirements)
Goal: Prove that AI can extract claims and determine verdicts automatically without human intervention

Core Philosophy: POC validates the Main Requirements through simplified implementation. All POC features map to formal FR/NFR requirements.

1. POC Overview

1.1 What POC Tests

Core Question:

Can AI automatically extract factual claims from articles and evaluate them with reasonable verdicts?

What we're proving:

AI can identify factual claims from text
AI can evaluate those claims with structured evidence
Quality gates can filter unreliable outputs
The core workflow is technically feasible

What we're NOT proving:

Production-ready reliability (that's POC2)
User-facing features (that's Beta 0)
Full IFCN compliance (that's V1.0)

1.2 Requirements Mapping

POC1 implements a subset of the full system requirements defined in Main Requirements.

Scope Summary:

In Scope: 8 requirements (7 FRs + 1 NFR)
Partial: 3 NFRs (simplified versions)
Out of Scope: 19 requirements (deferred to later phases)

2. POC1 Scope

Authoritative Source for Phase Mapping: Requirements Roadmap Matrix

The Roadmap Matrix is the single source of truth for which requirements are implemented in which phases. This page provides POC1-specific implementation details only.

POC1 implements these formal requirements:

Formal Req	Implementation in POC1	Notes
FR4	Analysis Summary	Basic format; quality metadata deferred to POC2
FR7	Automated Verdicts	Full implementation with quality gates (NFR11)
NFR11	Quality Assurance Framework	4 quality gates implemented

POC1 also implements these workflow components (detailed as FR1-FR6 in implementation sections below)

InformationNote: FR11 (Audit Trail) and FR13 (In-Article Claim Highlighting) are deferred to Beta 0 for production readiness and user experience enhancement.:

Claim extraction (FR1)
Claim context (FR2)
Multiple scenarios (FR3)
Evidence collection (FR5)
Source quality assessment (FR6)
Time evolution tracking (FR8) - deferred to POC2
Audit trail (FR11) - deferred to Beta 0
In-article highlighting (FR13) - deferred to Beta 0

Partial implementations:

NFR1 (Explainability) - Basic only
NFR2 (Performance) - Functional but not optimized
NFR3 (Transparency) - Basic only

Detailed POC1 implementation specifications continue below...

3. POC Simplifications

3.1 FR1: Claim Extraction (Full Implementation)

Main Requirement: AI extracts factual claims from input text

POC Implementation:

✅ AKEL extracts claims using LLM
✅ Each claim includes original text reference
✅ Claims are identified as factual/non-factual
❌ No advanced claim parsing (added in POC2)

Acceptance Criteria:

Extracts 3-5 claims from typical article
Identifies factual vs non-factual claims
Quality Gate 1 validates extraction

3.2 FR3: Multiple Scenarios (Full Implementation)

Main Requirement: Generate multiple interpretation scenarios for ambiguous claims

POC Implementation:

✅ AKEL generates 2-3 scenarios per claim
✅ Scenarios capture different interpretations
✅ Each scenario is evaluated separately
✅ Verdict considers all scenarios

Acceptance Criteria:

Generates 2+ scenarios for ambiguous claims
Scenarios are meaningfully different
All scenarios are evaluated

3.3 FR4: Analysis Summary (Basic Implementation)

Main Requirement: Provide user-friendly summary of analysis

POC Implementation:

✅ Simple text summary generated
❌ No rich formatting (added in Beta 0)
❌ No visual elements (added in Beta 0)
❌ No interactive features (added in Beta 0)

POC Format:
```
Claim: [extracted claim]
Scenarios: [list of scenarios]
Evidence: [supporting/opposing evidence]
Verdict: [probability with uncertainty]
```

3.4 FR5-FR6: Evidence Collection & Evaluation (Full Implementation)

Main Requirements:

FR5: Collect supporting and opposing evidence
FR6: Evaluate evidence source reliability

POC Implementation:

✅ AKEL searches for evidence (web/knowledge base)
✅ Mandatory contradiction search (finds opposing evidence)
✅ Source reliability scoring
❌ No evidence deduplication (added in POC2)
❌ No advanced source verification (added in POC2)

Acceptance Criteria:

Finds 2+ supporting evidence items
Finds 1+ opposing evidence (if exists)
Sources scored for reliability

3.5 FR7: Automated Verdicts (Full Implementation)

Main Requirement: AI computes verdicts with uncertainty quantification

POC Implementation:

✅ Probabilistic verdicts (0-100% confidence)
✅ Uncertainty explicitly stated
✅ Reasoning chain provided
✅ Quality Gate 4 validates verdict confidence

POC Output:
```
Verdict: 70% likely true
Uncertainty: ±15% (moderate confidence)
Reasoning: Based on 3 high-quality sources...
Confidence Level: MEDIUM
```

Acceptance Criteria:

Verdicts include probability (0-100%)
Uncertainty explicitly quantified
Reasoning chain explains verdict

3.6 NFR11: Quality Assurance Framework (LITE VERSION)

Main Requirement: Complete quality assurance with 7 quality gates

POC Implementation: 2 gates only

Quality Gate 1: Claim Validation

✅ Validates claim is factual and verifiable
✅ Blocks non-factual claims (opinion/prediction/ambiguous)
✅ Provides clear rejection reason

Quality Gate 4: Verdict Confidence Assessment

✅ Validates ≥2 sources found
✅ Validates quality score ≥0.6
✅ Blocks low-confidence verdicts
✅ Provides clear rejection reason

Out of Scope (POC2+):

❌ Gate 2: Evidence Relevance
❌ Gate 3: Scenario Coherence
❌ Gate 5: Source Diversity
❌ Gate 6: Reasoning Validity
❌ Gate 7: Output Completeness

Rationale: Prove gate concept works. Add remaining gates in POC2 after validating approach.

3.7 NFR1-3: Performance, Scalability, Reliability (Basic)

Main Requirements:

NFR1: Response time < 30 seconds
NFR2: Handle 1000+ concurrent users
NFR3: 99.9% uptime

POC Implementation:

⚠️ Response time monitored (not optimized)
⚠️ Single-threaded processing (no concurrency)
⚠️ Basic error handling (no advanced retry logic)

Rationale: POC proves functionality. Performance optimization happens in POC2.

POC Acceptance:

Analysis completes (no timeout requirement)
Errors don't crash system
Basic logging in place

4. What's NOT in POC Scope

4.1 User-Facing Features (Beta 0+)

Deferred to Beta 0:

Out of Scope:

❌ User accounts and authentication (FR8)
❌ User corrections system (FR9, FR45-46)
❌ Public publishing interface (FR10)
❌ Social sharing (FR11)
❌ Email notifications (FR12)
❌ API access (FR13)

Rationale: POC validates AI capabilities. User features added in Beta 0.

4.2 Advanced Features (V1.0+)

Out of Scope:

❌ IFCN compliance (FR47)
❌ ClaimReview schema (FR48)
❌ Archive.org integration (FR49)
❌ OSINT toolkit (FR50)
❌ Video verification (FR51)
❌ Deepfake detection (FR52)
❌ Cross-org sharing (FR53)

Rationale: Advanced features require proven platform. Added post-V1.0.

4.3 Production Requirements (POC2, Beta 0)

Out of Scope:

❌ Security controls (NFR4, NFR12)
❌ Code maintainability (NFR5)
❌ System monitoring (NFR13)
❌ Evidence deduplication
❌ Advanced source verification
❌ Full 7-gate quality framework

Rationale: POC proves concept. Production hardening happens in POC2 and Beta 0.

5. POC Output Specification

5.1 Required Output Elements

For each analyzed claim, POC must produce:

- 1. Claim
Original text
Classification (factual/non-factual/ambiguous)
If non-factual: Clear reason why

2. Scenarios (if factual)

2-3 interpretation scenarios
Each scenario clearly described

3. Evidence (if factual)

Supporting evidence (2+ items)
Opposing evidence (if exists)
Source URLs and reliability scores

4. Verdict (if factual)

Probability (0-100%)
Uncertainty quantification
Confidence level (LOW/MEDIUM/HIGH)
Reasoning chain

5. Quality Status

Which gates passed/failed
If failed: Clear explanation why

5.2 Example POC Output

{
"claim": {
   "text": "Switzerland has the highest life expectancy in Europe",
   "type": "factual",
   "gate1_status": "PASS"
  },
"scenarios": [
   "Switzerland's overall life expectancy is highest",
   "Switzerland ranks highest for specific age groups"
  ],
"evidence": {
   "supporting": [
      {
       "source": "WHO Report 2023",
       "reliability": 0.95,
       "excerpt": "Switzerland: 83.4 years average..."
      }
    ],
   "opposing": [
      {
       "source": "Eurostat 2024",
       "reliability": 0.90,
       "excerpt": "Spain leads at 83.5 years..."
      }
    ]
  },
"verdict": {
   "probability": 0.65,
   "uncertainty": 0.15,
   "confidence": "MEDIUM",
   "reasoning": "WHO and Eurostat show similar but conflicting data...",
   "gate4_status": "PASS"
  }
}

6. Success Criteria

POC Success Definition: POC validates that AI can extract claims, find balanced evidence, and compute reasonable verdicts with quality gates improving output quality.

6.1 Functional Success

POC is successful if:

✅ FR1-FR7 Requirements Met:

Extracts 3-5 factual claims from test articles
2. Generates 2-3 scenarios per ambiguous claim
3. Finds supporting AND opposing evidence
4. Computes probabilistic verdicts with uncertainty
5. Provides clear reasoning chains

✅ Quality Gates Work:

Gate 1 blocks non-factual claims (100% block rate)
2. Gate 4 blocks low-quality verdicts (blocks if <2 sources or quality <0.6)
3. Clear rejection reasons provided

✅ NFR11 Met:

Quality gates reduce hallucination rate
2. Blocked outputs have clear explanations
3. Quality metrics are logged

6.2 Quality Thresholds

Minimum Acceptable:

≥70% of test claims correctly classified (factual/non-factual)
≥60% of verdicts are reasonable (human evaluation)
Gate 1 blocks 100% of non-factual claims
Gate 4 blocks verdicts with <2 sources

Target:

≥80% claims correctly classified
≥75% verdicts are reasonable
<10% false positives (blocking good claims)

6.3 POC Decision Gate

After POC1, we decide:

✅ PROCEED to POC2 if:

Success criteria met
Quality gates demonstrably improve output
Core workflow is technically sound
Clear path to production quality

⚠️ ITERATE POC1 if:

Success criteria partially met
Gates work but need tuning
Core issues identified but fixable

❌ PIVOT APPROACH if:

Success criteria not met
Fundamental AI limitations discovered
Quality gates insufficient
Alternative approach needed

7. Test Cases

7.1 Happy Path

Test 1: Simple Factual Claim

Input: "Paris is the capital of France"
Expected: Factual, 1 scenario, verdict 95% true

Test 2: Ambiguous Claim

Input: "Switzerland has the highest income in Europe"
Expected: Factual, 2-3 scenarios, verdict with uncertainty

Test 3: Statistical Claim

Input: "10% of people have condition X"
Expected: Factual, evidence with numbers, probabilistic verdict

7.2 Edge Cases

Test 4: Opinion

Input: "Paris is the best city"
Expected: Non-factual (opinion), blocked by Gate 1

Test 5: Prediction

Input: "Bitcoin will reach $100,000 next year"
Expected: Non-factual (prediction), blocked by Gate 1

Test 6: Insufficient Evidence

Input: Obscure factual claim with no sources
Expected: Blocked by Gate 4 (<2 sources)

7.3 Quality Gate Tests

Test 7: Gate 1 Effectiveness

Input: Mix of 10 factual + 10 non-factual claims
Expected: Gate 1 blocks all 10 non-factual (100% precision)

Test 8: Gate 4 Effectiveness

Input: Claims with varying evidence availability
Expected: Gate 4 blocks low-confidence verdicts

8. Technical Architecture (POC)

8.1 Simplified Architecture

POC Tech Stack:

Frontend: Simple web interface (Next.js + TypeScript)
Backend: Single API endpoint
AI: Claude API (Sonnet 4.5)
Database: Local JSON files (no database)
Deployment: Single server

Architecture Diagram: See POC1 Specification

8.2 AKEL Implementation

POC AKEL:

Single-threaded processing
Synchronous API calls
No caching
Basic error handling
Console logging

Full AKEL (POC2+):

Multi-threaded processing
Async API calls
Evidence caching
Advanced error handling with retry
Structured logging + monitoring

9. POC Philosophy

Important: POC validates concept, not production readiness. Focus is on proving AI can do the job, with production quality coming in later phases.

9.1 Core Principles

- 1. Prove Concept, Not Production
POC validates AI can do the job
Production quality comes in POC2 and Beta 0
Focus on "does it work?" not "is it perfect?"

2. Implement Subset of Requirements

POC covers FR1-7, NFR11 (lite)
All other requirements deferred
Clear mapping to Main Requirements

3. Quality Gates Validate Approach

2 gates prove the concept
Remaining 5 gates added in POC2
Gates must demonstrably improve quality

4. Iterate Based on Results

POC results determine next steps
Decision gate after POC1
Flexibility to pivot if needed

9.2 Success

Clear Path Forward ===

POC succeeds if we can confidently answer:

✅ Technical Feasibility:

Can AI extract claims reliably?
Can AI find balanced evidence?
Can AI compute reasonable verdicts?

✅ Quality Approach:

Do quality gates improve output?
Can we measure and track quality?
Is the gate approach scalable?

✅ Production Path:

Is the core architecture sound?
What needs improvement for production?
Is POC2 the right next step?

10. Related Pages

Main Requirements - Full system requirements (this POC implements a subset)
POC1 Specification (Detailed) - Detailed POC1 technical specs
POC Summary - High-level POC overview
Implementation Roadmap - POC1, POC2, Beta 0, V1.0 phases
User Needs - What users need (drives requirements)

Document Owner: Technical Team
Review Frequency: After each POC iteration
Version History:

v1.0 - Initial POC requirements
v2.0 - Updated after specification cross-check
v3.0 - Aligned with Main Requirements (FR/NFR IDs added)

POC Requirements (POC1 & POC2)

POC Requirements

1. POC Overview

1.1 What POC Tests

1.2 Requirements Mapping

2. POC1 Scope

3. POC Simplifications

3.1 FR1: Claim Extraction (Full Implementation)

3.2 FR3: Multiple Scenarios (Full Implementation)

3.3 FR4: Analysis Summary (Basic Implementation)

3.4 FR5-FR6: Evidence Collection & Evaluation (Full Implementation)

3.5 FR7: Automated Verdicts (Full Implementation)

3.6 NFR11: Quality Assurance Framework (LITE VERSION)

3.7 NFR1-3: Performance, Scalability, Reliability (Basic)

4. What's NOT in POC Scope

4.1 User-Facing Features (Beta 0+)

4.2 Advanced Features (V1.0+)

4.3 Production Requirements (POC2, Beta 0)

5. POC Output Specification

5.1 Required Output Elements

5.2 Example POC Output

6. Success Criteria

6.1 Functional Success

6.2 Quality Thresholds

6.3 POC Decision Gate

7. Test Cases

7.1 Happy Path

7.2 Edge Cases

7.3 Quality Gate Tests

8. Technical Architecture (POC)

8.1 Simplified Architecture

8.2 AKEL Implementation

9. POC Philosophy

9.1 Core Principles

9.2 Success

10. Related Pages

Applications

Navigation

Need help?