Changes for page POC Requirements

Last modified by Robert Schaub on 2026/02/08 08:25

From 1.2 to 1.3

From version 1.1

edited by Robert Schaub
on 2025/12/22 19:12

Change comment: Imported from XAR

To version 1.2

edited by Robert Schaub
on 2026/01/20 20:23

Change comment: Renamed back-links.

Raw
Rendered

Summary

Page properties (1 modified, 0 added, 0 removed)

Details

Page properties

Content

@@ -14,9 +14,11 @@
  === 1.1 What POC Tests ===
  **Core Question:**
++
  > Can AI automatically extract factual claims from articles and evaluate them with reasonable verdicts?
  **What we're proving:**
++
  * AI can identify factual claims from text
  * AI can evaluate those claims with structured evidence
  * Quality gates can filter unreliable outputs
@@ -23,6 +23,7 @@
  * The core workflow is technically feasible
  **What we're NOT proving:**
++
  * Production-ready reliability (that's POC2)
  * User-facing features (that's Beta 0)
  * Full IFCN compliance (that's V1.0)
@@ -32,11 +32,11 @@
  POC1 implements a **subset** of the full system requirements defined in [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]].
  **Scope Summary:**
++
  * **In Scope:** 8 requirements (7 FRs + 1 NFR)
  * **Partial:** 3 NFRs (simplified versions)
  * **Out of Scope:** 19 requirements (deferred to later phases)
--
  == 2. Requirements Scope Matrix ==
  {{success}}
@@ -44,7 +44,7 @@
  {{/success}}
  |=Requirement|=POC1 Status|=Implementation Level|=Notes
--|**CORE WORKFLOW**||||
++|**CORE WORKFLOW**||||\\
  |FR1: Claim Extraction|✅ **In Scope**|Full|AKEL extracts claims from text
  |FR2: Claim Context|✅ **In Scope**|Basic|Context preserved with claim
  |FR3: Multiple Scenarios|✅ **In Scope**|Full|AKEL generates interpretation scenarios
@@ -52,12 +52,12 @@
  |FR5: Evidence Collection|✅ **In Scope**|Full|AKEL searches for evidence
  |FR6: Evidence Evaluation|✅ **In Scope**|Full|AKEL evaluates source reliability
  |FR7: Automated Verdicts|✅ **In Scope**|Full|AKEL computes verdicts with uncertainty
--|**QUALITY & RELIABILITY**||||
++|**QUALITY & RELIABILITY**||||\\
  |NFR11: Quality Assurance|✅ **In Scope**|**Lite**|**2 gates only** (Gate 1 & 4)
  |NFR1: Performance|⚠️ **Partial**|Basic|Response time monitored, not optimized
  |NFR2: Scalability|⚠️ **Partial**|Single-thread|No concurrent processing
  |NFR3: Reliability|⚠️ **Partial**|Basic|Error handling, no retry logic
--|**DEFERRED TO LATER**||||
++|**DEFERRED TO LATER**||||\\
  |FR8-FR13|❌ Out of Scope|N/A|User accounts, corrections, publishing
  |FR44-FR53|❌ Out of Scope|N/A|Advanced features (V1.0+)
  |NFR4: Security|❌ Out of Scope|N/A|POC2
@@ -65,7 +65,6 @@
  |NFR12: Security Controls|❌ Out of Scope|N/A|Beta 0
  |NFR13: Monitoring|❌ Out of Scope|N/A|POC2
--
  == 3. POC Simplifications ==
  === 3.1 FR1: Claim Extraction (Full Implementation) ===
@@ -73,6 +73,7 @@
  **Main Requirement:** AI extracts factual claims from input text
  **POC Implementation:**
++
  * ✅ AKEL extracts claims using LLM
  * ✅ Each claim includes original text reference
  * ✅ Claims are identified as factual/non-factual
@@ -79,16 +79,17 @@
  * ❌ No advanced claim parsing (added in POC2)
  **Acceptance Criteria:**
++
  * Extracts 3-5 claims from typical article
  * Identifies factual vs non-factual claims
  * Quality Gate 1 validates extraction
--
  === 3.2 FR3: Multiple Scenarios (Full Implementation) ===
  **Main Requirement:** Generate multiple interpretation scenarios for ambiguous claims
  **POC Implementation:**
++
  * ✅ AKEL generates 2-3 scenarios per claim
  * ✅ Scenarios capture different interpretations
  * ✅ Each scenario is evaluated separately
@@ -95,16 +95,17 @@
  * ✅ Verdict considers all scenarios
  **Acceptance Criteria:**
++
  * Generates 2+ scenarios for ambiguous claims
  * Scenarios are meaningfully different
  * All scenarios are evaluated
--
  === 3.3 FR4: Analysis Summary (Basic Implementation) ===
  **Main Requirement:** Provide user-friendly summary of analysis
  **POC Implementation:**
++
  * ✅ Simple text summary generated
  * ❌ No rich formatting (added in Beta 0)
  * ❌ No visual elements (added in Beta 0)
@@ -122,10 +122,12 @@
  === 3.4 FR5-FR6: Evidence Collection & Evaluation (Full Implementation) ===
  **Main Requirements:**
++
  * FR5: Collect supporting and opposing evidence
  * FR6: Evaluate evidence source reliability
  **POC Implementation:**
++
  * ✅ AKEL searches for evidence (web/knowledge base)
  * ✅ **Mandatory contradiction search** (finds opposing evidence)
  * ✅ Source reliability scoring
@@ -133,16 +133,17 @@
  * ❌ No advanced source verification (added in POC2)
  **Acceptance Criteria:**
++
  * Finds 2+ supporting evidence items
  * Finds 1+ opposing evidence (if exists)
  * Sources scored for reliability
--
  === 3.5 FR7: Automated Verdicts (Full Implementation) ===
  **Main Requirement:** AI computes verdicts with uncertainty quantification
  **POC Implementation:**
++
  * ✅ Probabilistic verdicts (0-100% confidence)
  * ✅ Uncertainty explicitly stated
  * ✅ Reasoning chain provided
@@ -157,11 +157,11 @@
  ```
  **Acceptance Criteria:**
++
  * Verdicts include probability (0-100%)
  * Uncertainty explicitly quantified
  * Reasoning chain explains verdict
--
  === 3.6 NFR11: Quality Assurance Framework (LITE VERSION) ===
  **Main Requirement:** Complete quality assurance with 7 quality gates
@@ -169,11 +169,13 @@
  **POC Implementation:** **2 gates only**
  **Quality Gate 1: Claim Validation**
++
  * ✅ Validates claim is factual and verifiable
  * ✅ Blocks non-factual claims (opinion/prediction/ambiguous)
  * ✅ Provides clear rejection reason
  **Quality Gate 4: Verdict Confidence Assessment**
++
  * ✅ Validates ≥2 sources found
  * ✅ Validates quality score ≥0.6
  * ✅ Blocks low-confidence verdicts
@@ -180,6 +180,7 @@
  * ✅ Provides clear rejection reason
  **Out of Scope (POC2+):**
++
  * ❌ Gate 2: Evidence Relevance
  * ❌ Gate 3: Scenario Coherence
  * ❌ Gate 5: Source Diversity
@@ -192,11 +192,13 @@
  === 3.7 NFR1-3: Performance, Scalability, Reliability (Basic) ===
  **Main Requirements:**
++
  * NFR1: Response time < 30 seconds
  * NFR2: Handle 1000+ concurrent users
  * NFR3: 99.9% uptime
  **POC Implementation:**
++
  * ⚠️ **Response time monitored** (not optimized)
  * ⚠️ **Single-threaded processing** (no concurrency)
  * ⚠️ **Basic error handling** (no advanced retry logic)
@@ -204,11 +204,11 @@
  **Rationale:** POC proves functionality. Performance optimization happens in POC2.
  **POC Acceptance:**
++
  * Analysis completes (no timeout requirement)
  * Errors don't crash system
  * Basic logging in place
--
  == 4. What's NOT in POC Scope ==
  === 4.1 User-Facing Features (Beta 0+) ===
@@ -218,6 +218,7 @@
  {{/warning}}
  **Out of Scope:**
++
  * ❌ User accounts and authentication (FR8)
  * ❌ User corrections system (FR9, FR45-46)
  * ❌ Public publishing interface (FR10)
@@ -231,6 +231,7 @@
  === 4.2 Advanced Features (V1.0+) ===
  **Out of Scope:**
++
  * ❌ IFCN compliance (FR47)
  * ❌ ClaimReview schema (FR48)
  * ❌ Archive.org integration (FR49)
@@ -245,6 +245,7 @@
  === 4.3 Production Requirements (POC2, Beta 0) ===
  **Out of Scope:**
++
  * ❌ Security controls (NFR4, NFR12)
  * ❌ Code maintainability (NFR5)
  * ❌ System monitoring (NFR13)
@@ -261,21 +261,26 @@
  For each analyzed claim, POC must produce:
--**1. Claim**
++*
++**
++**1. Claim
  * Original text
  * Classification (factual/non-factual/ambiguous)
  * If non-factual: Clear reason why
  **2. Scenarios** (if factual)
++
  * 2-3 interpretation scenarios
  * Each scenario clearly described
  **3. Evidence** (if factual)
++
  * Supporting evidence (2+ items)
  * Opposing evidence (if exists)
  * Source URLs and reliability scores
  **4. Verdict** (if factual)
++
  * Probability (0-100%)
  * Uncertainty quantification
  * Confidence level (LOW/MEDIUM/HIGH)
@@ -282,10 +282,10 @@
  * Reasoning chain
  **5. Quality Status**
++
  * Which gates passed/failed
  * If failed: Clear explanation why
--
  === 5.2 Example POC Output ===
  {{code language="json"}}
@@ -337,6 +337,7 @@
  POC is successful if:
  ✅ **FR1-FR7 Requirements Met:**
++
 . Extracts 3-5 factual claims from test articles
 . Generates 2-3 scenarios per ambiguous claim
 . Finds supporting AND opposing evidence
@@ -344,19 +344,21 @@
 . Provides clear reasoning chains
  ✅ **Quality Gates Work:**
++
 . Gate 1 blocks non-factual claims (100% block rate)
 . Gate 4 blocks low-quality verdicts (blocks if <2 sources or quality <0.6)
 . Clear rejection reasons provided
  ✅ **NFR11 Met:**
++
 . Quality gates reduce hallucination rate
 . Blocked outputs have clear explanations
 . Quality metrics are logged
--
  === 6.2 Quality Thresholds ===
  **Minimum Acceptable:**
++
  * ≥70% of test claims correctly classified (factual/non-factual)
  * ≥60% of verdicts are reasonable (human evaluation)
  * Gate 1 blocks 100% of non-factual claims
@@ -363,16 +363,17 @@
  * Gate 4 blocks verdicts with <2 sources
  **Target:**
++
  * ≥80% claims correctly classified
  * ≥75% verdicts are reasonable
  * <10% false positives (blocking good claims)
--
  === 6.3 POC Decision Gate ===
  **After POC1, we decide:**
  **✅ PROCEED to POC2** if:
++
  * Success criteria met
  * Quality gates demonstrably improve output
  * Core workflow is technically sound
@@ -379,65 +379,72 @@
  * Clear path to production quality
  **⚠️ ITERATE POC1** if:
++
  * Success criteria partially met
  * Gates work but need tuning
  * Core issues identified but fixable
  **❌ PIVOT APPROACH** if:
++
  * Success criteria not met
  * Fundamental AI limitations discovered
  * Quality gates insufficient
  * Alternative approach needed
--
  == 7. Test Cases ==
  === 7.1 Happy Path ===
  **Test 1: Simple Factual Claim**
++
  * Input: "Paris is the capital of France"
--* Expected: Factual, 1 scenario, verdict ~95% true
++* Expected: Factual, 1 scenario, verdict 95% true
  **Test 2: Ambiguous Claim**
++
  * Input: "Switzerland has the highest income in Europe"
  * Expected: Factual, 2-3 scenarios, verdict with uncertainty
  **Test 3: Statistical Claim**
++
  * Input: "10% of people have condition X"
  * Expected: Factual, evidence with numbers, probabilistic verdict
--
  === 7.2 Edge Cases ===
  **Test 4: Opinion**
++
  * Input: "Paris is the best city"
  * Expected: Non-factual (opinion), blocked by Gate 1
  **Test 5: Prediction**
++
  * Input: "Bitcoin will reach $100,000 next year"
  * Expected: Non-factual (prediction), blocked by Gate 1
  **Test 6: Insufficient Evidence**
++
  * Input: Obscure factual claim with no sources
  * Expected: Blocked by Gate 4 (<2 sources)
--
  === 7.3 Quality Gate Tests ===
  **Test 7: Gate 1 Effectiveness**
++
  * Input: Mix of 10 factual + 10 non-factual claims
  * Expected: Gate 1 blocks all 10 non-factual (100% precision)
  **Test 8: Gate 4 Effectiveness**
++
  * Input: Claims with varying evidence availability
  * Expected: Gate 4 blocks low-confidence verdicts
--
  == 8. Technical Architecture (POC) ==
  === 8.1 Simplified Architecture ===
  **POC Tech Stack:**
++
  * **Frontend:** Simple web interface (Next.js + TypeScript)
  * **Backend:** Single API endpoint
  * **AI:** Claude API (Sonnet 4.5)
@@ -450,6 +450,7 @@
  === 8.2 AKEL Implementation ===
  **POC AKEL:**
++
  * Single-threaded processing
  * Synchronous API calls
  * No caching
@@ -457,6 +457,7 @@
  * Console logging
  **Full AKEL (POC2+):**
++
  * Multi-threaded processing
  * Async API calls
  * Evidence caching
@@ -463,7 +463,6 @@
  * Advanced error handling with retry
  * Structured logging + monitoring
--
  == 9. POC Philosophy ==
  {{info}}
@@ -472,60 +472,67 @@
  === 9.1 Core Principles ===
--**1. Prove Concept, Not Production**
++*
++**
++**1. Prove Concept, Not Production
  * POC validates AI can do the job
  * Production quality comes in POC2 and Beta 0
  * Focus on "does it work?" not "is it perfect?"
  **2. Implement Subset of Requirements**
++
  * POC covers FR1-7, NFR11 (lite)
  * All other requirements deferred
  * Clear mapping to [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]]
  **3. Quality Gates Validate Approach**
++
  * 2 gates prove the concept
  * Remaining 5 gates added in POC2
  * Gates must demonstrably improve quality
  **4. Iterate Based on Results**
++
  * POC results determine next steps
  * Decision gate after POC1
  * Flexibility to pivot if needed
++=== 9.2 Success ===
--=== 9.2 Success = Clear Path Forward ===
++ Clear Path Forward ===
  POC succeeds if we can confidently answer:
  ✅ **Technical Feasibility:**
++
  * Can AI extract claims reliably?
  * Can AI find balanced evidence?
  * Can AI compute reasonable verdicts?
  ✅ **Quality Approach:**
++
  * Do quality gates improve output?
  * Can we measure and track quality?
  * Is the gate approach scalable?
  ✅ **Production Path:**
++
  * Is the core architecture sound?
  * What needs improvement for production?
  * Is POC2 the right next step?
--
  == 10. Related Pages ==
  * **[[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]]** - Full system requirements (this POC implements a subset)
  * **[[POC1 Specification (Detailed)>>FactHarbor.Specification.POC.Specification]]** - Detailed POC1 technical specs
  * **[[POC Summary>>FactHarbor.Specification.POC.Summary]]** - High-level POC overview
--* **[[Implementation Roadmap>>FactHarbor.Roadmap.WebHome]]** - POC1, POC2, Beta 0, V1.0 phases
++* **[[Implementation Roadmap>>Archive.FactHarbor.Roadmap.WebHome]]** - POC1, POC2, Beta 0, V1.0 phases
  * **[[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]]** - What users need (drives requirements)
--
  **Document Owner:** Technical Team
  **Review Frequency:** After each POC iteration
  **Version History:**
++
  * v1.0 - Initial POC requirements
  * v2.0 - Updated after specification cross-check
  * v3.0 - Aligned with Main Requirements (FR/NFR IDs added)
--

Changes for page POC Requirements

Summary

Details

Applications

Navigation

Need help?