Changes for page POC1 API & Schemas Specification

Last modified by Robert Schaub on 2025/12/24 18:26

From 2.2 to 2.1

From version 2.1

edited by Robert Schaub
on 2025/12/24 13:58

Change comment: Imported from XAR

To version 1.1

edited by Robert Schaub
on 2025/12/24 11:54

Change comment: Imported from XAR

Raw
Rendered

Summary

Page properties (2 modified, 0 added, 0 removed)

Details

Page properties

Title

@@ -1,1 +1,1 @@
--POC1 API & Schemas Specification v0.4.1
++POC1 API & Schemas Specification

Content

@@ -1,6 +1,6 @@
  # FactHarbor POC1 — API & Schemas Specification
--**Version:** 0.4.1 (POC1 - 3-Stage Caching Architecture)
++**Version:** 0.3 (POC1 - Production Ready)
  **Namespace:** FactHarbor.*
  **Syntax:** xWiki 2.1
  **Last Updated:** 2025-12-24
@@ -10,31 +10,15 @@
  == Version History ==
  |=Version|=Date|=Changes
--|0.4.1|2025-12-24|Applied 9 critical fixes: file format notice, verdict taxonomy, canonicalization algorithm, Stage 1 cost policy, BullMQ fix, language in cache key, historical claims TTL, idempotency, copyright policy
--|0.4|2025-12-24|**BREAKING:** 3-stage pipeline with claim-level caching, user tier system, cache-only mode for free users, Redis cache architecture
--|0.3.1|2025-12-24|Fixed single-prompt strategy, SSE clarification, schema canonicalization, cost constraints, chain-of-thought, evidence citation, Jina safety, gate numbering
  |0.3|2025-12-24|Added complete API endpoints, LLM config, risk tiers, scraping details, quality gate logging, temporal separation note, cross-references
  |0.2|2025-12-24|Initial rebased version with holistic assessment
  |0.1|2025-12-24|Original specification
  ---
-----
--== File Format Notice ==
--
--**⚠️ Important:** This file is stored as {{code}}.md{{/code}} for transport/versioning, but the content is **xWiki 2.1 syntax** (not Markdown).
--
--**When importing to xWiki:**
--* Use "Import as XWiki content" (not "Import as Markdown")
--* The xWiki parser will correctly interpret {{code}}==}} headers, {{{{code}}}}}} blocks, etc.
--
--**Alternate naming:** If your workflow supports it, rename to {{code}}.xwiki.txt{{/code}} to avoid ambiguity.
--
-----
--
  == 1. Core Objective (POC1) ==
--The primary technical goal of POC1 is to validate **Approach 1 (Single-Pass Holistic Analysis)** while implementing **claim-level caching** to achieve cost sustainability:
++The primary technical goal of POC1 is to validate **Approach 1 (Single-Pass Holistic Analysis)**:
  The system must prove that AI can identify an article's **Main Thesis** and determine if the supporting claims (even if individually accurate) logically support that thesis without committing fallacies (e.g., correlation vs. causation, cherry-picking, hasty generalization).
@@ -41,226 +41,69 @@
  **Success Criteria:**
  * Test with 30 diverse articles
  * Target: ≥70% accuracy detecting misleading articles
--* Cost: <$0.25 per NEW analysis (uncached)
--* Cost: $0.00 for cached claim reuse
--* Cache hit rate: ≥50% after 1,000 articles
++* Cost: <$0.35 per analysis
  * Processing time: <2 minutes (standard depth)
--**Economic Model:**
--* Free tier: $10 credit per month (~40-140 articles depending on cache hits)
--* After limit: Cache-only mode (instant, free access to cached claims)
--* Paid tier: Unlimited new analyses
--
  **See:** [[Article Verdict Problem>>Test.FactHarbor.Specification.POC.Article-Verdict-Problem]] for complete investigation of 7 approaches.
  ---
--== 2. Architecture Overview ==
++== 2. Runtime Model & Job States ==
--=== 2.1 3-Stage Pipeline with Caching ===
++=== 2.1 Pipeline Steps ===
--FactHarbor POC1 uses a **3-stage architecture** designed for claim-level caching and cost efficiency:
++For progress reporting via API, the pipeline follows these stages:
--{{code language="mermaid"}}
--graph TD
--    A[Article Input] --> B[Stage 1: Extract Claims]
--    B --> C{For Each Claim}
--    C --> D[Check Cache]
--    D -->|Cache HIT| E[Return Cached Verdict]
--    D -->|Cache MISS| F[Stage 2: Analyze Claim]
--    F --> G[Store in Cache]
--    G --> E
--    E --> H[Stage 3: Holistic Assessment]
--    H --> I[Final Report]
--{{/code}}
++# **INGEST**: URL scraping (Jina Reader / Trafilatura) or text normalization.
++# **EXTRACT_CLAIMS**: Identifying 3-5 verifiable factual claims + marking central vs. supporting.
++# **SCENARIOS**: Generating context interpretations for each claim.
++# **RETRIEVAL**: Evidence gathering (Search API + mandatory contradiction search).
++# **VERDICTS**: Assigning likelihoods, confidence, and uncertainty per scenario.
++# **HOLISTIC_ASSESSMENT**: Evaluating article-level credibility (Thesis vs. Claims logic).
++# **REPORT**: Generating final Markdown and JSON outputs.
--**Stage 1: Claim Extraction** (Haiku, no cache)
--* Input: Article text
--* Output: 5 canonical claims (normalized, deduplicated)
--* Model: Claude Haiku 4
--* Cost: $0.003 per article
--* Cache strategy: No caching (article-specific)
++=== 2.1.1 URL Extraction Strategy ===
--**Stage 2: Claim Analysis** (Sonnet, CACHED)
--* Input: Single canonical claim
--* Output: Scenarios + Evidence + Verdicts
--* Model: Claude Sonnet 3.5
--* Cost: $0.081 per NEW claim
--* Cache strategy: **Redis, 90-day TTL**
--* Cache key: {{code}}claim:v1norm1:{language}:{sha256(canonical_claim)}{{/code}}
++**Primary:** Jina AI Reader ({{code}}https://r.jina.ai/{url}{{/code}})
++* **Rationale:** Clean markdown, handles JS rendering, free tier sufficient
++* **Fallback:** Trafilatura (Python library) for simple static HTML
--**Stage 3: Holistic Assessment** (Sonnet, no cache)
--* Input: Article + Claim verdicts (from cache or Stage 2)
--* Output: Article verdict + Fallacies + Logic quality
--* Model: Claude Sonnet 3.5
--* Cost: $0.030 per article
--* Cache strategy: No caching (article-specific)
++**Error Handling:**
--**Total Cost Formula:**
--{{code}}
--Cost = $0.003 (extraction) + (N_new_claims × $0.081) + $0.030 (holistic)
++|=Error Code|=Trigger|=Action
++|{{code}}URL_BLOCKED{{/code}}|403/401/Paywall detected|Return error, suggest text paste
++|{{code}}URL_UNREACHABLE{{/code}}|Network/DNS failure|Retry once, then fail
++|{{code}}URL_NOT_FOUND{{/code}}|404 Not Found|Return error immediately
++|{{code}}EXTRACTION_FAILED{{/code}}|Content <50 words or unreadable|Return error with reason
--Examples:
--- 0 new claims (100% cache hit): $0.033
--- 1 new claim (80% cache hit): $0.114
--- 3 new claims (40% cache hit): $0.276
--- 5 new claims (0% cache hit): $0.438
--{{/code}}
++**Supported URL Patterns:**
++* ✅ News articles, blog posts, Wikipedia
++* ✅ Academic preprints (arXiv)
++* ❌ Social media posts (Twitter, Facebook) - not in POC1
++* ❌ Video platforms (YouTube, TikTok) - not in POC1
++* ❌ PDF files - deferred to Beta 0
--=== 2.2 User Tier System ===
++=== 2.2 Job Status Enumeration ===
--|=Tier|=Monthly Credit|=After Limit|=Cache Access|=Analytics
--|**Free**|$10|Cache-only mode|✅ Full|Basic
--|**Pro** (future)|$50|Continues|✅ Full|Advanced
--|**Enterprise** (future)|Custom|Continues|✅ Full + Priority|Full
++(((
++* **QUEUED** - Job accepted, waiting in queue
++* **RUNNING** - Processing in progress
++* **SUCCEEDED** - Analysis complete, results available
++* **FAILED** - Error occurred, see error details
++* **CANCELLED** - User cancelled via DELETE endpoint
++)))
--**Free Tier Economics:**
--* $10 credit = 40-140 articles analyzed (depending on cache hit rate)
--* Average 70 articles/month at 70% cache hit rate
--* After limit: Cache-only mode (see Section 2.3)
--
--=== 2.3 Cache-Only Mode (Free Tier Feature) ===
--
--When free users reach their $10 monthly limit, they enter **Cache-Only Mode**:
--
--**What Cache-Only Mode Provides:**
--
--✅ **Claim Extraction (Platform-Funded):**
--* Stage 1 extraction runs at $0.003 per article
--* **Cost: Absorbed by platform** (not charged to user credit)
--* Rationale: Extraction is necessary to check cache, and cost is negligible
--* Rate limit: Max 50 extractions/day in cache-only mode (prevents abuse)
--
--✅ **Instant Access to Cached Claims:**
--* Any claim that exists in cache → Full verdict returned
--* Cost: $0 (no LLM calls)
--* Response time: <100ms
--
--✅ **Partial Article Analysis:**
--* Check each claim against cache
--* Return verdicts for ALL cached claims
--* For uncached claims: Return {{code}}"status": "cache_miss"{{/code}}
--
--✅ **Cache Coverage Report:**
--* "3 of 5 claims available in cache (60% coverage)"
--* Links to cached analyses
--* Estimated cost to complete: $0.162 (2 new claims)
--
--❌ **Not Available in Cache-Only Mode:**
--* New claim analysis (Stage 2 LLM calls blocked)
--* Full holistic assessment (Stage 3 blocked if any claims missing)
--
--**User Experience:**
--{{code language="json"}}
--{
--  "status": "cache_only_mode",
--  "message": "Monthly credit limit reached. Showing cached results only.",
--  "cache_coverage": {
--    "claims_total": 5,
--    "claims_cached": 3,
--    "claims_missing": 2,
--    "coverage_percent": 60
--  },
--  "cached_claims": [
--    {"claim_id": "C1", "verdict": "Likely", "confidence": 0.82},
--    {"claim_id": "C2", "verdict": "Highly Likely", "confidence": 0.91},
--    {"claim_id": "C4", "verdict": "Unclear", "confidence": 0.55}
--  ],
--  "missing_claims": [
--    {"claim_id": "C3", "claim_text": "...", "estimated_cost": "$0.081"},
--    {"claim_id": "C5", "claim_text": "...", "estimated_cost": "$0.081"}
--  ],
--  "upgrade_options": {
--    "top_up": "$5 for 20-70 more articles",
--    "pro_tier": "$50/month unlimited"
--  }
--}
--{{/code}}
--
--**Design Rationale:**
--* Free users still get value (cached claims often answer their question)
--* Demonstrates FactHarbor's value (partial results encourage upgrade)
--* Sustainable for platform (no additional cost)
--* Fair to all users (everyone contributes to cache)
--
  ---
  == 3. REST API Contract ==
--=== 3.1 User Credit Tracking ===
++=== 3.1 Create Analysis Job ===
--**Endpoint:** {{code}}GET /v1/user/credit{{/code}}
--
--**Response:** {{code}}200 OK{{/code}}
--
--{{code language="json"}}
--{
--  "user_id": "user_abc123",
--  "tier": "free",
--  "credit_limit": 10.00,
--  "credit_used": 7.42,
--  "credit_remaining": 2.58,
--  "reset_date": "2025-02-01T00:00:00Z",
--  "cache_only_mode": false,
--  "usage_stats": {
--    "articles_analyzed": 67,
--    "claims_from_cache": 189,
--    "claims_newly_analyzed": 113,
--    "cache_hit_rate": 0.626
--  }
--}
--{{/code}}
--
-----
--
--=== 3.2 Create Analysis Job (3-Stage) ===
--
  **Endpoint:** {{code}}POST /v1/analyze{{/code}}
--**Request Body:**
--
--
--**Idempotency Support:**
--
--To prevent duplicate job creation on network retries, clients SHOULD include:
--
--{{code language="http"}}
--POST /v1/analyze
--Idempotency-Key: {client-generated-uuid}
--{{/code}}
--
--OR use the {{code}}client.request_id{{/code}} field:
--
++**Request Body Example:**
  {{code language="json"}}
  {
--  "input_url": "...",
--  "client": {
--    "request_id": "client-uuid-12345",
--    "source_label": "optional"
--  }
--}
--{{/code}}
--
--**Server Behavior:**
--* If {{code}}Idempotency-Key{{/code}} or {{code}}request_id{{/code}} seen before (within 24 hours):
--  - Return existing job ({{code}}200 OK{{/code}}, not {{code}}202 Accepted{{/code}})
--  - Do NOT create duplicate job or charge twice
--* Idempotency keys expire after 24 hours (matches job retention)
--
--**Example Response (Idempotent):**
--{{code language="json"}}
--{
--  "job_id": "01J...ULID",
--  "status": "RUNNING",
--  "idempotent": true,
--  "original_request_at": "2025-12-24T10:31:00Z",
--  "message": "Returning existing job (idempotency key matched)"
--}
--{{/code}}
--
--
--{{code language="json"}}
--{
    "input_type": "url",
    "input_url": "https://example.com/medical-report-01",
    "input_text": null,
@@ -268,8 +268,7 @@
      "browsing": "on",
      "depth": "standard",
      "max_claims": 5,
--    "context_aware_analysis": true,
--    "cache_preference": "prefer_cache"
++    "context_aware_analysis": true
    },
    "client": {
      "request_id": "optional-client-tracking-id",
@@ -279,10 +279,10 @@
  {{/code}}
  **Options:**
--* {{code}}cache_preference{{/code}}: {{code}}prefer_cache{{/code}} | {{code}}require_fresh{{/code}} | {{code}}allow_partial{{/code}}
--  - {{code}}prefer_cache{{/code}}: Use cache when available, analyze new claims (default)
--  - {{code}}require_fresh{{/code}}: Force re-analysis of all claims (ignores cache, costs more)
--  - {{code}}allow_partial{{/code}}: Return partial results if some claims uncached (for free tier cache-only mode)
++* {{code}}browsing{{/code}}: {{code}}on{{/code}} | {{code}}off{{/code}} (retrieve web sources or just output queries)
++* {{code}}depth{{/code}}: {{code}}standard{{/code}} | {{code}}deep{{/code}} (evidence thoroughness)
++* {{code}}max_claims{{/code}}: 1-50 (default: 10)
++* {{code}}context_aware_analysis{{/code}}: {{code}}true{{/code}} | {{code}}false{{/code}} (experimental)
  **Response:** {{code}}202 Accepted{{/code}}
@@ -291,18 +291,6 @@
    "job_id": "01J...ULID",
    "status": "QUEUED",
    "created_at": "2025-12-24T10:31:00Z",
--  "estimated_cost": 0.114,
--  "cost_breakdown": {
--    "stage1_extraction": 0.003,
--    "stage2_new_claims": 0.081,
--    "stage2_cached_claims": 0.000,
--    "stage3_holistic": 0.030
--  },
--  "cache_info": {
--    "claims_to_extract": 5,
--    "estimated_cache_hits": 4,
--    "estimated_new_claims": 1
--  },
    "links": {
      "self": "/v1/jobs/01J...ULID",
      "result": "/v1/jobs/01J...ULID/result",
@@ -312,23 +312,9 @@
  }
  {{/code}}
--**Error Responses:**
--
--{{code}}402 Payment Required{{/code}} - Free tier limit reached, cache-only mode
--{{code language="json"}}
--{
--  "error": "credit_limit_reached",
--  "message": "Monthly credit limit reached. Entering cache-only mode.",
--  "cache_only_mode": true,
--  "credit_remaining": 0.00,
--  "reset_date": "2025-02-01T00:00:00Z",
--  "action": "Resubmit with cache_preference=allow_partial for cached results"
--}
--{{/code}}
--
  ---
--=== 3.3 Get Job Status ===
++=== 3.2 Get Job Status ===
  **Endpoint:** {{code}}GET /v1/jobs/{job_id}{{/code}}
@@ -341,20 +341,12 @@
    "created_at": "2025-12-24T10:31:00Z",
    "updated_at": "2025-12-24T10:31:22Z",
    "progress": {
--    "stage": "stage2_claim_analysis",
--    "percent": 65,
--    "message": "Analyzing claim 3 of 5 (2 from cache)",
--    "current_claim_id": "C3",
--    "cache_hits": 2,
--    "cache_misses": 1
++    "step": "RETRIEVAL",
++    "percent": 60,
++    "message": "Gathering evidence for C2-S1",
++    "current_claim_id": "C2",
++    "current_scenario_id": "C2-S1"
    },
--  "actual_cost": 0.084,
--  "cost_breakdown": {
--    "stage1_extraction": 0.003,
--    "stage2_new_claims": 0.081,
--    "stage2_cached_claims": 0.000,
--    "stage3_holistic": null
--  },
    "input_echo": {
      "input_type": "url",
      "input_url": "https://example.com/medical-report-01"
@@ -370,61 +370,12 @@
  ---
--=== 3.4 Get Analysis Result ===
++=== 3.3 Get JSON Result ===
  **Endpoint:** {{code}}GET /v1/jobs/{job_id}/result{{/code}}
--**Response:** {{code}}200 OK{{/code}}
++**Response:** {{code}}200 OK{{/code}} (Returns the **AnalysisResult** schema - see Section 4)
--Returns complete **AnalysisResult** schema (see Section 4).
--
--**Cache-Only Mode Response:** {{code}}206 Partial Content{{/code}}
--
--{{code language="json"}}
--{
--  "cache_only_mode": true,
--  "cache_coverage": {
--    "claims_total": 5,
--    "claims_cached": 3,
--    "claims_missing": 2,
--    "coverage_percent": 60
--  },
--  "partial_result": {
--    "metadata": {
--      "job_id": "01J...ULID",
--      "timestamp_utc": "2025-12-24T10:31:30Z",
--      "engine_version": "POC1-v0.4",
--      "cache_only": true
--    },
--    "claims": [
--      {
--        "claim_id": "C1",
--        "claim_text": "...",
--        "canonical_claim": "...",
--        "source": "cache",
--        "cached_at": "2025-12-20T15:30:00Z",
--        "cache_hit_count": 47,
--        "scenarios": [...]
--      },
--      {
--        "claim_id": "C3",
--        "claim_text": "...",
--        "canonical_claim": "...",
--        "source": "not_analyzed",
--        "status": "cache_miss",
--        "estimated_cost": 0.081
--      }
--    ],
--    "article_holistic_assessment": null,
--    "upgrade_prompt": {
--      "message": "Upgrade to Pro for full analysis of all claims",
--      "missing_claims": 2,
--      "cost_to_complete": 0.192
--    }
--  }
--}
--{{/code}}
--
  **Other Responses:**
  * {{code}}409 Conflict{{/code}} - Job not finished yet
  * {{code}}404 Not Found{{/code}} - Job ID unknown
@@ -431,29 +431,8 @@
  ---
--=== 3.5 Stage-Specific Endpoints (Optional, Advanced) ===
++=== 3.4 Download Markdown Report ===
--For direct stage access (useful for cache debugging, custom workflows):
--
--**Extract Claims Only:**
--{{code}}POST /v1/analyze/extract-claims{{/code}}
--
--**Analyze Single Claim:**
--{{code}}POST /v1/analyze/claim{{/code}}
--
--**Assess Article (with claim verdicts):**
--{{code}}POST /v1/analyze/assess-article{{/code}}
--
--**Check Claim Cache:**
--{{code}}GET /v1/cache/claim/{claim_hash}{{/code}}
--
--**Cache Statistics:**
--{{code}}GET /v1/cache/stats{{/code}}
--
-----
--
--=== 3.6 Download Markdown Report ===
--
  **Endpoint:** {{code}}GET /v1/jobs/{job_id}/report{{/code}}
  **Response:** {{code}}200 OK{{/code}} with {{code}}text/markdown; charset=utf-8{{/code}} content
@@ -461,11 +461,13 @@
  **Headers:**
  * {{code}}Content-Disposition: attachment; filename="factharbor_poc1_{job_id}.md"{{/code}}
--**Cache-Only Mode:** Report includes "Partial Analysis" watermark and upgrade prompt.
++**Other Responses:**
++* {{code}}409 Conflict{{/code}} - Job not finished
++* {{code}}404 Not Found{{/code}} - Job unknown
  ---
--=== 3.7 Stream Job Events (Backend Progress) ===
++=== 3.5 Stream Job Events (Optional, Recommended) ===
  **Endpoint:** {{code}}GET /v1/jobs/{job_id}/events{{/code}}
@@ -472,1044 +472,478 @@
  **Response:** Server-Sent Events (SSE) stream
  **Event Types:**
--* {{code}}progress{{/code}} - Backend progress (e.g., "Stage 1: Extracting claims")
--* {{code}}cache_hit{{/code}} - Claim found in cache
--* {{code}}cache_miss{{/code}} - Claim requires new analysis
--* {{code}}stage_complete{{/code}} - Stage 1/2/3 finished
++* {{code}}progress{{/code}} - Progress update
++* {{code}}claim_extracted{{/code}} - Claim identified
++* {{code}}verdict_computed{{/code}} - Scenario verdict complete
  * {{code}}complete{{/code}} - Job finished
  * {{code}}error{{/code}} - Error occurred
--* {{code}}credit_warning{{/code}} - User approaching limit
  ---
--=== 3.8 Cancel Job ===
++=== 3.6 Cancel Job ===
  **Endpoint:** {{code}}DELETE /v1/jobs/{job_id}{{/code}}
--**Note:** If job is mid-stage (e.g., analyzing claim 3 of 5), user is charged for completed work only.
++Attempts to cancel a queued or running job.
++**Response:** {{code}}200 OK{{/code}} with updated Job object (status: CANCELLED)
++
++**Note:** Already-completed jobs cannot be cancelled.
++
  ---
--=== 3.9 Health Check ===
++=== 3.7 Health Check ===
  **Endpoint:** {{code}}GET /v1/health{{/code}}
++**Response:** {{code}}200 OK{{/code}}
++
  {{code language="json"}}
  {
    "status": "ok",
--  "version": "POC1-v0.4",
--  "model_stage1": "claude-haiku-4",
--  "model_stage2": "claude-3-5-sonnet-20241022",
--  "model_stage3": "claude-3-5-sonnet-20241022",
--  "cache": {
--    "status": "connected",
--    "total_claims": 12847,
--    "avg_hit_rate_24h": 0.73
--  }
++  "version": "POC1-v0.3",
++  "model": "claude-3-5-sonnet-20241022"
  }
  {{/code}}
  ---
--== 4. Data Schemas ==
++== 4. AnalysisResult Schema (Context-Aware) ==
--=== 4.1 Stage 1 Output: ClaimExtraction ===
++This schema implements the **Context-Aware Analysis** required by the POC1 specification.
  {{code language="json"}}
  {
--  "job_id": "01J...ULID",
--  "stage": "stage1_extraction",
--  "article_metadata": {
--    "title": "Article title",
--    "source_url": "https://example.com/article",
--    "extracted_text_length": 5234,
--    "language": "en"
--  },
--  "claims": [
--    {
--      "claim_id": "C1",
--      "claim_text": "Original claim text from article",
--      "canonical_claim": "Normalized, deduplicated phrasing",
--      "claim_hash": "sha256:abc123...",
--      "is_central_to_thesis": true,
--      "claim_type": "causal",
--      "evaluability": "evaluable",
--      "risk_tier": "B",
--      "domain": "public_health"
--    }
--  ],
--  "article_thesis": "Main argument detected",
--  "cost": 0.003
--}
--{{/code}}
--
--=== 4.2 Stage 2 Output: ClaimAnalysis (CACHED) ===
--
--This is the CACHEABLE unit. Stored in Redis with 90-day TTL.
--
--{{code language="json"}}
--{
--  "claim_hash": "sha256:abc123...",
--  "canonical_claim": "COVID vaccines are 95% effective",
--  "language": "en",
--  "domain": "public_health",
--  "analysis_version": "v1.0",
--  "scenarios": [
--    {
--      "scenario_id": "S1",
--      "scenario_title": "mRNA vaccines (Pfizer/Moderna) in clinical trials",
--      "definitions": {"95% effective": "95% reduction in symptomatic infection"},
--      "assumptions": ["Based on phase 3 trial data", "Against original strain"],
--      "boundaries": {
--        "time": "2020-2021 trials",
--        "geography": "Multi-country trials",
--        "population": "Adult population (16+)",
--        "conditions": "Before widespread variants"
--      },
--      "verdict": {
--        "label": "Highly Likely",
--        "probability_range": [0.88, 0.97],
--        "confidence": 0.92,
--        "reasoning_chain": [
--          "Pfizer/BioNTech trial: 95% efficacy (n=43,548)",
--          "Moderna trial: 94.1% efficacy (n=30,420)",
--          "Peer-reviewed publications in NEJM",
--          "FDA independent analysis confirmed"
--        ],
--        "key_supporting_evidence_ids": ["E1", "E2"],
--        "key_counter_evidence_ids": ["E3"],
--        "uncertainty_factors": [
--          "Limited data on long-term effectiveness",
--          "Variant-specific performance not yet measured"
--        ]
--      },
--      "evidence": [
--        {
--          "evidence_id": "E1",
--          "stance": "supports",
--          "relevance_to_scenario": 0.98,
--          "evidence_summary": [
--            "Pfizer trial showed 170 cases in placebo vs 8 in vaccine group",
--            "Follow-up period median 2 months post-dose 2",
--            "Efficacy consistent across age, sex, race, ethnicity"
--          ],
--          "citation": {
--            "title": "Safety and Efficacy of the BNT162b2 mRNA Covid-19 Vaccine",
--            "author_or_org": "Polack et al.",
--            "publication_date": "2020-12-31",
--            "url": "https://nejm.org/doi/full/10.1056/NEJMoa2034577",
--            "publisher": "New England Journal of Medicine",
--            "retrieved_at_utc": "2025-12-20T15:30:00Z"
--          },
--          "excerpt": ["The vaccine was 95% effective in preventing Covid-19"],
--          "excerpt_word_count": 9,
--          "source_reliability_score": 0.95,
--          "reliability_justification": "Peer-reviewed, high-impact journal, large RCT",
--          "limitations_and_reservations": [
--            "Short follow-up period (2 months)",
--            "Primarily measures symptomatic infection, not transmission"
--          ],
--          "retraction_or_dispute_signal": "none"
--        }
--      ]
--    }
--  ],
--  "cache_metadata": {
--    "first_analyzed": "2025-12-01T10:00:00Z",
--    "last_updated": "2025-12-20T15:30:00Z",
--    "hit_count": 47,
--    "version": "v1.0",
--    "ttl_expires": "2026-03-20T15:30:00Z"
--  },
--  "cost": 0.081
--}
--{{/code}}
--
--**Cache Key Structure:**
--{{code}}
--Redis Key: claim:v1norm1:{language}:{sha256(canonical_claim)}
--TTL: 90 days (7,776,000 seconds)
--Size: ~15KB JSON (compressed: ~5KB)
--{{/code}}
--
--=== 4.3 Stage 3 Output: HolisticAssessment ===
--
--{{code language="json"}}
--{
--  "job_id": "01J...ULID",
--  "stage": "stage3_holistic",
--  "article_metadata": {
--    "title": "...",
--    "main_thesis": "...",
--    "source_url": "..."
--  },
--  "article_holistic_assessment": {
--    "overall_verdict": "MISLEADING",
--    "logic_quality_score": 0.42,
--    "fallacies_detected": [
--      "correlation-causation",
--      "cherry-picking"
--    ],
--    "verdict_reasoning": [
--      "Central claim C1 is REFUTED by multiple systematic reviews",
--      "Supporting claims C2-C4 are TRUE but do not support the thesis",
--      "Article commits correlation-causation fallacy",
--      "Selective citation of evidence (cherry-picking detected)"
--    ],
--    "experimental_feature": true
--  },
--  "claims_summary": [
--    {
--      "claim_id": "C1",
--      "is_central_to_thesis": true,
--      "verdict": "Refuted",
--      "confidence": 0.89,
--      "source": "cache",
--      "cache_hit": true
--    },
--    {
--      "claim_id": "C2",
--      "is_central_to_thesis": false,
--      "verdict": "Highly Likely",
--      "confidence": 0.91,
--      "source": "new_analysis",
--      "cache_hit": false
--    }
--  ],
--  "quality_gates": {
--    "gate1_claim_validation": "pass",
--    "gate4_verdict_confidence": "pass",
--    "passed_all": true
--  },
--  "cost": 0.030,
--  "total_job_cost": 0.114
--}
--{{/code}}
--
--=== 4.4 Complete AnalysisResult (All 3 Stages Combined) ===
--
--{{code language="json"}}
--{
    "metadata": {
--    "job_id": "01J...ULID",
--    "timestamp_utc": "2025-12-24T10:31:30Z",
--    "engine_version": "POC1-v0.4",
--    "llm_stage1": "claude-haiku-4",
--    "llm_stage2": "claude-3-5-sonnet-20241022",
--    "llm_stage3": "claude-3-5-sonnet-20241022",
++    "job_id": "string (ULID)",
++    "timestamp_utc": "ISO8601",
++    "engine_version": "POC1-v0.3",
++    "llm_provider": "anthropic",
++    "llm_model": "claude-3-5-sonnet-20241022",
      "usage_stats": {
--      "stage1_tokens": {"input": 10000, "output": 500},
--      "stage2_tokens": {"input": 2000, "output": 5000},
--      "stage3_tokens": {"input": 5000, "output": 1000},
--      "total_input_tokens": 17000,
--      "total_output_tokens": 6500,
--      "estimated_cost_usd": 0.114,
--      "response_time_sec": 45.2
--    },
--    "cache_stats": {
--      "claims_total": 5,
--      "claims_from_cache": 4,
--      "claims_new_analysis": 1,
--      "cache_hit_rate": 0.80,
--      "cache_savings_usd": 0.324
++      "input_tokens": "integer",
++      "output_tokens": "integer",
++      "estimated_cost_usd": "float",
++      "response_time_sec": "float"
      }
    },
    "article_holistic_assessment": {
--    "main_thesis": "...",
--    "overall_verdict": "MISLEADING",
--    "logic_quality_score": 0.42,
--    "fallacies_detected": ["correlation-causation", "cherry-picking"],
--    "verdict_reasoning": ["...", "...", "..."],
++    "main_thesis": "string (The core argument detected)",
++    "overall_verdict": "WELL-SUPPORTED | MISLEADING | REFUTED | UNCERTAIN",
++    "logic_quality_score": "float (0-1)",
++    "fallacies_detected": ["correlation-causation", "cherry-picking", "hasty-generalization"],
++    "verdict_reasoning": "string (Explanation of why article credibility differs from claim average)",
      "experimental_feature": true
    },
    "claims": [
      {
        "claim_id": "C1",
--      "is_central_to_thesis": true,
--      "claim_text": "...",
--      "canonical_claim": "...",
--      "claim_hash": "sha256:abc123...",
--      "claim_type": "causal",
--      "evaluability": "evaluable",
--      "risk_tier": "B",
--      "source": "cache",
--      "cached_at": "2025-12-20T15:30:00Z",
--      "cache_hit_count": 47,
--      "scenarios": [...]
--    },
--    {
--      "claim_id": "C2",
--      "source": "new_analysis",
--      "analyzed_at": "2025-12-24T10:31:15Z",
--      "scenarios": [...]
++      "is_central_to_thesis": "boolean",
++      "claim_text": "string",
++      "canonical_form": "string",
++      "claim_type": "descriptive | causal | predictive | normative | definitional",
++      "evaluability": "evaluable | partly_evaluable | not_evaluable",
++      "risk_tier": "A | B | C",
++      "risk_tier_justification": "string",
++      "domain": "string (e.g., 'public health', 'economics')",
++      "key_terms": ["term1", "term2"],
++      "entities": ["Person X", "Org Y"],
++      "time_scope_detected": "2020-2024",
++      "geography_scope_detected": "Brazil",
++      "scenarios": [
++        {
++          "scenario_id": "C1-S1",
++          "context_title": "string",
++          "definitions": {"key_term": "definition"},
++          "assumptions": ["Assumption 1", "Assumption 2"],
++          "boundaries": {
++            "time": "as of 2025-01",
++            "geography": "Brazil",
++            "population": "adult population",
++            "conditions": "excludes X; includes Y"
++          },
++          "scope_of_evidence": "What counts as evidence for this scenario",
++          "scenario_questions": ["Question that decides the verdict"],
++          "verdict": {
++            "label": "Highly Likely | Likely | Unclear | Unlikely | Refuted | Unsubstantiated",
++            "probability_range": [0.0, 1.0],
++            "confidence": "float (0-1)",
++            "reasoning": "string",
++            "key_supporting_evidence_ids": ["E1", "E3"],
++            "key_counter_evidence_ids": ["E2"],
++            "uncertainty_factors": ["Data gap", "Method disagreement"],
++            "what_would_change_my_mind": ["Specific new study", "Updated dataset"]
++          },
++          "evidence": [
++            {
++              "evidence_id": "E1",
++              "stance": "supports | undermines | mixed | context_dependent",
++              "relevance_to_scenario": "float (0-1)",
++              "evidence_summary": ["Bullet fact 1", "Bullet fact 2"],
++              "citation": {
++                "title": "Source title",
++                "author_or_org": "Org/Author",
++                "publication_date": "2024-05-01",
++                "url": "https://source.example",
++                "publisher": "Publisher/Domain"
++              },
++              "excerpt": ["Short quote ≤25 words (optional)"],
++              "source_reliability_score": "float (0-1) - READ-ONLY SNAPSHOT",
++              "reliability_justification": "Why high/medium/low",
++              "limitations_and_reservations": ["Limitation 1", "Limitation 2"],
++              "retraction_or_dispute_signal": "none | correction | retraction | disputed",
++              "retrieval_status": "OK | NEEDS_RETRIEVAL | FAILED"
++            }
++          ]
++        }
++      ]
      }
    ],
    "quality_gates": {
--    "gate1_claim_validation": "pass",
--    "gate4_verdict_confidence": "pass",
--    "passed_all": true
++    "gate1_claim_validation": "pass | fail",
++    "gate4_verdict_confidence": "pass | fail",
++    "passed_all": "boolean",
++    "gate_fail_reasons": [
++      {
++        "gate": "gate1_claim_validation",
++        "claim_id": "C1",
++        "reason_code": "OPINION_DETECTED | COMPOUND_CLAIM | SUBJECTIVE | TOO_VAGUE",
++        "explanation": "Human-readable explanation"
++      }
++    ]
++  },
++  "global_notes": {
++    "limitations": ["System limitation 1", "Limitation 2"],
++    "safety_or_policy_notes": ["Note 1"]
    }
  }
  {{/code}}
++=== 4.1 Risk Tier Definitions ===
++|=Tier|=Impact|=Examples|=Actions
++|**A (High)**|High real-world impact if wrong|Health claims, safety information, financial advice, medical procedures|Human review recommended (Mode3_Human_Reviewed_Required)
++|**B (Medium)**|Moderate impact, contested topics|Political claims, social issues, scientific debates, economic predictions|Enhanced contradiction search, AI-generated publication OK (Mode2_AI_Generated)
++|**C (Low)**|Low impact, easily verifiable|Historical facts, basic statistics, biographical data, geographic information|Standard processing, AI-generated publication OK (Mode2_AI_Generated)
--=== 4.5 Verdict Label Taxonomy ===
++=== 4.2 Source Reliability (Read-Only Snapshots) ===
--FactHarbor uses **three distinct verdict taxonomies** depending on analysis level:
++**IMPORTANT:** The {{code}}source_reliability_score{{/code}} in each evidence item is a **historical snapshot** from the weekly background scoring job.
--==== 4.5.1 Scenario Verdict Labels (Stage 2) ====
++* POC1 treats these scores as **read-only** (no modification during analysis)
++* **Prevents circular dependency:** scoring → affects retrieval → affects scoring
++* Full Source Track Record System is a **separate service** (not part of POC1)
++* **Temporal separation:** Scoring runs weekly; analysis uses snapshots
--Used for individual scenario verdicts within a claim.
++**See:** [[Data Model>>Test.FactHarbor.Specification.Data Model.WebHome]] Section 1.3 (Source Track Record System) for scoring algorithm.
--**Enum Values:**
--* {{code}}Highly Likely{{/code}} - Probability 0.85-1.0, high confidence
--* {{code}}Likely{{/code}} - Probability 0.65-0.84, moderate-high confidence
--* {{code}}Unclear{{/code}} - Probability 0.35-0.64, or low confidence
--* {{code}}Unlikely{{/code}} - Probability 0.16-0.34, moderate-high confidence
--* {{code}}Highly Unlikely{{/code}} - Probability 0.0-0.15, high confidence
--* {{code}}Unsubstantiated{{/code}} - Insufficient evidence to determine probability
++=== 4.3 Quality Gate Reason Codes ===
--==== 4.5.2 Claim Verdict Labels (Rollup) ====
++**Gate 1 (Claim Validation):**
++* {{code}}OPINION_DETECTED{{/code}} - Subjective judgment without factual anchor
++* {{code}}COMPOUND_CLAIM{{/code}} - Multiple claims in one statement
++* {{code}}SUBJECTIVE{{/code}} - Value judgment, not verifiable fact
++* {{code}}TOO_VAGUE{{/code}} - Lacks specificity for evaluation
--Used when summarizing a claim across all scenarios.
++**Gate 4 (Verdict Confidence):**
++* {{code}}LOW_CONFIDENCE{{/code}} - Confidence below threshold (<0.5)
++* {{code}}INSUFFICIENT_EVIDENCE{{/code}} - Too few sources to reach verdict
++* {{code}}CONTRADICTORY_EVIDENCE{{/code}} - Evidence conflicts without resolution
++* {{code}}NO_COUNTER_EVIDENCE{{/code}} - Contradiction search failed
--**Enum Values:**
--* {{code}}Supported{{/code}} - Majority of scenarios are Likely or Highly Likely
--* {{code}}Refuted{{/code}} - Majority of scenarios are Unlikely or Highly Unlikely
--* {{code}}Inconclusive{{/code}} - Mixed scenarios or majority Unclear/Unsubstantiated
++**Purpose:** Enable system improvement workflow (Observe → Analyze → Improve)
--**Mapping Logic:**
--* If ≥60% scenarios are (Highly Likely | Likely) → Supported
--* If ≥60% scenarios are (Highly Unlikely | Unlikely) → Refuted
--* Otherwise → Inconclusive
++---
--==== 4.5.3 Article Verdict Labels (Stage 3) ====
++== 5. Validation Rules (POC1 Enforcement) ==
--Used for holistic article-level assessment.
++|=Rule|=Requirement
++|**Mandatory Contradiction**|For every claim, the engine MUST search for "undermines" evidence. If none found, reasoning must explicitly state: "No counter-evidence found despite targeted search." Evidence must include at least 1 item with {{code}}stance ∈ {undermines, mixed, context_dependent}{{/code}} OR explicit note in {{code}}uncertainty_factors{{/code}}.
++|**Context-Aware Logic**|The {{code}}overall_verdict{{/code}} must prioritize central claims. If a {{code}}is_central_to_thesis=true{{/code}} claim is REFUTED, the overall article cannot be WELL-SUPPORTED. Central claims override verdict averaging.
++|**Author Identification**|All automated outputs MUST include {{code}}author_type: "AI/AKEL"{{/code}} or equivalent marker to distinguish AI-generated from human-reviewed content.
++|**Claim-to-Scenario Lifecycle**|In stateless POC1, Scenarios are **strictly children** of a specific Claim version. If a Claim's text changes, child Scenarios are part of that version's "snapshot." No scenario migration across versions.
--**Enum Values:**
--* {{code}}WELL-SUPPORTED{{/code}} - Article thesis logically follows from supported claims
--* {{code}}MISLEADING{{/code}} - Claims may be true but article commits logical fallacies
--* {{code}}REFUTED{{/code}} - Central claims are refuted, invalidating thesis
--* {{code}}UNCERTAIN{{/code}} - Insufficient evidence or highly mixed claim verdicts
++---
--**Note:** Article verdict considers **claim centrality** (central claims override supporting claims).
++== 6. Deterministic Markdown Template ==
--==== 4.5.4 API Field Mapping ====
++The system renders {{code}}report.md{{/code}} using a **fixed template** based on the JSON result (NOT generated by LLM).
--|=Level|=API Field|=Enum Name
--|Scenario|{{code}}scenarios[].verdict.label{{/code}}|scenario_verdict_label
--|Claim|{{code}}claims[].rollup_verdict{{/code}} (optional)|claim_verdict_label
--|Article|{{code}}article_holistic_assessment.overall_verdict{{/code}}|article_verdict_label
++{{code language="markdown"}}
++# FactHarbor Analysis Report: {overall_verdict}
++**Job ID:** {job_id} | **Generated:** {timestamp_utc}
++**Model:** {llm_model} | **Cost:** ${estimated_cost_usd} | **Time:** {response_time_sec}s
  ---
--== 5. Cache Architecture ==
++## 1. Holistic Assessment (Experimental)
--=== 5.1 Redis Cache Design ===
++**Main Thesis:** {main_thesis}
--**Technology:** Redis 7.0+ (in-memory key-value store)
++**Overall Verdict:** {overall_verdict}
--**Cache Key Schema:**
--{{code}}
--claim:v1norm1:{language}:{sha256(canonical_claim)}
--{{/code}}
++**Logic Quality Score:** {logic_quality_score}/1.0
--**Example:**
--{{code}}
--Claim (English): "COVID vaccines are 95% effective"
--Canonical: "covid vaccines are 95 percent effective"
--Language: "en"
--SHA256: abc123...def456
--Key: claim:v1norm1:en:abc123...def456
--{{/code}}
++**Fallacies Detected:** {fallacies_detected}
--**Rationale:** Prevents cross-language collisions and enables per-language cache analytics.
++**Reasoning:** {verdict_reasoning}
--**Data Structure:**
--{{code language="redis"}}
--SET claim:v1:abc123...def456 '{...ClaimAnalysis JSON...}'
--EXPIRE claim:v1:abc123...def456 7776000  # 90 days
--{{/code}}
--
--**Additional Keys:**
--{{code}}
--
--==== 5.1.1 Canonical Claim Normalization (v1) ====
--
--The cache key depends on deterministic claim normalization. All implementations MUST follow this algorithm exactly.
--
--**Algorithm: Canonical Claim Normalization v1**
--
--{{code language="python"}}
--def normalize_claim_v1(claim_text: str, language: str) -> str:
--    """
--    Normalizes claim to canonical form for cache key generation.
--    Version: v1norm1 (POC1)
--    """
--    import re
--    import unicodedata
--
--    # Step 1: Unicode normalization (NFC)
--    text = unicodedata.normalize('NFC', claim_text)
--
--    # Step 2: Lowercase
--    text = text.lower()
--
--    # Step 3: Remove punctuation (except hyphens in words)
--    text = re.sub(r'[^\w\s-]', '', text)
--
--    # Step 4: Normalize whitespace (collapse multiple spaces)
--    text = re.sub(r'\s+', ' ', text).strip()
--
--    # Step 5: Numeric normalization
--    text = text.replace('%', ' percent')
--    # Spell out single-digit numbers
--    num_to_word = {'0':'zero', '1':'one', '2':'two', '3':'three',
--                   '4':'four', '5':'five', '6':'six', '7':'seven',
--                   '8':'eight', '9':'nine'}
--    for num, word in num_to_word.items():
--        text = re.sub(rf'\b{num}\b', word, text)
--
--    # Step 6: Common abbreviations (English only in v1)
--    if language == 'en':
--        text = text.replace('covid-19', 'covid')
--        text = text.replace('u.s.', 'us')
--        text = text.replace('u.k.', 'uk')
--
--    # Step 7: NO entity normalization in v1
--    # (Trump vs Donald Trump vs President Trump remain distinct)
--
--    return text
--
--# Version identifier (include in cache namespace)
--CANONICALIZER_VERSION = "v1norm1"
--{{/code}}
--
--**Cache Key Formula (Updated):**
--
--{{code}}
--language = "en"
--canonical = normalize_claim_v1(claim_text, language)
--cache_key = f"claim:{CANONICALIZER_VERSION}:{language}:{sha256(canonical)}"
--
--Example:
--  claim: "COVID-19 vaccines are 95% effective"
--  canonical: "covid vaccines are 95 percent effective"
--  sha256: abc123...def456
--  key: "claim:v1norm1:en:abc123...def456"
--{{/code}}
--
--**Cache Metadata MUST Include:**
--
--{{code language="json"}}
--{
--  "canonical_claim": "covid vaccines are 95 percent effective",
--  "canonicalizer_version": "v1norm1",
--  "language": "en",
--  "original_claim_samples": ["COVID-19 vaccines are 95% effective"]
--}
--{{/code}}
--
--**Version Upgrade Path:**
--* v1norm1 → v1norm2: Cache namespace changes, old keys remain valid until TTL
--* v1normN → v2norm1: Major version bump, invalidate all v1 caches
--
--
--claim:stats:hit_count:{claim_hash}  # Counter
--claim:index:domain:{domain}  # Set of claim hashes by domain
--claim:index:language:{lang}  # Set of claim hashes by language
--{{/code}}
--
--
--=== 5.1.1 Canonical Claim Normalization (v1) ===
--
--The cache key depends on deterministic claim normalization. All implementations MUST follow this algorithm exactly.
--
--**Algorithm: Canonical Claim Normalization v1**
--
--{{code language="python"}}
--def normalize_claim_v1(claim_text: str, language: str) -> str:
--    """
--    Normalizes claim to canonical form for cache key generation.
--    Version: v1norm1 (POC1)
--    """
--    import re
--    import unicodedata
--
--    # Step 1: Unicode normalization (NFC)
--    text = unicodedata.normalize('NFC', claim_text)
--
--    # Step 2: Lowercase
--    text = text.lower()
--
--    # Step 3: Remove punctuation (except hyphens in words)
--    text = re.sub(r'[^\w\s-]', '', text)
--
--    # Step 4: Normalize whitespace (collapse multiple spaces)
--    text = re.sub(r'\s+', ' ', text).strip()
--
--    # Step 5: Numeric normalization
--    text = text.replace('%', ' percent')
--    # Spell out single-digit numbers
--    num_to_word = {'0':'zero', '1':'one', '2':'two', '3':'three',
--                   '4':'four', '5':'five', '6':'six', '7':'seven',
--                   '8':'eight', '9':'nine'}
--    for num, word in num_to_word.items():
--        text = re.sub(rf'\b{num}\b', word, text)
--
--    # Step 6: Common abbreviations (English only in v1)
--    if language == 'en':
--        text = text.replace('covid-19', 'covid')
--        text = text.replace('u.s.', 'us')
--        text = text.replace('u.k.', 'uk')
--
--    # Step 7: NO entity normalization in v1
--    # (Trump vs Donald Trump vs President Trump remain distinct)
--
--    return text
--
--# Version identifier (include in cache namespace)
--CANONICALIZER_VERSION = "v1norm1"
--{{/code}}
--
--**Cache Key Formula (Updated):**
--
--{{code}}
--language = "en"
--canonical = normalize_claim_v1(claim_text, language)
--cache_key = f"claim:{CANONICALIZER_VERSION}:{language}:{sha256(canonical)}"
--
--Example:
--  claim: "COVID-19 vaccines are 95% effective"
--  canonical: "covid vaccines are 95 percent effective"
--  sha256: abc123...def456
--  key: "claim:v1norm1:en:abc123...def456"
--{{/code}}
--
--**Cache Metadata MUST Include:**
--
--{{code language="json"}}
--{
--  "canonical_claim": "covid vaccines are 95 percent effective",
--  "canonicalizer_version": "v1norm1",
--  "language": "en",
--  "original_claim_samples": ["COVID-19 vaccines are 95% effective"]
--}
--{{/code}}
--
--**Version Upgrade Path:**
--* v1norm1 → v1norm2: Cache namespace changes, old keys remain valid until TTL
--* v1normN → v2norm1: Major version bump, invalidate all v1 caches
--
--
--
--=== 5.1.2 Copyright & Data Retention Policy ===
--
--**Evidence Excerpt Storage:**
--
--To comply with copyright law and fair use principles:
--
--**What We Store:**
--* **Metadata only:** Title, author, publisher, URL, publication date
--* **Short excerpts:** Max 25 words per quote, max 3 quotes per evidence item
--* **Summaries:** AI-generated bullet points (not verbatim text)
--* **No full articles:** Never store complete article text beyond job processing
--
--**Total per Cached Claim:**
--* Scenarios: 2 per claim
--* Evidence items: 6 per scenario (12 total)
--* Quotes: 3 per evidence × 25 words = 75 words per item
--* **Maximum stored verbatim text:** ~900 words per claim (12 × 75)
--
--**Retention:**
--* Cache TTL: 90 days
--* Job outputs: 24 hours (then archived or deleted)
--* No persistent full-text article storage
--
--**Rationale:**
--* Short excerpts for citation = fair use
--* Summaries are transformative (not copyrightable)
--* Limited retention (90 days max)
--* No commercial republication of excerpts
--
--**DMCA Compliance:**
--* Cache invalidation endpoint available for rights holders
--* Contact: dmca@factharbor.org
--
--
--=== 5.2 Cache Invalidation Strategy ===
--
--**Time-Based (Primary):**
--* TTL: 90 days for most claims
--* Reasoning: Evidence freshness, news cycles
--
--**Event-Based (Manual):**
--* Admin can flag claims for invalidation
--* Example: "Major study retracts findings"
--* Tool: {{code}}DELETE /v1/cache/claim/{claim_hash}?reason=retraction{{/code}}
--
--**Version-Based (Automatic):**
--* AKEL v2.0 release → Invalidate all v1.0 caches
--* Cache keys include version: {{code}}claim:v1:*{{/code}} vs {{code}}claim:v2:*{{/code}}
--
--**Long-Lived Historical Claims:**
--* Historical claims about completed events generally have stable verdicts
--* Example: "2024 US presidential election results"
--* **Policy:** Extended TTL (365-3,650 days) instead of "never invalidate"
--* **Reason:** Even historical data gets revisions (updated counts, corrections)
--* **Mechanism:** Admin can still manually invalidate if major correction issued
--* **Flag:** {{code}}is_historical=true{{/code}} in cache metadata → longer TTL
--
--=== 5.3 Cache Warming Strategy ===
--
--**Proactive Cache Building (Future):**
--
--**Trending Topics:**
--* Monitor news APIs for trending topics
--* Pre-analyze top 20 common claims
--* Example: New health study published → Pre-cache related claims
--
--**Predictable Events:**
--* Elections, sporting events, earnings reports
--* Pre-cache expected claims before event
--* Reduces load during traffic spikes
--
--**User Patterns:**
--* Analyze query logs
--* Identify frequently requested claims
--* Prioritize cache warming for these
--
  ---
--== 6. Quality Gates & Validation Rules ==
++## 2. Key Claims Analysis
--=== 6.1 Quality Gate Overview ===
++### [C1] {claim_text}
++* **Role:** {is_central_to_thesis ? "Central to thesis" : "Supporting claim"}
++* **Risk Tier:** {risk_tier} ({risk_tier_justification})
++* **Evaluability:** {evaluability}
--|=Gate|=Name|=POC1 Status|=Applies To|=Notes
--|**Gate 1**|Claim Validation|✅ Hard gate|Stage 1: Extraction|Filters opinions, compound claims
--|**Gate 2**|Contradiction Search|✅ Mandatory rule|Stage 2: Analysis|Enforced per cached claim
--|**Gate 3**|Uncertainty Disclosure|⚠️ Soft guidance|Stage 2: Analysis|Best practice
--|**Gate 4**|Verdict Confidence|✅ Hard gate|Stage 2: Analysis|Confidence ≥ 0.5 required
++**Scenarios Explored:** {scenarios.length}
--**Hard Gate Failures:**
--* Gate 1 fail → Claim excluded from analysis
--* Gate 4 fail → Claim marked "Unsubstantiated" but included
++#### Scenario: {scenario.context_title}
++* **Verdict:** {verdict.label} (Confidence: {verdict.confidence})
++* **Probability Range:** {verdict.probability_range[0]} - {verdict.probability_range[1]}
++* **Reasoning:** {verdict.reasoning}
--=== 6.2 Validation Rules ===
++**Evidence:**
++* Supporting: {evidence.filter(e => e.stance == "supports").length} sources
++* Undermining: {evidence.filter(e => e.stance == "undermines").length} sources
++* Mixed: {evidence.filter(e => e.stance == "mixed").length} sources
--|=Rule|=Requirement
--|**Mandatory Contradiction**|Stage 2 MUST search for "undermines" evidence. If none found, reasoning must state: "No counter-evidence found despite targeted search."
--|**Context-Aware Logic**|Stage 3 must prioritize central claims. If {{code}}is_central_to_thesis=true{{/code}} claim is REFUTED, article cannot be WELL-SUPPORTED.
--|**Cache Consistency**|Cached claims must match current AKEL version. Version mismatch → cache miss.
--|**Author Identification**|All outputs MUST include {{code}}author_type: "AI/AKEL"{{/code}}.
++**Key Evidence:**
++* [{evidence[0].citation.title}]({evidence[0].citation.url}) - {evidence[0].stance}
  ---
--== 7. Deterministic Markdown Template ==
++## 3. Quality Assessment
--Report generation uses **fixed template** (not LLM-generated).
++**Quality Gates:**
++* Gate 1 (Claim Validation): {gate1_claim_validation}
++* Gate 4 (Verdict Confidence): {gate4_verdict_confidence}
++* Overall: {passed_all ? "PASS" : "FAIL"}
--**Cache-Only Mode Template:**
--{{code language="markdown"}}
--# FactHarbor Analysis Report: PARTIAL ANALYSIS
++{if gate_fail_reasons.length > 0}
++**Failed Gates:**
++{gate_fail_reasons.map(r => `* ${r.gate}: ${r.explanation}`)}
++{/if}
--**Job ID:** {job_id} | **Generated:** {timestamp_utc}
--**Mode:** Cache-Only (Free Tier)
--
  ---
--## ⚠️ Partial Analysis Notice
++## 4. Limitations & Disclaimers
--This is a **cache-only analysis** based on previously analyzed claims.
--{cache_coverage_percent}% of claims were available in cache.
++**System Limitations:**
++{limitations.map(l => `* ${l}`)}
--**What's Included:**
--* {claims_cached} of {claims_total} claims analyzed
--* Evidence and verdicts from cache (last updated: {oldest_cache_date})
++**Important Notes:**
++* This analysis is AI-generated and experimental (POC1)
++* Context-aware article verdict is being tested for accuracy
++* Human review recommended for high-risk claims (Tier A)
++* Cost: ${estimated_cost_usd} | Tokens: {input_tokens + output_tokens}
--**What's Missing:**
--* {claims_missing} claims require new analysis
--* Full article holistic assessment unavailable
--* Estimated cost to complete: ${cost_to_complete}
++**Methodology:** FactHarbor uses Claude 3.5 Sonnet to extract claims, generate scenarios, gather evidence (with mandatory contradiction search), and assess logical coherence between claims and article thesis.
--**[Upgrade to Pro]** for complete analysis
--
  ---
--## Cached Claims
++*Generated by FactHarbor POC1-v0.3 | [About FactHarbor](https://factharbor.org)*
++{{/code}}
--### [C1] {claim_text} ✅ From Cache
--* **Cached:** {cached_at} ({cache_age} ago)
--* **Times Used:** {hit_count} articles
--* **Verdict:** {verdict} (Confidence: {confidence})
--* **Evidence:** {evidence_count} sources
++**Target Report Size:** 220-350 words (optimized for 2-minute read)
--[Full claim details...]
--
--### [C3] {claim_text} ⚠️ Not In Cache
--* **Status:** Requires new analysis
--* **Cost:** $0.081
--* **Upgrade to analyze this claim**
--
  ---
--**Powered by FactHarbor POC1-v0.4** | [Upgrade](https://factharbor.org/upgrade)
--{{/code}}
++== 7. LLM Configuration (POC1) ==
-----
++|=Parameter|=Value|=Notes
++|**Provider**|Anthropic|Primary provider for POC1
++|**Model**|{{code}}claude-3-5-sonnet-20241022{{/code}}|Current production model
++|**Future Model**|{{code}}claude-sonnet-4-20250514{{/code}}|When available (architecture supports)
++|**Token Budget**|50K-80K per analysis|Input + output combined (varies by article length)
++|**Estimated Cost**|$0.10-0.30 per article|Based on Sonnet 3.5 pricing ($3/M input, $15/M output)
++|**Prompt Strategy**|Single-pass per stage|Not multi-turn; structured JSON output with schema validation
++|**Chain-of-Thought**|Yes|For verdict reasoning and holistic assessment
++|**Few-Shot Examples**|Yes|For claim extraction and scenario generation
--== 8. LLM Configuration (3-Stage) ==
++=== 7.1 Token Budgets by Stage ===
--=== 8.1 Stage 1: Claim Extraction (Haiku) ===
++|=Stage|=Approximate Output Tokens
++|Claim Extraction|~4,000 (10 claims × ~400 tokens)
++|Scenario Generation|~3,000 per claim (3 scenarios × ~1,000 tokens)
++|Evidence Synthesis|~2,000 per scenario
++|Verdict Generation|~1,000 per scenario
++|Holistic Assessment|~500 (context-aware summary)
--|=Parameter|=Value|=Notes
--|**Model**|{{code}}claude-haiku-4-20250108{{/code}}|Fast, cheap, sufficient for extraction
--|**Input Tokens**|~10K|Article text after URL extraction
--|**Output Tokens**|~500|5 claims @ ~100 tokens each
--|**Cost**|$0.003 per article|($0.25/M input + $1.25/M output)
--|**Temperature**|0.0|Deterministic
--|**Max Tokens**|1000|Generous buffer
++**Total:** 50K-80K tokens per article (input + output)
--**Prompt Strategy:**
--* Extract 5 verifiable factual claims
--* Mark central vs. supporting claims
--* Canonicalize (normalize phrasing)
--* Deduplicate similar claims
--* Output structured JSON only
++=== 7.2 API Integration ===
--=== 8.2 Stage 2: Claim Analysis (Sonnet, CACHED) ===
++**Anthropic Messages API:**
++* Endpoint: {{code}}https://api.anthropic.com/v1/messages{{/code}}
++* Authentication: API key via {{code}}x-api-key{{/code}} header
++* Model parameter: {{code}}"model": "claude-3-5-sonnet-20241022"{{/code}}
++* Max tokens: {{code}}"max_tokens": 4096{{/code}} (per stage)
--|=Parameter|=Value|=Notes
--|**Model**|{{code}}claude-3-5-sonnet-20241022{{/code}}|High quality for verdicts
--|**Input Tokens**|~2K|Single claim + prompt + context
--|**Output Tokens**|~5K|2 scenarios × ~2.5K tokens
--|**Cost**|$0.081 per NEW claim|($3/M input + $15/M output)
--|**Temperature**|0.0|Deterministic (cache consistency)
--|**Max Tokens**|8000|Sufficient for 2 scenarios
--|**Cache Strategy**|Redis, 90-day TTL|Key: {{code}}claim:v1norm1:{language}:{sha256(canonical_claim)}{{/code}}
++**No LangChain/LangGraph needed** for POC1 simplicity - direct SDK calls suffice.
--**Prompt Strategy:**
--* Generate 2 scenario interpretations
--* Search for supporting AND undermining evidence (mandatory)
--* 6 evidence items per scenario maximum
--* Compute verdict with reasoning chain (3-4 bullets)
--* Output structured JSON only
++---
--**Output Constraints (Cost Control):**
--* Scenarios: Max 2 per claim
--* Evidence: Max 6 per scenario
--* Evidence summary: Max 3 bullets
--* Reasoning chain: Max 4 bullets
++== 8. Cross-References (xWiki) ==
--=== 8.3 Stage 3: Holistic Assessment (Sonnet) ===
++This API specification implements requirements from:
--|=Parameter|=Value|=Notes
--|**Model**|{{code}}claude-3-5-sonnet-20241022{{/code}}|Context-aware analysis
--|**Input Tokens**|~5K|Article + claim verdicts
--|**Output Tokens**|~1K|Article verdict + fallacies
--|**Cost**|$0.030 per article|($3/M input + $15/M output)
--|**Temperature**|0.0|Deterministic
--|**Max Tokens**|2000|Sufficient for assessment
++* **[[POC Requirements>>Test.FactHarbor.Specification.POC.Requirements]]**
++** FR-POC-1 through FR-POC-6 (POC1-specific functional requirements)
++** NFR-POC-1 through NFR-POC-3 (quality gates lite: Gates 1 & 4 only)
++** Section 2.1: Analysis Summary (Context-Aware) component specification
++** Section 10.3: Prompt structure for claim extraction and verdict synthesis
--**Prompt Strategy:**
--* Detect main thesis
--* Evaluate logical coherence (claim verdicts → thesis)
--* Identify fallacies (correlation-causation, cherry-picking, etc.)
--* Compute logic_quality_score
--* Explain article verdict reasoning (3-4 bullets)
--* Output structured JSON only
++* **[[Article Verdict Problem>>Test.FactHarbor.Specification.POC.Article-Verdict-Problem]]**
++** Complete investigation of 7 approaches to article-level verdicts
++** Approach 1 (Single-Pass Holistic Analysis) chosen for POC1
++** Experimental feature testing plan (30 articles, ≥70% accuracy target)
++** Decision framework for POC2 implementation
--=== 8.4 Cost Projections by Cache Hit Rate ===
++* **[[Requirements>>Test.FactHarbor.Specification.Requirements.WebHome]]**
++** FR4 (Analysis Summary) - enhanced with context-aware capability
++** FR7 (Verdict Calculation) - probability ranges + confidence scores
++** NFR11 (Quality Gates) - POC1 implements Gates 1 & 4; Gates 2 & 3 in POC2
--|=Cache Hit Rate|=Cost per Article|=10K Articles Cost|=100K Articles Cost
--|0% (cold start)|$0.438|$4,380|$43,800
--|20%|$0.357|$3,570|$35,700
--|40%|$0.276|$2,760|$27,600
--|**60%**|**$0.195**|**$1,950**|**$19,500**
--|**70%** (target)|**$0.155**|**$1,550**|**$15,500**
--|**80%**|**$0.114**|**$1,140**|**$11,400**
--|**90%**|**$0.073**|**$730**|**$7,300**
--|95%|$0.053|$530|$5,300
++* **[[Architecture>>Test.FactHarbor.Specification.Architecture.WebHome]]**
++** POC1 simplified architecture (stateless, single AKEL orchestration call)
++** Data persistence minimized (job outputs only, no database required)
++** Deferred complexity (no Elasticsearch, TimescaleDB, Federation until metrics justify)
--**Break-Even Analysis:**
--* Monolithic (v0.3.1): $0.15 per article constant
--* 3-stage breaks even at **70% cache hit rate**
--* Expected after ~1,500 articles in same domain
++* **[[Data Model>>Test.FactHarbor.Specification.Data Model.WebHome]]**
++** Evidence structure (source, stance, reliability rating)
++** Scenario boundaries (time, geography, population, conditions)
++** Claim types and evaluability taxonomy
++** Source Track Record System (Section 1.3) - temporal separation
++* **[[Requirements Roadmap Matrix>>Test.FactHarbor.Roadmap.Requirements-Roadmap-Matrix.WebHome]]**
++** POC1 requirement mappings and phase assignments
++** Context-aware analysis as POC1 experimental feature
++** POC2 enhancement path (Gates 2 & 3, evidence deduplication)
++
  ---
--== 9. Implementation Notes ==
++== 9. Implementation Notes (POC1) ==
  === 9.1 Recommended Tech Stack ===
--* **Framework:** Next.js 14+ with App Router (TypeScript)
--* **Cache:** Redis 7.0+ (managed: AWS ElastiCache, Redis Cloud, Upstash)
--* **Storage:** Filesystem JSON for jobs + S3/R2 for archival
--* **Queue:** BullMQ with Redis (for 3-stage pipeline orchestration)
--* **LLM Client:** Anthropic Python SDK or TypeScript SDK
--* **Cost Tracking:** PostgreSQL for user credit ledger
--* **Deployment:** Vercel (frontend + API) + Redis Cloud
++* **Framework:** Next.js 14+ with App Router (TypeScript) - Full-stack in one codebase
++* **Rationale:** API routes + React UI unified, Vercel deployment-ready, similar to C# in structure
++* **Storage:** Filesystem JSON files (no database needed for POC1)
++* **Queue:** In-memory queue or Redis (optional for concurrency)
++* **URL Extraction:** Jina AI Reader API (primary), trafilatura (fallback)
++* **Deployment:** Vercel, AWS Lambda, or similar serverless
--=== 9.2 3-Stage Pipeline Implementation ===
++=== 9.2 POC1 Simplifications ===
--**Job Queue Flow (Conceptual):**
++* **No database required:** Job metadata + outputs stored as JSON files ({{code}}jobs/{job_id}.json{{/code}}, {{code}}results/{job_id}.json{{/code}})
++* **No user authentication:** Optional API key validation only (env var: {{code}}FACTHARBOR_API_KEY{{/code}})
++* **Single-instance deployment:** No distributed processing, no worker pools
++* **Synchronous LLM calls:** No streaming in POC1 (entire response before returning)
++* **Job retention:** 24 hours default (configurable: {{code}}JOB_RETENTION_HOURS{{/code}})
++* **Rate limiting:** Simple IP-based (optional) - no complex billing
--{{code language="typescript"}}
--// Stage 1: Extract Claims
--const stage1Job = await queue.add('stage1-extract-claims', {
--  jobId: 'job123',
--  articleUrl: 'https://example.com/article'
--});
++=== 9.3 Estimated Costs (Per Analysis) ===
--// On Stage 1 completion → enqueue Stage 2 jobs
--stage1Job.on('completed', async (result) => {
--  const { claims } = result;
--
--  // Stage 2: Analyze each claim (with cache check)
--  const stage2Jobs = await Promise.all(
--    claims.map(claim =>
--      queue.add('stage2-analyze-claim', {
--        jobId: 'job123',
--        claimId: claim.claim_id,
--        canonicalClaim: claim.canonical_claim,
--        checkCache: true
--      })
--    )
--  );
--
--  // On all Stage 2 completions → enqueue Stage 3
--  await Promise.all(stage2Jobs.map(j => j.waitUntilFinished()));
--
--  const claimVerdicts = await gatherStage2Results('job123');
--
--  await queue.add('stage3-holistic', {
--    jobId: 'job123',
--    articleUrl: 'https://example.com/article',
--    claimVerdicts: claimVerdicts
--  });
--});
--{{/code}}
++**LLM API costs (Claude 3.5 Sonnet):**
++* Input: $3.00 per million tokens
++* Output: $15.00 per million tokens
++* **Per article:** $0.10-0.30 (varies by length, 5-10 claims typical)
--**Note:** This is a conceptual sketch. Actual implementation may use BullMQ Flow API or custom orchestration.
++**Web search costs (optional):**
++* Using external search API (Tavily, Brave): $0.01-0.05 per analysis
++* POC1 can use free search APIs initially
--**Cache Check Logic:**
--{{code language="typescript"}}
--async function analyzeClaimWithCache(claim: string): Promise<ClaimAnalysis> {
--  const canonicalClaim = normalizeClaim(claim);
--  const claimHash = sha256(canonicalClaim);
--  const cacheKey = `claim:v1:${claimHash}`;
--
--  // Check cache
--  const cached = await redis.get(cacheKey);
--  if (cached) {
--    await redis.incr(`claim:stats:hit_count:${claimHash}`);
--    return JSON.parse(cached);
--  }
--
--  // Cache miss - analyze with LLM
--  const analysis = await analyzeClaim_Stage2(canonicalClaim);
--
--  // Store in cache
--  await redis.set(cacheKey, JSON.stringify(analysis), 'EX', 7776000); // 90 days
--
--  return analysis;
--}
--{{/code}}
++**Infrastructure costs:**
++* Vercel hobby tier: Free for POC
++* AWS Lambda: ~$0.001 per request
++* **Total infra:** <$0.01 per analysis
--=== 9.3 User Credit Management ===
++**Total estimated cost:** ~$0.15-0.35 per analysis ✅ Meets <$0.35 target
--**PostgreSQL Schema:**
--{{code language="sql"}}
--CREATE TABLE user_credits (
--  user_id UUID PRIMARY KEY,
--  tier VARCHAR(20) DEFAULT 'free',
--  credit_limit DECIMAL(10,2) DEFAULT 10.00,
--  credit_used DECIMAL(10,2) DEFAULT 0.00,
--  reset_date TIMESTAMP,
--  cache_only_mode BOOLEAN DEFAULT false,
--  created_at TIMESTAMP DEFAULT NOW()
--);
++=== 9.4 Estimated Timeline (AI-Assisted) ===
--CREATE TABLE usage_log (
--  id SERIAL PRIMARY KEY,
--  user_id UUID REFERENCES user_credits(user_id),
--  job_id VARCHAR(50),
--  stage VARCHAR(20),
--  cost DECIMAL(10,4),
--  cache_hit BOOLEAN,
--  created_at TIMESTAMP DEFAULT NOW()
--);
--{{/code}}
++**With Cursor IDE + Claude API:**
++* Day 1-2: API scaffolding + job queue
++* Day 3-4: LLM integration + prompt engineering
++* Day 5-6: Evidence retrieval + contradiction search
++* Day 7: Report templates + testing with 30 articles
++* **Total:** 5-7 days for working POC1
--**Credit Deduction Logic:**
--{{code language="typescript"}}
--async function deductCredit(userId: string, cost: number): Promise<boolean> {
--  const user = await db.query('SELECT * FROM user_credits WHERE user_id = $1', [userId]);
--
--  const newUsed = user.credit_used + cost;
--
--  if (newUsed > user.credit_limit && user.tier === 'free') {
--    // Trigger cache-only mode
--    await db.query(
--      'UPDATE user_credits SET cache_only_mode = true WHERE user_id = $1',
--      [userId]
--    );
--    throw new Error('CREDIT_LIMIT_REACHED');
--  }
--
--  await db.query(
--    'UPDATE user_credits SET credit_used = $1 WHERE user_id = $2',
--    [newUsed, userId]
--  );
--
--  return true;
--}
--{{/code}}
++**Manual coding (no AI assistance):**
++* Estimate: 15-20 days
--=== 9.4 Cache-Only Mode Implementation ===
++=== 9.5 First Prompt for AI Code Generation ===
--**Middleware:**
--{{code language="typescript"}}
--async function checkCacheOnlyMode(req, res, next) {
--  const user = await getUserCredit(req.userId);
--
--  if (user.cache_only_mode) {
--    // Allow only cache reads
--    if (req.body.options?.cache_preference !== 'allow_partial') {
--      return res.status(402).json({
--        error: 'credit_limit_reached',
--        message: 'Resubmit with cache_preference=allow_partial',
--        cache_only_mode: true
--      });
--    }
--
--    // Modify request to skip Stage 2 for uncached claims
--    req.cacheOnlyMode = true;
--  }
--
--  next();
--}
--{{/code}}
++{{code}}
++Based on the FactHarbor POC1 API & Schemas Specification (v0.3), generate a Next.js 14 TypeScript application with:
--=== 9.5 Estimated Timeline ===
++1. API routes implementing the 7 endpoints specified in Section 3
++2. AnalyzeRequest/AnalysisResult types matching schemas in Sections 4-5
++3. Anthropic Claude 3.5 Sonnet integration for:
++   - Claim extraction (with central/supporting marking)
++   - Scenario generation
++   - Evidence synthesis (with mandatory contradiction search)
++   - Verdict generation
++   - Holistic assessment (article-level credibility)
++4. Job-based async execution with progress tracking (7 pipeline stages)
++5. Quality Gates 1 & 4 from NFR11 implementation
++6. Mandatory contradiction search enforcement (Section 5)
++7. Context-aware analysis (experimental) as specified
++8. Filesystem-based job storage (no database)
++9. Markdown report generation from JSON templates (Section 6)
--**POC1 with 3-Stage Architecture:**
--* Week 1: Stage 1 (Haiku extraction) + Redis setup
--* Week 2: Stage 2 (Sonnet analysis + caching)
--* Week 3: Stage 3 (Holistic assessment) + pipeline orchestration
--* Week 4: User credit system + cache-only mode
--* Week 5: Testing with 100 articles (measure cache hit rate)
--* Week 6: Optimization + bug fixes
--* **Total: 6-8 weeks**
++Use the validation rules from Section 5 and error codes from Section 2.1.1.
++Target: <$0.35 per analysis, <2 minutes processing time.
++{{/code}}
--**Manual coding:** 12-16 weeks
--
  ---
--== 10. Testing Strategy ==
++== 10. Testing Strategy (POC1) ==
--=== 10.1 Cache Performance Testing ===
++=== 10.1 Test Dataset (30 Articles) ===
--**Test Scenarios:**
++**Category 1: Straightforward Factual (10 articles)**
++* Purpose: Baseline accuracy
++* Example: "WHO report on global vaccination rates"
++* Expected: High claim accuracy, straightforward verdict
--**Scenario 1: Cold Start (0 cache)**
--* Analyze 100 diverse articles
--* Measure: Cost per article, cache growth rate
--* Expected: $0.35-0.40 avg, ~400 unique claims cached
++**Category 2: Accurate Claims, Questionable Conclusions (10 articles)** ⭐ **Context-Aware Test**
++* Purpose: Test holistic assessment capability
++* Example: "Coffee cures cancer" (true premises, false conclusion)
++* Expected: Individual claims TRUE, article verdict MISLEADING
--**Scenario 2: Warm Cache (Overlapping Domain)**
--* Analyze 100 articles on SAME topic (e.g., "2024 election")
--* Measure: Cache hit rate growth
--* Expected: Hit rate 20% → 60% by article 100
++**Category 3: Mixed Accuracy (5 articles)**
++* Purpose: Test nuance handling
++* Example: Articles with some true, some false claims
++* Expected: Scenario-level differentiation
--**Scenario 3: Mature Cache (1,000 articles)**
--* Analyze next 100 articles (diverse topics)
--* Measure: Steady-state cache hit rate
--* Expected: 60-70% hit rate, $0.15-0.18 avg cost
++**Category 4: Low-Quality Claims (5 articles)**
++* Purpose: Test quality gates
++* Example: Opinion pieces, compound claims
++* Expected: Gate 1 failures, rejection or draft-only mode
--**Scenario 4: Cache-Only Mode**
--* Free user reaches $10 limit (67 articles at 70% hit rate)
--* Submit 10 more articles with {{code}}cache_preference=allow_partial{{/code}}
--* Measure: Coverage %, user satisfaction
--* Expected: 60-70% coverage, instant results
--
  === 10.2 Success Metrics ===
--**Cache Performance:**
--* Week 1: 5-10% hit rate
--* Week 2: 15-25% hit rate
--* Week 3: 30-40% hit rate
--* Week 4: 45-55% hit rate
--* Target: ≥50% by 1,000 articles
--
--**Cost Targets:**
--* Articles 1-100: $0.35-0.40 avg ⚠️ (expected)
--* Articles 100-500: $0.25-0.30 avg
--* Articles 500-1,000: $0.18-0.22 avg
--* Articles 1,000+: $0.12-0.15 avg ✅
--
--**Quality Metrics (same as v0.3.1):**
--* Hallucination rate: <5%
--* Context-aware accuracy: ≥70%
++**Quality Metrics:**
++* Hallucination rate: <5% (target: <3%)
++* Context-aware accuracy: ≥70% (experimental - key POC1 goal)
  * False positive rate: <15%
  * Mandatory contradiction search: 100% compliance
--=== 10.3 Free Tier Economics Validation ===
++**Performance Metrics:**
++* Processing time: <2 minutes per article (standard depth)
++* Cost per analysis: <$0.35
++* API uptime: >99%
++* LLM API error rate: <1%
--**Test with simulated 1,000 users:**
--* Each user: $10 credit
--* 70% cache hit rate
--* Avg 70 articles/user/month
++**See:** [[POC1 Roadmap>>Test.FactHarbor.Roadmap.POC1.WebHome]] Section 11 for complete success criteria and testing methodology.
--**Projected Costs:**
--* Total credits: 1,000 × $10 = $10,000
--* Actual LLM costs: ~$9,000 (cache savings)
--* Margin: 10%
--
--**Sustainability Check:**
--* If margin <5% → Reduce free tier limit
--* If margin >20% → Consider increasing free tier
--
  ---
--== 11. Cross-References ==
++**End of Specification - FactHarbor POC1 API v0.3**
--This API specification implements requirements from:
++**Ready for xWiki import and AI-assisted implementation!** 🚀
--* **[[POC Requirements>>Test.FactHarbor.Specification.POC.Requirements]]**
--** FR-POC-1 through FR-POC-6 (3-stage architecture)
--** NFR-POC-1 through NFR-POC-3 (quality gates, caching)
--** NEW: FR-POC-7 (Claim-level caching)
--** NEW: FR-POC-8 (User credit system)
--** NEW: FR-POC-9 (Cache-only mode)
--
--* **[[Article Verdict Problem>>Test.FactHarbor.Specification.POC.Article-Verdict-Problem]]**
--** Approach 1 implemented in Stage 3
--** Context-aware holistic assessment
--
--* **[[Requirements>>Test.FactHarbor.Specification.Requirements.WebHome]]**
--** FR4 (Analysis Summary) - enhanced with caching
--** FR7 (Verdict Calculation) - cached per claim
--** NFR11 (Quality Gates) - enforced across stages
--** NEW: NFR19 (Cost Efficiency via Caching)
--** NEW: NFR20 (Free Tier Sustainability)
--
--* **[[Architecture>>Test.FactHarbor.Specification.Architecture.WebHome]]**
--** POC1 3-stage pipeline architecture
--** Redis cache layer
--** User credit system
--
--* **[[Data Model>>Test.FactHarbor.Specification.Data Model.WebHome]]**
--** Claim structure (cacheable unit)
--** Evidence structure
--** Scenario boundaries
--
-----
--
--**End of Specification - FactHarbor POC1 API v0.4**
--
--**3-stage caching architecture with free tier cache-only mode. Ready for sustainable, scalable implementation!** 🚀
--

Changes for page POC1 API & Schemas Specification

Summary

Details

Applications

Navigation

Need help?