Changes for page POC1 API & Schemas Specification

Last modified by Robert Schaub on 2025/12/24 18:26

From 5.1 to 5.2

From version 1.1

edited by Robert Schaub
on 2025/12/24 11:54

Change comment: Imported from XAR

To version 5.1

edited by Robert Schaub
on 2025/12/24 17:59

Change comment: Imported from XAR

Raw
Rendered

Summary

Page properties (1 modified, 0 added, 0 removed)

Details

Page properties

Content

@@ -1,673 +1,904 @@
--# FactHarbor POC1 — API & Schemas Specification
++= POC1 API & Schemas Specification =
--**Version:** 0.3 (POC1 - Production Ready)
--**Namespace:** FactHarbor.*
--**Syntax:** xWiki 2.1
--**Last Updated:** 2025-12-24
++----
-----
--
  == Version History ==
  |=Version|=Date|=Changes
--|0.3|2025-12-24|Added complete API endpoints, LLM config, risk tiers, scraping details, quality gate logging, temporal separation note, cross-references
--|0.2|2025-12-24|Initial rebased version with holistic assessment
--|0.1|2025-12-24|Original specification
++|0.4.1|2025-12-24|Applied 9 critical fixes: file format notice, verdict taxonomy, canonicalization algorithm, Stage 1 cost policy, BullMQ fix, language in cache key, historical claims TTL, idempotency, copyright policy
++|0.4|2025-12-24|**BREAKING:** 3-stage pipeline with claim-level caching, user tier system, cache-only mode for free users, Redis cache architecture
++|0.3.1|2025-12-24|Fixed single-prompt strategy, SSE clarification, schema canonicalization, cost constraints
++|0.3|2025-12-24|Added complete API endpoints, LLM config, risk tiers, scraping details
-----
++----
  == 1. Core Objective (POC1) ==
--The primary technical goal of POC1 is to validate **Approach 1 (Single-Pass Holistic Analysis)**:
++The primary technical goal of POC1 is to validate **Approach 1 (Single-Pass Holistic Analysis)** while implementing **claim-level caching** to achieve cost sustainability.
--The system must prove that AI can identify an article's **Main Thesis** and determine if the supporting claims (even if individually accurate) logically support that thesis without committing fallacies (e.g., correlation vs. causation, cherry-picking, hasty generalization).
++The system must prove that AI can identify an article's **Main Thesis** and determine if supporting claims logically support that thesis without committing fallacies.
--**Success Criteria:**
++=== Success Criteria: ===
++
  * Test with 30 diverse articles
  * Target: ≥70% accuracy detecting misleading articles
--* Cost: <$0.35 per analysis
++* Cost: <$0.25 per NEW analysis (uncached)
++* Cost: $0.00 for cached claim reuse
++* Cache hit rate: ≥50% after 1,000 articles
  * Processing time: <2 minutes (standard depth)
--**See:** [[Article Verdict Problem>>Test.FactHarbor.Specification.POC.Article-Verdict-Problem]] for complete investigation of 7 approaches.
++=== Economic Model: ===
-----
++* **Free tier:** $10 credit per month (~~40-140 articles depending on cache hits)
++* **After limit:** Cache-only mode (instant, free access to cached claims)
++* **Paid tier:** Unlimited new analyses
--== 2. Runtime Model & Job States ==
++----
--=== 2.1 Pipeline Steps ===
++== 2. Architecture Overview ==
--For progress reporting via API, the pipeline follows these stages:
++=== 2.1 3-Stage Pipeline with Caching ===
--# **INGEST**: URL scraping (Jina Reader / Trafilatura) or text normalization.
--# **EXTRACT_CLAIMS**: Identifying 3-5 verifiable factual claims + marking central vs. supporting.
--# **SCENARIOS**: Generating context interpretations for each claim.
--# **RETRIEVAL**: Evidence gathering (Search API + mandatory contradiction search).
--# **VERDICTS**: Assigning likelihoods, confidence, and uncertainty per scenario.
--# **HOLISTIC_ASSESSMENT**: Evaluating article-level credibility (Thesis vs. Claims logic).
--# **REPORT**: Generating final Markdown and JSON outputs.
++FactHarbor POC1 uses a **3-stage architecture** designed for claim-level caching and cost efficiency:
--=== 2.1.1 URL Extraction Strategy ===
++{{mermaid}}
++graph TD
++ A[Article Input] --> B[Stage 1: Extract Claims]
++ B --> C{For Each Claim}
++ C --> D[Check Cache]
++ D -->|Cache HIT| E[Return Cached Verdict]
++ D -->|Cache MISS| F[Stage 2: Analyze Claim]
++ F --> G[Store in Cache]
++ G --> E
++ E --> H[Stage 3: Holistic Assessment]
++ H --> I[Final Report]
++{{/mermaid}}
--**Primary:** Jina AI Reader ({{code}}https://r.jina.ai/{url}{{/code}})
--* **Rationale:** Clean markdown, handles JS rendering, free tier sufficient
--* **Fallback:** Trafilatura (Python library) for simple static HTML
++==== Stage 1: Claim Extraction (Haiku, no cache) ====
--**Error Handling:**
++* **Input:** Article text
++* **Output:** 5 canonical claims (normalized, deduplicated)
++* **Model:** Claude Haiku 4 (default, configurable via LLM abstraction layer)
++* **Cost:** $0.003 per article
++* **Cache strategy:** No caching (article-specific)
--|=Error Code|=Trigger|=Action
--|{{code}}URL_BLOCKED{{/code}}|403/401/Paywall detected|Return error, suggest text paste
--|{{code}}URL_UNREACHABLE{{/code}}|Network/DNS failure|Retry once, then fail
--|{{code}}URL_NOT_FOUND{{/code}}|404 Not Found|Return error immediately
--|{{code}}EXTRACTION_FAILED{{/code}}|Content <50 words or unreadable|Return error with reason
++==== Stage 2: Claim Analysis (Sonnet, CACHED) ====
--**Supported URL Patterns:**
--* ✅ News articles, blog posts, Wikipedia
--* ✅ Academic preprints (arXiv)
--* ❌ Social media posts (Twitter, Facebook) - not in POC1
--* ❌ Video platforms (YouTube, TikTok) - not in POC1
--* ❌ PDF files - deferred to Beta 0
++* **Input:** Single canonical claim
++* **Output:** Scenarios + Evidence + Verdicts
++* **Model:** Claude Sonnet 3.5 (default, configurable via LLM abstraction layer)
++* **Cost:** $0.081 per NEW claim
++* **Cache strategy:** Redis, 90-day TTL
++* **Cache key:** claim:v1norm1:{language}:{sha256(canonical_claim)}
--=== 2.2 Job Status Enumeration ===
++==== Stage 3: Holistic Assessment (Sonnet, no cache) ====
--(((
--* **QUEUED** - Job accepted, waiting in queue
--* **RUNNING** - Processing in progress
--* **SUCCEEDED** - Analysis complete, results available
--* **FAILED** - Error occurred, see error details
--* **CANCELLED** - User cancelled via DELETE endpoint
--)))
++* **Input:** Article + Claim verdicts (from cache or Stage 2)
++* **Output:** Article verdict + Fallacies + Logic quality
++* **Model:** Claude Sonnet 3.5 (default, configurable via LLM abstraction layer)
++* **Cost:** $0.030 per article
++* **Cache strategy:** No caching (article-specific)
-----
--== 3. REST API Contract ==
--=== 3.1 Create Analysis Job ===
++**Note:** Stage 3 implements **Approach 1 (Single-Pass Holistic Analysis)** from the [[Article Verdict Problem>>Test.FactHarbor.Specification.POC.Article-Verdict-Problem]]. While claim analysis (Stage 2) is cached for efficiency, the holistic assessment maintains the integrated evaluation philosophy of Approach 1.
--**Endpoint:** {{code}}POST /v1/analyze{{/code}}
++=== Total Cost Formula: ===
--**Request Body Example:**
--{{code language="json"}}
--{
--  "input_type": "url",
--  "input_url": "https://example.com/medical-report-01",
--  "input_text": null,
--  "options": {
--    "browsing": "on",
--    "depth": "standard",
--    "max_claims": 5,
--    "context_aware_analysis": true
--  },
--  "client": {
--    "request_id": "optional-client-tracking-id",
--    "source_label": "optional"
--  }
--}
--{{/code}}
++{{{Cost = $0.003 (extraction) + (N_new_claims × $0.081) + $0.030 (holistic)
--**Options:**
--* {{code}}browsing{{/code}}: {{code}}on{{/code}} | {{code}}off{{/code}} (retrieve web sources or just output queries)
--* {{code}}depth{{/code}}: {{code}}standard{{/code}} | {{code}}deep{{/code}} (evidence thoroughness)
--* {{code}}max_claims{{/code}}: 1-50 (default: 10)
--* {{code}}context_aware_analysis{{/code}}: {{code}}true{{/code}} | {{code}}false{{/code}} (experimental)
++Examples:
++- 0 new claims (100% cache hit): $0.033
++- 1 new claim (80% cache hit): $0.114
++- 3 new claims (40% cache hit): $0.276
++- 5 new claims (0% cache hit): $0.438
++}}}
--**Response:** {{code}}202 Accepted{{/code}}
++----
--{{code language="json"}}
--{
--  "job_id": "01J...ULID",
--  "status": "QUEUED",
--  "created_at": "2025-12-24T10:31:00Z",
--  "links": {
--    "self": "/v1/jobs/01J...ULID",
--    "result": "/v1/jobs/01J...ULID/result",
--    "report": "/v1/jobs/01J...ULID/report",
--    "events": "/v1/jobs/01J...ULID/events"
--  }
--}
--{{/code}}
++=== 2.2 User Tier System ===
-----
++|=Tier|=Monthly Credit|=After Limit|=Cache Access|=Analytics
++|**Free**|$10|Cache-only mode|✅ Full|Basic
++|**Pro** (future)|$50|Continues|✅ Full|Advanced
++|**Enterprise** (future)|Custom|Continues|✅ Full + Priority|Full
--=== 3.2 Get Job Status ===
++**Free Tier Economics:**
--**Endpoint:** {{code}}GET /v1/jobs/{job_id}{{/code}}
++* $10 credit = 40-140 articles analyzed (depending on cache hit rate)
++* Average 70 articles/month at 70% cache hit rate
++* After limit: Cache-only mode
--**Response:** {{code}}200 OK{{/code}}
++----
--{{code language="json"}}
--{
--  "job_id": "01J...ULID",
--  "status": "RUNNING",
--  "created_at": "2025-12-24T10:31:00Z",
--  "updated_at": "2025-12-24T10:31:22Z",
--  "progress": {
--    "step": "RETRIEVAL",
--    "percent": 60,
--    "message": "Gathering evidence for C2-S1",
--    "current_claim_id": "C2",
--    "current_scenario_id": "C2-S1"
--  },
--  "input_echo": {
--    "input_type": "url",
--    "input_url": "https://example.com/medical-report-01"
--  },
--  "links": {
--    "self": "/v1/jobs/01J...ULID",
--    "result": "/v1/jobs/01J...ULID/result",
--    "report": "/v1/jobs/01J...ULID/report"
--  },
--  "error": null
++=== 2.3 Cache-Only Mode (Free Tier Feature) ===
++
++When free users reach their $10 monthly limit, they enter **Cache-Only Mode**:
++
++==== What Cache-Only Mode Provides: ====
++
++✅ **Claim Extraction (Platform-Funded):**
++
++* Stage 1 extraction runs at $0.003 per article
++* **Cost: Absorbed by platform** (not charged to user credit)
++* Rationale: Extraction is necessary to check cache, and cost is negligible
++* Rate limit: Max 50 extractions/day in cache-only mode (prevents abuse)
++
++✅ **Instant Access to Cached Claims:**
++
++* Any claim that exists in cache → Full verdict returned
++* Cost: $0 (no LLM calls)
++* Response time: <100ms
++
++✅ **Partial Article Analysis:**
++
++* Check each claim against cache
++* Return verdicts for ALL cached claims
++* For uncached claims: Return "status": "cache_miss"
++
++✅ **Cache Coverage Report:**
++
++* "3 of 5 claims available in cache (60% coverage)"
++* Links to cached analyses
++* Estimated cost to complete: $0.162 (2 new claims)
++
++❌ **Not Available in Cache-Only Mode:**
++
++* New claim analysis (Stage 2 LLM calls blocked)
++* Full holistic assessment (Stage 3 blocked if any claims missing)
++
++==== User Experience Example: ====
++
++{{{{
++ "status": "cache_only_mode",
++ "message": "Monthly credit limit reached. Showing cached results only.",
++ "cache_coverage": {
++ "claims_total": 5,
++ "claims_cached": 3,
++ "claims_missing": 2,
++ "coverage_percent": 60
++ },
++ "cached_claims": [
++ {"claim_id": "C1", "verdict": "Likely", "confidence": 0.82},
++ {"claim_id": "C2", "verdict": "Highly Likely", "confidence": 0.91},
++ {"claim_id": "C4", "verdict": "Unclear", "confidence": 0.55}
++ ],
++ "missing_claims": [
++ {"claim_id": "C3", "claim_text": "...", "estimated_cost": "$0.081"},
++ {"claim_id": "C5", "claim_text": "...", "estimated_cost": "$0.081"}
++ ],
++ "upgrade_options": {
++ "top_up": "$5 for 20-70 more articles",
++ "pro_tier": "$50/month unlimited"
++ }
  }
--{{/code}}
++}}}
-----
++**Design Rationale:**
--=== 3.3 Get JSON Result ===
++* Free users still get value (cached claims often answer their question)
++* Demonstrates FactHarbor's value (partial results encourage upgrade)
++* Sustainable for platform (no additional cost)
++* Fair to all users (everyone contributes to cache)
--**Endpoint:** {{code}}GET /v1/jobs/{job_id}/result{{/code}}
++----
--**Response:** {{code}}200 OK{{/code}} (Returns the **AnalysisResult** schema - see Section 4)
--**Other Responses:**
--* {{code}}409 Conflict{{/code}} - Job not finished yet
--* {{code}}404 Not Found{{/code}} - Job ID unknown
-----
++== 6. LLM Abstraction Layer ==
--=== 3.4 Download Markdown Report ===
++=== 6.1 Design Principle ===
--**Endpoint:** {{code}}GET /v1/jobs/{job_id}/report{{/code}}
++**FactHarbor uses provider-agnostic LLM abstraction** to avoid vendor lock-in and enable:
--**Response:** {{code}}200 OK{{/code}} with {{code}}text/markdown; charset=utf-8{{/code}} content
++* **Provider switching:** Change LLM providers without code changes
++* **Cost optimization:** Use different providers for different stages
++* **Resilience:** Automatic fallback if primary provider fails
++* **Cross-checking:** Compare outputs from multiple providers
++* **A/B testing:** Test new models without deployment changes
--**Headers:**
--* {{code}}Content-Disposition: attachment; filename="factharbor_poc1_{job_id}.md"{{/code}}
++**Implementation:** All LLM calls go through an abstraction layer that routes to configured providers.
--**Other Responses:**
--* {{code}}409 Conflict{{/code}} - Job not finished
--* {{code}}404 Not Found{{/code}} - Job unknown
++----
-----
++=== 6.2 LLM Provider Interface ===
--=== 3.5 Stream Job Events (Optional, Recommended) ===
++**Abstract Interface:**
--**Endpoint:** {{code}}GET /v1/jobs/{job_id}/events{{/code}}
++{{{
++interface LLMProvider {
++  // Core methods
++  complete(prompt: string, options: CompletionOptions): Promise<CompletionResponse>
++  stream(prompt: string, options: CompletionOptions): AsyncIterator<StreamChunk>
++
++  // Provider metadata
++  getName(): string
++  getMaxTokens(): number
++  getCostPer1kTokens(): { input: number, output: number }
++
++  // Health check
++  isAvailable(): Promise<boolean>
++}
--**Response:** Server-Sent Events (SSE) stream
++interface CompletionOptions {
++  model?: string
++  maxTokens?: number
++  temperature?: number
++  stopSequences?: string[]
++  systemPrompt?: string
++}
++}}}
--**Event Types:**
--* {{code}}progress{{/code}} - Progress update
--* {{code}}claim_extracted{{/code}} - Claim identified
--* {{code}}verdict_computed{{/code}} - Scenario verdict complete
--* {{code}}complete{{/code}} - Job finished
--* {{code}}error{{/code}} - Error occurred
++----
-----
++=== 6.3 Supported Providers (POC1) ===
--=== 3.6 Cancel Job ===
++**Primary Provider (Default):**
--**Endpoint:** {{code}}DELETE /v1/jobs/{job_id}{{/code}}
++* **Anthropic Claude API**
++  * Models: Claude Haiku 4, Claude Sonnet 3.5, Claude Opus 4
++  * Used by default in POC1
++  * Best quality for holistic analysis
--Attempts to cancel a queued or running job.
++**Secondary Providers (Future):**
--**Response:** {{code}}200 OK{{/code}} with updated Job object (status: CANCELLED)
++* **OpenAI API**
++  * Models: GPT-4o, GPT-4o-mini
++  * For cost comparison
++
++* **Google Vertex AI**
++  * Models: Gemini 1.5 Pro, Gemini 1.5 Flash
++  * For diversity in evidence gathering
--**Note:** Already-completed jobs cannot be cancelled.
++* **Local Models** (Post-POC)
++  * Models: Llama 3.1, Mistral
++  * For privacy-sensitive deployments
-----
++----
--=== 3.7 Health Check ===
++=== 6.4 Provider Configuration ===
--**Endpoint:** {{code}}GET /v1/health{{/code}}
++**Environment Variables:**
--**Response:** {{code}}200 OK{{/code}}
++{{{
++# Primary provider
++LLM_PRIMARY_PROVIDER=anthropic
++ANTHROPIC_API_KEY=sk-ant-...
--{{code language="json"}}
++# Fallback provider
++LLM_FALLBACK_PROVIDER=openai
++OPENAI_API_KEY=sk-...
++
++# Provider selection per stage
++LLM_STAGE1_PROVIDER=anthropic
++LLM_STAGE1_MODEL=claude-haiku-4
++LLM_STAGE2_PROVIDER=anthropic
++LLM_STAGE2_MODEL=claude-sonnet-3-5
++LLM_STAGE3_PROVIDER=anthropic
++LLM_STAGE3_MODEL=claude-sonnet-3-5
++
++# Cost limits
++LLM_MAX_COST_PER_REQUEST=1.00
++}}}
++
++**Database Configuration (Alternative):**
++
++{{{{
  {
--  "status": "ok",
--  "version": "POC1-v0.3",
--  "model": "claude-3-5-sonnet-20241022"
++  "providers": [
++    {
++      "name": "anthropic",
++      "api_key_ref": "vault://anthropic-api-key",
++      "enabled": true,
++      "priority": 1
++    },
++    {
++      "name": "openai",
++      "api_key_ref": "vault://openai-api-key",
++      "enabled": true,
++      "priority": 2
++    }
++  ],
++  "stage_config": {
++    "stage1": {
++      "provider": "anthropic",
++      "model": "claude-haiku-4",
++      "max_tokens": 4096,
++      "temperature": 0.0
++    },
++    "stage2": {
++      "provider": "anthropic",
++      "model": "claude-sonnet-3-5",
++      "max_tokens": 16384,
++      "temperature": 0.3
++    },
++    "stage3": {
++      "provider": "anthropic",
++      "model": "claude-sonnet-3-5",
++      "max_tokens": 8192,
++      "temperature": 0.2
++    }
++  }
  }
--{{/code}}
++}}}
-----
++----
--== 4. AnalysisResult Schema (Context-Aware) ==
++=== 6.5 Stage-Specific Models (POC1 Defaults) ===
--This schema implements the **Context-Aware Analysis** required by the POC1 specification.
++**Stage 1: Claim Extraction**
--{{code language="json"}}
--{
--  "metadata": {
--    "job_id": "string (ULID)",
--    "timestamp_utc": "ISO8601",
--    "engine_version": "POC1-v0.3",
--    "llm_provider": "anthropic",
--    "llm_model": "claude-3-5-sonnet-20241022",
--    "usage_stats": {
--      "input_tokens": "integer",
--      "output_tokens": "integer",
--      "estimated_cost_usd": "float",
--      "response_time_sec": "float"
++* **Default:** Anthropic Claude Haiku 4
++* **Alternative:** OpenAI GPT-4o-mini, Google Gemini 1.5 Flash
++* **Rationale:** Fast, cheap, simple task
++* **Cost:** ~$0.003 per article
++
++**Stage 2: Claim Analysis** (CACHEABLE)
++
++* **Default:** Anthropic Claude Sonnet 3.5
++* **Alternative:** OpenAI GPT-4o, Google Gemini 1.5 Pro
++* **Rationale:** High-quality analysis, cached 90 days
++* **Cost:** ~$0.081 per NEW claim
++
++**Stage 3: Holistic Assessment**
++
++* **Default:** Anthropic Claude Sonnet 3.5
++* **Alternative:** OpenAI GPT-4o, Claude Opus 4 (for high-stakes)
++* **Rationale:** Complex reasoning, logical fallacy detection
++* **Cost:** ~$0.030 per article
++
++**Cost Comparison (Example):**
++
++|=Stage|=Anthropic (Default)|=OpenAI Alternative|=Google Alternative
++|Stage 1|Claude Haiku 4 ($0.003)|GPT-4o-mini ($0.002)|Gemini Flash ($0.002)
++|Stage 2|Claude Sonnet 3.5 ($0.081)|GPT-4o ($0.045)|Gemini Pro ($0.050)
++|Stage 3|Claude Sonnet 3.5 ($0.030)|GPT-4o ($0.018)|Gemini Pro ($0.020)
++|**Total (0% cache)**|**$0.114**|**$0.065**|**$0.072**
++
++**Note:** POC1 uses Anthropic exclusively for consistency. Multi-provider support planned for POC2.
++
++----
++
++=== 6.6 Failover Strategy ===
++
++**Automatic Failover:**
++
++{{{
++async function completeLLM(stage: string, prompt: string): Promise<string> {
++  const primaryProvider = getProviderForStage(stage)
++  const fallbackProvider = getFallbackProvider()
++
++  try {
++    return await primaryProvider.complete(prompt)
++  } catch (error) {
++    if (error.type === 'rate_limit' || error.type === 'service_unavailable') {
++      logger.warn(`Primary provider failed, using fallback`)
++      return await fallbackProvider.complete(prompt)
      }
++    throw error
++  }
++}
++}}}
++
++**Fallback Priority:**
++
++1. **Primary:** Configured provider for stage
++2. **Secondary:** Fallback provider (if configured)
++3. **Cache:** Return cached result (if available for Stage 2)
++4. **Error:** Return 503 Service Unavailable
++
++----
++
++=== 6.7 Provider Selection API ===
++
++**Admin Endpoint:** POST /admin/v1/llm/configure
++
++**Update provider for specific stage:**
++
++{{{{
++{
++  "stage": "stage2",
++  "provider": "openai",
++  "model": "gpt-4o",
++  "max_tokens": 16384,
++  "temperature": 0.3
++}
++}}}
++
++**Response:** 200 OK
++
++{{{{
++{
++  "message": "LLM configuration updated",
++  "stage": "stage2",
++  "previous": {
++    "provider": "anthropic",
++    "model": "claude-sonnet-3-5"
    },
--  "article_holistic_assessment": {
--    "main_thesis": "string (The core argument detected)",
--    "overall_verdict": "WELL-SUPPORTED | MISLEADING | REFUTED | UNCERTAIN",
--    "logic_quality_score": "float (0-1)",
--    "fallacies_detected": ["correlation-causation", "cherry-picking", "hasty-generalization"],
--    "verdict_reasoning": "string (Explanation of why article credibility differs from claim average)",
--    "experimental_feature": true
++  "current": {
++    "provider": "openai",
++    "model": "gpt-4o"
    },
--  "claims": [
--    {
--      "claim_id": "C1",
--      "is_central_to_thesis": "boolean",
--      "claim_text": "string",
--      "canonical_form": "string",
--      "claim_type": "descriptive | causal | predictive | normative | definitional",
--      "evaluability": "evaluable | partly_evaluable | not_evaluable",
--      "risk_tier": "A | B | C",
--      "risk_tier_justification": "string",
--      "domain": "string (e.g., 'public health', 'economics')",
--      "key_terms": ["term1", "term2"],
--      "entities": ["Person X", "Org Y"],
--      "time_scope_detected": "2020-2024",
--      "geography_scope_detected": "Brazil",
--      "scenarios": [
--        {
--          "scenario_id": "C1-S1",
--          "context_title": "string",
--          "definitions": {"key_term": "definition"},
--          "assumptions": ["Assumption 1", "Assumption 2"],
--          "boundaries": {
--            "time": "as of 2025-01",
--            "geography": "Brazil",
--            "population": "adult population",
--            "conditions": "excludes X; includes Y"
--          },
--          "scope_of_evidence": "What counts as evidence for this scenario",
--          "scenario_questions": ["Question that decides the verdict"],
--          "verdict": {
--            "label": "Highly Likely | Likely | Unclear | Unlikely | Refuted | Unsubstantiated",
--            "probability_range": [0.0, 1.0],
--            "confidence": "float (0-1)",
--            "reasoning": "string",
--            "key_supporting_evidence_ids": ["E1", "E3"],
--            "key_counter_evidence_ids": ["E2"],
--            "uncertainty_factors": ["Data gap", "Method disagreement"],
--            "what_would_change_my_mind": ["Specific new study", "Updated dataset"]
--          },
--          "evidence": [
--            {
--              "evidence_id": "E1",
--              "stance": "supports | undermines | mixed | context_dependent",
--              "relevance_to_scenario": "float (0-1)",
--              "evidence_summary": ["Bullet fact 1", "Bullet fact 2"],
--              "citation": {
--                "title": "Source title",
--                "author_or_org": "Org/Author",
--                "publication_date": "2024-05-01",
--                "url": "https://source.example",
--                "publisher": "Publisher/Domain"
--              },
--              "excerpt": ["Short quote ≤25 words (optional)"],
--              "source_reliability_score": "float (0-1) - READ-ONLY SNAPSHOT",
--              "reliability_justification": "Why high/medium/low",
--              "limitations_and_reservations": ["Limitation 1", "Limitation 2"],
--              "retraction_or_dispute_signal": "none | correction | retraction | disputed",
--              "retrieval_status": "OK | NEEDS_RETRIEVAL | FAILED"
--            }
--          ]
--        }
--      ]
++  "cost_impact": {
++    "previous_cost_per_claim": 0.081,
++    "new_cost_per_claim": 0.045,
++    "savings_percent": 44
++  }
++}
++}}}
++
++**Get current configuration:**
++
++GET /admin/v1/llm/config
++
++{{{{
++{
++  "providers": ["anthropic", "openai"],
++  "primary": "anthropic",
++  "fallback": "openai",
++  "stages": {
++    "stage1": {
++      "provider": "anthropic",
++      "model": "claude-haiku-4",
++      "cost_per_request": 0.003
++    },
++    "stage2": {
++      "provider": "anthropic",
++      "model": "claude-sonnet-3-5",
++      "cost_per_new_claim": 0.081
++    },
++    "stage3": {
++      "provider": "anthropic",
++      "model": "claude-sonnet-3-5",
++      "cost_per_request": 0.030
      }
--  ],
--  "quality_gates": {
--    "gate1_claim_validation": "pass | fail",
--    "gate4_verdict_confidence": "pass | fail",
--    "passed_all": "boolean",
--    "gate_fail_reasons": [
--      {
--        "gate": "gate1_claim_validation",
--        "claim_id": "C1",
--        "reason_code": "OPINION_DETECTED | COMPOUND_CLAIM | SUBJECTIVE | TOO_VAGUE",
--        "explanation": "Human-readable explanation"
--      }
--    ]
--  },
--  "global_notes": {
--    "limitations": ["System limitation 1", "Limitation 2"],
--    "safety_or_policy_notes": ["Note 1"]
    }
  }
--{{/code}}
++}}}
--=== 4.1 Risk Tier Definitions ===
++----
--|=Tier|=Impact|=Examples|=Actions
--|**A (High)**|High real-world impact if wrong|Health claims, safety information, financial advice, medical procedures|Human review recommended (Mode3_Human_Reviewed_Required)
--|**B (Medium)**|Moderate impact, contested topics|Political claims, social issues, scientific debates, economic predictions|Enhanced contradiction search, AI-generated publication OK (Mode2_AI_Generated)
--|**C (Low)**|Low impact, easily verifiable|Historical facts, basic statistics, biographical data, geographic information|Standard processing, AI-generated publication OK (Mode2_AI_Generated)
++=== 6.8 Implementation Notes ===
--=== 4.2 Source Reliability (Read-Only Snapshots) ===
++**Provider Adapter Pattern:**
--**IMPORTANT:** The {{code}}source_reliability_score{{/code}} in each evidence item is a **historical snapshot** from the weekly background scoring job.
++{{{
++class AnthropicProvider implements LLMProvider {
++  async complete(prompt: string, options: CompletionOptions) {
++    const response = await anthropic.messages.create({
++      model: options.model || 'claude-sonnet-3-5',
++      max_tokens: options.maxTokens || 4096,
++      messages: [{ role: 'user', content: prompt }],
++      system: options.systemPrompt
++    })
++    return response.content[0].text
++  }
++}
--* POC1 treats these scores as **read-only** (no modification during analysis)
--* **Prevents circular dependency:** scoring → affects retrieval → affects scoring
--* Full Source Track Record System is a **separate service** (not part of POC1)
--* **Temporal separation:** Scoring runs weekly; analysis uses snapshots
++class OpenAIProvider implements LLMProvider {
++  async complete(prompt: string, options: CompletionOptions) {
++    const response = await openai.chat.completions.create({
++      model: options.model || 'gpt-4o',
++      max_tokens: options.maxTokens || 4096,
++      messages: [
++        { role: 'system', content: options.systemPrompt },
++        { role: 'user', content: prompt }
++      ]
++    })
++    return response.choices[0].message.content
++  }
++}
++}}}
--**See:** [[Data Model>>Test.FactHarbor.Specification.Data Model.WebHome]] Section 1.3 (Source Track Record System) for scoring algorithm.
++**Provider Registry:**
--=== 4.3 Quality Gate Reason Codes ===
++{{{
++const providers = new Map<string, LLMProvider>()
++providers.set('anthropic', new AnthropicProvider())
++providers.set('openai', new OpenAIProvider())
++providers.set('google', new GoogleProvider())
--**Gate 1 (Claim Validation):**
--* {{code}}OPINION_DETECTED{{/code}} - Subjective judgment without factual anchor
--* {{code}}COMPOUND_CLAIM{{/code}} - Multiple claims in one statement
--* {{code}}SUBJECTIVE{{/code}} - Value judgment, not verifiable fact
--* {{code}}TOO_VAGUE{{/code}} - Lacks specificity for evaluation
++function getProvider(name: string): LLMProvider {
++  return providers.get(name) || providers.get(config.primaryProvider)
++}
++}}}
--**Gate 4 (Verdict Confidence):**
--* {{code}}LOW_CONFIDENCE{{/code}} - Confidence below threshold (<0.5)
--* {{code}}INSUFFICIENT_EVIDENCE{{/code}} - Too few sources to reach verdict
--* {{code}}CONTRADICTORY_EVIDENCE{{/code}} - Evidence conflicts without resolution
--* {{code}}NO_COUNTER_EVIDENCE{{/code}} - Contradiction search failed
++----
--**Purpose:** Enable system improvement workflow (Observe → Analyze → Improve)
++== 3. REST API Contract ==
-----
++=== 3.1 User Credit Tracking ===
--== 5. Validation Rules (POC1 Enforcement) ==
++**Endpoint:** GET /v1/user/credit
--|=Rule|=Requirement
--|**Mandatory Contradiction**|For every claim, the engine MUST search for "undermines" evidence. If none found, reasoning must explicitly state: "No counter-evidence found despite targeted search." Evidence must include at least 1 item with {{code}}stance ∈ {undermines, mixed, context_dependent}{{/code}} OR explicit note in {{code}}uncertainty_factors{{/code}}.
--|**Context-Aware Logic**|The {{code}}overall_verdict{{/code}} must prioritize central claims. If a {{code}}is_central_to_thesis=true{{/code}} claim is REFUTED, the overall article cannot be WELL-SUPPORTED. Central claims override verdict averaging.
--|**Author Identification**|All automated outputs MUST include {{code}}author_type: "AI/AKEL"{{/code}} or equivalent marker to distinguish AI-generated from human-reviewed content.
--|**Claim-to-Scenario Lifecycle**|In stateless POC1, Scenarios are **strictly children** of a specific Claim version. If a Claim's text changes, child Scenarios are part of that version's "snapshot." No scenario migration across versions.
++**Response:** 200 OK
-----
++{{{{
++ "user_id": "user_abc123",
++ "tier": "free",
++ "credit_limit": 10.00,
++ "credit_used": 7.42,
++ "credit_remaining": 2.58,
++ "reset_date": "2025-02-01T00:00:00Z",
++ "cache_only_mode": false,
++ "usage_stats": {
++ "articles_analyzed": 67,
++ "claims_from_cache": 189,
++ "claims_newly_analyzed": 113,
++ "cache_hit_rate": 0.626
++ }
++}
++}}}
--== 6. Deterministic Markdown Template ==
++----
--The system renders {{code}}report.md{{/code}} using a **fixed template** based on the JSON result (NOT generated by LLM).
++=== 3.2 Create Analysis Job (3-Stage) ===
--{{code language="markdown"}}
--# FactHarbor Analysis Report: {overall_verdict}
++**Endpoint:** POST /v1/analyze
--**Job ID:** {job_id} | **Generated:** {timestamp_utc}
--**Model:** {llm_model} | **Cost:** ${estimated_cost_usd} | **Time:** {response_time_sec}s
++==== Idempotency Support: ====
-----
++To prevent duplicate job creation on network retries, clients SHOULD include:
--## 1. Holistic Assessment (Experimental)
++{{{POST /v1/analyze
++Idempotency-Key: {client-generated-uuid}
++}}}
--**Main Thesis:** {main_thesis}
++OR use the client.request_id field:
--**Overall Verdict:** {overall_verdict}
++{{{{
++ "input_url": "...",
++ "client": {
++ "request_id": "client-uuid-12345",
++ "source_label": "optional"
++ }
++}
++}}}
--**Logic Quality Score:** {logic_quality_score}/1.0
++**Server Behavior:**
--**Fallacies Detected:** {fallacies_detected}
++* If Idempotency-Key or request_id seen before (within 24 hours):
++** Return existing job (200 OK, not 202 Accepted)
++** Do NOT create duplicate job or charge twice
++* Idempotency keys expire after 24 hours (matches job retention)
--**Reasoning:** {verdict_reasoning}
++**Example Response (Idempotent):**
-----
++{{{{
++ "job_id": "01J...ULID",
++ "status": "RUNNING",
++ "idempotent": true,
++ "original_request_at": "2025-12-24T10:31:00Z",
++ "message": "Returning existing job (idempotency key matched)"
++}
++}}}
--## 2. Key Claims Analysis
++==== Request Body: ====
--### [C1] {claim_text}
--* **Role:** {is_central_to_thesis ? "Central to thesis" : "Supporting claim"}
--* **Risk Tier:** {risk_tier} ({risk_tier_justification})
--* **Evaluability:** {evaluability}
++{{{{
++ "input_type": "url",
++ "input_url": "https://example.com/medical-report-01",
++ "input_text": null,
++ "options": {
++ "browsing": "on",
++ "depth": "standard",
++ "max_claims": 5,
++ "scenarios_per_claim": 2,
++ "max_evidence_per_scenario": 6,
++ "context_aware_analysis": true
++ },
++ "client": {
++ "request_id": "optional-client-tracking-id",
++ "source_label": "optional"
++ }
++}
++}}}
--**Scenarios Explored:** {scenarios.length}
++**Options:**
--#### Scenario: {scenario.context_title}
--* **Verdict:** {verdict.label} (Confidence: {verdict.confidence})
--* **Probability Range:** {verdict.probability_range[0]} - {verdict.probability_range[1]}
--* **Reasoning:** {verdict.reasoning}
++* browsing: on | off (retrieve web sources or just output queries)
++* depth: standard | deep (evidence thoroughness)
++* max_claims: 1-10 (default: **5** for cost control)
++* scenarios_per_claim: 1-5 (default: **2** for cost control)
++* max_evidence_per_scenario: 3-10 (default: **6**)
++* context_aware_analysis: true | false (experimental)
--**Evidence:**
--* Supporting: {evidence.filter(e => e.stance == "supports").length} sources
--* Undermining: {evidence.filter(e => e.stance == "undermines").length} sources
--* Mixed: {evidence.filter(e => e.stance == "mixed").length} sources
++**Response:** 202 Accepted
--**Key Evidence:**
--* [{evidence[0].citation.title}]({evidence[0].citation.url}) - {evidence[0].stance}
++{{{{
++ "job_id": "01J...ULID",
++ "status": "QUEUED",
++ "created_at": "2025-12-24T10:31:00Z",
++ "estimated_cost": 0.114,
++ "cost_breakdown": {
++ "stage1_extraction": 0.003,
++ "stage2_new_claims": 0.081,
++ "stage2_cached_claims": 0.000,
++ "stage3_holistic": 0.030
++ },
++ "cache_info": {
++ "claims_to_extract": 5,
++ "estimated_cache_hits": 4,
++ "estimated_new_claims": 1
++ },
++ "links": {
++ "self": "/v1/jobs/01J...ULID",
++ "result": "/v1/jobs/01J...ULID/result",
++ "report": "/v1/jobs/01J...ULID/report",
++ "events": "/v1/jobs/01J...ULID/events"
++ }
++}
++}}}
-----
++**Error Responses:**
--## 3. Quality Assessment
++402 Payment Required - Free tier limit reached, cache-only mode
--**Quality Gates:**
--* Gate 1 (Claim Validation): {gate1_claim_validation}
--* Gate 4 (Verdict Confidence): {gate4_verdict_confidence}
--* Overall: {passed_all ? "PASS" : "FAIL"}
++{{{{
++ "error": "credit_limit_reached",
++ "message": "Monthly credit limit reached. Entering cache-only mode.",
++ "cache_only_mode": true,
++ "credit_remaining": 0.00,
++ "reset_date": "2025-02-01T00:00:00Z",
++ "action": "Resubmit with cache_preference=allow_partial for cached results"
++}
++}}}
--{if gate_fail_reasons.length > 0}
--**Failed Gates:**
--{gate_fail_reasons.map(r => `* ${r.gate}: ${r.explanation}`)}
--{/if}
++----
-----
++== 4. Data Schemas ==
--## 4. Limitations & Disclaimers
++=== 4.1 Stage 1 Output: ClaimExtraction ===
--**System Limitations:**
--{limitations.map(l => `* ${l}`)}
++{{{{
++ "job_id": "01J...ULID",
++ "stage": "stage1_extraction",
++ "article_metadata": {
++ "title": "Article title",
++ "source_url": "https://example.com/article",
++ "extracted_text_length": 5234,
++ "language": "en"
++ },
++ "claims": [
++ {
++ "claim_id": "C1",
++ "claim_text": "Original claim text from article",
++ "canonical_claim": "Normalized, deduplicated phrasing",
++ "claim_hash": "sha256:abc123...",
++ "is_central_to_thesis": true,
++ "claim_type": "causal",
++ "evaluability": "evaluable",
++ "risk_tier": "B",
++ "domain": "public_health"
++ }
++ ],
++ "article_thesis": "Main argument detected",
++ "cost": 0.003
++}
++}}}
--**Important Notes:**
--* This analysis is AI-generated and experimental (POC1)
--* Context-aware article verdict is being tested for accuracy
--* Human review recommended for high-risk claims (Tier A)
--* Cost: ${estimated_cost_usd} | Tokens: {input_tokens + output_tokens}
++----
--**Methodology:** FactHarbor uses Claude 3.5 Sonnet to extract claims, generate scenarios, gather evidence (with mandatory contradiction search), and assess logical coherence between claims and article thesis.
++=== 4.5 Verdict Label Taxonomy ===
-----
++FactHarbor uses **three distinct verdict taxonomies** depending on analysis level:
--*Generated by FactHarbor POC1-v0.3 | [About FactHarbor](https://factharbor.org)*
--{{/code}}
++==== 4.5.1 Scenario Verdict Labels (Stage 2) ====
--**Target Report Size:** 220-350 words (optimized for 2-minute read)
++Used for individual scenario verdicts within a claim.
-----
++**Enum Values:**
--== 7. LLM Configuration (POC1) ==
++* Highly Likely - Probability 0.85-1.0, high confidence
++* Likely - Probability 0.65-0.84, moderate-high confidence
++* Unclear - Probability 0.35-0.64, or low confidence
++* Unlikely - Probability 0.16-0.34, moderate-high confidence
++* Highly Unlikely - Probability 0.0-0.15, high confidence
++* Unsubstantiated - Insufficient evidence to determine probability
--|=Parameter|=Value|=Notes
--|**Provider**|Anthropic|Primary provider for POC1
--|**Model**|{{code}}claude-3-5-sonnet-20241022{{/code}}|Current production model
--|**Future Model**|{{code}}claude-sonnet-4-20250514{{/code}}|When available (architecture supports)
--|**Token Budget**|50K-80K per analysis|Input + output combined (varies by article length)
--|**Estimated Cost**|$0.10-0.30 per article|Based on Sonnet 3.5 pricing ($3/M input, $15/M output)
--|**Prompt Strategy**|Single-pass per stage|Not multi-turn; structured JSON output with schema validation
--|**Chain-of-Thought**|Yes|For verdict reasoning and holistic assessment
--|**Few-Shot Examples**|Yes|For claim extraction and scenario generation
++==== 4.5.2 Claim Verdict Labels (Rollup) ====
--=== 7.1 Token Budgets by Stage ===
++Used when summarizing a claim across all scenarios.
--|=Stage|=Approximate Output Tokens
--|Claim Extraction|~4,000 (10 claims × ~400 tokens)
--|Scenario Generation|~3,000 per claim (3 scenarios × ~1,000 tokens)
--|Evidence Synthesis|~2,000 per scenario
--|Verdict Generation|~1,000 per scenario
--|Holistic Assessment|~500 (context-aware summary)
++**Enum Values:**
--**Total:** 50K-80K tokens per article (input + output)
++* Supported - Majority of scenarios are Likely or Highly Likely
++* Refuted - Majority of scenarios are Unlikely or Highly Unlikely
++* Inconclusive - Mixed scenarios or majority Unclear/Unsubstantiated
--=== 7.2 API Integration ===
++**Mapping Logic:**
--**Anthropic Messages API:**
--* Endpoint: {{code}}https://api.anthropic.com/v1/messages{{/code}}
--* Authentication: API key via {{code}}x-api-key{{/code}} header
--* Model parameter: {{code}}"model": "claude-3-5-sonnet-20241022"{{/code}}
--* Max tokens: {{code}}"max_tokens": 4096{{/code}} (per stage)
++* If ≥60% scenarios are (Highly Likely | Likely) → Supported
++* If ≥60% scenarios are (Highly Unlikely | Unlikely) → Refuted
++* Otherwise → Inconclusive
--**No LangChain/LangGraph needed** for POC1 simplicity - direct SDK calls suffice.
++==== 4.5.3 Article Verdict Labels (Stage 3) ====
-----
++Used for holistic article-level assessment.
--== 8. Cross-References (xWiki) ==
++**Enum Values:**
--This API specification implements requirements from:
++* WELL-SUPPORTED - Article thesis logically follows from supported claims
++* MISLEADING - Claims may be true but article commits logical fallacies
++* REFUTED - Central claims are refuted, invalidating thesis
++* UNCERTAIN - Insufficient evidence or highly mixed claim verdicts
--* **[[POC Requirements>>Test.FactHarbor.Specification.POC.Requirements]]**
--** FR-POC-1 through FR-POC-6 (POC1-specific functional requirements)
--** NFR-POC-1 through NFR-POC-3 (quality gates lite: Gates 1 & 4 only)
--** Section 2.1: Analysis Summary (Context-Aware) component specification
--** Section 10.3: Prompt structure for claim extraction and verdict synthesis
++**Note:** Article verdict considers **claim centrality** (central claims override supporting claims).
--* **[[Article Verdict Problem>>Test.FactHarbor.Specification.POC.Article-Verdict-Problem]]**
--** Complete investigation of 7 approaches to article-level verdicts
--** Approach 1 (Single-Pass Holistic Analysis) chosen for POC1
--** Experimental feature testing plan (30 articles, ≥70% accuracy target)
--** Decision framework for POC2 implementation
++==== 4.5.4 API Field Mapping ====
--* **[[Requirements>>Test.FactHarbor.Specification.Requirements.WebHome]]**
--** FR4 (Analysis Summary) - enhanced with context-aware capability
--** FR7 (Verdict Calculation) - probability ranges + confidence scores
--** NFR11 (Quality Gates) - POC1 implements Gates 1 & 4; Gates 2 & 3 in POC2
++|=Level|=API Field|=Enum Name
++|Scenario|scenarios[].verdict.label|scenario_verdict_label
++|Claim|claims[].rollup_verdict (optional)|claim_verdict_label
++|Article|article_holistic_assessment.overall_verdict|article_verdict_label
--* **[[Architecture>>Test.FactHarbor.Specification.Architecture.WebHome]]**
--** POC1 simplified architecture (stateless, single AKEL orchestration call)
--** Data persistence minimized (job outputs only, no database required)
--** Deferred complexity (no Elasticsearch, TimescaleDB, Federation until metrics justify)
++----
--* **[[Data Model>>Test.FactHarbor.Specification.Data Model.WebHome]]**
--** Evidence structure (source, stance, reliability rating)
--** Scenario boundaries (time, geography, population, conditions)
--** Claim types and evaluability taxonomy
--** Source Track Record System (Section 1.3) - temporal separation
++== 5. Cache Architecture ==
--* **[[Requirements Roadmap Matrix>>Test.FactHarbor.Roadmap.Requirements-Roadmap-Matrix.WebHome]]**
--** POC1 requirement mappings and phase assignments
--** Context-aware analysis as POC1 experimental feature
--** POC2 enhancement path (Gates 2 & 3, evidence deduplication)
++=== 5.1 Redis Cache Design ===
-----
++**Technology:** Redis 7.0+ (in-memory key-value store)
--== 9. Implementation Notes (POC1) ==
++**Cache Key Schema:**
--=== 9.1 Recommended Tech Stack ===
++{{{claim:v1norm1:{language}:{sha256(canonical_claim)}
++}}}
--* **Framework:** Next.js 14+ with App Router (TypeScript) - Full-stack in one codebase
--* **Rationale:** API routes + React UI unified, Vercel deployment-ready, similar to C# in structure
--* **Storage:** Filesystem JSON files (no database needed for POC1)
--* **Queue:** In-memory queue or Redis (optional for concurrency)
--* **URL Extraction:** Jina AI Reader API (primary), trafilatura (fallback)
--* **Deployment:** Vercel, AWS Lambda, or similar serverless
++**Example:**
--=== 9.2 POC1 Simplifications ===
++{{{Claim (English): "COVID vaccines are 95% effective"
++Canonical: "covid vaccines are 95 percent effective"
++Language: "en"
++SHA256: abc123...def456
++Key: claim:v1norm1:en:abc123...def456
++}}}
--* **No database required:** Job metadata + outputs stored as JSON files ({{code}}jobs/{job_id}.json{{/code}}, {{code}}results/{job_id}.json{{/code}})
--* **No user authentication:** Optional API key validation only (env var: {{code}}FACTHARBOR_API_KEY{{/code}})
--* **Single-instance deployment:** No distributed processing, no worker pools
--* **Synchronous LLM calls:** No streaming in POC1 (entire response before returning)
--* **Job retention:** 24 hours default (configurable: {{code}}JOB_RETENTION_HOURS{{/code}})
--* **Rate limiting:** Simple IP-based (optional) - no complex billing
++**Rationale:** Prevents cross-language collisions and enables per-language cache analytics.
--=== 9.3 Estimated Costs (Per Analysis) ===
++**Data Structure:**
--**LLM API costs (Claude 3.5 Sonnet):**
--* Input: $3.00 per million tokens
--* Output: $15.00 per million tokens
--* **Per article:** $0.10-0.30 (varies by length, 5-10 claims typical)
++{{{SET claim:v1norm1:en:abc123...def456 '{...ClaimAnalysis JSON...}'
++EXPIRE claim:v1norm1:en:abc123...def456 7776000 # 90 days
++}}}
--**Web search costs (optional):**
--* Using external search API (Tavily, Brave): $0.01-0.05 per analysis
--* POC1 can use free search APIs initially
++----
--**Infrastructure costs:**
--* Vercel hobby tier: Free for POC
--* AWS Lambda: ~$0.001 per request
--* **Total infra:** <$0.01 per analysis
++=== 5.1.1 Canonical Claim Normalization (v1) ===
--**Total estimated cost:** ~$0.15-0.35 per analysis ✅ Meets <$0.35 target
++The cache key depends on deterministic claim normalization. All implementations MUST follow this algorithm exactly.
--=== 9.4 Estimated Timeline (AI-Assisted) ===
++**Algorithm: Canonical Claim Normalization v1**
--**With Cursor IDE + Claude API:**
--* Day 1-2: API scaffolding + job queue
--* Day 3-4: LLM integration + prompt engineering
--* Day 5-6: Evidence retrieval + contradiction search
--* Day 7: Report templates + testing with 30 articles
--* **Total:** 5-7 days for working POC1
++{{{def normalize_claim_v1(claim_text: str, language: str) -> str:
++ """
++ Normalizes claim to canonical form for cache key generation.
++ Version: v1norm1 (POC1)
++ """
++ import re
++ import unicodedata
++
++ # Step 1: Unicode normalization (NFC)
++ text = unicodedata.normalize('NFC', claim_text)
++
++ # Step 2: Lowercase
++ text = text.lower()
++
++ # Step 3: Remove punctuation (except hyphens in words)
++ text = re.sub(r'[^\w\s-]', '', text)
++
++ # Step 4: Normalize whitespace (collapse multiple spaces)
++ text = re.sub(r'\s+', ' ', text).strip()
++
++ # Step 5: Numeric normalization
++ text = text.replace('%', ' percent')
++ # Spell out single-digit numbers
++ num_to_word = {'0':'zero', '1':'one', '2':'two', '3':'three',
++ '4':'four', '5':'five', '6':'six', '7':'seven',
++ '8':'eight', '9':'nine'}
++ for num, word in num_to_word.items():
++ text = re.sub(rf'\b{num}\b', word, text)
++
++ # Step 6: Common abbreviations (English only in v1)
++ if language == 'en':
++ text = text.replace('covid-19', 'covid')
++ text = text.replace('u.s.', 'us')
++ text = text.replace('u.k.', 'uk')
++
++ # Step 7: NO entity normalization in v1
++ # (Trump vs Donald Trump vs President Trump remain distinct)
++
++ return text
--**Manual coding (no AI assistance):**
--* Estimate: 15-20 days
++# Version identifier (include in cache namespace)
++CANONICALIZER_VERSION = "v1norm1"
++}}}
--=== 9.5 First Prompt for AI Code Generation ===
++**Cache Key Formula (Updated):**
--{{code}}
--Based on the FactHarbor POC1 API & Schemas Specification (v0.3), generate a Next.js 14 TypeScript application with:
++{{{language = "en"
++canonical = normalize_claim_v1(claim_text, language)
++cache_key = f"claim:{CANONICALIZER_VERSION}:{language}:{sha256(canonical)}"
--1. API routes implementing the 7 endpoints specified in Section 3
--2. AnalyzeRequest/AnalysisResult types matching schemas in Sections 4-5
--3. Anthropic Claude 3.5 Sonnet integration for:
--   - Claim extraction (with central/supporting marking)
--   - Scenario generation
--   - Evidence synthesis (with mandatory contradiction search)
--   - Verdict generation
--   - Holistic assessment (article-level credibility)
--4. Job-based async execution with progress tracking (7 pipeline stages)
--5. Quality Gates 1 & 4 from NFR11 implementation
--6. Mandatory contradiction search enforcement (Section 5)
--7. Context-aware analysis (experimental) as specified
--8. Filesystem-based job storage (no database)
--9. Markdown report generation from JSON templates (Section 6)
++Example:
++ claim: "COVID-19 vaccines are 95% effective"
++ canonical: "covid vaccines are 95 percent effective"
++ sha256: abc123...def456
++ key: "claim:v1norm1:en:abc123...def456"
++}}}
--Use the validation rules from Section 5 and error codes from Section 2.1.1.
--Target: <$0.35 per analysis, <2 minutes processing time.
--{{/code}}
++**Cache Metadata MUST Include:**
-----
++{{{{
++ "canonical_claim": "covid vaccines are 95 percent effective",
++ "canonicalizer_version": "v1norm1",
++ "language": "en",
++ "original_claim_samples": ["COVID-19 vaccines are 95% effective"]
++}
++}}}
--== 10. Testing Strategy (POC1) ==
++**Version Upgrade Path:**
--=== 10.1 Test Dataset (30 Articles) ===
++* v1norm1 → v1norm2: Cache namespace changes, old keys remain valid until TTL
++* v1normN → v2norm1: Major version bump, invalidate all v1 caches
--**Category 1: Straightforward Factual (10 articles)**
--* Purpose: Baseline accuracy
--* Example: "WHO report on global vaccination rates"
--* Expected: High claim accuracy, straightforward verdict
++----
--**Category 2: Accurate Claims, Questionable Conclusions (10 articles)** ⭐ **Context-Aware Test**
--* Purpose: Test holistic assessment capability
--* Example: "Coffee cures cancer" (true premises, false conclusion)
--* Expected: Individual claims TRUE, article verdict MISLEADING
++=== 5.1.2 Copyright & Data Retention Policy ===
--**Category 3: Mixed Accuracy (5 articles)**
--* Purpose: Test nuance handling
--* Example: Articles with some true, some false claims
--* Expected: Scenario-level differentiation
++**Evidence Excerpt Storage:**
--**Category 4: Low-Quality Claims (5 articles)**
--* Purpose: Test quality gates
--* Example: Opinion pieces, compound claims
--* Expected: Gate 1 failures, rejection or draft-only mode
++To comply with copyright law and fair use principles:
--=== 10.2 Success Metrics ===
++**What We Store:**
--**Quality Metrics:**
--* Hallucination rate: <5% (target: <3%)
--* Context-aware accuracy: ≥70% (experimental - key POC1 goal)
--* False positive rate: <15%
--* Mandatory contradiction search: 100% compliance
++* **Metadata only:** Title, author, publisher, URL, publication date
++* **Short excerpts:** Max 25 words per quote, max 3 quotes per evidence item
++* **Summaries:** AI-generated bullet points (not verbatim text)
++* **No full articles:** Never store complete article text beyond job processing
--**Performance Metrics:**
--* Processing time: <2 minutes per article (standard depth)
--* Cost per analysis: <$0.35
--* API uptime: >99%
--* LLM API error rate: <1%
++**Total per Cached Claim:**
--**See:** [[POC1 Roadmap>>Test.FactHarbor.Roadmap.POC1.WebHome]] Section 11 for complete success criteria and testing methodology.
++* Scenarios: 2 per claim
++* Evidence items: 6 per scenario (12 total)
++* Quotes: 3 per evidence × 25 words = 75 words per item
++* **Maximum stored verbatim text:** ~~900 words per claim (12 × 75)
-----
++**Retention:**
--**End of Specification - FactHarbor POC1 API v0.3**
++* Cache TTL: 90 days
++* Job outputs: 24 hours (then archived or deleted)
++* No persistent full-text article storage
--**Ready for xWiki import and AI-assisted implementation!** 🚀
++**Rationale:**
++* Short excerpts for citation = fair use
++* Summaries are transformative (not copyrightable)
++* Limited retention (90 days max)
++* No commercial republication of excerpts
++
++**DMCA Compliance:**
++
++* Cache invalidation endpoint available for rights holders
++* Contact: dmca@factharbor.org
++
++----
++
++== Summary ==
++
++This WYSIWYG preview shows the **structure and key sections** of the 1,515-line API specification.
++
++**Full specification includes:**
++
++* Complete API endpoints (7 total)
++* All data schemas (ClaimExtraction, ClaimAnalysis, HolisticAssessment, Complete)
++* Quality gates & validation rules
++* LLM configuration for all 3 stages
++* Implementation notes with code samples
++* Testing strategy
++* Cross-references to other pages
++
++**The complete specification is available in:**
++
++* FactHarbor_POC1_API_and_Schemas_Spec_v0_4_1_PATCHED.md (45 KB standalone)
++* Export files (TEST/PRODUCTION) for xWiki import

Changes for page POC1 API & Schemas Specification

Summary

Details

Applications

Navigation

Need help?