Changes for page POC1 API & Schemas Specification

Last modified by Robert Schaub on 2025/12/24 18:26

From 3.1 to 4.1 From 5.1 to 5.2

From version 4.1

edited by Robert Schaub
on 2025/12/24 16:55

Change comment: Imported from XAR

To version 5.1

edited by Robert Schaub
on 2025/12/24 17:59

Change comment: Imported from XAR

Raw
Rendered

Summary

Page properties (2 modified, 0 added, 0 removed)

Details

Page properties

Title

@@ -1,1 +1,1 @@
--POC1 API & Schemas Specification v0.4.1
++POC1 API & Schemas Specification

Content

@@ -43,15 +43,15 @@
  {{mermaid}}
  graph TD
--    A[Article Input] --> B[Stage 1: Extract Claims]
--    B --> C{For Each Claim}
--    C --> D[Check Cache]
--    D -->|Cache HIT| E[Return Cached Verdict]
--    D -->|Cache MISS| F[Stage 2: Analyze Claim]
--    F --> G[Store in Cache]
--    G --> E
--    E --> H[Stage 3: Holistic Assessment]
--    H --> I[Final Report]
++ A[Article Input] --> B[Stage 1: Extract Claims]
++ B --> C{For Each Claim}
++ C --> D[Check Cache]
++ D -->|Cache HIT| E[Return Cached Verdict]
++ D -->|Cache MISS| F[Stage 2: Analyze Claim]
++ F --> G[Store in Cache]
++ G --> E
++ E --> H[Stage 3: Holistic Assessment]
++ H --> I[Final Report]
  {{/mermaid}}
  ==== Stage 1: Claim Extraction (Haiku, no cache) ====
@@ -58,7 +58,7 @@
  * **Input:** Article text
  * **Output:** 5 canonical claims (normalized, deduplicated)
--* **Model:** Claude Haiku 4
++* **Model:** Claude Haiku 4 (default, configurable via LLM abstraction layer)
  * **Cost:** $0.003 per article
  * **Cache strategy:** No caching (article-specific)
@@ -66,7 +66,7 @@
  * **Input:** Single canonical claim
  * **Output:** Scenarios + Evidence + Verdicts
--* **Model:** Claude Sonnet 3.5
++* **Model:** Claude Sonnet 3.5 (default, configurable via LLM abstraction layer)
  * **Cost:** $0.081 per NEW claim
  * **Cache strategy:** Redis, 90-day TTL
  * **Cache key:** claim:v1norm1:{language}:{sha256(canonical_claim)}
@@ -75,10 +75,14 @@
  * **Input:** Article + Claim verdicts (from cache or Stage 2)
  * **Output:** Article verdict + Fallacies + Logic quality
--* **Model:** Claude Sonnet 3.5
++* **Model:** Claude Sonnet 3.5 (default, configurable via LLM abstraction layer)
  * **Cost:** $0.030 per article
  * **Cache strategy:** No caching (article-specific)
++
++
++**Note:** Stage 3 implements **Approach 1 (Single-Pass Holistic Analysis)** from the [[Article Verdict Problem>>Test.FactHarbor.Specification.POC.Article-Verdict-Problem]]. While claim analysis (Stage 2) is cached for efficiency, the holistic assessment maintains the integrated evaluation philosophy of Approach 1.
++
  === Total Cost Formula: ===
  {{{Cost = $0.003 (extraction) + (N_new_claims × $0.081) + $0.030 (holistic)
@@ -146,27 +146,27 @@
  ==== User Experience Example: ====
  {{{{
--  "status": "cache_only_mode",
--  "message": "Monthly credit limit reached. Showing cached results only.",
--  "cache_coverage": {
--    "claims_total": 5,
--    "claims_cached": 3,
--    "claims_missing": 2,
--    "coverage_percent": 60
--  },
--  "cached_claims": [
--    {"claim_id": "C1", "verdict": "Likely", "confidence": 0.82},
--    {"claim_id": "C2", "verdict": "Highly Likely", "confidence": 0.91},
--    {"claim_id": "C4", "verdict": "Unclear", "confidence": 0.55}
--  ],
--  "missing_claims": [
--    {"claim_id": "C3", "claim_text": "...", "estimated_cost": "$0.081"},
--    {"claim_id": "C5", "claim_text": "...", "estimated_cost": "$0.081"}
--  ],
--  "upgrade_options": {
--    "top_up": "$5 for 20-70 more articles",
--    "pro_tier": "$50/month unlimited"
--  }
++ "status": "cache_only_mode",
++ "message": "Monthly credit limit reached. Showing cached results only.",
++ "cache_coverage": {
++ "claims_total": 5,
++ "claims_cached": 3,
++ "claims_missing": 2,
++ "coverage_percent": 60
++ },
++ "cached_claims": [
++ {"claim_id": "C1", "verdict": "Likely", "confidence": 0.82},
++ {"claim_id": "C2", "verdict": "Highly Likely", "confidence": 0.91},
++ {"claim_id": "C4", "verdict": "Unclear", "confidence": 0.55}
++ ],
++ "missing_claims": [
++ {"claim_id": "C3", "claim_text": "...", "estimated_cost": "$0.081"},
++ {"claim_id": "C5", "claim_text": "...", "estimated_cost": "$0.081"}
++ ],
++ "upgrade_options": {
++ "top_up": "$5 for 20-70 more articles",
++ "pro_tier": "$50/month unlimited"
++ }
  }
  }}}
@@ -179,6 +179,328 @@
  ----
++
++
++== 6. LLM Abstraction Layer ==
++
++=== 6.1 Design Principle ===
++
++**FactHarbor uses provider-agnostic LLM abstraction** to avoid vendor lock-in and enable:
++
++* **Provider switching:** Change LLM providers without code changes
++* **Cost optimization:** Use different providers for different stages
++* **Resilience:** Automatic fallback if primary provider fails
++* **Cross-checking:** Compare outputs from multiple providers
++* **A/B testing:** Test new models without deployment changes
++
++**Implementation:** All LLM calls go through an abstraction layer that routes to configured providers.
++
++----
++
++=== 6.2 LLM Provider Interface ===
++
++**Abstract Interface:**
++
++{{{
++interface LLMProvider {
++  // Core methods
++  complete(prompt: string, options: CompletionOptions): Promise<CompletionResponse>
++  stream(prompt: string, options: CompletionOptions): AsyncIterator<StreamChunk>
++
++  // Provider metadata
++  getName(): string
++  getMaxTokens(): number
++  getCostPer1kTokens(): { input: number, output: number }
++
++  // Health check
++  isAvailable(): Promise<boolean>
++}
++
++interface CompletionOptions {
++  model?: string
++  maxTokens?: number
++  temperature?: number
++  stopSequences?: string[]
++  systemPrompt?: string
++}
++}}}
++
++----
++
++=== 6.3 Supported Providers (POC1) ===
++
++**Primary Provider (Default):**
++
++* **Anthropic Claude API**
++  * Models: Claude Haiku 4, Claude Sonnet 3.5, Claude Opus 4
++  * Used by default in POC1
++  * Best quality for holistic analysis
++
++**Secondary Providers (Future):**
++
++* **OpenAI API**
++  * Models: GPT-4o, GPT-4o-mini
++  * For cost comparison
++
++* **Google Vertex AI**
++  * Models: Gemini 1.5 Pro, Gemini 1.5 Flash
++  * For diversity in evidence gathering
++
++* **Local Models** (Post-POC)
++  * Models: Llama 3.1, Mistral
++  * For privacy-sensitive deployments
++
++----
++
++=== 6.4 Provider Configuration ===
++
++**Environment Variables:**
++
++{{{
++# Primary provider
++LLM_PRIMARY_PROVIDER=anthropic
++ANTHROPIC_API_KEY=sk-ant-...
++
++# Fallback provider
++LLM_FALLBACK_PROVIDER=openai
++OPENAI_API_KEY=sk-...
++
++# Provider selection per stage
++LLM_STAGE1_PROVIDER=anthropic
++LLM_STAGE1_MODEL=claude-haiku-4
++LLM_STAGE2_PROVIDER=anthropic
++LLM_STAGE2_MODEL=claude-sonnet-3-5
++LLM_STAGE3_PROVIDER=anthropic
++LLM_STAGE3_MODEL=claude-sonnet-3-5
++
++# Cost limits
++LLM_MAX_COST_PER_REQUEST=1.00
++}}}
++
++**Database Configuration (Alternative):**
++
++{{{{
++{
++  "providers": [
++    {
++      "name": "anthropic",
++      "api_key_ref": "vault://anthropic-api-key",
++      "enabled": true,
++      "priority": 1
++    },
++    {
++      "name": "openai",
++      "api_key_ref": "vault://openai-api-key",
++      "enabled": true,
++      "priority": 2
++    }
++  ],
++  "stage_config": {
++    "stage1": {
++      "provider": "anthropic",
++      "model": "claude-haiku-4",
++      "max_tokens": 4096,
++      "temperature": 0.0
++    },
++    "stage2": {
++      "provider": "anthropic",
++      "model": "claude-sonnet-3-5",
++      "max_tokens": 16384,
++      "temperature": 0.3
++    },
++    "stage3": {
++      "provider": "anthropic",
++      "model": "claude-sonnet-3-5",
++      "max_tokens": 8192,
++      "temperature": 0.2
++    }
++  }
++}
++}}}
++
++----
++
++=== 6.5 Stage-Specific Models (POC1 Defaults) ===
++
++**Stage 1: Claim Extraction**
++
++* **Default:** Anthropic Claude Haiku 4
++* **Alternative:** OpenAI GPT-4o-mini, Google Gemini 1.5 Flash
++* **Rationale:** Fast, cheap, simple task
++* **Cost:** ~$0.003 per article
++
++**Stage 2: Claim Analysis** (CACHEABLE)
++
++* **Default:** Anthropic Claude Sonnet 3.5
++* **Alternative:** OpenAI GPT-4o, Google Gemini 1.5 Pro
++* **Rationale:** High-quality analysis, cached 90 days
++* **Cost:** ~$0.081 per NEW claim
++
++**Stage 3: Holistic Assessment**
++
++* **Default:** Anthropic Claude Sonnet 3.5
++* **Alternative:** OpenAI GPT-4o, Claude Opus 4 (for high-stakes)
++* **Rationale:** Complex reasoning, logical fallacy detection
++* **Cost:** ~$0.030 per article
++
++**Cost Comparison (Example):**
++
++|=Stage|=Anthropic (Default)|=OpenAI Alternative|=Google Alternative
++|Stage 1|Claude Haiku 4 ($0.003)|GPT-4o-mini ($0.002)|Gemini Flash ($0.002)
++|Stage 2|Claude Sonnet 3.5 ($0.081)|GPT-4o ($0.045)|Gemini Pro ($0.050)
++|Stage 3|Claude Sonnet 3.5 ($0.030)|GPT-4o ($0.018)|Gemini Pro ($0.020)
++|**Total (0% cache)**|**$0.114**|**$0.065**|**$0.072**
++
++**Note:** POC1 uses Anthropic exclusively for consistency. Multi-provider support planned for POC2.
++
++----
++
++=== 6.6 Failover Strategy ===
++
++**Automatic Failover:**
++
++{{{
++async function completeLLM(stage: string, prompt: string): Promise<string> {
++  const primaryProvider = getProviderForStage(stage)
++  const fallbackProvider = getFallbackProvider()
++
++  try {
++    return await primaryProvider.complete(prompt)
++  } catch (error) {
++    if (error.type === 'rate_limit' || error.type === 'service_unavailable') {
++      logger.warn(`Primary provider failed, using fallback`)
++      return await fallbackProvider.complete(prompt)
++    }
++    throw error
++  }
++}
++}}}
++
++**Fallback Priority:**
++
++1. **Primary:** Configured provider for stage
++2. **Secondary:** Fallback provider (if configured)
++3. **Cache:** Return cached result (if available for Stage 2)
++4. **Error:** Return 503 Service Unavailable
++
++----
++
++=== 6.7 Provider Selection API ===
++
++**Admin Endpoint:** POST /admin/v1/llm/configure
++
++**Update provider for specific stage:**
++
++{{{{
++{
++  "stage": "stage2",
++  "provider": "openai",
++  "model": "gpt-4o",
++  "max_tokens": 16384,
++  "temperature": 0.3
++}
++}}}
++
++**Response:** 200 OK
++
++{{{{
++{
++  "message": "LLM configuration updated",
++  "stage": "stage2",
++  "previous": {
++    "provider": "anthropic",
++    "model": "claude-sonnet-3-5"
++  },
++  "current": {
++    "provider": "openai",
++    "model": "gpt-4o"
++  },
++  "cost_impact": {
++    "previous_cost_per_claim": 0.081,
++    "new_cost_per_claim": 0.045,
++    "savings_percent": 44
++  }
++}
++}}}
++
++**Get current configuration:**
++
++GET /admin/v1/llm/config
++
++{{{{
++{
++  "providers": ["anthropic", "openai"],
++  "primary": "anthropic",
++  "fallback": "openai",
++  "stages": {
++    "stage1": {
++      "provider": "anthropic",
++      "model": "claude-haiku-4",
++      "cost_per_request": 0.003
++    },
++    "stage2": {
++      "provider": "anthropic",
++      "model": "claude-sonnet-3-5",
++      "cost_per_new_claim": 0.081
++    },
++    "stage3": {
++      "provider": "anthropic",
++      "model": "claude-sonnet-3-5",
++      "cost_per_request": 0.030
++    }
++  }
++}
++}}}
++
++----
++
++=== 6.8 Implementation Notes ===
++
++**Provider Adapter Pattern:**
++
++{{{
++class AnthropicProvider implements LLMProvider {
++  async complete(prompt: string, options: CompletionOptions) {
++    const response = await anthropic.messages.create({
++      model: options.model || 'claude-sonnet-3-5',
++      max_tokens: options.maxTokens || 4096,
++      messages: [{ role: 'user', content: prompt }],
++      system: options.systemPrompt
++    })
++    return response.content[0].text
++  }
++}
++
++class OpenAIProvider implements LLMProvider {
++  async complete(prompt: string, options: CompletionOptions) {
++    const response = await openai.chat.completions.create({
++      model: options.model || 'gpt-4o',
++      max_tokens: options.maxTokens || 4096,
++      messages: [
++        { role: 'system', content: options.systemPrompt },
++        { role: 'user', content: prompt }
++      ]
++    })
++    return response.choices[0].message.content
++  }
++}
++}}}
++
++**Provider Registry:**
++
++{{{
++const providers = new Map<string, LLMProvider>()
++providers.set('anthropic', new AnthropicProvider())
++providers.set('openai', new OpenAIProvider())
++providers.set('google', new GoogleProvider())
++
++function getProvider(name: string): LLMProvider {
++  return providers.get(name) || providers.get(config.primaryProvider)
++}
++}}}
++
++----
++
  == 3. REST API Contract ==
  === 3.1 User Credit Tracking ===
@@ -188,19 +188,19 @@
  **Response:** 200 OK
  {{{{
--  "user_id": "user_abc123",
--  "tier": "free",
--  "credit_limit": 10.00,
--  "credit_used": 7.42,
--  "credit_remaining": 2.58,
--  "reset_date": "2025-02-01T00:00:00Z",
--  "cache_only_mode": false,
--  "usage_stats": {
--    "articles_analyzed": 67,
--    "claims_from_cache": 189,
--    "claims_newly_analyzed": 113,
--    "cache_hit_rate": 0.626
--  }
++ "user_id": "user_abc123",
++ "tier": "free",
++ "credit_limit": 10.00,
++ "credit_used": 7.42,
++ "credit_remaining": 2.58,
++ "reset_date": "2025-02-01T00:00:00Z",
++ "cache_only_mode": false,
++ "usage_stats": {
++ "articles_analyzed": 67,
++ "claims_from_cache": 189,
++ "claims_newly_analyzed": 113,
++ "cache_hit_rate": 0.626
++ }
  }
  }}}
@@ -221,11 +221,11 @@
  OR use the client.request_id field:
  {{{{
--  "input_url": "...",
--  "client": {
--    "request_id": "client-uuid-12345",
--    "source_label": "optional"
--  }
++ "input_url": "...",
++ "client": {
++ "request_id": "client-uuid-12345",
++ "source_label": "optional"
++ }
  }
  }}}
@@ -239,11 +239,11 @@
  **Example Response (Idempotent):**
  {{{{
--  "job_id": "01J...ULID",
--  "status": "RUNNING",
--  "idempotent": true,
--  "original_request_at": "2025-12-24T10:31:00Z",
--  "message": "Returning existing job (idempotency key matched)"
++ "job_id": "01J...ULID",
++ "status": "RUNNING",
++ "idempotent": true,
++ "original_request_at": "2025-12-24T10:31:00Z",
++ "message": "Returning existing job (idempotency key matched)"
  }
  }}}
@@ -250,21 +250,21 @@
  ==== Request Body: ====
  {{{{
--  "input_type": "url",
--  "input_url": "https://example.com/medical-report-01",
--  "input_text": null,
--  "options": {
--    "browsing": "on",
--    "depth": "standard",
--    "max_claims": 5,
--    "scenarios_per_claim": 2,
--    "max_evidence_per_scenario": 6,
--    "context_aware_analysis": true
--  },
--  "client": {
--    "request_id": "optional-client-tracking-id",
--    "source_label": "optional"
--  }
++ "input_type": "url",
++ "input_url": "https://example.com/medical-report-01",
++ "input_text": null,
++ "options": {
++ "browsing": "on",
++ "depth": "standard",
++ "max_claims": 5,
++ "scenarios_per_claim": 2,
++ "max_evidence_per_scenario": 6,
++ "context_aware_analysis": true
++ },
++ "client": {
++ "request_id": "optional-client-tracking-id",
++ "source_label": "optional"
++ }
  }
  }}}
@@ -280,27 +280,27 @@
  **Response:** 202 Accepted
  {{{{
--  "job_id": "01J...ULID",
--  "status": "QUEUED",
--  "created_at": "2025-12-24T10:31:00Z",
--  "estimated_cost": 0.114,
--  "cost_breakdown": {
--    "stage1_extraction": 0.003,
--    "stage2_new_claims": 0.081,
--    "stage2_cached_claims": 0.000,
--    "stage3_holistic": 0.030
--  },
--  "cache_info": {
--    "claims_to_extract": 5,
--    "estimated_cache_hits": 4,
--    "estimated_new_claims": 1
--  },
--  "links": {
--    "self": "/v1/jobs/01J...ULID",
--    "result": "/v1/jobs/01J...ULID/result",
--    "report": "/v1/jobs/01J...ULID/report",
--    "events": "/v1/jobs/01J...ULID/events"
--  }
++ "job_id": "01J...ULID",
++ "status": "QUEUED",
++ "created_at": "2025-12-24T10:31:00Z",
++ "estimated_cost": 0.114,
++ "cost_breakdown": {
++ "stage1_extraction": 0.003,
++ "stage2_new_claims": 0.081,
++ "stage2_cached_claims": 0.000,
++ "stage3_holistic": 0.030
++ },
++ "cache_info": {
++ "claims_to_extract": 5,
++ "estimated_cache_hits": 4,
++ "estimated_new_claims": 1
++ },
++ "links": {
++ "self": "/v1/jobs/01J...ULID",
++ "result": "/v1/jobs/01J...ULID/result",
++ "report": "/v1/jobs/01J...ULID/report",
++ "events": "/v1/jobs/01J...ULID/events"
++ }
  }
  }}}
@@ -309,12 +309,12 @@
 Payment Required - Free tier limit reached, cache-only mode
  {{{{
--  "error": "credit_limit_reached",
--  "message": "Monthly credit limit reached. Entering cache-only mode.",
--  "cache_only_mode": true,
--  "credit_remaining": 0.00,
--  "reset_date": "2025-02-01T00:00:00Z",
--  "action": "Resubmit with cache_preference=allow_partial for cached results"
++ "error": "credit_limit_reached",
++ "message": "Monthly credit limit reached. Entering cache-only mode.",
++ "cache_only_mode": true,
++ "credit_remaining": 0.00,
++ "reset_date": "2025-02-01T00:00:00Z",
++ "action": "Resubmit with cache_preference=allow_partial for cached results"
  }
  }}}
@@ -325,29 +325,29 @@
  === 4.1 Stage 1 Output: ClaimExtraction ===
  {{{{
--  "job_id": "01J...ULID",
--  "stage": "stage1_extraction",
--  "article_metadata": {
--    "title": "Article title",
--    "source_url": "https://example.com/article",
--    "extracted_text_length": 5234,
--    "language": "en"
--  },
--  "claims": [
--    {
--      "claim_id": "C1",
--      "claim_text": "Original claim text from article",
--      "canonical_claim": "Normalized, deduplicated phrasing",
--      "claim_hash": "sha256:abc123...",
--      "is_central_to_thesis": true,
--      "claim_type": "causal",
--      "evaluability": "evaluable",
--      "risk_tier": "B",
--      "domain": "public_health"
--    }
--  ],
--  "article_thesis": "Main argument detected",
--  "cost": 0.003
++ "job_id": "01J...ULID",
++ "stage": "stage1_extraction",
++ "article_metadata": {
++ "title": "Article title",
++ "source_url": "https://example.com/article",
++ "extracted_text_length": 5234,
++ "language": "en"
++ },
++ "claims": [
++ {
++ "claim_id": "C1",
++ "claim_text": "Original claim text from article",
++ "canonical_claim": "Normalized, deduplicated phrasing",
++ "claim_hash": "sha256:abc123...",
++ "is_central_to_thesis": true,
++ "claim_type": "causal",
++ "evaluability": "evaluable",
++ "risk_tier": "B",
++ "domain": "public_health"
++ }
++ ],
++ "article_thesis": "Main argument detected",
++ "cost": 0.003
  }
  }}}
@@ -433,7 +433,7 @@
  **Data Structure:**
  {{{SET claim:v1norm1:en:abc123...def456 '{...ClaimAnalysis JSON...}'
--EXPIRE claim:v1norm1:en:abc123...def456 7776000  # 90 days
++EXPIRE claim:v1norm1:en:abc123...def456 7776000 # 90 days
  }}}
  ----
@@ -445,44 +445,44 @@
  **Algorithm: Canonical Claim Normalization v1**
  {{{def normalize_claim_v1(claim_text: str, language: str) -> str:
--    """
--    Normalizes claim to canonical form for cache key generation.
--    Version: v1norm1 (POC1)
--    """
--    import re
--    import unicodedata
--
--    # Step 1: Unicode normalization (NFC)
--    text = unicodedata.normalize('NFC', claim_text)
--
--    # Step 2: Lowercase
--    text = text.lower()
--
--    # Step 3: Remove punctuation (except hyphens in words)
--    text = re.sub(r'[^\w\s-]', '', text)
--
--    # Step 4: Normalize whitespace (collapse multiple spaces)
--    text = re.sub(r'\s+', ' ', text).strip()
--
--    # Step 5: Numeric normalization
--    text = text.replace('%', ' percent')
--    # Spell out single-digit numbers
--    num_to_word = {'0':'zero', '1':'one', '2':'two', '3':'three',
--                   '4':'four', '5':'five', '6':'six', '7':'seven',
--                   '8':'eight', '9':'nine'}
--    for num, word in num_to_word.items():
--        text = re.sub(rf'\b{num}\b', word, text)
--
--    # Step 6: Common abbreviations (English only in v1)
--    if language == 'en':
--        text = text.replace('covid-19', 'covid')
--        text = text.replace('u.s.', 'us')
--        text = text.replace('u.k.', 'uk')
--
--    # Step 7: NO entity normalization in v1
--    # (Trump vs Donald Trump vs President Trump remain distinct)
--
--    return text
++ """
++ Normalizes claim to canonical form for cache key generation.
++ Version: v1norm1 (POC1)
++ """
++ import re
++ import unicodedata
++
++ # Step 1: Unicode normalization (NFC)
++ text = unicodedata.normalize('NFC', claim_text)
++
++ # Step 2: Lowercase
++ text = text.lower()
++
++ # Step 3: Remove punctuation (except hyphens in words)
++ text = re.sub(r'[^\w\s-]', '', text)
++
++ # Step 4: Normalize whitespace (collapse multiple spaces)
++ text = re.sub(r'\s+', ' ', text).strip()
++
++ # Step 5: Numeric normalization
++ text = text.replace('%', ' percent')
++ # Spell out single-digit numbers
++ num_to_word = {'0':'zero', '1':'one', '2':'two', '3':'three',
++ '4':'four', '5':'five', '6':'six', '7':'seven',
++ '8':'eight', '9':'nine'}
++ for num, word in num_to_word.items():
++ text = re.sub(rf'\b{num}\b', word, text)
++
++ # Step 6: Common abbreviations (English only in v1)
++ if language == 'en':
++ text = text.replace('covid-19', 'covid')
++ text = text.replace('u.s.', 'us')
++ text = text.replace('u.k.', 'uk')
++
++ # Step 7: NO entity normalization in v1
++ # (Trump vs Donald Trump vs President Trump remain distinct)
++
++ return text
  # Version identifier (include in cache namespace)
  CANONICALIZER_VERSION = "v1norm1"
@@ -495,19 +495,19 @@
  cache_key = f"claim:{CANONICALIZER_VERSION}:{language}:{sha256(canonical)}"
  Example:
--  claim: "COVID-19 vaccines are 95% effective"
--  canonical: "covid vaccines are 95 percent effective"
--  sha256: abc123...def456
--  key: "claim:v1norm1:en:abc123...def456"
++ claim: "COVID-19 vaccines are 95% effective"
++ canonical: "covid vaccines are 95 percent effective"
++ sha256: abc123...def456
++ key: "claim:v1norm1:en:abc123...def456"
  }}}
  **Cache Metadata MUST Include:**
  {{{{
--  "canonical_claim": "covid vaccines are 95 percent effective",
--  "canonicalizer_version": "v1norm1",
--  "language": "en",
--  "original_claim_samples": ["COVID-19 vaccines are 95% effective"]
++ "canonical_claim": "covid vaccines are 95 percent effective",
++ "canonicalizer_version": "v1norm1",
++ "language": "en",
++ "original_claim_samples": ["COVID-19 vaccines are 95% effective"]
  }
  }}}

Changes for page POC1 API & Schemas Specification

Summary

Details

Applications

Navigation

Need help?