Changes for page POC1 API & Schemas Specification
Last modified by Robert Schaub on 2025/12/24 18:26
Summary
-
Page properties (1 modified, 0 added, 0 removed)
Details
- Page properties
-
- Content
-
... ... @@ -1,673 +1,904 @@ 1 - #FactHarborPOC1—API & Schemas Specification1 += POC1 API & Schemas Specification = 2 2 3 -**Version:** 0.3 (POC1 - Production Ready) 4 -**Namespace:** FactHarbor.* 5 -**Syntax:** xWiki 2.1 6 -**Last Updated:** 2025-12-24 3 +---- 7 7 8 ---- 9 - 10 10 == Version History == 11 11 12 12 |=Version|=Date|=Changes 13 -|0.3|2025-12-24|Added complete API endpoints, LLM config, risk tiers, scraping details, quality gate logging, temporal separation note, cross-references 14 -|0.2|2025-12-24|Initial rebased version with holistic assessment 15 -|0.1|2025-12-24|Original specification 8 +|0.4.1|2025-12-24|Applied 9 critical fixes: file format notice, verdict taxonomy, canonicalization algorithm, Stage 1 cost policy, BullMQ fix, language in cache key, historical claims TTL, idempotency, copyright policy 9 +|0.4|2025-12-24|**BREAKING:** 3-stage pipeline with claim-level caching, user tier system, cache-only mode for free users, Redis cache architecture 10 +|0.3.1|2025-12-24|Fixed single-prompt strategy, SSE clarification, schema canonicalization, cost constraints 11 +|0.3|2025-12-24|Added complete API endpoints, LLM config, risk tiers, scraping details 16 16 17 ---- 13 +---- 18 18 19 19 == 1. Core Objective (POC1) == 20 20 21 -The primary technical goal of POC1 is to validate **Approach 1 (Single-Pass Holistic Analysis)** :17 +The primary technical goal of POC1 is to validate **Approach 1 (Single-Pass Holistic Analysis)** while implementing **claim-level caching** to achieve cost sustainability. 22 22 23 -The system must prove that AI can identify an article's **Main Thesis** and determine if thesupporting claims(even if individually accurate) logically support that thesis without committing fallacies(e.g., correlation vs. causation, cherry-picking, hasty generalization).19 +The system must prove that AI can identify an article's **Main Thesis** and determine if supporting claims logically support that thesis without committing fallacies. 24 24 25 -**Success Criteria:** 21 +=== Success Criteria: === 22 + 26 26 * Test with 30 diverse articles 27 27 * Target: ≥70% accuracy detecting misleading articles 28 -* Cost: <$0.35 per analysis 25 +* Cost: <$0.25 per NEW analysis (uncached) 26 +* Cost: $0.00 for cached claim reuse 27 +* Cache hit rate: ≥50% after 1,000 articles 29 29 * Processing time: <2 minutes (standard depth) 30 30 31 - **See:**[[Article Verdict Problem>>Test.FactHarbor.Specification.POC.Article-Verdict-Problem]] for completeinvestigation of 7 approaches.30 +=== Economic Model: === 32 32 33 ---- 32 +* **Free tier:** $10 credit per month (~~40-140 articles depending on cache hits) 33 +* **After limit:** Cache-only mode (instant, free access to cached claims) 34 +* **Paid tier:** Unlimited new analyses 34 34 35 - == 2. Runtime Model & Job States ==36 +---- 36 36 37 -== =2.1PipelineSteps===38 +== 2. Architecture Overview == 38 38 39 - Forprogressreporting viaAPI, thepipelinefollowsthesestages:40 +=== 2.1 3-Stage Pipeline with Caching === 40 40 41 -# **INGEST**: URL scraping (Jina Reader / Trafilatura) or text normalization. 42 -# **EXTRACT_CLAIMS**: Identifying 3-5 verifiable factual claims + marking central vs. supporting. 43 -# **SCENARIOS**: Generating context interpretations for each claim. 44 -# **RETRIEVAL**: Evidence gathering (Search API + mandatory contradiction search). 45 -# **VERDICTS**: Assigning likelihoods, confidence, and uncertainty per scenario. 46 -# **HOLISTIC_ASSESSMENT**: Evaluating article-level credibility (Thesis vs. Claims logic). 47 -# **REPORT**: Generating final Markdown and JSON outputs. 42 +FactHarbor POC1 uses a **3-stage architecture** designed for claim-level caching and cost efficiency: 48 48 49 -=== 2.1.1 URL Extraction Strategy === 44 +{{mermaid}} 45 +graph TD 46 + A[Article Input] --> B[Stage 1: Extract Claims] 47 + B --> C{For Each Claim} 48 + C --> D[Check Cache] 49 + D -->|Cache HIT| E[Return Cached Verdict] 50 + D -->|Cache MISS| F[Stage 2: Analyze Claim] 51 + F --> G[Store in Cache] 52 + G --> E 53 + E --> H[Stage 3: Holistic Assessment] 54 + H --> I[Final Report] 55 +{{/mermaid}} 50 50 51 -**Primary:** Jina AI Reader ({{code}}https://r.jina.ai/{url}{{/code}}) 52 -* **Rationale:** Clean markdown, handles JS rendering, free tier sufficient 53 -* **Fallback:** Trafilatura (Python library) for simple static HTML 57 +==== Stage 1: Claim Extraction (Haiku, no cache) ==== 54 54 55 -**Error Handling:** 59 +* **Input:** Article text 60 +* **Output:** 5 canonical claims (normalized, deduplicated) 61 +* **Model:** Claude Haiku 4 (default, configurable via LLM abstraction layer) 62 +* **Cost:** $0.003 per article 63 +* **Cache strategy:** No caching (article-specific) 56 56 57 -|=Error Code|=Trigger|=Action 58 -|{{code}}URL_BLOCKED{{/code}}|403/401/Paywall detected|Return error, suggest text paste 59 -|{{code}}URL_UNREACHABLE{{/code}}|Network/DNS failure|Retry once, then fail 60 -|{{code}}URL_NOT_FOUND{{/code}}|404 Not Found|Return error immediately 61 -|{{code}}EXTRACTION_FAILED{{/code}}|Content <50 words or unreadable|Return error with reason 65 +==== Stage 2: Claim Analysis (Sonnet, CACHED) ==== 62 62 63 -** SupportedURL Patterns:**64 -* ✅Newsarticles, blog posts,Wikipedia65 -* ✅Academicpreprints(arXiv)66 -* ❌ Social media posts(Twitter,Facebook)- notin POC167 -* ❌ Videoplatforms(YouTube,TikTok)-not in POC168 -* ❌PDFfiles- deferred toBeta067 +* **Input:** Single canonical claim 68 +* **Output:** Scenarios + Evidence + Verdicts 69 +* **Model:** Claude Sonnet 3.5 (default, configurable via LLM abstraction layer) 70 +* **Cost:** $0.081 per NEW claim 71 +* **Cache strategy:** Redis, 90-day TTL 72 +* **Cache key:** claim:v1norm1:{language}:{sha256(canonical_claim)} 69 69 70 -=== 2.2 JobStatusEnumeration ===74 +==== Stage 3: Holistic Assessment (Sonnet, no cache) ==== 71 71 72 -((( 73 -* **QUEUED** - Job accepted, waiting in queue 74 -* **RUNNING** - Processing in progress 75 -* **SUCCEEDED** - Analysis complete, results available 76 -* **FAILED** - Error occurred, see error details 77 -* **CANCELLED** - User cancelled via DELETE endpoint 78 -))) 76 +* **Input:** Article + Claim verdicts (from cache or Stage 2) 77 +* **Output:** Article verdict + Fallacies + Logic quality 78 +* **Model:** Claude Sonnet 3.5 (default, configurable via LLM abstraction layer) 79 +* **Cost:** $0.030 per article 80 +* **Cache strategy:** No caching (article-specific) 79 79 80 ---- 81 81 82 -== 3. REST API Contract == 83 83 84 - ===3.1CreateAnalysisJob===84 +**Note:** Stage 3 implements **Approach 1 (Single-Pass Holistic Analysis)** from the [[Article Verdict Problem>>Test.FactHarbor.Specification.POC.Article-Verdict-Problem]]. While claim analysis (Stage 2) is cached for efficiency, the holistic assessment maintains the integrated evaluation philosophy of Approach 1. 85 85 86 - **Endpoint:**{{code}}POST/v1/analyze{{/code}}86 +=== Total Cost Formula: === 87 87 88 -**Request Body Example:** 89 -{{code language="json"}} 90 -{ 91 - "input_type": "url", 92 - "input_url": "https://example.com/medical-report-01", 93 - "input_text": null, 94 - "options": { 95 - "browsing": "on", 96 - "depth": "standard", 97 - "max_claims": 5, 98 - "context_aware_analysis": true 99 - }, 100 - "client": { 101 - "request_id": "optional-client-tracking-id", 102 - "source_label": "optional" 103 - } 104 -} 105 -{{/code}} 88 +{{{Cost = $0.003 (extraction) + (N_new_claims × $0.081) + $0.030 (holistic) 106 106 107 -**Options:** 108 -* {{code}}browsing{{/code}}: {{code}}on{{/code}} | {{code}}off{{/code}} (retrieve web sources or just output queries) 109 -* {{code}}depth{{/code}}: {{code}}standard{{/code}} | {{code}}deep{{/code}} (evidence thoroughness) 110 -* {{code}}max_claims{{/code}}: 1-50 (default: 10) 111 -* {{code}}context_aware_analysis{{/code}}: {{code}}true{{/code}} | {{code}}false{{/code}} (experimental) 90 +Examples: 91 +- 0 new claims (100% cache hit): $0.033 92 +- 1 new claim (80% cache hit): $0.114 93 +- 3 new claims (40% cache hit): $0.276 94 +- 5 new claims (0% cache hit): $0.438 95 +}}} 112 112 113 - **Response:** {{code}}202 Accepted{{/code}}97 +---- 114 114 115 -{{code language="json"}} 116 -{ 117 - "job_id": "01J...ULID", 118 - "status": "QUEUED", 119 - "created_at": "2025-12-24T10:31:00Z", 120 - "links": { 121 - "self": "/v1/jobs/01J...ULID", 122 - "result": "/v1/jobs/01J...ULID/result", 123 - "report": "/v1/jobs/01J...ULID/report", 124 - "events": "/v1/jobs/01J...ULID/events" 125 - } 126 -} 127 -{{/code}} 99 +=== 2.2 User Tier System === 128 128 129 ---- 101 +|=Tier|=Monthly Credit|=After Limit|=Cache Access|=Analytics 102 +|**Free**|$10|Cache-only mode|✅ Full|Basic 103 +|**Pro** (future)|$50|Continues|✅ Full|Advanced 104 +|**Enterprise** (future)|Custom|Continues|✅ Full + Priority|Full 130 130 131 - ===3.2 GetJob Status===106 +**Free Tier Economics:** 132 132 133 -**Endpoint:** {{code}}GET /v1/jobs/{job_id}{{/code}} 108 +* $10 credit = 40-140 articles analyzed (depending on cache hit rate) 109 +* Average 70 articles/month at 70% cache hit rate 110 +* After limit: Cache-only mode 134 134 135 - **Response:** {{code}}200 OK{{/code}}112 +---- 136 136 137 -{{code language="json"}} 138 -{ 139 - "job_id": "01J...ULID", 140 - "status": "RUNNING", 141 - "created_at": "2025-12-24T10:31:00Z", 142 - "updated_at": "2025-12-24T10:31:22Z", 143 - "progress": { 144 - "step": "RETRIEVAL", 145 - "percent": 60, 146 - "message": "Gathering evidence for C2-S1", 147 - "current_claim_id": "C2", 148 - "current_scenario_id": "C2-S1" 149 - }, 150 - "input_echo": { 151 - "input_type": "url", 152 - "input_url": "https://example.com/medical-report-01" 153 - }, 154 - "links": { 155 - "self": "/v1/jobs/01J...ULID", 156 - "result": "/v1/jobs/01J...ULID/result", 157 - "report": "/v1/jobs/01J...ULID/report" 158 - }, 159 - "error": null 114 +=== 2.3 Cache-Only Mode (Free Tier Feature) === 115 + 116 +When free users reach their $10 monthly limit, they enter **Cache-Only Mode**: 117 + 118 +==== What Cache-Only Mode Provides: ==== 119 + 120 +✅ **Claim Extraction (Platform-Funded):** 121 + 122 +* Stage 1 extraction runs at $0.003 per article 123 +* **Cost: Absorbed by platform** (not charged to user credit) 124 +* Rationale: Extraction is necessary to check cache, and cost is negligible 125 +* Rate limit: Max 50 extractions/day in cache-only mode (prevents abuse) 126 + 127 +✅ **Instant Access to Cached Claims:** 128 + 129 +* Any claim that exists in cache → Full verdict returned 130 +* Cost: $0 (no LLM calls) 131 +* Response time: <100ms 132 + 133 +✅ **Partial Article Analysis:** 134 + 135 +* Check each claim against cache 136 +* Return verdicts for ALL cached claims 137 +* For uncached claims: Return "status": "cache_miss" 138 + 139 +✅ **Cache Coverage Report:** 140 + 141 +* "3 of 5 claims available in cache (60% coverage)" 142 +* Links to cached analyses 143 +* Estimated cost to complete: $0.162 (2 new claims) 144 + 145 +❌ **Not Available in Cache-Only Mode:** 146 + 147 +* New claim analysis (Stage 2 LLM calls blocked) 148 +* Full holistic assessment (Stage 3 blocked if any claims missing) 149 + 150 +==== User Experience Example: ==== 151 + 152 +{{{{ 153 + "status": "cache_only_mode", 154 + "message": "Monthly credit limit reached. Showing cached results only.", 155 + "cache_coverage": { 156 + "claims_total": 5, 157 + "claims_cached": 3, 158 + "claims_missing": 2, 159 + "coverage_percent": 60 160 + }, 161 + "cached_claims": [ 162 + {"claim_id": "C1", "verdict": "Likely", "confidence": 0.82}, 163 + {"claim_id": "C2", "verdict": "Highly Likely", "confidence": 0.91}, 164 + {"claim_id": "C4", "verdict": "Unclear", "confidence": 0.55} 165 + ], 166 + "missing_claims": [ 167 + {"claim_id": "C3", "claim_text": "...", "estimated_cost": "$0.081"}, 168 + {"claim_id": "C5", "claim_text": "...", "estimated_cost": "$0.081"} 169 + ], 170 + "upgrade_options": { 171 + "top_up": "$5 for 20-70 more articles", 172 + "pro_tier": "$50/month unlimited" 173 + } 160 160 } 161 - {{/code}}175 +}}} 162 162 163 - ---177 +**Design Rationale:** 164 164 165 -=== 3.3 Get JSON Result === 179 +* Free users still get value (cached claims often answer their question) 180 +* Demonstrates FactHarbor's value (partial results encourage upgrade) 181 +* Sustainable for platform (no additional cost) 182 +* Fair to all users (everyone contributes to cache) 166 166 167 - **Endpoint:** {{code}}GET /v1/jobs/{job_id}/result{{/code}}184 +---- 168 168 169 -**Response:** {{code}}200 OK{{/code}} (Returns the **AnalysisResult** schema - see Section 4) 170 170 171 -**Other Responses:** 172 -* {{code}}409 Conflict{{/code}} - Job not finished yet 173 -* {{code}}404 Not Found{{/code}} - Job ID unknown 174 174 175 - ---188 +== 6. LLM Abstraction Layer == 176 176 177 -=== 3.4DownloadMarkdownReport===190 +=== 6.1 Design Principle === 178 178 179 -** Endpoint:**{{code}}GET/v1/jobs/{job_id}/report{{/code}}192 +**FactHarbor uses provider-agnostic LLM abstraction** to avoid vendor lock-in and enable: 180 180 181 -**Response:** {{code}}200 OK{{/code}} with {{code}}text/markdown; charset=utf-8{{/code}} content 194 +* **Provider switching:** Change LLM providers without code changes 195 +* **Cost optimization:** Use different providers for different stages 196 +* **Resilience:** Automatic fallback if primary provider fails 197 +* **Cross-checking:** Compare outputs from multiple providers 198 +* **A/B testing:** Test new models without deployment changes 182 182 183 -**Headers:** 184 -* {{code}}Content-Disposition: attachment; filename="factharbor_poc1_{job_id}.md"{{/code}} 200 +**Implementation:** All LLM calls go through an abstraction layer that routes to configured providers. 185 185 186 -**Other Responses:** 187 -* {{code}}409 Conflict{{/code}} - Job not finished 188 -* {{code}}404 Not Found{{/code}} - Job unknown 202 +---- 189 189 190 - ---204 +=== 6.2 LLM Provider Interface === 191 191 192 - === 3.5 Stream Job Events(Optional, Recommended) ===206 +**Abstract Interface:** 193 193 194 -**Endpoint:** {{code}}GET /v1/jobs/{job_id}/events{{/code}} 208 +{{{ 209 +interface LLMProvider { 210 + // Core methods 211 + complete(prompt: string, options: CompletionOptions): Promise<CompletionResponse> 212 + stream(prompt: string, options: CompletionOptions): AsyncIterator<StreamChunk> 213 + 214 + // Provider metadata 215 + getName(): string 216 + getMaxTokens(): number 217 + getCostPer1kTokens(): { input: number, output: number } 218 + 219 + // Health check 220 + isAvailable(): Promise<boolean> 221 +} 195 195 196 -**Response:** Server-Sent Events (SSE) stream 223 +interface CompletionOptions { 224 + model?: string 225 + maxTokens?: number 226 + temperature?: number 227 + stopSequences?: string[] 228 + systemPrompt?: string 229 +} 230 +}}} 197 197 198 -**Event Types:** 199 -* {{code}}progress{{/code}} - Progress update 200 -* {{code}}claim_extracted{{/code}} - Claim identified 201 -* {{code}}verdict_computed{{/code}} - Scenario verdict complete 202 -* {{code}}complete{{/code}} - Job finished 203 -* {{code}}error{{/code}} - Error occurred 232 +---- 204 204 205 - ---234 +=== 6.3 Supported Providers (POC1) === 206 206 207 - === 3.6 CancelJob===236 +**Primary Provider (Default):** 208 208 209 -**Endpoint:** {{code}}DELETE /v1/jobs/{job_id}{{/code}} 238 +* **Anthropic Claude API** 239 + * Models: Claude Haiku 4, Claude Sonnet 3.5, Claude Opus 4 240 + * Used by default in POC1 241 + * Best quality for holistic analysis 210 210 211 - Attempts tocancel a queuedor runningjob.243 +**Secondary Providers (Future):** 212 212 213 -**Response:** {{code}}200 OK{{/code}} with updated Job object (status: CANCELLED) 245 +* **OpenAI API** 246 + * Models: GPT-4o, GPT-4o-mini 247 + * For cost comparison 248 + 249 +* **Google Vertex AI** 250 + * Models: Gemini 1.5 Pro, Gemini 1.5 Flash 251 + * For diversity in evidence gathering 214 214 215 -**Note:** Already-completed jobs cannot be cancelled. 253 +* **Local Models** (Post-POC) 254 + * Models: Llama 3.1, Mistral 255 + * For privacy-sensitive deployments 216 216 217 ---- 257 +---- 218 218 219 -=== 3.7HealthCheck===259 +=== 6.4 Provider Configuration === 220 220 221 -**En dpoint:**{{code}}GET /v1/health{{/code}}261 +**Environment Variables:** 222 222 223 -**Response:** {{code}}200 OK{{/code}} 263 +{{{ 264 +# Primary provider 265 +LLM_PRIMARY_PROVIDER=anthropic 266 +ANTHROPIC_API_KEY=sk-ant-... 224 224 225 -{{code language="json"}} 268 +# Fallback provider 269 +LLM_FALLBACK_PROVIDER=openai 270 +OPENAI_API_KEY=sk-... 271 + 272 +# Provider selection per stage 273 +LLM_STAGE1_PROVIDER=anthropic 274 +LLM_STAGE1_MODEL=claude-haiku-4 275 +LLM_STAGE2_PROVIDER=anthropic 276 +LLM_STAGE2_MODEL=claude-sonnet-3-5 277 +LLM_STAGE3_PROVIDER=anthropic 278 +LLM_STAGE3_MODEL=claude-sonnet-3-5 279 + 280 +# Cost limits 281 +LLM_MAX_COST_PER_REQUEST=1.00 282 +}}} 283 + 284 +**Database Configuration (Alternative):** 285 + 286 +{{{{ 226 226 { 227 - "status": "ok", 228 - "version": "POC1-v0.3", 229 - "model": "claude-3-5-sonnet-20241022" 288 + "providers": [ 289 + { 290 + "name": "anthropic", 291 + "api_key_ref": "vault://anthropic-api-key", 292 + "enabled": true, 293 + "priority": 1 294 + }, 295 + { 296 + "name": "openai", 297 + "api_key_ref": "vault://openai-api-key", 298 + "enabled": true, 299 + "priority": 2 300 + } 301 + ], 302 + "stage_config": { 303 + "stage1": { 304 + "provider": "anthropic", 305 + "model": "claude-haiku-4", 306 + "max_tokens": 4096, 307 + "temperature": 0.0 308 + }, 309 + "stage2": { 310 + "provider": "anthropic", 311 + "model": "claude-sonnet-3-5", 312 + "max_tokens": 16384, 313 + "temperature": 0.3 314 + }, 315 + "stage3": { 316 + "provider": "anthropic", 317 + "model": "claude-sonnet-3-5", 318 + "max_tokens": 8192, 319 + "temperature": 0.2 320 + } 321 + } 230 230 } 231 - {{/code}}323 +}}} 232 232 233 ---- 325 +---- 234 234 235 -== 4.AnalysisResultSchema(Context-Aware) ==327 +=== 6.5 Stage-Specific Models (POC1 Defaults) === 236 236 237 - This schema implements the**Context-AwareAnalysis**required bythe POC1 specification.329 +**Stage 1: Claim Extraction** 238 238 239 -{{code language="json"}} 240 -{ 241 - "metadata": { 242 - "job_id": "string (ULID)", 243 - "timestamp_utc": "ISO8601", 244 - "engine_version": "POC1-v0.3", 245 - "llm_provider": "anthropic", 246 - "llm_model": "claude-3-5-sonnet-20241022", 247 - "usage_stats": { 248 - "input_tokens": "integer", 249 - "output_tokens": "integer", 250 - "estimated_cost_usd": "float", 251 - "response_time_sec": "float" 331 +* **Default:** Anthropic Claude Haiku 4 332 +* **Alternative:** OpenAI GPT-4o-mini, Google Gemini 1.5 Flash 333 +* **Rationale:** Fast, cheap, simple task 334 +* **Cost:** ~$0.003 per article 335 + 336 +**Stage 2: Claim Analysis** (CACHEABLE) 337 + 338 +* **Default:** Anthropic Claude Sonnet 3.5 339 +* **Alternative:** OpenAI GPT-4o, Google Gemini 1.5 Pro 340 +* **Rationale:** High-quality analysis, cached 90 days 341 +* **Cost:** ~$0.081 per NEW claim 342 + 343 +**Stage 3: Holistic Assessment** 344 + 345 +* **Default:** Anthropic Claude Sonnet 3.5 346 +* **Alternative:** OpenAI GPT-4o, Claude Opus 4 (for high-stakes) 347 +* **Rationale:** Complex reasoning, logical fallacy detection 348 +* **Cost:** ~$0.030 per article 349 + 350 +**Cost Comparison (Example):** 351 + 352 +|=Stage|=Anthropic (Default)|=OpenAI Alternative|=Google Alternative 353 +|Stage 1|Claude Haiku 4 ($0.003)|GPT-4o-mini ($0.002)|Gemini Flash ($0.002) 354 +|Stage 2|Claude Sonnet 3.5 ($0.081)|GPT-4o ($0.045)|Gemini Pro ($0.050) 355 +|Stage 3|Claude Sonnet 3.5 ($0.030)|GPT-4o ($0.018)|Gemini Pro ($0.020) 356 +|**Total (0% cache)**|**$0.114**|**$0.065**|**$0.072** 357 + 358 +**Note:** POC1 uses Anthropic exclusively for consistency. Multi-provider support planned for POC2. 359 + 360 +---- 361 + 362 +=== 6.6 Failover Strategy === 363 + 364 +**Automatic Failover:** 365 + 366 +{{{ 367 +async function completeLLM(stage: string, prompt: string): Promise<string> { 368 + const primaryProvider = getProviderForStage(stage) 369 + const fallbackProvider = getFallbackProvider() 370 + 371 + try { 372 + return await primaryProvider.complete(prompt) 373 + } catch (error) { 374 + if (error.type === 'rate_limit' || error.type === 'service_unavailable') { 375 + logger.warn(`Primary provider failed, using fallback`) 376 + return await fallbackProvider.complete(prompt) 252 252 } 378 + throw error 379 + } 380 +} 381 +}}} 382 + 383 +**Fallback Priority:** 384 + 385 +1. **Primary:** Configured provider for stage 386 +2. **Secondary:** Fallback provider (if configured) 387 +3. **Cache:** Return cached result (if available for Stage 2) 388 +4. **Error:** Return 503 Service Unavailable 389 + 390 +---- 391 + 392 +=== 6.7 Provider Selection API === 393 + 394 +**Admin Endpoint:** POST /admin/v1/llm/configure 395 + 396 +**Update provider for specific stage:** 397 + 398 +{{{{ 399 +{ 400 + "stage": "stage2", 401 + "provider": "openai", 402 + "model": "gpt-4o", 403 + "max_tokens": 16384, 404 + "temperature": 0.3 405 +} 406 +}}} 407 + 408 +**Response:** 200 OK 409 + 410 +{{{{ 411 +{ 412 + "message": "LLM configuration updated", 413 + "stage": "stage2", 414 + "previous": { 415 + "provider": "anthropic", 416 + "model": "claude-sonnet-3-5" 253 253 }, 254 - "article_holistic_assessment": { 255 - "main_thesis": "string (The core argument detected)", 256 - "overall_verdict": "WELL-SUPPORTED | MISLEADING | REFUTED | UNCERTAIN", 257 - "logic_quality_score": "float (0-1)", 258 - "fallacies_detected": ["correlation-causation", "cherry-picking", "hasty-generalization"], 259 - "verdict_reasoning": "string (Explanation of why article credibility differs from claim average)", 260 - "experimental_feature": true 418 + "current": { 419 + "provider": "openai", 420 + "model": "gpt-4o" 261 261 }, 262 - "claims": [ 263 - { 264 - "claim_id": "C1", 265 - "is_central_to_thesis": "boolean", 266 - "claim_text": "string", 267 - "canonical_form": "string", 268 - "claim_type": "descriptive | causal | predictive | normative | definitional", 269 - "evaluability": "evaluable | partly_evaluable | not_evaluable", 270 - "risk_tier": "A | B | C", 271 - "risk_tier_justification": "string", 272 - "domain": "string (e.g., 'public health', 'economics')", 273 - "key_terms": ["term1", "term2"], 274 - "entities": ["Person X", "Org Y"], 275 - "time_scope_detected": "2020-2024", 276 - "geography_scope_detected": "Brazil", 277 - "scenarios": [ 278 - { 279 - "scenario_id": "C1-S1", 280 - "context_title": "string", 281 - "definitions": {"key_term": "definition"}, 282 - "assumptions": ["Assumption 1", "Assumption 2"], 283 - "boundaries": { 284 - "time": "as of 2025-01", 285 - "geography": "Brazil", 286 - "population": "adult population", 287 - "conditions": "excludes X; includes Y" 288 - }, 289 - "scope_of_evidence": "What counts as evidence for this scenario", 290 - "scenario_questions": ["Question that decides the verdict"], 291 - "verdict": { 292 - "label": "Highly Likely | Likely | Unclear | Unlikely | Refuted | Unsubstantiated", 293 - "probability_range": [0.0, 1.0], 294 - "confidence": "float (0-1)", 295 - "reasoning": "string", 296 - "key_supporting_evidence_ids": ["E1", "E3"], 297 - "key_counter_evidence_ids": ["E2"], 298 - "uncertainty_factors": ["Data gap", "Method disagreement"], 299 - "what_would_change_my_mind": ["Specific new study", "Updated dataset"] 300 - }, 301 - "evidence": [ 302 - { 303 - "evidence_id": "E1", 304 - "stance": "supports | undermines | mixed | context_dependent", 305 - "relevance_to_scenario": "float (0-1)", 306 - "evidence_summary": ["Bullet fact 1", "Bullet fact 2"], 307 - "citation": { 308 - "title": "Source title", 309 - "author_or_org": "Org/Author", 310 - "publication_date": "2024-05-01", 311 - "url": "https://source.example", 312 - "publisher": "Publisher/Domain" 313 - }, 314 - "excerpt": ["Short quote ≤25 words (optional)"], 315 - "source_reliability_score": "float (0-1) - READ-ONLY SNAPSHOT", 316 - "reliability_justification": "Why high/medium/low", 317 - "limitations_and_reservations": ["Limitation 1", "Limitation 2"], 318 - "retraction_or_dispute_signal": "none | correction | retraction | disputed", 319 - "retrieval_status": "OK | NEEDS_RETRIEVAL | FAILED" 320 - } 321 - ] 322 - } 323 - ] 422 + "cost_impact": { 423 + "previous_cost_per_claim": 0.081, 424 + "new_cost_per_claim": 0.045, 425 + "savings_percent": 44 426 + } 427 +} 428 +}}} 429 + 430 +**Get current configuration:** 431 + 432 +GET /admin/v1/llm/config 433 + 434 +{{{{ 435 +{ 436 + "providers": ["anthropic", "openai"], 437 + "primary": "anthropic", 438 + "fallback": "openai", 439 + "stages": { 440 + "stage1": { 441 + "provider": "anthropic", 442 + "model": "claude-haiku-4", 443 + "cost_per_request": 0.003 444 + }, 445 + "stage2": { 446 + "provider": "anthropic", 447 + "model": "claude-sonnet-3-5", 448 + "cost_per_new_claim": 0.081 449 + }, 450 + "stage3": { 451 + "provider": "anthropic", 452 + "model": "claude-sonnet-3-5", 453 + "cost_per_request": 0.030 324 324 } 325 - ], 326 - "quality_gates": { 327 - "gate1_claim_validation": "pass | fail", 328 - "gate4_verdict_confidence": "pass | fail", 329 - "passed_all": "boolean", 330 - "gate_fail_reasons": [ 331 - { 332 - "gate": "gate1_claim_validation", 333 - "claim_id": "C1", 334 - "reason_code": "OPINION_DETECTED | COMPOUND_CLAIM | SUBJECTIVE | TOO_VAGUE", 335 - "explanation": "Human-readable explanation" 336 - } 337 - ] 338 - }, 339 - "global_notes": { 340 - "limitations": ["System limitation 1", "Limitation 2"], 341 - "safety_or_policy_notes": ["Note 1"] 342 342 } 343 343 } 344 - {{/code}}457 +}}} 345 345 346 - === 4.1 Risk Tier Definitions ===459 +---- 347 347 348 -|=Tier|=Impact|=Examples|=Actions 349 -|**A (High)**|High real-world impact if wrong|Health claims, safety information, financial advice, medical procedures|Human review recommended (Mode3_Human_Reviewed_Required) 350 -|**B (Medium)**|Moderate impact, contested topics|Political claims, social issues, scientific debates, economic predictions|Enhanced contradiction search, AI-generated publication OK (Mode2_AI_Generated) 351 -|**C (Low)**|Low impact, easily verifiable|Historical facts, basic statistics, biographical data, geographic information|Standard processing, AI-generated publication OK (Mode2_AI_Generated) 461 +=== 6.8 Implementation Notes === 352 352 353 - === 4.2 SourceReliability (Read-OnlySnapshots) ===463 +**Provider Adapter Pattern:** 354 354 355 -**IMPORTANT:** The {{code}}source_reliability_score{{/code}} in each evidence item is a **historical snapshot** from the weekly background scoring job. 465 +{{{ 466 +class AnthropicProvider implements LLMProvider { 467 + async complete(prompt: string, options: CompletionOptions) { 468 + const response = await anthropic.messages.create({ 469 + model: options.model || 'claude-sonnet-3-5', 470 + max_tokens: options.maxTokens || 4096, 471 + messages: [{ role: 'user', content: prompt }], 472 + system: options.systemPrompt 473 + }) 474 + return response.content[0].text 475 + } 476 +} 356 356 357 -* POC1 treats these scores as **read-only** (no modification during analysis) 358 -* **Prevents circular dependency:** scoring → affects retrieval → affects scoring 359 -* Full Source Track Record System is a **separate service** (not part of POC1) 360 -* **Temporal separation:** Scoring runs weekly; analysis uses snapshots 478 +class OpenAIProvider implements LLMProvider { 479 + async complete(prompt: string, options: CompletionOptions) { 480 + const response = await openai.chat.completions.create({ 481 + model: options.model || 'gpt-4o', 482 + max_tokens: options.maxTokens || 4096, 483 + messages: [ 484 + { role: 'system', content: options.systemPrompt }, 485 + { role: 'user', content: prompt } 486 + ] 487 + }) 488 + return response.choices[0].message.content 489 + } 490 +} 491 +}}} 361 361 362 -** See:** [[Data Model>>Test.FactHarbor.Specification.Data Model.WebHome]] Section 1.3 (SourceTrackRecord System) for scoringalgorithm.493 +**Provider Registry:** 363 363 364 -=== 4.3 Quality Gate Reason Codes === 495 +{{{ 496 +const providers = new Map<string, LLMProvider>() 497 +providers.set('anthropic', new AnthropicProvider()) 498 +providers.set('openai', new OpenAIProvider()) 499 +providers.set('google', new GoogleProvider()) 365 365 366 -**Gate 1 (Claim Validation):** 367 -* {{code}}OPINION_DETECTED{{/code}} - Subjective judgment without factual anchor 368 -* {{code}}COMPOUND_CLAIM{{/code}} - Multiple claims in one statement 369 -* {{code}}SUBJECTIVE{{/code}} - Value judgment, not verifiable fact 370 -* {{code}}TOO_VAGUE{{/code}} - Lacks specificity for evaluation 501 +function getProvider(name: string): LLMProvider { 502 + return providers.get(name) || providers.get(config.primaryProvider) 503 +} 504 +}}} 371 371 372 -**Gate 4 (Verdict Confidence):** 373 -* {{code}}LOW_CONFIDENCE{{/code}} - Confidence below threshold (<0.5) 374 -* {{code}}INSUFFICIENT_EVIDENCE{{/code}} - Too few sources to reach verdict 375 -* {{code}}CONTRADICTORY_EVIDENCE{{/code}} - Evidence conflicts without resolution 376 -* {{code}}NO_COUNTER_EVIDENCE{{/code}} - Contradiction search failed 506 +---- 377 377 378 - **Purpose:**Enablesystemimprovementworkflow (Observe → Analyze→ Improve)508 +== 3. REST API Contract == 379 379 380 - ---510 +=== 3.1 User Credit Tracking === 381 381 382 - == 5. ValidationRules(POC1Enforcement) ==512 +**Endpoint:** GET /v1/user/credit 383 383 384 -|=Rule|=Requirement 385 -|**Mandatory Contradiction**|For every claim, the engine MUST search for "undermines" evidence. If none found, reasoning must explicitly state: "No counter-evidence found despite targeted search." Evidence must include at least 1 item with {{code}}stance ∈ {undermines, mixed, context_dependent}{{/code}} OR explicit note in {{code}}uncertainty_factors{{/code}}. 386 -|**Context-Aware Logic**|The {{code}}overall_verdict{{/code}} must prioritize central claims. If a {{code}}is_central_to_thesis=true{{/code}} claim is REFUTED, the overall article cannot be WELL-SUPPORTED. Central claims override verdict averaging. 387 -|**Author Identification**|All automated outputs MUST include {{code}}author_type: "AI/AKEL"{{/code}} or equivalent marker to distinguish AI-generated from human-reviewed content. 388 -|**Claim-to-Scenario Lifecycle**|In stateless POC1, Scenarios are **strictly children** of a specific Claim version. If a Claim's text changes, child Scenarios are part of that version's "snapshot." No scenario migration across versions. 514 +**Response:** 200 OK 389 389 390 ---- 516 +{{{{ 517 + "user_id": "user_abc123", 518 + "tier": "free", 519 + "credit_limit": 10.00, 520 + "credit_used": 7.42, 521 + "credit_remaining": 2.58, 522 + "reset_date": "2025-02-01T00:00:00Z", 523 + "cache_only_mode": false, 524 + "usage_stats": { 525 + "articles_analyzed": 67, 526 + "claims_from_cache": 189, 527 + "claims_newly_analyzed": 113, 528 + "cache_hit_rate": 0.626 529 + } 530 +} 531 +}}} 391 391 392 - == 6. Deterministic Markdown Template ==533 +---- 393 393 394 - Thesystemrenders {{code}}report.md{{/code}}usinga**fixed template** basedonthe JSON result(NOTgenerated by LLM).535 +=== 3.2 Create Analysis Job (3-Stage) === 395 395 396 -{{code language="markdown"}} 397 -# FactHarbor Analysis Report: {overall_verdict} 537 +**Endpoint:** POST /v1/analyze 398 398 399 -**Job ID:** {job_id} | **Generated:** {timestamp_utc} 400 -**Model:** {llm_model} | **Cost:** ${estimated_cost_usd} | **Time:** {response_time_sec}s 539 +==== Idempotency Support: ==== 401 401 402 - ---541 +To prevent duplicate job creation on network retries, clients SHOULD include: 403 403 404 -## 1. Holistic Assessment (Experimental) 543 +{{{POST /v1/analyze 544 +Idempotency-Key: {client-generated-uuid} 545 +}}} 405 405 406 - **MainThesis:**{main_thesis}547 +OR use the client.request_id field: 407 407 408 -**Overall Verdict:** {overall_verdict} 549 +{{{{ 550 + "input_url": "...", 551 + "client": { 552 + "request_id": "client-uuid-12345", 553 + "source_label": "optional" 554 + } 555 +} 556 +}}} 409 409 410 -** Logic QualityScore:**{logic_quality_score}/1.0558 +**Server Behavior:** 411 411 412 -**Fallacies Detected:** {fallacies_detected} 560 +* If Idempotency-Key or request_id seen before (within 24 hours): 561 +** Return existing job (200 OK, not 202 Accepted) 562 +** Do NOT create duplicate job or charge twice 563 +* Idempotency keys expire after 24 hours (matches job retention) 413 413 414 -**Re asoning:** {verdict_reasoning}565 +**Example Response (Idempotent):** 415 415 416 ---- 567 +{{{{ 568 + "job_id": "01J...ULID", 569 + "status": "RUNNING", 570 + "idempotent": true, 571 + "original_request_at": "2025-12-24T10:31:00Z", 572 + "message": "Returning existing job (idempotency key matched)" 573 +} 574 +}}} 417 417 418 - ##2. Key ClaimsAnalysis576 +==== Request Body: ==== 419 419 420 -### [C1] {claim_text} 421 -* **Role:** {is_central_to_thesis ? "Central to thesis" : "Supporting claim"} 422 -* **Risk Tier:** {risk_tier} ({risk_tier_justification}) 423 -* **Evaluability:** {evaluability} 578 +{{{{ 579 + "input_type": "url", 580 + "input_url": "https://example.com/medical-report-01", 581 + "input_text": null, 582 + "options": { 583 + "browsing": "on", 584 + "depth": "standard", 585 + "max_claims": 5, 586 + "scenarios_per_claim": 2, 587 + "max_evidence_per_scenario": 6, 588 + "context_aware_analysis": true 589 + }, 590 + "client": { 591 + "request_id": "optional-client-tracking-id", 592 + "source_label": "optional" 593 + } 594 +} 595 +}}} 424 424 425 -** ScenariosExplored:**{scenarios.length}597 +**Options:** 426 426 427 -#### Scenario: {scenario.context_title} 428 -* **Verdict:** {verdict.label} (Confidence: {verdict.confidence}) 429 -* **Probability Range:** {verdict.probability_range[0]} - {verdict.probability_range[1]} 430 -* **Reasoning:** {verdict.reasoning} 599 +* browsing: on | off (retrieve web sources or just output queries) 600 +* depth: standard | deep (evidence thoroughness) 601 +* max_claims: 1-10 (default: **5** for cost control) 602 +* scenarios_per_claim: 1-5 (default: **2** for cost control) 603 +* max_evidence_per_scenario: 3-10 (default: **6**) 604 +* context_aware_analysis: true | false (experimental) 431 431 432 -**Evidence:** 433 -* Supporting: {evidence.filter(e => e.stance == "supports").length} sources 434 -* Undermining: {evidence.filter(e => e.stance == "undermines").length} sources 435 -* Mixed: {evidence.filter(e => e.stance == "mixed").length} sources 606 +**Response:** 202 Accepted 436 436 437 -**Key Evidence:** 438 -* [{evidence[0].citation.title}]({evidence[0].citation.url}) - {evidence[0].stance} 608 +{{{{ 609 + "job_id": "01J...ULID", 610 + "status": "QUEUED", 611 + "created_at": "2025-12-24T10:31:00Z", 612 + "estimated_cost": 0.114, 613 + "cost_breakdown": { 614 + "stage1_extraction": 0.003, 615 + "stage2_new_claims": 0.081, 616 + "stage2_cached_claims": 0.000, 617 + "stage3_holistic": 0.030 618 + }, 619 + "cache_info": { 620 + "claims_to_extract": 5, 621 + "estimated_cache_hits": 4, 622 + "estimated_new_claims": 1 623 + }, 624 + "links": { 625 + "self": "/v1/jobs/01J...ULID", 626 + "result": "/v1/jobs/01J...ULID/result", 627 + "report": "/v1/jobs/01J...ULID/report", 628 + "events": "/v1/jobs/01J...ULID/events" 629 + } 630 +} 631 +}}} 439 439 440 - ---633 +**Error Responses:** 441 441 442 - ##3.QualityAssessment635 +402 Payment Required - Free tier limit reached, cache-only mode 443 443 444 -**Quality Gates:** 445 -* Gate 1 (Claim Validation): {gate1_claim_validation} 446 -* Gate 4 (Verdict Confidence): {gate4_verdict_confidence} 447 -* Overall: {passed_all ? "PASS" : "FAIL"} 637 +{{{{ 638 + "error": "credit_limit_reached", 639 + "message": "Monthly credit limit reached. Entering cache-only mode.", 640 + "cache_only_mode": true, 641 + "credit_remaining": 0.00, 642 + "reset_date": "2025-02-01T00:00:00Z", 643 + "action": "Resubmit with cache_preference=allow_partial for cached results" 644 +} 645 +}}} 448 448 449 -{if gate_fail_reasons.length > 0} 450 -**Failed Gates:** 451 -{gate_fail_reasons.map(r => `* ${r.gate}: ${r.explanation}`)} 452 -{/if} 647 +---- 453 453 454 - ---649 +== 4. Data Schemas == 455 455 456 - ##4.Limitations&Disclaimers651 +=== 4.1 Stage 1 Output: ClaimExtraction === 457 457 458 -**System Limitations:** 459 -{limitations.map(l => `* ${l}`)} 653 +{{{{ 654 + "job_id": "01J...ULID", 655 + "stage": "stage1_extraction", 656 + "article_metadata": { 657 + "title": "Article title", 658 + "source_url": "https://example.com/article", 659 + "extracted_text_length": 5234, 660 + "language": "en" 661 + }, 662 + "claims": [ 663 + { 664 + "claim_id": "C1", 665 + "claim_text": "Original claim text from article", 666 + "canonical_claim": "Normalized, deduplicated phrasing", 667 + "claim_hash": "sha256:abc123...", 668 + "is_central_to_thesis": true, 669 + "claim_type": "causal", 670 + "evaluability": "evaluable", 671 + "risk_tier": "B", 672 + "domain": "public_health" 673 + } 674 + ], 675 + "article_thesis": "Main argument detected", 676 + "cost": 0.003 677 +} 678 +}}} 460 460 461 -**Important Notes:** 462 -* This analysis is AI-generated and experimental (POC1) 463 -* Context-aware article verdict is being tested for accuracy 464 -* Human review recommended for high-risk claims (Tier A) 465 -* Cost: ${estimated_cost_usd} | Tokens: {input_tokens + output_tokens} 680 +---- 466 466 467 - **Methodology:**FactHarbor uses Claude 3.5Sonnet to extract claims, generate scenarios, gather evidence (with mandatorycontradictionsearch), and assesslogicalcoherence between claimsand article thesis.682 +=== 4.5 Verdict Label Taxonomy === 468 468 469 - ---684 +FactHarbor uses **three distinct verdict taxonomies** depending on analysis level: 470 470 471 -*Generated by FactHarbor POC1-v0.3 | [About FactHarbor](https://factharbor.org)* 472 -{{/code}} 686 +==== 4.5.1 Scenario Verdict Labels (Stage 2) ==== 473 473 474 - **TargetReportSize:** 220-350 words (optimizedfor2-minuteread)688 +Used for individual scenario verdicts within a claim. 475 475 476 - ---690 +**Enum Values:** 477 477 478 -== 7. LLM Configuration (POC1) == 692 +* Highly Likely - Probability 0.85-1.0, high confidence 693 +* Likely - Probability 0.65-0.84, moderate-high confidence 694 +* Unclear - Probability 0.35-0.64, or low confidence 695 +* Unlikely - Probability 0.16-0.34, moderate-high confidence 696 +* Highly Unlikely - Probability 0.0-0.15, high confidence 697 +* Unsubstantiated - Insufficient evidence to determine probability 479 479 480 -|=Parameter|=Value|=Notes 481 -|**Provider**|Anthropic|Primary provider for POC1 482 -|**Model**|{{code}}claude-3-5-sonnet-20241022{{/code}}|Current production model 483 -|**Future Model**|{{code}}claude-sonnet-4-20250514{{/code}}|When available (architecture supports) 484 -|**Token Budget**|50K-80K per analysis|Input + output combined (varies by article length) 485 -|**Estimated Cost**|$0.10-0.30 per article|Based on Sonnet 3.5 pricing ($3/M input, $15/M output) 486 -|**Prompt Strategy**|Single-pass per stage|Not multi-turn; structured JSON output with schema validation 487 -|**Chain-of-Thought**|Yes|For verdict reasoning and holistic assessment 488 -|**Few-Shot Examples**|Yes|For claim extraction and scenario generation 699 +==== 4.5.2 Claim Verdict Labels (Rollup) ==== 489 489 490 - ===7.1 TokenBudgetsbyStage===701 +Used when summarizing a claim across all scenarios. 491 491 492 -|=Stage|=Approximate Output Tokens 493 -|Claim Extraction|~4,000 (10 claims × ~400 tokens) 494 -|Scenario Generation|~3,000 per claim (3 scenarios × ~1,000 tokens) 495 -|Evidence Synthesis|~2,000 per scenario 496 -|Verdict Generation|~1,000 per scenario 497 -|Holistic Assessment|~500 (context-aware summary) 703 +**Enum Values:** 498 498 499 -**Total:** 50K-80K tokens per article (input + output) 705 +* Supported - Majority of scenarios are Likely or Highly Likely 706 +* Refuted - Majority of scenarios are Unlikely or Highly Unlikely 707 +* Inconclusive - Mixed scenarios or majority Unclear/Unsubstantiated 500 500 501 - === 7.2 API Integration===709 +**Mapping Logic:** 502 502 503 -**Anthropic Messages API:** 504 -* Endpoint: {{code}}https://api.anthropic.com/v1/messages{{/code}} 505 -* Authentication: API key via {{code}}x-api-key{{/code}} header 506 -* Model parameter: {{code}}"model": "claude-3-5-sonnet-20241022"{{/code}} 507 -* Max tokens: {{code}}"max_tokens": 4096{{/code}} (per stage) 711 +* If ≥60% scenarios are (Highly Likely | Likely) → Supported 712 +* If ≥60% scenarios are (Highly Unlikely | Unlikely) → Refuted 713 +* Otherwise → Inconclusive 508 508 509 - **NoLangChain/LangGraphneeded** forPOC1 simplicity-directSDK callssuffice.715 +==== 4.5.3 Article Verdict Labels (Stage 3) ==== 510 510 511 -- --717 +Used for holistic article-level assessment. 512 512 513 - ==8. Cross-References(xWiki) ==719 +**Enum Values:** 514 514 515 -This API specification implements requirements from: 721 +* WELL-SUPPORTED - Article thesis logically follows from supported claims 722 +* MISLEADING - Claims may be true but article commits logical fallacies 723 +* REFUTED - Central claims are refuted, invalidating thesis 724 +* UNCERTAIN - Insufficient evidence or highly mixed claim verdicts 516 516 517 -* **[[POC Requirements>>Test.FactHarbor.Specification.POC.Requirements]]** 518 -** FR-POC-1 through FR-POC-6 (POC1-specific functional requirements) 519 -** NFR-POC-1 through NFR-POC-3 (quality gates lite: Gates 1 & 4 only) 520 -** Section 2.1: Analysis Summary (Context-Aware) component specification 521 -** Section 10.3: Prompt structure for claim extraction and verdict synthesis 726 +**Note:** Article verdict considers **claim centrality** (central claims override supporting claims). 522 522 523 -* **[[Article Verdict Problem>>Test.FactHarbor.Specification.POC.Article-Verdict-Problem]]** 524 -** Complete investigation of 7 approaches to article-level verdicts 525 -** Approach 1 (Single-Pass Holistic Analysis) chosen for POC1 526 -** Experimental feature testing plan (30 articles, ≥70% accuracy target) 527 -** Decision framework for POC2 implementation 728 +==== 4.5.4 API Field Mapping ==== 528 528 529 - * **[[Requirements>>Test.FactHarbor.Specification.Requirements.WebHome]]**530 - ** FR4 (AnalysisSummary) -enhancedwithcontext-awarecapability531 - ** FR7 (VerdictCalculation) -probabilityranges + confidencescores532 - ** NFR11 (Quality Gates) - POC1 implements Gates 1 & 4; Gates 2 & 3in POC2730 +|=Level|=API Field|=Enum Name 731 +|Scenario|scenarios[].verdict.label|scenario_verdict_label 732 +|Claim|claims[].rollup_verdict (optional)|claim_verdict_label 733 +|Article|article_holistic_assessment.overall_verdict|article_verdict_label 533 533 534 -* **[[Architecture>>Test.FactHarbor.Specification.Architecture.WebHome]]** 535 -** POC1 simplified architecture (stateless, single AKEL orchestration call) 536 -** Data persistence minimized (job outputs only, no database required) 537 -** Deferred complexity (no Elasticsearch, TimescaleDB, Federation until metrics justify) 735 +---- 538 538 539 -* **[[Data Model>>Test.FactHarbor.Specification.Data Model.WebHome]]** 540 -** Evidence structure (source, stance, reliability rating) 541 -** Scenario boundaries (time, geography, population, conditions) 542 -** Claim types and evaluability taxonomy 543 -** Source Track Record System (Section 1.3) - temporal separation 737 +== 5. Cache Architecture == 544 544 545 -* **[[Requirements Roadmap Matrix>>Test.FactHarbor.Roadmap.Requirements-Roadmap-Matrix.WebHome]]** 546 -** POC1 requirement mappings and phase assignments 547 -** Context-aware analysis as POC1 experimental feature 548 -** POC2 enhancement path (Gates 2 & 3, evidence deduplication) 739 +=== 5.1 Redis Cache Design === 549 549 550 --- -741 +**Technology:** Redis 7.0+ (in-memory key-value store) 551 551 552 - ==9.Implementation Notes (POC1) ==743 +**Cache Key Schema:** 553 553 554 -=== 9.1 Recommended Tech Stack === 745 +{{{claim:v1norm1:{language}:{sha256(canonical_claim)} 746 +}}} 555 555 556 -* **Framework:** Next.js 14+ with App Router (TypeScript) - Full-stack in one codebase 557 -* **Rationale:** API routes + React UI unified, Vercel deployment-ready, similar to C# in structure 558 -* **Storage:** Filesystem JSON files (no database needed for POC1) 559 -* **Queue:** In-memory queue or Redis (optional for concurrency) 560 -* **URL Extraction:** Jina AI Reader API (primary), trafilatura (fallback) 561 -* **Deployment:** Vercel, AWS Lambda, or similar serverless 748 +**Example:** 562 562 563 -=== 9.2 POC1 Simplifications === 750 +{{{Claim (English): "COVID vaccines are 95% effective" 751 +Canonical: "covid vaccines are 95 percent effective" 752 +Language: "en" 753 +SHA256: abc123...def456 754 +Key: claim:v1norm1:en:abc123...def456 755 +}}} 564 564 565 -* **No database required:** Job metadata + outputs stored as JSON files ({{code}}jobs/{job_id}.json{{/code}}, {{code}}results/{job_id}.json{{/code}}) 566 -* **No user authentication:** Optional API key validation only (env var: {{code}}FACTHARBOR_API_KEY{{/code}}) 567 -* **Single-instance deployment:** No distributed processing, no worker pools 568 -* **Synchronous LLM calls:** No streaming in POC1 (entire response before returning) 569 -* **Job retention:** 24 hours default (configurable: {{code}}JOB_RETENTION_HOURS{{/code}}) 570 -* **Rate limiting:** Simple IP-based (optional) - no complex billing 757 +**Rationale:** Prevents cross-language collisions and enables per-language cache analytics. 571 571 572 - === 9.3 EstimatedCosts (PerAnalysis) ===759 +**Data Structure:** 573 573 574 -**LLM API costs (Claude 3.5 Sonnet):** 575 -* Input: $3.00 per million tokens 576 -* Output: $15.00 per million tokens 577 -* **Per article:** $0.10-0.30 (varies by length, 5-10 claims typical) 761 +{{{SET claim:v1norm1:en:abc123...def456 '{...ClaimAnalysis JSON...}' 762 +EXPIRE claim:v1norm1:en:abc123...def456 7776000 # 90 days 763 +}}} 578 578 579 -**Web search costs (optional):** 580 -* Using external search API (Tavily, Brave): $0.01-0.05 per analysis 581 -* POC1 can use free search APIs initially 765 +---- 582 582 583 -**Infrastructure costs:** 584 -* Vercel hobby tier: Free for POC 585 -* AWS Lambda: ~$0.001 per request 586 -* **Total infra:** <$0.01 per analysis 767 +=== 5.1.1 Canonical Claim Normalization (v1) === 587 587 588 - **Totalestimatedcost:**~$0.15-0.35 per analysis✅Meets<$0.35target769 +The cache key depends on deterministic claim normalization. All implementations MUST follow this algorithm exactly. 589 589 590 - === 9.4 EstimatedTimeline(AI-Assisted) ===771 +**Algorithm: Canonical Claim Normalization v1** 591 591 592 -**With Cursor IDE + Claude API:** 593 -* Day 1-2: API scaffolding + job queue 594 -* Day 3-4: LLM integration + prompt engineering 595 -* Day 5-6: Evidence retrieval + contradiction search 596 -* Day 7: Report templates + testing with 30 articles 597 -* **Total:** 5-7 days for working POC1 773 +{{{def normalize_claim_v1(claim_text: str, language: str) -> str: 774 + """ 775 + Normalizes claim to canonical form for cache key generation. 776 + Version: v1norm1 (POC1) 777 + """ 778 + import re 779 + import unicodedata 780 + 781 + # Step 1: Unicode normalization (NFC) 782 + text = unicodedata.normalize('NFC', claim_text) 783 + 784 + # Step 2: Lowercase 785 + text = text.lower() 786 + 787 + # Step 3: Remove punctuation (except hyphens in words) 788 + text = re.sub(r'[^\w\s-]', '', text) 789 + 790 + # Step 4: Normalize whitespace (collapse multiple spaces) 791 + text = re.sub(r'\s+', ' ', text).strip() 792 + 793 + # Step 5: Numeric normalization 794 + text = text.replace('%', ' percent') 795 + # Spell out single-digit numbers 796 + num_to_word = {'0':'zero', '1':'one', '2':'two', '3':'three', 797 + '4':'four', '5':'five', '6':'six', '7':'seven', 798 + '8':'eight', '9':'nine'} 799 + for num, word in num_to_word.items(): 800 + text = re.sub(rf'\b{num}\b', word, text) 801 + 802 + # Step 6: Common abbreviations (English only in v1) 803 + if language == 'en': 804 + text = text.replace('covid-19', 'covid') 805 + text = text.replace('u.s.', 'us') 806 + text = text.replace('u.k.', 'uk') 807 + 808 + # Step 7: NO entity normalization in v1 809 + # (Trump vs Donald Trump vs President Trump remain distinct) 810 + 811 + return text 598 598 599 -**Manual coding (no AI assistance):** 600 -* Estimate: 15-20 days 813 +# Version identifier (include in cache namespace) 814 +CANONICALIZER_VERSION = "v1norm1" 815 +}}} 601 601 602 - ===9.5First Promptfor AI Code Generation ===817 +**Cache Key Formula (Updated):** 603 603 604 -{{code}} 605 -Based on the FactHarbor POC1 API & Schemas Specification (v0.3), generate a Next.js 14 TypeScript application with: 819 +{{{language = "en" 820 +canonical = normalize_claim_v1(claim_text, language) 821 +cache_key = f"claim:{CANONICALIZER_VERSION}:{language}:{sha256(canonical)}" 606 606 607 -1. API routes implementing the 7 endpoints specified in Section 3 608 -2. AnalyzeRequest/AnalysisResult types matching schemas in Sections 4-5 609 -3. Anthropic Claude 3.5 Sonnet integration for: 610 - - Claim extraction (with central/supporting marking) 611 - - Scenario generation 612 - - Evidence synthesis (with mandatory contradiction search) 613 - - Verdict generation 614 - - Holistic assessment (article-level credibility) 615 -4. Job-based async execution with progress tracking (7 pipeline stages) 616 -5. Quality Gates 1 & 4 from NFR11 implementation 617 -6. Mandatory contradiction search enforcement (Section 5) 618 -7. Context-aware analysis (experimental) as specified 619 -8. Filesystem-based job storage (no database) 620 -9. Markdown report generation from JSON templates (Section 6) 823 +Example: 824 + claim: "COVID-19 vaccines are 95% effective" 825 + canonical: "covid vaccines are 95 percent effective" 826 + sha256: abc123...def456 827 + key: "claim:v1norm1:en:abc123...def456" 828 +}}} 621 621 622 -Use the validation rules from Section 5 and error codes from Section 2.1.1. 623 -Target: <$0.35 per analysis, <2 minutes processing time. 624 -{{/code}} 830 +**Cache Metadata MUST Include:** 625 625 626 ---- 832 +{{{{ 833 + "canonical_claim": "covid vaccines are 95 percent effective", 834 + "canonicalizer_version": "v1norm1", 835 + "language": "en", 836 + "original_claim_samples": ["COVID-19 vaccines are 95% effective"] 837 +} 838 +}}} 627 627 628 - == 10. TestingStrategy(POC1) ==840 +**Version Upgrade Path:** 629 629 630 -=== 10.1 Test Dataset (30 Articles) === 842 +* v1norm1 → v1norm2: Cache namespace changes, old keys remain valid until TTL 843 +* v1normN → v2norm1: Major version bump, invalidate all v1 caches 631 631 632 -**Category 1: Straightforward Factual (10 articles)** 633 -* Purpose: Baseline accuracy 634 -* Example: "WHO report on global vaccination rates" 635 -* Expected: High claim accuracy, straightforward verdict 845 +---- 636 636 637 -**Category 2: Accurate Claims, Questionable Conclusions (10 articles)** ⭐ **Context-Aware Test** 638 -* Purpose: Test holistic assessment capability 639 -* Example: "Coffee cures cancer" (true premises, false conclusion) 640 -* Expected: Individual claims TRUE, article verdict MISLEADING 847 +=== 5.1.2 Copyright & Data Retention Policy === 641 641 642 -**Category 3: Mixed Accuracy (5 articles)** 643 -* Purpose: Test nuance handling 644 -* Example: Articles with some true, some false claims 645 -* Expected: Scenario-level differentiation 849 +**Evidence Excerpt Storage:** 646 646 647 -**Category 4: Low-Quality Claims (5 articles)** 648 -* Purpose: Test quality gates 649 -* Example: Opinion pieces, compound claims 650 -* Expected: Gate 1 failures, rejection or draft-only mode 851 +To comply with copyright law and fair use principles: 651 651 652 - ===10.2 SuccessMetrics ===853 +**What We Store:** 653 653 654 -**Quality Metrics:** 655 -* Hallucination rate: <5% (target: <3%) 656 -* Context-aware accuracy: ≥70% (experimental - key POC1 goal) 657 -* False positive rate: <15% 658 -* Mandatory contradiction search: 100% compliance 855 +* **Metadata only:** Title, author, publisher, URL, publication date 856 +* **Short excerpts:** Max 25 words per quote, max 3 quotes per evidence item 857 +* **Summaries:** AI-generated bullet points (not verbatim text) 858 +* **No full articles:** Never store complete article text beyond job processing 659 659 660 -**Performance Metrics:** 661 -* Processing time: <2 minutes per article (standard depth) 662 -* Cost per analysis: <$0.35 663 -* API uptime: >99% 664 -* LLM API error rate: <1% 860 +**Total per Cached Claim:** 665 665 666 -**See:** [[POC1 Roadmap>>Test.FactHarbor.Roadmap.POC1.WebHome]] Section 11 for complete success criteria and testing methodology. 862 +* Scenarios: 2 per claim 863 +* Evidence items: 6 per scenario (12 total) 864 +* Quotes: 3 per evidence × 25 words = 75 words per item 865 +* **Maximum stored verbatim text:** ~~900 words per claim (12 × 75) 667 667 668 - ---867 +**Retention:** 669 669 670 -**End of Specification - FactHarbor POC1 API v0.3** 869 +* Cache TTL: 90 days 870 +* Job outputs: 24 hours (then archived or deleted) 871 +* No persistent full-text article storage 671 671 672 -**R eady for xWiki import and AI-assisted implementation!**🚀873 +**Rationale:** 673 673 875 +* Short excerpts for citation = fair use 876 +* Summaries are transformative (not copyrightable) 877 +* Limited retention (90 days max) 878 +* No commercial republication of excerpts 879 + 880 +**DMCA Compliance:** 881 + 882 +* Cache invalidation endpoint available for rights holders 883 +* Contact: dmca@factharbor.org 884 + 885 +---- 886 + 887 +== Summary == 888 + 889 +This WYSIWYG preview shows the **structure and key sections** of the 1,515-line API specification. 890 + 891 +**Full specification includes:** 892 + 893 +* Complete API endpoints (7 total) 894 +* All data schemas (ClaimExtraction, ClaimAnalysis, HolisticAssessment, Complete) 895 +* Quality gates & validation rules 896 +* LLM configuration for all 3 stages 897 +* Implementation notes with code samples 898 +* Testing strategy 899 +* Cross-references to other pages 900 + 901 +**The complete specification is available in:** 902 + 903 +* FactHarbor_POC1_API_and_Schemas_Spec_v0_4_1_PATCHED.md (45 KB standalone) 904 +* Export files (TEST/PRODUCTION) for xWiki import