Changes for page POC1 API & Schemas Specification
Last modified by Robert Schaub on 2025/12/24 18:26
From version 2.2
edited by Robert Schaub
on 2025/12/24 16:28
on 2025/12/24 16:28
Change comment:
There is no comment for this version
Summary
-
Page properties (2 modified, 0 added, 0 removed)
Details
- Page properties
-
- Title
-
... ... @@ -1,1 +1,1 @@ 1 -POC1 API & Schemas Specification 1 +POC1 API & Schemas Specification v0.4.1 - Content
-
... ... @@ -1,25 +1,44 @@ 1 - =POC1 API & Schemas Specification=1 +# FactHarbor POC1 — API & Schemas Specification 2 2 3 ----- 3 +**Version:** 0.4.1 (POC1 - 3-Stage Caching Architecture) 4 +**Namespace:** FactHarbor.* 5 +**Syntax:** xWiki 2.1 6 +**Last Updated:** 2025-12-24 4 4 8 +--- 9 + 5 5 == Version History == 6 6 7 7 |=Version|=Date|=Changes 8 8 |0.4.1|2025-12-24|Applied 9 critical fixes: file format notice, verdict taxonomy, canonicalization algorithm, Stage 1 cost policy, BullMQ fix, language in cache key, historical claims TTL, idempotency, copyright policy 9 9 |0.4|2025-12-24|**BREAKING:** 3-stage pipeline with claim-level caching, user tier system, cache-only mode for free users, Redis cache architecture 10 -|0.3.1|2025-12-24|Fixed single-prompt strategy, SSE clarification, schema canonicalization, cost constraints 11 -|0.3|2025-12-24|Added complete API endpoints, LLM config, risk tiers, scraping details 15 +|0.3.1|2025-12-24|Fixed single-prompt strategy, SSE clarification, schema canonicalization, cost constraints, chain-of-thought, evidence citation, Jina safety, gate numbering 16 +|0.3|2025-12-24|Added complete API endpoints, LLM config, risk tiers, scraping details, quality gate logging, temporal separation note, cross-references 17 +|0.2|2025-12-24|Initial rebased version with holistic assessment 18 +|0.1|2025-12-24|Original specification 12 12 13 ----- 20 +--- 21 +--- 14 14 15 -== 1.CoreObjective(POC1)==23 +== File Format Notice == 16 16 17 - Theprimarytechnicalgoalof POC1is tovalidate**Approach1 (Single-Pass Holistic Analysis)**whileimplementing**claim-level caching**toachieve costsustainability.25 +**⚠️ Important:** This file is stored as {{code}}.md{{/code}} for transport/versioning, but the content is **xWiki 2.1 syntax** (not Markdown). 18 18 19 -The system must prove that AI can identify an article's **Main Thesis** and determine if supporting claims logically support that thesis without committing fallacies. 27 +**When importing to xWiki:** 28 +* Use "Import as XWiki content" (not "Import as Markdown") 29 +* The xWiki parser will correctly interpret {{code}}==}} headers, {{{{code}}}}}} blocks, etc. 20 20 21 - ===SuccessCriteria:===31 +**Alternate naming:** If your workflow supports it, rename to {{code}}.xwiki.txt{{/code}} to avoid ambiguity. 22 22 33 +--- 34 + 35 +== 1. Core Objective (POC1) == 36 + 37 +The primary technical goal of POC1 is to validate **Approach 1 (Single-Pass Holistic Analysis)** while implementing **claim-level caching** to achieve cost sustainability: 38 + 39 +The system must prove that AI can identify an article's **Main Thesis** and determine if the supporting claims (even if individually accurate) logically support that thesis without committing fallacies (e.g., correlation vs. causation, cherry-picking, hasty generalization). 40 + 41 +**Success Criteria:** 23 23 * Test with 30 diverse articles 24 24 * Target: ≥70% accuracy detecting misleading articles 25 25 * Cost: <$0.25 per NEW analysis (uncached) ... ... @@ -27,13 +27,14 @@ 27 27 * Cache hit rate: ≥50% after 1,000 articles 28 28 * Processing time: <2 minutes (standard depth) 29 29 30 -=== Economic Model: === 49 +**Economic Model:** 50 +* Free tier: $10 credit per month (~40-140 articles depending on cache hits) 51 +* After limit: Cache-only mode (instant, free access to cached claims) 52 +* Paid tier: Unlimited new analyses 31 31 32 -* **Free tier:** $10 credit per month (~~40-140 articles depending on cache hits) 33 -* **After limit:** Cache-only mode (instant, free access to cached claims) 34 -* **Paid tier:** Unlimited new analyses 54 +**See:** [[Article Verdict Problem>>Test.FactHarbor.Specification.POC.Article-Verdict-Problem]] for complete investigation of 7 approaches. 35 35 36 ---- -56 +--- 37 37 38 38 == 2. Architecture Overview == 39 39 ... ... @@ -41,7 +41,8 @@ 41 41 42 42 FactHarbor POC1 uses a **3-stage architecture** designed for claim-level caching and cost efficiency: 43 43 44 -{{{graph TD 64 +{{code language="mermaid"}} 65 +graph TD 45 45 A[Article Input] --> B[Stage 1: Extract Claims] 46 46 B --> C{For Each Claim} 47 47 C --> D[Check Cache] ... ... @@ -51,46 +51,41 @@ 51 51 G --> E 52 52 E --> H[Stage 3: Holistic Assessment] 53 53 H --> I[Final Report] 54 -}} }75 +{{/code}} 55 55 56 -==== Stage 1: Claim Extraction (Haiku, no cache) ==== 77 +**Stage 1: Claim Extraction** (Haiku, no cache) 78 +* Input: Article text 79 +* Output: 5 canonical claims (normalized, deduplicated) 80 +* Model: Claude Haiku 4 81 +* Cost: $0.003 per article 82 +* Cache strategy: No caching (article-specific) 57 57 58 -* **Input:** Article text 59 -* **Output:** 5 canonical claims (normalized, deduplicated) 60 -* **Model:** Claude Haiku 4 61 -* **Cost:** $0.003 per article 62 -* **Cache strategy:** No caching (article-specific) 84 +**Stage 2: Claim Analysis** (Sonnet, CACHED) 85 +* Input: Single canonical claim 86 +* Output: Scenarios + Evidence + Verdicts 87 +* Model: Claude Sonnet 3.5 88 +* Cost: $0.081 per NEW claim 89 +* Cache strategy: **Redis, 90-day TTL** 90 +* Cache key: {{code}}claim:v1norm1:{language}:{sha256(canonical_claim)}{{/code}} 63 63 64 -==== Stage 2: Claim Analysis (Sonnet, CACHED) ==== 92 +**Stage 3: Holistic Assessment** (Sonnet, no cache) 93 +* Input: Article + Claim verdicts (from cache or Stage 2) 94 +* Output: Article verdict + Fallacies + Logic quality 95 +* Model: Claude Sonnet 3.5 96 +* Cost: $0.030 per article 97 +* Cache strategy: No caching (article-specific) 65 65 66 -* **Input:** Single canonical claim 67 -* **Output:** Scenarios + Evidence + Verdicts 68 -* **Model:** Claude Sonnet 3.5 69 -* **Cost:** $0.081 per NEW claim 70 -* **Cache strategy:** Redis, 90-day TTL 71 -* **Cache key:** claim:v1norm1:{language}:{sha256(canonical_claim)} 99 +**Total Cost Formula:** 100 +{{code}} 101 +Cost = $0.003 (extraction) + (N_new_claims × $0.081) + $0.030 (holistic) 72 72 73 -==== Stage 3: Holistic Assessment (Sonnet, no cache) ==== 74 - 75 -* **Input:** Article + Claim verdicts (from cache or Stage 2) 76 -* **Output:** Article verdict + Fallacies + Logic quality 77 -* **Model:** Claude Sonnet 3.5 78 -* **Cost:** $0.030 per article 79 -* **Cache strategy:** No caching (article-specific) 80 - 81 -=== Total Cost Formula: === 82 - 83 -{{{Cost = $0.003 (extraction) + (N_new_claims × $0.081) + $0.030 (holistic) 84 - 85 85 Examples: 86 86 - 0 new claims (100% cache hit): $0.033 87 87 - 1 new claim (80% cache hit): $0.114 88 88 - 3 new claims (40% cache hit): $0.276 89 89 - 5 new claims (0% cache hit): $0.438 90 -}} }108 +{{/code}} 91 91 92 ----- 93 - 94 94 === 2.2 User Tier System === 95 95 96 96 |=Tier|=Monthly Credit|=After Limit|=Cache Access|=Analytics ... ... @@ -99,21 +99,17 @@ 99 99 |**Enterprise** (future)|Custom|Continues|✅ Full + Priority|Full 100 100 101 101 **Free Tier Economics:** 102 - 103 103 * $10 credit = 40-140 articles analyzed (depending on cache hit rate) 104 104 * Average 70 articles/month at 70% cache hit rate 105 -* After limit: Cache-only mode 120 +* After limit: Cache-only mode (see Section 2.3) 106 106 107 ----- 108 - 109 109 === 2.3 Cache-Only Mode (Free Tier Feature) === 110 110 111 111 When free users reach their $10 monthly limit, they enter **Cache-Only Mode**: 112 112 113 - ====What Cache-Only Mode Provides:====126 +**What Cache-Only Mode Provides:** 114 114 115 115 ✅ **Claim Extraction (Platform-Funded):** 116 - 117 117 * Stage 1 extraction runs at $0.003 per article 118 118 * **Cost: Absorbed by platform** (not charged to user credit) 119 119 * Rationale: Extraction is necessary to check cache, and cost is negligible ... ... @@ -120,31 +120,27 @@ 120 120 * Rate limit: Max 50 extractions/day in cache-only mode (prevents abuse) 121 121 122 122 ✅ **Instant Access to Cached Claims:** 123 - 124 124 * Any claim that exists in cache → Full verdict returned 125 125 * Cost: $0 (no LLM calls) 126 126 * Response time: <100ms 127 127 128 128 ✅ **Partial Article Analysis:** 129 - 130 130 * Check each claim against cache 131 131 * Return verdicts for ALL cached claims 132 -* For uncached claims: Return "status": "cache_miss" 142 +* For uncached claims: Return {{code}}"status": "cache_miss"{{/code}} 133 133 134 134 ✅ **Cache Coverage Report:** 135 - 136 136 * "3 of 5 claims available in cache (60% coverage)" 137 137 * Links to cached analyses 138 138 * Estimated cost to complete: $0.162 (2 new claims) 139 139 140 140 ❌ **Not Available in Cache-Only Mode:** 141 - 142 142 * New claim analysis (Stage 2 LLM calls blocked) 143 143 * Full holistic assessment (Stage 3 blocked if any claims missing) 144 144 145 - ====User ExperienceExample:====146 - 147 -{ {{{153 +**User Experience:** 154 +{{code language="json"}} 155 +{ 148 148 "status": "cache_only_mode", 149 149 "message": "Monthly credit limit reached. Showing cached results only.", 150 150 "cache_coverage": { ... ... @@ -167,26 +167,26 @@ 167 167 "pro_tier": "$50/month unlimited" 168 168 } 169 169 } 170 -}} }178 +{{/code}} 171 171 172 172 **Design Rationale:** 173 - 174 174 * Free users still get value (cached claims often answer their question) 175 175 * Demonstrates FactHarbor's value (partial results encourage upgrade) 176 176 * Sustainable for platform (no additional cost) 177 177 * Fair to all users (everyone contributes to cache) 178 178 179 ---- -186 +--- 180 180 181 181 == 3. REST API Contract == 182 182 183 183 === 3.1 User Credit Tracking === 184 184 185 -**Endpoint:** GET /v1/user/credit 192 +**Endpoint:** {{code}}GET /v1/user/credit{{/code}} 186 186 187 -**Response:** 200 OK 194 +**Response:** {{code}}200 OK{{/code}} 188 188 189 -{{{{ 196 +{{code language="json"}} 197 +{ 190 190 "user_id": "user_abc123", 191 191 "tier": "free", 192 192 "credit_limit": 10.00, ... ... @@ -201,25 +201,30 @@ 201 201 "cache_hit_rate": 0.626 202 202 } 203 203 } 204 -}} }212 +{{/code}} 205 205 206 ---- -214 +--- 207 207 208 208 === 3.2 Create Analysis Job (3-Stage) === 209 209 210 -**Endpoint:** POST /v1/analyze 218 +**Endpoint:** {{code}}POST /v1/analyze{{/code}} 211 211 212 - ==== IdempotencySupport:====220 +**Request Body:** 213 213 222 + 223 +**Idempotency Support:** 224 + 214 214 To prevent duplicate job creation on network retries, clients SHOULD include: 215 215 216 -{{{POST /v1/analyze 227 +{{code language="http"}} 228 +POST /v1/analyze 217 217 Idempotency-Key: {client-generated-uuid} 218 -}} }230 +{{/code}} 219 219 220 -OR use the client.request_id field: 232 +OR use the {{code}}client.request_id{{/code}} field: 221 221 222 -{{{{ 234 +{{code language="json"}} 235 +{ 223 223 "input_url": "...", 224 224 "client": { 225 225 "request_id": "client-uuid-12345", ... ... @@ -226,18 +226,17 @@ 226 226 "source_label": "optional" 227 227 } 228 228 } 229 -}} }242 +{{/code}} 230 230 231 231 **Server Behavior:** 232 - 233 -* If Idempotency-Key or request_id seen before (within 24 hours): 234 -** Return existing job (200 OK, not 202 Accepted) 235 -** Do NOT create duplicate job or charge twice 245 +* If {{code}}Idempotency-Key{{/code}} or {{code}}request_id{{/code}} seen before (within 24 hours): 246 + - Return existing job ({{code}}200 OK{{/code}}, not {{code}}202 Accepted{{/code}}) 247 + - Do NOT create duplicate job or charge twice 236 236 * Idempotency keys expire after 24 hours (matches job retention) 237 237 238 238 **Example Response (Idempotent):** 239 - 240 -{ {{{251 +{{code language="json"}} 252 +{ 241 241 "job_id": "01J...ULID", 242 242 "status": "RUNNING", 243 243 "idempotent": true, ... ... @@ -244,11 +244,11 @@ 244 244 "original_request_at": "2025-12-24T10:31:00Z", 245 245 "message": "Returning existing job (idempotency key matched)" 246 246 } 247 -}} }259 +{{/code}} 248 248 249 -==== Request Body: ==== 250 250 251 -{{{{ 262 +{{code language="json"}} 263 +{ 252 252 "input_type": "url", 253 253 "input_url": "https://example.com/medical-report-01", 254 254 "input_text": null, ... ... @@ -256,9 +256,8 @@ 256 256 "browsing": "on", 257 257 "depth": "standard", 258 258 "max_claims": 5, 259 - "scenarios_per_claim": 2, 260 - "max_evidence_per_scenario": 6, 261 - "context_aware_analysis": true 271 + "context_aware_analysis": true, 272 + "cache_preference": "prefer_cache" 262 262 }, 263 263 "client": { 264 264 "request_id": "optional-client-tracking-id", ... ... @@ -265,20 +265,18 @@ 265 265 "source_label": "optional" 266 266 } 267 267 } 268 -}} }279 +{{/code}} 269 269 270 270 **Options:** 282 +* {{code}}cache_preference{{/code}}: {{code}}prefer_cache{{/code}} | {{code}}require_fresh{{/code}} | {{code}}allow_partial{{/code}} 283 + - {{code}}prefer_cache{{/code}}: Use cache when available, analyze new claims (default) 284 + - {{code}}require_fresh{{/code}}: Force re-analysis of all claims (ignores cache, costs more) 285 + - {{code}}allow_partial{{/code}}: Return partial results if some claims uncached (for free tier cache-only mode) 271 271 272 -* browsing: on | off (retrieve web sources or just output queries) 273 -* depth: standard | deep (evidence thoroughness) 274 -* max_claims: 1-10 (default: **5** for cost control) 275 -* scenarios_per_claim: 1-5 (default: **2** for cost control) 276 -* max_evidence_per_scenario: 3-10 (default: **6**) 277 -* context_aware_analysis: true | false (experimental) 287 +**Response:** {{code}}202 Accepted{{/code}} 278 278 279 -**Response:** 202 Accepted 280 - 281 -{{{{ 289 +{{code language="json"}} 290 +{ 282 282 "job_id": "01J...ULID", 283 283 "status": "QUEUED", 284 284 "created_at": "2025-12-24T10:31:00Z", ... ... @@ -301,13 +301,13 @@ 301 301 "events": "/v1/jobs/01J...ULID/events" 302 302 } 303 303 } 304 -}} }313 +{{/code}} 305 305 306 306 **Error Responses:** 307 307 308 -402 Payment Required - Free tier limit reached, cache-only mode 309 - 310 -{ {{{317 +{{code}}402 Payment Required{{/code}} - Free tier limit reached, cache-only mode 318 +{{code language="json"}} 319 +{ 311 311 "error": "credit_limit_reached", 312 312 "message": "Monthly credit limit reached. Entering cache-only mode.", 313 313 "cache_only_mode": true, ... ... @@ -315,15 +315,199 @@ 315 315 "reset_date": "2025-02-01T00:00:00Z", 316 316 "action": "Resubmit with cache_preference=allow_partial for cached results" 317 317 } 318 -}} }327 +{{/code}} 319 319 320 ---- -329 +--- 321 321 331 +=== 3.3 Get Job Status === 332 + 333 +**Endpoint:** {{code}}GET /v1/jobs/{job_id}{{/code}} 334 + 335 +**Response:** {{code}}200 OK{{/code}} 336 + 337 +{{code language="json"}} 338 +{ 339 + "job_id": "01J...ULID", 340 + "status": "RUNNING", 341 + "created_at": "2025-12-24T10:31:00Z", 342 + "updated_at": "2025-12-24T10:31:22Z", 343 + "progress": { 344 + "stage": "stage2_claim_analysis", 345 + "percent": 65, 346 + "message": "Analyzing claim 3 of 5 (2 from cache)", 347 + "current_claim_id": "C3", 348 + "cache_hits": 2, 349 + "cache_misses": 1 350 + }, 351 + "actual_cost": 0.084, 352 + "cost_breakdown": { 353 + "stage1_extraction": 0.003, 354 + "stage2_new_claims": 0.081, 355 + "stage2_cached_claims": 0.000, 356 + "stage3_holistic": null 357 + }, 358 + "input_echo": { 359 + "input_type": "url", 360 + "input_url": "https://example.com/medical-report-01" 361 + }, 362 + "links": { 363 + "self": "/v1/jobs/01J...ULID", 364 + "result": "/v1/jobs/01J...ULID/result", 365 + "report": "/v1/jobs/01J...ULID/report" 366 + }, 367 + "error": null 368 +} 369 +{{/code}} 370 + 371 +--- 372 + 373 +=== 3.4 Get Analysis Result === 374 + 375 +**Endpoint:** {{code}}GET /v1/jobs/{job_id}/result{{/code}} 376 + 377 +**Response:** {{code}}200 OK{{/code}} 378 + 379 +Returns complete **AnalysisResult** schema (see Section 4). 380 + 381 +**Cache-Only Mode Response:** {{code}}206 Partial Content{{/code}} 382 + 383 +{{code language="json"}} 384 +{ 385 + "cache_only_mode": true, 386 + "cache_coverage": { 387 + "claims_total": 5, 388 + "claims_cached": 3, 389 + "claims_missing": 2, 390 + "coverage_percent": 60 391 + }, 392 + "partial_result": { 393 + "metadata": { 394 + "job_id": "01J...ULID", 395 + "timestamp_utc": "2025-12-24T10:31:30Z", 396 + "engine_version": "POC1-v0.4", 397 + "cache_only": true 398 + }, 399 + "claims": [ 400 + { 401 + "claim_id": "C1", 402 + "claim_text": "...", 403 + "canonical_claim": "...", 404 + "source": "cache", 405 + "cached_at": "2025-12-20T15:30:00Z", 406 + "cache_hit_count": 47, 407 + "scenarios": [...] 408 + }, 409 + { 410 + "claim_id": "C3", 411 + "claim_text": "...", 412 + "canonical_claim": "...", 413 + "source": "not_analyzed", 414 + "status": "cache_miss", 415 + "estimated_cost": 0.081 416 + } 417 + ], 418 + "article_holistic_assessment": null, 419 + "upgrade_prompt": { 420 + "message": "Upgrade to Pro for full analysis of all claims", 421 + "missing_claims": 2, 422 + "cost_to_complete": 0.192 423 + } 424 + } 425 +} 426 +{{/code}} 427 + 428 +**Other Responses:** 429 +* {{code}}409 Conflict{{/code}} - Job not finished yet 430 +* {{code}}404 Not Found{{/code}} - Job ID unknown 431 + 432 +--- 433 + 434 +=== 3.5 Stage-Specific Endpoints (Optional, Advanced) === 435 + 436 +For direct stage access (useful for cache debugging, custom workflows): 437 + 438 +**Extract Claims Only:** 439 +{{code}}POST /v1/analyze/extract-claims{{/code}} 440 + 441 +**Analyze Single Claim:** 442 +{{code}}POST /v1/analyze/claim{{/code}} 443 + 444 +**Assess Article (with claim verdicts):** 445 +{{code}}POST /v1/analyze/assess-article{{/code}} 446 + 447 +**Check Claim Cache:** 448 +{{code}}GET /v1/cache/claim/{claim_hash}{{/code}} 449 + 450 +**Cache Statistics:** 451 +{{code}}GET /v1/cache/stats{{/code}} 452 + 453 +--- 454 + 455 +=== 3.6 Download Markdown Report === 456 + 457 +**Endpoint:** {{code}}GET /v1/jobs/{job_id}/report{{/code}} 458 + 459 +**Response:** {{code}}200 OK{{/code}} with {{code}}text/markdown; charset=utf-8{{/code}} content 460 + 461 +**Headers:** 462 +* {{code}}Content-Disposition: attachment; filename="factharbor_poc1_{job_id}.md"{{/code}} 463 + 464 +**Cache-Only Mode:** Report includes "Partial Analysis" watermark and upgrade prompt. 465 + 466 +--- 467 + 468 +=== 3.7 Stream Job Events (Backend Progress) === 469 + 470 +**Endpoint:** {{code}}GET /v1/jobs/{job_id}/events{{/code}} 471 + 472 +**Response:** Server-Sent Events (SSE) stream 473 + 474 +**Event Types:** 475 +* {{code}}progress{{/code}} - Backend progress (e.g., "Stage 1: Extracting claims") 476 +* {{code}}cache_hit{{/code}} - Claim found in cache 477 +* {{code}}cache_miss{{/code}} - Claim requires new analysis 478 +* {{code}}stage_complete{{/code}} - Stage 1/2/3 finished 479 +* {{code}}complete{{/code}} - Job finished 480 +* {{code}}error{{/code}} - Error occurred 481 +* {{code}}credit_warning{{/code}} - User approaching limit 482 + 483 +--- 484 + 485 +=== 3.8 Cancel Job === 486 + 487 +**Endpoint:** {{code}}DELETE /v1/jobs/{job_id}{{/code}} 488 + 489 +**Note:** If job is mid-stage (e.g., analyzing claim 3 of 5), user is charged for completed work only. 490 + 491 +--- 492 + 493 +=== 3.9 Health Check === 494 + 495 +**Endpoint:** {{code}}GET /v1/health{{/code}} 496 + 497 +{{code language="json"}} 498 +{ 499 + "status": "ok", 500 + "version": "POC1-v0.4", 501 + "model_stage1": "claude-haiku-4", 502 + "model_stage2": "claude-3-5-sonnet-20241022", 503 + "model_stage3": "claude-3-5-sonnet-20241022", 504 + "cache": { 505 + "status": "connected", 506 + "total_claims": 12847, 507 + "avg_hit_rate_24h": 0.73 508 + } 509 +} 510 +{{/code}} 511 + 512 +--- 513 + 322 322 == 4. Data Schemas == 323 323 324 324 === 4.1 Stage 1 Output: ClaimExtraction === 325 325 326 -{{{{ 518 +{{code language="json"}} 519 +{ 327 327 "job_id": "01J...ULID", 328 328 "stage": "stage1_extraction", 329 329 "article_metadata": { ... ... @@ -348,10 +348,219 @@ 348 348 "article_thesis": "Main argument detected", 349 349 "cost": 0.003 350 350 } 351 -}} }544 +{{/code}} 352 352 353 - ----546 +=== 4.2 Stage 2 Output: ClaimAnalysis (CACHED) === 354 354 548 +This is the CACHEABLE unit. Stored in Redis with 90-day TTL. 549 + 550 +{{code language="json"}} 551 +{ 552 + "claim_hash": "sha256:abc123...", 553 + "canonical_claim": "COVID vaccines are 95% effective", 554 + "language": "en", 555 + "domain": "public_health", 556 + "analysis_version": "v1.0", 557 + "scenarios": [ 558 + { 559 + "scenario_id": "S1", 560 + "scenario_title": "mRNA vaccines (Pfizer/Moderna) in clinical trials", 561 + "definitions": {"95% effective": "95% reduction in symptomatic infection"}, 562 + "assumptions": ["Based on phase 3 trial data", "Against original strain"], 563 + "boundaries": { 564 + "time": "2020-2021 trials", 565 + "geography": "Multi-country trials", 566 + "population": "Adult population (16+)", 567 + "conditions": "Before widespread variants" 568 + }, 569 + "verdict": { 570 + "label": "Highly Likely", 571 + "probability_range": [0.88, 0.97], 572 + "confidence": 0.92, 573 + "reasoning_chain": [ 574 + "Pfizer/BioNTech trial: 95% efficacy (n=43,548)", 575 + "Moderna trial: 94.1% efficacy (n=30,420)", 576 + "Peer-reviewed publications in NEJM", 577 + "FDA independent analysis confirmed" 578 + ], 579 + "key_supporting_evidence_ids": ["E1", "E2"], 580 + "key_counter_evidence_ids": ["E3"], 581 + "uncertainty_factors": [ 582 + "Limited data on long-term effectiveness", 583 + "Variant-specific performance not yet measured" 584 + ] 585 + }, 586 + "evidence": [ 587 + { 588 + "evidence_id": "E1", 589 + "stance": "supports", 590 + "relevance_to_scenario": 0.98, 591 + "evidence_summary": [ 592 + "Pfizer trial showed 170 cases in placebo vs 8 in vaccine group", 593 + "Follow-up period median 2 months post-dose 2", 594 + "Efficacy consistent across age, sex, race, ethnicity" 595 + ], 596 + "citation": { 597 + "title": "Safety and Efficacy of the BNT162b2 mRNA Covid-19 Vaccine", 598 + "author_or_org": "Polack et al.", 599 + "publication_date": "2020-12-31", 600 + "url": "https://nejm.org/doi/full/10.1056/NEJMoa2034577", 601 + "publisher": "New England Journal of Medicine", 602 + "retrieved_at_utc": "2025-12-20T15:30:00Z" 603 + }, 604 + "excerpt": ["The vaccine was 95% effective in preventing Covid-19"], 605 + "excerpt_word_count": 9, 606 + "source_reliability_score": 0.95, 607 + "reliability_justification": "Peer-reviewed, high-impact journal, large RCT", 608 + "limitations_and_reservations": [ 609 + "Short follow-up period (2 months)", 610 + "Primarily measures symptomatic infection, not transmission" 611 + ], 612 + "retraction_or_dispute_signal": "none" 613 + } 614 + ] 615 + } 616 + ], 617 + "cache_metadata": { 618 + "first_analyzed": "2025-12-01T10:00:00Z", 619 + "last_updated": "2025-12-20T15:30:00Z", 620 + "hit_count": 47, 621 + "version": "v1.0", 622 + "ttl_expires": "2026-03-20T15:30:00Z" 623 + }, 624 + "cost": 0.081 625 +} 626 +{{/code}} 627 + 628 +**Cache Key Structure:** 629 +{{code}} 630 +Redis Key: claim:v1norm1:{language}:{sha256(canonical_claim)} 631 +TTL: 90 days (7,776,000 seconds) 632 +Size: ~15KB JSON (compressed: ~5KB) 633 +{{/code}} 634 + 635 +=== 4.3 Stage 3 Output: HolisticAssessment === 636 + 637 +{{code language="json"}} 638 +{ 639 + "job_id": "01J...ULID", 640 + "stage": "stage3_holistic", 641 + "article_metadata": { 642 + "title": "...", 643 + "main_thesis": "...", 644 + "source_url": "..." 645 + }, 646 + "article_holistic_assessment": { 647 + "overall_verdict": "MISLEADING", 648 + "logic_quality_score": 0.42, 649 + "fallacies_detected": [ 650 + "correlation-causation", 651 + "cherry-picking" 652 + ], 653 + "verdict_reasoning": [ 654 + "Central claim C1 is REFUTED by multiple systematic reviews", 655 + "Supporting claims C2-C4 are TRUE but do not support the thesis", 656 + "Article commits correlation-causation fallacy", 657 + "Selective citation of evidence (cherry-picking detected)" 658 + ], 659 + "experimental_feature": true 660 + }, 661 + "claims_summary": [ 662 + { 663 + "claim_id": "C1", 664 + "is_central_to_thesis": true, 665 + "verdict": "Refuted", 666 + "confidence": 0.89, 667 + "source": "cache", 668 + "cache_hit": true 669 + }, 670 + { 671 + "claim_id": "C2", 672 + "is_central_to_thesis": false, 673 + "verdict": "Highly Likely", 674 + "confidence": 0.91, 675 + "source": "new_analysis", 676 + "cache_hit": false 677 + } 678 + ], 679 + "quality_gates": { 680 + "gate1_claim_validation": "pass", 681 + "gate4_verdict_confidence": "pass", 682 + "passed_all": true 683 + }, 684 + "cost": 0.030, 685 + "total_job_cost": 0.114 686 +} 687 +{{/code}} 688 + 689 +=== 4.4 Complete AnalysisResult (All 3 Stages Combined) === 690 + 691 +{{code language="json"}} 692 +{ 693 + "metadata": { 694 + "job_id": "01J...ULID", 695 + "timestamp_utc": "2025-12-24T10:31:30Z", 696 + "engine_version": "POC1-v0.4", 697 + "llm_stage1": "claude-haiku-4", 698 + "llm_stage2": "claude-3-5-sonnet-20241022", 699 + "llm_stage3": "claude-3-5-sonnet-20241022", 700 + "usage_stats": { 701 + "stage1_tokens": {"input": 10000, "output": 500}, 702 + "stage2_tokens": {"input": 2000, "output": 5000}, 703 + "stage3_tokens": {"input": 5000, "output": 1000}, 704 + "total_input_tokens": 17000, 705 + "total_output_tokens": 6500, 706 + "estimated_cost_usd": 0.114, 707 + "response_time_sec": 45.2 708 + }, 709 + "cache_stats": { 710 + "claims_total": 5, 711 + "claims_from_cache": 4, 712 + "claims_new_analysis": 1, 713 + "cache_hit_rate": 0.80, 714 + "cache_savings_usd": 0.324 715 + } 716 + }, 717 + "article_holistic_assessment": { 718 + "main_thesis": "...", 719 + "overall_verdict": "MISLEADING", 720 + "logic_quality_score": 0.42, 721 + "fallacies_detected": ["correlation-causation", "cherry-picking"], 722 + "verdict_reasoning": ["...", "...", "..."], 723 + "experimental_feature": true 724 + }, 725 + "claims": [ 726 + { 727 + "claim_id": "C1", 728 + "is_central_to_thesis": true, 729 + "claim_text": "...", 730 + "canonical_claim": "...", 731 + "claim_hash": "sha256:abc123...", 732 + "claim_type": "causal", 733 + "evaluability": "evaluable", 734 + "risk_tier": "B", 735 + "source": "cache", 736 + "cached_at": "2025-12-20T15:30:00Z", 737 + "cache_hit_count": 47, 738 + "scenarios": [...] 739 + }, 740 + { 741 + "claim_id": "C2", 742 + "source": "new_analysis", 743 + "analyzed_at": "2025-12-24T10:31:15Z", 744 + "scenarios": [...] 745 + } 746 + ], 747 + "quality_gates": { 748 + "gate1_claim_validation": "pass", 749 + "gate4_verdict_confidence": "pass", 750 + "passed_all": true 751 + } 752 +} 753 +{{/code}} 754 + 755 + 756 + 355 355 === 4.5 Verdict Label Taxonomy === 356 356 357 357 FactHarbor uses **three distinct verdict taxonomies** depending on analysis level: ... ... @@ -361,26 +361,23 @@ 361 361 Used for individual scenario verdicts within a claim. 362 362 363 363 **Enum Values:** 766 +* {{code}}Highly Likely{{/code}} - Probability 0.85-1.0, high confidence 767 +* {{code}}Likely{{/code}} - Probability 0.65-0.84, moderate-high confidence 768 +* {{code}}Unclear{{/code}} - Probability 0.35-0.64, or low confidence 769 +* {{code}}Unlikely{{/code}} - Probability 0.16-0.34, moderate-high confidence 770 +* {{code}}Highly Unlikely{{/code}} - Probability 0.0-0.15, high confidence 771 +* {{code}}Unsubstantiated{{/code}} - Insufficient evidence to determine probability 364 364 365 -* Highly Likely - Probability 0.85-1.0, high confidence 366 -* Likely - Probability 0.65-0.84, moderate-high confidence 367 -* Unclear - Probability 0.35-0.64, or low confidence 368 -* Unlikely - Probability 0.16-0.34, moderate-high confidence 369 -* Highly Unlikely - Probability 0.0-0.15, high confidence 370 -* Unsubstantiated - Insufficient evidence to determine probability 371 - 372 372 ==== 4.5.2 Claim Verdict Labels (Rollup) ==== 373 373 374 374 Used when summarizing a claim across all scenarios. 375 375 376 376 **Enum Values:** 778 +* {{code}}Supported{{/code}} - Majority of scenarios are Likely or Highly Likely 779 +* {{code}}Refuted{{/code}} - Majority of scenarios are Unlikely or Highly Unlikely 780 +* {{code}}Inconclusive{{/code}} - Mixed scenarios or majority Unclear/Unsubstantiated 377 377 378 -* Supported - Majority of scenarios are Likely or Highly Likely 379 -* Refuted - Majority of scenarios are Unlikely or Highly Unlikely 380 -* Inconclusive - Mixed scenarios or majority Unclear/Unsubstantiated 381 - 382 382 **Mapping Logic:** 383 - 384 384 * If ≥60% scenarios are (Highly Likely | Likely) → Supported 385 385 * If ≥60% scenarios are (Highly Unlikely | Unlikely) → Refuted 386 386 * Otherwise → Inconclusive ... ... @@ -390,23 +390,23 @@ 390 390 Used for holistic article-level assessment. 391 391 392 392 **Enum Values:** 792 +* {{code}}WELL-SUPPORTED{{/code}} - Article thesis logically follows from supported claims 793 +* {{code}}MISLEADING{{/code}} - Claims may be true but article commits logical fallacies 794 +* {{code}}REFUTED{{/code}} - Central claims are refuted, invalidating thesis 795 +* {{code}}UNCERTAIN{{/code}} - Insufficient evidence or highly mixed claim verdicts 393 393 394 -* WELL-SUPPORTED - Article thesis logically follows from supported claims 395 -* MISLEADING - Claims may be true but article commits logical fallacies 396 -* REFUTED - Central claims are refuted, invalidating thesis 397 -* UNCERTAIN - Insufficient evidence or highly mixed claim verdicts 398 - 399 399 **Note:** Article verdict considers **claim centrality** (central claims override supporting claims). 400 400 401 401 ==== 4.5.4 API Field Mapping ==== 402 402 403 403 |=Level|=API Field|=Enum Name 404 -|Scenario|scenarios[].verdict.label|scenario_verdict_label 405 -|Claim|claims[].rollup_verdict (optional)|claim_verdict_label 406 -|Article|article_holistic_assessment.overall_verdict|article_verdict_label 802 +|Scenario|{{code}}scenarios[].verdict.label{{/code}}|scenario_verdict_label 803 +|Claim|{{code}}claims[].rollup_verdict{{/code}} (optional)|claim_verdict_label 804 +|Article|{{code}}article_holistic_assessment.overall_verdict{{/code}}|article_verdict_label 407 407 408 ----- 409 409 807 +--- 808 + 410 410 == 5. Cache Architecture == 411 411 412 412 === 5.1 Redis Cache Design === ... ... @@ -414,29 +414,117 @@ 414 414 **Technology:** Redis 7.0+ (in-memory key-value store) 415 415 416 416 **Cache Key Schema:** 816 +{{code}} 817 +claim:v1norm1:{language}:{sha256(canonical_claim)} 818 +{{/code}} 417 417 418 -{{{claim:v1norm1:{language}:{sha256(canonical_claim)} 419 -}}} 420 - 421 421 **Example:** 422 - 423 - {{{Claim (English): "COVID vaccines are 95% effective"821 +{{code}} 822 +Claim (English): "COVID vaccines are 95% effective" 424 424 Canonical: "covid vaccines are 95 percent effective" 425 425 Language: "en" 426 426 SHA256: abc123...def456 427 427 Key: claim:v1norm1:en:abc123...def456 428 -}} }827 +{{/code}} 429 429 430 430 **Rationale:** Prevents cross-language collisions and enables per-language cache analytics. 431 431 432 432 **Data Structure:** 832 +{{code language="redis"}} 833 +SET claim:v1:abc123...def456 '{...ClaimAnalysis JSON...}' 834 +EXPIRE claim:v1:abc123...def456 7776000 # 90 days 835 +{{/code}} 433 433 434 -{{{SET claim:v1norm1:en:abc123...def456 '{...ClaimAnalysis JSON...}' 435 -EXPIRE claim:v1norm1:en:abc123...def456 7776000 # 90 days 436 -}}} 837 +**Additional Keys:** 838 +{{code}} 437 437 438 - ----840 +==== 5.1.1 Canonical Claim Normalization (v1) ==== 439 439 842 +The cache key depends on deterministic claim normalization. All implementations MUST follow this algorithm exactly. 843 + 844 +**Algorithm: Canonical Claim Normalization v1** 845 + 846 +{{code language="python"}} 847 +def normalize_claim_v1(claim_text: str, language: str) -> str: 848 + """ 849 + Normalizes claim to canonical form for cache key generation. 850 + Version: v1norm1 (POC1) 851 + """ 852 + import re 853 + import unicodedata 854 + 855 + # Step 1: Unicode normalization (NFC) 856 + text = unicodedata.normalize('NFC', claim_text) 857 + 858 + # Step 2: Lowercase 859 + text = text.lower() 860 + 861 + # Step 3: Remove punctuation (except hyphens in words) 862 + text = re.sub(r'[^\w\s-]', '', text) 863 + 864 + # Step 4: Normalize whitespace (collapse multiple spaces) 865 + text = re.sub(r'\s+', ' ', text).strip() 866 + 867 + # Step 5: Numeric normalization 868 + text = text.replace('%', ' percent') 869 + # Spell out single-digit numbers 870 + num_to_word = {'0':'zero', '1':'one', '2':'two', '3':'three', 871 + '4':'four', '5':'five', '6':'six', '7':'seven', 872 + '8':'eight', '9':'nine'} 873 + for num, word in num_to_word.items(): 874 + text = re.sub(rf'\b{num}\b', word, text) 875 + 876 + # Step 6: Common abbreviations (English only in v1) 877 + if language == 'en': 878 + text = text.replace('covid-19', 'covid') 879 + text = text.replace('u.s.', 'us') 880 + text = text.replace('u.k.', 'uk') 881 + 882 + # Step 7: NO entity normalization in v1 883 + # (Trump vs Donald Trump vs President Trump remain distinct) 884 + 885 + return text 886 + 887 +# Version identifier (include in cache namespace) 888 +CANONICALIZER_VERSION = "v1norm1" 889 +{{/code}} 890 + 891 +**Cache Key Formula (Updated):** 892 + 893 +{{code}} 894 +language = "en" 895 +canonical = normalize_claim_v1(claim_text, language) 896 +cache_key = f"claim:{CANONICALIZER_VERSION}:{language}:{sha256(canonical)}" 897 + 898 +Example: 899 + claim: "COVID-19 vaccines are 95% effective" 900 + canonical: "covid vaccines are 95 percent effective" 901 + sha256: abc123...def456 902 + key: "claim:v1norm1:en:abc123...def456" 903 +{{/code}} 904 + 905 +**Cache Metadata MUST Include:** 906 + 907 +{{code language="json"}} 908 +{ 909 + "canonical_claim": "covid vaccines are 95 percent effective", 910 + "canonicalizer_version": "v1norm1", 911 + "language": "en", 912 + "original_claim_samples": ["COVID-19 vaccines are 95% effective"] 913 +} 914 +{{/code}} 915 + 916 +**Version Upgrade Path:** 917 +* v1norm1 → v1norm2: Cache namespace changes, old keys remain valid until TTL 918 +* v1normN → v2norm1: Major version bump, invalidate all v1 caches 919 + 920 + 921 +claim:stats:hit_count:{claim_hash} # Counter 922 +claim:index:domain:{domain} # Set of claim hashes by domain 923 +claim:index:language:{lang} # Set of claim hashes by language 924 +{{/code}} 925 + 926 + 440 440 === 5.1.1 Canonical Claim Normalization (v1) === 441 441 442 442 The cache key depends on deterministic claim normalization. All implementations MUST follow this algorithm exactly. ... ... @@ -443,7 +443,8 @@ 443 443 444 444 **Algorithm: Canonical Claim Normalization v1** 445 445 446 -{{{def normalize_claim_v1(claim_text: str, language: str) -> str: 933 +{{code language="python"}} 934 +def normalize_claim_v1(claim_text: str, language: str) -> str: 447 447 """ 448 448 Normalizes claim to canonical form for cache key generation. 449 449 Version: v1norm1 (POC1) ... ... @@ -485,11 +485,12 @@ 485 485 486 486 # Version identifier (include in cache namespace) 487 487 CANONICALIZER_VERSION = "v1norm1" 488 -}} }976 +{{/code}} 489 489 490 490 **Cache Key Formula (Updated):** 491 491 492 -{{{language = "en" 980 +{{code}} 981 +language = "en" 493 493 canonical = normalize_claim_v1(claim_text, language) 494 494 cache_key = f"claim:{CANONICALIZER_VERSION}:{language}:{sha256(canonical)}" 495 495 ... ... @@ -498,25 +498,25 @@ 498 498 canonical: "covid vaccines are 95 percent effective" 499 499 sha256: abc123...def456 500 500 key: "claim:v1norm1:en:abc123...def456" 501 -}} }990 +{{/code}} 502 502 503 503 **Cache Metadata MUST Include:** 504 504 505 -{{{{ 994 +{{code language="json"}} 995 +{ 506 506 "canonical_claim": "covid vaccines are 95 percent effective", 507 507 "canonicalizer_version": "v1norm1", 508 508 "language": "en", 509 509 "original_claim_samples": ["COVID-19 vaccines are 95% effective"] 510 510 } 511 -}} }1001 +{{/code}} 512 512 513 513 **Version Upgrade Path:** 514 - 515 515 * v1norm1 → v1norm2: Cache namespace changes, old keys remain valid until TTL 516 516 * v1normN → v2norm1: Major version bump, invalidate all v1 caches 517 517 518 ----- 519 519 1008 + 520 520 === 5.1.2 Copyright & Data Retention Policy === 521 521 522 522 **Evidence Excerpt Storage:** ... ... @@ -524,7 +524,6 @@ 524 524 To comply with copyright law and fair use principles: 525 525 526 526 **What We Store:** 527 - 528 528 * **Metadata only:** Title, author, publisher, URL, publication date 529 529 * **Short excerpts:** Max 25 words per quote, max 3 quotes per evidence item 530 530 * **Summaries:** AI-generated bullet points (not verbatim text) ... ... @@ -531,20 +531,17 @@ 531 531 * **No full articles:** Never store complete article text beyond job processing 532 532 533 533 **Total per Cached Claim:** 534 - 535 535 * Scenarios: 2 per claim 536 536 * Evidence items: 6 per scenario (12 total) 537 537 * Quotes: 3 per evidence × 25 words = 75 words per item 538 -* **Maximum stored verbatim text:** ~ ~900 words per claim (12 × 75)1025 +* **Maximum stored verbatim text:** ~900 words per claim (12 × 75) 539 539 540 540 **Retention:** 541 - 542 542 * Cache TTL: 90 days 543 543 * Job outputs: 24 hours (then archived or deleted) 544 544 * No persistent full-text article storage 545 545 546 546 **Rationale:** 547 - 548 548 * Short excerpts for citation = fair use 549 549 * Summaries are transformative (not copyrightable) 550 550 * Limited retention (90 days max) ... ... @@ -551,27 +551,480 @@ 551 551 * No commercial republication of excerpts 552 552 553 553 **DMCA Compliance:** 554 - 555 555 * Cache invalidation endpoint available for rights holders 556 556 * Contact: dmca@factharbor.org 557 557 558 ----- 559 559 560 -== Summary ==1043 +=== 5.2 Cache Invalidation Strategy === 561 561 562 -This WYSIWYG preview shows the **structure and key sections** of the 1,515-line API specification. 1045 +**Time-Based (Primary):** 1046 +* TTL: 90 days for most claims 1047 +* Reasoning: Evidence freshness, news cycles 563 563 564 -**Full specification includes:** 1049 +**Event-Based (Manual):** 1050 +* Admin can flag claims for invalidation 1051 +* Example: "Major study retracts findings" 1052 +* Tool: {{code}}DELETE /v1/cache/claim/{claim_hash}?reason=retraction{{/code}} 565 565 566 -* Complete API endpoints (7 total) 567 -* All data schemas (ClaimExtraction, ClaimAnalysis, HolisticAssessment, Complete) 568 -* Quality gates & validation rules 569 -* LLM configuration for all 3 stages 570 -* Implementation notes with code samples 571 -* Testing strategy 572 -* Cross-references to other pages 1054 +**Version-Based (Automatic):** 1055 +* AKEL v2.0 release → Invalidate all v1.0 caches 1056 +* Cache keys include version: {{code}}claim:v1:*{{/code}} vs {{code}}claim:v2:*{{/code}} 573 573 574 -**The complete specification is available in:** 1058 +**Long-Lived Historical Claims:** 1059 +* Historical claims about completed events generally have stable verdicts 1060 +* Example: "2024 US presidential election results" 1061 +* **Policy:** Extended TTL (365-3,650 days) instead of "never invalidate" 1062 +* **Reason:** Even historical data gets revisions (updated counts, corrections) 1063 +* **Mechanism:** Admin can still manually invalidate if major correction issued 1064 +* **Flag:** {{code}}is_historical=true{{/code}} in cache metadata → longer TTL 575 575 576 -* FactHarbor_POC1_API_and_Schemas_Spec_v0_4_1_PATCHED.md (45 KB standalone) 577 -* Export files (TEST/PRODUCTION) for xWiki import 1066 +=== 5.3 Cache Warming Strategy === 1067 + 1068 +**Proactive Cache Building (Future):** 1069 + 1070 +**Trending Topics:** 1071 +* Monitor news APIs for trending topics 1072 +* Pre-analyze top 20 common claims 1073 +* Example: New health study published → Pre-cache related claims 1074 + 1075 +**Predictable Events:** 1076 +* Elections, sporting events, earnings reports 1077 +* Pre-cache expected claims before event 1078 +* Reduces load during traffic spikes 1079 + 1080 +**User Patterns:** 1081 +* Analyze query logs 1082 +* Identify frequently requested claims 1083 +* Prioritize cache warming for these 1084 + 1085 +--- 1086 + 1087 +== 6. Quality Gates & Validation Rules == 1088 + 1089 +=== 6.1 Quality Gate Overview === 1090 + 1091 +|=Gate|=Name|=POC1 Status|=Applies To|=Notes 1092 +|**Gate 1**|Claim Validation|✅ Hard gate|Stage 1: Extraction|Filters opinions, compound claims 1093 +|**Gate 2**|Contradiction Search|✅ Mandatory rule|Stage 2: Analysis|Enforced per cached claim 1094 +|**Gate 3**|Uncertainty Disclosure|⚠️ Soft guidance|Stage 2: Analysis|Best practice 1095 +|**Gate 4**|Verdict Confidence|✅ Hard gate|Stage 2: Analysis|Confidence ≥ 0.5 required 1096 + 1097 +**Hard Gate Failures:** 1098 +* Gate 1 fail → Claim excluded from analysis 1099 +* Gate 4 fail → Claim marked "Unsubstantiated" but included 1100 + 1101 +=== 6.2 Validation Rules === 1102 + 1103 +|=Rule|=Requirement 1104 +|**Mandatory Contradiction**|Stage 2 MUST search for "undermines" evidence. If none found, reasoning must state: "No counter-evidence found despite targeted search." 1105 +|**Context-Aware Logic**|Stage 3 must prioritize central claims. If {{code}}is_central_to_thesis=true{{/code}} claim is REFUTED, article cannot be WELL-SUPPORTED. 1106 +|**Cache Consistency**|Cached claims must match current AKEL version. Version mismatch → cache miss. 1107 +|**Author Identification**|All outputs MUST include {{code}}author_type: "AI/AKEL"{{/code}}. 1108 + 1109 +--- 1110 + 1111 +== 7. Deterministic Markdown Template == 1112 + 1113 +Report generation uses **fixed template** (not LLM-generated). 1114 + 1115 +**Cache-Only Mode Template:** 1116 +{{code language="markdown"}} 1117 +# FactHarbor Analysis Report: PARTIAL ANALYSIS 1118 + 1119 +**Job ID:** {job_id} | **Generated:** {timestamp_utc} 1120 +**Mode:** Cache-Only (Free Tier) 1121 + 1122 +--- 1123 + 1124 +## ⚠️ Partial Analysis Notice 1125 + 1126 +This is a **cache-only analysis** based on previously analyzed claims. 1127 +{cache_coverage_percent}% of claims were available in cache. 1128 + 1129 +**What's Included:** 1130 +* {claims_cached} of {claims_total} claims analyzed 1131 +* Evidence and verdicts from cache (last updated: {oldest_cache_date}) 1132 + 1133 +**What's Missing:** 1134 +* {claims_missing} claims require new analysis 1135 +* Full article holistic assessment unavailable 1136 +* Estimated cost to complete: ${cost_to_complete} 1137 + 1138 +**[Upgrade to Pro]** for complete analysis 1139 + 1140 +--- 1141 + 1142 +## Cached Claims 1143 + 1144 +### [C1] {claim_text} ✅ From Cache 1145 +* **Cached:** {cached_at} ({cache_age} ago) 1146 +* **Times Used:** {hit_count} articles 1147 +* **Verdict:** {verdict} (Confidence: {confidence}) 1148 +* **Evidence:** {evidence_count} sources 1149 + 1150 +[Full claim details...] 1151 + 1152 +### [C3] {claim_text} ⚠️ Not In Cache 1153 +* **Status:** Requires new analysis 1154 +* **Cost:** $0.081 1155 +* **Upgrade to analyze this claim** 1156 + 1157 +--- 1158 + 1159 +**Powered by FactHarbor POC1-v0.4** | [Upgrade](https://factharbor.org/upgrade) 1160 +{{/code}} 1161 + 1162 +--- 1163 + 1164 +== 8. LLM Configuration (3-Stage) == 1165 + 1166 +=== 8.1 Stage 1: Claim Extraction (Haiku) === 1167 + 1168 +|=Parameter|=Value|=Notes 1169 +|**Model**|{{code}}claude-haiku-4-20250108{{/code}}|Fast, cheap, sufficient for extraction 1170 +|**Input Tokens**|~10K|Article text after URL extraction 1171 +|**Output Tokens**|~500|5 claims @ ~100 tokens each 1172 +|**Cost**|$0.003 per article|($0.25/M input + $1.25/M output) 1173 +|**Temperature**|0.0|Deterministic 1174 +|**Max Tokens**|1000|Generous buffer 1175 + 1176 +**Prompt Strategy:** 1177 +* Extract 5 verifiable factual claims 1178 +* Mark central vs. supporting claims 1179 +* Canonicalize (normalize phrasing) 1180 +* Deduplicate similar claims 1181 +* Output structured JSON only 1182 + 1183 +=== 8.2 Stage 2: Claim Analysis (Sonnet, CACHED) === 1184 + 1185 +|=Parameter|=Value|=Notes 1186 +|**Model**|{{code}}claude-3-5-sonnet-20241022{{/code}}|High quality for verdicts 1187 +|**Input Tokens**|~2K|Single claim + prompt + context 1188 +|**Output Tokens**|~5K|2 scenarios × ~2.5K tokens 1189 +|**Cost**|$0.081 per NEW claim|($3/M input + $15/M output) 1190 +|**Temperature**|0.0|Deterministic (cache consistency) 1191 +|**Max Tokens**|8000|Sufficient for 2 scenarios 1192 +|**Cache Strategy**|Redis, 90-day TTL|Key: {{code}}claim:v1norm1:{language}:{sha256(canonical_claim)}{{/code}} 1193 + 1194 +**Prompt Strategy:** 1195 +* Generate 2 scenario interpretations 1196 +* Search for supporting AND undermining evidence (mandatory) 1197 +* 6 evidence items per scenario maximum 1198 +* Compute verdict with reasoning chain (3-4 bullets) 1199 +* Output structured JSON only 1200 + 1201 +**Output Constraints (Cost Control):** 1202 +* Scenarios: Max 2 per claim 1203 +* Evidence: Max 6 per scenario 1204 +* Evidence summary: Max 3 bullets 1205 +* Reasoning chain: Max 4 bullets 1206 + 1207 +=== 8.3 Stage 3: Holistic Assessment (Sonnet) === 1208 + 1209 +|=Parameter|=Value|=Notes 1210 +|**Model**|{{code}}claude-3-5-sonnet-20241022{{/code}}|Context-aware analysis 1211 +|**Input Tokens**|~5K|Article + claim verdicts 1212 +|**Output Tokens**|~1K|Article verdict + fallacies 1213 +|**Cost**|$0.030 per article|($3/M input + $15/M output) 1214 +|**Temperature**|0.0|Deterministic 1215 +|**Max Tokens**|2000|Sufficient for assessment 1216 + 1217 +**Prompt Strategy:** 1218 +* Detect main thesis 1219 +* Evaluate logical coherence (claim verdicts → thesis) 1220 +* Identify fallacies (correlation-causation, cherry-picking, etc.) 1221 +* Compute logic_quality_score 1222 +* Explain article verdict reasoning (3-4 bullets) 1223 +* Output structured JSON only 1224 + 1225 +=== 8.4 Cost Projections by Cache Hit Rate === 1226 + 1227 +|=Cache Hit Rate|=Cost per Article|=10K Articles Cost|=100K Articles Cost 1228 +|0% (cold start)|$0.438|$4,380|$43,800 1229 +|20%|$0.357|$3,570|$35,700 1230 +|40%|$0.276|$2,760|$27,600 1231 +|**60%**|**$0.195**|**$1,950**|**$19,500** 1232 +|**70%** (target)|**$0.155**|**$1,550**|**$15,500** 1233 +|**80%**|**$0.114**|**$1,140**|**$11,400** 1234 +|**90%**|**$0.073**|**$730**|**$7,300** 1235 +|95%|$0.053|$530|$5,300 1236 + 1237 +**Break-Even Analysis:** 1238 +* Monolithic (v0.3.1): $0.15 per article constant 1239 +* 3-stage breaks even at **70% cache hit rate** 1240 +* Expected after ~1,500 articles in same domain 1241 + 1242 +--- 1243 + 1244 +== 9. Implementation Notes == 1245 + 1246 +=== 9.1 Recommended Tech Stack === 1247 + 1248 +* **Framework:** Next.js 14+ with App Router (TypeScript) 1249 +* **Cache:** Redis 7.0+ (managed: AWS ElastiCache, Redis Cloud, Upstash) 1250 +* **Storage:** Filesystem JSON for jobs + S3/R2 for archival 1251 +* **Queue:** BullMQ with Redis (for 3-stage pipeline orchestration) 1252 +* **LLM Client:** Anthropic Python SDK or TypeScript SDK 1253 +* **Cost Tracking:** PostgreSQL for user credit ledger 1254 +* **Deployment:** Vercel (frontend + API) + Redis Cloud 1255 + 1256 +=== 9.2 3-Stage Pipeline Implementation === 1257 + 1258 +**Job Queue Flow (Conceptual):** 1259 + 1260 +{{code language="typescript"}} 1261 +// Stage 1: Extract Claims 1262 +const stage1Job = await queue.add('stage1-extract-claims', { 1263 + jobId: 'job123', 1264 + articleUrl: 'https://example.com/article' 1265 +}); 1266 + 1267 +// On Stage 1 completion → enqueue Stage 2 jobs 1268 +stage1Job.on('completed', async (result) => { 1269 + const { claims } = result; 1270 + 1271 + // Stage 2: Analyze each claim (with cache check) 1272 + const stage2Jobs = await Promise.all( 1273 + claims.map(claim => 1274 + queue.add('stage2-analyze-claim', { 1275 + jobId: 'job123', 1276 + claimId: claim.claim_id, 1277 + canonicalClaim: claim.canonical_claim, 1278 + checkCache: true 1279 + }) 1280 + ) 1281 + ); 1282 + 1283 + // On all Stage 2 completions → enqueue Stage 3 1284 + await Promise.all(stage2Jobs.map(j => j.waitUntilFinished())); 1285 + 1286 + const claimVerdicts = await gatherStage2Results('job123'); 1287 + 1288 + await queue.add('stage3-holistic', { 1289 + jobId: 'job123', 1290 + articleUrl: 'https://example.com/article', 1291 + claimVerdicts: claimVerdicts 1292 + }); 1293 +}); 1294 +{{/code}} 1295 + 1296 +**Note:** This is a conceptual sketch. Actual implementation may use BullMQ Flow API or custom orchestration. 1297 + 1298 +**Cache Check Logic:** 1299 +{{code language="typescript"}} 1300 +async function analyzeClaimWithCache(claim: string): Promise<ClaimAnalysis> { 1301 + const canonicalClaim = normalizeClaim(claim); 1302 + const claimHash = sha256(canonicalClaim); 1303 + const cacheKey = `claim:v1:${claimHash}`; 1304 + 1305 + // Check cache 1306 + const cached = await redis.get(cacheKey); 1307 + if (cached) { 1308 + await redis.incr(`claim:stats:hit_count:${claimHash}`); 1309 + return JSON.parse(cached); 1310 + } 1311 + 1312 + // Cache miss - analyze with LLM 1313 + const analysis = await analyzeClaim_Stage2(canonicalClaim); 1314 + 1315 + // Store in cache 1316 + await redis.set(cacheKey, JSON.stringify(analysis), 'EX', 7776000); // 90 days 1317 + 1318 + return analysis; 1319 +} 1320 +{{/code}} 1321 + 1322 +=== 9.3 User Credit Management === 1323 + 1324 +**PostgreSQL Schema:** 1325 +{{code language="sql"}} 1326 +CREATE TABLE user_credits ( 1327 + user_id UUID PRIMARY KEY, 1328 + tier VARCHAR(20) DEFAULT 'free', 1329 + credit_limit DECIMAL(10,2) DEFAULT 10.00, 1330 + credit_used DECIMAL(10,2) DEFAULT 0.00, 1331 + reset_date TIMESTAMP, 1332 + cache_only_mode BOOLEAN DEFAULT false, 1333 + created_at TIMESTAMP DEFAULT NOW() 1334 +); 1335 + 1336 +CREATE TABLE usage_log ( 1337 + id SERIAL PRIMARY KEY, 1338 + user_id UUID REFERENCES user_credits(user_id), 1339 + job_id VARCHAR(50), 1340 + stage VARCHAR(20), 1341 + cost DECIMAL(10,4), 1342 + cache_hit BOOLEAN, 1343 + created_at TIMESTAMP DEFAULT NOW() 1344 +); 1345 +{{/code}} 1346 + 1347 +**Credit Deduction Logic:** 1348 +{{code language="typescript"}} 1349 +async function deductCredit(userId: string, cost: number): Promise<boolean> { 1350 + const user = await db.query('SELECT * FROM user_credits WHERE user_id = $1', [userId]); 1351 + 1352 + const newUsed = user.credit_used + cost; 1353 + 1354 + if (newUsed > user.credit_limit && user.tier === 'free') { 1355 + // Trigger cache-only mode 1356 + await db.query( 1357 + 'UPDATE user_credits SET cache_only_mode = true WHERE user_id = $1', 1358 + [userId] 1359 + ); 1360 + throw new Error('CREDIT_LIMIT_REACHED'); 1361 + } 1362 + 1363 + await db.query( 1364 + 'UPDATE user_credits SET credit_used = $1 WHERE user_id = $2', 1365 + [newUsed, userId] 1366 + ); 1367 + 1368 + return true; 1369 +} 1370 +{{/code}} 1371 + 1372 +=== 9.4 Cache-Only Mode Implementation === 1373 + 1374 +**Middleware:** 1375 +{{code language="typescript"}} 1376 +async function checkCacheOnlyMode(req, res, next) { 1377 + const user = await getUserCredit(req.userId); 1378 + 1379 + if (user.cache_only_mode) { 1380 + // Allow only cache reads 1381 + if (req.body.options?.cache_preference !== 'allow_partial') { 1382 + return res.status(402).json({ 1383 + error: 'credit_limit_reached', 1384 + message: 'Resubmit with cache_preference=allow_partial', 1385 + cache_only_mode: true 1386 + }); 1387 + } 1388 + 1389 + // Modify request to skip Stage 2 for uncached claims 1390 + req.cacheOnlyMode = true; 1391 + } 1392 + 1393 + next(); 1394 +} 1395 +{{/code}} 1396 + 1397 +=== 9.5 Estimated Timeline === 1398 + 1399 +**POC1 with 3-Stage Architecture:** 1400 +* Week 1: Stage 1 (Haiku extraction) + Redis setup 1401 +* Week 2: Stage 2 (Sonnet analysis + caching) 1402 +* Week 3: Stage 3 (Holistic assessment) + pipeline orchestration 1403 +* Week 4: User credit system + cache-only mode 1404 +* Week 5: Testing with 100 articles (measure cache hit rate) 1405 +* Week 6: Optimization + bug fixes 1406 +* **Total: 6-8 weeks** 1407 + 1408 +**Manual coding:** 12-16 weeks 1409 + 1410 +--- 1411 + 1412 +== 10. Testing Strategy == 1413 + 1414 +=== 10.1 Cache Performance Testing === 1415 + 1416 +**Test Scenarios:** 1417 + 1418 +**Scenario 1: Cold Start (0 cache)** 1419 +* Analyze 100 diverse articles 1420 +* Measure: Cost per article, cache growth rate 1421 +* Expected: $0.35-0.40 avg, ~400 unique claims cached 1422 + 1423 +**Scenario 2: Warm Cache (Overlapping Domain)** 1424 +* Analyze 100 articles on SAME topic (e.g., "2024 election") 1425 +* Measure: Cache hit rate growth 1426 +* Expected: Hit rate 20% → 60% by article 100 1427 + 1428 +**Scenario 3: Mature Cache (1,000 articles)** 1429 +* Analyze next 100 articles (diverse topics) 1430 +* Measure: Steady-state cache hit rate 1431 +* Expected: 60-70% hit rate, $0.15-0.18 avg cost 1432 + 1433 +**Scenario 4: Cache-Only Mode** 1434 +* Free user reaches $10 limit (67 articles at 70% hit rate) 1435 +* Submit 10 more articles with {{code}}cache_preference=allow_partial{{/code}} 1436 +* Measure: Coverage %, user satisfaction 1437 +* Expected: 60-70% coverage, instant results 1438 + 1439 +=== 10.2 Success Metrics === 1440 + 1441 +**Cache Performance:** 1442 +* Week 1: 5-10% hit rate 1443 +* Week 2: 15-25% hit rate 1444 +* Week 3: 30-40% hit rate 1445 +* Week 4: 45-55% hit rate 1446 +* Target: ≥50% by 1,000 articles 1447 + 1448 +**Cost Targets:** 1449 +* Articles 1-100: $0.35-0.40 avg ⚠️ (expected) 1450 +* Articles 100-500: $0.25-0.30 avg 1451 +* Articles 500-1,000: $0.18-0.22 avg 1452 +* Articles 1,000+: $0.12-0.15 avg ✅ 1453 + 1454 +**Quality Metrics (same as v0.3.1):** 1455 +* Hallucination rate: <5% 1456 +* Context-aware accuracy: ≥70% 1457 +* False positive rate: <15% 1458 +* Mandatory contradiction search: 100% compliance 1459 + 1460 +=== 10.3 Free Tier Economics Validation === 1461 + 1462 +**Test with simulated 1,000 users:** 1463 +* Each user: $10 credit 1464 +* 70% cache hit rate 1465 +* Avg 70 articles/user/month 1466 + 1467 +**Projected Costs:** 1468 +* Total credits: 1,000 × $10 = $10,000 1469 +* Actual LLM costs: ~$9,000 (cache savings) 1470 +* Margin: 10% 1471 + 1472 +**Sustainability Check:** 1473 +* If margin <5% → Reduce free tier limit 1474 +* If margin >20% → Consider increasing free tier 1475 + 1476 +--- 1477 + 1478 +== 11. Cross-References == 1479 + 1480 +This API specification implements requirements from: 1481 + 1482 +* **[[POC Requirements>>Test.FactHarbor.Specification.POC.Requirements]]** 1483 +** FR-POC-1 through FR-POC-6 (3-stage architecture) 1484 +** NFR-POC-1 through NFR-POC-3 (quality gates, caching) 1485 +** NEW: FR-POC-7 (Claim-level caching) 1486 +** NEW: FR-POC-8 (User credit system) 1487 +** NEW: FR-POC-9 (Cache-only mode) 1488 + 1489 +* **[[Article Verdict Problem>>Test.FactHarbor.Specification.POC.Article-Verdict-Problem]]** 1490 +** Approach 1 implemented in Stage 3 1491 +** Context-aware holistic assessment 1492 + 1493 +* **[[Requirements>>Test.FactHarbor.Specification.Requirements.WebHome]]** 1494 +** FR4 (Analysis Summary) - enhanced with caching 1495 +** FR7 (Verdict Calculation) - cached per claim 1496 +** NFR11 (Quality Gates) - enforced across stages 1497 +** NEW: NFR19 (Cost Efficiency via Caching) 1498 +** NEW: NFR20 (Free Tier Sustainability) 1499 + 1500 +* **[[Architecture>>Test.FactHarbor.Specification.Architecture.WebHome]]** 1501 +** POC1 3-stage pipeline architecture 1502 +** Redis cache layer 1503 +** User credit system 1504 + 1505 +* **[[Data Model>>Test.FactHarbor.Specification.Data Model.WebHome]]** 1506 +** Claim structure (cacheable unit) 1507 +** Evidence structure 1508 +** Scenario boundaries 1509 + 1510 +--- 1511 + 1512 +**End of Specification - FactHarbor POC1 API v0.4** 1513 + 1514 +**3-stage caching architecture with free tier cache-only mode. Ready for sustainable, scalable implementation!** 🚀 1515 +