Changes for page POC1 API & Schemas Specification
Last modified by Robert Schaub on 2025/12/24 18:26
Summary
-
Page properties (1 modified, 0 added, 0 removed)
Details
- Page properties
-
- Content
-
... ... @@ -1,44 +1,25 @@ 1 - #FactHarborPOC1—API & Schemas Specification1 += POC1 API & Schemas Specification = 2 2 3 -**Version:** 0.4.1 (POC1 - 3-Stage Caching Architecture) 4 -**Namespace:** FactHarbor.* 5 -**Syntax:** xWiki 2.1 6 -**Last Updated:** 2025-12-24 3 +---- 7 7 8 ---- 9 - 10 10 == Version History == 11 11 12 12 |=Version|=Date|=Changes 13 13 |0.4.1|2025-12-24|Applied 9 critical fixes: file format notice, verdict taxonomy, canonicalization algorithm, Stage 1 cost policy, BullMQ fix, language in cache key, historical claims TTL, idempotency, copyright policy 14 14 |0.4|2025-12-24|**BREAKING:** 3-stage pipeline with claim-level caching, user tier system, cache-only mode for free users, Redis cache architecture 15 -|0.3.1|2025-12-24|Fixed single-prompt strategy, SSE clarification, schema canonicalization, cost constraints, chain-of-thought, evidence citation, Jina safety, gate numbering 16 -|0.3|2025-12-24|Added complete API endpoints, LLM config, risk tiers, scraping details, quality gate logging, temporal separation note, cross-references 17 -|0.2|2025-12-24|Initial rebased version with holistic assessment 18 -|0.1|2025-12-24|Original specification 10 +|0.3.1|2025-12-24|Fixed single-prompt strategy, SSE clarification, schema canonicalization, cost constraints 11 +|0.3|2025-12-24|Added complete API endpoints, LLM config, risk tiers, scraping details 19 19 20 ---- 21 ---- 13 +---- 22 22 23 -== File Format Notice == 24 - 25 -**⚠️ Important:** This file is stored as {{code}}.md{{/code}} for transport/versioning, but the content is **xWiki 2.1 syntax** (not Markdown). 26 - 27 -**When importing to xWiki:** 28 -* Use "Import as XWiki content" (not "Import as Markdown") 29 -* The xWiki parser will correctly interpret {{code}}==}} headers, {{{{code}}}}}} blocks, etc. 30 - 31 -**Alternate naming:** If your workflow supports it, rename to {{code}}.xwiki.txt{{/code}} to avoid ambiguity. 32 - 33 ---- 34 - 35 35 == 1. Core Objective (POC1) == 36 36 37 -The primary technical goal of POC1 is to validate **Approach 1 (Single-Pass Holistic Analysis)** while implementing **claim-level caching** to achieve cost sustainability :17 +The primary technical goal of POC1 is to validate **Approach 1 (Single-Pass Holistic Analysis)** while implementing **claim-level caching** to achieve cost sustainability. 38 38 39 -The system must prove that AI can identify an article's **Main Thesis** and determine if thesupporting claims(even if individually accurate) logically support that thesis without committing fallacies(e.g., correlation vs. causation, cherry-picking, hasty generalization).19 +The system must prove that AI can identify an article's **Main Thesis** and determine if supporting claims logically support that thesis without committing fallacies. 40 40 41 -**Success Criteria:** 21 +=== Success Criteria: === 22 + 42 42 * Test with 30 diverse articles 43 43 * Target: ≥70% accuracy detecting misleading articles 44 44 * Cost: <$0.25 per NEW analysis (uncached) ... ... @@ -46,14 +46,13 @@ 46 46 * Cache hit rate: ≥50% after 1,000 articles 47 47 * Processing time: <2 minutes (standard depth) 48 48 49 -**Economic Model:** 50 -* Free tier: $10 credit per month (~40-140 articles depending on cache hits) 51 -* After limit: Cache-only mode (instant, free access to cached claims) 52 -* Paid tier: Unlimited new analyses 30 +=== Economic Model: === 53 53 54 -**See:** [[Article Verdict Problem>>Test.FactHarbor.Specification.POC.Article-Verdict-Problem]] for complete investigation of 7 approaches. 32 +* **Free tier:** $10 credit per month (~~40-140 articles depending on cache hits) 33 +* **After limit:** Cache-only mode (instant, free access to cached claims) 34 +* **Paid tier:** Unlimited new analyses 55 55 56 ---- 36 +---- 57 57 58 58 == 2. Architecture Overview == 59 59 ... ... @@ -61,7 +61,7 @@ 61 61 62 62 FactHarbor POC1 uses a **3-stage architecture** designed for claim-level caching and cost efficiency: 63 63 64 -{{ code language="mermaid"}}44 +{{mermaid}} 65 65 graph TD 66 66 A[Article Input] --> B[Stage 1: Extract Claims] 67 67 B --> C{For Each Claim} ... ... @@ -72,41 +72,46 @@ 72 72 G --> E 73 73 E --> H[Stage 3: Holistic Assessment] 74 74 H --> I[Final Report] 75 -{{/ code}}55 +{{/mermaid}} 76 76 77 -**Stage 1: Claim Extraction** (Haiku, no cache) 78 -* Input: Article text 79 -* Output: 5 canonical claims (normalized, deduplicated) 80 -* Model: Claude Haiku 4 81 -* Cost: $0.003 per article 82 -* Cache strategy: No caching (article-specific) 57 +==== Stage 1: Claim Extraction (Haiku, no cache) ==== 83 83 84 -**Stage 2: Claim Analysis** (Sonnet, CACHED) 85 -* Input: Single canonical claim 86 -* Output: Scenarios + Evidence + Verdicts 87 -* Model: Claude Sonnet 3.5 88 -* Cost: $0.081 per NEW claim 89 -* Cache strategy: **Redis, 90-day TTL** 90 -* Cache key: {{code}}claim:v1norm1:{language}:{sha256(canonical_claim)}{{/code}} 59 +* **Input:** Article text 60 +* **Output:** 5 canonical claims (normalized, deduplicated) 61 +* **Model:** Claude Haiku 4 62 +* **Cost:** $0.003 per article 63 +* **Cache strategy:** No caching (article-specific) 91 91 92 -**Stage 3: Holistic Assessment** (Sonnet, no cache) 93 -* Input: Article + Claim verdicts (from cache or Stage 2) 94 -* Output: Article verdict + Fallacies + Logic quality 95 -* Model: Claude Sonnet 3.5 96 -* Cost: $0.030 per article 97 -* Cache strategy: No caching (article-specific) 65 +==== Stage 2: Claim Analysis (Sonnet, CACHED) ==== 98 98 99 -**Total Cost Formula:** 100 -{{code}} 101 -Cost = $0.003 (extraction) + (N_new_claims × $0.081) + $0.030 (holistic) 67 +* **Input:** Single canonical claim 68 +* **Output:** Scenarios + Evidence + Verdicts 69 +* **Model:** Claude Sonnet 3.5 70 +* **Cost:** $0.081 per NEW claim 71 +* **Cache strategy:** Redis, 90-day TTL 72 +* **Cache key:** claim:v1norm1:{language}:{sha256(canonical_claim)} 102 102 74 +==== Stage 3: Holistic Assessment (Sonnet, no cache) ==== 75 + 76 +* **Input:** Article + Claim verdicts (from cache or Stage 2) 77 +* **Output:** Article verdict + Fallacies + Logic quality 78 +* **Model:** Claude Sonnet 3.5 79 +* **Cost:** $0.030 per article 80 +* **Cache strategy:** No caching (article-specific) 81 + 82 +=== Total Cost Formula: === 83 + 84 +{{{Cost = $0.003 (extraction) + (N_new_claims × $0.081) + $0.030 (holistic) 85 + 103 103 Examples: 104 104 - 0 new claims (100% cache hit): $0.033 105 105 - 1 new claim (80% cache hit): $0.114 106 106 - 3 new claims (40% cache hit): $0.276 107 107 - 5 new claims (0% cache hit): $0.438 108 - {{/code}}91 +}}} 109 109 93 +---- 94 + 110 110 === 2.2 User Tier System === 111 111 112 112 |=Tier|=Monthly Credit|=After Limit|=Cache Access|=Analytics ... ... @@ -115,17 +115,21 @@ 115 115 |**Enterprise** (future)|Custom|Continues|✅ Full + Priority|Full 116 116 117 117 **Free Tier Economics:** 103 + 118 118 * $10 credit = 40-140 articles analyzed (depending on cache hit rate) 119 119 * Average 70 articles/month at 70% cache hit rate 120 -* After limit: Cache-only mode (see Section 2.3)106 +* After limit: Cache-only mode 121 121 108 +---- 109 + 122 122 === 2.3 Cache-Only Mode (Free Tier Feature) === 123 123 124 124 When free users reach their $10 monthly limit, they enter **Cache-Only Mode**: 125 125 126 - **What Cache-Only Mode Provides:**114 +==== What Cache-Only Mode Provides: ==== 127 127 128 128 ✅ **Claim Extraction (Platform-Funded):** 117 + 129 129 * Stage 1 extraction runs at $0.003 per article 130 130 * **Cost: Absorbed by platform** (not charged to user credit) 131 131 * Rationale: Extraction is necessary to check cache, and cost is negligible ... ... @@ -132,27 +132,31 @@ 132 132 * Rate limit: Max 50 extractions/day in cache-only mode (prevents abuse) 133 133 134 134 ✅ **Instant Access to Cached Claims:** 124 + 135 135 * Any claim that exists in cache → Full verdict returned 136 136 * Cost: $0 (no LLM calls) 137 137 * Response time: <100ms 138 138 139 139 ✅ **Partial Article Analysis:** 130 + 140 140 * Check each claim against cache 141 141 * Return verdicts for ALL cached claims 142 -* For uncached claims: Return {{code}}"status": "cache_miss"{{/code}}133 +* For uncached claims: Return "status": "cache_miss" 143 143 144 144 ✅ **Cache Coverage Report:** 136 + 145 145 * "3 of 5 claims available in cache (60% coverage)" 146 146 * Links to cached analyses 147 147 * Estimated cost to complete: $0.162 (2 new claims) 148 148 149 149 ❌ **Not Available in Cache-Only Mode:** 142 + 150 150 * New claim analysis (Stage 2 LLM calls blocked) 151 151 * Full holistic assessment (Stage 3 blocked if any claims missing) 152 152 153 - **User Experience:**154 - {{code language="json"}}155 -{ 146 +==== User Experience Example: ==== 147 + 148 +{{{{ 156 156 "status": "cache_only_mode", 157 157 "message": "Monthly credit limit reached. Showing cached results only.", 158 158 "cache_coverage": { ... ... @@ -175,26 +175,26 @@ 175 175 "pro_tier": "$50/month unlimited" 176 176 } 177 177 } 178 - {{/code}}171 +}}} 179 179 180 180 **Design Rationale:** 174 + 181 181 * Free users still get value (cached claims often answer their question) 182 182 * Demonstrates FactHarbor's value (partial results encourage upgrade) 183 183 * Sustainable for platform (no additional cost) 184 184 * Fair to all users (everyone contributes to cache) 185 185 186 ---- 180 +---- 187 187 188 188 == 3. REST API Contract == 189 189 190 190 === 3.1 User Credit Tracking === 191 191 192 -**Endpoint:** {{code}}GET /v1/user/credit{{/code}}186 +**Endpoint:** GET /v1/user/credit 193 193 194 -**Response:** {{code}}200 OK{{/code}}188 +**Response:** 200 OK 195 195 196 -{{code language="json"}} 197 -{ 190 +{{{{ 198 198 "user_id": "user_abc123", 199 199 "tier": "free", 200 200 "credit_limit": 10.00, ... ... @@ -209,30 +209,25 @@ 209 209 "cache_hit_rate": 0.626 210 210 } 211 211 } 212 - {{/code}}205 +}}} 213 213 214 ---- 207 +---- 215 215 216 216 === 3.2 Create Analysis Job (3-Stage) === 217 217 218 -**Endpoint:** {{code}}POST /v1/analyze{{/code}}211 +**Endpoint:** POST /v1/analyze 219 219 220 - **RequestBody:**213 +==== Idempotency Support: ==== 221 221 222 - 223 -**Idempotency Support:** 224 - 225 225 To prevent duplicate job creation on network retries, clients SHOULD include: 226 226 227 -{{code language="http"}} 228 -POST /v1/analyze 217 +{{{POST /v1/analyze 229 229 Idempotency-Key: {client-generated-uuid} 230 - {{/code}}219 +}}} 231 231 232 -OR use the {{code}}client.request_id{{/code}}field:221 +OR use the client.request_id field: 233 233 234 -{{code language="json"}} 235 -{ 223 +{{{{ 236 236 "input_url": "...", 237 237 "client": { 238 238 "request_id": "client-uuid-12345", ... ... @@ -239,17 +239,18 @@ 239 239 "source_label": "optional" 240 240 } 241 241 } 242 - {{/code}}230 +}}} 243 243 244 244 **Server Behavior:** 245 -* If {{code}}Idempotency-Key{{/code}} or {{code}}request_id{{/code}} seen before (within 24 hours): 246 - - Return existing job ({{code}}200 OK{{/code}}, not {{code}}202 Accepted{{/code}}) 247 - - Do NOT create duplicate job or charge twice 233 + 234 +* If Idempotency-Key or request_id seen before (within 24 hours): 235 +** Return existing job (200 OK, not 202 Accepted) 236 +** Do NOT create duplicate job or charge twice 248 248 * Idempotency keys expire after 24 hours (matches job retention) 249 249 250 250 **Example Response (Idempotent):** 251 - {{code language="json"}}252 -{ 240 + 241 +{{{{ 253 253 "job_id": "01J...ULID", 254 254 "status": "RUNNING", 255 255 "idempotent": true, ... ... @@ -256,11 +256,11 @@ 256 256 "original_request_at": "2025-12-24T10:31:00Z", 257 257 "message": "Returning existing job (idempotency key matched)" 258 258 } 259 - {{/code}}248 +}}} 260 260 250 +==== Request Body: ==== 261 261 262 -{{code language="json"}} 263 -{ 252 +{{{{ 264 264 "input_type": "url", 265 265 "input_url": "https://example.com/medical-report-01", 266 266 "input_text": null, ... ... @@ -268,8 +268,9 @@ 268 268 "browsing": "on", 269 269 "depth": "standard", 270 270 "max_claims": 5, 271 - "context_aware_analysis": true, 272 - "cache_preference": "prefer_cache" 260 + "scenarios_per_claim": 2, 261 + "max_evidence_per_scenario": 6, 262 + "context_aware_analysis": true 273 273 }, 274 274 "client": { 275 275 "request_id": "optional-client-tracking-id", ... ... @@ -276,18 +276,20 @@ 276 276 "source_label": "optional" 277 277 } 278 278 } 279 - {{/code}}269 +}}} 280 280 281 281 **Options:** 282 -* {{code}}cache_preference{{/code}}: {{code}}prefer_cache{{/code}} | {{code}}require_fresh{{/code}} | {{code}}allow_partial{{/code}} 283 - - {{code}}prefer_cache{{/code}}: Use cache when available, analyze new claims (default) 284 - - {{code}}require_fresh{{/code}}: Force re-analysis of all claims (ignores cache, costs more) 285 - - {{code}}allow_partial{{/code}}: Return partial results if some claims uncached (for free tier cache-only mode) 286 286 287 -**Response:** {{code}}202 Accepted{{/code}} 273 +* browsing: on | off (retrieve web sources or just output queries) 274 +* depth: standard | deep (evidence thoroughness) 275 +* max_claims: 1-10 (default: **5** for cost control) 276 +* scenarios_per_claim: 1-5 (default: **2** for cost control) 277 +* max_evidence_per_scenario: 3-10 (default: **6**) 278 +* context_aware_analysis: true | false (experimental) 288 288 289 -{{code language="json"}} 290 -{ 280 +**Response:** 202 Accepted 281 + 282 +{{{{ 291 291 "job_id": "01J...ULID", 292 292 "status": "QUEUED", 293 293 "created_at": "2025-12-24T10:31:00Z", ... ... @@ -310,13 +310,13 @@ 310 310 "events": "/v1/jobs/01J...ULID/events" 311 311 } 312 312 } 313 - {{/code}}305 +}}} 314 314 315 315 **Error Responses:** 316 316 317 - {{code}}402 Payment Required{{/code}}- Free tier limit reached, cache-only mode318 - {{code language="json"}}319 -{ 309 +402 Payment Required - Free tier limit reached, cache-only mode 310 + 311 +{{{{ 320 320 "error": "credit_limit_reached", 321 321 "message": "Monthly credit limit reached. Entering cache-only mode.", 322 322 "cache_only_mode": true, ... ... @@ -324,199 +324,15 @@ 324 324 "reset_date": "2025-02-01T00:00:00Z", 325 325 "action": "Resubmit with cache_preference=allow_partial for cached results" 326 326 } 327 - {{/code}}319 +}}} 328 328 329 ---- 321 +---- 330 330 331 -=== 3.3 Get Job Status === 332 - 333 -**Endpoint:** {{code}}GET /v1/jobs/{job_id}{{/code}} 334 - 335 -**Response:** {{code}}200 OK{{/code}} 336 - 337 -{{code language="json"}} 338 -{ 339 - "job_id": "01J...ULID", 340 - "status": "RUNNING", 341 - "created_at": "2025-12-24T10:31:00Z", 342 - "updated_at": "2025-12-24T10:31:22Z", 343 - "progress": { 344 - "stage": "stage2_claim_analysis", 345 - "percent": 65, 346 - "message": "Analyzing claim 3 of 5 (2 from cache)", 347 - "current_claim_id": "C3", 348 - "cache_hits": 2, 349 - "cache_misses": 1 350 - }, 351 - "actual_cost": 0.084, 352 - "cost_breakdown": { 353 - "stage1_extraction": 0.003, 354 - "stage2_new_claims": 0.081, 355 - "stage2_cached_claims": 0.000, 356 - "stage3_holistic": null 357 - }, 358 - "input_echo": { 359 - "input_type": "url", 360 - "input_url": "https://example.com/medical-report-01" 361 - }, 362 - "links": { 363 - "self": "/v1/jobs/01J...ULID", 364 - "result": "/v1/jobs/01J...ULID/result", 365 - "report": "/v1/jobs/01J...ULID/report" 366 - }, 367 - "error": null 368 -} 369 -{{/code}} 370 - 371 ---- 372 - 373 -=== 3.4 Get Analysis Result === 374 - 375 -**Endpoint:** {{code}}GET /v1/jobs/{job_id}/result{{/code}} 376 - 377 -**Response:** {{code}}200 OK{{/code}} 378 - 379 -Returns complete **AnalysisResult** schema (see Section 4). 380 - 381 -**Cache-Only Mode Response:** {{code}}206 Partial Content{{/code}} 382 - 383 -{{code language="json"}} 384 -{ 385 - "cache_only_mode": true, 386 - "cache_coverage": { 387 - "claims_total": 5, 388 - "claims_cached": 3, 389 - "claims_missing": 2, 390 - "coverage_percent": 60 391 - }, 392 - "partial_result": { 393 - "metadata": { 394 - "job_id": "01J...ULID", 395 - "timestamp_utc": "2025-12-24T10:31:30Z", 396 - "engine_version": "POC1-v0.4", 397 - "cache_only": true 398 - }, 399 - "claims": [ 400 - { 401 - "claim_id": "C1", 402 - "claim_text": "...", 403 - "canonical_claim": "...", 404 - "source": "cache", 405 - "cached_at": "2025-12-20T15:30:00Z", 406 - "cache_hit_count": 47, 407 - "scenarios": [...] 408 - }, 409 - { 410 - "claim_id": "C3", 411 - "claim_text": "...", 412 - "canonical_claim": "...", 413 - "source": "not_analyzed", 414 - "status": "cache_miss", 415 - "estimated_cost": 0.081 416 - } 417 - ], 418 - "article_holistic_assessment": null, 419 - "upgrade_prompt": { 420 - "message": "Upgrade to Pro for full analysis of all claims", 421 - "missing_claims": 2, 422 - "cost_to_complete": 0.192 423 - } 424 - } 425 -} 426 -{{/code}} 427 - 428 -**Other Responses:** 429 -* {{code}}409 Conflict{{/code}} - Job not finished yet 430 -* {{code}}404 Not Found{{/code}} - Job ID unknown 431 - 432 ---- 433 - 434 -=== 3.5 Stage-Specific Endpoints (Optional, Advanced) === 435 - 436 -For direct stage access (useful for cache debugging, custom workflows): 437 - 438 -**Extract Claims Only:** 439 -{{code}}POST /v1/analyze/extract-claims{{/code}} 440 - 441 -**Analyze Single Claim:** 442 -{{code}}POST /v1/analyze/claim{{/code}} 443 - 444 -**Assess Article (with claim verdicts):** 445 -{{code}}POST /v1/analyze/assess-article{{/code}} 446 - 447 -**Check Claim Cache:** 448 -{{code}}GET /v1/cache/claim/{claim_hash}{{/code}} 449 - 450 -**Cache Statistics:** 451 -{{code}}GET /v1/cache/stats{{/code}} 452 - 453 ---- 454 - 455 -=== 3.6 Download Markdown Report === 456 - 457 -**Endpoint:** {{code}}GET /v1/jobs/{job_id}/report{{/code}} 458 - 459 -**Response:** {{code}}200 OK{{/code}} with {{code}}text/markdown; charset=utf-8{{/code}} content 460 - 461 -**Headers:** 462 -* {{code}}Content-Disposition: attachment; filename="factharbor_poc1_{job_id}.md"{{/code}} 463 - 464 -**Cache-Only Mode:** Report includes "Partial Analysis" watermark and upgrade prompt. 465 - 466 ---- 467 - 468 -=== 3.7 Stream Job Events (Backend Progress) === 469 - 470 -**Endpoint:** {{code}}GET /v1/jobs/{job_id}/events{{/code}} 471 - 472 -**Response:** Server-Sent Events (SSE) stream 473 - 474 -**Event Types:** 475 -* {{code}}progress{{/code}} - Backend progress (e.g., "Stage 1: Extracting claims") 476 -* {{code}}cache_hit{{/code}} - Claim found in cache 477 -* {{code}}cache_miss{{/code}} - Claim requires new analysis 478 -* {{code}}stage_complete{{/code}} - Stage 1/2/3 finished 479 -* {{code}}complete{{/code}} - Job finished 480 -* {{code}}error{{/code}} - Error occurred 481 -* {{code}}credit_warning{{/code}} - User approaching limit 482 - 483 ---- 484 - 485 -=== 3.8 Cancel Job === 486 - 487 -**Endpoint:** {{code}}DELETE /v1/jobs/{job_id}{{/code}} 488 - 489 -**Note:** If job is mid-stage (e.g., analyzing claim 3 of 5), user is charged for completed work only. 490 - 491 ---- 492 - 493 -=== 3.9 Health Check === 494 - 495 -**Endpoint:** {{code}}GET /v1/health{{/code}} 496 - 497 -{{code language="json"}} 498 -{ 499 - "status": "ok", 500 - "version": "POC1-v0.4", 501 - "model_stage1": "claude-haiku-4", 502 - "model_stage2": "claude-3-5-sonnet-20241022", 503 - "model_stage3": "claude-3-5-sonnet-20241022", 504 - "cache": { 505 - "status": "connected", 506 - "total_claims": 12847, 507 - "avg_hit_rate_24h": 0.73 508 - } 509 -} 510 -{{/code}} 511 - 512 ---- 513 - 514 514 == 4. Data Schemas == 515 515 516 516 === 4.1 Stage 1 Output: ClaimExtraction === 517 517 518 -{{code language="json"}} 519 -{ 327 +{{{{ 520 520 "job_id": "01J...ULID", 521 521 "stage": "stage1_extraction", 522 522 "article_metadata": { ... ... @@ -541,219 +541,10 @@ 541 541 "article_thesis": "Main argument detected", 542 542 "cost": 0.003 543 543 } 544 - {{/code}}352 +}}} 545 545 546 - === 4.2 Stage 2 Output: ClaimAnalysis (CACHED) ===354 +---- 547 547 548 -This is the CACHEABLE unit. Stored in Redis with 90-day TTL. 549 - 550 -{{code language="json"}} 551 -{ 552 - "claim_hash": "sha256:abc123...", 553 - "canonical_claim": "COVID vaccines are 95% effective", 554 - "language": "en", 555 - "domain": "public_health", 556 - "analysis_version": "v1.0", 557 - "scenarios": [ 558 - { 559 - "scenario_id": "S1", 560 - "scenario_title": "mRNA vaccines (Pfizer/Moderna) in clinical trials", 561 - "definitions": {"95% effective": "95% reduction in symptomatic infection"}, 562 - "assumptions": ["Based on phase 3 trial data", "Against original strain"], 563 - "boundaries": { 564 - "time": "2020-2021 trials", 565 - "geography": "Multi-country trials", 566 - "population": "Adult population (16+)", 567 - "conditions": "Before widespread variants" 568 - }, 569 - "verdict": { 570 - "label": "Highly Likely", 571 - "probability_range": [0.88, 0.97], 572 - "confidence": 0.92, 573 - "reasoning_chain": [ 574 - "Pfizer/BioNTech trial: 95% efficacy (n=43,548)", 575 - "Moderna trial: 94.1% efficacy (n=30,420)", 576 - "Peer-reviewed publications in NEJM", 577 - "FDA independent analysis confirmed" 578 - ], 579 - "key_supporting_evidence_ids": ["E1", "E2"], 580 - "key_counter_evidence_ids": ["E3"], 581 - "uncertainty_factors": [ 582 - "Limited data on long-term effectiveness", 583 - "Variant-specific performance not yet measured" 584 - ] 585 - }, 586 - "evidence": [ 587 - { 588 - "evidence_id": "E1", 589 - "stance": "supports", 590 - "relevance_to_scenario": 0.98, 591 - "evidence_summary": [ 592 - "Pfizer trial showed 170 cases in placebo vs 8 in vaccine group", 593 - "Follow-up period median 2 months post-dose 2", 594 - "Efficacy consistent across age, sex, race, ethnicity" 595 - ], 596 - "citation": { 597 - "title": "Safety and Efficacy of the BNT162b2 mRNA Covid-19 Vaccine", 598 - "author_or_org": "Polack et al.", 599 - "publication_date": "2020-12-31", 600 - "url": "https://nejm.org/doi/full/10.1056/NEJMoa2034577", 601 - "publisher": "New England Journal of Medicine", 602 - "retrieved_at_utc": "2025-12-20T15:30:00Z" 603 - }, 604 - "excerpt": ["The vaccine was 95% effective in preventing Covid-19"], 605 - "excerpt_word_count": 9, 606 - "source_reliability_score": 0.95, 607 - "reliability_justification": "Peer-reviewed, high-impact journal, large RCT", 608 - "limitations_and_reservations": [ 609 - "Short follow-up period (2 months)", 610 - "Primarily measures symptomatic infection, not transmission" 611 - ], 612 - "retraction_or_dispute_signal": "none" 613 - } 614 - ] 615 - } 616 - ], 617 - "cache_metadata": { 618 - "first_analyzed": "2025-12-01T10:00:00Z", 619 - "last_updated": "2025-12-20T15:30:00Z", 620 - "hit_count": 47, 621 - "version": "v1.0", 622 - "ttl_expires": "2026-03-20T15:30:00Z" 623 - }, 624 - "cost": 0.081 625 -} 626 -{{/code}} 627 - 628 -**Cache Key Structure:** 629 -{{code}} 630 -Redis Key: claim:v1norm1:{language}:{sha256(canonical_claim)} 631 -TTL: 90 days (7,776,000 seconds) 632 -Size: ~15KB JSON (compressed: ~5KB) 633 -{{/code}} 634 - 635 -=== 4.3 Stage 3 Output: HolisticAssessment === 636 - 637 -{{code language="json"}} 638 -{ 639 - "job_id": "01J...ULID", 640 - "stage": "stage3_holistic", 641 - "article_metadata": { 642 - "title": "...", 643 - "main_thesis": "...", 644 - "source_url": "..." 645 - }, 646 - "article_holistic_assessment": { 647 - "overall_verdict": "MISLEADING", 648 - "logic_quality_score": 0.42, 649 - "fallacies_detected": [ 650 - "correlation-causation", 651 - "cherry-picking" 652 - ], 653 - "verdict_reasoning": [ 654 - "Central claim C1 is REFUTED by multiple systematic reviews", 655 - "Supporting claims C2-C4 are TRUE but do not support the thesis", 656 - "Article commits correlation-causation fallacy", 657 - "Selective citation of evidence (cherry-picking detected)" 658 - ], 659 - "experimental_feature": true 660 - }, 661 - "claims_summary": [ 662 - { 663 - "claim_id": "C1", 664 - "is_central_to_thesis": true, 665 - "verdict": "Refuted", 666 - "confidence": 0.89, 667 - "source": "cache", 668 - "cache_hit": true 669 - }, 670 - { 671 - "claim_id": "C2", 672 - "is_central_to_thesis": false, 673 - "verdict": "Highly Likely", 674 - "confidence": 0.91, 675 - "source": "new_analysis", 676 - "cache_hit": false 677 - } 678 - ], 679 - "quality_gates": { 680 - "gate1_claim_validation": "pass", 681 - "gate4_verdict_confidence": "pass", 682 - "passed_all": true 683 - }, 684 - "cost": 0.030, 685 - "total_job_cost": 0.114 686 -} 687 -{{/code}} 688 - 689 -=== 4.4 Complete AnalysisResult (All 3 Stages Combined) === 690 - 691 -{{code language="json"}} 692 -{ 693 - "metadata": { 694 - "job_id": "01J...ULID", 695 - "timestamp_utc": "2025-12-24T10:31:30Z", 696 - "engine_version": "POC1-v0.4", 697 - "llm_stage1": "claude-haiku-4", 698 - "llm_stage2": "claude-3-5-sonnet-20241022", 699 - "llm_stage3": "claude-3-5-sonnet-20241022", 700 - "usage_stats": { 701 - "stage1_tokens": {"input": 10000, "output": 500}, 702 - "stage2_tokens": {"input": 2000, "output": 5000}, 703 - "stage3_tokens": {"input": 5000, "output": 1000}, 704 - "total_input_tokens": 17000, 705 - "total_output_tokens": 6500, 706 - "estimated_cost_usd": 0.114, 707 - "response_time_sec": 45.2 708 - }, 709 - "cache_stats": { 710 - "claims_total": 5, 711 - "claims_from_cache": 4, 712 - "claims_new_analysis": 1, 713 - "cache_hit_rate": 0.80, 714 - "cache_savings_usd": 0.324 715 - } 716 - }, 717 - "article_holistic_assessment": { 718 - "main_thesis": "...", 719 - "overall_verdict": "MISLEADING", 720 - "logic_quality_score": 0.42, 721 - "fallacies_detected": ["correlation-causation", "cherry-picking"], 722 - "verdict_reasoning": ["...", "...", "..."], 723 - "experimental_feature": true 724 - }, 725 - "claims": [ 726 - { 727 - "claim_id": "C1", 728 - "is_central_to_thesis": true, 729 - "claim_text": "...", 730 - "canonical_claim": "...", 731 - "claim_hash": "sha256:abc123...", 732 - "claim_type": "causal", 733 - "evaluability": "evaluable", 734 - "risk_tier": "B", 735 - "source": "cache", 736 - "cached_at": "2025-12-20T15:30:00Z", 737 - "cache_hit_count": 47, 738 - "scenarios": [...] 739 - }, 740 - { 741 - "claim_id": "C2", 742 - "source": "new_analysis", 743 - "analyzed_at": "2025-12-24T10:31:15Z", 744 - "scenarios": [...] 745 - } 746 - ], 747 - "quality_gates": { 748 - "gate1_claim_validation": "pass", 749 - "gate4_verdict_confidence": "pass", 750 - "passed_all": true 751 - } 752 -} 753 -{{/code}} 754 - 755 - 756 - 757 757 === 4.5 Verdict Label Taxonomy === 758 758 759 759 FactHarbor uses **three distinct verdict taxonomies** depending on analysis level: ... ... @@ -763,23 +763,26 @@ 763 763 Used for individual scenario verdicts within a claim. 764 764 765 765 **Enum Values:** 766 -* {{code}}Highly Likely{{/code}} - Probability 0.85-1.0, high confidence 767 -* {{code}}Likely{{/code}} - Probability 0.65-0.84, moderate-high confidence 768 -* {{code}}Unclear{{/code}} - Probability 0.35-0.64, or low confidence 769 -* {{code}}Unlikely{{/code}} - Probability 0.16-0.34, moderate-high confidence 770 -* {{code}}Highly Unlikely{{/code}} - Probability 0.0-0.15, high confidence 771 -* {{code}}Unsubstantiated{{/code}} - Insufficient evidence to determine probability 772 772 366 +* Highly Likely - Probability 0.85-1.0, high confidence 367 +* Likely - Probability 0.65-0.84, moderate-high confidence 368 +* Unclear - Probability 0.35-0.64, or low confidence 369 +* Unlikely - Probability 0.16-0.34, moderate-high confidence 370 +* Highly Unlikely - Probability 0.0-0.15, high confidence 371 +* Unsubstantiated - Insufficient evidence to determine probability 372 + 773 773 ==== 4.5.2 Claim Verdict Labels (Rollup) ==== 774 774 775 775 Used when summarizing a claim across all scenarios. 776 776 777 777 **Enum Values:** 778 -* {{code}}Supported{{/code}} - Majority of scenarios are Likely or Highly Likely 779 -* {{code}}Refuted{{/code}} - Majority of scenarios are Unlikely or Highly Unlikely 780 -* {{code}}Inconclusive{{/code}} - Mixed scenarios or majority Unclear/Unsubstantiated 781 781 379 +* Supported - Majority of scenarios are Likely or Highly Likely 380 +* Refuted - Majority of scenarios are Unlikely or Highly Unlikely 381 +* Inconclusive - Mixed scenarios or majority Unclear/Unsubstantiated 382 + 782 782 **Mapping Logic:** 384 + 783 783 * If ≥60% scenarios are (Highly Likely | Likely) → Supported 784 784 * If ≥60% scenarios are (Highly Unlikely | Unlikely) → Refuted 785 785 * Otherwise → Inconclusive ... ... @@ -789,23 +789,23 @@ 789 789 Used for holistic article-level assessment. 790 790 791 791 **Enum Values:** 792 -* {{code}}WELL-SUPPORTED{{/code}} - Article thesis logically follows from supported claims 793 -* {{code}}MISLEADING{{/code}} - Claims may be true but article commits logical fallacies 794 -* {{code}}REFUTED{{/code}} - Central claims are refuted, invalidating thesis 795 -* {{code}}UNCERTAIN{{/code}} - Insufficient evidence or highly mixed claim verdicts 796 796 395 +* WELL-SUPPORTED - Article thesis logically follows from supported claims 396 +* MISLEADING - Claims may be true but article commits logical fallacies 397 +* REFUTED - Central claims are refuted, invalidating thesis 398 +* UNCERTAIN - Insufficient evidence or highly mixed claim verdicts 399 + 797 797 **Note:** Article verdict considers **claim centrality** (central claims override supporting claims). 798 798 799 799 ==== 4.5.4 API Field Mapping ==== 800 800 801 801 |=Level|=API Field|=Enum Name 802 -|Scenario| {{code}}scenarios[].verdict.label{{/code}}|scenario_verdict_label803 -|Claim| {{code}}claims[].rollup_verdict{{/code}}(optional)|claim_verdict_label804 -|Article| {{code}}article_holistic_assessment.overall_verdict{{/code}}|article_verdict_label405 +|Scenario|scenarios[].verdict.label|scenario_verdict_label 406 +|Claim|claims[].rollup_verdict (optional)|claim_verdict_label 407 +|Article|article_holistic_assessment.overall_verdict|article_verdict_label 805 805 409 +---- 806 806 807 ---- 808 - 809 809 == 5. Cache Architecture == 810 810 811 811 === 5.1 Redis Cache Design === ... ... @@ -813,117 +813,29 @@ 813 813 **Technology:** Redis 7.0+ (in-memory key-value store) 814 814 815 815 **Cache Key Schema:** 816 -{{code}} 817 -claim:v1norm1:{language}:{sha256(canonical_claim)} 818 -{{/code}} 819 819 419 +{{{claim:v1norm1:{language}:{sha256(canonical_claim)} 420 +}}} 421 + 820 820 **Example:** 821 - {{code}}822 -Claim (English): "COVID vaccines are 95% effective" 423 + 424 +{{{Claim (English): "COVID vaccines are 95% effective" 823 823 Canonical: "covid vaccines are 95 percent effective" 824 824 Language: "en" 825 825 SHA256: abc123...def456 826 826 Key: claim:v1norm1:en:abc123...def456 827 - {{/code}}429 +}}} 828 828 829 829 **Rationale:** Prevents cross-language collisions and enables per-language cache analytics. 830 830 831 831 **Data Structure:** 832 -{{code language="redis"}} 833 -SET claim:v1:abc123...def456 '{...ClaimAnalysis JSON...}' 834 -EXPIRE claim:v1:abc123...def456 7776000 # 90 days 835 -{{/code}} 836 836 837 -**Additional Keys:** 838 -{{code}} 435 +{{{SET claim:v1norm1:en:abc123...def456 '{...ClaimAnalysis JSON...}' 436 +EXPIRE claim:v1norm1:en:abc123...def456 7776000 # 90 days 437 +}}} 839 839 840 - ==== 5.1.1 Canonical Claim Normalization (v1) ====439 +---- 841 841 842 -The cache key depends on deterministic claim normalization. All implementations MUST follow this algorithm exactly. 843 - 844 -**Algorithm: Canonical Claim Normalization v1** 845 - 846 -{{code language="python"}} 847 -def normalize_claim_v1(claim_text: str, language: str) -> str: 848 - """ 849 - Normalizes claim to canonical form for cache key generation. 850 - Version: v1norm1 (POC1) 851 - """ 852 - import re 853 - import unicodedata 854 - 855 - # Step 1: Unicode normalization (NFC) 856 - text = unicodedata.normalize('NFC', claim_text) 857 - 858 - # Step 2: Lowercase 859 - text = text.lower() 860 - 861 - # Step 3: Remove punctuation (except hyphens in words) 862 - text = re.sub(r'[^\w\s-]', '', text) 863 - 864 - # Step 4: Normalize whitespace (collapse multiple spaces) 865 - text = re.sub(r'\s+', ' ', text).strip() 866 - 867 - # Step 5: Numeric normalization 868 - text = text.replace('%', ' percent') 869 - # Spell out single-digit numbers 870 - num_to_word = {'0':'zero', '1':'one', '2':'two', '3':'three', 871 - '4':'four', '5':'five', '6':'six', '7':'seven', 872 - '8':'eight', '9':'nine'} 873 - for num, word in num_to_word.items(): 874 - text = re.sub(rf'\b{num}\b', word, text) 875 - 876 - # Step 6: Common abbreviations (English only in v1) 877 - if language == 'en': 878 - text = text.replace('covid-19', 'covid') 879 - text = text.replace('u.s.', 'us') 880 - text = text.replace('u.k.', 'uk') 881 - 882 - # Step 7: NO entity normalization in v1 883 - # (Trump vs Donald Trump vs President Trump remain distinct) 884 - 885 - return text 886 - 887 -# Version identifier (include in cache namespace) 888 -CANONICALIZER_VERSION = "v1norm1" 889 -{{/code}} 890 - 891 -**Cache Key Formula (Updated):** 892 - 893 -{{code}} 894 -language = "en" 895 -canonical = normalize_claim_v1(claim_text, language) 896 -cache_key = f"claim:{CANONICALIZER_VERSION}:{language}:{sha256(canonical)}" 897 - 898 -Example: 899 - claim: "COVID-19 vaccines are 95% effective" 900 - canonical: "covid vaccines are 95 percent effective" 901 - sha256: abc123...def456 902 - key: "claim:v1norm1:en:abc123...def456" 903 -{{/code}} 904 - 905 -**Cache Metadata MUST Include:** 906 - 907 -{{code language="json"}} 908 -{ 909 - "canonical_claim": "covid vaccines are 95 percent effective", 910 - "canonicalizer_version": "v1norm1", 911 - "language": "en", 912 - "original_claim_samples": ["COVID-19 vaccines are 95% effective"] 913 -} 914 -{{/code}} 915 - 916 -**Version Upgrade Path:** 917 -* v1norm1 → v1norm2: Cache namespace changes, old keys remain valid until TTL 918 -* v1normN → v2norm1: Major version bump, invalidate all v1 caches 919 - 920 - 921 -claim:stats:hit_count:{claim_hash} # Counter 922 -claim:index:domain:{domain} # Set of claim hashes by domain 923 -claim:index:language:{lang} # Set of claim hashes by language 924 -{{/code}} 925 - 926 - 927 927 === 5.1.1 Canonical Claim Normalization (v1) === 928 928 929 929 The cache key depends on deterministic claim normalization. All implementations MUST follow this algorithm exactly. ... ... @@ -930,8 +930,7 @@ 930 930 931 931 **Algorithm: Canonical Claim Normalization v1** 932 932 933 -{{code language="python"}} 934 -def normalize_claim_v1(claim_text: str, language: str) -> str: 447 +{{{def normalize_claim_v1(claim_text: str, language: str) -> str: 935 935 """ 936 936 Normalizes claim to canonical form for cache key generation. 937 937 Version: v1norm1 (POC1) ... ... @@ -973,12 +973,11 @@ 973 973 974 974 # Version identifier (include in cache namespace) 975 975 CANONICALIZER_VERSION = "v1norm1" 976 - {{/code}}489 +}}} 977 977 978 978 **Cache Key Formula (Updated):** 979 979 980 -{{code}} 981 -language = "en" 493 +{{{language = "en" 982 982 canonical = normalize_claim_v1(claim_text, language) 983 983 cache_key = f"claim:{CANONICALIZER_VERSION}:{language}:{sha256(canonical)}" 984 984 ... ... @@ -987,25 +987,25 @@ 987 987 canonical: "covid vaccines are 95 percent effective" 988 988 sha256: abc123...def456 989 989 key: "claim:v1norm1:en:abc123...def456" 990 - {{/code}}502 +}}} 991 991 992 992 **Cache Metadata MUST Include:** 993 993 994 -{{code language="json"}} 995 -{ 506 +{{{{ 996 996 "canonical_claim": "covid vaccines are 95 percent effective", 997 997 "canonicalizer_version": "v1norm1", 998 998 "language": "en", 999 999 "original_claim_samples": ["COVID-19 vaccines are 95% effective"] 1000 1000 } 1001 - {{/code}}512 +}}} 1002 1002 1003 1003 **Version Upgrade Path:** 515 + 1004 1004 * v1norm1 → v1norm2: Cache namespace changes, old keys remain valid until TTL 1005 1005 * v1normN → v2norm1: Major version bump, invalidate all v1 caches 1006 1006 519 +---- 1007 1007 1008 - 1009 1009 === 5.1.2 Copyright & Data Retention Policy === 1010 1010 1011 1011 **Evidence Excerpt Storage:** ... ... @@ -1013,6 +1013,7 @@ 1013 1013 To comply with copyright law and fair use principles: 1014 1014 1015 1015 **What We Store:** 528 + 1016 1016 * **Metadata only:** Title, author, publisher, URL, publication date 1017 1017 * **Short excerpts:** Max 25 words per quote, max 3 quotes per evidence item 1018 1018 * **Summaries:** AI-generated bullet points (not verbatim text) ... ... @@ -1019,17 +1019,20 @@ 1019 1019 * **No full articles:** Never store complete article text beyond job processing 1020 1020 1021 1021 **Total per Cached Claim:** 535 + 1022 1022 * Scenarios: 2 per claim 1023 1023 * Evidence items: 6 per scenario (12 total) 1024 1024 * Quotes: 3 per evidence × 25 words = 75 words per item 1025 -* **Maximum stored verbatim text:** ~900 words per claim (12 × 75) 539 +* **Maximum stored verbatim text:** ~~900 words per claim (12 × 75) 1026 1026 1027 1027 **Retention:** 542 + 1028 1028 * Cache TTL: 90 days 1029 1029 * Job outputs: 24 hours (then archived or deleted) 1030 1030 * No persistent full-text article storage 1031 1031 1032 1032 **Rationale:** 548 + 1033 1033 * Short excerpts for citation = fair use 1034 1034 * Summaries are transformative (not copyrightable) 1035 1035 * Limited retention (90 days max) ... ... @@ -1036,480 +1036,27 @@ 1036 1036 * No commercial republication of excerpts 1037 1037 1038 1038 **DMCA Compliance:** 555 + 1039 1039 * Cache invalidation endpoint available for rights holders 1040 1040 * Contact: dmca@factharbor.org 1041 1041 559 +---- 1042 1042 1043 -== =5.2 Cache InvalidationStrategy ===561 +== Summary == 1044 1044 1045 -**Time-Based (Primary):** 1046 -* TTL: 90 days for most claims 1047 -* Reasoning: Evidence freshness, news cycles 563 +This WYSIWYG preview shows the **structure and key sections** of the 1,515-line API specification. 1048 1048 1049 -**Event-Based (Manual):** 1050 -* Admin can flag claims for invalidation 1051 -* Example: "Major study retracts findings" 1052 -* Tool: {{code}}DELETE /v1/cache/claim/{claim_hash}?reason=retraction{{/code}} 565 +**Full specification includes:** 1053 1053 1054 -**Version-Based (Automatic):** 1055 -* AKEL v2.0 release → Invalidate all v1.0 caches 1056 -* Cache keys include version: {{code}}claim:v1:*{{/code}} vs {{code}}claim:v2:*{{/code}} 567 +* Complete API endpoints (7 total) 568 +* All data schemas (ClaimExtraction, ClaimAnalysis, HolisticAssessment, Complete) 569 +* Quality gates & validation rules 570 +* LLM configuration for all 3 stages 571 +* Implementation notes with code samples 572 +* Testing strategy 573 +* Cross-references to other pages 1057 1057 1058 -**Long-Lived Historical Claims:** 1059 -* Historical claims about completed events generally have stable verdicts 1060 -* Example: "2024 US presidential election results" 1061 -* **Policy:** Extended TTL (365-3,650 days) instead of "never invalidate" 1062 -* **Reason:** Even historical data gets revisions (updated counts, corrections) 1063 -* **Mechanism:** Admin can still manually invalidate if major correction issued 1064 -* **Flag:** {{code}}is_historical=true{{/code}} in cache metadata → longer TTL 575 +**The complete specification is available in:** 1065 1065 1066 -=== 5.3 Cache Warming Strategy === 1067 - 1068 -**Proactive Cache Building (Future):** 1069 - 1070 -**Trending Topics:** 1071 -* Monitor news APIs for trending topics 1072 -* Pre-analyze top 20 common claims 1073 -* Example: New health study published → Pre-cache related claims 1074 - 1075 -**Predictable Events:** 1076 -* Elections, sporting events, earnings reports 1077 -* Pre-cache expected claims before event 1078 -* Reduces load during traffic spikes 1079 - 1080 -**User Patterns:** 1081 -* Analyze query logs 1082 -* Identify frequently requested claims 1083 -* Prioritize cache warming for these 1084 - 1085 ---- 1086 - 1087 -== 6. Quality Gates & Validation Rules == 1088 - 1089 -=== 6.1 Quality Gate Overview === 1090 - 1091 -|=Gate|=Name|=POC1 Status|=Applies To|=Notes 1092 -|**Gate 1**|Claim Validation|✅ Hard gate|Stage 1: Extraction|Filters opinions, compound claims 1093 -|**Gate 2**|Contradiction Search|✅ Mandatory rule|Stage 2: Analysis|Enforced per cached claim 1094 -|**Gate 3**|Uncertainty Disclosure|⚠️ Soft guidance|Stage 2: Analysis|Best practice 1095 -|**Gate 4**|Verdict Confidence|✅ Hard gate|Stage 2: Analysis|Confidence ≥ 0.5 required 1096 - 1097 -**Hard Gate Failures:** 1098 -* Gate 1 fail → Claim excluded from analysis 1099 -* Gate 4 fail → Claim marked "Unsubstantiated" but included 1100 - 1101 -=== 6.2 Validation Rules === 1102 - 1103 -|=Rule|=Requirement 1104 -|**Mandatory Contradiction**|Stage 2 MUST search for "undermines" evidence. If none found, reasoning must state: "No counter-evidence found despite targeted search." 1105 -|**Context-Aware Logic**|Stage 3 must prioritize central claims. If {{code}}is_central_to_thesis=true{{/code}} claim is REFUTED, article cannot be WELL-SUPPORTED. 1106 -|**Cache Consistency**|Cached claims must match current AKEL version. Version mismatch → cache miss. 1107 -|**Author Identification**|All outputs MUST include {{code}}author_type: "AI/AKEL"{{/code}}. 1108 - 1109 ---- 1110 - 1111 -== 7. Deterministic Markdown Template == 1112 - 1113 -Report generation uses **fixed template** (not LLM-generated). 1114 - 1115 -**Cache-Only Mode Template:** 1116 -{{code language="markdown"}} 1117 -# FactHarbor Analysis Report: PARTIAL ANALYSIS 1118 - 1119 -**Job ID:** {job_id} | **Generated:** {timestamp_utc} 1120 -**Mode:** Cache-Only (Free Tier) 1121 - 1122 ---- 1123 - 1124 -## ⚠️ Partial Analysis Notice 1125 - 1126 -This is a **cache-only analysis** based on previously analyzed claims. 1127 -{cache_coverage_percent}% of claims were available in cache. 1128 - 1129 -**What's Included:** 1130 -* {claims_cached} of {claims_total} claims analyzed 1131 -* Evidence and verdicts from cache (last updated: {oldest_cache_date}) 1132 - 1133 -**What's Missing:** 1134 -* {claims_missing} claims require new analysis 1135 -* Full article holistic assessment unavailable 1136 -* Estimated cost to complete: ${cost_to_complete} 1137 - 1138 -**[Upgrade to Pro]** for complete analysis 1139 - 1140 ---- 1141 - 1142 -## Cached Claims 1143 - 1144 -### [C1] {claim_text} ✅ From Cache 1145 -* **Cached:** {cached_at} ({cache_age} ago) 1146 -* **Times Used:** {hit_count} articles 1147 -* **Verdict:** {verdict} (Confidence: {confidence}) 1148 -* **Evidence:** {evidence_count} sources 1149 - 1150 -[Full claim details...] 1151 - 1152 -### [C3] {claim_text} ⚠️ Not In Cache 1153 -* **Status:** Requires new analysis 1154 -* **Cost:** $0.081 1155 -* **Upgrade to analyze this claim** 1156 - 1157 ---- 1158 - 1159 -**Powered by FactHarbor POC1-v0.4** | [Upgrade](https://factharbor.org/upgrade) 1160 -{{/code}} 1161 - 1162 ---- 1163 - 1164 -== 8. LLM Configuration (3-Stage) == 1165 - 1166 -=== 8.1 Stage 1: Claim Extraction (Haiku) === 1167 - 1168 -|=Parameter|=Value|=Notes 1169 -|**Model**|{{code}}claude-haiku-4-20250108{{/code}}|Fast, cheap, sufficient for extraction 1170 -|**Input Tokens**|~10K|Article text after URL extraction 1171 -|**Output Tokens**|~500|5 claims @ ~100 tokens each 1172 -|**Cost**|$0.003 per article|($0.25/M input + $1.25/M output) 1173 -|**Temperature**|0.0|Deterministic 1174 -|**Max Tokens**|1000|Generous buffer 1175 - 1176 -**Prompt Strategy:** 1177 -* Extract 5 verifiable factual claims 1178 -* Mark central vs. supporting claims 1179 -* Canonicalize (normalize phrasing) 1180 -* Deduplicate similar claims 1181 -* Output structured JSON only 1182 - 1183 -=== 8.2 Stage 2: Claim Analysis (Sonnet, CACHED) === 1184 - 1185 -|=Parameter|=Value|=Notes 1186 -|**Model**|{{code}}claude-3-5-sonnet-20241022{{/code}}|High quality for verdicts 1187 -|**Input Tokens**|~2K|Single claim + prompt + context 1188 -|**Output Tokens**|~5K|2 scenarios × ~2.5K tokens 1189 -|**Cost**|$0.081 per NEW claim|($3/M input + $15/M output) 1190 -|**Temperature**|0.0|Deterministic (cache consistency) 1191 -|**Max Tokens**|8000|Sufficient for 2 scenarios 1192 -|**Cache Strategy**|Redis, 90-day TTL|Key: {{code}}claim:v1norm1:{language}:{sha256(canonical_claim)}{{/code}} 1193 - 1194 -**Prompt Strategy:** 1195 -* Generate 2 scenario interpretations 1196 -* Search for supporting AND undermining evidence (mandatory) 1197 -* 6 evidence items per scenario maximum 1198 -* Compute verdict with reasoning chain (3-4 bullets) 1199 -* Output structured JSON only 1200 - 1201 -**Output Constraints (Cost Control):** 1202 -* Scenarios: Max 2 per claim 1203 -* Evidence: Max 6 per scenario 1204 -* Evidence summary: Max 3 bullets 1205 -* Reasoning chain: Max 4 bullets 1206 - 1207 -=== 8.3 Stage 3: Holistic Assessment (Sonnet) === 1208 - 1209 -|=Parameter|=Value|=Notes 1210 -|**Model**|{{code}}claude-3-5-sonnet-20241022{{/code}}|Context-aware analysis 1211 -|**Input Tokens**|~5K|Article + claim verdicts 1212 -|**Output Tokens**|~1K|Article verdict + fallacies 1213 -|**Cost**|$0.030 per article|($3/M input + $15/M output) 1214 -|**Temperature**|0.0|Deterministic 1215 -|**Max Tokens**|2000|Sufficient for assessment 1216 - 1217 -**Prompt Strategy:** 1218 -* Detect main thesis 1219 -* Evaluate logical coherence (claim verdicts → thesis) 1220 -* Identify fallacies (correlation-causation, cherry-picking, etc.) 1221 -* Compute logic_quality_score 1222 -* Explain article verdict reasoning (3-4 bullets) 1223 -* Output structured JSON only 1224 - 1225 -=== 8.4 Cost Projections by Cache Hit Rate === 1226 - 1227 -|=Cache Hit Rate|=Cost per Article|=10K Articles Cost|=100K Articles Cost 1228 -|0% (cold start)|$0.438|$4,380|$43,800 1229 -|20%|$0.357|$3,570|$35,700 1230 -|40%|$0.276|$2,760|$27,600 1231 -|**60%**|**$0.195**|**$1,950**|**$19,500** 1232 -|**70%** (target)|**$0.155**|**$1,550**|**$15,500** 1233 -|**80%**|**$0.114**|**$1,140**|**$11,400** 1234 -|**90%**|**$0.073**|**$730**|**$7,300** 1235 -|95%|$0.053|$530|$5,300 1236 - 1237 -**Break-Even Analysis:** 1238 -* Monolithic (v0.3.1): $0.15 per article constant 1239 -* 3-stage breaks even at **70% cache hit rate** 1240 -* Expected after ~1,500 articles in same domain 1241 - 1242 ---- 1243 - 1244 -== 9. Implementation Notes == 1245 - 1246 -=== 9.1 Recommended Tech Stack === 1247 - 1248 -* **Framework:** Next.js 14+ with App Router (TypeScript) 1249 -* **Cache:** Redis 7.0+ (managed: AWS ElastiCache, Redis Cloud, Upstash) 1250 -* **Storage:** Filesystem JSON for jobs + S3/R2 for archival 1251 -* **Queue:** BullMQ with Redis (for 3-stage pipeline orchestration) 1252 -* **LLM Client:** Anthropic Python SDK or TypeScript SDK 1253 -* **Cost Tracking:** PostgreSQL for user credit ledger 1254 -* **Deployment:** Vercel (frontend + API) + Redis Cloud 1255 - 1256 -=== 9.2 3-Stage Pipeline Implementation === 1257 - 1258 -**Job Queue Flow (Conceptual):** 1259 - 1260 -{{code language="typescript"}} 1261 -// Stage 1: Extract Claims 1262 -const stage1Job = await queue.add('stage1-extract-claims', { 1263 - jobId: 'job123', 1264 - articleUrl: 'https://example.com/article' 1265 -}); 1266 - 1267 -// On Stage 1 completion → enqueue Stage 2 jobs 1268 -stage1Job.on('completed', async (result) => { 1269 - const { claims } = result; 1270 - 1271 - // Stage 2: Analyze each claim (with cache check) 1272 - const stage2Jobs = await Promise.all( 1273 - claims.map(claim => 1274 - queue.add('stage2-analyze-claim', { 1275 - jobId: 'job123', 1276 - claimId: claim.claim_id, 1277 - canonicalClaim: claim.canonical_claim, 1278 - checkCache: true 1279 - }) 1280 - ) 1281 - ); 1282 - 1283 - // On all Stage 2 completions → enqueue Stage 3 1284 - await Promise.all(stage2Jobs.map(j => j.waitUntilFinished())); 1285 - 1286 - const claimVerdicts = await gatherStage2Results('job123'); 1287 - 1288 - await queue.add('stage3-holistic', { 1289 - jobId: 'job123', 1290 - articleUrl: 'https://example.com/article', 1291 - claimVerdicts: claimVerdicts 1292 - }); 1293 -}); 1294 -{{/code}} 1295 - 1296 -**Note:** This is a conceptual sketch. Actual implementation may use BullMQ Flow API or custom orchestration. 1297 - 1298 -**Cache Check Logic:** 1299 -{{code language="typescript"}} 1300 -async function analyzeClaimWithCache(claim: string): Promise<ClaimAnalysis> { 1301 - const canonicalClaim = normalizeClaim(claim); 1302 - const claimHash = sha256(canonicalClaim); 1303 - const cacheKey = `claim:v1:${claimHash}`; 1304 - 1305 - // Check cache 1306 - const cached = await redis.get(cacheKey); 1307 - if (cached) { 1308 - await redis.incr(`claim:stats:hit_count:${claimHash}`); 1309 - return JSON.parse(cached); 1310 - } 1311 - 1312 - // Cache miss - analyze with LLM 1313 - const analysis = await analyzeClaim_Stage2(canonicalClaim); 1314 - 1315 - // Store in cache 1316 - await redis.set(cacheKey, JSON.stringify(analysis), 'EX', 7776000); // 90 days 1317 - 1318 - return analysis; 1319 -} 1320 -{{/code}} 1321 - 1322 -=== 9.3 User Credit Management === 1323 - 1324 -**PostgreSQL Schema:** 1325 -{{code language="sql"}} 1326 -CREATE TABLE user_credits ( 1327 - user_id UUID PRIMARY KEY, 1328 - tier VARCHAR(20) DEFAULT 'free', 1329 - credit_limit DECIMAL(10,2) DEFAULT 10.00, 1330 - credit_used DECIMAL(10,2) DEFAULT 0.00, 1331 - reset_date TIMESTAMP, 1332 - cache_only_mode BOOLEAN DEFAULT false, 1333 - created_at TIMESTAMP DEFAULT NOW() 1334 -); 1335 - 1336 -CREATE TABLE usage_log ( 1337 - id SERIAL PRIMARY KEY, 1338 - user_id UUID REFERENCES user_credits(user_id), 1339 - job_id VARCHAR(50), 1340 - stage VARCHAR(20), 1341 - cost DECIMAL(10,4), 1342 - cache_hit BOOLEAN, 1343 - created_at TIMESTAMP DEFAULT NOW() 1344 -); 1345 -{{/code}} 1346 - 1347 -**Credit Deduction Logic:** 1348 -{{code language="typescript"}} 1349 -async function deductCredit(userId: string, cost: number): Promise<boolean> { 1350 - const user = await db.query('SELECT * FROM user_credits WHERE user_id = $1', [userId]); 1351 - 1352 - const newUsed = user.credit_used + cost; 1353 - 1354 - if (newUsed > user.credit_limit && user.tier === 'free') { 1355 - // Trigger cache-only mode 1356 - await db.query( 1357 - 'UPDATE user_credits SET cache_only_mode = true WHERE user_id = $1', 1358 - [userId] 1359 - ); 1360 - throw new Error('CREDIT_LIMIT_REACHED'); 1361 - } 1362 - 1363 - await db.query( 1364 - 'UPDATE user_credits SET credit_used = $1 WHERE user_id = $2', 1365 - [newUsed, userId] 1366 - ); 1367 - 1368 - return true; 1369 -} 1370 -{{/code}} 1371 - 1372 -=== 9.4 Cache-Only Mode Implementation === 1373 - 1374 -**Middleware:** 1375 -{{code language="typescript"}} 1376 -async function checkCacheOnlyMode(req, res, next) { 1377 - const user = await getUserCredit(req.userId); 1378 - 1379 - if (user.cache_only_mode) { 1380 - // Allow only cache reads 1381 - if (req.body.options?.cache_preference !== 'allow_partial') { 1382 - return res.status(402).json({ 1383 - error: 'credit_limit_reached', 1384 - message: 'Resubmit with cache_preference=allow_partial', 1385 - cache_only_mode: true 1386 - }); 1387 - } 1388 - 1389 - // Modify request to skip Stage 2 for uncached claims 1390 - req.cacheOnlyMode = true; 1391 - } 1392 - 1393 - next(); 1394 -} 1395 -{{/code}} 1396 - 1397 -=== 9.5 Estimated Timeline === 1398 - 1399 -**POC1 with 3-Stage Architecture:** 1400 -* Week 1: Stage 1 (Haiku extraction) + Redis setup 1401 -* Week 2: Stage 2 (Sonnet analysis + caching) 1402 -* Week 3: Stage 3 (Holistic assessment) + pipeline orchestration 1403 -* Week 4: User credit system + cache-only mode 1404 -* Week 5: Testing with 100 articles (measure cache hit rate) 1405 -* Week 6: Optimization + bug fixes 1406 -* **Total: 6-8 weeks** 1407 - 1408 -**Manual coding:** 12-16 weeks 1409 - 1410 ---- 1411 - 1412 -== 10. Testing Strategy == 1413 - 1414 -=== 10.1 Cache Performance Testing === 1415 - 1416 -**Test Scenarios:** 1417 - 1418 -**Scenario 1: Cold Start (0 cache)** 1419 -* Analyze 100 diverse articles 1420 -* Measure: Cost per article, cache growth rate 1421 -* Expected: $0.35-0.40 avg, ~400 unique claims cached 1422 - 1423 -**Scenario 2: Warm Cache (Overlapping Domain)** 1424 -* Analyze 100 articles on SAME topic (e.g., "2024 election") 1425 -* Measure: Cache hit rate growth 1426 -* Expected: Hit rate 20% → 60% by article 100 1427 - 1428 -**Scenario 3: Mature Cache (1,000 articles)** 1429 -* Analyze next 100 articles (diverse topics) 1430 -* Measure: Steady-state cache hit rate 1431 -* Expected: 60-70% hit rate, $0.15-0.18 avg cost 1432 - 1433 -**Scenario 4: Cache-Only Mode** 1434 -* Free user reaches $10 limit (67 articles at 70% hit rate) 1435 -* Submit 10 more articles with {{code}}cache_preference=allow_partial{{/code}} 1436 -* Measure: Coverage %, user satisfaction 1437 -* Expected: 60-70% coverage, instant results 1438 - 1439 -=== 10.2 Success Metrics === 1440 - 1441 -**Cache Performance:** 1442 -* Week 1: 5-10% hit rate 1443 -* Week 2: 15-25% hit rate 1444 -* Week 3: 30-40% hit rate 1445 -* Week 4: 45-55% hit rate 1446 -* Target: ≥50% by 1,000 articles 1447 - 1448 -**Cost Targets:** 1449 -* Articles 1-100: $0.35-0.40 avg ⚠️ (expected) 1450 -* Articles 100-500: $0.25-0.30 avg 1451 -* Articles 500-1,000: $0.18-0.22 avg 1452 -* Articles 1,000+: $0.12-0.15 avg ✅ 1453 - 1454 -**Quality Metrics (same as v0.3.1):** 1455 -* Hallucination rate: <5% 1456 -* Context-aware accuracy: ≥70% 1457 -* False positive rate: <15% 1458 -* Mandatory contradiction search: 100% compliance 1459 - 1460 -=== 10.3 Free Tier Economics Validation === 1461 - 1462 -**Test with simulated 1,000 users:** 1463 -* Each user: $10 credit 1464 -* 70% cache hit rate 1465 -* Avg 70 articles/user/month 1466 - 1467 -**Projected Costs:** 1468 -* Total credits: 1,000 × $10 = $10,000 1469 -* Actual LLM costs: ~$9,000 (cache savings) 1470 -* Margin: 10% 1471 - 1472 -**Sustainability Check:** 1473 -* If margin <5% → Reduce free tier limit 1474 -* If margin >20% → Consider increasing free tier 1475 - 1476 ---- 1477 - 1478 -== 11. Cross-References == 1479 - 1480 -This API specification implements requirements from: 1481 - 1482 -* **[[POC Requirements>>Test.FactHarbor.Specification.POC.Requirements]]** 1483 -** FR-POC-1 through FR-POC-6 (3-stage architecture) 1484 -** NFR-POC-1 through NFR-POC-3 (quality gates, caching) 1485 -** NEW: FR-POC-7 (Claim-level caching) 1486 -** NEW: FR-POC-8 (User credit system) 1487 -** NEW: FR-POC-9 (Cache-only mode) 1488 - 1489 -* **[[Article Verdict Problem>>Test.FactHarbor.Specification.POC.Article-Verdict-Problem]]** 1490 -** Approach 1 implemented in Stage 3 1491 -** Context-aware holistic assessment 1492 - 1493 -* **[[Requirements>>Test.FactHarbor.Specification.Requirements.WebHome]]** 1494 -** FR4 (Analysis Summary) - enhanced with caching 1495 -** FR7 (Verdict Calculation) - cached per claim 1496 -** NFR11 (Quality Gates) - enforced across stages 1497 -** NEW: NFR19 (Cost Efficiency via Caching) 1498 -** NEW: NFR20 (Free Tier Sustainability) 1499 - 1500 -* **[[Architecture>>Test.FactHarbor.Specification.Architecture.WebHome]]** 1501 -** POC1 3-stage pipeline architecture 1502 -** Redis cache layer 1503 -** User credit system 1504 - 1505 -* **[[Data Model>>Test.FactHarbor.Specification.Data Model.WebHome]]** 1506 -** Claim structure (cacheable unit) 1507 -** Evidence structure 1508 -** Scenario boundaries 1509 - 1510 ---- 1511 - 1512 -**End of Specification - FactHarbor POC1 API v0.4** 1513 - 1514 -**3-stage caching architecture with free tier cache-only mode. Ready for sustainable, scalable implementation!** 🚀 1515 - 577 +* FactHarbor_POC1_API_and_Schemas_Spec_v0_4_1_PATCHED.md (45 KB standalone) 578 +* Export files (TEST/PRODUCTION) for xWiki import