Wiki source code of POC1 API & Schemas Specification
Last modified by Robert Schaub on 2025/12/24 21:53
Hide last authors
| author | version | line-number | content |
|---|---|---|---|
| |
1.1 | 1 | = POC1 API & Schemas Specification = |
| 2 | |||
| 3 | ---- | ||
| 4 | |||
| 5 | == Version History == | ||
| 6 | |||
| 7 | |=Version|=Date|=Changes | ||
| 8 | |0.4.1|2025-12-24|Applied 9 critical fixes: file format notice, verdict taxonomy, canonicalization algorithm, Stage 1 cost policy, BullMQ fix, language in cache key, historical claims TTL, idempotency, copyright policy | ||
| 9 | |0.4|2025-12-24|**BREAKING:** 3-stage pipeline with claim-level caching, user tier system, cache-only mode for free users, Redis cache architecture | ||
| 10 | |0.3.1|2025-12-24|Fixed single-prompt strategy, SSE clarification, schema canonicalization, cost constraints | ||
| 11 | |0.3|2025-12-24|Added complete API endpoints, LLM config, risk tiers, scraping details | ||
| 12 | |||
| 13 | ---- | ||
| 14 | |||
| 15 | == POC1 Codegen Contract (Canonical) == | ||
| 16 | |||
| 17 | {{info}} | ||
| 18 | This section is the **authoritative, code-generation-ready contract** for POC1. | ||
| 19 | If any other page conflicts with this section, **this section wins**. | ||
| 20 | {{/info}} | ||
| 21 | |||
| 22 | === Canonical outputs === | ||
| 23 | * **result.json**: schema-validated, machine-readable output | ||
| 24 | * **report.md**: deterministic template rendering from ``result.json`` (LLM must not free-write the final report) | ||
| 25 | |||
| 26 | === Locked enums === | ||
| 27 | **Scenario verdict** (``ScenarioVerdict.verdict_label``): | ||
| 28 | * ``Highly likely`` | ``Likely`` | ``Unclear`` | ``Unlikely`` | ``Highly unlikely`` | ``Unsubstantiated`` | ||
| 29 | |||
| 30 | **Claim verdict** (``ClaimVerdict.verdict_label``): | ||
| 31 | * ``Supported`` | ``Refuted`` | ``Inconclusive`` | ||
| 32 | |||
| 33 | **Mapping rule (summary):** | ||
| 34 | * Primary-interpretation scenario: | ||
| 35 | ** ``Highly likely`` / ``Likely`` ⇒ ``Supported`` | ||
| 36 | ** ``Highly unlikely`` / ``Unlikely`` ⇒ ``Refuted`` | ||
| 37 | ** ``Unclear`` / ``Unsubstantiated`` ⇒ ``Inconclusive`` | ||
| 38 | * If scenarios materially disagree (assumption-dependent outcomes) ⇒ ``Inconclusive`` (explain why) | ||
| 39 | |||
| 40 | === Deterministic claim normalization (cache key) === | ||
| 41 | * Normalization version: ``v1norm1`` | ||
| 42 | * Cache namespace: ``claim:v1norm1:{language}:{sha256(canonical_claim_text)}`` | ||
| 43 | * Normative reference implementation is defined in section **5.1.1** (no ellipses; must match exactly). | ||
| 44 | |||
| 45 | === Idempotency === | ||
| 46 | Clients SHOULD send: | ||
| 47 | * Header: ``Idempotency-Key: <client-generated-uuid>`` (preferred) | ||
| 48 | or | ||
| 49 | * Body: ``client.request_id`` | ||
| 50 | |||
| 51 | Server rules: | ||
| 52 | * Same key + same request body ⇒ return existing job (``200``) and include ``idempotent=true``. | ||
| 53 | * Same key + different request body ⇒ ``409`` ``VALIDATION_ERROR``. | ||
| 54 | |||
| 55 | Idempotency TTL: 24 hours. | ||
| 56 | |||
| 57 | === Minimal OpenAPI 3.1 (authoritative for codegen) === | ||
| 58 | {{code language="yaml"}} | ||
| 59 | openapi: 3.1.0 | ||
| 60 | info: | ||
| 61 | title: FactHarbor POC1 API | ||
| 62 | version: 0.9.106 | ||
| 63 | servers: | ||
| 64 | - url: / | ||
| 65 | paths: | ||
| 66 | /v1/analyze: | ||
| 67 | post: | ||
| 68 | summary: Create analysis job | ||
| 69 | parameters: | ||
| 70 | - in: header | ||
| 71 | name: Authorization | ||
| 72 | required: true | ||
| 73 | schema: { type: string } | ||
| 74 | - in: header | ||
| 75 | name: Idempotency-Key | ||
| 76 | required: false | ||
| 77 | schema: { type: string } | ||
| 78 | requestBody: | ||
| 79 | required: true | ||
| 80 | content: | ||
| 81 | application/json: | ||
| 82 | schema: | ||
| 83 | $ref: '#/components/schemas/AnalyzeRequest' | ||
| 84 | responses: | ||
| 85 | '202': | ||
| 86 | description: Accepted | ||
| 87 | content: | ||
| 88 | application/json: | ||
| 89 | schema: | ||
| 90 | $ref: '#/components/schemas/JobCreated' | ||
| 91 | '4XX': | ||
| 92 | description: Error | ||
| 93 | content: | ||
| 94 | application/json: | ||
| 95 | schema: | ||
| 96 | $ref: '#/components/schemas/ErrorEnvelope' | ||
| 97 | /v1/jobs/{job_id}: | ||
| 98 | get: | ||
| 99 | summary: Get job status | ||
| 100 | parameters: | ||
| 101 | - in: path | ||
| 102 | name: job_id | ||
| 103 | required: true | ||
| 104 | schema: { type: string } | ||
| 105 | - in: header | ||
| 106 | name: Authorization | ||
| 107 | required: true | ||
| 108 | schema: { type: string } | ||
| 109 | responses: | ||
| 110 | '200': | ||
| 111 | description: OK | ||
| 112 | content: | ||
| 113 | application/json: | ||
| 114 | schema: | ||
| 115 | $ref: '#/components/schemas/Job' | ||
| 116 | '404': | ||
| 117 | description: Not Found | ||
| 118 | content: | ||
| 119 | application/json: | ||
| 120 | schema: | ||
| 121 | $ref: '#/components/schemas/ErrorEnvelope' | ||
| 122 | delete: | ||
| 123 | summary: Cancel job (best-effort) and delete artifacts | ||
| 124 | parameters: | ||
| 125 | - in: path | ||
| 126 | name: job_id | ||
| 127 | required: true | ||
| 128 | schema: { type: string } | ||
| 129 | - in: header | ||
| 130 | name: Authorization | ||
| 131 | required: true | ||
| 132 | schema: { type: string } | ||
| 133 | responses: | ||
| 134 | '204': { description: No Content } | ||
| 135 | '404': | ||
| 136 | description: Not Found | ||
| 137 | content: | ||
| 138 | application/json: | ||
| 139 | schema: | ||
| 140 | $ref: '#/components/schemas/ErrorEnvelope' | ||
| 141 | /v1/jobs/{job_id}/events: | ||
| 142 | get: | ||
| 143 | summary: Job progress via SSE (no token streaming) | ||
| 144 | parameters: | ||
| 145 | - in: path | ||
| 146 | name: job_id | ||
| 147 | required: true | ||
| 148 | schema: { type: string } | ||
| 149 | - in: header | ||
| 150 | name: Authorization | ||
| 151 | required: true | ||
| 152 | schema: { type: string } | ||
| 153 | responses: | ||
| 154 | '200': | ||
| 155 | description: text/event-stream | ||
| 156 | /v1/jobs/{job_id}/result: | ||
| 157 | get: | ||
| 158 | summary: Get final JSON result | ||
| 159 | parameters: | ||
| 160 | - in: path | ||
| 161 | name: job_id | ||
| 162 | required: true | ||
| 163 | schema: { type: string } | ||
| 164 | - in: header | ||
| 165 | name: Authorization | ||
| 166 | required: true | ||
| 167 | schema: { type: string } | ||
| 168 | responses: | ||
| 169 | '200': | ||
| 170 | description: OK | ||
| 171 | content: | ||
| 172 | application/json: | ||
| 173 | schema: | ||
| 174 | $ref: '#/components/schemas/AnalysisResult' | ||
| 175 | '409': | ||
| 176 | description: Not ready | ||
| 177 | content: | ||
| 178 | application/json: | ||
| 179 | schema: | ||
| 180 | $ref: '#/components/schemas/ErrorEnvelope' | ||
| 181 | /v1/jobs/{job_id}/report: | ||
| 182 | get: | ||
| 183 | summary: Download report (markdown) | ||
| 184 | parameters: | ||
| 185 | - in: path | ||
| 186 | name: job_id | ||
| 187 | required: true | ||
| 188 | schema: { type: string } | ||
| 189 | - in: header | ||
| 190 | name: Authorization | ||
| 191 | required: true | ||
| 192 | schema: { type: string } | ||
| 193 | responses: | ||
| 194 | '200': | ||
| 195 | description: text/markdown | ||
| 196 | '409': | ||
| 197 | description: Not ready | ||
| 198 | content: | ||
| 199 | application/json: | ||
| 200 | schema: | ||
| 201 | $ref: '#/components/schemas/ErrorEnvelope' | ||
| 202 | /v1/health: | ||
| 203 | get: | ||
| 204 | summary: Health check | ||
| 205 | responses: | ||
| 206 | '200': | ||
| 207 | description: OK | ||
| 208 | components: | ||
| 209 | schemas: | ||
| 210 | AnalyzeRequest: | ||
| 211 | type: object | ||
| 212 | properties: | ||
| 213 | input_url: { type: ['string', 'null'] } | ||
| 214 | input_text: { type: ['string', 'null'] } | ||
| 215 | options: | ||
| 216 | type: object | ||
| 217 | properties: | ||
| 218 | max_claims: { type: integer, minimum: 1, maximum: 50, default: 5 } | ||
| 219 | cache_preference: | ||
| 220 | type: string | ||
| 221 | enum: [prefer_cache, allow_partial, cache_only, skip_cache] | ||
| 222 | default: prefer_cache | ||
| 223 | browsing: | ||
| 224 | type: string | ||
| 225 | enum: [on, off] | ||
| 226 | default: on | ||
| 227 | output_report: { type: boolean, default: true } | ||
| 228 | client: | ||
| 229 | type: object | ||
| 230 | properties: | ||
| 231 | request_id: { type: string } | ||
| 232 | JobCreated: | ||
| 233 | type: object | ||
| 234 | required: [job_id, status, created_at, links] | ||
| 235 | properties: | ||
| 236 | job_id: { type: string } | ||
| 237 | status: { type: string } | ||
| 238 | created_at: { type: string } | ||
| 239 | links: | ||
| 240 | type: object | ||
| 241 | properties: | ||
| 242 | self: { type: string } | ||
| 243 | events: { type: string } | ||
| 244 | result: { type: string } | ||
| 245 | report: { type: string } | ||
| 246 | Job: | ||
| 247 | type: object | ||
| 248 | required: [job_id, status, created_at, updated_at] | ||
| 249 | properties: | ||
| 250 | job_id: { type: string } | ||
| 251 | status: | ||
| 252 | type: string | ||
| 253 | enum: [QUEUED, RUNNING, SUCCEEDED, FAILED, CANCELED] | ||
| 254 | created_at: { type: string } | ||
| 255 | updated_at: { type: string } | ||
| 256 | AnalysisResult: | ||
| 257 | type: object | ||
| 258 | properties: | ||
| 259 | job_id: { type: string } | ||
| 260 | ErrorEnvelope: | ||
| 261 | type: object | ||
| 262 | properties: | ||
| 263 | error: | ||
| 264 | type: object | ||
| 265 | properties: | ||
| 266 | code: { type: string } | ||
| 267 | message: { type: string } | ||
| 268 | details: { type: object } | ||
| 269 | {{/code}} | ||
| 270 | |||
| 271 | ---- | ||
| 272 | |||
| 273 | == 1. Core Objective (POC1) == | ||
| 274 | |||
| 275 | The primary technical goal of POC1 is to validate **Approach 1 (Single-Pass Holistic Analysis)** while implementing **claim-level caching** to achieve cost sustainability. | ||
| 276 | |||
| 277 | The system must prove that AI can identify an article's **Main Thesis** and determine if supporting claims logically support that thesis without committing fallacies. | ||
| 278 | |||
| 279 | === Success Criteria: === | ||
| 280 | |||
| 281 | * Test with 30 diverse articles | ||
| 282 | * Target: ≥70% accuracy detecting misleading articles | ||
| 283 | * Cost: <$0.25 per NEW analysis (uncached) | ||
| 284 | * Cost: $0.00 for cached claim reuse | ||
| 285 | * Cache hit rate: ≥50% after 1,000 articles | ||
| 286 | * Processing time: <2 minutes (standard depth) | ||
| 287 | |||
| 288 | === Economic Model: === | ||
| 289 | |||
| 290 | * **Free tier:** $10 credit per month (~~40-140 articles depending on cache hits) | ||
| 291 | * **After limit:** Cache-only mode (instant, free access to cached claims) | ||
| 292 | * **Paid tier:** Unlimited new analyses | ||
| 293 | |||
| 294 | ---- | ||
| 295 | |||
| 296 | == 2. Architecture Overview == | ||
| 297 | |||
| 298 | === 2.1 3-Stage Pipeline with Caching === | ||
| 299 | |||
| 300 | FactHarbor POC1 uses a **3-stage architecture** designed for claim-level caching and cost efficiency: | ||
| 301 | |||
| 302 | {{mermaid}} | ||
| 303 | graph TD | ||
| 304 | A[Article Input] --> B[Stage 1: Extract Claims] | ||
| 305 | B --> C{For Each Claim} | ||
| 306 | C --> D[Check Cache] | ||
| 307 | D -->|Cache HIT| E[Return Cached Verdict] | ||
| 308 | D -->|Cache MISS| F[Stage 2: Analyze Claim] | ||
| 309 | F --> G[Store in Cache] | ||
| 310 | G --> E | ||
| 311 | E --> H[Stage 3: Holistic Assessment] | ||
| 312 | H --> I[Final Report] | ||
| 313 | {{/mermaid}} | ||
| 314 | |||
| 315 | ==== Stage 1: Claim Extraction (FAST model, no cache) ==== | ||
| 316 | |||
| 317 | * **Input:** Article text | ||
| 318 | * **Output:** 5 canonical claims (normalized, deduplicated) | ||
| 319 | * **Model:** Provider-default FAST model (default, configurable via LLM abstraction layer) | ||
| 320 | * **Cost:** $0.003 per article | ||
| 321 | * **Cache strategy:** No caching (article-specific) | ||
| 322 | |||
| 323 | ==== Stage 2: Claim Analysis (REASONING model, CACHED) ==== | ||
| 324 | |||
| 325 | * **Input:** Single canonical claim | ||
| 326 | * **Output:** Scenarios + Evidence + Verdicts | ||
| 327 | * **Model:** Provider-default REASONING model (default, configurable via LLM abstraction layer) | ||
| 328 | * **Cost:** $0.081 per NEW claim | ||
| 329 | * **Cache strategy:** Redis, 90-day TTL | ||
| 330 | * **Cache key:** claim:v1norm1:{language}:{sha256(canonical_claim)} | ||
| 331 | |||
| 332 | ==== Stage 3: Holistic Assessment (REASONING model, no cache) ==== | ||
| 333 | |||
| 334 | * **Input:** Article + Claim verdicts (from cache or Stage 2) | ||
| 335 | * **Output:** Article verdict + Fallacies + Logic quality | ||
| 336 | * **Model:** Provider-default REASONING model (default, configurable via LLM abstraction layer) | ||
| 337 | * **Cost:** $0.030 per article | ||
| 338 | * **Cache strategy:** No caching (article-specific) | ||
| 339 | |||
| 340 | |||
| 341 | |||
| 342 | **Note:** Stage 3 implements **Approach 1 (Single-Pass Holistic Analysis)** from the [[Article Verdict Problem>>Test.FactHarbor.Specification.POC.Article-Verdict-Problem]]. While claim analysis (Stage 2) is cached for efficiency, the holistic assessment maintains the integrated evaluation philosophy of Approach 1. | ||
| 343 | |||
| 344 | === Total Cost Formula: === | ||
| 345 | |||
| 346 | {{{Cost = $0.003 (extraction) + (N_new_claims × $0.081) + $0.030 (holistic) | ||
| 347 | |||
| 348 | Examples: | ||
| 349 | - 0 new claims (100% cache hit): $0.033 | ||
| 350 | - 1 new claim (80% cache hit): $0.114 | ||
| 351 | - 3 new claims (40% cache hit): $0.276 | ||
| 352 | - 5 new claims (0% cache hit): $0.438 | ||
| 353 | }}} | ||
| 354 | |||
| 355 | ---- | ||
| 356 | |||
| 357 | === 2.2 User Tier System === | ||
| 358 | |||
| 359 | |=Tier|=Monthly Credit|=After Limit|=Cache Access|=Analytics | ||
| 360 | |**Free**|$10|Cache-only mode|✅ Full|Basic | ||
| 361 | |**Pro** (future)|$50|Continues|✅ Full|Advanced | ||
| 362 | |**Enterprise** (future)|Custom|Continues|✅ Full + Priority|Full | ||
| 363 | |||
| 364 | **Free Tier Economics:** | ||
| 365 | |||
| 366 | * $10 credit = 40-140 articles analyzed (depending on cache hit rate) | ||
| 367 | * Average 70 articles/month at 70% cache hit rate | ||
| 368 | * After limit: Cache-only mode | ||
| 369 | |||
| 370 | ---- | ||
| 371 | |||
| 372 | === 2.3 Cache-Only Mode (Free Tier Feature) === | ||
| 373 | |||
| 374 | When free users reach their $10 monthly limit, they enter **Cache-Only Mode**: | ||
| 375 | |||
| 376 | |||
| 377 | |||
| 378 | ==== Stage 3: Holistic Assessment - Complete Specification ==== | ||
| 379 | |||
| 380 | ===== 3.3.1 Overview ===== | ||
| 381 | |||
| 382 | **Purpose:** Synthesize individual claim analyses into an overall article assessment, identifying logical fallacies, reasoning quality, and publication readiness. | ||
| 383 | |||
| 384 | **Approach:** **Single-Pass Holistic Analysis** (Approach 1 from Comparison Matrix) | ||
| 385 | |||
| 386 | **Why This Approach for POC1:** | ||
| 387 | * ✅ **1 API call** (vs 2 for Two-Pass or Judge) | ||
| 388 | * ✅ **Low cost** ($0.030 per article) | ||
| 389 | * ✅ **Fast** (4-6 seconds) | ||
| 390 | * ✅ **Low complexity** (simple implementation) | ||
| 391 | * ⚠️ **Medium reliability** (acceptable for POC1, will improve in POC2/Production) | ||
| 392 | |||
| 393 | **Alternative Approaches Considered:** | ||
| 394 | |||
| 395 | |= Approach |= API Calls |= Cost |= Speed |= Complexity |= Reliability |= Best For | ||
| 396 | | **1. Single-Pass** ⭐ | 1 | 💰 Low | ⚡ Fast | 🟢 Low | ⚠️ Medium | **POC1** | ||
| 397 | | 2. Two-Pass | 2 | 💰💰 Med | 🐢 Slow | 🟡 Med | ✅ High | POC2/Prod | ||
| 398 | | 3. Structured | 1 | 💰 Low | ⚡ Fast | 🟡 Med | ✅ High | POC1 (alternative) | ||
| 399 | | 4. Weighted | 1 | 💰 Low | ⚡ Fast | 🟢 Low | ⚠️ Medium | POC1 (alternative) | ||
| 400 | | 5. Heuristics | 1 | 💰 Lowest | ⚡⚡ Fastest | 🟡 Med | ⚠️ Medium | Any | ||
| 401 | | 6. Hybrid | 1 | 💰 Low | ⚡ Fast | 🔴 Med-High | ✅ High | POC2 | ||
| 402 | | 7. Judge | 2 | 💰💰 Med | 🐢 Slow | 🟡 Med | ✅ High | Production | ||
| 403 | |||
| 404 | **POC1 Choice:** Approach 1 (Single-Pass) for speed and simplicity. Will upgrade to Approach 2 (Two-Pass) or 6 (Hybrid) in POC2 for higher reliability. | ||
| 405 | |||
| 406 | ===== 3.3.2 What Stage 3 Evaluates ===== | ||
| 407 | |||
| 408 | Stage 3 performs **integrated holistic analysis** considering: | ||
| 409 | |||
| 410 | **1. Claim-Level Aggregation:** | ||
| 411 | * Verdict distribution (how many TRUE vs FALSE vs DISPUTED) | ||
| 412 | * Average confidence across all claims | ||
| 413 | * Claim interdependencies (do claims support/contradict each other?) | ||
| 414 | * Critical claim identification (which claims are most important?) | ||
| 415 | |||
| 416 | **2. Contextual Factors:** | ||
| 417 | * **Source credibility**: Is the article from a reputable publisher? | ||
| 418 | * **Author expertise**: Does the author have relevant credentials? | ||
| 419 | * **Publication date**: Is information current or outdated? | ||
| 420 | * **Claim coherence**: Do claims form a logical narrative? | ||
| 421 | * **Missing context**: Are important caveats or qualifications missing? | ||
| 422 | |||
| 423 | **3. Logical Fallacies:** | ||
| 424 | * **Cherry-picking**: Selective evidence presentation | ||
| 425 | * **False equivalence**: Treating unequal things as equal | ||
| 426 | * **Straw man**: Misrepresenting opposing arguments | ||
| 427 | * **Ad hominem**: Attacking person instead of argument | ||
| 428 | * **Slippery slope**: Assuming extreme consequences without justification | ||
| 429 | * **Circular reasoning**: Conclusion assumes premise | ||
| 430 | * **False dichotomy**: Presenting only two options when more exist | ||
| 431 | |||
| 432 | **4. Reasoning Quality:** | ||
| 433 | * **Evidence strength**: Quality and quantity of supporting evidence | ||
| 434 | * **Logical coherence**: Arguments follow logically | ||
| 435 | * **Transparency**: Assumptions and limitations acknowledged | ||
| 436 | * **Nuance**: Complexity and uncertainty appropriately addressed | ||
| 437 | |||
| 438 | **5. Publication Readiness:** | ||
| 439 | * **Risk tier assignment**: A (high risk), B (medium), or C (low risk) | ||
| 440 | * **Publication mode**: DRAFT_ONLY, AI_GENERATED, or HUMAN_REVIEWED | ||
| 441 | * **Required disclaimers**: What warnings should accompany this content? | ||
| 442 | |||
| 443 | ===== 3.3.3 Implementation: Single-Pass Approach ===== | ||
| 444 | |||
| 445 | **Input:** | ||
| 446 | * Original article text (full content) | ||
| 447 | * Stage 2 claim analyses (array of ClaimAnalysis objects) | ||
| 448 | * Article metadata (URL, title, author, date, source) | ||
| 449 | |||
| 450 | **Processing:** | ||
| 451 | |||
| 452 | {{code language="python"}} | ||
| 453 | # Pseudo-code for Stage 3 (Single-Pass) | ||
| 454 | |||
| 455 | def stage3_holistic_assessment(article, claim_analyses, metadata): | ||
| 456 | """ | ||
| 457 | Single-pass holistic assessment using Provider-default REASONING model. | ||
| 458 | |||
| 459 | Approach 1: One comprehensive prompt that asks the LLM to: | ||
| 460 | 1. Review all claim verdicts | ||
| 461 | 2. Identify patterns and dependencies | ||
| 462 | 3. Detect logical fallacies | ||
| 463 | 4. Assess reasoning quality | ||
| 464 | 5. Determine credibility score and risk tier | ||
| 465 | 6. Generate publication recommendations | ||
| 466 | """ | ||
| 467 | |||
| 468 | # Construct comprehensive prompt | ||
| 469 | prompt = f""" | ||
| 470 | You are analyzing an article for factual accuracy and logical reasoning. | ||
| 471 | |||
| 472 | ARTICLE METADATA: | ||
| 473 | - Title: {metadata['title']} | ||
| 474 | - Source: {metadata['source']} | ||
| 475 | - Date: {metadata['date']} | ||
| 476 | - Author: {metadata['author']} | ||
| 477 | |||
| 478 | ARTICLE TEXT: | ||
| 479 | {article} | ||
| 480 | |||
| 481 | INDIVIDUAL CLAIM ANALYSES: | ||
| 482 | {format_claim_analyses(claim_analyses)} | ||
| 483 | |||
| 484 | YOUR TASK: | ||
| 485 | Perform a holistic assessment considering: | ||
| 486 | |||
| 487 | 1. CLAIM AGGREGATION: | ||
| 488 | - Review the verdict for each claim | ||
| 489 | - Identify any interdependencies between claims | ||
| 490 | - Determine which claims are most critical to the article's thesis | ||
| 491 | |||
| 492 | 2. CONTEXTUAL EVALUATION: | ||
| 493 | - Assess source credibility | ||
| 494 | - Evaluate author expertise | ||
| 495 | - Consider publication timeliness | ||
| 496 | - Identify missing context or important caveats | ||
| 497 | |||
| 498 | 3. LOGICAL FALLACIES: | ||
| 499 | - Identify any logical fallacies present | ||
| 500 | - For each fallacy, provide: | ||
| 501 | * Type of fallacy | ||
| 502 | * Where it occurs in the article | ||
| 503 | * Why it's problematic | ||
| 504 | * Severity (minor/moderate/severe) | ||
| 505 | |||
| 506 | 4. REASONING QUALITY: | ||
| 507 | - Evaluate evidence strength | ||
| 508 | - Assess logical coherence | ||
| 509 | - Check for transparency in assumptions | ||
| 510 | - Evaluate handling of nuance and uncertainty | ||
| 511 | |||
| 512 | 5. CREDIBILITY SCORING: | ||
| 513 | - Calculate overall credibility score (0.0-1.0) | ||
| 514 | - Assign risk tier: | ||
| 515 | * A (high risk): ≤0.5 credibility OR severe fallacies | ||
| 516 | * B (medium risk): 0.5-0.8 credibility OR moderate issues | ||
| 517 | * C (low risk): >0.8 credibility AND no significant issues | ||
| 518 | |||
| 519 | 6. PUBLICATION RECOMMENDATIONS: | ||
| 520 | - Determine publication mode: | ||
| 521 | * DRAFT_ONLY: Tier A, multiple severe issues | ||
| 522 | * AI_GENERATED: Tier B/C, acceptable quality with disclaimers | ||
| 523 | * HUMAN_REVIEWED: Complex or borderline cases | ||
| 524 | - List required disclaimers | ||
| 525 | - Explain decision rationale | ||
| 526 | |||
| 527 | OUTPUT FORMAT: | ||
| 528 | Return a JSON object matching the ArticleAssessment schema. | ||
| 529 | """ | ||
| 530 | |||
| 531 | # Call LLM | ||
| 532 | response = llm_client.complete( | ||
| 533 | model="claude-sonnet-4-5-20250929", | ||
| 534 | prompt=prompt, | ||
| 535 | max_tokens=4000, | ||
| 536 | response_format="json" | ||
| 537 | ) | ||
| 538 | |||
| 539 | # Parse and validate response | ||
| 540 | assessment = parse_json(response.content) | ||
| 541 | validate_article_assessment_schema(assessment) | ||
| 542 | |||
| 543 | return assessment | ||
| 544 | {{/code}} | ||
| 545 | |||
| 546 | **Prompt Engineering Notes:** | ||
| 547 | |||
| 548 | 1. **Structured Instructions**: Break down task into 6 clear sections | ||
| 549 | 2. **Context-Rich**: Provide article + all claim analyses + metadata | ||
| 550 | 3. **Explicit Criteria**: Define credibility scoring and risk tiers precisely | ||
| 551 | 4. **JSON Schema**: Request structured output matching ArticleAssessment schema | ||
| 552 | 5. **Examples** (in production): Include 2-3 example assessments for consistency | ||
| 553 | |||
| 554 | ===== 3.3.4 Credibility Scoring Algorithm ===== | ||
| 555 | |||
| 556 | **Base Score Calculation:** | ||
| 557 | |||
| 558 | {{code language="python"}} | ||
| 559 | def calculate_credibility_score(claim_analyses, fallacies, contextual_factors): | ||
| 560 | """ | ||
| 561 | Calculate overall credibility score (0.0-1.0). | ||
| 562 | |||
| 563 | This is a GUIDELINE for the LLM, not strict code. | ||
| 564 | The LLM has flexibility to adjust based on context. | ||
| 565 | """ | ||
| 566 | |||
| 567 | # 1. Claim Verdict Score (60% weight) | ||
| 568 | verdict_weights = { | ||
| 569 | "TRUE": 1.0, | ||
| 570 | "PARTIALLY_TRUE": 0.7, | ||
| 571 | "DISPUTED": 0.5, | ||
| 572 | "UNSUPPORTED": 0.3, | ||
| 573 | "FALSE": 0.0, | ||
| 574 | "UNVERIFIABLE": 0.4 | ||
| 575 | } | ||
| 576 | |||
| 577 | claim_scores = [ | ||
| 578 | verdict_weights[c.verdict.label] * c.verdict.confidence | ||
| 579 | for c in claim_analyses | ||
| 580 | ] | ||
| 581 | avg_claim_score = sum(claim_scores) / len(claim_scores) | ||
| 582 | claim_component = avg_claim_score * 0.6 | ||
| 583 | |||
| 584 | # 2. Fallacy Penalty (20% weight) | ||
| 585 | fallacy_penalties = { | ||
| 586 | "minor": -0.05, | ||
| 587 | "moderate": -0.15, | ||
| 588 | "severe": -0.30 | ||
| 589 | } | ||
| 590 | |||
| 591 | fallacy_score = 1.0 | ||
| 592 | for fallacy in fallacies: | ||
| 593 | fallacy_score += fallacy_penalties[fallacy.severity] | ||
| 594 | |||
| 595 | fallacy_score = max(0.0, min(1.0, fallacy_score)) | ||
| 596 | fallacy_component = fallacy_score * 0.2 | ||
| 597 | |||
| 598 | # 3. Contextual Factors (20% weight) | ||
| 599 | context_adjustments = { | ||
| 600 | "source_credibility": {"positive": +0.1, "neutral": 0, "negative": -0.1}, | ||
| 601 | "author_expertise": {"positive": +0.1, "neutral": 0, "negative": -0.1}, | ||
| 602 | "timeliness": {"positive": +0.05, "neutral": 0, "negative": -0.05}, | ||
| 603 | "transparency": {"positive": +0.05, "neutral": 0, "negative": -0.05} | ||
| 604 | } | ||
| 605 | |||
| 606 | context_score = 1.0 | ||
| 607 | for factor in contextual_factors: | ||
| 608 | adjustment = context_adjustments.get(factor.factor, {}).get(factor.impact, 0) | ||
| 609 | context_score += adjustment | ||
| 610 | |||
| 611 | context_score = max(0.0, min(1.0, context_score)) | ||
| 612 | context_component = context_score * 0.2 | ||
| 613 | |||
| 614 | # 4. Combine components | ||
| 615 | final_score = claim_component + fallacy_component + context_component | ||
| 616 | |||
| 617 | # 5. Apply confidence modifier | ||
| 618 | avg_confidence = sum(c.verdict.confidence for c in claim_analyses) / len(claim_analyses) | ||
| 619 | final_score = final_score * (0.8 + 0.2 * avg_confidence) | ||
| 620 | |||
| 621 | return max(0.0, min(1.0, final_score)) | ||
| 622 | {{/code}} | ||
| 623 | |||
| 624 | **Note:** This algorithm is a **guideline** provided to the LLM in the system prompt. The LLM has flexibility to adjust based on specific article context, but should generally follow this structure for consistency. | ||
| 625 | |||
| 626 | ===== 3.3.5 Risk Tier Assignment ===== | ||
| 627 | |||
| 628 | **Automatic Risk Tier Rules:** | ||
| 629 | |||
| 630 | {{code}} | ||
| 631 | Risk Tier A (High Risk - Requires Review): | ||
| 632 | - Credibility score ≤ 0.5, OR | ||
| 633 | - Any severe fallacies detected, OR | ||
| 634 | - Multiple (3+) moderate fallacies, OR | ||
| 635 | - 50%+ of claims are FALSE or UNSUPPORTED | ||
| 636 | |||
| 637 | Risk Tier B (Medium Risk - May Publish with Disclaimers): | ||
| 638 | - Credibility score 0.5-0.8, OR | ||
| 639 | - 1-2 moderate fallacies, OR | ||
| 640 | - 20-49% of claims are DISPUTED or PARTIALLY_TRUE | ||
| 641 | |||
| 642 | Risk Tier C (Low Risk - Safe to Publish): | ||
| 643 | - Credibility score > 0.8, AND | ||
| 644 | - No severe or moderate fallacies, AND | ||
| 645 | - <20% disputed/problematic claims, AND | ||
| 646 | - No critical missing context | ||
| 647 | {{/code}} | ||
| 648 | |||
| 649 | ===== 3.3.6 Output: ArticleAssessment Schema ===== | ||
| 650 | |||
| 651 | (See Stage 3 Output Schema section above for complete JSON schema) | ||
| 652 | |||
| 653 | ===== 3.3.7 Performance Metrics ===== | ||
| 654 | |||
| 655 | **POC1 Targets:** | ||
| 656 | * **Processing time**: 4-6 seconds per article | ||
| 657 | * **Cost**: $0.030 per article (Sonnet 4.5 tokens) | ||
| 658 | * **Quality**: 70-80% agreement with human reviewers (acceptable for POC) | ||
| 659 | * **API calls**: 1 per article | ||
| 660 | |||
| 661 | **Future Improvements (POC2/Production):** | ||
| 662 | * Upgrade to Two-Pass (Approach 2): +15% accuracy, +$0.020 cost | ||
| 663 | * Add human review sampling: 10% of Tier B articles | ||
| 664 | * Implement Judge approach (Approach 7) for Tier A: Highest quality | ||
| 665 | |||
| 666 | ===== 3.3.8 Example Stage 3 Execution ===== | ||
| 667 | |||
| 668 | **Input:** | ||
| 669 | * Article: "Biden won the 2020 election" | ||
| 670 | * Claim analyses: [{claim: "Biden won", verdict: "TRUE", confidence: 0.95}] | ||
| 671 | |||
| 672 | **Stage 3 Processing:** | ||
| 673 | 1. Analyzes single claim with high confidence | ||
| 674 | 2. Checks for contextual factors (source credibility) | ||
| 675 | 3. Searches for logical fallacies (none found) | ||
| 676 | 4. Calculates credibility: 0.6 * 0.95 + 0.2 * 1.0 + 0.2 * 1.0 = 0.97 | ||
| 677 | 5. Assigns risk tier: C (low risk) | ||
| 678 | 6. Recommends: AI_GENERATED publication mode | ||
| 679 | |||
| 680 | **Output:** | ||
| 681 | ```json | ||
| 682 | { | ||
| 683 | "article_id": "a1", | ||
| 684 | "overall_assessment": { | ||
| 685 | "credibility_score": 0.97, | ||
| 686 | "risk_tier": "C", | ||
| 687 | "summary": "Article makes single verifiable claim with strong evidence support", | ||
| 688 | "confidence": 0.95 | ||
| 689 | }, | ||
| 690 | "claim_aggregation": { | ||
| 691 | "total_claims": 1, | ||
| 692 | "verdict_distribution": {"TRUE": 1}, | ||
| 693 | "avg_confidence": 0.95 | ||
| 694 | }, | ||
| 695 | "contextual_factors": [ | ||
| 696 | {"factor": "source_credibility", "impact": "positive", "description": "Reputable news source"} | ||
| 697 | ], | ||
| 698 | "recommendations": { | ||
| 699 | "publication_mode": "AI_GENERATED", | ||
| 700 | "requires_review": false, | ||
| 701 | "suggested_disclaimers": [] | ||
| 702 | } | ||
| 703 | } | ||
| 704 | ``` | ||
| 705 | |||
| 706 | ==== What Cache-Only Mode Provides: ==== | ||
| 707 | |||
| 708 | ✅ **Claim Extraction (Platform-Funded):** | ||
| 709 | |||
| 710 | * Stage 1 extraction runs at $0.003 per article | ||
| 711 | * **Cost: Absorbed by platform** (not charged to user credit) | ||
| 712 | * Rationale: Extraction is necessary to check cache, and cost is negligible | ||
| 713 | * Rate limit: Max 50 extractions/day in cache-only mode (prevents abuse) | ||
| 714 | |||
| 715 | ✅ **Instant Access to Cached Claims:** | ||
| 716 | |||
| 717 | * Any claim that exists in cache → Full verdict returned | ||
| 718 | * Cost: $0 (no LLM calls) | ||
| 719 | * Response time: <100ms | ||
| 720 | |||
| 721 | ✅ **Partial Article Analysis:** | ||
| 722 | |||
| 723 | * Check each claim against cache | ||
| 724 | * Return verdicts for ALL cached claims | ||
| 725 | * For uncached claims: Return "status": "cache_miss" | ||
| 726 | |||
| 727 | ✅ **Cache Coverage Report:** | ||
| 728 | |||
| 729 | * "3 of 5 claims available in cache (60% coverage)" | ||
| 730 | * Links to cached analyses | ||
| 731 | * Estimated cost to complete: $0.162 (2 new claims) | ||
| 732 | |||
| 733 | ❌ **Not Available in Cache-Only Mode:** | ||
| 734 | |||
| 735 | * New claim analysis (Stage 2 LLM calls blocked) | ||
| 736 | * Full holistic assessment (Stage 3 blocked if any claims missing) | ||
| 737 | |||
| 738 | ==== User Experience Example: ==== | ||
| 739 | |||
| 740 | {{{{ | ||
| 741 | "status": "cache_only_mode", | ||
| 742 | "message": "Monthly credit limit reached. Showing cached results only.", | ||
| 743 | "cache_coverage": { | ||
| 744 | "claims_total": 5, | ||
| 745 | "claims_cached": 3, | ||
| 746 | "claims_missing": 2, | ||
| 747 | "coverage_percent": 60 | ||
| 748 | }, | ||
| 749 | "cached_claims": [ | ||
| 750 | {"claim_id": "C1", "verdict": "Likely", "confidence": 0.82}, | ||
| 751 | {"claim_id": "C2", "verdict": "Highly Likely", "confidence": 0.91}, | ||
| 752 | {"claim_id": "C4", "verdict": "Unclear", "confidence": 0.55} | ||
| 753 | ], | ||
| 754 | "missing_claims": [ | ||
| 755 | {"claim_id": "C3", "claim_text": "...", "estimated_cost": "$0.081"}, | ||
| 756 | {"claim_id": "C5", "claim_text": "...", "estimated_cost": "$0.081"} | ||
| 757 | ], | ||
| 758 | "upgrade_options": { | ||
| 759 | "top_up": "$5 for 20-70 more articles", | ||
| 760 | "pro_tier": "$50/month unlimited" | ||
| 761 | } | ||
| 762 | } | ||
| 763 | }}} | ||
| 764 | |||
| 765 | **Design Rationale:** | ||
| 766 | |||
| 767 | * Free users still get value (cached claims often answer their question) | ||
| 768 | * Demonstrates FactHarbor's value (partial results encourage upgrade) | ||
| 769 | * Sustainable for platform (no additional cost) | ||
| 770 | * Fair to all users (everyone contributes to cache) | ||
| 771 | |||
| 772 | ---- | ||
| 773 | |||
| 774 | |||
| 775 | |||
| 776 | == 6. LLM Abstraction Layer == | ||
| 777 | |||
| 778 | === 6.1 Design Principle === | ||
| 779 | |||
| 780 | **FactHarbor uses provider-agnostic LLM abstraction** to avoid vendor lock-in and enable: | ||
| 781 | |||
| 782 | * **Provider switching:** Change LLM providers without code changes | ||
| 783 | * **Cost optimization:** Use different providers for different stages | ||
| 784 | * **Resilience:** Automatic fallback if primary provider fails | ||
| 785 | * **Cross-checking:** Compare outputs from multiple providers | ||
| 786 | * **A/B testing:** Test new models without deployment changes | ||
| 787 | |||
| 788 | **Implementation:** All LLM calls go through an abstraction layer that routes to configured providers. | ||
| 789 | |||
| 790 | ---- | ||
| 791 | |||
| 792 | === 6.2 LLM Provider Interface === | ||
| 793 | |||
| 794 | **Abstract Interface:** | ||
| 795 | |||
| 796 | {{{ | ||
| 797 | interface LLMProvider { | ||
| 798 | // Core methods | ||
| 799 | complete(prompt: string, options: CompletionOptions): Promise<CompletionResponse> | ||
| 800 | stream(prompt: string, options: CompletionOptions): AsyncIterator<StreamChunk> | ||
| 801 | |||
| 802 | // Provider metadata | ||
| 803 | getName(): string | ||
| 804 | getMaxTokens(): number | ||
| 805 | getCostPer1kTokens(): { input: number, output: number } | ||
| 806 | |||
| 807 | // Health check | ||
| 808 | isAvailable(): Promise<boolean> | ||
| 809 | } | ||
| 810 | |||
| 811 | interface CompletionOptions { | ||
| 812 | model?: string | ||
| 813 | maxTokens?: number | ||
| 814 | temperature?: number | ||
| 815 | stopSequences?: string[] | ||
| 816 | systemPrompt?: string | ||
| 817 | } | ||
| 818 | }}} | ||
| 819 | |||
| 820 | ---- | ||
| 821 | |||
| 822 | === 6.3 Supported Providers (POC1) === | ||
| 823 | |||
| 824 | **Primary Provider (Default):** | ||
| 825 | |||
| 826 | * **Anthropic Claude API** | ||
| 827 | * Models (examples; not normative): Provider-default FAST model, Provider-default REASONING model, Provider-default HEAVY model (optional) | ||
| 828 | * Used by default in POC1 | ||
| 829 | * Best quality for holistic analysis | ||
| 830 | |||
| 831 | **Secondary Providers (Future):** | ||
| 832 | |||
| 833 | * **OpenAI API** | ||
| 834 | * Models: GPT-4o, GPT-4o-mini | ||
| 835 | * For cost comparison | ||
| 836 | |||
| 837 | * **Google Vertex AI** | ||
| 838 | * Models: Gemini 1.5 Pro, Gemini 1.5 Flash | ||
| 839 | * For diversity in evidence gathering | ||
| 840 | |||
| 841 | * **Local Models** (Post-POC) | ||
| 842 | * Models: Llama 3.1, Mistral | ||
| 843 | * For privacy-sensitive deployments | ||
| 844 | |||
| 845 | ---- | ||
| 846 | |||
| 847 | === 6.4 Provider Configuration === | ||
| 848 | |||
| 849 | **Environment Variables:** | ||
| 850 | |||
| 851 | {{{ | ||
| 852 | # Primary provider | ||
| 853 | LLM_PRIMARY_PROVIDER=anthropic | ||
| 854 | ANTHROPIC_API_KEY=sk-ant-... | ||
| 855 | |||
| 856 | # Fallback provider | ||
| 857 | LLM_FALLBACK_PROVIDER=openai | ||
| 858 | OPENAI_API_KEY=sk-... | ||
| 859 | |||
| 860 | # Provider selection per stage | ||
| 861 | LLM_STAGE1_PROVIDER=anthropic | ||
| 862 | LLM_STAGE1_MODEL=claude-haiku-4 | ||
| 863 | LLM_STAGE2_PROVIDER=anthropic | ||
| 864 | LLM_STAGE2_MODEL=claude-sonnet-4-5-20250929 | ||
| 865 | LLM_STAGE3_PROVIDER=anthropic | ||
| 866 | LLM_STAGE3_MODEL=claude-sonnet-4-5-20250929 | ||
| 867 | |||
| 868 | # Cost limits | ||
| 869 | LLM_MAX_COST_PER_REQUEST=1.00 | ||
| 870 | }}} | ||
| 871 | |||
| 872 | **Database Configuration (Alternative):** | ||
| 873 | |||
| 874 | {{{{ | ||
| 875 | { | ||
| 876 | "providers": [ | ||
| 877 | { | ||
| 878 | "name": "anthropic", | ||
| 879 | "api_key_ref": "vault://anthropic-api-key", | ||
| 880 | "enabled": true, | ||
| 881 | "priority": 1 | ||
| 882 | }, | ||
| 883 | { | ||
| 884 | "name": "openai", | ||
| 885 | "api_key_ref": "vault://openai-api-key", | ||
| 886 | "enabled": true, | ||
| 887 | "priority": 2 | ||
| 888 | } | ||
| 889 | ], | ||
| 890 | "stage_config": { | ||
| 891 | "stage1": { | ||
| 892 | "provider": "anthropic", | ||
| 893 | "model": "claude-haiku-4-5-20251001", | ||
| 894 | "max_tokens": 4096, | ||
| 895 | "temperature": 0.0 | ||
| 896 | }, | ||
| 897 | "stage2": { | ||
| 898 | "provider": "anthropic", | ||
| 899 | "model": "claude-sonnet-4-5-20250929", | ||
| 900 | "max_tokens": 16384, | ||
| 901 | "temperature": 0.3 | ||
| 902 | }, | ||
| 903 | "stage3": { | ||
| 904 | "provider": "anthropic", | ||
| 905 | "model": "claude-sonnet-4-5-20250929", | ||
| 906 | "max_tokens": 8192, | ||
| 907 | "temperature": 0.2 | ||
| 908 | } | ||
| 909 | } | ||
| 910 | } | ||
| 911 | }}} | ||
| 912 | |||
| 913 | ---- | ||
| 914 | |||
| 915 | === 6.5 Stage-Specific Models (POC1 Defaults) === | ||
| 916 | |||
| 917 | **Stage 1: Claim Extraction** | ||
| 918 | |||
| 919 | * **Default:** Anthropic Provider-default FAST model | ||
| 920 | * **Alternative:** OpenAI GPT-4o-mini, Google Gemini 1.5 Flash | ||
| 921 | * **Rationale:** Fast, cheap, simple task | ||
| 922 | * **Cost:** ~$0.003 per article | ||
| 923 | |||
| 924 | **Stage 2: Claim Analysis** (CACHEABLE) | ||
| 925 | |||
| 926 | * **Default:** Anthropic Provider-default REASONING model | ||
| 927 | * **Alternative:** OpenAI GPT-4o, Google Gemini 1.5 Pro | ||
| 928 | * **Rationale:** High-quality analysis, cached 90 days | ||
| 929 | * **Cost:** ~$0.081 per NEW claim | ||
| 930 | |||
| 931 | **Stage 3: Holistic Assessment** | ||
| 932 | |||
| 933 | * **Default:** Anthropic Provider-default REASONING model | ||
| 934 | * **Alternative:** OpenAI GPT-4o, Provider-default HEAVY model (optional) (for high-stakes) | ||
| 935 | * **Rationale:** Complex reasoning, logical fallacy detection | ||
| 936 | * **Cost:** ~$0.030 per article | ||
| 937 | |||
| 938 | **Cost Comparison (Example):** | ||
| 939 | |||
| 940 | |=Stage|=Anthropic (Default)|=OpenAI Alternative|=Google Alternative | ||
| 941 | |Stage 1|Provider-default FAST model ($0.003)|GPT-4o-mini ($0.002)|Gemini Flash ($0.002) | ||
| 942 | |Stage 2|Provider-default REASONING model ($0.081)|GPT-4o ($0.045)|Gemini Pro ($0.050) | ||
| 943 | |Stage 3|Provider-default REASONING model ($0.030)|GPT-4o ($0.018)|Gemini Pro ($0.020) | ||
| 944 | |**Total (0% cache)**|**$0.114**|**$0.065**|**$0.072** | ||
| 945 | |||
| 946 | **Note:** POC1 uses Anthropic exclusively for consistency. Multi-provider support planned for POC2. | ||
| 947 | |||
| 948 | ---- | ||
| 949 | |||
| 950 | === 6.6 Failover Strategy === | ||
| 951 | |||
| 952 | **Automatic Failover:** | ||
| 953 | |||
| 954 | {{{ | ||
| 955 | async function completeLLM(stage: string, prompt: string): Promise<string> { | ||
| 956 | const primaryProvider = getProviderForStage(stage) | ||
| 957 | const fallbackProvider = getFallbackProvider() | ||
| 958 | |||
| 959 | try { | ||
| 960 | return await primaryProvider.complete(prompt) | ||
| 961 | } catch (error) { | ||
| 962 | if (error.type === 'rate_limit' || error.type === 'service_unavailable') { | ||
| 963 | logger.warn(`Primary provider failed, using fallback`) | ||
| 964 | return await fallbackProvider.complete(prompt) | ||
| 965 | } | ||
| 966 | throw error | ||
| 967 | } | ||
| 968 | } | ||
| 969 | }}} | ||
| 970 | |||
| 971 | **Fallback Priority:** | ||
| 972 | |||
| 973 | 1. **Primary:** Configured provider for stage | ||
| 974 | 2. **Secondary:** Fallback provider (if configured) | ||
| 975 | 3. **Cache:** Return cached result (if available for Stage 2) | ||
| 976 | 4. **Error:** Return 503 Service Unavailable | ||
| 977 | |||
| 978 | ---- | ||
| 979 | |||
| 980 | === 6.7 Provider Selection API === | ||
| 981 | |||
| 982 | **Admin Endpoint:** POST /admin/v1/llm/configure | ||
| 983 | |||
| 984 | **Update provider for specific stage:** | ||
| 985 | |||
| 986 | {{{{ | ||
| 987 | { | ||
| 988 | "stage": "stage2", | ||
| 989 | "provider": "openai", | ||
| 990 | "model": "gpt-4o", | ||
| 991 | "max_tokens": 16384, | ||
| 992 | "temperature": 0.3 | ||
| 993 | } | ||
| 994 | }}} | ||
| 995 | |||
| 996 | **Response:** 200 OK | ||
| 997 | |||
| 998 | {{{{ | ||
| 999 | { | ||
| 1000 | "message": "LLM configuration updated", | ||
| 1001 | "stage": "stage2", | ||
| 1002 | "previous": { | ||
| 1003 | "provider": "anthropic", | ||
| 1004 | "model": "claude-sonnet-4-5-20250929" | ||
| 1005 | }, | ||
| 1006 | "current": { | ||
| 1007 | "provider": "openai", | ||
| 1008 | "model": "gpt-4o" | ||
| 1009 | }, | ||
| 1010 | "cost_impact": { | ||
| 1011 | "previous_cost_per_claim": 0.081, | ||
| 1012 | "new_cost_per_claim": 0.045, | ||
| 1013 | "savings_percent": 44 | ||
| 1014 | } | ||
| 1015 | } | ||
| 1016 | }}} | ||
| 1017 | |||
| 1018 | **Get current configuration:** | ||
| 1019 | |||
| 1020 | GET /admin/v1/llm/config | ||
| 1021 | |||
| 1022 | {{{{ | ||
| 1023 | { | ||
| 1024 | "providers": ["anthropic", "openai"], | ||
| 1025 | "primary": "anthropic", | ||
| 1026 | "fallback": "openai", | ||
| 1027 | "stages": { | ||
| 1028 | "stage1": { | ||
| 1029 | "provider": "anthropic", | ||
| 1030 | "model": "claude-haiku-4-5-20251001", | ||
| 1031 | "cost_per_request": 0.003 | ||
| 1032 | }, | ||
| 1033 | "stage2": { | ||
| 1034 | "provider": "anthropic", | ||
| 1035 | "model": "claude-sonnet-4-5-20250929", | ||
| 1036 | "cost_per_new_claim": 0.081 | ||
| 1037 | }, | ||
| 1038 | "stage3": { | ||
| 1039 | "provider": "anthropic", | ||
| 1040 | "model": "claude-sonnet-4-5-20250929", | ||
| 1041 | "cost_per_request": 0.030 | ||
| 1042 | } | ||
| 1043 | } | ||
| 1044 | } | ||
| 1045 | }}} | ||
| 1046 | |||
| 1047 | ---- | ||
| 1048 | |||
| 1049 | === 6.8 Implementation Notes === | ||
| 1050 | |||
| 1051 | **Provider Adapter Pattern:** | ||
| 1052 | |||
| 1053 | {{{ | ||
| 1054 | class AnthropicProvider implements LLMProvider { | ||
| 1055 | async complete(prompt: string, options: CompletionOptions) { | ||
| 1056 | const response = await anthropic.messages.create({ | ||
| 1057 | model: options.model || 'claude-sonnet-4-5-20250929', | ||
| 1058 | max_tokens: options.maxTokens || 4096, | ||
| 1059 | messages: [{ role: 'user', content: prompt }], | ||
| 1060 | system: options.systemPrompt | ||
| 1061 | }) | ||
| 1062 | return response.content[0].text | ||
| 1063 | } | ||
| 1064 | } | ||
| 1065 | |||
| 1066 | class OpenAIProvider implements LLMProvider { | ||
| 1067 | async complete(prompt: string, options: CompletionOptions) { | ||
| 1068 | const response = await openai.chat.completions.create({ | ||
| 1069 | model: options.model || 'gpt-4o', | ||
| 1070 | max_tokens: options.maxTokens || 4096, | ||
| 1071 | messages: [ | ||
| 1072 | { role: 'system', content: options.systemPrompt }, | ||
| 1073 | { role: 'user', content: prompt } | ||
| 1074 | ] | ||
| 1075 | }) | ||
| 1076 | return response.choices[0].message.content | ||
| 1077 | } | ||
| 1078 | } | ||
| 1079 | }}} | ||
| 1080 | |||
| 1081 | **Provider Registry:** | ||
| 1082 | |||
| 1083 | {{{ | ||
| 1084 | const providers = new Map<string, LLMProvider>() | ||
| 1085 | providers.set('anthropic', new AnthropicProvider()) | ||
| 1086 | providers.set('openai', new OpenAIProvider()) | ||
| 1087 | providers.set('google', new GoogleProvider()) | ||
| 1088 | |||
| 1089 | function getProvider(name: string): LLMProvider { | ||
| 1090 | return providers.get(name) || providers.get(config.primaryProvider) | ||
| 1091 | } | ||
| 1092 | }}} | ||
| 1093 | |||
| 1094 | ---- | ||
| 1095 | |||
| 1096 | == 3. REST API Contract == | ||
| 1097 | |||
| 1098 | === 3.1 User Credit Tracking === | ||
| 1099 | |||
| 1100 | **Endpoint:** GET /v1/user/credit | ||
| 1101 | |||
| 1102 | **Response:** 200 OK | ||
| 1103 | |||
| 1104 | {{{{ | ||
| 1105 | "user_id": "user_abc123", | ||
| 1106 | "tier": "free", | ||
| 1107 | "credit_limit": 10.00, | ||
| 1108 | "credit_used": 7.42, | ||
| 1109 | "credit_remaining": 2.58, | ||
| 1110 | "reset_date": "2025-02-01T00:00:00Z", | ||
| 1111 | "cache_only_mode": false, | ||
| 1112 | "usage_stats": { | ||
| 1113 | "articles_analyzed": 67, | ||
| 1114 | "claims_from_cache": 189, | ||
| 1115 | "claims_newly_analyzed": 113, | ||
| 1116 | "cache_hit_rate": 0.626 | ||
| 1117 | } | ||
| 1118 | } | ||
| 1119 | }}} | ||
| 1120 | |||
| 1121 | ---- | ||
| 1122 | |||
| 1123 | |||
| 1124 | |||
| 1125 | ==== Stage 2 Output Schema: ClaimAnalysis ==== | ||
| 1126 | |||
| 1127 | **Complete schema for each claim's analysis result:** | ||
| 1128 | |||
| 1129 | {{code language="json"}} | ||
| 1130 | { | ||
| 1131 | "claim_id": "claim_abc123", | ||
| 1132 | "claim_text": "Biden won the 2020 election", | ||
| 1133 | "scenarios": [ | ||
| 1134 | { | ||
| 1135 | "scenario_id": "scenario_1", | ||
| 1136 | "description": "Interpreting 'won' as Electoral College victory", | ||
| 1137 | "verdict": { | ||
| 1138 | "label": "TRUE", | ||
| 1139 | "confidence": 0.95, | ||
| 1140 | "explanation": "Joe Biden won 306 electoral votes vs Trump's 232" | ||
| 1141 | }, | ||
| 1142 | "evidence": { | ||
| 1143 | "supporting": [ | ||
| 1144 | { | ||
| 1145 | "text": "Biden certified with 306 electoral votes", | ||
| 1146 | "source_url": "https://www.archives.gov/electoral-college/2020", | ||
| 1147 | "source_title": "2020 Electoral College Results", | ||
| 1148 | "credibility_score": 0.98 | ||
| 1149 | } | ||
| 1150 | ], | ||
| 1151 | "opposing": [] | ||
| 1152 | } | ||
| 1153 | } | ||
| 1154 | ], | ||
| 1155 | "recommended_scenario": "scenario_1", | ||
| 1156 | "metadata": { | ||
| 1157 | "analysis_timestamp": "2024-12-24T18:00:00Z", | ||
| 1158 | "model_used": "claude-sonnet-4-5-20250929", | ||
| 1159 | "processing_time_seconds": 8.5 | ||
| 1160 | } | ||
| 1161 | } | ||
| 1162 | {{/code}} | ||
| 1163 | |||
| 1164 | **Required Fields:** | ||
| 1165 | * **claim_id**: Unique identifier matching Stage 1 output | ||
| 1166 | * **claim_text**: The exact claim being analyzed | ||
| 1167 | * **scenarios**: Array of interpretation scenarios (minimum 1) | ||
| 1168 | * **scenario_id**: Unique ID for this scenario | ||
| 1169 | * **description**: Clear interpretation of the claim | ||
| 1170 | * **verdict**: Verdict object with label, confidence, explanation | ||
| 1171 | * **evidence**: Supporting and opposing evidence arrays | ||
| 1172 | * **recommended_scenario**: ID of the primary/recommended scenario | ||
| 1173 | * **metadata**: Processing metadata (timestamp, model, timing) | ||
| 1174 | |||
| 1175 | **Optional Fields:** | ||
| 1176 | * Additional context, warnings, or quality scores | ||
| 1177 | |||
| 1178 | **Minimum Viable Example:** | ||
| 1179 | |||
| 1180 | {{code language="json"}} | ||
| 1181 | { | ||
| 1182 | "claim_id": "c1", | ||
| 1183 | "claim_text": "The sky is blue", | ||
| 1184 | "scenarios": [{ | ||
| 1185 | "scenario_id": "s1", | ||
| 1186 | "description": "Under clear daytime conditions", | ||
| 1187 | "verdict": {"label": "TRUE", "confidence": 0.99, "explanation": "Rayleigh scattering"}, | ||
| 1188 | "evidence": {"supporting": [], "opposing": []} | ||
| 1189 | }], | ||
| 1190 | "recommended_scenario": "s1", | ||
| 1191 | "metadata": {"analysis_timestamp": "2024-12-24T18:00:00Z"} | ||
| 1192 | } | ||
| 1193 | {{/code}} | ||
| 1194 | |||
| 1195 | |||
| 1196 | |||
| 1197 | ==== Stage 3 Output Schema: ArticleAssessment ==== | ||
| 1198 | |||
| 1199 | **Complete schema for holistic article-level assessment:** | ||
| 1200 | |||
| 1201 | {{code language="json"}} | ||
| 1202 | { | ||
| 1203 | "article_id": "article_xyz789", | ||
| 1204 | "overall_assessment": { | ||
| 1205 | "credibility_score": 0.72, | ||
| 1206 | "risk_tier": "B", | ||
| 1207 | "summary": "Article contains mostly accurate claims with one disputed claim requiring expert review", | ||
| 1208 | "confidence": 0.85 | ||
| 1209 | }, | ||
| 1210 | "claim_aggregation": { | ||
| 1211 | "total_claims": 5, | ||
| 1212 | "verdict_distribution": { | ||
| 1213 | "TRUE": 3, | ||
| 1214 | "PARTIALLY_TRUE": 1, | ||
| 1215 | "DISPUTED": 1, | ||
| 1216 | "FALSE": 0, | ||
| 1217 | "UNSUPPORTED": 0, | ||
| 1218 | "UNVERIFIABLE": 0 | ||
| 1219 | }, | ||
| 1220 | "avg_confidence": 0.82 | ||
| 1221 | }, | ||
| 1222 | "contextual_factors": [ | ||
| 1223 | { | ||
| 1224 | "factor": "Source credibility", | ||
| 1225 | "impact": "positive", | ||
| 1226 | "description": "Published by reputable news organization" | ||
| 1227 | }, | ||
| 1228 | { | ||
| 1229 | "factor": "Claim interdependence", | ||
| 1230 | "impact": "neutral", | ||
| 1231 | "description": "Claims are independent; no logical chains" | ||
| 1232 | } | ||
| 1233 | ], | ||
| 1234 | "recommendations": { | ||
| 1235 | "publication_mode": "AI_GENERATED", | ||
| 1236 | "requires_review": false, | ||
| 1237 | "review_reason": null, | ||
| 1238 | "suggested_disclaimers": [ | ||
| 1239 | "One claim (Claim 4) has conflicting expert opinions" | ||
| 1240 | ] | ||
| 1241 | }, | ||
| 1242 | "metadata": { | ||
| 1243 | "holistic_timestamp": "2024-12-24T18:00:10Z", | ||
| 1244 | "model_used": "claude-sonnet-4-5-20250929", | ||
| 1245 | "processing_time_seconds": 4.2, | ||
| 1246 | "cache_used": false | ||
| 1247 | } | ||
| 1248 | } | ||
| 1249 | {{/code}} | ||
| 1250 | |||
| 1251 | **Required Fields:** | ||
| 1252 | * **article_id**: Unique identifier for this article | ||
| 1253 | * **overall_assessment**: Top-level assessment | ||
| 1254 | * **credibility_score**: 0.0-1.0 composite score | ||
| 1255 | * **risk_tier**: A, B, or C (per AKEL quality gates) | ||
| 1256 | * **summary**: Human-readable assessment | ||
| 1257 | * **confidence**: How confident the holistic assessment is | ||
| 1258 | * **claim_aggregation**: Statistics across all claims | ||
| 1259 | * **total_claims**: Count of claims analyzed | ||
| 1260 | * **verdict_distribution**: Count per verdict label | ||
| 1261 | * **avg_confidence**: Average confidence across verdicts | ||
| 1262 | * **contextual_factors**: Array of contextual considerations | ||
| 1263 | * **recommendations**: Publication decision support | ||
| 1264 | * **publication_mode**: DRAFT_ONLY, AI_GENERATED, or HUMAN_REVIEWED | ||
| 1265 | * **requires_review**: Boolean flag | ||
| 1266 | * **suggested_disclaimers**: Array of disclaimer texts | ||
| 1267 | * **metadata**: Processing metadata | ||
| 1268 | |||
| 1269 | **Minimum Viable Example:** | ||
| 1270 | |||
| 1271 | {{code language="json"}} | ||
| 1272 | { | ||
| 1273 | "article_id": "a1", | ||
| 1274 | "overall_assessment": { | ||
| 1275 | "credibility_score": 0.95, | ||
| 1276 | "risk_tier": "C", | ||
| 1277 | "summary": "All claims verified as true", | ||
| 1278 | "confidence": 0.98 | ||
| 1279 | }, | ||
| 1280 | "claim_aggregation": { | ||
| 1281 | "total_claims": 1, | ||
| 1282 | "verdict_distribution": {"TRUE": 1}, | ||
| 1283 | "avg_confidence": 0.99 | ||
| 1284 | }, | ||
| 1285 | "contextual_factors": [], | ||
| 1286 | "recommendations": { | ||
| 1287 | "publication_mode": "AI_GENERATED", | ||
| 1288 | "requires_review": false, | ||
| 1289 | "suggested_disclaimers": [] | ||
| 1290 | }, | ||
| 1291 | "metadata": {"holistic_timestamp": "2024-12-24T18:00:00Z"} | ||
| 1292 | } | ||
| 1293 | {{/code}} | ||
| 1294 | |||
| 1295 | === 3.2 Create Analysis Job (3-Stage) === | ||
| 1296 | |||
| 1297 | **Endpoint:** POST /v1/analyze | ||
| 1298 | |||
| 1299 | ==== Idempotency Support: ==== | ||
| 1300 | |||
| 1301 | To prevent duplicate job creation on network retries, clients SHOULD include **either**: | ||
| 1302 | |||
| 1303 | * Header: ``Idempotency-Key: <client-generated-uuid>`` (preferred) | ||
| 1304 | * OR body: ``client.request_id`` | ||
| 1305 | |||
| 1306 | **Example request (header):** | ||
| 1307 | {{code language="text"}} | ||
| 1308 | POST /v1/analyze | ||
| 1309 | Authorization: Bearer <API_KEY> | ||
| 1310 | Idempotency-Key: 0f3c6c0e-2d2b-4b4a-9d6f-1a1f6b0c9f7e | ||
| 1311 | Content-Type: application/json | ||
| 1312 | {{/code}} | ||
| 1313 | |||
| 1314 | **Example request (body):** | ||
| 1315 | {{code language="json"}} | ||
| 1316 | { | ||
| 1317 | "input_url": "https://example.org/article", | ||
| 1318 | "options": { "max_claims": 5, "cache_preference": "prefer_cache" }, | ||
| 1319 | "client": { "request_id": "0f3c6c0e-2d2b-4b4a-9d6f-1a1f6b0c9f7e" } | ||
| 1320 | } | ||
| 1321 | {{/code}} | ||
| 1322 | |||
| 1323 | **Server behavior:** | ||
| 1324 | * Same idempotency key + same request body ⇒ return existing job (``200``) and include: | ||
| 1325 | ``idempotent=true`` and ``original_request_at``. | ||
| 1326 | * Same key + different body ⇒ ``409`` with ``VALIDATION_ERROR`` describing the mismatch. | ||
| 1327 | |||
| 1328 | **Idempotency TTL:** 24 hours (minimum). | ||
| 1329 | |||
| 1330 | ==== Request Body: ==== | ||
| 1331 | |||
| 1332 | {{{{ | ||
| 1333 | "input_type": "url", | ||
| 1334 | "input_url": "https://example.com/medical-report-01", | ||
| 1335 | "input_text": null, | ||
| 1336 | "options": { | ||
| 1337 | "browsing": "on", | ||
| 1338 | "depth": "standard", | ||
| 1339 | "max_claims": 5, | ||
| 1340 | |||
| 1341 | * **cache_preference** (optional): Cache usage preference | ||
| 1342 | * **Type:** string | ||
| 1343 | * **Enum:** {{code}}["prefer_cache", "allow_partial", "skip_cache"]{{/code}} | ||
| 1344 | * **Default:** {{code}}"prefer_cache"{{/code}} | ||
| 1345 | * **Semantics:** | ||
| 1346 | * {{code}}"prefer_cache"{{/code}}: Use full cache if available, otherwise run all stages | ||
| 1347 | * {{code}}"allow_partial"{{/code}}: Use cached Stage 2 results if available, rerun only Stage 3 | ||
| 1348 | * {{code}}"skip_cache"{{/code}}: Always rerun all stages (ignore cache) | ||
| 1349 | * **Behavior:** When set to {{code}}"allow_partial"{{/code}} and Stage 2 cached results exist: | ||
| 1350 | * Stage 1 & 2 are skipped | ||
| 1351 | * Stage 3 (holistic assessment) runs fresh with cached claim analyses | ||
| 1352 | * Response includes {{code}}"cache_used": true{{/code}} and {{code}}"stages_cached": ["stage1", "stage2"]{{/code}} | ||
| 1353 | |||
| 1354 | "scenarios_per_claim": 2, | ||
| 1355 | "max_evidence_per_scenario": 6, | ||
| 1356 | "context_aware_analysis": true | ||
| 1357 | }, | ||
| 1358 | "client": { | ||
| 1359 | "request_id": "optional-client-tracking-id", | ||
| 1360 | "source_label": "optional" | ||
| 1361 | } | ||
| 1362 | } | ||
| 1363 | }}} | ||
| 1364 | |||
| 1365 | **Options:** | ||
| 1366 | |||
| 1367 | * browsing: on | off (retrieve web sources or just output queries) | ||
| 1368 | * depth: standard | deep (evidence thoroughness) | ||
| 1369 | * max_claims: 1-10 (default: **5** for cost control) | ||
| 1370 | * scenarios_per_claim: 1-5 (default: **2** for cost control) | ||
| 1371 | * max_evidence_per_scenario: 3-10 (default: **6**) | ||
| 1372 | * context_aware_analysis: true | false (experimental) | ||
| 1373 | |||
| 1374 | **Response:** 202 Accepted | ||
| 1375 | |||
| 1376 | {{{{ | ||
| 1377 | "job_id": "01J...ULID", | ||
| 1378 | "status": "QUEUED", | ||
| 1379 | "created_at": "2025-12-24T10:31:00Z", | ||
| 1380 | "estimated_cost": 0.114, | ||
| 1381 | "cost_breakdown": { | ||
| 1382 | "stage1_extraction": 0.003, | ||
| 1383 | "stage2_new_claims": 0.081, | ||
| 1384 | "stage2_cached_claims": 0.000, | ||
| 1385 | "stage3_holistic": 0.030 | ||
| 1386 | }, | ||
| 1387 | "cache_info": { | ||
| 1388 | "claims_to_extract": 5, | ||
| 1389 | "estimated_cache_hits": 4, | ||
| 1390 | "estimated_new_claims": 1 | ||
| 1391 | }, | ||
| 1392 | "links": { | ||
| 1393 | "self": "/v1/jobs/01J...ULID", | ||
| 1394 | "result": "/v1/jobs/01J...ULID/result", | ||
| 1395 | "report": "/v1/jobs/01J...ULID/report", | ||
| 1396 | "events": "/v1/jobs/01J...ULID/events" | ||
| 1397 | } | ||
| 1398 | } | ||
| 1399 | }}} | ||
| 1400 | |||
| 1401 | **Error Responses:** | ||
| 1402 | |||
| 1403 | 402 Payment Required - Free tier limit reached, cache-only mode | ||
| 1404 | |||
| 1405 | {{{{ | ||
| 1406 | "error": "credit_limit_reached", | ||
| 1407 | "message": "Monthly credit limit reached. Entering cache-only mode.", | ||
| 1408 | "cache_only_mode": true, | ||
| 1409 | "credit_remaining": 0.00, | ||
| 1410 | "reset_date": "2025-02-01T00:00:00Z", | ||
| 1411 | "action": "Resubmit with cache_preference=allow_partial for cached results" | ||
| 1412 | } | ||
| 1413 | }}} | ||
| 1414 | |||
| 1415 | ---- | ||
| 1416 | |||
| 1417 | == 4. Data Schemas == | ||
| 1418 | |||
| 1419 | === 4.1 Stage 1 Output: ClaimExtraction === | ||
| 1420 | |||
| 1421 | {{{{ | ||
| 1422 | "job_id": "01J...ULID", | ||
| 1423 | "stage": "stage1_extraction", | ||
| 1424 | "article_metadata": { | ||
| 1425 | "title": "Article title", | ||
| 1426 | "source_url": "https://example.com/article", | ||
| 1427 | "extracted_text_length": 5234, | ||
| 1428 | "language": "en" | ||
| 1429 | }, | ||
| 1430 | "claims": [ | ||
| 1431 | { | ||
| 1432 | "claim_id": "C1", | ||
| 1433 | "claim_text": "Original claim text from article", | ||
| 1434 | "canonical_claim": "Normalized, deduplicated phrasing", | ||
| 1435 | "claim_hash": "sha256:abc123...", | ||
| 1436 | "is_central_to_thesis": true, | ||
| 1437 | "claim_type": "causal", | ||
| 1438 | "evaluability": "evaluable", | ||
| 1439 | "risk_tier": "B", | ||
| 1440 | "domain": "public_health" | ||
| 1441 | } | ||
| 1442 | ], | ||
| 1443 | "article_thesis": "Main argument detected", | ||
| 1444 | "cost": 0.003 | ||
| 1445 | } | ||
| 1446 | }}} | ||
| 1447 | |||
| 1448 | ---- | ||
| 1449 | |||
| 1450 | === 4.5 Verdict Label Taxonomy === | ||
| 1451 | |||
| 1452 | FactHarbor uses **three distinct verdict taxonomies** depending on analysis level: | ||
| 1453 | |||
| 1454 | ==== 4.5.1 Scenario Verdict Labels (Stage 2) ==== | ||
| 1455 | |||
| 1456 | Used for individual scenario verdicts within a claim. | ||
| 1457 | |||
| 1458 | **Enum Values:** | ||
| 1459 | |||
| 1460 | * Highly Likely - Probability 0.85-1.0, high confidence | ||
| 1461 | * Likely - Probability 0.65-0.84, moderate-high confidence | ||
| 1462 | * Unclear - Probability 0.35-0.64, or low confidence | ||
| 1463 | * Unlikely - Probability 0.16-0.34, moderate-high confidence | ||
| 1464 | * Highly Unlikely - Probability 0.0-0.15, high confidence | ||
| 1465 | * Unsubstantiated - Insufficient evidence to determine probability | ||
| 1466 | |||
| 1467 | ==== 4.5.2 Claim Verdict Labels (Rollup) ==== | ||
| 1468 | |||
| 1469 | Used when summarizing a claim across all scenarios. | ||
| 1470 | |||
| 1471 | **Enum Values:** | ||
| 1472 | |||
| 1473 | * Supported - Majority of scenarios are Likely or Highly Likely | ||
| 1474 | * Refuted - Majority of scenarios are Unlikely or Highly Unlikely | ||
| 1475 | * Inconclusive - Mixed scenarios or majority Unclear/Unsubstantiated | ||
| 1476 | |||
| 1477 | **Mapping Logic:** | ||
| 1478 | |||
| 1479 | * If ≥60% scenarios are (Highly Likely | Likely) → Supported | ||
| 1480 | * If ≥60% scenarios are (Highly Unlikely | Unlikely) → Refuted | ||
| 1481 | * Otherwise → Inconclusive | ||
| 1482 | |||
| 1483 | ==== 4.5.3 Article Verdict Labels (Stage 3) ==== | ||
| 1484 | |||
| 1485 | Used for holistic article-level assessment. | ||
| 1486 | |||
| 1487 | **Enum Values:** | ||
| 1488 | |||
| 1489 | * WELL-SUPPORTED - Article thesis logically follows from supported claims | ||
| 1490 | * MISLEADING - Claims may be true but article commits logical fallacies | ||
| 1491 | * REFUTED - Central claims are refuted, invalidating thesis | ||
| 1492 | * UNCERTAIN - Insufficient evidence or highly mixed claim verdicts | ||
| 1493 | |||
| 1494 | **Note:** Article verdict considers **claim centrality** (central claims override supporting claims). | ||
| 1495 | |||
| 1496 | ==== 4.5.4 API Field Mapping ==== | ||
| 1497 | |||
| 1498 | |=Level|=API Field|=Enum Name | ||
| 1499 | |Scenario|scenarios[].verdict.label|scenario_verdict_label | ||
| 1500 | |Claim|claims[].rollup_verdict (optional)|claim_verdict_label | ||
| 1501 | |Article|article_holistic_assessment.overall_verdict|article_verdict_label | ||
| 1502 | |||
| 1503 | ---- | ||
| 1504 | |||
| 1505 | == 5. Cache Architecture == | ||
| 1506 | |||
| 1507 | === 5.1 Redis Cache Design === | ||
| 1508 | |||
| 1509 | **Technology:** Redis 7.0+ (in-memory key-value store) | ||
| 1510 | |||
| 1511 | **Cache Key Schema:** | ||
| 1512 | |||
| 1513 | {{{claim:v1norm1:{language}:{sha256(canonical_claim)} | ||
| 1514 | }}} | ||
| 1515 | |||
| 1516 | **Example:** | ||
| 1517 | |||
| 1518 | {{{Claim (English): "COVID vaccines are 95% effective" | ||
| 1519 | Canonical: "covid vaccines are 95 percent effective" | ||
| 1520 | Language: "en" | ||
| 1521 | SHA256: abc123...def456 | ||
| 1522 | Key: claim:v1norm1:en:abc123...def456 | ||
| 1523 | }}} | ||
| 1524 | |||
| 1525 | **Rationale:** Prevents cross-language collisions and enables per-language cache analytics. | ||
| 1526 | |||
| 1527 | **Data Structure:** | ||
| 1528 | |||
| 1529 | {{{SET claim:v1norm1:en:abc123...def456 '{...ClaimAnalysis JSON...}' | ||
| 1530 | EXPIRE claim:v1norm1:en:abc123...def456 7776000 # 90 days | ||
| 1531 | }}} | ||
| 1532 | |||
| 1533 | ---- | ||
| 1534 | |||
| 1535 | === 5.1.1 Canonical Claim Normalization (v1norm1) === | ||
| 1536 | |||
| 1537 | The cache key depends on deterministic claim normalization. **All implementations MUST follow this algorithm exactly.** | ||
| 1538 | |||
| 1539 | **Normalization version:** ``v1norm1`` | ||
| 1540 | |||
| 1541 | **Algorithm (v1norm1):** | ||
| 1542 | 1. Unicode normalize: NFD | ||
| 1543 | 2. Lowercase | ||
| 1544 | 3. Strip diacritics | ||
| 1545 | 4. Normalize apostrophes: ``’`` and ``‘`` → ``'`` | ||
| 1546 | 5. Replace percent sign: ``%`` → `` percent`` | ||
| 1547 | 6. Collapse whitespace | ||
| 1548 | 7. Remove punctuation **except apostrophes** | ||
| 1549 | 8. Expand contractions (fixed list below) | ||
| 1550 | 9. Remove remaining apostrophes | ||
| 1551 | 10. Collapse whitespace again | ||
| 1552 | |||
| 1553 | {{code language="python"}} | ||
| 1554 | import re | ||
| 1555 | import unicodedata | ||
| 1556 | |||
| 1557 | # Canonical claim normalization for deduplication. | ||
| 1558 | # Version: v1norm1 | ||
| 1559 | # | ||
| 1560 | # IMPORTANT: | ||
| 1561 | # - Any change to these rules REQUIRES a new normalization version. | ||
| 1562 | # - Cache keys MUST include the normalization version to avoid collisions. | ||
| 1563 | |||
| 1564 | CONTRACTIONS_V1NORM1 = { | ||
| 1565 | "don't": "do not", | ||
| 1566 | "doesn't": "does not", | ||
| 1567 | "didn't": "did not", | ||
| 1568 | "can't": "cannot", | ||
| 1569 | "won't": "will not", | ||
| 1570 | "shouldn't": "should not", | ||
| 1571 | "wouldn't": "would not", | ||
| 1572 | "isn't": "is not", | ||
| 1573 | "aren't": "are not", | ||
| 1574 | "wasn't": "was not", | ||
| 1575 | "weren't": "were not", | ||
| 1576 | "haven't": "have not", | ||
| 1577 | "hasn't": "has not", | ||
| 1578 | "hadn't": "had not", | ||
| 1579 | "it's": "it is", | ||
| 1580 | "that's": "that is", | ||
| 1581 | "there's": "there is", | ||
| 1582 | "i'm": "i am", | ||
| 1583 | "we're": "we are", | ||
| 1584 | "they're": "they are", | ||
| 1585 | "you're": "you are", | ||
| 1586 | "i've": "i have", | ||
| 1587 | "we've": "we have", | ||
| 1588 | "they've": "they have", | ||
| 1589 | "you've": "you have", | ||
| 1590 | "i'll": "i will", | ||
| 1591 | "we'll": "we will", | ||
| 1592 | "they'll": "they will", | ||
| 1593 | "you'll": "you will", | ||
| 1594 | } | ||
| 1595 | |||
| 1596 | def normalize_claim(text: str) -> str: | ||
| 1597 | if text is None: | ||
| 1598 | return "" | ||
| 1599 | |||
| 1600 | # 1) Unicode normalization (NFD) | ||
| 1601 | text = unicodedata.normalize("NFD", text) | ||
| 1602 | |||
| 1603 | # 2) Lowercase | ||
| 1604 | text = text.lower() | ||
| 1605 | |||
| 1606 | # 3) Strip diacritics | ||
| 1607 | text = "".join(c for c in text if unicodedata.category(c) != "Mn") | ||
| 1608 | |||
| 1609 | # 4) Normalize apostrophes | ||
| 1610 | text = text.replace("’", "'").replace("‘", "'") | ||
| 1611 | |||
| 1612 | # 5) Normalize percent sign | ||
| 1613 | text = text.replace("%", " percent") | ||
| 1614 | |||
| 1615 | # 6) Collapse whitespace | ||
| 1616 | text = re.sub(r"\s+", " ", text).strip() | ||
| 1617 | |||
| 1618 | # 7) Remove punctuation except apostrophes | ||
| 1619 | text = re.sub(r"[^\w\s']", "", text) | ||
| 1620 | |||
| 1621 | # 8) Expand contractions | ||
| 1622 | for k, v in CONTRACTIONS_V1NORM1.items(): | ||
| 1623 | text = re.sub(rf"\b{re.escape(k)}\b", v, text) | ||
| 1624 | |||
| 1625 | # 9) Remove remaining apostrophes (after contraction expansion) | ||
| 1626 | text = text.replace("'", "") | ||
| 1627 | |||
| 1628 | # 10) Final whitespace normalization | ||
| 1629 | text = re.sub(r"\s+", " ", text).strip() | ||
| 1630 | |||
| 1631 | return text | ||
| 1632 | {{/code}} | ||
| 1633 | |||
| 1634 | **Canonical claim hash input (normative):** | ||
| 1635 | * ``claim_hash = sha256_hex_lower( "v1norm1|<language>|" + canonical_claim_text )`` | ||
| 1636 | * Cache key: ``claim:v1norm1:<language>:<claim_hash>`` | ||
| 1637 | |||
| 1638 | **Normalization Examples:** | ||
| 1639 | |||
| 1640 | |= Input |= Normalized Output | ||
| 1641 | | "Biden won the 2020 election" | {{code}}biden won the 2020 election{{/code}} | ||
| 1642 | | "Biden won the 2020 election!" | {{code}}biden won the 2020 election{{/code}} | ||
| 1643 | | "Biden won the 2020 election" | {{code}}biden won the 2020 election{{/code}} | ||
| 1644 | | "Biden didn't win the 2020 election" | {{code}}biden did not win the 2020 election{{/code}} | ||
| 1645 | | "BIDEN WON THE 2020 ELECTION" | {{code}}biden won the 2020 election{{/code}} | ||
| 1646 | |||
| 1647 | **Versioning:** Algorithm version is {{code}}v1norm1{{/code}}. Changes to the algorithm require a new version identifier. | ||
| 1648 | |||
| 1649 | === 5.1.2 Copyright & Data Retention Policy === | ||
| 1650 | |||
| 1651 | **Evidence Excerpt Storage:** | ||
| 1652 | |||
| 1653 | To comply with copyright law and fair use principles: | ||
| 1654 | |||
| 1655 | **What We Store:** | ||
| 1656 | |||
| 1657 | * **Metadata only:** Title, author, publisher, URL, publication date | ||
| 1658 | * **Short excerpts:** Max 25 words per quote, max 3 quotes per evidence item | ||
| 1659 | * **Summaries:** AI-generated bullet points (not verbatim text) | ||
| 1660 | * **No full articles:** Never store complete article text beyond job processing | ||
| 1661 | |||
| 1662 | **Total per Cached Claim:** | ||
| 1663 | |||
| 1664 | * Scenarios: 2 per claim | ||
| 1665 | * Evidence items: 6 per scenario (12 total) | ||
| 1666 | * Quotes: 3 per evidence × 25 words = 75 words per item | ||
| 1667 | * **Maximum stored verbatim text:** ~~900 words per claim (12 × 75) | ||
| 1668 | |||
| 1669 | **Retention:** | ||
| 1670 | |||
| 1671 | * Cache TTL: 90 days | ||
| 1672 | * Job outputs: 24 hours (then archived or deleted) | ||
| 1673 | * No persistent full-text article storage | ||
| 1674 | |||
| 1675 | **Rationale:** | ||
| 1676 | |||
| 1677 | * Short excerpts for citation = fair use | ||
| 1678 | * Summaries are transformative (not copyrightable) | ||
| 1679 | * Limited retention (90 days max) | ||
| 1680 | * No commercial republication of excerpts | ||
| 1681 | |||
| 1682 | **DMCA Compliance:** | ||
| 1683 | |||
| 1684 | * Cache invalidation endpoint available for rights holders | ||
| 1685 | * Contact: dmca@factharbor.org | ||
| 1686 | |||
| 1687 | ---- | ||
| 1688 | |||
| 1689 | == Summary == | ||
| 1690 | |||
| 1691 | This WYSIWYG preview shows the **structure and key sections** of the 1,515-line API specification. | ||
| 1692 | |||
| 1693 | **Full specification includes:** | ||
| 1694 | |||
| 1695 | * Complete API endpoints (7 total) | ||
| 1696 | * All data schemas (ClaimExtraction, ClaimAnalysis, HolisticAssessment, Complete) | ||
| 1697 | * Quality gates & validation rules | ||
| 1698 | * LLM configuration for all 3 stages | ||
| 1699 | * Implementation notes with code samples | ||
| 1700 | * Testing strategy | ||
| 1701 | * Cross-references to other pages | ||
| 1702 | |||
| 1703 | **The complete specification is available in:** | ||
| 1704 | |||
| 1705 | * this page (authoritative canonical contract) (45 KB standalone) | ||
| 1706 | * Export files (TEST/PRODUCTION) for xWiki import |