Last modified by Robert Schaub on 2025/12/24 18:26

From version 5.1
edited by Robert Schaub
on 2025/12/24 17:59
Change comment: Imported from XAR
To version 2.1
edited by Robert Schaub
on 2025/12/24 13:58
Change comment: Imported from XAR

Summary

Details

Page properties
Title
... ... @@ -1,1 +1,1 @@
1 -POC1 API & Schemas Specification
1 +POC1 API & Schemas Specification v0.4.1
Content
... ... @@ -1,25 +1,44 @@
1 -= POC1 API & Schemas Specification =
1 +# FactHarbor POC1 API & Schemas Specification
2 2  
3 -----
3 +**Version:** 0.4.1 (POC1 - 3-Stage Caching Architecture)
4 +**Namespace:** FactHarbor.*
5 +**Syntax:** xWiki 2.1
6 +**Last Updated:** 2025-12-24
4 4  
8 +---
9 +
5 5  == Version History ==
6 6  
7 7  |=Version|=Date|=Changes
8 8  |0.4.1|2025-12-24|Applied 9 critical fixes: file format notice, verdict taxonomy, canonicalization algorithm, Stage 1 cost policy, BullMQ fix, language in cache key, historical claims TTL, idempotency, copyright policy
9 9  |0.4|2025-12-24|**BREAKING:** 3-stage pipeline with claim-level caching, user tier system, cache-only mode for free users, Redis cache architecture
10 -|0.3.1|2025-12-24|Fixed single-prompt strategy, SSE clarification, schema canonicalization, cost constraints
11 -|0.3|2025-12-24|Added complete API endpoints, LLM config, risk tiers, scraping details
15 +|0.3.1|2025-12-24|Fixed single-prompt strategy, SSE clarification, schema canonicalization, cost constraints, chain-of-thought, evidence citation, Jina safety, gate numbering
16 +|0.3|2025-12-24|Added complete API endpoints, LLM config, risk tiers, scraping details, quality gate logging, temporal separation note, cross-references
17 +|0.2|2025-12-24|Initial rebased version with holistic assessment
18 +|0.1|2025-12-24|Original specification
12 12  
13 -----
20 +---
21 +---
14 14  
15 -== 1. Core Objective (POC1) ==
23 +== File Format Notice ==
16 16  
17 -The primary technical goal of POC1 is to validate **Approach 1 (Single-Pass Holistic Analysis)** while implementing **claim-level caching** to achieve cost sustainability.
25 +**⚠️ Important:** This file is stored as {{code}}.md{{/code}} for transport/versioning, but the content is **xWiki 2.1 syntax** (not Markdown).
18 18  
19 -The system must prove that AI can identify an article's **Main Thesis** and determine if supporting claims logically support that thesis without committing fallacies.
27 +**When importing to xWiki:**
28 +* Use "Import as XWiki content" (not "Import as Markdown")
29 +* The xWiki parser will correctly interpret {{code}}==}} headers, {{{{code}}}}}} blocks, etc.
20 20  
21 -=== Success Criteria: ===
31 +**Alternate naming:** If your workflow supports it, rename to {{code}}.xwiki.txt{{/code}} to avoid ambiguity.
22 22  
33 +---
34 +
35 +== 1. Core Objective (POC1) ==
36 +
37 +The primary technical goal of POC1 is to validate **Approach 1 (Single-Pass Holistic Analysis)** while implementing **claim-level caching** to achieve cost sustainability:
38 +
39 +The system must prove that AI can identify an article's **Main Thesis** and determine if the supporting claims (even if individually accurate) logically support that thesis without committing fallacies (e.g., correlation vs. causation, cherry-picking, hasty generalization).
40 +
41 +**Success Criteria:**
23 23  * Test with 30 diverse articles
24 24  * Target: ≥70% accuracy detecting misleading articles
25 25  * Cost: <$0.25 per NEW analysis (uncached)
... ... @@ -27,13 +27,14 @@
27 27  * Cache hit rate: ≥50% after 1,000 articles
28 28  * Processing time: <2 minutes (standard depth)
29 29  
30 -=== Economic Model: ===
49 +**Economic Model:**
50 +* Free tier: $10 credit per month (~40-140 articles depending on cache hits)
51 +* After limit: Cache-only mode (instant, free access to cached claims)
52 +* Paid tier: Unlimited new analyses
31 31  
32 -* **Free tier:** $10 credit per month (~~40-140 articles depending on cache hits)
33 -* **After limit:** Cache-only mode (instant, free access to cached claims)
34 -* **Paid tier:** Unlimited new analyses
54 +**See:** [[Article Verdict Problem>>Test.FactHarbor.Specification.POC.Article-Verdict-Problem]] for complete investigation of 7 approaches.
35 35  
36 -----
56 +---
37 37  
38 38  == 2. Architecture Overview ==
39 39  
... ... @@ -41,61 +41,52 @@
41 41  
42 42  FactHarbor POC1 uses a **3-stage architecture** designed for claim-level caching and cost efficiency:
43 43  
44 -{{mermaid}}
64 +{{code language="mermaid"}}
45 45  graph TD
46 - A[Article Input] --> B[Stage 1: Extract Claims]
47 - B --> C{For Each Claim}
48 - C --> D[Check Cache]
49 - D -->|Cache HIT| E[Return Cached Verdict]
50 - D -->|Cache MISS| F[Stage 2: Analyze Claim]
51 - F --> G[Store in Cache]
52 - G --> E
53 - E --> H[Stage 3: Holistic Assessment]
54 - H --> I[Final Report]
55 -{{/mermaid}}
66 + A[Article Input] --> B[Stage 1: Extract Claims]
67 + B --> C{For Each Claim}
68 + C --> D[Check Cache]
69 + D -->|Cache HIT| E[Return Cached Verdict]
70 + D -->|Cache MISS| F[Stage 2: Analyze Claim]
71 + F --> G[Store in Cache]
72 + G --> E
73 + E --> H[Stage 3: Holistic Assessment]
74 + H --> I[Final Report]
75 +{{/code}}
56 56  
57 -==== Stage 1: Claim Extraction (Haiku, no cache) ====
77 +**Stage 1: Claim Extraction** (Haiku, no cache)
78 +* Input: Article text
79 +* Output: 5 canonical claims (normalized, deduplicated)
80 +* Model: Claude Haiku 4
81 +* Cost: $0.003 per article
82 +* Cache strategy: No caching (article-specific)
58 58  
59 -* **Input:** Article text
60 -* **Output:** 5 canonical claims (normalized, deduplicated)
61 -* **Model:** Claude Haiku 4 (default, configurable via LLM abstraction layer)
62 -* **Cost:** $0.003 per article
63 -* **Cache strategy:** No caching (article-specific)
84 +**Stage 2: Claim Analysis** (Sonnet, CACHED)
85 +* Input: Single canonical claim
86 +* Output: Scenarios + Evidence + Verdicts
87 +* Model: Claude Sonnet 3.5
88 +* Cost: $0.081 per NEW claim
89 +* Cache strategy: **Redis, 90-day TTL**
90 +* Cache key: {{code}}claim:v1norm1:{language}:{sha256(canonical_claim)}{{/code}}
64 64  
65 -==== Stage 2: Claim Analysis (Sonnet, CACHED) ====
92 +**Stage 3: Holistic Assessment** (Sonnet, no cache)
93 +* Input: Article + Claim verdicts (from cache or Stage 2)
94 +* Output: Article verdict + Fallacies + Logic quality
95 +* Model: Claude Sonnet 3.5
96 +* Cost: $0.030 per article
97 +* Cache strategy: No caching (article-specific)
66 66  
67 -* **Input:** Single canonical claim
68 -* **Output:** Scenarios + Evidence + Verdicts
69 -* **Model:** Claude Sonnet 3.5 (default, configurable via LLM abstraction layer)
70 -* **Cost:** $0.081 per NEW claim
71 -* **Cache strategy:** Redis, 90-day TTL
72 -* **Cache key:** claim:v1norm1:{language}:{sha256(canonical_claim)}
99 +**Total Cost Formula:**
100 +{{code}}
101 +Cost = $0.003 (extraction) + (N_new_claims × $0.081) + $0.030 (holistic)
73 73  
74 -==== Stage 3: Holistic Assessment (Sonnet, no cache) ====
75 -
76 -* **Input:** Article + Claim verdicts (from cache or Stage 2)
77 -* **Output:** Article verdict + Fallacies + Logic quality
78 -* **Model:** Claude Sonnet 3.5 (default, configurable via LLM abstraction layer)
79 -* **Cost:** $0.030 per article
80 -* **Cache strategy:** No caching (article-specific)
81 -
82 -
83 -
84 -**Note:** Stage 3 implements **Approach 1 (Single-Pass Holistic Analysis)** from the [[Article Verdict Problem>>Test.FactHarbor.Specification.POC.Article-Verdict-Problem]]. While claim analysis (Stage 2) is cached for efficiency, the holistic assessment maintains the integrated evaluation philosophy of Approach 1.
85 -
86 -=== Total Cost Formula: ===
87 -
88 -{{{Cost = $0.003 (extraction) + (N_new_claims × $0.081) + $0.030 (holistic)
89 -
90 90  Examples:
91 91  - 0 new claims (100% cache hit): $0.033
92 92  - 1 new claim (80% cache hit): $0.114
93 93  - 3 new claims (40% cache hit): $0.276
94 94  - 5 new claims (0% cache hit): $0.438
95 -}}}
108 +{{/code}}
96 96  
97 -----
98 -
99 99  === 2.2 User Tier System ===
100 100  
101 101  |=Tier|=Monthly Credit|=After Limit|=Cache Access|=Analytics
... ... @@ -104,21 +104,17 @@
104 104  |**Enterprise** (future)|Custom|Continues|✅ Full + Priority|Full
105 105  
106 106  **Free Tier Economics:**
107 -
108 108  * $10 credit = 40-140 articles analyzed (depending on cache hit rate)
109 109  * Average 70 articles/month at 70% cache hit rate
110 -* After limit: Cache-only mode
120 +* After limit: Cache-only mode (see Section 2.3)
111 111  
112 -----
113 -
114 114  === 2.3 Cache-Only Mode (Free Tier Feature) ===
115 115  
116 116  When free users reach their $10 monthly limit, they enter **Cache-Only Mode**:
117 117  
118 -==== What Cache-Only Mode Provides: ====
126 +**What Cache-Only Mode Provides:**
119 119  
120 120  ✅ **Claim Extraction (Platform-Funded):**
121 -
122 122  * Stage 1 extraction runs at $0.003 per article
123 123  * **Cost: Absorbed by platform** (not charged to user credit)
124 124  * Rationale: Extraction is necessary to check cache, and cost is negligible
... ... @@ -125,560 +125,628 @@
125 125  * Rate limit: Max 50 extractions/day in cache-only mode (prevents abuse)
126 126  
127 127  ✅ **Instant Access to Cached Claims:**
128 -
129 129  * Any claim that exists in cache → Full verdict returned
130 130  * Cost: $0 (no LLM calls)
131 131  * Response time: <100ms
132 132  
133 133  ✅ **Partial Article Analysis:**
134 -
135 135  * Check each claim against cache
136 136  * Return verdicts for ALL cached claims
137 -* For uncached claims: Return "status": "cache_miss"
142 +* For uncached claims: Return {{code}}"status": "cache_miss"{{/code}}
138 138  
139 139  ✅ **Cache Coverage Report:**
140 -
141 141  * "3 of 5 claims available in cache (60% coverage)"
142 142  * Links to cached analyses
143 143  * Estimated cost to complete: $0.162 (2 new claims)
144 144  
145 145  ❌ **Not Available in Cache-Only Mode:**
146 -
147 147  * New claim analysis (Stage 2 LLM calls blocked)
148 148  * Full holistic assessment (Stage 3 blocked if any claims missing)
149 149  
150 -==== User Experience Example: ====
151 -
152 -{{{{
153 - "status": "cache_only_mode",
154 - "message": "Monthly credit limit reached. Showing cached results only.",
155 - "cache_coverage": {
156 - "claims_total": 5,
157 - "claims_cached": 3,
158 - "claims_missing": 2,
159 - "coverage_percent": 60
160 - },
161 - "cached_claims": [
162 - {"claim_id": "C1", "verdict": "Likely", "confidence": 0.82},
163 - {"claim_id": "C2", "verdict": "Highly Likely", "confidence": 0.91},
164 - {"claim_id": "C4", "verdict": "Unclear", "confidence": 0.55}
165 - ],
166 - "missing_claims": [
167 - {"claim_id": "C3", "claim_text": "...", "estimated_cost": "$0.081"},
168 - {"claim_id": "C5", "claim_text": "...", "estimated_cost": "$0.081"}
169 - ],
170 - "upgrade_options": {
171 - "top_up": "$5 for 20-70 more articles",
172 - "pro_tier": "$50/month unlimited"
173 - }
153 +**User Experience:**
154 +{{code language="json"}}
155 +{
156 + "status": "cache_only_mode",
157 + "message": "Monthly credit limit reached. Showing cached results only.",
158 + "cache_coverage": {
159 + "claims_total": 5,
160 + "claims_cached": 3,
161 + "claims_missing": 2,
162 + "coverage_percent": 60
163 + },
164 + "cached_claims": [
165 + {"claim_id": "C1", "verdict": "Likely", "confidence": 0.82},
166 + {"claim_id": "C2", "verdict": "Highly Likely", "confidence": 0.91},
167 + {"claim_id": "C4", "verdict": "Unclear", "confidence": 0.55}
168 + ],
169 + "missing_claims": [
170 + {"claim_id": "C3", "claim_text": "...", "estimated_cost": "$0.081"},
171 + {"claim_id": "C5", "claim_text": "...", "estimated_cost": "$0.081"}
172 + ],
173 + "upgrade_options": {
174 + "top_up": "$5 for 20-70 more articles",
175 + "pro_tier": "$50/month unlimited"
176 + }
174 174  }
175 -}}}
178 +{{/code}}
176 176  
177 177  **Design Rationale:**
178 -
179 179  * Free users still get value (cached claims often answer their question)
180 180  * Demonstrates FactHarbor's value (partial results encourage upgrade)
181 181  * Sustainable for platform (no additional cost)
182 182  * Fair to all users (everyone contributes to cache)
183 183  
184 -----
186 +---
185 185  
188 +== 3. REST API Contract ==
186 186  
190 +=== 3.1 User Credit Tracking ===
187 187  
188 -== 6. LLM Abstraction Layer ==
192 +**Endpoint:** {{code}}GET /v1/user/credit{{/code}}
189 189  
190 -=== 6.1 Design Principle ===
194 +**Response:** {{code}}200 OK{{/code}}
191 191  
192 -**FactHarbor uses provider-agnostic LLM abstraction** to avoid vendor lock-in and enable:
193 -
194 -* **Provider switching:** Change LLM providers without code changes
195 -* **Cost optimization:** Use different providers for different stages
196 -* **Resilience:** Automatic fallback if primary provider fails
197 -* **Cross-checking:** Compare outputs from multiple providers
198 -* **A/B testing:** Test new models without deployment changes
199 -
200 -**Implementation:** All LLM calls go through an abstraction layer that routes to configured providers.
201 -
202 -----
203 -
204 -=== 6.2 LLM Provider Interface ===
205 -
206 -**Abstract Interface:**
207 -
208 -{{{
209 -interface LLMProvider {
210 - // Core methods
211 - complete(prompt: string, options: CompletionOptions): Promise<CompletionResponse>
212 - stream(prompt: string, options: CompletionOptions): AsyncIterator<StreamChunk>
213 -
214 - // Provider metadata
215 - getName(): string
216 - getMaxTokens(): number
217 - getCostPer1kTokens(): { input: number, output: number }
218 -
219 - // Health check
220 - isAvailable(): Promise<boolean>
196 +{{code language="json"}}
197 +{
198 + "user_id": "user_abc123",
199 + "tier": "free",
200 + "credit_limit": 10.00,
201 + "credit_used": 7.42,
202 + "credit_remaining": 2.58,
203 + "reset_date": "2025-02-01T00:00:00Z",
204 + "cache_only_mode": false,
205 + "usage_stats": {
206 + "articles_analyzed": 67,
207 + "claims_from_cache": 189,
208 + "claims_newly_analyzed": 113,
209 + "cache_hit_rate": 0.626
210 + }
221 221  }
212 +{{/code}}
222 222  
223 -interface CompletionOptions {
224 - model?: string
225 - maxTokens?: number
226 - temperature?: number
227 - stopSequences?: string[]
228 - systemPrompt?: string
229 -}
230 -}}}
214 +---
231 231  
232 -----
216 +=== 3.2 Create Analysis Job (3-Stage) ===
233 233  
234 -=== 6.3 Supported Providers (POC1) ===
218 +**Endpoint:** {{code}}POST /v1/analyze{{/code}}
235 235  
236 -**Primary Provider (Default):**
220 +**Request Body:**
237 237  
238 -* **Anthropic Claude API**
239 - * Models: Claude Haiku 4, Claude Sonnet 3.5, Claude Opus 4
240 - * Used by default in POC1
241 - * Best quality for holistic analysis
242 242  
243 -**Secondary Providers (Future):**
223 +**Idempotency Support:**
244 244  
245 -* **OpenAI API**
246 - * Models: GPT-4o, GPT-4o-mini
247 - * For cost comparison
248 -
249 -* **Google Vertex AI**
250 - * Models: Gemini 1.5 Pro, Gemini 1.5 Flash
251 - * For diversity in evidence gathering
225 +To prevent duplicate job creation on network retries, clients SHOULD include:
252 252  
253 -* **Local Models** (Post-POC)
254 - * Models: Llama 3.1, Mistral
255 - * For privacy-sensitive deployments
227 +{{code language="http"}}
228 +POST /v1/analyze
229 +Idempotency-Key: {client-generated-uuid}
230 +{{/code}}
256 256  
257 -----
232 +OR use the {{code}}client.request_id{{/code}} field:
258 258  
259 -=== 6.4 Provider Configuration ===
234 +{{code language="json"}}
235 +{
236 + "input_url": "...",
237 + "client": {
238 + "request_id": "client-uuid-12345",
239 + "source_label": "optional"
240 + }
241 +}
242 +{{/code}}
260 260  
261 -**Environment Variables:**
244 +**Server Behavior:**
245 +* If {{code}}Idempotency-Key{{/code}} or {{code}}request_id{{/code}} seen before (within 24 hours):
246 + - Return existing job ({{code}}200 OK{{/code}}, not {{code}}202 Accepted{{/code}})
247 + - Do NOT create duplicate job or charge twice
248 +* Idempotency keys expire after 24 hours (matches job retention)
262 262  
263 -{{{
264 -# Primary provider
265 -LLM_PRIMARY_PROVIDER=anthropic
266 -ANTHROPIC_API_KEY=sk-ant-...
250 +**Example Response (Idempotent):**
251 +{{code language="json"}}
252 +{
253 + "job_id": "01J...ULID",
254 + "status": "RUNNING",
255 + "idempotent": true,
256 + "original_request_at": "2025-12-24T10:31:00Z",
257 + "message": "Returning existing job (idempotency key matched)"
258 +}
259 +{{/code}}
267 267  
268 -# Fallback provider
269 -LLM_FALLBACK_PROVIDER=openai
270 -OPENAI_API_KEY=sk-...
271 271  
272 -# Provider selection per stage
273 -LLM_STAGE1_PROVIDER=anthropic
274 -LLM_STAGE1_MODEL=claude-haiku-4
275 -LLM_STAGE2_PROVIDER=anthropic
276 -LLM_STAGE2_MODEL=claude-sonnet-3-5
277 -LLM_STAGE3_PROVIDER=anthropic
278 -LLM_STAGE3_MODEL=claude-sonnet-3-5
279 -
280 -# Cost limits
281 -LLM_MAX_COST_PER_REQUEST=1.00
282 -}}}
283 -
284 -**Database Configuration (Alternative):**
285 -
286 -{{{{
262 +{{code language="json"}}
287 287  {
288 - "providers": [
289 - {
290 - "name": "anthropic",
291 - "api_key_ref": "vault://anthropic-api-key",
292 - "enabled": true,
293 - "priority": 1
294 - },
295 - {
296 - "name": "openai",
297 - "api_key_ref": "vault://openai-api-key",
298 - "enabled": true,
299 - "priority": 2
300 - }
301 - ],
302 - "stage_config": {
303 - "stage1": {
304 - "provider": "anthropic",
305 - "model": "claude-haiku-4",
306 - "max_tokens": 4096,
307 - "temperature": 0.0
308 - },
309 - "stage2": {
310 - "provider": "anthropic",
311 - "model": "claude-sonnet-3-5",
312 - "max_tokens": 16384,
313 - "temperature": 0.3
314 - },
315 - "stage3": {
316 - "provider": "anthropic",
317 - "model": "claude-sonnet-3-5",
318 - "max_tokens": 8192,
319 - "temperature": 0.2
320 - }
264 + "input_type": "url",
265 + "input_url": "https://example.com/medical-report-01",
266 + "input_text": null,
267 + "options": {
268 + "browsing": "on",
269 + "depth": "standard",
270 + "max_claims": 5,
271 + "context_aware_analysis": true,
272 + "cache_preference": "prefer_cache"
273 + },
274 + "client": {
275 + "request_id": "optional-client-tracking-id",
276 + "source_label": "optional"
321 321   }
322 322  }
323 -}}}
279 +{{/code}}
324 324  
325 -----
281 +**Options:**
282 +* {{code}}cache_preference{{/code}}: {{code}}prefer_cache{{/code}} | {{code}}require_fresh{{/code}} | {{code}}allow_partial{{/code}}
283 + - {{code}}prefer_cache{{/code}}: Use cache when available, analyze new claims (default)
284 + - {{code}}require_fresh{{/code}}: Force re-analysis of all claims (ignores cache, costs more)
285 + - {{code}}allow_partial{{/code}}: Return partial results if some claims uncached (for free tier cache-only mode)
326 326  
327 -=== 6.5 Stage-Specific Models (POC1 Defaults) ===
287 +**Response:** {{code}}202 Accepted{{/code}}
328 328  
329 -**Stage 1: Claim Extraction**
330 -
331 -* **Default:** Anthropic Claude Haiku 4
332 -* **Alternative:** OpenAI GPT-4o-mini, Google Gemini 1.5 Flash
333 -* **Rationale:** Fast, cheap, simple task
334 -* **Cost:** ~$0.003 per article
335 -
336 -**Stage 2: Claim Analysis** (CACHEABLE)
337 -
338 -* **Default:** Anthropic Claude Sonnet 3.5
339 -* **Alternative:** OpenAI GPT-4o, Google Gemini 1.5 Pro
340 -* **Rationale:** High-quality analysis, cached 90 days
341 -* **Cost:** ~$0.081 per NEW claim
342 -
343 -**Stage 3: Holistic Assessment**
344 -
345 -* **Default:** Anthropic Claude Sonnet 3.5
346 -* **Alternative:** OpenAI GPT-4o, Claude Opus 4 (for high-stakes)
347 -* **Rationale:** Complex reasoning, logical fallacy detection
348 -* **Cost:** ~$0.030 per article
349 -
350 -**Cost Comparison (Example):**
351 -
352 -|=Stage|=Anthropic (Default)|=OpenAI Alternative|=Google Alternative
353 -|Stage 1|Claude Haiku 4 ($0.003)|GPT-4o-mini ($0.002)|Gemini Flash ($0.002)
354 -|Stage 2|Claude Sonnet 3.5 ($0.081)|GPT-4o ($0.045)|Gemini Pro ($0.050)
355 -|Stage 3|Claude Sonnet 3.5 ($0.030)|GPT-4o ($0.018)|Gemini Pro ($0.020)
356 -|**Total (0% cache)**|**$0.114**|**$0.065**|**$0.072**
357 -
358 -**Note:** POC1 uses Anthropic exclusively for consistency. Multi-provider support planned for POC2.
359 -
360 -----
361 -
362 -=== 6.6 Failover Strategy ===
363 -
364 -**Automatic Failover:**
365 -
366 -{{{
367 -async function completeLLM(stage: string, prompt: string): Promise<string> {
368 - const primaryProvider = getProviderForStage(stage)
369 - const fallbackProvider = getFallbackProvider()
370 -
371 - try {
372 - return await primaryProvider.complete(prompt)
373 - } catch (error) {
374 - if (error.type === 'rate_limit' || error.type === 'service_unavailable') {
375 - logger.warn(`Primary provider failed, using fallback`)
376 - return await fallbackProvider.complete(prompt)
377 - }
378 - throw error
289 +{{code language="json"}}
290 +{
291 + "job_id": "01J...ULID",
292 + "status": "QUEUED",
293 + "created_at": "2025-12-24T10:31:00Z",
294 + "estimated_cost": 0.114,
295 + "cost_breakdown": {
296 + "stage1_extraction": 0.003,
297 + "stage2_new_claims": 0.081,
298 + "stage2_cached_claims": 0.000,
299 + "stage3_holistic": 0.030
300 + },
301 + "cache_info": {
302 + "claims_to_extract": 5,
303 + "estimated_cache_hits": 4,
304 + "estimated_new_claims": 1
305 + },
306 + "links": {
307 + "self": "/v1/jobs/01J...ULID",
308 + "result": "/v1/jobs/01J...ULID/result",
309 + "report": "/v1/jobs/01J...ULID/report",
310 + "events": "/v1/jobs/01J...ULID/events"
379 379   }
380 380  }
381 -}}}
313 +{{/code}}
382 382  
383 -**Fallback Priority:**
315 +**Error Responses:**
384 384  
385 -1. **Primary:** Configured provider for stage
386 -2. **Secondary:** Fallback provider (if configured)
387 -3. **Cache:** Return cached result (if available for Stage 2)
388 -4. **Error:** Return 503 Service Unavailable
317 +{{code}}402 Payment Required{{/code}} - Free tier limit reached, cache-only mode
318 +{{code language="json"}}
319 +{
320 + "error": "credit_limit_reached",
321 + "message": "Monthly credit limit reached. Entering cache-only mode.",
322 + "cache_only_mode": true,
323 + "credit_remaining": 0.00,
324 + "reset_date": "2025-02-01T00:00:00Z",
325 + "action": "Resubmit with cache_preference=allow_partial for cached results"
326 +}
327 +{{/code}}
389 389  
390 -----
329 +---
391 391  
392 -=== 6.7 Provider Selection API ===
331 +=== 3.3 Get Job Status ===
393 393  
394 -**Admin Endpoint:** POST /admin/v1/llm/configure
333 +**Endpoint:** {{code}}GET /v1/jobs/{job_id}{{/code}}
395 395  
396 -**Update provider for specific stage:**
335 +**Response:** {{code}}200 OK{{/code}}
397 397  
398 -{{{{
337 +{{code language="json"}}
399 399  {
400 - "stage": "stage2",
401 - "provider": "openai",
402 - "model": "gpt-4o",
403 - "max_tokens": 16384,
404 - "temperature": 0.3
405 -}
406 -}}}
407 -
408 -**Response:** 200 OK
409 -
410 -{{{{
411 -{
412 - "message": "LLM configuration updated",
413 - "stage": "stage2",
414 - "previous": {
415 - "provider": "anthropic",
416 - "model": "claude-sonnet-3-5"
339 + "job_id": "01J...ULID",
340 + "status": "RUNNING",
341 + "created_at": "2025-12-24T10:31:00Z",
342 + "updated_at": "2025-12-24T10:31:22Z",
343 + "progress": {
344 + "stage": "stage2_claim_analysis",
345 + "percent": 65,
346 + "message": "Analyzing claim 3 of 5 (2 from cache)",
347 + "current_claim_id": "C3",
348 + "cache_hits": 2,
349 + "cache_misses": 1
417 417   },
418 - "current": {
419 - "provider": "openai",
420 - "model": "gpt-4o"
351 + "actual_cost": 0.084,
352 + "cost_breakdown": {
353 + "stage1_extraction": 0.003,
354 + "stage2_new_claims": 0.081,
355 + "stage2_cached_claims": 0.000,
356 + "stage3_holistic": null
421 421   },
422 - "cost_impact": {
423 - "previous_cost_per_claim": 0.081,
424 - "new_cost_per_claim": 0.045,
425 - "savings_percent": 44
426 - }
358 + "input_echo": {
359 + "input_type": "url",
360 + "input_url": "https://example.com/medical-report-01"
361 + },
362 + "links": {
363 + "self": "/v1/jobs/01J...ULID",
364 + "result": "/v1/jobs/01J...ULID/result",
365 + "report": "/v1/jobs/01J...ULID/report"
366 + },
367 + "error": null
427 427  }
428 -}}}
369 +{{/code}}
429 429  
430 -**Get current configuration:**
371 +---
431 431  
432 -GET /admin/v1/llm/config
373 +=== 3.4 Get Analysis Result ===
433 433  
434 -{{{{
375 +**Endpoint:** {{code}}GET /v1/jobs/{job_id}/result{{/code}}
376 +
377 +**Response:** {{code}}200 OK{{/code}}
378 +
379 +Returns complete **AnalysisResult** schema (see Section 4).
380 +
381 +**Cache-Only Mode Response:** {{code}}206 Partial Content{{/code}}
382 +
383 +{{code language="json"}}
435 435  {
436 - "providers": ["anthropic", "openai"],
437 - "primary": "anthropic",
438 - "fallback": "openai",
439 - "stages": {
440 - "stage1": {
441 - "provider": "anthropic",
442 - "model": "claude-haiku-4",
443 - "cost_per_request": 0.003
385 + "cache_only_mode": true,
386 + "cache_coverage": {
387 + "claims_total": 5,
388 + "claims_cached": 3,
389 + "claims_missing": 2,
390 + "coverage_percent": 60
391 + },
392 + "partial_result": {
393 + "metadata": {
394 + "job_id": "01J...ULID",
395 + "timestamp_utc": "2025-12-24T10:31:30Z",
396 + "engine_version": "POC1-v0.4",
397 + "cache_only": true
444 444   },
445 - "stage2": {
446 - "provider": "anthropic",
447 - "model": "claude-sonnet-3-5",
448 - "cost_per_new_claim": 0.081
449 - },
450 - "stage3": {
451 - "provider": "anthropic",
452 - "model": "claude-sonnet-3-5",
453 - "cost_per_request": 0.030
399 + "claims": [
400 + {
401 + "claim_id": "C1",
402 + "claim_text": "...",
403 + "canonical_claim": "...",
404 + "source": "cache",
405 + "cached_at": "2025-12-20T15:30:00Z",
406 + "cache_hit_count": 47,
407 + "scenarios": [...]
408 + },
409 + {
410 + "claim_id": "C3",
411 + "claim_text": "...",
412 + "canonical_claim": "...",
413 + "source": "not_analyzed",
414 + "status": "cache_miss",
415 + "estimated_cost": 0.081
416 + }
417 + ],
418 + "article_holistic_assessment": null,
419 + "upgrade_prompt": {
420 + "message": "Upgrade to Pro for full analysis of all claims",
421 + "missing_claims": 2,
422 + "cost_to_complete": 0.192
454 454   }
455 455   }
456 456  }
457 -}}}
426 +{{/code}}
458 458  
459 -----
428 +**Other Responses:**
429 +* {{code}}409 Conflict{{/code}} - Job not finished yet
430 +* {{code}}404 Not Found{{/code}} - Job ID unknown
460 460  
461 -=== 6.8 Implementation Notes ===
432 +---
462 462  
463 -**Provider Adapter Pattern:**
434 +=== 3.5 Stage-Specific Endpoints (Optional, Advanced) ===
464 464  
465 -{{{
466 -class AnthropicProvider implements LLMProvider {
467 - async complete(prompt: string, options: CompletionOptions) {
468 - const response = await anthropic.messages.create({
469 - model: options.model || 'claude-sonnet-3-5',
470 - max_tokens: options.maxTokens || 4096,
471 - messages: [{ role: 'user', content: prompt }],
472 - system: options.systemPrompt
473 - })
474 - return response.content[0].text
475 - }
476 -}
436 +For direct stage access (useful for cache debugging, custom workflows):
477 477  
478 -class OpenAIProvider implements LLMProvider {
479 - async complete(prompt: string, options: CompletionOptions) {
480 - const response = await openai.chat.completions.create({
481 - model: options.model || 'gpt-4o',
482 - max_tokens: options.maxTokens || 4096,
483 - messages: [
484 - { role: 'system', content: options.systemPrompt },
485 - { role: 'user', content: prompt }
486 - ]
487 - })
488 - return response.choices[0].message.content
489 - }
490 -}
491 -}}}
438 +**Extract Claims Only:**
439 +{{code}}POST /v1/analyze/extract-claims{{/code}}
492 492  
493 -**Provider Registry:**
441 +**Analyze Single Claim:**
442 +{{code}}POST /v1/analyze/claim{{/code}}
494 494  
495 -{{{
496 -const providers = new Map<string, LLMProvider>()
497 -providers.set('anthropic', new AnthropicProvider())
498 -providers.set('openai', new OpenAIProvider())
499 -providers.set('google', new GoogleProvider())
444 +**Assess Article (with claim verdicts):**
445 +{{code}}POST /v1/analyze/assess-article{{/code}}
500 500  
501 -function getProvider(name: string): LLMProvider {
502 - return providers.get(name) || providers.get(config.primaryProvider)
503 -}
504 -}}}
447 +**Check Claim Cache:**
448 +{{code}}GET /v1/cache/claim/{claim_hash}{{/code}}
505 505  
506 -----
450 +**Cache Statistics:**
451 +{{code}}GET /v1/cache/stats{{/code}}
507 507  
508 -== 3. REST API Contract ==
453 +---
509 509  
510 -=== 3.1 User Credit Tracking ===
455 +=== 3.6 Download Markdown Report ===
511 511  
512 -**Endpoint:** GET /v1/user/credit
457 +**Endpoint:** {{code}}GET /v1/jobs/{job_id}/report{{/code}}
513 513  
514 -**Response:** 200 OK
459 +**Response:** {{code}}200 OK{{/code}} with {{code}}text/markdown; charset=utf-8{{/code}} content
515 515  
516 -{{{{
517 - "user_id": "user_abc123",
518 - "tier": "free",
519 - "credit_limit": 10.00,
520 - "credit_used": 7.42,
521 - "credit_remaining": 2.58,
522 - "reset_date": "2025-02-01T00:00:00Z",
523 - "cache_only_mode": false,
524 - "usage_stats": {
525 - "articles_analyzed": 67,
526 - "claims_from_cache": 189,
527 - "claims_newly_analyzed": 113,
528 - "cache_hit_rate": 0.626
529 - }
530 -}
531 -}}}
461 +**Headers:**
462 +* {{code}}Content-Disposition: attachment; filename="factharbor_poc1_{job_id}.md"{{/code}}
532 532  
533 -----
464 +**Cache-Only Mode:** Report includes "Partial Analysis" watermark and upgrade prompt.
534 534  
535 -=== 3.2 Create Analysis Job (3-Stage) ===
466 +---
536 536  
537 -**Endpoint:** POST /v1/analyze
468 +=== 3.7 Stream Job Events (Backend Progress) ===
538 538  
539 -==== Idempotency Support: ====
470 +**Endpoint:** {{code}}GET /v1/jobs/{job_id}/events{{/code}}
540 540  
541 -To prevent duplicate job creation on network retries, clients SHOULD include:
472 +**Response:** Server-Sent Events (SSE) stream
542 542  
543 -{{{POST /v1/analyze
544 -Idempotency-Key: {client-generated-uuid}
545 -}}}
474 +**Event Types:**
475 +* {{code}}progress{{/code}} - Backend progress (e.g., "Stage 1: Extracting claims")
476 +* {{code}}cache_hit{{/code}} - Claim found in cache
477 +* {{code}}cache_miss{{/code}} - Claim requires new analysis
478 +* {{code}}stage_complete{{/code}} - Stage 1/2/3 finished
479 +* {{code}}complete{{/code}} - Job finished
480 +* {{code}}error{{/code}} - Error occurred
481 +* {{code}}credit_warning{{/code}} - User approaching limit
546 546  
547 -OR use the client.request_id field:
483 +---
548 548  
549 -{{{{
550 - "input_url": "...",
551 - "client": {
552 - "request_id": "client-uuid-12345",
553 - "source_label": "optional"
554 - }
555 -}
556 -}}}
485 +=== 3.8 Cancel Job ===
557 557  
558 -**Server Behavior:**
487 +**Endpoint:** {{code}}DELETE /v1/jobs/{job_id}{{/code}}
559 559  
560 -* If Idempotency-Key or request_id seen before (within 24 hours):
561 -** Return existing job (200 OK, not 202 Accepted)
562 -** Do NOT create duplicate job or charge twice
563 -* Idempotency keys expire after 24 hours (matches job retention)
489 +**Note:** If job is mid-stage (e.g., analyzing claim 3 of 5), user is charged for completed work only.
564 564  
565 -**Example Response (Idempotent):**
491 +---
566 566  
567 -{{{{
568 - "job_id": "01J...ULID",
569 - "status": "RUNNING",
570 - "idempotent": true,
571 - "original_request_at": "2025-12-24T10:31:00Z",
572 - "message": "Returning existing job (idempotency key matched)"
573 -}
574 -}}}
493 +=== 3.9 Health Check ===
575 575  
576 -==== Request Body: ====
495 +**Endpoint:** {{code}}GET /v1/health{{/code}}
577 577  
578 -{{{{
579 - "input_type": "url",
580 - "input_url": "https://example.com/medical-report-01",
581 - "input_text": null,
582 - "options": {
583 - "browsing": "on",
584 - "depth": "standard",
585 - "max_claims": 5,
586 - "scenarios_per_claim": 2,
587 - "max_evidence_per_scenario": 6,
588 - "context_aware_analysis": true
589 - },
590 - "client": {
591 - "request_id": "optional-client-tracking-id",
592 - "source_label": "optional"
593 - }
497 +{{code language="json"}}
498 +{
499 + "status": "ok",
500 + "version": "POC1-v0.4",
501 + "model_stage1": "claude-haiku-4",
502 + "model_stage2": "claude-3-5-sonnet-20241022",
503 + "model_stage3": "claude-3-5-sonnet-20241022",
504 + "cache": {
505 + "status": "connected",
506 + "total_claims": 12847,
507 + "avg_hit_rate_24h": 0.73
508 + }
594 594  }
595 -}}}
510 +{{/code}}
596 596  
597 -**Options:**
512 +---
598 598  
599 -* browsing: on | off (retrieve web sources or just output queries)
600 -* depth: standard | deep (evidence thoroughness)
601 -* max_claims: 1-10 (default: **5** for cost control)
602 -* scenarios_per_claim: 1-5 (default: **2** for cost control)
603 -* max_evidence_per_scenario: 3-10 (default: **6**)
604 -* context_aware_analysis: true | false (experimental)
514 +== 4. Data Schemas ==
605 605  
606 -**Response:** 202 Accepted
516 +=== 4.1 Stage 1 Output: ClaimExtraction ===
607 607  
608 -{{{{
609 - "job_id": "01J...ULID",
610 - "status": "QUEUED",
611 - "created_at": "2025-12-24T10:31:00Z",
612 - "estimated_cost": 0.114,
613 - "cost_breakdown": {
614 - "stage1_extraction": 0.003,
615 - "stage2_new_claims": 0.081,
616 - "stage2_cached_claims": 0.000,
617 - "stage3_holistic": 0.030
618 - },
619 - "cache_info": {
620 - "claims_to_extract": 5,
621 - "estimated_cache_hits": 4,
622 - "estimated_new_claims": 1
623 - },
624 - "links": {
625 - "self": "/v1/jobs/01J...ULID",
626 - "result": "/v1/jobs/01J...ULID/result",
627 - "report": "/v1/jobs/01J...ULID/report",
628 - "events": "/v1/jobs/01J...ULID/events"
629 - }
518 +{{code language="json"}}
519 +{
520 + "job_id": "01J...ULID",
521 + "stage": "stage1_extraction",
522 + "article_metadata": {
523 + "title": "Article title",
524 + "source_url": "https://example.com/article",
525 + "extracted_text_length": 5234,
526 + "language": "en"
527 + },
528 + "claims": [
529 + {
530 + "claim_id": "C1",
531 + "claim_text": "Original claim text from article",
532 + "canonical_claim": "Normalized, deduplicated phrasing",
533 + "claim_hash": "sha256:abc123...",
534 + "is_central_to_thesis": true,
535 + "claim_type": "causal",
536 + "evaluability": "evaluable",
537 + "risk_tier": "B",
538 + "domain": "public_health"
539 + }
540 + ],
541 + "article_thesis": "Main argument detected",
542 + "cost": 0.003
630 630  }
631 -}}}
544 +{{/code}}
632 632  
633 -**Error Responses:**
546 +=== 4.2 Stage 2 Output: ClaimAnalysis (CACHED) ===
634 634  
635 -402 Payment Required - Free tier limit reached, cache-only mode
548 +This is the CACHEABLE unit. Stored in Redis with 90-day TTL.
636 636  
637 -{{{{
638 - "error": "credit_limit_reached",
639 - "message": "Monthly credit limit reached. Entering cache-only mode.",
640 - "cache_only_mode": true,
641 - "credit_remaining": 0.00,
642 - "reset_date": "2025-02-01T00:00:00Z",
643 - "action": "Resubmit with cache_preference=allow_partial for cached results"
550 +{{code language="json"}}
551 +{
552 + "claim_hash": "sha256:abc123...",
553 + "canonical_claim": "COVID vaccines are 95% effective",
554 + "language": "en",
555 + "domain": "public_health",
556 + "analysis_version": "v1.0",
557 + "scenarios": [
558 + {
559 + "scenario_id": "S1",
560 + "scenario_title": "mRNA vaccines (Pfizer/Moderna) in clinical trials",
561 + "definitions": {"95% effective": "95% reduction in symptomatic infection"},
562 + "assumptions": ["Based on phase 3 trial data", "Against original strain"],
563 + "boundaries": {
564 + "time": "2020-2021 trials",
565 + "geography": "Multi-country trials",
566 + "population": "Adult population (16+)",
567 + "conditions": "Before widespread variants"
568 + },
569 + "verdict": {
570 + "label": "Highly Likely",
571 + "probability_range": [0.88, 0.97],
572 + "confidence": 0.92,
573 + "reasoning_chain": [
574 + "Pfizer/BioNTech trial: 95% efficacy (n=43,548)",
575 + "Moderna trial: 94.1% efficacy (n=30,420)",
576 + "Peer-reviewed publications in NEJM",
577 + "FDA independent analysis confirmed"
578 + ],
579 + "key_supporting_evidence_ids": ["E1", "E2"],
580 + "key_counter_evidence_ids": ["E3"],
581 + "uncertainty_factors": [
582 + "Limited data on long-term effectiveness",
583 + "Variant-specific performance not yet measured"
584 + ]
585 + },
586 + "evidence": [
587 + {
588 + "evidence_id": "E1",
589 + "stance": "supports",
590 + "relevance_to_scenario": 0.98,
591 + "evidence_summary": [
592 + "Pfizer trial showed 170 cases in placebo vs 8 in vaccine group",
593 + "Follow-up period median 2 months post-dose 2",
594 + "Efficacy consistent across age, sex, race, ethnicity"
595 + ],
596 + "citation": {
597 + "title": "Safety and Efficacy of the BNT162b2 mRNA Covid-19 Vaccine",
598 + "author_or_org": "Polack et al.",
599 + "publication_date": "2020-12-31",
600 + "url": "https://nejm.org/doi/full/10.1056/NEJMoa2034577",
601 + "publisher": "New England Journal of Medicine",
602 + "retrieved_at_utc": "2025-12-20T15:30:00Z"
603 + },
604 + "excerpt": ["The vaccine was 95% effective in preventing Covid-19"],
605 + "excerpt_word_count": 9,
606 + "source_reliability_score": 0.95,
607 + "reliability_justification": "Peer-reviewed, high-impact journal, large RCT",
608 + "limitations_and_reservations": [
609 + "Short follow-up period (2 months)",
610 + "Primarily measures symptomatic infection, not transmission"
611 + ],
612 + "retraction_or_dispute_signal": "none"
613 + }
614 + ]
615 + }
616 + ],
617 + "cache_metadata": {
618 + "first_analyzed": "2025-12-01T10:00:00Z",
619 + "last_updated": "2025-12-20T15:30:00Z",
620 + "hit_count": 47,
621 + "version": "v1.0",
622 + "ttl_expires": "2026-03-20T15:30:00Z"
623 + },
624 + "cost": 0.081
644 644  }
645 -}}}
626 +{{/code}}
646 646  
647 -----
628 +**Cache Key Structure:**
629 +{{code}}
630 +Redis Key: claim:v1norm1:{language}:{sha256(canonical_claim)}
631 +TTL: 90 days (7,776,000 seconds)
632 +Size: ~15KB JSON (compressed: ~5KB)
633 +{{/code}}
648 648  
649 -== 4. Data Schemas ==
635 +=== 4.3 Stage 3 Output: HolisticAssessment ===
650 650  
651 -=== 4.1 Stage 1 Output: ClaimExtraction ===
637 +{{code language="json"}}
638 +{
639 + "job_id": "01J...ULID",
640 + "stage": "stage3_holistic",
641 + "article_metadata": {
642 + "title": "...",
643 + "main_thesis": "...",
644 + "source_url": "..."
645 + },
646 + "article_holistic_assessment": {
647 + "overall_verdict": "MISLEADING",
648 + "logic_quality_score": 0.42,
649 + "fallacies_detected": [
650 + "correlation-causation",
651 + "cherry-picking"
652 + ],
653 + "verdict_reasoning": [
654 + "Central claim C1 is REFUTED by multiple systematic reviews",
655 + "Supporting claims C2-C4 are TRUE but do not support the thesis",
656 + "Article commits correlation-causation fallacy",
657 + "Selective citation of evidence (cherry-picking detected)"
658 + ],
659 + "experimental_feature": true
660 + },
661 + "claims_summary": [
662 + {
663 + "claim_id": "C1",
664 + "is_central_to_thesis": true,
665 + "verdict": "Refuted",
666 + "confidence": 0.89,
667 + "source": "cache",
668 + "cache_hit": true
669 + },
670 + {
671 + "claim_id": "C2",
672 + "is_central_to_thesis": false,
673 + "verdict": "Highly Likely",
674 + "confidence": 0.91,
675 + "source": "new_analysis",
676 + "cache_hit": false
677 + }
678 + ],
679 + "quality_gates": {
680 + "gate1_claim_validation": "pass",
681 + "gate4_verdict_confidence": "pass",
682 + "passed_all": true
683 + },
684 + "cost": 0.030,
685 + "total_job_cost": 0.114
686 +}
687 +{{/code}}
652 652  
653 -{{{{
654 - "job_id": "01J...ULID",
655 - "stage": "stage1_extraction",
656 - "article_metadata": {
657 - "title": "Article title",
658 - "source_url": "https://example.com/article",
659 - "extracted_text_length": 5234,
660 - "language": "en"
661 - },
662 - "claims": [
663 - {
664 - "claim_id": "C1",
665 - "claim_text": "Original claim text from article",
666 - "canonical_claim": "Normalized, deduplicated phrasing",
667 - "claim_hash": "sha256:abc123...",
668 - "is_central_to_thesis": true,
669 - "claim_type": "causal",
670 - "evaluability": "evaluable",
671 - "risk_tier": "B",
672 - "domain": "public_health"
673 - }
674 - ],
675 - "article_thesis": "Main argument detected",
676 - "cost": 0.003
689 +=== 4.4 Complete AnalysisResult (All 3 Stages Combined) ===
690 +
691 +{{code language="json"}}
692 +{
693 + "metadata": {
694 + "job_id": "01J...ULID",
695 + "timestamp_utc": "2025-12-24T10:31:30Z",
696 + "engine_version": "POC1-v0.4",
697 + "llm_stage1": "claude-haiku-4",
698 + "llm_stage2": "claude-3-5-sonnet-20241022",
699 + "llm_stage3": "claude-3-5-sonnet-20241022",
700 + "usage_stats": {
701 + "stage1_tokens": {"input": 10000, "output": 500},
702 + "stage2_tokens": {"input": 2000, "output": 5000},
703 + "stage3_tokens": {"input": 5000, "output": 1000},
704 + "total_input_tokens": 17000,
705 + "total_output_tokens": 6500,
706 + "estimated_cost_usd": 0.114,
707 + "response_time_sec": 45.2
708 + },
709 + "cache_stats": {
710 + "claims_total": 5,
711 + "claims_from_cache": 4,
712 + "claims_new_analysis": 1,
713 + "cache_hit_rate": 0.80,
714 + "cache_savings_usd": 0.324
715 + }
716 + },
717 + "article_holistic_assessment": {
718 + "main_thesis": "...",
719 + "overall_verdict": "MISLEADING",
720 + "logic_quality_score": 0.42,
721 + "fallacies_detected": ["correlation-causation", "cherry-picking"],
722 + "verdict_reasoning": ["...", "...", "..."],
723 + "experimental_feature": true
724 + },
725 + "claims": [
726 + {
727 + "claim_id": "C1",
728 + "is_central_to_thesis": true,
729 + "claim_text": "...",
730 + "canonical_claim": "...",
731 + "claim_hash": "sha256:abc123...",
732 + "claim_type": "causal",
733 + "evaluability": "evaluable",
734 + "risk_tier": "B",
735 + "source": "cache",
736 + "cached_at": "2025-12-20T15:30:00Z",
737 + "cache_hit_count": 47,
738 + "scenarios": [...]
739 + },
740 + {
741 + "claim_id": "C2",
742 + "source": "new_analysis",
743 + "analyzed_at": "2025-12-24T10:31:15Z",
744 + "scenarios": [...]
745 + }
746 + ],
747 + "quality_gates": {
748 + "gate1_claim_validation": "pass",
749 + "gate4_verdict_confidence": "pass",
750 + "passed_all": true
751 + }
677 677  }
678 -}}}
753 +{{/code}}
679 679  
680 -----
681 681  
756 +
682 682  === 4.5 Verdict Label Taxonomy ===
683 683  
684 684  FactHarbor uses **three distinct verdict taxonomies** depending on analysis level:
... ... @@ -688,26 +688,23 @@
688 688  Used for individual scenario verdicts within a claim.
689 689  
690 690  **Enum Values:**
766 +* {{code}}Highly Likely{{/code}} - Probability 0.85-1.0, high confidence
767 +* {{code}}Likely{{/code}} - Probability 0.65-0.84, moderate-high confidence
768 +* {{code}}Unclear{{/code}} - Probability 0.35-0.64, or low confidence
769 +* {{code}}Unlikely{{/code}} - Probability 0.16-0.34, moderate-high confidence
770 +* {{code}}Highly Unlikely{{/code}} - Probability 0.0-0.15, high confidence
771 +* {{code}}Unsubstantiated{{/code}} - Insufficient evidence to determine probability
691 691  
692 -* Highly Likely - Probability 0.85-1.0, high confidence
693 -* Likely - Probability 0.65-0.84, moderate-high confidence
694 -* Unclear - Probability 0.35-0.64, or low confidence
695 -* Unlikely - Probability 0.16-0.34, moderate-high confidence
696 -* Highly Unlikely - Probability 0.0-0.15, high confidence
697 -* Unsubstantiated - Insufficient evidence to determine probability
698 -
699 699  ==== 4.5.2 Claim Verdict Labels (Rollup) ====
700 700  
701 701  Used when summarizing a claim across all scenarios.
702 702  
703 703  **Enum Values:**
778 +* {{code}}Supported{{/code}} - Majority of scenarios are Likely or Highly Likely
779 +* {{code}}Refuted{{/code}} - Majority of scenarios are Unlikely or Highly Unlikely
780 +* {{code}}Inconclusive{{/code}} - Mixed scenarios or majority Unclear/Unsubstantiated
704 704  
705 -* Supported - Majority of scenarios are Likely or Highly Likely
706 -* Refuted - Majority of scenarios are Unlikely or Highly Unlikely
707 -* Inconclusive - Mixed scenarios or majority Unclear/Unsubstantiated
708 -
709 709  **Mapping Logic:**
710 -
711 711  * If ≥60% scenarios are (Highly Likely | Likely) → Supported
712 712  * If ≥60% scenarios are (Highly Unlikely | Unlikely) → Refuted
713 713  * Otherwise → Inconclusive
... ... @@ -717,23 +717,23 @@
717 717  Used for holistic article-level assessment.
718 718  
719 719  **Enum Values:**
792 +* {{code}}WELL-SUPPORTED{{/code}} - Article thesis logically follows from supported claims
793 +* {{code}}MISLEADING{{/code}} - Claims may be true but article commits logical fallacies
794 +* {{code}}REFUTED{{/code}} - Central claims are refuted, invalidating thesis
795 +* {{code}}UNCERTAIN{{/code}} - Insufficient evidence or highly mixed claim verdicts
720 720  
721 -* WELL-SUPPORTED - Article thesis logically follows from supported claims
722 -* MISLEADING - Claims may be true but article commits logical fallacies
723 -* REFUTED - Central claims are refuted, invalidating thesis
724 -* UNCERTAIN - Insufficient evidence or highly mixed claim verdicts
725 -
726 726  **Note:** Article verdict considers **claim centrality** (central claims override supporting claims).
727 727  
728 728  ==== 4.5.4 API Field Mapping ====
729 729  
730 730  |=Level|=API Field|=Enum Name
731 -|Scenario|scenarios[].verdict.label|scenario_verdict_label
732 -|Claim|claims[].rollup_verdict (optional)|claim_verdict_label
733 -|Article|article_holistic_assessment.overall_verdict|article_verdict_label
802 +|Scenario|{{code}}scenarios[].verdict.label{{/code}}|scenario_verdict_label
803 +|Claim|{{code}}claims[].rollup_verdict{{/code}} (optional)|claim_verdict_label
804 +|Article|{{code}}article_holistic_assessment.overall_verdict{{/code}}|article_verdict_label
734 734  
735 -----
736 736  
807 +---
808 +
737 737  == 5. Cache Architecture ==
738 738  
739 739  === 5.1 Redis Cache Design ===
... ... @@ -741,29 +741,117 @@
741 741  **Technology:** Redis 7.0+ (in-memory key-value store)
742 742  
743 743  **Cache Key Schema:**
816 +{{code}}
817 +claim:v1norm1:{language}:{sha256(canonical_claim)}
818 +{{/code}}
744 744  
745 -{{{claim:v1norm1:{language}:{sha256(canonical_claim)}
746 -}}}
747 -
748 748  **Example:**
749 -
750 -{{{Claim (English): "COVID vaccines are 95% effective"
821 +{{code}}
822 +Claim (English): "COVID vaccines are 95% effective"
751 751  Canonical: "covid vaccines are 95 percent effective"
752 752  Language: "en"
753 753  SHA256: abc123...def456
754 754  Key: claim:v1norm1:en:abc123...def456
755 -}}}
827 +{{/code}}
756 756  
757 757  **Rationale:** Prevents cross-language collisions and enables per-language cache analytics.
758 758  
759 759  **Data Structure:**
832 +{{code language="redis"}}
833 +SET claim:v1:abc123...def456 '{...ClaimAnalysis JSON...}'
834 +EXPIRE claim:v1:abc123...def456 7776000 # 90 days
835 +{{/code}}
760 760  
761 -{{{SET claim:v1norm1:en:abc123...def456 '{...ClaimAnalysis JSON...}'
762 -EXPIRE claim:v1norm1:en:abc123...def456 7776000 # 90 days
763 -}}}
837 +**Additional Keys:**
838 +{{code}}
764 764  
765 -----
840 +==== 5.1.1 Canonical Claim Normalization (v1) ====
766 766  
842 +The cache key depends on deterministic claim normalization. All implementations MUST follow this algorithm exactly.
843 +
844 +**Algorithm: Canonical Claim Normalization v1**
845 +
846 +{{code language="python"}}
847 +def normalize_claim_v1(claim_text: str, language: str) -> str:
848 + """
849 + Normalizes claim to canonical form for cache key generation.
850 + Version: v1norm1 (POC1)
851 + """
852 + import re
853 + import unicodedata
854 +
855 + # Step 1: Unicode normalization (NFC)
856 + text = unicodedata.normalize('NFC', claim_text)
857 +
858 + # Step 2: Lowercase
859 + text = text.lower()
860 +
861 + # Step 3: Remove punctuation (except hyphens in words)
862 + text = re.sub(r'[^\w\s-]', '', text)
863 +
864 + # Step 4: Normalize whitespace (collapse multiple spaces)
865 + text = re.sub(r'\s+', ' ', text).strip()
866 +
867 + # Step 5: Numeric normalization
868 + text = text.replace('%', ' percent')
869 + # Spell out single-digit numbers
870 + num_to_word = {'0':'zero', '1':'one', '2':'two', '3':'three',
871 + '4':'four', '5':'five', '6':'six', '7':'seven',
872 + '8':'eight', '9':'nine'}
873 + for num, word in num_to_word.items():
874 + text = re.sub(rf'\b{num}\b', word, text)
875 +
876 + # Step 6: Common abbreviations (English only in v1)
877 + if language == 'en':
878 + text = text.replace('covid-19', 'covid')
879 + text = text.replace('u.s.', 'us')
880 + text = text.replace('u.k.', 'uk')
881 +
882 + # Step 7: NO entity normalization in v1
883 + # (Trump vs Donald Trump vs President Trump remain distinct)
884 +
885 + return text
886 +
887 +# Version identifier (include in cache namespace)
888 +CANONICALIZER_VERSION = "v1norm1"
889 +{{/code}}
890 +
891 +**Cache Key Formula (Updated):**
892 +
893 +{{code}}
894 +language = "en"
895 +canonical = normalize_claim_v1(claim_text, language)
896 +cache_key = f"claim:{CANONICALIZER_VERSION}:{language}:{sha256(canonical)}"
897 +
898 +Example:
899 + claim: "COVID-19 vaccines are 95% effective"
900 + canonical: "covid vaccines are 95 percent effective"
901 + sha256: abc123...def456
902 + key: "claim:v1norm1:en:abc123...def456"
903 +{{/code}}
904 +
905 +**Cache Metadata MUST Include:**
906 +
907 +{{code language="json"}}
908 +{
909 + "canonical_claim": "covid vaccines are 95 percent effective",
910 + "canonicalizer_version": "v1norm1",
911 + "language": "en",
912 + "original_claim_samples": ["COVID-19 vaccines are 95% effective"]
913 +}
914 +{{/code}}
915 +
916 +**Version Upgrade Path:**
917 +* v1norm1 → v1norm2: Cache namespace changes, old keys remain valid until TTL
918 +* v1normN → v2norm1: Major version bump, invalidate all v1 caches
919 +
920 +
921 +claim:stats:hit_count:{claim_hash} # Counter
922 +claim:index:domain:{domain} # Set of claim hashes by domain
923 +claim:index:language:{lang} # Set of claim hashes by language
924 +{{/code}}
925 +
926 +
767 767  === 5.1.1 Canonical Claim Normalization (v1) ===
768 768  
769 769  The cache key depends on deterministic claim normalization. All implementations MUST follow this algorithm exactly.
... ... @@ -770,80 +770,82 @@
770 770  
771 771  **Algorithm: Canonical Claim Normalization v1**
772 772  
773 -{{{def normalize_claim_v1(claim_text: str, language: str) -> str:
774 - """
775 - Normalizes claim to canonical form for cache key generation.
776 - Version: v1norm1 (POC1)
777 - """
778 - import re
779 - import unicodedata
780 -
781 - # Step 1: Unicode normalization (NFC)
782 - text = unicodedata.normalize('NFC', claim_text)
783 -
784 - # Step 2: Lowercase
785 - text = text.lower()
786 -
787 - # Step 3: Remove punctuation (except hyphens in words)
788 - text = re.sub(r'[^\w\s-]', '', text)
789 -
790 - # Step 4: Normalize whitespace (collapse multiple spaces)
791 - text = re.sub(r'\s+', ' ', text).strip()
792 -
793 - # Step 5: Numeric normalization
794 - text = text.replace('%', ' percent')
795 - # Spell out single-digit numbers
796 - num_to_word = {'0':'zero', '1':'one', '2':'two', '3':'three',
797 - '4':'four', '5':'five', '6':'six', '7':'seven',
798 - '8':'eight', '9':'nine'}
799 - for num, word in num_to_word.items():
800 - text = re.sub(rf'\b{num}\b', word, text)
801 -
802 - # Step 6: Common abbreviations (English only in v1)
803 - if language == 'en':
804 - text = text.replace('covid-19', 'covid')
805 - text = text.replace('u.s.', 'us')
806 - text = text.replace('u.k.', 'uk')
807 -
808 - # Step 7: NO entity normalization in v1
809 - # (Trump vs Donald Trump vs President Trump remain distinct)
810 -
811 - return text
933 +{{code language="python"}}
934 +def normalize_claim_v1(claim_text: str, language: str) -> str:
935 + """
936 + Normalizes claim to canonical form for cache key generation.
937 + Version: v1norm1 (POC1)
938 + """
939 + import re
940 + import unicodedata
941 +
942 + # Step 1: Unicode normalization (NFC)
943 + text = unicodedata.normalize('NFC', claim_text)
944 +
945 + # Step 2: Lowercase
946 + text = text.lower()
947 +
948 + # Step 3: Remove punctuation (except hyphens in words)
949 + text = re.sub(r'[^\w\s-]', '', text)
950 +
951 + # Step 4: Normalize whitespace (collapse multiple spaces)
952 + text = re.sub(r'\s+', ' ', text).strip()
953 +
954 + # Step 5: Numeric normalization
955 + text = text.replace('%', ' percent')
956 + # Spell out single-digit numbers
957 + num_to_word = {'0':'zero', '1':'one', '2':'two', '3':'three',
958 + '4':'four', '5':'five', '6':'six', '7':'seven',
959 + '8':'eight', '9':'nine'}
960 + for num, word in num_to_word.items():
961 + text = re.sub(rf'\b{num}\b', word, text)
962 +
963 + # Step 6: Common abbreviations (English only in v1)
964 + if language == 'en':
965 + text = text.replace('covid-19', 'covid')
966 + text = text.replace('u.s.', 'us')
967 + text = text.replace('u.k.', 'uk')
968 +
969 + # Step 7: NO entity normalization in v1
970 + # (Trump vs Donald Trump vs President Trump remain distinct)
971 +
972 + return text
812 812  
813 813  # Version identifier (include in cache namespace)
814 814  CANONICALIZER_VERSION = "v1norm1"
815 -}}}
976 +{{/code}}
816 816  
817 817  **Cache Key Formula (Updated):**
818 818  
819 -{{{language = "en"
980 +{{code}}
981 +language = "en"
820 820  canonical = normalize_claim_v1(claim_text, language)
821 821  cache_key = f"claim:{CANONICALIZER_VERSION}:{language}:{sha256(canonical)}"
822 822  
823 823  Example:
824 - claim: "COVID-19 vaccines are 95% effective"
825 - canonical: "covid vaccines are 95 percent effective"
826 - sha256: abc123...def456
827 - key: "claim:v1norm1:en:abc123...def456"
828 -}}}
986 + claim: "COVID-19 vaccines are 95% effective"
987 + canonical: "covid vaccines are 95 percent effective"
988 + sha256: abc123...def456
989 + key: "claim:v1norm1:en:abc123...def456"
990 +{{/code}}
829 829  
830 830  **Cache Metadata MUST Include:**
831 831  
832 -{{{{
833 - "canonical_claim": "covid vaccines are 95 percent effective",
834 - "canonicalizer_version": "v1norm1",
835 - "language": "en",
836 - "original_claim_samples": ["COVID-19 vaccines are 95% effective"]
994 +{{code language="json"}}
995 +{
996 + "canonical_claim": "covid vaccines are 95 percent effective",
997 + "canonicalizer_version": "v1norm1",
998 + "language": "en",
999 + "original_claim_samples": ["COVID-19 vaccines are 95% effective"]
837 837  }
838 -}}}
1001 +{{/code}}
839 839  
840 840  **Version Upgrade Path:**
841 -
842 842  * v1norm1 → v1norm2: Cache namespace changes, old keys remain valid until TTL
843 843  * v1normN → v2norm1: Major version bump, invalidate all v1 caches
844 844  
845 -----
846 846  
1008 +
847 847  === 5.1.2 Copyright & Data Retention Policy ===
848 848  
849 849  **Evidence Excerpt Storage:**
... ... @@ -851,7 +851,6 @@
851 851  To comply with copyright law and fair use principles:
852 852  
853 853  **What We Store:**
854 -
855 855  * **Metadata only:** Title, author, publisher, URL, publication date
856 856  * **Short excerpts:** Max 25 words per quote, max 3 quotes per evidence item
857 857  * **Summaries:** AI-generated bullet points (not verbatim text)
... ... @@ -858,20 +858,17 @@
858 858  * **No full articles:** Never store complete article text beyond job processing
859 859  
860 860  **Total per Cached Claim:**
861 -
862 862  * Scenarios: 2 per claim
863 863  * Evidence items: 6 per scenario (12 total)
864 864  * Quotes: 3 per evidence × 25 words = 75 words per item
865 -* **Maximum stored verbatim text:** ~~900 words per claim (12 × 75)
1025 +* **Maximum stored verbatim text:** ~900 words per claim (12 × 75)
866 866  
867 867  **Retention:**
868 -
869 869  * Cache TTL: 90 days
870 870  * Job outputs: 24 hours (then archived or deleted)
871 871  * No persistent full-text article storage
872 872  
873 873  **Rationale:**
874 -
875 875  * Short excerpts for citation = fair use
876 876  * Summaries are transformative (not copyrightable)
877 877  * Limited retention (90 days max)
... ... @@ -878,27 +878,480 @@
878 878  * No commercial republication of excerpts
879 879  
880 880  **DMCA Compliance:**
881 -
882 882  * Cache invalidation endpoint available for rights holders
883 883  * Contact: dmca@factharbor.org
884 884  
885 -----
886 886  
887 -== Summary ==
1043 +=== 5.2 Cache Invalidation Strategy ===
888 888  
889 -This WYSIWYG preview shows the **structure and key sections** of the 1,515-line API specification.
1045 +**Time-Based (Primary):**
1046 +* TTL: 90 days for most claims
1047 +* Reasoning: Evidence freshness, news cycles
890 890  
891 -**Full specification includes:**
1049 +**Event-Based (Manual):**
1050 +* Admin can flag claims for invalidation
1051 +* Example: "Major study retracts findings"
1052 +* Tool: {{code}}DELETE /v1/cache/claim/{claim_hash}?reason=retraction{{/code}}
892 892  
893 -* Complete API endpoints (7 total)
894 -* All data schemas (ClaimExtraction, ClaimAnalysis, HolisticAssessment, Complete)
895 -* Quality gates & validation rules
896 -* LLM configuration for all 3 stages
897 -* Implementation notes with code samples
898 -* Testing strategy
899 -* Cross-references to other pages
1054 +**Version-Based (Automatic):**
1055 +* AKEL v2.0 release → Invalidate all v1.0 caches
1056 +* Cache keys include version: {{code}}claim:v1:*{{/code}} vs {{code}}claim:v2:*{{/code}}
900 900  
901 -**The complete specification is available in:**
1058 +**Long-Lived Historical Claims:**
1059 +* Historical claims about completed events generally have stable verdicts
1060 +* Example: "2024 US presidential election results"
1061 +* **Policy:** Extended TTL (365-3,650 days) instead of "never invalidate"
1062 +* **Reason:** Even historical data gets revisions (updated counts, corrections)
1063 +* **Mechanism:** Admin can still manually invalidate if major correction issued
1064 +* **Flag:** {{code}}is_historical=true{{/code}} in cache metadata → longer TTL
902 902  
903 -* FactHarbor_POC1_API_and_Schemas_Spec_v0_4_1_PATCHED.md (45 KB standalone)
904 -* Export files (TEST/PRODUCTION) for xWiki import
1066 +=== 5.3 Cache Warming Strategy ===
1067 +
1068 +**Proactive Cache Building (Future):**
1069 +
1070 +**Trending Topics:**
1071 +* Monitor news APIs for trending topics
1072 +* Pre-analyze top 20 common claims
1073 +* Example: New health study published → Pre-cache related claims
1074 +
1075 +**Predictable Events:**
1076 +* Elections, sporting events, earnings reports
1077 +* Pre-cache expected claims before event
1078 +* Reduces load during traffic spikes
1079 +
1080 +**User Patterns:**
1081 +* Analyze query logs
1082 +* Identify frequently requested claims
1083 +* Prioritize cache warming for these
1084 +
1085 +---
1086 +
1087 +== 6. Quality Gates & Validation Rules ==
1088 +
1089 +=== 6.1 Quality Gate Overview ===
1090 +
1091 +|=Gate|=Name|=POC1 Status|=Applies To|=Notes
1092 +|**Gate 1**|Claim Validation|✅ Hard gate|Stage 1: Extraction|Filters opinions, compound claims
1093 +|**Gate 2**|Contradiction Search|✅ Mandatory rule|Stage 2: Analysis|Enforced per cached claim
1094 +|**Gate 3**|Uncertainty Disclosure|⚠️ Soft guidance|Stage 2: Analysis|Best practice
1095 +|**Gate 4**|Verdict Confidence|✅ Hard gate|Stage 2: Analysis|Confidence ≥ 0.5 required
1096 +
1097 +**Hard Gate Failures:**
1098 +* Gate 1 fail → Claim excluded from analysis
1099 +* Gate 4 fail → Claim marked "Unsubstantiated" but included
1100 +
1101 +=== 6.2 Validation Rules ===
1102 +
1103 +|=Rule|=Requirement
1104 +|**Mandatory Contradiction**|Stage 2 MUST search for "undermines" evidence. If none found, reasoning must state: "No counter-evidence found despite targeted search."
1105 +|**Context-Aware Logic**|Stage 3 must prioritize central claims. If {{code}}is_central_to_thesis=true{{/code}} claim is REFUTED, article cannot be WELL-SUPPORTED.
1106 +|**Cache Consistency**|Cached claims must match current AKEL version. Version mismatch → cache miss.
1107 +|**Author Identification**|All outputs MUST include {{code}}author_type: "AI/AKEL"{{/code}}.
1108 +
1109 +---
1110 +
1111 +== 7. Deterministic Markdown Template ==
1112 +
1113 +Report generation uses **fixed template** (not LLM-generated).
1114 +
1115 +**Cache-Only Mode Template:**
1116 +{{code language="markdown"}}
1117 +# FactHarbor Analysis Report: PARTIAL ANALYSIS
1118 +
1119 +**Job ID:** {job_id} | **Generated:** {timestamp_utc}
1120 +**Mode:** Cache-Only (Free Tier)
1121 +
1122 +---
1123 +
1124 +## ⚠️ Partial Analysis Notice
1125 +
1126 +This is a **cache-only analysis** based on previously analyzed claims.
1127 +{cache_coverage_percent}% of claims were available in cache.
1128 +
1129 +**What's Included:**
1130 +* {claims_cached} of {claims_total} claims analyzed
1131 +* Evidence and verdicts from cache (last updated: {oldest_cache_date})
1132 +
1133 +**What's Missing:**
1134 +* {claims_missing} claims require new analysis
1135 +* Full article holistic assessment unavailable
1136 +* Estimated cost to complete: ${cost_to_complete}
1137 +
1138 +**[Upgrade to Pro]** for complete analysis
1139 +
1140 +---
1141 +
1142 +## Cached Claims
1143 +
1144 +### [C1] {claim_text} ✅ From Cache
1145 +* **Cached:** {cached_at} ({cache_age} ago)
1146 +* **Times Used:** {hit_count} articles
1147 +* **Verdict:** {verdict} (Confidence: {confidence})
1148 +* **Evidence:** {evidence_count} sources
1149 +
1150 +[Full claim details...]
1151 +
1152 +### [C3] {claim_text} ⚠️ Not In Cache
1153 +* **Status:** Requires new analysis
1154 +* **Cost:** $0.081
1155 +* **Upgrade to analyze this claim**
1156 +
1157 +---
1158 +
1159 +**Powered by FactHarbor POC1-v0.4** | [Upgrade](https://factharbor.org/upgrade)
1160 +{{/code}}
1161 +
1162 +---
1163 +
1164 +== 8. LLM Configuration (3-Stage) ==
1165 +
1166 +=== 8.1 Stage 1: Claim Extraction (Haiku) ===
1167 +
1168 +|=Parameter|=Value|=Notes
1169 +|**Model**|{{code}}claude-haiku-4-20250108{{/code}}|Fast, cheap, sufficient for extraction
1170 +|**Input Tokens**|~10K|Article text after URL extraction
1171 +|**Output Tokens**|~500|5 claims @ ~100 tokens each
1172 +|**Cost**|$0.003 per article|($0.25/M input + $1.25/M output)
1173 +|**Temperature**|0.0|Deterministic
1174 +|**Max Tokens**|1000|Generous buffer
1175 +
1176 +**Prompt Strategy:**
1177 +* Extract 5 verifiable factual claims
1178 +* Mark central vs. supporting claims
1179 +* Canonicalize (normalize phrasing)
1180 +* Deduplicate similar claims
1181 +* Output structured JSON only
1182 +
1183 +=== 8.2 Stage 2: Claim Analysis (Sonnet, CACHED) ===
1184 +
1185 +|=Parameter|=Value|=Notes
1186 +|**Model**|{{code}}claude-3-5-sonnet-20241022{{/code}}|High quality for verdicts
1187 +|**Input Tokens**|~2K|Single claim + prompt + context
1188 +|**Output Tokens**|~5K|2 scenarios × ~2.5K tokens
1189 +|**Cost**|$0.081 per NEW claim|($3/M input + $15/M output)
1190 +|**Temperature**|0.0|Deterministic (cache consistency)
1191 +|**Max Tokens**|8000|Sufficient for 2 scenarios
1192 +|**Cache Strategy**|Redis, 90-day TTL|Key: {{code}}claim:v1norm1:{language}:{sha256(canonical_claim)}{{/code}}
1193 +
1194 +**Prompt Strategy:**
1195 +* Generate 2 scenario interpretations
1196 +* Search for supporting AND undermining evidence (mandatory)
1197 +* 6 evidence items per scenario maximum
1198 +* Compute verdict with reasoning chain (3-4 bullets)
1199 +* Output structured JSON only
1200 +
1201 +**Output Constraints (Cost Control):**
1202 +* Scenarios: Max 2 per claim
1203 +* Evidence: Max 6 per scenario
1204 +* Evidence summary: Max 3 bullets
1205 +* Reasoning chain: Max 4 bullets
1206 +
1207 +=== 8.3 Stage 3: Holistic Assessment (Sonnet) ===
1208 +
1209 +|=Parameter|=Value|=Notes
1210 +|**Model**|{{code}}claude-3-5-sonnet-20241022{{/code}}|Context-aware analysis
1211 +|**Input Tokens**|~5K|Article + claim verdicts
1212 +|**Output Tokens**|~1K|Article verdict + fallacies
1213 +|**Cost**|$0.030 per article|($3/M input + $15/M output)
1214 +|**Temperature**|0.0|Deterministic
1215 +|**Max Tokens**|2000|Sufficient for assessment
1216 +
1217 +**Prompt Strategy:**
1218 +* Detect main thesis
1219 +* Evaluate logical coherence (claim verdicts → thesis)
1220 +* Identify fallacies (correlation-causation, cherry-picking, etc.)
1221 +* Compute logic_quality_score
1222 +* Explain article verdict reasoning (3-4 bullets)
1223 +* Output structured JSON only
1224 +
1225 +=== 8.4 Cost Projections by Cache Hit Rate ===
1226 +
1227 +|=Cache Hit Rate|=Cost per Article|=10K Articles Cost|=100K Articles Cost
1228 +|0% (cold start)|$0.438|$4,380|$43,800
1229 +|20%|$0.357|$3,570|$35,700
1230 +|40%|$0.276|$2,760|$27,600
1231 +|**60%**|**$0.195**|**$1,950**|**$19,500**
1232 +|**70%** (target)|**$0.155**|**$1,550**|**$15,500**
1233 +|**80%**|**$0.114**|**$1,140**|**$11,400**
1234 +|**90%**|**$0.073**|**$730**|**$7,300**
1235 +|95%|$0.053|$530|$5,300
1236 +
1237 +**Break-Even Analysis:**
1238 +* Monolithic (v0.3.1): $0.15 per article constant
1239 +* 3-stage breaks even at **70% cache hit rate**
1240 +* Expected after ~1,500 articles in same domain
1241 +
1242 +---
1243 +
1244 +== 9. Implementation Notes ==
1245 +
1246 +=== 9.1 Recommended Tech Stack ===
1247 +
1248 +* **Framework:** Next.js 14+ with App Router (TypeScript)
1249 +* **Cache:** Redis 7.0+ (managed: AWS ElastiCache, Redis Cloud, Upstash)
1250 +* **Storage:** Filesystem JSON for jobs + S3/R2 for archival
1251 +* **Queue:** BullMQ with Redis (for 3-stage pipeline orchestration)
1252 +* **LLM Client:** Anthropic Python SDK or TypeScript SDK
1253 +* **Cost Tracking:** PostgreSQL for user credit ledger
1254 +* **Deployment:** Vercel (frontend + API) + Redis Cloud
1255 +
1256 +=== 9.2 3-Stage Pipeline Implementation ===
1257 +
1258 +**Job Queue Flow (Conceptual):**
1259 +
1260 +{{code language="typescript"}}
1261 +// Stage 1: Extract Claims
1262 +const stage1Job = await queue.add('stage1-extract-claims', {
1263 + jobId: 'job123',
1264 + articleUrl: 'https://example.com/article'
1265 +});
1266 +
1267 +// On Stage 1 completion → enqueue Stage 2 jobs
1268 +stage1Job.on('completed', async (result) => {
1269 + const { claims } = result;
1270 +
1271 + // Stage 2: Analyze each claim (with cache check)
1272 + const stage2Jobs = await Promise.all(
1273 + claims.map(claim =>
1274 + queue.add('stage2-analyze-claim', {
1275 + jobId: 'job123',
1276 + claimId: claim.claim_id,
1277 + canonicalClaim: claim.canonical_claim,
1278 + checkCache: true
1279 + })
1280 + )
1281 + );
1282 +
1283 + // On all Stage 2 completions → enqueue Stage 3
1284 + await Promise.all(stage2Jobs.map(j => j.waitUntilFinished()));
1285 +
1286 + const claimVerdicts = await gatherStage2Results('job123');
1287 +
1288 + await queue.add('stage3-holistic', {
1289 + jobId: 'job123',
1290 + articleUrl: 'https://example.com/article',
1291 + claimVerdicts: claimVerdicts
1292 + });
1293 +});
1294 +{{/code}}
1295 +
1296 +**Note:** This is a conceptual sketch. Actual implementation may use BullMQ Flow API or custom orchestration.
1297 +
1298 +**Cache Check Logic:**
1299 +{{code language="typescript"}}
1300 +async function analyzeClaimWithCache(claim: string): Promise<ClaimAnalysis> {
1301 + const canonicalClaim = normalizeClaim(claim);
1302 + const claimHash = sha256(canonicalClaim);
1303 + const cacheKey = `claim:v1:${claimHash}`;
1304 +
1305 + // Check cache
1306 + const cached = await redis.get(cacheKey);
1307 + if (cached) {
1308 + await redis.incr(`claim:stats:hit_count:${claimHash}`);
1309 + return JSON.parse(cached);
1310 + }
1311 +
1312 + // Cache miss - analyze with LLM
1313 + const analysis = await analyzeClaim_Stage2(canonicalClaim);
1314 +
1315 + // Store in cache
1316 + await redis.set(cacheKey, JSON.stringify(analysis), 'EX', 7776000); // 90 days
1317 +
1318 + return analysis;
1319 +}
1320 +{{/code}}
1321 +
1322 +=== 9.3 User Credit Management ===
1323 +
1324 +**PostgreSQL Schema:**
1325 +{{code language="sql"}}
1326 +CREATE TABLE user_credits (
1327 + user_id UUID PRIMARY KEY,
1328 + tier VARCHAR(20) DEFAULT 'free',
1329 + credit_limit DECIMAL(10,2) DEFAULT 10.00,
1330 + credit_used DECIMAL(10,2) DEFAULT 0.00,
1331 + reset_date TIMESTAMP,
1332 + cache_only_mode BOOLEAN DEFAULT false,
1333 + created_at TIMESTAMP DEFAULT NOW()
1334 +);
1335 +
1336 +CREATE TABLE usage_log (
1337 + id SERIAL PRIMARY KEY,
1338 + user_id UUID REFERENCES user_credits(user_id),
1339 + job_id VARCHAR(50),
1340 + stage VARCHAR(20),
1341 + cost DECIMAL(10,4),
1342 + cache_hit BOOLEAN,
1343 + created_at TIMESTAMP DEFAULT NOW()
1344 +);
1345 +{{/code}}
1346 +
1347 +**Credit Deduction Logic:**
1348 +{{code language="typescript"}}
1349 +async function deductCredit(userId: string, cost: number): Promise<boolean> {
1350 + const user = await db.query('SELECT * FROM user_credits WHERE user_id = $1', [userId]);
1351 +
1352 + const newUsed = user.credit_used + cost;
1353 +
1354 + if (newUsed > user.credit_limit && user.tier === 'free') {
1355 + // Trigger cache-only mode
1356 + await db.query(
1357 + 'UPDATE user_credits SET cache_only_mode = true WHERE user_id = $1',
1358 + [userId]
1359 + );
1360 + throw new Error('CREDIT_LIMIT_REACHED');
1361 + }
1362 +
1363 + await db.query(
1364 + 'UPDATE user_credits SET credit_used = $1 WHERE user_id = $2',
1365 + [newUsed, userId]
1366 + );
1367 +
1368 + return true;
1369 +}
1370 +{{/code}}
1371 +
1372 +=== 9.4 Cache-Only Mode Implementation ===
1373 +
1374 +**Middleware:**
1375 +{{code language="typescript"}}
1376 +async function checkCacheOnlyMode(req, res, next) {
1377 + const user = await getUserCredit(req.userId);
1378 +
1379 + if (user.cache_only_mode) {
1380 + // Allow only cache reads
1381 + if (req.body.options?.cache_preference !== 'allow_partial') {
1382 + return res.status(402).json({
1383 + error: 'credit_limit_reached',
1384 + message: 'Resubmit with cache_preference=allow_partial',
1385 + cache_only_mode: true
1386 + });
1387 + }
1388 +
1389 + // Modify request to skip Stage 2 for uncached claims
1390 + req.cacheOnlyMode = true;
1391 + }
1392 +
1393 + next();
1394 +}
1395 +{{/code}}
1396 +
1397 +=== 9.5 Estimated Timeline ===
1398 +
1399 +**POC1 with 3-Stage Architecture:**
1400 +* Week 1: Stage 1 (Haiku extraction) + Redis setup
1401 +* Week 2: Stage 2 (Sonnet analysis + caching)
1402 +* Week 3: Stage 3 (Holistic assessment) + pipeline orchestration
1403 +* Week 4: User credit system + cache-only mode
1404 +* Week 5: Testing with 100 articles (measure cache hit rate)
1405 +* Week 6: Optimization + bug fixes
1406 +* **Total: 6-8 weeks**
1407 +
1408 +**Manual coding:** 12-16 weeks
1409 +
1410 +---
1411 +
1412 +== 10. Testing Strategy ==
1413 +
1414 +=== 10.1 Cache Performance Testing ===
1415 +
1416 +**Test Scenarios:**
1417 +
1418 +**Scenario 1: Cold Start (0 cache)**
1419 +* Analyze 100 diverse articles
1420 +* Measure: Cost per article, cache growth rate
1421 +* Expected: $0.35-0.40 avg, ~400 unique claims cached
1422 +
1423 +**Scenario 2: Warm Cache (Overlapping Domain)**
1424 +* Analyze 100 articles on SAME topic (e.g., "2024 election")
1425 +* Measure: Cache hit rate growth
1426 +* Expected: Hit rate 20% → 60% by article 100
1427 +
1428 +**Scenario 3: Mature Cache (1,000 articles)**
1429 +* Analyze next 100 articles (diverse topics)
1430 +* Measure: Steady-state cache hit rate
1431 +* Expected: 60-70% hit rate, $0.15-0.18 avg cost
1432 +
1433 +**Scenario 4: Cache-Only Mode**
1434 +* Free user reaches $10 limit (67 articles at 70% hit rate)
1435 +* Submit 10 more articles with {{code}}cache_preference=allow_partial{{/code}}
1436 +* Measure: Coverage %, user satisfaction
1437 +* Expected: 60-70% coverage, instant results
1438 +
1439 +=== 10.2 Success Metrics ===
1440 +
1441 +**Cache Performance:**
1442 +* Week 1: 5-10% hit rate
1443 +* Week 2: 15-25% hit rate
1444 +* Week 3: 30-40% hit rate
1445 +* Week 4: 45-55% hit rate
1446 +* Target: ≥50% by 1,000 articles
1447 +
1448 +**Cost Targets:**
1449 +* Articles 1-100: $0.35-0.40 avg ⚠️ (expected)
1450 +* Articles 100-500: $0.25-0.30 avg
1451 +* Articles 500-1,000: $0.18-0.22 avg
1452 +* Articles 1,000+: $0.12-0.15 avg ✅
1453 +
1454 +**Quality Metrics (same as v0.3.1):**
1455 +* Hallucination rate: <5%
1456 +* Context-aware accuracy: ≥70%
1457 +* False positive rate: <15%
1458 +* Mandatory contradiction search: 100% compliance
1459 +
1460 +=== 10.3 Free Tier Economics Validation ===
1461 +
1462 +**Test with simulated 1,000 users:**
1463 +* Each user: $10 credit
1464 +* 70% cache hit rate
1465 +* Avg 70 articles/user/month
1466 +
1467 +**Projected Costs:**
1468 +* Total credits: 1,000 × $10 = $10,000
1469 +* Actual LLM costs: ~$9,000 (cache savings)
1470 +* Margin: 10%
1471 +
1472 +**Sustainability Check:**
1473 +* If margin <5% → Reduce free tier limit
1474 +* If margin >20% → Consider increasing free tier
1475 +
1476 +---
1477 +
1478 +== 11. Cross-References ==
1479 +
1480 +This API specification implements requirements from:
1481 +
1482 +* **[[POC Requirements>>Test.FactHarbor.Specification.POC.Requirements]]**
1483 +** FR-POC-1 through FR-POC-6 (3-stage architecture)
1484 +** NFR-POC-1 through NFR-POC-3 (quality gates, caching)
1485 +** NEW: FR-POC-7 (Claim-level caching)
1486 +** NEW: FR-POC-8 (User credit system)
1487 +** NEW: FR-POC-9 (Cache-only mode)
1488 +
1489 +* **[[Article Verdict Problem>>Test.FactHarbor.Specification.POC.Article-Verdict-Problem]]**
1490 +** Approach 1 implemented in Stage 3
1491 +** Context-aware holistic assessment
1492 +
1493 +* **[[Requirements>>Test.FactHarbor.Specification.Requirements.WebHome]]**
1494 +** FR4 (Analysis Summary) - enhanced with caching
1495 +** FR7 (Verdict Calculation) - cached per claim
1496 +** NFR11 (Quality Gates) - enforced across stages
1497 +** NEW: NFR19 (Cost Efficiency via Caching)
1498 +** NEW: NFR20 (Free Tier Sustainability)
1499 +
1500 +* **[[Architecture>>Test.FactHarbor.Specification.Architecture.WebHome]]**
1501 +** POC1 3-stage pipeline architecture
1502 +** Redis cache layer
1503 +** User credit system
1504 +
1505 +* **[[Data Model>>Test.FactHarbor.Specification.Data Model.WebHome]]**
1506 +** Claim structure (cacheable unit)
1507 +** Evidence structure
1508 +** Scenario boundaries
1509 +
1510 +---
1511 +
1512 +**End of Specification - FactHarbor POC1 API v0.4**
1513 +
1514 +**3-stage caching architecture with free tier cache-only mode. Ready for sustainable, scalable implementation!** 🚀
1515 +