Last modified by Robert Schaub on 2025/12/24 18:26

From version 4.1
edited by Robert Schaub
on 2025/12/24 16:55
Change comment: Imported from XAR
To version 1.1
edited by Robert Schaub
on 2025/12/24 11:54
Change comment: Imported from XAR

Summary

Details

Page properties
Title
... ... @@ -1,1 +1,1 @@
1 -POC1 API & Schemas Specification v0.4.1
1 +POC1 API & Schemas Specification
Content
... ... @@ -1,578 +1,673 @@
1 -= POC1 API & Schemas Specification =
1 +# FactHarbor POC1 API & Schemas Specification
2 2  
3 -----
3 +**Version:** 0.3 (POC1 - Production Ready)
4 +**Namespace:** FactHarbor.*
5 +**Syntax:** xWiki 2.1
6 +**Last Updated:** 2025-12-24
4 4  
8 +---
9 +
5 5  == Version History ==
6 6  
7 7  |=Version|=Date|=Changes
8 -|0.4.1|2025-12-24|Applied 9 critical fixes: file format notice, verdict taxonomy, canonicalization algorithm, Stage 1 cost policy, BullMQ fix, language in cache key, historical claims TTL, idempotency, copyright policy
9 -|0.4|2025-12-24|**BREAKING:** 3-stage pipeline with claim-level caching, user tier system, cache-only mode for free users, Redis cache architecture
10 -|0.3.1|2025-12-24|Fixed single-prompt strategy, SSE clarification, schema canonicalization, cost constraints
11 -|0.3|2025-12-24|Added complete API endpoints, LLM config, risk tiers, scraping details
13 +|0.3|2025-12-24|Added complete API endpoints, LLM config, risk tiers, scraping details, quality gate logging, temporal separation note, cross-references
14 +|0.2|2025-12-24|Initial rebased version with holistic assessment
15 +|0.1|2025-12-24|Original specification
12 12  
13 -----
17 +---
14 14  
15 15  == 1. Core Objective (POC1) ==
16 16  
17 -The primary technical goal of POC1 is to validate **Approach 1 (Single-Pass Holistic Analysis)** while implementing **claim-level caching** to achieve cost sustainability.
21 +The primary technical goal of POC1 is to validate **Approach 1 (Single-Pass Holistic Analysis)**:
18 18  
19 -The system must prove that AI can identify an article's **Main Thesis** and determine if supporting claims logically support that thesis without committing fallacies.
23 +The system must prove that AI can identify an article's **Main Thesis** and determine if the supporting claims (even if individually accurate) logically support that thesis without committing fallacies (e.g., correlation vs. causation, cherry-picking, hasty generalization).
20 20  
21 -=== Success Criteria: ===
22 -
25 +**Success Criteria:**
23 23  * Test with 30 diverse articles
24 24  * Target: ≥70% accuracy detecting misleading articles
25 -* Cost: <$0.25 per NEW analysis (uncached)
26 -* Cost: $0.00 for cached claim reuse
27 -* Cache hit rate: ≥50% after 1,000 articles
28 +* Cost: <$0.35 per analysis
28 28  * Processing time: <2 minutes (standard depth)
29 29  
30 -=== Economic Model: ===
31 +**See:** [[Article Verdict Problem>>Test.FactHarbor.Specification.POC.Article-Verdict-Problem]] for complete investigation of 7 approaches.
31 31  
32 -* **Free tier:** $10 credit per month (~~40-140 articles depending on cache hits)
33 -* **After limit:** Cache-only mode (instant, free access to cached claims)
34 -* **Paid tier:** Unlimited new analyses
33 +---
35 35  
36 -----
35 +== 2. Runtime Model & Job States ==
37 37  
38 -== 2. Architecture Overview ==
37 +=== 2.1 Pipeline Steps ===
39 39  
40 -=== 2.1 3-Stage Pipeline with Caching ===
39 +For progress reporting via API, the pipeline follows these stages:
41 41  
42 -FactHarbor POC1 uses a **3-stage architecture** designed for claim-level caching and cost efficiency:
41 +# **INGEST**: URL scraping (Jina Reader / Trafilatura) or text normalization.
42 +# **EXTRACT_CLAIMS**: Identifying 3-5 verifiable factual claims + marking central vs. supporting.
43 +# **SCENARIOS**: Generating context interpretations for each claim.
44 +# **RETRIEVAL**: Evidence gathering (Search API + mandatory contradiction search).
45 +# **VERDICTS**: Assigning likelihoods, confidence, and uncertainty per scenario.
46 +# **HOLISTIC_ASSESSMENT**: Evaluating article-level credibility (Thesis vs. Claims logic).
47 +# **REPORT**: Generating final Markdown and JSON outputs.
43 43  
44 -{{mermaid}}
45 -graph TD
46 - A[Article Input] --> B[Stage 1: Extract Claims]
47 - B --> C{For Each Claim}
48 - C --> D[Check Cache]
49 - D -->|Cache HIT| E[Return Cached Verdict]
50 - D -->|Cache MISS| F[Stage 2: Analyze Claim]
51 - F --> G[Store in Cache]
52 - G --> E
53 - E --> H[Stage 3: Holistic Assessment]
54 - H --> I[Final Report]
55 -{{/mermaid}}
49 +=== 2.1.1 URL Extraction Strategy ===
56 56  
57 -==== Stage 1: Claim Extraction (Haiku, no cache) ====
51 +**Primary:** Jina AI Reader ({{code}}https://r.jina.ai/{url}{{/code}})
52 +* **Rationale:** Clean markdown, handles JS rendering, free tier sufficient
53 +* **Fallback:** Trafilatura (Python library) for simple static HTML
58 58  
59 -* **Input:** Article text
60 -* **Output:** 5 canonical claims (normalized, deduplicated)
61 -* **Model:** Claude Haiku 4
62 -* **Cost:** $0.003 per article
63 -* **Cache strategy:** No caching (article-specific)
55 +**Error Handling:**
64 64  
65 -==== Stage 2: Claim Analysis (Sonnet, CACHED) ====
57 +|=Error Code|=Trigger|=Action
58 +|{{code}}URL_BLOCKED{{/code}}|403/401/Paywall detected|Return error, suggest text paste
59 +|{{code}}URL_UNREACHABLE{{/code}}|Network/DNS failure|Retry once, then fail
60 +|{{code}}URL_NOT_FOUND{{/code}}|404 Not Found|Return error immediately
61 +|{{code}}EXTRACTION_FAILED{{/code}}|Content <50 words or unreadable|Return error with reason
66 66  
67 -* **Input:** Single canonical claim
68 -* **Output:** Scenarios + Evidence + Verdicts
69 -* **Model:** Claude Sonnet 3.5
70 -* **Cost:** $0.081 per NEW claim
71 -* **Cache strategy:** Redis, 90-day TTL
72 -* **Cache key:** claim:v1norm1:{language}:{sha256(canonical_claim)}
63 +**Supported URL Patterns:**
64 +* News articles, blog posts, Wikipedia
65 +* Academic preprints (arXiv)
66 +* ❌ Social media posts (Twitter, Facebook) - not in POC1
67 +* ❌ Video platforms (YouTube, TikTok) - not in POC1
68 +* PDF files - deferred to Beta 0
73 73  
74 -==== Stage 3: Holistic Assessment (Sonnet, no cache) ====
70 +=== 2.2 Job Status Enumeration ===
75 75  
76 -* **Input:** Article + Claim verdicts (from cache or Stage 2)
77 -* **Output:** Article verdict + Fallacies + Logic quality
78 -* **Model:** Claude Sonnet 3.5
79 -* **Cost:** $0.030 per article
80 -* **Cache strategy:** No caching (article-specific)
72 +(((
73 +* **QUEUED** - Job accepted, waiting in queue
74 +* **RUNNING** - Processing in progress
75 +* **SUCCEEDED** - Analysis complete, results available
76 +* **FAILED** - Error occurred, see error details
77 +* **CANCELLED** - User cancelled via DELETE endpoint
78 +)))
81 81  
82 -=== Total Cost Formula: ===
80 +---
83 83  
84 -{{{Cost = $0.003 (extraction) + (N_new_claims × $0.081) + $0.030 (holistic)
82 +== 3. REST API Contract ==
85 85  
86 -Examples:
87 -- 0 new claims (100% cache hit): $0.033
88 -- 1 new claim (80% cache hit): $0.114
89 -- 3 new claims (40% cache hit): $0.276
90 -- 5 new claims (0% cache hit): $0.438
91 -}}}
84 +=== 3.1 Create Analysis Job ===
92 92  
93 -----
86 +**Endpoint:** {{code}}POST /v1/analyze{{/code}}
94 94  
95 -=== 2.2 User Tier System ===
88 +**Request Body Example:**
89 +{{code language="json"}}
90 +{
91 + "input_type": "url",
92 + "input_url": "https://example.com/medical-report-01",
93 + "input_text": null,
94 + "options": {
95 + "browsing": "on",
96 + "depth": "standard",
97 + "max_claims": 5,
98 + "context_aware_analysis": true
99 + },
100 + "client": {
101 + "request_id": "optional-client-tracking-id",
102 + "source_label": "optional"
103 + }
104 +}
105 +{{/code}}
96 96  
97 -|=Tier|=Monthly Credit|=After Limit|=Cache Access|=Analytics
98 -|**Free**|$10|Cache-only mode|✅ Full|Basic
99 -|**Pro** (future)|$50|Continues|✅ Full|Advanced
100 -|**Enterprise** (future)|Custom|Continues|✅ Full + Priority|Full
107 +**Options:**
108 +* {{code}}browsing{{/code}}: {{code}}on{{/code}} | {{code}}off{{/code}} (retrieve web sources or just output queries)
109 +* {{code}}depth{{/code}}: {{code}}standard{{/code}} | {{code}}deep{{/code}} (evidence thoroughness)
110 +* {{code}}max_claims{{/code}}: 1-50 (default: 10)
111 +* {{code}}context_aware_analysis{{/code}}: {{code}}true{{/code}} | {{code}}false{{/code}} (experimental)
101 101  
102 -**Free Tier Economics:**
113 +**Response:** {{code}}202 Accepted{{/code}}
103 103  
104 -* $10 credit = 40-140 articles analyzed (depending on cache hit rate)
105 -* Average 70 articles/month at 70% cache hit rate
106 -* After limit: Cache-only mode
115 +{{code language="json"}}
116 +{
117 + "job_id": "01J...ULID",
118 + "status": "QUEUED",
119 + "created_at": "2025-12-24T10:31:00Z",
120 + "links": {
121 + "self": "/v1/jobs/01J...ULID",
122 + "result": "/v1/jobs/01J...ULID/result",
123 + "report": "/v1/jobs/01J...ULID/report",
124 + "events": "/v1/jobs/01J...ULID/events"
125 + }
126 +}
127 +{{/code}}
107 107  
108 -----
129 +---
109 109  
110 -=== 2.3 Cache-Only Mode (Free Tier Feature) ===
131 +=== 3.2 Get Job Status ===
111 111  
112 -When free users reach their $10 monthly limit, they enter **Cache-Only Mode**:
133 +**Endpoint:** {{code}}GET /v1/jobs/{job_id}{{/code}}
113 113  
114 -==== What Cache-Only Mode Provides: ====
135 +**Response:** {{code}}200 OK{{/code}}
115 115  
116 -✅ **Claim Extraction (Platform-Funded):**
137 +{{code language="json"}}
138 +{
139 + "job_id": "01J...ULID",
140 + "status": "RUNNING",
141 + "created_at": "2025-12-24T10:31:00Z",
142 + "updated_at": "2025-12-24T10:31:22Z",
143 + "progress": {
144 + "step": "RETRIEVAL",
145 + "percent": 60,
146 + "message": "Gathering evidence for C2-S1",
147 + "current_claim_id": "C2",
148 + "current_scenario_id": "C2-S1"
149 + },
150 + "input_echo": {
151 + "input_type": "url",
152 + "input_url": "https://example.com/medical-report-01"
153 + },
154 + "links": {
155 + "self": "/v1/jobs/01J...ULID",
156 + "result": "/v1/jobs/01J...ULID/result",
157 + "report": "/v1/jobs/01J...ULID/report"
158 + },
159 + "error": null
160 +}
161 +{{/code}}
117 117  
118 -* Stage 1 extraction runs at $0.003 per article
119 -* **Cost: Absorbed by platform** (not charged to user credit)
120 -* Rationale: Extraction is necessary to check cache, and cost is negligible
121 -* Rate limit: Max 50 extractions/day in cache-only mode (prevents abuse)
163 +---
122 122  
123 - **Instant Access to Cached Claims:**
165 +=== 3.3 Get JSON Result ===
124 124  
125 -* Any claim that exists in cache → Full verdict returned
126 -* Cost: $0 (no LLM calls)
127 -* Response time: <100ms
167 +**Endpoint:** {{code}}GET /v1/jobs/{job_id}/result{{/code}}
128 128  
129 -**Partial Article Analysis:**
169 +**Response:** {{code}}200 OK{{/code}} (Returns the **AnalysisResult** schema - see Section 4)
130 130  
131 -* Check each claim against cache
132 -* Return verdicts for ALL cached claims
133 -* For uncached claims: Return "status": "cache_miss"
171 +**Other Responses:**
172 +* {{code}}409 Conflict{{/code}} - Job not finished yet
173 +* {{code}}404 Not Found{{/code}} - Job ID unknown
134 134  
135 -✅ **Cache Coverage Report:**
175 +---
136 136  
137 -* "3 of 5 claims available in cache (60% coverage)"
138 -* Links to cached analyses
139 -* Estimated cost to complete: $0.162 (2 new claims)
177 +=== 3.4 Download Markdown Report ===
140 140  
141 -**Not Available in Cache-Only Mode:**
179 +**Endpoint:** {{code}}GET /v1/jobs/{job_id}/report{{/code}}
142 142  
143 -* New claim analysis (Stage 2 LLM calls blocked)
144 -* Full holistic assessment (Stage 3 blocked if any claims missing)
181 +**Response:** {{code}}200 OK{{/code}} with {{code}}text/markdown; charset=utf-8{{/code}} content
145 145  
146 -==== User Experience Example: ====
183 +**Headers:**
184 +* {{code}}Content-Disposition: attachment; filename="factharbor_poc1_{job_id}.md"{{/code}}
147 147  
148 -{{{{
149 - "status": "cache_only_mode",
150 - "message": "Monthly credit limit reached. Showing cached results only.",
151 - "cache_coverage": {
152 - "claims_total": 5,
153 - "claims_cached": 3,
154 - "claims_missing": 2,
155 - "coverage_percent": 60
156 - },
157 - "cached_claims": [
158 - {"claim_id": "C1", "verdict": "Likely", "confidence": 0.82},
159 - {"claim_id": "C2", "verdict": "Highly Likely", "confidence": 0.91},
160 - {"claim_id": "C4", "verdict": "Unclear", "confidence": 0.55}
161 - ],
162 - "missing_claims": [
163 - {"claim_id": "C3", "claim_text": "...", "estimated_cost": "$0.081"},
164 - {"claim_id": "C5", "claim_text": "...", "estimated_cost": "$0.081"}
165 - ],
166 - "upgrade_options": {
167 - "top_up": "$5 for 20-70 more articles",
168 - "pro_tier": "$50/month unlimited"
169 - }
170 -}
171 -}}}
186 +**Other Responses:**
187 +* {{code}}409 Conflict{{/code}} - Job not finished
188 +* {{code}}404 Not Found{{/code}} - Job unknown
172 172  
173 -**Design Rationale:**
190 +---
174 174  
175 -* Free users still get value (cached claims often answer their question)
176 -* Demonstrates FactHarbor's value (partial results encourage upgrade)
177 -* Sustainable for platform (no additional cost)
178 -* Fair to all users (everyone contributes to cache)
192 +=== 3.5 Stream Job Events (Optional, Recommended) ===
179 179  
180 -----
194 +**Endpoint:** {{code}}GET /v1/jobs/{job_id}/events{{/code}}
181 181  
182 -== 3. REST API Contract ==
196 +**Response:** Server-Sent Events (SSE) stream
183 183  
184 -=== 3.1 User Credit Tracking ===
198 +**Event Types:**
199 +* {{code}}progress{{/code}} - Progress update
200 +* {{code}}claim_extracted{{/code}} - Claim identified
201 +* {{code}}verdict_computed{{/code}} - Scenario verdict complete
202 +* {{code}}complete{{/code}} - Job finished
203 +* {{code}}error{{/code}} - Error occurred
185 185  
186 -**Endpoint:** GET /v1/user/credit
205 +---
187 187  
188 -**Response:** 200 OK
207 +=== 3.6 Cancel Job ===
189 189  
190 -{{{{
191 - "user_id": "user_abc123",
192 - "tier": "free",
193 - "credit_limit": 10.00,
194 - "credit_used": 7.42,
195 - "credit_remaining": 2.58,
196 - "reset_date": "2025-02-01T00:00:00Z",
197 - "cache_only_mode": false,
198 - "usage_stats": {
199 - "articles_analyzed": 67,
200 - "claims_from_cache": 189,
201 - "claims_newly_analyzed": 113,
202 - "cache_hit_rate": 0.626
203 - }
204 -}
205 -}}}
209 +**Endpoint:** {{code}}DELETE /v1/jobs/{job_id}{{/code}}
206 206  
207 -----
211 +Attempts to cancel a queued or running job.
208 208  
209 -=== 3.2 Create Analysis Job (3-Stage) ===
213 +**Response:** {{code}}200 OK{{/code}} with updated Job object (status: CANCELLED)
210 210  
211 -**Endpoint:** POST /v1/analyze
215 +**Note:** Already-completed jobs cannot be cancelled.
212 212  
213 -==== Idempotency Support: ====
217 +---
214 214  
215 -To prevent duplicate job creation on network retries, clients SHOULD include:
219 +=== 3.7 Health Check ===
216 216  
217 -{{{POST /v1/analyze
218 -Idempotency-Key: {client-generated-uuid}
219 -}}}
221 +**Endpoint:** {{code}}GET /v1/health{{/code}}
220 220  
221 -OR use the client.request_id field:
223 +**Response:** {{code}}200 OK{{/code}}
222 222  
223 -{{{{
224 - "input_url": "...",
225 - "client": {
226 - "request_id": "client-uuid-12345",
227 - "source_label": "optional"
228 - }
225 +{{code language="json"}}
226 +{
227 + "status": "ok",
228 + "version": "POC1-v0.3",
229 + "model": "claude-3-5-sonnet-20241022"
229 229  }
230 -}}}
231 +{{/code}}
231 231  
232 -**Server Behavior:**
233 +---
233 233  
234 -* If Idempotency-Key or request_id seen before (within 24 hours):
235 -** Return existing job (200 OK, not 202 Accepted)
236 -** Do NOT create duplicate job or charge twice
237 -* Idempotency keys expire after 24 hours (matches job retention)
235 +== 4. AnalysisResult Schema (Context-Aware) ==
238 238  
239 -**Example Response (Idempotent):**
237 +This schema implements the **Context-Aware Analysis** required by the POC1 specification.
240 240  
241 -{{{{
242 - "job_id": "01J...ULID",
243 - "status": "RUNNING",
244 - "idempotent": true,
245 - "original_request_at": "2025-12-24T10:31:00Z",
246 - "message": "Returning existing job (idempotency key matched)"
247 -}
248 -}}}
249 -
250 -==== Request Body: ====
251 -
252 -{{{{
253 - "input_type": "url",
254 - "input_url": "https://example.com/medical-report-01",
255 - "input_text": null,
256 - "options": {
257 - "browsing": "on",
258 - "depth": "standard",
259 - "max_claims": 5,
260 - "scenarios_per_claim": 2,
261 - "max_evidence_per_scenario": 6,
262 - "context_aware_analysis": true
239 +{{code language="json"}}
240 +{
241 + "metadata": {
242 + "job_id": "string (ULID)",
243 + "timestamp_utc": "ISO8601",
244 + "engine_version": "POC1-v0.3",
245 + "llm_provider": "anthropic",
246 + "llm_model": "claude-3-5-sonnet-20241022",
247 + "usage_stats": {
248 + "input_tokens": "integer",
249 + "output_tokens": "integer",
250 + "estimated_cost_usd": "float",
251 + "response_time_sec": "float"
252 + }
263 263   },
264 - "client": {
265 - "request_id": "optional-client-tracking-id",
266 - "source_label": "optional"
254 + "article_holistic_assessment": {
255 + "main_thesis": "string (The core argument detected)",
256 + "overall_verdict": "WELL-SUPPORTED | MISLEADING | REFUTED | UNCERTAIN",
257 + "logic_quality_score": "float (0-1)",
258 + "fallacies_detected": ["correlation-causation", "cherry-picking", "hasty-generalization"],
259 + "verdict_reasoning": "string (Explanation of why article credibility differs from claim average)",
260 + "experimental_feature": true
261 + },
262 + "claims": [
263 + {
264 + "claim_id": "C1",
265 + "is_central_to_thesis": "boolean",
266 + "claim_text": "string",
267 + "canonical_form": "string",
268 + "claim_type": "descriptive | causal | predictive | normative | definitional",
269 + "evaluability": "evaluable | partly_evaluable | not_evaluable",
270 + "risk_tier": "A | B | C",
271 + "risk_tier_justification": "string",
272 + "domain": "string (e.g., 'public health', 'economics')",
273 + "key_terms": ["term1", "term2"],
274 + "entities": ["Person X", "Org Y"],
275 + "time_scope_detected": "2020-2024",
276 + "geography_scope_detected": "Brazil",
277 + "scenarios": [
278 + {
279 + "scenario_id": "C1-S1",
280 + "context_title": "string",
281 + "definitions": {"key_term": "definition"},
282 + "assumptions": ["Assumption 1", "Assumption 2"],
283 + "boundaries": {
284 + "time": "as of 2025-01",
285 + "geography": "Brazil",
286 + "population": "adult population",
287 + "conditions": "excludes X; includes Y"
288 + },
289 + "scope_of_evidence": "What counts as evidence for this scenario",
290 + "scenario_questions": ["Question that decides the verdict"],
291 + "verdict": {
292 + "label": "Highly Likely | Likely | Unclear | Unlikely | Refuted | Unsubstantiated",
293 + "probability_range": [0.0, 1.0],
294 + "confidence": "float (0-1)",
295 + "reasoning": "string",
296 + "key_supporting_evidence_ids": ["E1", "E3"],
297 + "key_counter_evidence_ids": ["E2"],
298 + "uncertainty_factors": ["Data gap", "Method disagreement"],
299 + "what_would_change_my_mind": ["Specific new study", "Updated dataset"]
300 + },
301 + "evidence": [
302 + {
303 + "evidence_id": "E1",
304 + "stance": "supports | undermines | mixed | context_dependent",
305 + "relevance_to_scenario": "float (0-1)",
306 + "evidence_summary": ["Bullet fact 1", "Bullet fact 2"],
307 + "citation": {
308 + "title": "Source title",
309 + "author_or_org": "Org/Author",
310 + "publication_date": "2024-05-01",
311 + "url": "https://source.example",
312 + "publisher": "Publisher/Domain"
313 + },
314 + "excerpt": ["Short quote ≤25 words (optional)"],
315 + "source_reliability_score": "float (0-1) - READ-ONLY SNAPSHOT",
316 + "reliability_justification": "Why high/medium/low",
317 + "limitations_and_reservations": ["Limitation 1", "Limitation 2"],
318 + "retraction_or_dispute_signal": "none | correction | retraction | disputed",
319 + "retrieval_status": "OK | NEEDS_RETRIEVAL | FAILED"
320 + }
321 + ]
322 + }
323 + ]
324 + }
325 + ],
326 + "quality_gates": {
327 + "gate1_claim_validation": "pass | fail",
328 + "gate4_verdict_confidence": "pass | fail",
329 + "passed_all": "boolean",
330 + "gate_fail_reasons": [
331 + {
332 + "gate": "gate1_claim_validation",
333 + "claim_id": "C1",
334 + "reason_code": "OPINION_DETECTED | COMPOUND_CLAIM | SUBJECTIVE | TOO_VAGUE",
335 + "explanation": "Human-readable explanation"
336 + }
337 + ]
338 + },
339 + "global_notes": {
340 + "limitations": ["System limitation 1", "Limitation 2"],
341 + "safety_or_policy_notes": ["Note 1"]
267 267   }
268 268  }
269 -}}}
344 +{{/code}}
270 270  
271 -**Options:**
346 +=== 4.1 Risk Tier Definitions ===
272 272  
273 -* browsing: on | off (retrieve web sources or just output queries)
274 -* depth: standard | deep (evidence thoroughness)
275 -* max_claims: 1-10 (default: **5** for cost control)
276 -* scenarios_per_claim: 1-5 (default: **2** for cost control)
277 -* max_evidence_per_scenario: 3-10 (default: **6**)
278 -* context_aware_analysis: true | false (experimental)
348 +|=Tier|=Impact|=Examples|=Actions
349 +|**A (High)**|High real-world impact if wrong|Health claims, safety information, financial advice, medical procedures|Human review recommended (Mode3_Human_Reviewed_Required)
350 +|**B (Medium)**|Moderate impact, contested topics|Political claims, social issues, scientific debates, economic predictions|Enhanced contradiction search, AI-generated publication OK (Mode2_AI_Generated)
351 +|**C (Low)**|Low impact, easily verifiable|Historical facts, basic statistics, biographical data, geographic information|Standard processing, AI-generated publication OK (Mode2_AI_Generated)
279 279  
280 -**Response:** 202 Accepted
353 +=== 4.2 Source Reliability (Read-Only Snapshots) ===
281 281  
282 -{{{{
283 - "job_id": "01J...ULID",
284 - "status": "QUEUED",
285 - "created_at": "2025-12-24T10:31:00Z",
286 - "estimated_cost": 0.114,
287 - "cost_breakdown": {
288 - "stage1_extraction": 0.003,
289 - "stage2_new_claims": 0.081,
290 - "stage2_cached_claims": 0.000,
291 - "stage3_holistic": 0.030
292 - },
293 - "cache_info": {
294 - "claims_to_extract": 5,
295 - "estimated_cache_hits": 4,
296 - "estimated_new_claims": 1
297 - },
298 - "links": {
299 - "self": "/v1/jobs/01J...ULID",
300 - "result": "/v1/jobs/01J...ULID/result",
301 - "report": "/v1/jobs/01J...ULID/report",
302 - "events": "/v1/jobs/01J...ULID/events"
303 - }
304 -}
305 -}}}
355 +**IMPORTANT:** The {{code}}source_reliability_score{{/code}} in each evidence item is a **historical snapshot** from the weekly background scoring job.
306 306  
307 -**Error Responses:**
357 +* POC1 treats these scores as **read-only** (no modification during analysis)
358 +* **Prevents circular dependency:** scoring → affects retrieval → affects scoring
359 +* Full Source Track Record System is a **separate service** (not part of POC1)
360 +* **Temporal separation:** Scoring runs weekly; analysis uses snapshots
308 308  
309 -402 Payment Required - Free tier limit reached, cache-only mode
362 +**See:** [[Data Model>>Test.FactHarbor.Specification.Data Model.WebHome]] Section 1.3 (Source Track Record System) for scoring algorithm.
310 310  
311 -{{{{
312 - "error": "credit_limit_reached",
313 - "message": "Monthly credit limit reached. Entering cache-only mode.",
314 - "cache_only_mode": true,
315 - "credit_remaining": 0.00,
316 - "reset_date": "2025-02-01T00:00:00Z",
317 - "action": "Resubmit with cache_preference=allow_partial for cached results"
318 -}
319 -}}}
364 +=== 4.3 Quality Gate Reason Codes ===
320 320  
321 -----
366 +**Gate 1 (Claim Validation):**
367 +* {{code}}OPINION_DETECTED{{/code}} - Subjective judgment without factual anchor
368 +* {{code}}COMPOUND_CLAIM{{/code}} - Multiple claims in one statement
369 +* {{code}}SUBJECTIVE{{/code}} - Value judgment, not verifiable fact
370 +* {{code}}TOO_VAGUE{{/code}} - Lacks specificity for evaluation
322 322  
323 -== 4. Data Schemas ==
372 +**Gate 4 (Verdict Confidence):**
373 +* {{code}}LOW_CONFIDENCE{{/code}} - Confidence below threshold (<0.5)
374 +* {{code}}INSUFFICIENT_EVIDENCE{{/code}} - Too few sources to reach verdict
375 +* {{code}}CONTRADICTORY_EVIDENCE{{/code}} - Evidence conflicts without resolution
376 +* {{code}}NO_COUNTER_EVIDENCE{{/code}} - Contradiction search failed
324 324  
325 -=== 4.1 Stage 1 Output: ClaimExtraction ===
378 +**Purpose:** Enable system improvement workflow (Observe → Analyze → Improve)
326 326  
327 -{{{{
328 - "job_id": "01J...ULID",
329 - "stage": "stage1_extraction",
330 - "article_metadata": {
331 - "title": "Article title",
332 - "source_url": "https://example.com/article",
333 - "extracted_text_length": 5234,
334 - "language": "en"
335 - },
336 - "claims": [
337 - {
338 - "claim_id": "C1",
339 - "claim_text": "Original claim text from article",
340 - "canonical_claim": "Normalized, deduplicated phrasing",
341 - "claim_hash": "sha256:abc123...",
342 - "is_central_to_thesis": true,
343 - "claim_type": "causal",
344 - "evaluability": "evaluable",
345 - "risk_tier": "B",
346 - "domain": "public_health"
347 - }
348 - ],
349 - "article_thesis": "Main argument detected",
350 - "cost": 0.003
351 -}
352 -}}}
380 +---
353 353  
354 -----
382 +== 5. Validation Rules (POC1 Enforcement) ==
355 355  
356 -=== 4.5 Verdict Label Taxonomy ===
384 +|=Rule|=Requirement
385 +|**Mandatory Contradiction**|For every claim, the engine MUST search for "undermines" evidence. If none found, reasoning must explicitly state: "No counter-evidence found despite targeted search." Evidence must include at least 1 item with {{code}}stance ∈ {undermines, mixed, context_dependent}{{/code}} OR explicit note in {{code}}uncertainty_factors{{/code}}.
386 +|**Context-Aware Logic**|The {{code}}overall_verdict{{/code}} must prioritize central claims. If a {{code}}is_central_to_thesis=true{{/code}} claim is REFUTED, the overall article cannot be WELL-SUPPORTED. Central claims override verdict averaging.
387 +|**Author Identification**|All automated outputs MUST include {{code}}author_type: "AI/AKEL"{{/code}} or equivalent marker to distinguish AI-generated from human-reviewed content.
388 +|**Claim-to-Scenario Lifecycle**|In stateless POC1, Scenarios are **strictly children** of a specific Claim version. If a Claim's text changes, child Scenarios are part of that version's "snapshot." No scenario migration across versions.
357 357  
358 -FactHarbor uses **three distinct verdict taxonomies** depending on analysis level:
390 +---
359 359  
360 -==== 4.5.1 Scenario Verdict Labels (Stage 2) ====
392 +== 6. Deterministic Markdown Template ==
361 361  
362 -Used for individual scenario verdicts within a claim.
394 +The system renders {{code}}report.md{{/code}} using a **fixed template** based on the JSON result (NOT generated by LLM).
363 363  
364 -**Enum Values:**
396 +{{code language="markdown"}}
397 +# FactHarbor Analysis Report: {overall_verdict}
365 365  
366 -* Highly Likely - Probability 0.85-1.0, high confidence
367 -* Likely - Probability 0.65-0.84, moderate-high confidence
368 -* Unclear - Probability 0.35-0.64, or low confidence
369 -* Unlikely - Probability 0.16-0.34, moderate-high confidence
370 -* Highly Unlikely - Probability 0.0-0.15, high confidence
371 -* Unsubstantiated - Insufficient evidence to determine probability
399 +**Job ID:** {job_id} | **Generated:** {timestamp_utc}
400 +**Model:** {llm_model} | **Cost:** ${estimated_cost_usd} | **Time:** {response_time_sec}s
372 372  
373 -==== 4.5.2 Claim Verdict Labels (Rollup) ====
402 +---
374 374  
375 -Used when summarizing a claim across all scenarios.
404 +## 1. Holistic Assessment (Experimental)
376 376  
377 -**Enum Values:**
406 +**Main Thesis:** {main_thesis}
378 378  
379 -* Supported - Majority of scenarios are Likely or Highly Likely
380 -* Refuted - Majority of scenarios are Unlikely or Highly Unlikely
381 -* Inconclusive - Mixed scenarios or majority Unclear/Unsubstantiated
408 +**Overall Verdict:** {overall_verdict}
382 382  
383 -**Mapping Logic:**
410 +**Logic Quality Score:** {logic_quality_score}/1.0
384 384  
385 -* If ≥60% scenarios are (Highly Likely | Likely) → Supported
386 -* If ≥60% scenarios are (Highly Unlikely | Unlikely) → Refuted
387 -* Otherwise → Inconclusive
412 +**Fallacies Detected:** {fallacies_detected}
388 388  
389 -==== 4.5.3 Article Verdict Labels (Stage 3) ====
414 +**Reasoning:** {verdict_reasoning}
390 390  
391 -Used for holistic article-level assessment.
416 +---
392 392  
393 -**Enum Values:**
418 +## 2. Key Claims Analysis
394 394  
395 -* WELL-SUPPORTED - Article thesis logically follows from supported claims
396 -* MISLEADING - Claims may be true but article commits logical fallacies
397 -* REFUTED - Central claims are refuted, invalidating thesis
398 -* UNCERTAIN - Insufficient evidence or highly mixed claim verdicts
420 +### [C1] {claim_text}
421 +* **Role:** {is_central_to_thesis ? "Central to thesis" : "Supporting claim"}
422 +* **Risk Tier:** {risk_tier} ({risk_tier_justification})
423 +* **Evaluability:** {evaluability}
399 399  
400 -**Note:** Article verdict considers **claim centrality** (central claims override supporting claims).
425 +**Scenarios Explored:** {scenarios.length}
401 401  
402 -==== 4.5.4 API Field Mapping ====
427 +#### Scenario: {scenario.context_title}
428 +* **Verdict:** {verdict.label} (Confidence: {verdict.confidence})
429 +* **Probability Range:** {verdict.probability_range[0]} - {verdict.probability_range[1]}
430 +* **Reasoning:** {verdict.reasoning}
403 403  
404 -|=Level|=API Field|=Enum Name
405 -|Scenario|scenarios[].verdict.label|scenario_verdict_label
406 -|Claim|claims[].rollup_verdict (optional)|claim_verdict_label
407 -|Article|article_holistic_assessment.overall_verdict|article_verdict_label
432 +**Evidence:**
433 +* Supporting: {evidence.filter(e => e.stance == "supports").length} sources
434 +* Undermining: {evidence.filter(e => e.stance == "undermines").length} sources
435 +* Mixed: {evidence.filter(e => e.stance == "mixed").length} sources
408 408  
409 -----
437 +**Key Evidence:**
438 +* [{evidence[0].citation.title}]({evidence[0].citation.url}) - {evidence[0].stance}
410 410  
411 -== 5. Cache Architecture ==
440 +---
412 412  
413 -=== 5.1 Redis Cache Design ===
442 +## 3. Quality Assessment
414 414  
415 -**Technology:** Redis 7.0+ (in-memory key-value store)
444 +**Quality Gates:**
445 +* Gate 1 (Claim Validation): {gate1_claim_validation}
446 +* Gate 4 (Verdict Confidence): {gate4_verdict_confidence}
447 +* Overall: {passed_all ? "PASS" : "FAIL"}
416 416  
417 -**Cache Key Schema:**
449 +{if gate_fail_reasons.length > 0}
450 +**Failed Gates:**
451 +{gate_fail_reasons.map(r => `* ${r.gate}: ${r.explanation}`)}
452 +{/if}
418 418  
419 -{{{claim:v1norm1:{language}:{sha256(canonical_claim)}
420 -}}}
454 +---
421 421  
422 -**Example:**
456 +## 4. Limitations & Disclaimers
423 423  
424 -{{{Claim (English): "COVID vaccines are 95% effective"
425 -Canonical: "covid vaccines are 95 percent effective"
426 -Language: "en"
427 -SHA256: abc123...def456
428 -Key: claim:v1norm1:en:abc123...def456
429 -}}}
458 +**System Limitations:**
459 +{limitations.map(l => `* ${l}`)}
430 430  
431 -**Rationale:** Prevents cross-language collisions and enables per-language cache analytics.
461 +**Important Notes:**
462 +* This analysis is AI-generated and experimental (POC1)
463 +* Context-aware article verdict is being tested for accuracy
464 +* Human review recommended for high-risk claims (Tier A)
465 +* Cost: ${estimated_cost_usd} | Tokens: {input_tokens + output_tokens}
432 432  
433 -**Data Structure:**
467 +**Methodology:** FactHarbor uses Claude 3.5 Sonnet to extract claims, generate scenarios, gather evidence (with mandatory contradiction search), and assess logical coherence between claims and article thesis.
434 434  
435 -{{{SET claim:v1norm1:en:abc123...def456 '{...ClaimAnalysis JSON...}'
436 -EXPIRE claim:v1norm1:en:abc123...def456 7776000 # 90 days
437 -}}}
469 +---
438 438  
439 -----
471 +*Generated by FactHarbor POC1-v0.3 | [About FactHarbor](https://factharbor.org)*
472 +{{/code}}
440 440  
441 -=== 5.1.1 Canonical Claim Normalization (v1) ===
474 +**Target Report Size:** 220-350 words (optimized for 2-minute read)
442 442  
443 -The cache key depends on deterministic claim normalization. All implementations MUST follow this algorithm exactly.
476 +---
444 444  
445 -**Algorithm: Canonical Claim Normalization v1**
478 +== 7. LLM Configuration (POC1) ==
446 446  
447 -{{{def normalize_claim_v1(claim_text: str, language: str) -> str:
448 - """
449 - Normalizes claim to canonical form for cache key generation.
450 - Version: v1norm1 (POC1)
451 - """
452 - import re
453 - import unicodedata
454 -
455 - # Step 1: Unicode normalization (NFC)
456 - text = unicodedata.normalize('NFC', claim_text)
457 -
458 - # Step 2: Lowercase
459 - text = text.lower()
460 -
461 - # Step 3: Remove punctuation (except hyphens in words)
462 - text = re.sub(r'[^\w\s-]', '', text)
463 -
464 - # Step 4: Normalize whitespace (collapse multiple spaces)
465 - text = re.sub(r'\s+', ' ', text).strip()
466 -
467 - # Step 5: Numeric normalization
468 - text = text.replace('%', ' percent')
469 - # Spell out single-digit numbers
470 - num_to_word = {'0':'zero', '1':'one', '2':'two', '3':'three',
471 - '4':'four', '5':'five', '6':'six', '7':'seven',
472 - '8':'eight', '9':'nine'}
473 - for num, word in num_to_word.items():
474 - text = re.sub(rf'\b{num}\b', word, text)
475 -
476 - # Step 6: Common abbreviations (English only in v1)
477 - if language == 'en':
478 - text = text.replace('covid-19', 'covid')
479 - text = text.replace('u.s.', 'us')
480 - text = text.replace('u.k.', 'uk')
481 -
482 - # Step 7: NO entity normalization in v1
483 - # (Trump vs Donald Trump vs President Trump remain distinct)
484 -
485 - return text
480 +|=Parameter|=Value|=Notes
481 +|**Provider**|Anthropic|Primary provider for POC1
482 +|**Model**|{{code}}claude-3-5-sonnet-20241022{{/code}}|Current production model
483 +|**Future Model**|{{code}}claude-sonnet-4-20250514{{/code}}|When available (architecture supports)
484 +|**Token Budget**|50K-80K per analysis|Input + output combined (varies by article length)
485 +|**Estimated Cost**|$0.10-0.30 per article|Based on Sonnet 3.5 pricing ($3/M input, $15/M output)
486 +|**Prompt Strategy**|Single-pass per stage|Not multi-turn; structured JSON output with schema validation
487 +|**Chain-of-Thought**|Yes|For verdict reasoning and holistic assessment
488 +|**Few-Shot Examples**|Yes|For claim extraction and scenario generation
486 486  
487 -# Version identifier (include in cache namespace)
488 -CANONICALIZER_VERSION = "v1norm1"
489 -}}}
490 +=== 7.1 Token Budgets by Stage ===
490 490  
491 -**Cache Key Formula (Updated):**
492 +|=Stage|=Approximate Output Tokens
493 +|Claim Extraction|~4,000 (10 claims × ~400 tokens)
494 +|Scenario Generation|~3,000 per claim (3 scenarios × ~1,000 tokens)
495 +|Evidence Synthesis|~2,000 per scenario
496 +|Verdict Generation|~1,000 per scenario
497 +|Holistic Assessment|~500 (context-aware summary)
492 492  
493 -{{{language = "en"
494 -canonical = normalize_claim_v1(claim_text, language)
495 -cache_key = f"claim:{CANONICALIZER_VERSION}:{language}:{sha256(canonical)}"
499 +**Total:** 50K-80K tokens per article (input + output)
496 496  
497 -Example:
498 - claim: "COVID-19 vaccines are 95% effective"
499 - canonical: "covid vaccines are 95 percent effective"
500 - sha256: abc123...def456
501 - key: "claim:v1norm1:en:abc123...def456"
502 -}}}
501 +=== 7.2 API Integration ===
503 503  
504 -**Cache Metadata MUST Include:**
503 +**Anthropic Messages API:**
504 +* Endpoint: {{code}}https://api.anthropic.com/v1/messages{{/code}}
505 +* Authentication: API key via {{code}}x-api-key{{/code}} header
506 +* Model parameter: {{code}}"model": "claude-3-5-sonnet-20241022"{{/code}}
507 +* Max tokens: {{code}}"max_tokens": 4096{{/code}} (per stage)
505 505  
506 -{{{{
507 - "canonical_claim": "covid vaccines are 95 percent effective",
508 - "canonicalizer_version": "v1norm1",
509 - "language": "en",
510 - "original_claim_samples": ["COVID-19 vaccines are 95% effective"]
511 -}
512 -}}}
509 +**No LangChain/LangGraph needed** for POC1 simplicity - direct SDK calls suffice.
513 513  
514 -**Version Upgrade Path:**
511 +---
515 515  
516 -* v1norm1 → v1norm2: Cache namespace changes, old keys remain valid until TTL
517 -* v1normN → v2norm1: Major version bump, invalidate all v1 caches
513 +== 8. Cross-References (xWiki) ==
518 518  
519 -----
515 +This API specification implements requirements from:
520 520  
521 -=== 5.1.2 Copyright & Data Retention Policy ===
517 +* **[[POC Requirements>>Test.FactHarbor.Specification.POC.Requirements]]**
518 +** FR-POC-1 through FR-POC-6 (POC1-specific functional requirements)
519 +** NFR-POC-1 through NFR-POC-3 (quality gates lite: Gates 1 & 4 only)
520 +** Section 2.1: Analysis Summary (Context-Aware) component specification
521 +** Section 10.3: Prompt structure for claim extraction and verdict synthesis
522 522  
523 -**Evidence Excerpt Storage:**
523 +* **[[Article Verdict Problem>>Test.FactHarbor.Specification.POC.Article-Verdict-Problem]]**
524 +** Complete investigation of 7 approaches to article-level verdicts
525 +** Approach 1 (Single-Pass Holistic Analysis) chosen for POC1
526 +** Experimental feature testing plan (30 articles, ≥70% accuracy target)
527 +** Decision framework for POC2 implementation
524 524  
525 -To comply with copyright law and fair use principles:
529 +* **[[Requirements>>Test.FactHarbor.Specification.Requirements.WebHome]]**
530 +** FR4 (Analysis Summary) - enhanced with context-aware capability
531 +** FR7 (Verdict Calculation) - probability ranges + confidence scores
532 +** NFR11 (Quality Gates) - POC1 implements Gates 1 & 4; Gates 2 & 3 in POC2
526 526  
527 -**What We Store:**
534 +* **[[Architecture>>Test.FactHarbor.Specification.Architecture.WebHome]]**
535 +** POC1 simplified architecture (stateless, single AKEL orchestration call)
536 +** Data persistence minimized (job outputs only, no database required)
537 +** Deferred complexity (no Elasticsearch, TimescaleDB, Federation until metrics justify)
528 528  
529 -* **Metadata only:** Title, author, publisher, URL, publication date
530 -* **Short excerpts:** Max 25 words per quote, max 3 quotes per evidence item
531 -* **Summaries:** AI-generated bullet points (not verbatim text)
532 -* **No full articles:** Never store complete article text beyond job processing
539 +* **[[Data Model>>Test.FactHarbor.Specification.Data Model.WebHome]]**
540 +** Evidence structure (source, stance, reliability rating)
541 +** Scenario boundaries (time, geography, population, conditions)
542 +** Claim types and evaluability taxonomy
543 +** Source Track Record System (Section 1.3) - temporal separation
533 533  
534 -**Total per Cached Claim:**
545 +* **[[Requirements Roadmap Matrix>>Test.FactHarbor.Roadmap.Requirements-Roadmap-Matrix.WebHome]]**
546 +** POC1 requirement mappings and phase assignments
547 +** Context-aware analysis as POC1 experimental feature
548 +** POC2 enhancement path (Gates 2 & 3, evidence deduplication)
535 535  
536 -* Scenarios: 2 per claim
537 -* Evidence items: 6 per scenario (12 total)
538 -* Quotes: 3 per evidence × 25 words = 75 words per item
539 -* **Maximum stored verbatim text:** ~~900 words per claim (12 × 75)
550 +---
540 540  
541 -**Retention:**
552 +== 9. Implementation Notes (POC1) ==
542 542  
543 -* Cache TTL: 90 days
544 -* Job outputs: 24 hours (then archived or deleted)
545 -* No persistent full-text article storage
554 +=== 9.1 Recommended Tech Stack ===
546 546  
547 -**Rationale:**
556 +* **Framework:** Next.js 14+ with App Router (TypeScript) - Full-stack in one codebase
557 +* **Rationale:** API routes + React UI unified, Vercel deployment-ready, similar to C# in structure
558 +* **Storage:** Filesystem JSON files (no database needed for POC1)
559 +* **Queue:** In-memory queue or Redis (optional for concurrency)
560 +* **URL Extraction:** Jina AI Reader API (primary), trafilatura (fallback)
561 +* **Deployment:** Vercel, AWS Lambda, or similar serverless
548 548  
549 -* Short excerpts for citation = fair use
550 -* Summaries are transformative (not copyrightable)
551 -* Limited retention (90 days max)
552 -* No commercial republication of excerpts
563 +=== 9.2 POC1 Simplifications ===
553 553  
554 -**DMCA Compliance:**
565 +* **No database required:** Job metadata + outputs stored as JSON files ({{code}}jobs/{job_id}.json{{/code}}, {{code}}results/{job_id}.json{{/code}})
566 +* **No user authentication:** Optional API key validation only (env var: {{code}}FACTHARBOR_API_KEY{{/code}})
567 +* **Single-instance deployment:** No distributed processing, no worker pools
568 +* **Synchronous LLM calls:** No streaming in POC1 (entire response before returning)
569 +* **Job retention:** 24 hours default (configurable: {{code}}JOB_RETENTION_HOURS{{/code}})
570 +* **Rate limiting:** Simple IP-based (optional) - no complex billing
555 555  
556 -* Cache invalidation endpoint available for rights holders
557 -* Contact: dmca@factharbor.org
572 +=== 9.3 Estimated Costs (Per Analysis) ===
558 558  
559 -----
574 +**LLM API costs (Claude 3.5 Sonnet):**
575 +* Input: $3.00 per million tokens
576 +* Output: $15.00 per million tokens
577 +* **Per article:** $0.10-0.30 (varies by length, 5-10 claims typical)
560 560  
561 -== Summary ==
579 +**Web search costs (optional):**
580 +* Using external search API (Tavily, Brave): $0.01-0.05 per analysis
581 +* POC1 can use free search APIs initially
562 562  
563 -This WYSIWYG preview shows the **structure and key sections** of the 1,515-line API specification.
583 +**Infrastructure costs:**
584 +* Vercel hobby tier: Free for POC
585 +* AWS Lambda: ~$0.001 per request
586 +* **Total infra:** <$0.01 per analysis
564 564  
565 -**Full specification includes:**
588 +**Total estimated cost:** ~$0.15-0.35 per analysis ✅ Meets <$0.35 target
566 566  
567 -* Complete API endpoints (7 total)
568 -* All data schemas (ClaimExtraction, ClaimAnalysis, HolisticAssessment, Complete)
569 -* Quality gates & validation rules
570 -* LLM configuration for all 3 stages
571 -* Implementation notes with code samples
572 -* Testing strategy
573 -* Cross-references to other pages
590 +=== 9.4 Estimated Timeline (AI-Assisted) ===
574 574  
575 -**The complete specification is available in:**
592 +**With Cursor IDE + Claude API:**
593 +* Day 1-2: API scaffolding + job queue
594 +* Day 3-4: LLM integration + prompt engineering
595 +* Day 5-6: Evidence retrieval + contradiction search
596 +* Day 7: Report templates + testing with 30 articles
597 +* **Total:** 5-7 days for working POC1
576 576  
577 -* FactHarbor_POC1_API_and_Schemas_Spec_v0_4_1_PATCHED.md (45 KB standalone)
578 -* Export files (TEST/PRODUCTION) for xWiki import
599 +**Manual coding (no AI assistance):**
600 +* Estimate: 15-20 days
601 +
602 +=== 9.5 First Prompt for AI Code Generation ===
603 +
604 +{{code}}
605 +Based on the FactHarbor POC1 API & Schemas Specification (v0.3), generate a Next.js 14 TypeScript application with:
606 +
607 +1. API routes implementing the 7 endpoints specified in Section 3
608 +2. AnalyzeRequest/AnalysisResult types matching schemas in Sections 4-5
609 +3. Anthropic Claude 3.5 Sonnet integration for:
610 + - Claim extraction (with central/supporting marking)
611 + - Scenario generation
612 + - Evidence synthesis (with mandatory contradiction search)
613 + - Verdict generation
614 + - Holistic assessment (article-level credibility)
615 +4. Job-based async execution with progress tracking (7 pipeline stages)
616 +5. Quality Gates 1 & 4 from NFR11 implementation
617 +6. Mandatory contradiction search enforcement (Section 5)
618 +7. Context-aware analysis (experimental) as specified
619 +8. Filesystem-based job storage (no database)
620 +9. Markdown report generation from JSON templates (Section 6)
621 +
622 +Use the validation rules from Section 5 and error codes from Section 2.1.1.
623 +Target: <$0.35 per analysis, <2 minutes processing time.
624 +{{/code}}
625 +
626 +---
627 +
628 +== 10. Testing Strategy (POC1) ==
629 +
630 +=== 10.1 Test Dataset (30 Articles) ===
631 +
632 +**Category 1: Straightforward Factual (10 articles)**
633 +* Purpose: Baseline accuracy
634 +* Example: "WHO report on global vaccination rates"
635 +* Expected: High claim accuracy, straightforward verdict
636 +
637 +**Category 2: Accurate Claims, Questionable Conclusions (10 articles)** ⭐ **Context-Aware Test**
638 +* Purpose: Test holistic assessment capability
639 +* Example: "Coffee cures cancer" (true premises, false conclusion)
640 +* Expected: Individual claims TRUE, article verdict MISLEADING
641 +
642 +**Category 3: Mixed Accuracy (5 articles)**
643 +* Purpose: Test nuance handling
644 +* Example: Articles with some true, some false claims
645 +* Expected: Scenario-level differentiation
646 +
647 +**Category 4: Low-Quality Claims (5 articles)**
648 +* Purpose: Test quality gates
649 +* Example: Opinion pieces, compound claims
650 +* Expected: Gate 1 failures, rejection or draft-only mode
651 +
652 +=== 10.2 Success Metrics ===
653 +
654 +**Quality Metrics:**
655 +* Hallucination rate: <5% (target: <3%)
656 +* Context-aware accuracy: ≥70% (experimental - key POC1 goal)
657 +* False positive rate: <15%
658 +* Mandatory contradiction search: 100% compliance
659 +
660 +**Performance Metrics:**
661 +* Processing time: <2 minutes per article (standard depth)
662 +* Cost per analysis: <$0.35
663 +* API uptime: >99%
664 +* LLM API error rate: <1%
665 +
666 +**See:** [[POC1 Roadmap>>Test.FactHarbor.Roadmap.POC1.WebHome]] Section 11 for complete success criteria and testing methodology.
667 +
668 +---
669 +
670 +**End of Specification - FactHarbor POC1 API v0.3**
671 +
672 +**Ready for xWiki import and AI-assisted implementation!** 🚀
673 +