Wiki source code of POC1 Architecture Analysis - 1.Jan.26

Version 4.1 by Robert Schaub on 2026/01/02 10:03

version	line-number	content
2.1	1	= FactHarbor POC1 Architecture Analysis =
1.1	2
	3
	4	Version: 2.6.17
	5	Analysis Date: January 2026
	6	Document Purpose: Technical diagrams, gap analysis, and optimization recommendations
	7
2.1	8	----
1.1	9
2.1	10	== 1. AKEL Flow Diagram (with LLM and WebSearch Interactions) ==
1.1	11
2.1	12
1.1	13	{{mermaid}}
	14	flowchart TB
	15	subgraph Input["📥 Input Layer"]
	16	URL[URL Input]
	17	TEXT[Text Input]
	18	end
	19
	20	subgraph Retrieval["🔍 Content Retrieval"]
	21	FETCH[extractTextFromUrl]
	22	PDF[PDF Parser<br/>pdf-parse v1]
	23	HTML[HTML Parser<br/>cheerio]
	24	end
	25
	26	subgraph AKEL["🧠 AKEL Pipeline"]
	27	direction TB
	28
	29	subgraph Step1["Step 1: Understand"]
	30	UNDERSTAND[understandClaim<br/>━━━━━━━━━━━━━<br/>• Detect input type<br/>• Extract claims<br/>• Identify dependencies<br/>• Assign risk tiers]
	31	LLM1[("🤖 LLM Call #1<br/>Claude/GPT/Gemini")]
	32	end
	33
	34	subgraph Step2["Step 2: Research (Iterative)"]
	35	DECIDE[decideNextResearch<br/>━━━━━━━━━━━━━<br/>• Generate queries<br/>• Focus areas]
	36
	37	SEARCH[("🌐 Web Search<br/>Google CSE / SerpAPI")]
	38
	39	FETCHSRC[fetchSourceContent<br/>━━━━━━━━━━━━━<br/>• Parallel fetching<br/>• Timeout handling]
	40
	41	EXTRACT[extractFacts<br/>━━━━━━━━━━━━━<br/>• Parse sources<br/>• Extract facts]
	42	LLM2[("🤖 LLM Call #2-N<br/>Per source")]
	43	end
	44
	45	subgraph Step3["Step 3: Verdict Generation"]
	46	VERDICT[generateVerdicts<br/>━━━━━━━━━━━━━<br/>• Claim verdicts<br/>• Article verdict<br/>• Dependency propagation]
	47	LLM3[("🤖 LLM Call #N+1<br/>Final synthesis")]
	48	end
	49
	50	subgraph Step4["Step 4: Report"]
	51	REPORT[buildTwoPanelSummary<br/>━━━━━━━━━━━━━<br/>• Format results<br/>• Generate markdown]
	52	end
	53	end
	54
	55	subgraph Output["📤 Output"]
	56	RESULT[AnalysisResult JSON]
	57	MARKDOWN[Report Markdown]
	58	end
	59
	60	%% Flow connections
	61	URL --> FETCH
	62	TEXT --> UNDERSTAND
	63	FETCH --> PDF
	64	FETCH --> HTML
	65	PDF --> UNDERSTAND
	66	HTML --> UNDERSTAND
	67
	68	UNDERSTAND --> LLM1
	69	LLM1 --> DECIDE
	70
	71	DECIDE --> SEARCH
	72	SEARCH --> FETCHSRC
	73	FETCHSRC --> EXTRACT
	74	EXTRACT --> LLM2
	75	LLM2 --> DECIDE
	76
	77	DECIDE -->\|"Research Complete"\| VERDICT
	78	VERDICT --> LLM3
	79	LLM3 --> REPORT
	80
	81	REPORT --> RESULT
	82	REPORT --> MARKDOWN
	83
	84	%% Styling
	85	classDef llm fill:#e1f5fe,stroke:#01579b,stroke-width:2px
	86	classDef search fill:#fff3e0,stroke:#e65100,stroke-width:2px
	87	classDef step fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
	88
	89	class LLM1,LLM2,LLM3 llm
	90	class SEARCH search
	91	class UNDERSTAND,DECIDE,FETCHSRC,EXTRACT,VERDICT,REPORT step
	92	{{/mermaid}}
	93
3.1	94	----
1.1	95
	96
2.1	97	== 2. ERD Data Model (Current POC1 Implementation) ==
1.1	98
	99
	100	{{mermaid}}
	101	erDiagram
	102	JOB \|\|--o{ JOB_EVENT : "has"
	103	JOB \|\|--\|\| ANALYSIS_RESULT : "produces"
	104	ANALYSIS_RESULT \|\|--o{ CLAIM_VERDICT : "contains"
	105	ANALYSIS_RESULT \|\|--o{ FETCHED_SOURCE : "references"
	106	ANALYSIS_RESULT \|\|--o{ EXTRACTED_FACT : "contains"
	107	CLAIM_VERDICT }o--o{ EXTRACTED_FACT : "supported by"
	108	FETCHED_SOURCE \|\|--o{ EXTRACTED_FACT : "provides"
	109	CLAIM_VERDICT \|\|--o{ CLAIM_VERDICT : "depends on"
	110
	111	JOB {
	112	string JobId PK "GUID"
	113	string Status "QUEUED\|RUNNING\|COMPLETE\|FAILED"
	114	int Progress "0-100"
	115	datetime CreatedUtc
	116	datetime UpdatedUtc
	117	string InputType "text\|url"
	118	string InputValue "URL or text content"
	119	string InputPreview "First 100 chars"
	120	json ResultJson "Full analysis result"
	121	string ReportMarkdown "Formatted report"
	122	}
	123
	124	JOB_EVENT {
	125	long Id PK
	126	string JobId FK
	127	datetime TsUtc
	128	string Level "info\|warn\|error"
	129	string Message
	130	}
	131
	132	ANALYSIS_RESULT {
	133	string schemaVersion "2.6.17"
	134	string inputType "question\|claim\|article"
	135	boolean isQuestion
	136	string articleThesis
	137	int articleTruthPercentage "0-100"
	138	string articleVerdict "7-point scale"
	139	json claimPattern "total/supported/uncertain/refuted"
	140	boolean isPseudoscience
	141	int llmCalls "Total LLM invocations"
	142	json searchQueries "All search queries"
	143	}
	144
	145	CLAIM_VERDICT {
	146	string claimId PK "SC1, SC2, etc."
	147	string claimText
	148	boolean isCentral
	149	string claimRole "attribution\|source\|timing\|core"
	150	string_array dependsOn "Prerequisite claim IDs"
	151	boolean dependencyFailed
	152	string llmVerdict "WELL-SUPPORTED\|PARTIALLY-SUPPORTED\|UNCERTAIN\|REFUTED"
	153	string verdict "7-point: True to False"
	154	int confidence "0-100"
	155	int truthPercentage "0-100"
	156	string riskTier "A\|B\|C"
	157	string reasoning
	158	string_array supportingFactIds
	159	string highlightColor "green to dark-red"
	160	}
	161
	162	FETCHED_SOURCE {
	163	string id PK "S1, S2, etc."
	164	string url
	165	string title
	166	int trackRecordScore "0-100 or null"
	167	string fullText "Extracted content"
	168	datetime fetchedAt
	169	string category "legal\|news\|academic"
	170	boolean fetchSuccess
	171	string searchQuery "Which query found this"
	172	}
	173
	174	EXTRACTED_FACT {
	175	string id PK "S1-F1, S1-F2, etc."
	176	string fact "The factual statement"
	177	string category "legal_provision\|evidence\|expert_quote\|statistic\|event\|criticism"
	178	string specificity "high\|medium"
	179	string sourceId FK
	180	string sourceUrl
	181	string sourceTitle
	182	string sourceExcerpt
	183	string relatedProceedingId
	184	boolean isContestedClaim
	185	string claimSource
	186	}
	187	{{/mermaid}}
	188
3.1	189	----
1.1	190
	191
2.1	192	== 3. Overall Architecture with Interactions ==
1.1	193
	194
	195	{{mermaid}}
	196	flowchart TB
	197	subgraph Client["🖥️ Client Layer"]
	198	BROWSER[Web Browser]
	199	ANALYZE_PAGE["/analyze page<br/>React + TailwindCSS"]
	200	JOBS_PAGE["/jobs page<br/>Job history & status"]
	201	end
	202
	203	subgraph NextJS["⚡ Next.js Web App (apps/web)"]
	204	direction TB
	205
	206	subgraph API_Routes["API Routes"]
	207	ANALYZE_API["/api/fh/analyze<br/>━━━━━━━━━━━━━<br/>POST: Create job"]
	208	JOBS_API["/api/fh/jobs<br/>━━━━━━━━━━━━━<br/>GET: List jobs<br/>POST: Create job"]
	209	JOB_API["/api/fh/jobs/[id]<br/>━━━━━━━━━━━━━<br/>GET: Job status"]
	210	EVENTS_API["/api/fh/jobs/[id]/events<br/>━━━━━━━━━━━━━<br/>GET: Job events (SSE)"]
	211	RUN_JOB["/api/internal/run-job<br/>━━━━━━━━━━━━━<br/>POST: Execute analysis"]
	212	end
	213
	214	subgraph Lib["Core Libraries"]
	215	ANALYZER["analyzer.ts<br/>━━━━━━━━━━━━━<br/>AKEL Pipeline<br/>2918 lines"]
	216	RETRIEVAL["retrieval.ts<br/>━━━━━━━━━━━━━<br/>URL content extraction"]
	217	WEBSEARCH["web-search.ts<br/>━━━━━━━━━━━━━<br/>Search abstraction"]
	218	MBFC["mbfc-loader.ts<br/>━━━━━━━━━━━━━<br/>Source reliability"]
	219	end
	220	end
	221
	222	subgraph DotNet["🔧 .NET API (apps/api)"]
	223	DOTNET_API["FactHarbor.Api<br/>ASP.NET Core"]
	224
	225	subgraph Controllers["Controllers"]
	226	ANALYZE_CTRL["AnalyzeController"]
	227	JOBS_CTRL["JobsController"]
	228	INTERNAL_CTRL["InternalJobsController"]
	229	end
	230
	231	subgraph Services["Services"]
	232	JOB_SVC["JobService<br/>━━━━━━━━━━━━━<br/>Job CRUD operations"]
	233	RUNNER_CLIENT["RunnerClient<br/>━━━━━━━━━━━━━<br/>Calls Next.js runner"]
	234	end
	235
	236	DB[(SQLite Database<br/>━━━━━━━━━━━━━<br/>JobEntity<br/>JobEventEntity)]
	237	end
	238
	239	subgraph External["🌐 External Services"]
	240	LLM_PROVIDERS["LLM Providers<br/>━━━━━━━━━━━━━<br/>• Anthropic Claude<br/>• OpenAI GPT<br/>• Google Gemini<br/>• Mistral"]
	241	SEARCH_PROVIDERS["Search Providers<br/>━━━━━━━━━━━━━<br/>• Google CSE<br/>• SerpAPI<br/>• Brave<br/>• Tavily"]
	242	WEB["Web Content<br/>━━━━━━━━━━━━━<br/>• News sites<br/>• PDFs<br/>• Academic sources"]
	243	end
	244
	245	%% Client interactions
	246	BROWSER --> ANALYZE_PAGE
	247	BROWSER --> JOBS_PAGE
	248	ANALYZE_PAGE --> ANALYZE_API
	249	JOBS_PAGE --> JOBS_API
	250
	251	%% Next.js internal
	252	ANALYZE_API --> JOBS_API
	253	JOBS_API -->\|"Proxy"\| DOTNET_API
	254	JOB_API -->\|"Proxy"\| DOTNET_API
	255	EVENTS_API -->\|"Proxy"\| DOTNET_API
	256
	257	%% .NET flow
	258	DOTNET_API --> ANALYZE_CTRL
	259	DOTNET_API --> JOBS_CTRL
	260	DOTNET_API --> INTERNAL_CTRL
	261	ANALYZE_CTRL --> JOB_SVC
	262	JOBS_CTRL --> JOB_SVC
	263	JOB_SVC --> DB
	264	JOB_SVC --> RUNNER_CLIENT
	265	RUNNER_CLIENT -->\|"HTTP POST"\| RUN_JOB
	266
	267	%% Analysis execution
	268	RUN_JOB --> ANALYZER
	269	ANALYZER --> RETRIEVAL
	270	ANALYZER --> WEBSEARCH
	271	ANALYZER --> MBFC
	272
	273	%% External calls
	274	ANALYZER -->\|"AI SDK"\| LLM_PROVIDERS
	275	WEBSEARCH --> SEARCH_PROVIDERS
	276	RETRIEVAL --> WEB
	277
	278	%% Styling
	279	classDef external fill:#fff3e0,stroke:#e65100,stroke-width:2px
	280	classDef core fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px
	281	classDef api fill:#e3f2fd,stroke:#1565c0,stroke-width:2px
	282
	283	class LLM_PROVIDERS,SEARCH_PROVIDERS,WEB external
	284	class ANALYZER,RETRIEVAL,WEBSEARCH,MBFC core
	285	class ANALYZE_API,JOBS_API,JOB_API,EVENTS_API,RUN_JOB api
	286	{{/mermaid}}
	287
3.1	288	----
1.1	289
	290
2.1	291	== 4. Specification vs Implementation Gap Analysis ==
1.1	292
	293
	294
2.1	295	=== 4.1 Data Model Gaps ===
1.1	296
	297
2.1	298	\| Specification Entity \| POC1 Status \| Gap Description \|
	299	\|-\|-\|-\|
	300	\| Claim \| ⚠️ Partial \| No persistent storage; claims exist only in JSON result. Missing: `status`, `confidence_score`, `risk_score`, `completeness_score`, `version`, `views`, `edit_count` \|
	301	\| Evidence \| ⚠️ Partial \| Implemented as `ExtractedFact` but lacks: `supports` enum, proper `relevance_score` \|
	302	\| Source \| ⚠️ Partial \| `FetchedSource` exists but missing: `type` enum, `accuracy_history`, `correction_frequency`, weekly update scheduler \|
	303	\| Scenario \| ❌ Missing \| Not implemented. Claims are evaluated directly without scenario contexts \|
	304	\| Verdict \| ⚠️ Partial \| `ClaimVerdict` exists but missing: `likelihood_range`, `uncertainty_factors` array, proper `explanation_summary` \|
	305	\| User \| ❌ Missing \| No user authentication or role system \|
	306	\| Edit \| ❌ Missing \| No audit trail for changes \|
1.1	307
	308	=== 4.2 AKEL Component Gaps ===
	309
2.1	310	\| Spec Component \| POC1 Status \| Gap Description \|
	311	\| \|-\|-\|
	312	\| AKEL Orchestrator \| ✅ Implemented \| `runAnalysis()` function serves this role \|
	313	\| Claim Extractor \| ✅ Implemented \| `understandClaim()` with claim role/dependency tracking \|
	314	\| Claim Classifier \| ⚠️ Partial \| Risk tier (A/B/C) assigned, but no domain classification \|
	315	\| Scenario Generator \| ❌ Missing \| Claims evaluated without scenario extraction \|
	316	\| Evidence Summarizer \| ✅ Implemented \| `extractFacts()` function \|
	317	\| Contradiction Detector \| ⚠️ Partial \| `isContestedClaim` flag exists but no active contradiction search \|
	318	\| Quality Gate Validator \| ❌ Missing \| No source quality gates, no mandatory checks \|
	319	\| Audit Sampling Scheduler \| ❌ Missing \| No audit system \|
	320	\| Embedding Handler \| ❌ Missing \| Not needed for POC \|
	321	\| Federation Sync \| ❌ Missing \| Not needed for POC \|
1.1	322
2.1	323	=== 4.3 Architecture Gaps ===
1.1	324
	325
2.1	326	\| Spec Requirement \| POC1 Status \| Gap Description \|
3.1	327	\| \|-\|-\|
2.1	328	\| Three-Layer Architecture \| ✅ Implemented \| Interface (Next.js) → Processing (AKEL) → Data (SQLite) \|
	329	\| LLM Abstraction Layer \| ✅ Implemented \| AI SDK supports multiple providers with failover \|
	330	\| PostgreSQL Primary DB \| ⚠️ Different \| Using SQLite for simplicity (acceptable for POC) \|
	331	\| Redis Caching \| ❌ Missing \| No caching layer \|
	332	\| S3 Archival \| ❌ Missing \| No long-term storage \|
	333	\| Background Jobs \| ❌ Missing \| No scheduler for source updates, cache warming \|
	334	\| Quality Monitoring \| ⚠️ Partial \| LLM call counting exists, but no anomaly detection \|
1.1	335
2.1	336	=== 4.4 Publication & Review Gaps ===
1.1	337
	338
2.1	339	\| Spec Feature \| POC1 Status \| Gap Description \|
3.1	340	\| \|-\|-\|
2.1	341	\| Risk Tier Publication Rules \| ❌ Missing \| All results published immediately regardless of tier \|
	342	\| Human Review Queue \| ❌ Missing \| No review workflow \|
	343	\| AI-Generated Labeling \| ⚠️ Partial \| Results show "AI analysis" but no formal labeling system \|
	344	\| Audit Rate Sampling \| ❌ Missing \| No sampling audits \|
1.1	345
3.1	346	----
1.1	347
	348
2.1	349	== 5. Optimization Recommendations ==
1.1	350
	351
	352
2.1	353	=== 5.1 Cost Optimizations ===
1.1	354
	355
	356	{{mermaid}}
	357	pie title Current LLM Cost Distribution (Estimated per Analysis)
	358	"Step 1: Understand" : 15
	359	"Step 2: Research (per source)" : 60
	360	"Step 3: Verdicts" : 25
	361	{{/mermaid}}
	362
2.1	363	\| Optimization \| Estimated Savings \| Implementation Effort \|
3.1	364	\| \|-\| \|
2.1	365	\| Cache claim understanding \| 30-50% on repeated claims \| Medium \|
	366	\| Use Haiku for fact extraction \| 40% on Step 2 costs \| Low (config change) \|
	367	\| Batch fact extraction \| 20% fewer API calls \| Medium \|
	368	\| Skip search for known claims \| 50%+ for cached claims \| High (needs claim DB) \|
	369	\| Reduce max iterations \| Linear reduction \| Low (config change) \|
1.1	370
2.1	371	=== 5.2 Timing Optimizations ===
1.1	372
	373
	374	{{mermaid}}
	375	gantt
	376	title Current Analysis Timeline (Typical)
	377	dateFormat ss
	378	axisFormat %S sec
	379
	380	section Current Flow
	381	URL Fetch :a1, 00, 2s
	382	Step 1 Understand :a2, after a1, 15s
	383	Search Iteration 1 :a3, after a2, 8s
	384	Fetch Sources 1 :a4, after a3, 10s
	385	Extract Facts 1 :a5, after a4, 12s
	386	Search Iteration 2 :a6, after a5, 8s
	387	Fetch Sources 2 :a7, after a6, 10s
	388	Extract Facts 2 :a8, after a7, 12s
	389	Generate Verdicts :a9, after a8, 15s
	390
	391	section Optimized Flow
	392	URL Fetch :b1, 00, 2s
	393	Step 1 Understand :b2, after b1, 10s
	394	Search + Fetch (parallel) :b3, after b2, 12s
	395	Extract Facts (batched) :b4, after b3, 8s
	396	Generate Verdicts :b5, after b4, 10s
	397	{{/mermaid}}
	398
2.1	399	\| Optimization \| Time Savings \| Notes \|
3.1	400	\| \| \|-\|
2.1	401	\| Parallel source fetching \| Already implemented \| Currently fetches 3 sources in parallel \|
	402	\| Streaming LLM responses \| 20-30% perceived \| User sees progress faster \|
	403	\| Search query batching \| 10-15% \| Send multiple queries to search API \|
	404	\| Reduce prompt size \| 5-10% per call \| Optimize system prompts \|
	405	\| Use faster models for extraction \| 30-40% on Step 2 \| Claude Haiku vs Sonnet \|
1.1	406
2.1	407	=== 5.3 Priority Recommendations ===
1.1	408
	409
	410	1. HIGH PRIORITY - Implement Claim Caching
	411	- Cache claim verdicts by content hash
	412	- Reduces costs for repeated/similar claims
	413	- Enables the separated verdict architecture (see Section 6)
	414
	415	2. MEDIUM PRIORITY - Use Tiered Models
	416	- Step 1 (Understand): Sonnet (needs reasoning)
	417	- Step 2 (Extract): Haiku (simple extraction)
	418	- Step 3 (Verdicts): Sonnet (needs synthesis)
	419
	420	3. LOW PRIORITY - Add Redis Cache
	421	- Cache source content (24h TTL)
	422	- Cache search results (1h TTL)
	423	- Reduces external API calls
	424
3.1	425	----
1.1	426
	427
2.1	428	== 6. Separated Verdict Architecture Proposal ==
1.1	429
	430
	431
2.1	432	=== 6.1 Current Architecture ===
1.1	433
	434
	435	{{mermaid}}
	436	flowchart LR
	437	subgraph Current["Current: Monolithic Analysis"]
	438	INPUT[Article Input] --> ANALYZE[Full Analysis Pipeline]
	439	ANALYZE --> CLAIMS[Claim Verdicts]
	440	ANALYZE --> ARTICLE[Article Verdict]
	441	CLAIMS -.->\|"Aggregated"\| ARTICLE
	442	end
	443	{{/mermaid}}
	444
	445	Issues:
	446	- Every analysis re-processes all claims
	447	- No caching of individual claim verdicts
	448	- Article verdict tightly coupled to claim extraction
	449
	450
2.1	451	=== 6.2 Proposed Separated Architecture ===
1.1	452
	453
	454	{{mermaid}}
	455	flowchart TB
	456	subgraph Input["Input Processing"]
	457	ARTICLE[Article/Text Input]
	458	EXTRACT[Claim Extraction]
	459	end
	460
	461	subgraph ClaimLayer["Claim Verdict Layer (Cacheable)"]
	462	CACHE[(Claim Cache<br/>━━━━━━━━━━━━━<br/>Key: claim_hash<br/>TTL: 7 days)]
	463
	464	CLAIM1["Claim 1 Analysis"]
	465	CLAIM2["Claim 2 Analysis"]
	466	CLAIM3["Claim N Analysis"]
	467
	468	VERDICT1[Claim 1 Verdict]
	469	VERDICT2[Claim 2 Verdict]
	470	VERDICT3[Claim N Verdict]
	471	end
	472
	473	subgraph ArticleLayer["Article Verdict Layer (Dynamic)"]
	474	AGGREGATE[Aggregate Claim Verdicts]
	475	CONTEXT[Apply Article Context<br/>━━━━━━━━━━━━━<br/>• Claim relationships<br/>• Logical structure<br/>• Author intent]
	476	ARTICLE_VERDICT[Article Verdict]
	477	end
	478
	479	%% Flow
	480	ARTICLE --> EXTRACT
	481	EXTRACT --> CLAIM1
	482	EXTRACT --> CLAIM2
	483	EXTRACT --> CLAIM3
	484
	485	CLAIM1 -->\|"Cache Miss"\| VERDICT1
	486	CLAIM2 -->\|"Cache Hit"\| VERDICT2
	487	CLAIM3 -->\|"Cache Miss"\| VERDICT3
	488
	489	CLAIM1 <-.-> CACHE
	490	CLAIM2 <-.-> CACHE
	491	CLAIM3 <-.-> CACHE
	492
	493	VERDICT1 --> AGGREGATE
	494	VERDICT2 --> AGGREGATE
	495	VERDICT3 --> AGGREGATE
	496
	497	AGGREGATE --> CONTEXT
	498	CONTEXT --> ARTICLE_VERDICT
	499
	500	classDef cache fill:#fff9c4,stroke:#f57f17,stroke-width:2px
	501	classDef dynamic fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px
	502	class CACHE cache
	503	class CONTEXT,ARTICLE_VERDICT dynamic
	504	{{/mermaid}}
	505
	506
2.1	507	=== 6.3 Benefits Analysis ===
1.1	508
	509
2.1	510	\| Benefit \| Impact \| Rationale \|
3.1	511	\|-\| \|-\|
2.1	512	\| Cost Reduction \| 40-70% for repeated claims \| Many articles share common claims (e.g., "COVID vaccines are safe") \|
	513	\| Faster Analysis \| 50%+ for cached claims \| Skip research + LLM calls for known claims \|
	514	\| Consistency \| High \| Same claim always gets same verdict (until cache expires) \|
	515	\| Freshness Control \| Configurable TTL \| Balance consistency vs. new evidence \|
	516	\| Scalability \| Linear improvement \| More users = higher cache hit rate \|
1.1	517
	518	=== 6.4 Implementation Considerations ===
	519
	520	Claim Hashing Strategy:
2.1	521	{{code language="typescript"}}function getClaimHash(claim: string): string {
1.1	522	// Normalize: lowercase, remove punctuation, stem words
	523	const normalized = normalize(claim);
	524	// Hash for cache key
	525	return crypto.createHash('sha256').update(normalized).digest('hex').slice(0, 16);
2.1	526	}{{/code}}
1.1	527
	528	Cache Invalidation Triggers:
	529	- TTL expiration (default 7 days)
	530	- Major news event related to claim topic
	531	- Source track record significant change
	532	- Manual invalidation by moderator
	533
	534	Article Verdict Considerations:
	535	- Article verdict should ALWAYS be dynamic (never cached)
	536	- Same claims in different article contexts may yield different article verdicts
	537	- Example: "Vaccines are safe" + "Vaccines cause autism" → article may be misleading even if first claim is true
	538
2.1	539	### 6.5 Recommendation##
1.1	540
	541	YES, separating is beneficial with the following caveats:
	542
	543	1. Claim verdicts should be cached with semantic similarity matching (not just exact match)
	544	2. Article verdicts should always be dynamic to account for:
	545	- Claim relationships and logical structure
	546	- Author's argumentative strategy
	547	- Context and framing
	548	- Selective use of true claims to support false conclusions
	549
	550	3. Implementation phases:
	551	- Phase 1: Exact-match claim caching (simple hash)
	552	- Phase 2: Semantic similarity caching (embedding-based)
	553	- Phase 3: Federated claim sharing across instances
	554
3.1	555	----
1.1	556
	557
2.1	558	== 7. Summary ==
1.1	559
	560
	561
2.1	562	=== Current State ===
1.1	563
	564	- POC1 implements core AKEL pipeline successfully
	565	- Claim dependency tracking is implemented
	566	- Multiple LLM providers supported
	567	- No persistent claim storage or caching
	568
	569
2.1	570	=== Key Gaps from Specification ===
1.1	571
	572	- No scenario extraction
	573	- No user/role system
	574	- No audit trail
	575	- No source track record updates
	576	- No review queue
	577
	578
2.1	579	=== Recommended Next Steps ===
1.1	580
	581	1. Implement claim caching layer
	582	2. Separate claim vs article verdict generation
	583	3. Add Redis for source/search caching
	584	4. Implement tiered model selection
	585	5. Add basic audit logging

Wiki source code of POC1 Architecture Analysis - 1.Jan.26

Applications

Navigation

Need help?