Wiki source code of POC Requirements (POC1 & POC2)

Last modified by Robert Schaub on 2025/12/23 18:00

version	line-number	content
1.1	1	= POC Requirements =
	2
	3	Status: ✅ Approved for Development
2.1	4	Version: 3.0 (Aligned with Main Requirements)
1.1	5	Goal: Prove that AI can extract claims and determine verdicts automatically without human intervention
	6
2.1	7	{{info}}
	8	Core Philosophy: POC validates the [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]] through simplified implementation. All POC features map to formal FR/NFR requirements.
	9	{{/info}}
1.1	10
2.1	11
1.1	12	== 1. POC Overview ==
	13
	14	=== 1.1 What POC Tests ===
	15
	16	Core Question:
2.2	17
1.1	18	> Can AI automatically extract factual claims from articles and evaluate them with reasonable verdicts?
	19
	20	What we're proving:
2.2	21
1.1	22	* AI can identify factual claims from text
2.1	23	* AI can evaluate those claims with structured evidence
	24	* Quality gates can filter unreliable outputs
	25	* The core workflow is technically feasible
1.1	26
2.1	27	What we're NOT proving:
2.2	28
2.1	29	* Production-ready reliability (that's POC2)
	30	* User-facing features (that's Beta 0)
	31	* Full IFCN compliance (that's V1.0)
1.1	32
2.1	33	=== 1.2 Requirements Mapping ===
1.1	34
2.1	35	POC1 implements a subset of the full system requirements defined in [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]].
1.1	36
2.1	37	Scope Summary:
2.2	38
2.1	39	* In Scope: 8 requirements (7 FRs + 1 NFR)
	40	* Partial: 3 NFRs (simplified versions)
	41	* Out of Scope: 19 requirements (deferred to later phases)
1.1	42
2.1	43	== 2. POC1 Scope ==
1.1	44
2.1	45	{{success}}
2.2	46	Authoritative Source for Phase Mapping: [[Requirements Roadmap Matrix>>Test.FactHarbor V0\.9\.88 ex 2 new Org Pages.Roadmap.Requirements-Roadmap-Matrix.WebHome]]
1.1	47
2.1	48	The Roadmap Matrix is the single source of truth for which requirements are implemented in which phases. This page provides POC1-specific implementation details only.
	49	{{/success}}
1.1	50
2.1	51	POC1 implements these formal requirements:
1.1	52
2.1	53	\|= Formal Req \|= Implementation in POC1 \|= Notes
	54	\| FR4 \| Analysis Summary \| Basic format; quality metadata deferred to POC2
	55	\| FR7 \| Automated Verdicts \| Full implementation with quality gates (NFR11)
	56	\| NFR11 \| Quality Assurance Framework \| 4 quality gates implemented
1.1	57
2.1	58	POC1 also implements these workflow components (detailed as FR1-FR6 in implementation sections below)
1.1	59
2.2	60	{{info}}Note: FR11 (Audit Trail) and FR13 (In-Article Claim Highlighting) are deferred to Beta 0 for production readiness and user experience enhancement.{{/info}}:
	61
2.1	62	* Claim extraction (FR1)
	63	* Claim context (FR2)
	64	* Multiple scenarios (FR3)
	65	* Evidence collection (FR5)
	66	* Source quality assessment (FR6)
	67	* Time evolution tracking (FR8) - deferred to POC2
	68	* Audit trail (FR11) - deferred to Beta 0
	69	* In-article highlighting (FR13) - deferred to Beta 0
1.1	70
2.1	71	Partial implementations:
2.2	72
2.1	73	* NFR1 (Explainability) - Basic only
	74	* NFR2 (Performance) - Functional but not optimized
	75	* NFR3 (Transparency) - Basic only
1.1	76
2.1	77	Detailed POC1 implementation specifications continue below...
1.1	78
	79
	80
2.1	81	== 3. POC Simplifications ==
1.1	82
2.1	83	=== 3.1 FR1: Claim Extraction (Full Implementation) ===
1.1	84
2.1	85	Main Requirement: AI extracts factual claims from input text
1.1	86
	87	POC Implementation:
2.2	88
2.1	89	* ✅ AKEL extracts claims using LLM
	90	* ✅ Each claim includes original text reference
	91	* ✅ Claims are identified as factual/non-factual
	92	* ❌ No advanced claim parsing (added in POC2)
1.1	93
2.1	94	Acceptance Criteria:
2.2	95
2.1	96	* Extracts 3-5 claims from typical article
	97	* Identifies factual vs non-factual claims
	98	* Quality Gate 1 validates extraction
1.1	99
2.1	100	=== 3.2 FR3: Multiple Scenarios (Full Implementation) ===
1.1	101
2.1	102	Main Requirement: Generate multiple interpretation scenarios for ambiguous claims
1.1	103
	104	POC Implementation:
2.2	105
2.1	106	* ✅ AKEL generates 2-3 scenarios per claim
	107	* ✅ Scenarios capture different interpretations
	108	* ✅ Each scenario is evaluated separately
	109	* ✅ Verdict considers all scenarios
1.1	110
2.1	111	Acceptance Criteria:
2.2	112
2.1	113	* Generates 2+ scenarios for ambiguous claims
	114	* Scenarios are meaningfully different
	115	* All scenarios are evaluated
1.1	116
2.1	117	=== 3.3 FR4: Analysis Summary (Basic Implementation) ===
1.1	118
2.1	119	Main Requirement: Provide user-friendly summary of analysis
1.1	120
	121	POC Implementation:
2.2	122
2.1	123	* ✅ Simple text summary generated
	124	* ❌ No rich formatting (added in Beta 0)
	125	* ❌ No visual elements (added in Beta 0)
	126	* ❌ No interactive features (added in Beta 0)
1.1	127
2.1	128	POC Format:
	129	```
	130	Claim: [extracted claim]
	131	Scenarios: [list of scenarios]
	132	Evidence: [supporting/opposing evidence]
	133	Verdict: [probability with uncertainty]
	134	```
1.1	135
	136
2.1	137	=== 3.4 FR5-FR6: Evidence Collection & Evaluation (Full Implementation) ===
1.1	138
2.1	139	Main Requirements:
2.2	140
2.1	141	* FR5: Collect supporting and opposing evidence
	142	* FR6: Evaluate evidence source reliability
1.1	143
	144	POC Implementation:
2.2	145
2.1	146	* ✅ AKEL searches for evidence (web/knowledge base)
	147	* ✅ Mandatory contradiction search (finds opposing evidence)
	148	* ✅ Source reliability scoring
	149	* ❌ No evidence deduplication (added in POC2)
	150	* ❌ No advanced source verification (added in POC2)
1.1	151
	152	Acceptance Criteria:
2.2	153
2.1	154	* Finds 2+ supporting evidence items
	155	* Finds 1+ opposing evidence (if exists)
	156	* Sources scored for reliability
1.1	157
2.1	158	=== 3.5 FR7: Automated Verdicts (Full Implementation) ===
1.1	159
2.1	160	Main Requirement: AI computes verdicts with uncertainty quantification
1.1	161
2.1	162	POC Implementation:
2.2	163
2.1	164	* ✅ Probabilistic verdicts (0-100% confidence)
	165	* ✅ Uncertainty explicitly stated
	166	* ✅ Reasoning chain provided
	167	* ✅ Quality Gate 4 validates verdict confidence
1.1	168
2.1	169	POC Output:
	170	```
	171	Verdict: 70% likely true
	172	Uncertainty: ±15% (moderate confidence)
	173	Reasoning: Based on 3 high-quality sources...
	174	Confidence Level: MEDIUM
	175	```
1.1	176
	177	Acceptance Criteria:
2.2	178
2.1	179	* Verdicts include probability (0-100%)
	180	* Uncertainty explicitly quantified
	181	* Reasoning chain explains verdict
1.1	182
2.1	183	=== 3.6 NFR11: Quality Assurance Framework (LITE VERSION) ===
1.1	184
2.1	185	Main Requirement: Complete quality assurance with 7 quality gates
1.1	186
2.1	187	POC Implementation: 2 gates only
1.1	188
2.1	189	Quality Gate 1: Claim Validation
2.2	190
2.1	191	* ✅ Validates claim is factual and verifiable
	192	* ✅ Blocks non-factual claims (opinion/prediction/ambiguous)
	193	* ✅ Provides clear rejection reason
1.1	194
2.1	195	Quality Gate 4: Verdict Confidence Assessment
2.2	196
2.1	197	* ✅ Validates ≥2 sources found
	198	* ✅ Validates quality score ≥0.6
	199	* ✅ Blocks low-confidence verdicts
	200	* ✅ Provides clear rejection reason
1.1	201
2.1	202	Out of Scope (POC2+):
2.2	203
2.1	204	* ❌ Gate 2: Evidence Relevance
	205	* ❌ Gate 3: Scenario Coherence
	206	* ❌ Gate 5: Source Diversity
	207	* ❌ Gate 6: Reasoning Validity
	208	* ❌ Gate 7: Output Completeness
1.1	209
2.1	210	Rationale: Prove gate concept works. Add remaining gates in POC2 after validating approach.
1.1	211
	212
2.1	213	=== 3.7 NFR1-3: Performance, Scalability, Reliability (Basic) ===
1.1	214
2.1	215	Main Requirements:
2.2	216
2.1	217	* NFR1: Response time < 30 seconds
	218	* NFR2: Handle 1000+ concurrent users
	219	* NFR3: 99.9% uptime
1.1	220
2.1	221	POC Implementation:
2.2	222
2.1	223	* ⚠️ Response time monitored (not optimized)
	224	* ⚠️ Single-threaded processing (no concurrency)
	225	* ⚠️ Basic error handling (no advanced retry logic)
1.1	226
2.1	227	Rationale: POC proves functionality. Performance optimization happens in POC2.
1.1	228
2.1	229	POC Acceptance:
2.2	230
2.1	231	* Analysis completes (no timeout requirement)
	232	* Errors don't crash system
	233	* Basic logging in place
1.1	234
2.1	235	== 4. What's NOT in POC Scope ==
1.1	236
2.1	237	=== 4.1 User-Facing Features (Beta 0+) ===
1.1	238
2.1	239	{{warning}}
	240	Deferred to Beta 0:
	241	{{/warning}}
1.1	242
2.1	243	Out of Scope:
2.2	244
2.1	245	* ❌ User accounts and authentication (FR8)
	246	* ❌ User corrections system (FR9, FR45-46)
	247	* ❌ Public publishing interface (FR10)
	248	* ❌ Social sharing (FR11)
	249	* ❌ Email notifications (FR12)
	250	* ❌ API access (FR13)
1.1	251
2.1	252	Rationale: POC validates AI capabilities. User features added in Beta 0.
1.1	253
	254
2.1	255	=== 4.2 Advanced Features (V1.0+) ===
1.1	256
2.1	257	Out of Scope:
2.2	258
2.1	259	* ❌ IFCN compliance (FR47)
	260	* ❌ ClaimReview schema (FR48)
	261	* ❌ Archive.org integration (FR49)
	262	* ❌ OSINT toolkit (FR50)
	263	* ❌ Video verification (FR51)
	264	* ❌ Deepfake detection (FR52)
	265	* ❌ Cross-org sharing (FR53)
1.1	266
2.1	267	Rationale: Advanced features require proven platform. Added post-V1.0.
1.1	268
	269
2.1	270	=== 4.3 Production Requirements (POC2, Beta 0) ===
1.1	271
2.1	272	Out of Scope:
2.2	273
2.1	274	* ❌ Security controls (NFR4, NFR12)
	275	* ❌ Code maintainability (NFR5)
	276	* ❌ System monitoring (NFR13)
	277	* ❌ Evidence deduplication
	278	* ❌ Advanced source verification
	279	* ❌ Full 7-gate quality framework
1.1	280
2.1	281	Rationale: POC proves concept. Production hardening happens in POC2 and Beta 0.
1.1	282
	283
2.1	284	== 5. POC Output Specification ==
1.1	285
2.1	286	=== 5.1 Required Output Elements ===
1.1	287
2.1	288	For each analyzed claim, POC must produce:
1.1	289
2.2	290	*
	291	**
	292	**1. Claim
2.1	293	* Original text
	294	* Classification (factual/non-factual/ambiguous)
	295	* If non-factual: Clear reason why
1.1	296
2.1	297	2. Scenarios (if factual)
2.2	298
2.1	299	* 2-3 interpretation scenarios
	300	* Each scenario clearly described
1.1	301
2.1	302	3. Evidence (if factual)
2.2	303
2.1	304	* Supporting evidence (2+ items)
	305	* Opposing evidence (if exists)
	306	* Source URLs and reliability scores
1.1	307
2.1	308	4. Verdict (if factual)
2.2	309
2.1	310	* Probability (0-100%)
	311	* Uncertainty quantification
	312	* Confidence level (LOW/MEDIUM/HIGH)
	313	* Reasoning chain
1.1	314
2.1	315	5. Quality Status
2.2	316
2.1	317	* Which gates passed/failed
	318	* If failed: Clear explanation why
1.1	319
2.1	320	=== 5.2 Example POC Output ===
1.1	321
2.1	322	{{code language="json"}}
	323	{
	324	"claim": {
	325	"text": "Switzerland has the highest life expectancy in Europe",
	326	"type": "factual",
	327	"gate1_status": "PASS"
	328	},
	329	"scenarios": [
	330	"Switzerland's overall life expectancy is highest",
	331	"Switzerland ranks highest for specific age groups"
	332	],
	333	"evidence": {
	334	"supporting": [
	335	{
	336	"source": "WHO Report 2023",
	337	"reliability": 0.95,
	338	"excerpt": "Switzerland: 83.4 years average..."
	339	}
	340	],
	341	"opposing": [
	342	{
	343	"source": "Eurostat 2024",
	344	"reliability": 0.90,
	345	"excerpt": "Spain leads at 83.5 years..."
	346	}
	347	]
	348	},
	349	"verdict": {
	350	"probability": 0.65,
	351	"uncertainty": 0.15,
	352	"confidence": "MEDIUM",
	353	"reasoning": "WHO and Eurostat show similar but conflicting data...",
	354	"gate4_status": "PASS"
	355	}
	356	}
1.1	357	{{/code}}
	358
	359
2.1	360	== 6. Success Criteria ==
1.1	361
2.1	362	{{success}}
	363	POC Success Definition: POC validates that AI can extract claims, find balanced evidence, and compute reasonable verdicts with quality gates improving output quality.
	364	{{/success}}
1.1	365
2.1	366	=== 6.1 Functional Success ===
1.1	367
2.1	368	POC is successful if:
1.1	369
2.1	370	✅ FR1-FR7 Requirements Met:
2.2	371
2.1	372	1. Extracts 3-5 factual claims from test articles
	373	2. Generates 2-3 scenarios per ambiguous claim
	374	3. Finds supporting AND opposing evidence
	375	4. Computes probabilistic verdicts with uncertainty
	376	5. Provides clear reasoning chains
1.1	377
2.1	378	✅ Quality Gates Work:
2.2	379
2.1	380	1. Gate 1 blocks non-factual claims (100% block rate)
	381	2. Gate 4 blocks low-quality verdicts (blocks if <2 sources or quality <0.6)
	382	3. Clear rejection reasons provided
1.1	383
2.1	384	✅ NFR11 Met:
2.2	385
2.1	386	1. Quality gates reduce hallucination rate
	387	2. Blocked outputs have clear explanations
	388	3. Quality metrics are logged
1.1	389
2.1	390	=== 6.2 Quality Thresholds ===
1.1	391
2.1	392	Minimum Acceptable:
2.2	393
2.1	394	* ≥70% of test claims correctly classified (factual/non-factual)
	395	* ≥60% of verdicts are reasonable (human evaluation)
	396	* Gate 1 blocks 100% of non-factual claims
	397	* Gate 4 blocks verdicts with <2 sources
1.1	398
2.1	399	Target:
2.2	400
2.1	401	* ≥80% claims correctly classified
	402	* ≥75% verdicts are reasonable
	403	* <10% false positives (blocking good claims)
1.1	404
2.1	405	=== 6.3 POC Decision Gate ===
1.1	406
2.1	407	After POC1, we decide:
1.1	408
2.1	409	✅ PROCEED to POC2 if:
2.2	410
2.1	411	* Success criteria met
	412	* Quality gates demonstrably improve output
	413	* Core workflow is technically sound
	414	* Clear path to production quality
1.1	415
2.1	416	⚠️ ITERATE POC1 if:
2.2	417
2.1	418	* Success criteria partially met
	419	* Gates work but need tuning
	420	* Core issues identified but fixable
1.1	421
2.1	422	❌ PIVOT APPROACH if:
2.2	423
2.1	424	* Success criteria not met
	425	* Fundamental AI limitations discovered
	426	* Quality gates insufficient
	427	* Alternative approach needed
1.1	428
2.1	429	== 7. Test Cases ==
1.1	430
2.1	431	=== 7.1 Happy Path ===
1.1	432
2.1	433	Test 1: Simple Factual Claim
2.2	434
2.1	435	* Input: "Paris is the capital of France"
2.2	436	* Expected: Factual, 1 scenario, verdict 95% true
1.1	437
2.1	438	Test 2: Ambiguous Claim
2.2	439
2.1	440	* Input: "Switzerland has the highest income in Europe"
	441	* Expected: Factual, 2-3 scenarios, verdict with uncertainty
1.1	442
2.1	443	Test 3: Statistical Claim
2.2	444
2.1	445	* Input: "10% of people have condition X"
	446	* Expected: Factual, evidence with numbers, probabilistic verdict
1.1	447
2.1	448	=== 7.2 Edge Cases ===
1.1	449
2.1	450	Test 4: Opinion
2.2	451
2.1	452	* Input: "Paris is the best city"
	453	* Expected: Non-factual (opinion), blocked by Gate 1
1.1	454
2.1	455	Test 5: Prediction
2.2	456
2.1	457	* Input: "Bitcoin will reach $100,000 next year"
	458	* Expected: Non-factual (prediction), blocked by Gate 1
1.1	459
2.1	460	Test 6: Insufficient Evidence
2.2	461
2.1	462	* Input: Obscure factual claim with no sources
	463	* Expected: Blocked by Gate 4 (<2 sources)
1.1	464
2.1	465	=== 7.3 Quality Gate Tests ===
1.1	466
2.1	467	Test 7: Gate 1 Effectiveness
2.2	468
2.1	469	* Input: Mix of 10 factual + 10 non-factual claims
	470	* Expected: Gate 1 blocks all 10 non-factual (100% precision)
1.1	471
2.1	472	Test 8: Gate 4 Effectiveness
2.2	473
2.1	474	* Input: Claims with varying evidence availability
	475	* Expected: Gate 4 blocks low-confidence verdicts
1.1	476
2.1	477	== 8. Technical Architecture (POC) ==
1.1	478
2.1	479	=== 8.1 Simplified Architecture ===
1.1	480
2.1	481	POC Tech Stack:
2.2	482
2.1	483	* Frontend: Simple web interface (Next.js + TypeScript)
	484	* Backend: Single API endpoint
	485	* AI: Claude API (Sonnet 4.5)
	486	* Database: Local JSON files (no database)
	487	* Deployment: Single server
1.1	488
2.1	489	Architecture Diagram: See [[POC1 Specification>>FactHarbor.Specification.POC.Specification]]
1.1	490
	491
2.1	492	=== 8.2 AKEL Implementation ===
1.1	493
2.1	494	POC AKEL:
2.2	495
2.1	496	* Single-threaded processing
	497	* Synchronous API calls
	498	* No caching
	499	* Basic error handling
	500	* Console logging
1.1	501
2.1	502	Full AKEL (POC2+):
2.2	503
2.1	504	* Multi-threaded processing
	505	* Async API calls
	506	* Evidence caching
	507	* Advanced error handling with retry
	508	* Structured logging + monitoring
1.1	509
2.1	510	== 9. POC Philosophy ==
1.1	511
2.1	512	{{info}}
	513	Important: POC validates concept, not production readiness. Focus is on proving AI can do the job, with production quality coming in later phases.
	514	{{/info}}
1.1	515
2.1	516	=== 9.1 Core Principles ===
1.1	517
2.2	518	*
	519	**
	520	**1. Prove Concept, Not Production
2.1	521	* POC validates AI can do the job
	522	* Production quality comes in POC2 and Beta 0
	523	* Focus on "does it work?" not "is it perfect?"
1.1	524
2.1	525	2. Implement Subset of Requirements
2.2	526
2.1	527	* POC covers FR1-7, NFR11 (lite)
	528	* All other requirements deferred
	529	* Clear mapping to [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]]
1.1	530
2.1	531	3. Quality Gates Validate Approach
2.2	532
2.1	533	* 2 gates prove the concept
	534	* Remaining 5 gates added in POC2
	535	* Gates must demonstrably improve quality
1.1	536
2.1	537	4. Iterate Based on Results
2.2	538
2.1	539	* POC results determine next steps
	540	* Decision gate after POC1
	541	* Flexibility to pivot if needed
1.1	542
2.2	543	=== 9.2 Success ===
1.1	544
2.2	545	Clear Path Forward ===
1.1	546
2.1	547	POC succeeds if we can confidently answer:
1.1	548
2.1	549	✅ Technical Feasibility:
2.2	550
2.1	551	* Can AI extract claims reliably?
	552	* Can AI find balanced evidence?
	553	* Can AI compute reasonable verdicts?
1.1	554
2.1	555	✅ Quality Approach:
2.2	556
2.1	557	* Do quality gates improve output?
	558	* Can we measure and track quality?
	559	* Is the gate approach scalable?
1.1	560
2.1	561	✅ Production Path:
2.2	562
2.1	563	* Is the core architecture sound?
	564	* What needs improvement for production?
	565	* Is POC2 the right next step?
1.1	566
2.1	567	== 10. Related Pages ==
1.1	568
2.1	569	* [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]] - Full system requirements (this POC implements a subset)
	570	* [[POC1 Specification (Detailed)>>FactHarbor.Specification.POC.Specification]] - Detailed POC1 technical specs
	571	* [[POC Summary>>FactHarbor.Specification.POC.Summary]] - High-level POC overview
	572	* [[Implementation Roadmap>>FactHarbor.Roadmap.WebHome]] - POC1, POC2, Beta 0, V1.0 phases
	573	* [[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]] - What users need (drives requirements)
1.1	574
2.1	575	Document Owner: Technical Team
	576	Review Frequency: After each POC iteration
	577	Version History:
2.2	578
2.1	579	* v1.0 - Initial POC requirements
	580	* v2.0 - Updated after specification cross-check
	581	* v3.0 - Aligned with Main Requirements (FR/NFR IDs added)