POC Requirements

1

= POC Requirements =

2

3

**Status:** ✅ Approved for Development

4

**Version:** 3.0 (Aligned with Main Requirements)

5

**Goal:** Prove that AI can extract claims and determine verdicts automatically without human intervention

6

7

8

**Core Philosophy:** POC validates the [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]] through simplified implementation. All POC features map to formal FR/NFR requirements.

== 1. POC Overview ==

13

14

=== 1.1 What POC Tests ===

15

16

**Core Question:**

17

> Can AI automatically extract factual claims from articles and evaluate them with reasonable verdicts?

18

19

**What we're proving:**

20

* AI can identify factual claims from text

21

* AI can evaluate those claims with structured evidence

22

* Quality gates can filter unreliable outputs

23

* The core workflow is technically feasible

24

25

**What we're NOT proving:**

26

* Production-ready reliability (that's POC2)

27

* User-facing features (that's Beta 0)

28

* Full IFCN compliance (that's V1.0)

29

30

=== 1.2 Requirements Mapping ===

31

32

POC1 implements a **subset** of the full system requirements defined in [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]].

33

34

**Scope Summary:**

35

* **In Scope:** 8 requirements (7 FRs + 1 NFR)

36

* **Partial:** 3 NFRs (simplified versions)

37

* **Out of Scope:** 19 requirements (deferred to later phases)

38

39

40

== 2. Requirements Scope Matrix ==

41

42

43

**Requirements Traceability:** This matrix shows which [[Main Requirements>>FactHarbor.Specification.Requirements.WebHome]] are implemented in POC1, providing full traceability between POC and system requirements.

44

45

46

|=Requirement|=POC1 Status|=Implementation Level|=Notes

47

|**CORE WORKFLOW**||||

48

|FR1: Claim Extraction|✅ **In Scope**|Full|AKEL extracts claims from text

49

|FR2: Claim Context|✅ **In Scope**|Basic|Context preserved with claim

50

|FR3: Multiple Scenarios|✅ **In Scope**|Full|AKEL generates interpretation scenarios

51

|FR4: Analysis Summary|✅ **In Scope**|Basic|Simple summary format

52

|FR5: Evidence Collection|✅ **In Scope**|Full|AKEL searches for evidence

53

|FR6: Evidence Evaluation|✅ **In Scope**|Full|AKEL evaluates source reliability

54

|FR7: Automated Verdicts|✅ **In Scope**|Full|AKEL computes verdicts with uncertainty

55

|**QUALITY & RELIABILITY**||||

56

|NFR11: Quality Assurance|✅ **In Scope**|**Lite**|**2 gates only** (Gate 1 & 4)

57

|NFR1: Performance|⚠️ **Partial**|Basic|Response time monitored, not optimized

58

|NFR2: Scalability|⚠️ **Partial**|Single-thread|No concurrent processing

59

|NFR3: Reliability|⚠️ **Partial**|Basic|Error handling, no retry logic

60

|**DEFERRED TO LATER**||||

61

|FR8-FR13|❌ Out of Scope|N/A|User accounts, corrections, publishing

62

|FR44-FR53|❌ Out of Scope|N/A|Advanced features (V1.0+)

63

|NFR4: Security|❌ Out of Scope|N/A|POC2

64

|NFR5: Maintainability|❌ Out of Scope|N/A|POC2

65

|NFR12: Security Controls|❌ Out of Scope|N/A|Beta 0

66

|NFR13: Monitoring|❌ Out of Scope|N/A|POC2

67

68

69

== 3. POC Simplifications ==

70

71

=== 3.1 FR1: Claim Extraction (Full Implementation) ===

72

73

**Main Requirement:** AI extracts factual claims from input text

74

75

**POC Implementation:**

76

* ✅ AKEL extracts claims using LLM

77

* ✅ Each claim includes original text reference

78

* ✅ Claims are identified as factual/non-factual

79

* ❌ No advanced claim parsing (added in POC2)

80

81

**Acceptance Criteria:**

82

* Extracts 3-5 claims from typical article

83

* Identifies factual vs non-factual claims

84

* Quality Gate 1 validates extraction

85

86

87

=== 3.2 FR3: Multiple Scenarios (Full Implementation) ===

88

89

**Main Requirement:** Generate multiple interpretation scenarios for ambiguous claims

90

91

**POC Implementation:**

92

* ✅ AKEL generates 2-3 scenarios per claim

93

* ✅ Scenarios capture different interpretations

94

* ✅ Each scenario is evaluated separately

95

* ✅ Verdict considers all scenarios

96

97

**Acceptance Criteria:**

98

* Generates 2+ scenarios for ambiguous claims

99

* Scenarios are meaningfully different

100

* All scenarios are evaluated

101

102

103

=== 3.3 FR4: Analysis Summary (Basic Implementation) ===

104

105

**Main Requirement:** Provide user-friendly summary of analysis

106

107

**POC Implementation:**

108

* ✅ Simple text summary generated

109

* ❌ No rich formatting (added in Beta 0)

110

* ❌ No visual elements (added in Beta 0)

111

* ❌ No interactive features (added in Beta 0)

**POC Format:**

```

Claim: [extracted claim]

116

Scenarios: [list of scenarios]

117

Evidence: [supporting/opposing evidence]

118

Verdict: [probability with uncertainty]

```

=== 3.4 FR5-FR6: Evidence Collection & Evaluation (Full Implementation) ===

123

124

**Main Requirements:**

125

* FR5: Collect supporting and opposing evidence

126

* FR6: Evaluate evidence source reliability

127

128

**POC Implementation:**

129

* ✅ AKEL searches for evidence (web/knowledge base)

130

* ✅ **Mandatory contradiction search** (finds opposing evidence)

131

* ✅ Source reliability scoring

132

* ❌ No evidence deduplication (added in POC2)

133

* ❌ No advanced source verification (added in POC2)

134

135

**Acceptance Criteria:**

136

* Finds 2+ supporting evidence items

137

* Finds 1+ opposing evidence (if exists)

138

* Sources scored for reliability

139

140

141

=== 3.5 FR7: Automated Verdicts (Full Implementation) ===

142

143

**Main Requirement:** AI computes verdicts with uncertainty quantification

144

145

**POC Implementation:**

146

* ✅ Probabilistic verdicts (0-100% confidence)

147

* ✅ Uncertainty explicitly stated

148

* ✅ Reasoning chain provided

149

* ✅ Quality Gate 4 validates verdict confidence

**POC Output:**

```

Verdict: 70% likely true

154

Uncertainty: ±15% (moderate confidence)

155

Reasoning: Based on 3 high-quality sources...

156

Confidence Level: MEDIUM

157

```

158

159

**Acceptance Criteria:**

160

* Verdicts include probability (0-100%)

161

* Uncertainty explicitly quantified

162

* Reasoning chain explains verdict

163

164

165

=== 3.6 NFR11: Quality Assurance Framework (LITE VERSION) ===

166

167

**Main Requirement:** Complete quality assurance with 7 quality gates

168

169

**POC Implementation:** **2 gates only**

170

171

**Quality Gate 1: Claim Validation**

172

* ✅ Validates claim is factual and verifiable

173

* ✅ Blocks non-factual claims (opinion/prediction/ambiguous)

174

* ✅ Provides clear rejection reason

175

176

**Quality Gate 4: Verdict Confidence Assessment**

177

* ✅ Validates ≥2 sources found

178

* ✅ Validates quality score ≥0.6

179

* ✅ Blocks low-confidence verdicts

180

* ✅ Provides clear rejection reason

181

182

**Out of Scope (POC2+):**

183

* ❌ Gate 2: Evidence Relevance

184

* ❌ Gate 3: Scenario Coherence

185

* ❌ Gate 5: Source Diversity

186

* ❌ Gate 6: Reasoning Validity

187

* ❌ Gate 7: Output Completeness

188

189

**Rationale:** Prove gate concept works. Add remaining gates in POC2 after validating approach.

190

191

192

=== 3.7 NFR1-3: Performance, Scalability, Reliability (Basic) ===

193

194

**Main Requirements:**

195

* NFR1: Response time < 30 seconds

196

* NFR2: Handle 1000+ concurrent users

197

* NFR3: 99.9% uptime

198

199

**POC Implementation:**

200

* ⚠️ **Response time monitored** (not optimized)

201

* ⚠️ **Single-threaded processing** (no concurrency)

202

* ⚠️ **Basic error handling** (no advanced retry logic)

203

204

**Rationale:** POC proves functionality. Performance optimization happens in POC2.

205

206

**POC Acceptance:**

207

* Analysis completes (no timeout requirement)

208

* Errors don't crash system

209

* Basic logging in place

210

211

212

== 4. What's NOT in POC Scope ==

213

214

=== 4.1 User-Facing Features (Beta 0+) ===

215

216

217

**Deferred to Beta 0:**

**Out of Scope:**

* ❌ User accounts and authentication (FR8)

222

* ❌ User corrections system (FR9, FR45-46)

223

* ❌ Public publishing interface (FR10)

224

* ❌ Social sharing (FR11)

225

* ❌ Email notifications (FR12)

226

* ❌ API access (FR13)

227

228

**Rationale:** POC validates AI capabilities. User features added in Beta 0.

229

230

231

=== 4.2 Advanced Features (V1.0+) ===

232

233

**Out of Scope:**

234

* ❌ IFCN compliance (FR47)

235

* ❌ ClaimReview schema (FR48)

236

* ❌ Archive.org integration (FR49)

237

* ❌ OSINT toolkit (FR50)

238

* ❌ Video verification (FR51)

239

* ❌ Deepfake detection (FR52)

240

* ❌ Cross-org sharing (FR53)

241

242

**Rationale:** Advanced features require proven platform. Added post-V1.0.

243

244

245

=== 4.3 Production Requirements (POC2, Beta 0) ===

246

247

**Out of Scope:**

248

* ❌ Security controls (NFR4, NFR12)

249

* ❌ Code maintainability (NFR5)

250

* ❌ System monitoring (NFR13)

251

* ❌ Evidence deduplication

252

* ❌ Advanced source verification

253

* ❌ Full 7-gate quality framework

254

255

**Rationale:** POC proves concept. Production hardening happens in POC2 and Beta 0.

256

257

258

== 5. POC Output Specification ==

259

260

=== 5.1 Required Output Elements ===

261

262

For each analyzed claim, POC must produce:

**1. Claim**

* Original text

* Classification (factual/non-factual/ambiguous)

267

* If non-factual: Clear reason why

268

269

**2. Scenarios** (if factual)

270

* 2-3 interpretation scenarios

271

* Each scenario clearly described

272

273

**3. Evidence** (if factual)

274

* Supporting evidence (2+ items)

275

* Opposing evidence (if exists)

276

* Source URLs and reliability scores

277

278

**4. Verdict** (if factual)

279

* Probability (0-100%)

280

* Uncertainty quantification

281

* Confidence level (LOW/MEDIUM/HIGH)

282

* Reasoning chain

283

284

**5. Quality Status**

285

* Which gates passed/failed

286

* If failed: Clear explanation why

287

288

289

=== 5.2 Example POC Output ===

{

"claim": {

"text": "Switzerland has the highest life expectancy in Europe",

295

"type": "factual",

296

"gate1_status": "PASS"

297

},

298

"scenarios": [

299

"Switzerland's overall life expectancy is highest",

300

"Switzerland ranks highest for specific age groups"

],

"evidence": {

"supporting": [

{

"source": "WHO Report 2023",

306

"reliability": 0.95,

307

"excerpt": "Switzerland: 83.4 years average..."

}

],

"opposing": [

{

"source": "Eurostat 2024",

313

"reliability": 0.90,

314

"excerpt": "Spain leads at 83.5 years..."

}

]

},

"verdict": {

"probability": 0.65,

"uncertainty": 0.15,

"confidence": "MEDIUM",

322

"reasoning": "WHO and Eurostat show similar but conflicting data...",

323

"gate4_status": "PASS"

}

}

== 6. Success Criteria ==

330

331

332

**POC Success Definition:** POC validates that AI can extract claims, find balanced evidence, and compute reasonable verdicts with quality gates improving output quality.

333

334

335

=== 6.1 Functional Success ===

336

337

POC is successful if:

338

339

✅ **FR1-FR7 Requirements Met:**

340

1. Extracts 3-5 factual claims from test articles

341

2. Generates 2-3 scenarios per ambiguous claim

342

3. Finds supporting AND opposing evidence

343

4. Computes probabilistic verdicts with uncertainty

344

5. Provides clear reasoning chains

345

346

✅ **Quality Gates Work:**

347

1. Gate 1 blocks non-factual claims (100% block rate)

348

2. Gate 4 blocks low-quality verdicts (blocks if <2 sources or quality <0.6)

349

3. Clear rejection reasons provided

350

351

✅ **NFR11 Met:**

352

1. Quality gates reduce hallucination rate

353

2. Blocked outputs have clear explanations

354

3. Quality metrics are logged

355

356

357

=== 6.2 Quality Thresholds ===

358

359

**Minimum Acceptable:**

360

* ≥70% of test claims correctly classified (factual/non-factual)

361

* ≥60% of verdicts are reasonable (human evaluation)

362

* Gate 1 blocks 100% of non-factual claims

363

* Gate 4 blocks verdicts with <2 sources

364

365

**Target:**

366

* ≥80% claims correctly classified

367

* ≥75% verdicts are reasonable

368

* <10% false positives (blocking good claims)

369

370

371

=== 6.3 POC Decision Gate ===

372

373

**After POC1, we decide:**

374

375

**✅ PROCEED to POC2** if:

376

* Success criteria met

377

* Quality gates demonstrably improve output

378

* Core workflow is technically sound

379

* Clear path to production quality

380

381

**⚠️ ITERATE POC1** if:

382

* Success criteria partially met

383

* Gates work but need tuning

384

* Core issues identified but fixable

385

386

**❌ PIVOT APPROACH** if:

387

* Success criteria not met

388

* Fundamental AI limitations discovered

389

* Quality gates insufficient

390

* Alternative approach needed

== 7. Test Cases ==

=== 7.1 Happy Path ===

396

397

**Test 1: Simple Factual Claim**

398

* Input: "Paris is the capital of France"

399

* Expected: Factual, 1 scenario, verdict ~95% true

400

401

**Test 2: Ambiguous Claim**

402

* Input: "Switzerland has the highest income in Europe"

403

* Expected: Factual, 2-3 scenarios, verdict with uncertainty

404

405

**Test 3: Statistical Claim**

406

* Input: "10% of people have condition X"

407

* Expected: Factual, evidence with numbers, probabilistic verdict

408

409

410

=== 7.2 Edge Cases ===

411

412

**Test 4: Opinion**

413

* Input: "Paris is the best city"

414

* Expected: Non-factual (opinion), blocked by Gate 1

415

416

**Test 5: Prediction**

417

* Input: "Bitcoin will reach $100,000 next year"

418

* Expected: Non-factual (prediction), blocked by Gate 1

419

420

**Test 6: Insufficient Evidence**

421

* Input: Obscure factual claim with no sources

422

* Expected: Blocked by Gate 4 (<2 sources)

423

424

425

=== 7.3 Quality Gate Tests ===

426

427

**Test 7: Gate 1 Effectiveness**

428

* Input: Mix of 10 factual + 10 non-factual claims

429

* Expected: Gate 1 blocks all 10 non-factual (100% precision)

430

431

**Test 8: Gate 4 Effectiveness**

432

* Input: Claims with varying evidence availability

433

* Expected: Gate 4 blocks low-confidence verdicts

434

435

436

== 8. Technical Architecture (POC) ==

437

438

=== 8.1 Simplified Architecture ===

439

440

**POC Tech Stack:**

441

* **Frontend:** Simple web interface (Next.js + TypeScript)

442

* **Backend:** Single API endpoint

443

* **AI:** Claude API (Sonnet 4.5)

444

* **Database:** Local JSON files (no database)

445

* **Deployment:** Single server

446

447

**Architecture Diagram:** See [[POC1 Specification>>FactHarbor.Specification.POC.Specification]]

448

449

450

=== 8.2 AKEL Implementation ===

451

452

**POC AKEL:**

453

* Single-threaded processing

454

* Synchronous API calls

455

* No caching

456

* Basic error handling

457

* Console logging

458

459

**Full AKEL (POC2+):**

460

* Multi-threaded processing

461

* Async API calls

462

* Evidence caching

463

* Advanced error handling with retry

464

* Structured logging + monitoring

465

466

467

== 9. POC Philosophy ==

468

469

470

**Important:** POC validates concept, not production readiness. Focus is on proving AI can do the job, with production quality coming in later phases.

471

472

473

=== 9.1 Core Principles ===

474

475

**1. Prove Concept, Not Production**

476

* POC validates AI can do the job

477

* Production quality comes in POC2 and Beta 0

478