Quality Gates Reference

1

= Quality Gates Reference =

2

3

**Version**: 1.0

4

**Status**: Consolidated Reference

5

**Date**: February 3, 2026

----

== 1. Overview ==

Quality Gates are checkpoints in the FactHarbor analysis pipeline that enforce minimum standards for claim evaluation and verdict confidence. They ensure that:

12

13

* Only verifiable claims are analyzed (Gate 1)

14

* Verdicts have sufficient supporting evidence (Gate 4)

15

* Results meet minimum quality thresholds before publication

16

17

**Target Audience**: Developers, prompt engineers, and quality assurance reviewers.

18

19

**Implemented Gates**: Gate 1 (Claim Validation) and Gate 4 (Verdict Confidence Assessment)

20

21

**Replaces scattered documentation in**:

22

* [[Architecture Overview>>FactHarbor.Specification.Implementation.Architecture Overview.WebHome]] (Quality Gates section)

23

* Calculations.md (Gate 4 sections, still in Docs/ARCHITECTURE/)

24

* [[TriplePath Architecture>>FactHarbor.Specification.Implementation.Pipeline Architecture.TriplePath Architecture.WebHome]] (quality gates mentions)

----

== 2. Gate Architecture ==

29

30

=== 2.1 Pipeline Integration ===

flowchart TB

subgraph UNDERSTAND["Phase 1: UNDERSTAND"]

35

Input[User Input] --> ClaimExtraction[Claim Extraction]

36

ClaimExtraction --> GATE1["Gate 1: Claim Validation ━━━━━━━━━━━━━ Filter opinions, predictions, low-specificity"]

37

GATE1 -->|Pass| ValidClaims[Valid Claims]

38

GATE1 -.->|Fail| ExcludedClaims[Excluded Claims with reasons]

39

end

40

41

subgraph RESEARCH["Phase 2: RESEARCH"]

42

ValidClaims --> Search[Web Search]

43

Search --> Sources[Source Documents]

44

Sources --> EvidenceExtraction[Evidence Extraction]

45

end

46

47

subgraph VERDICT["Phase 3: VERDICT GENERATION"]

48

EvidenceExtraction --> VerdictGeneration[Verdict Generation]

49

VerdictGeneration --> GATE4["Gate 4: Confidence Assessment ━━━━━━━━━━━━━ Check source count, fact count, reasoning quality"]

50

GATE4 -->|Pass| PublishableVerdicts[Publishable Verdicts]

51

GATE4 -->|Warn| LowConfidenceVerdicts[Low Confidence Verdicts]

52

end

53

54

style GATE1 fill:#fff9c4

55

style GATE4 fill:#fff9c4

56

style ExcludedClaims fill:#ffcdd2

57

style ValidClaims fill:#c8e6c9

58

style PublishableVerdicts fill:#c8e6c9

59

style LowConfidenceVerdicts fill:#ffecb3

60

61

62

=== 2.2 Gate States ===

63

64

|= State |= Description |= Action

65

| **Pass** | Meets all criteria | Proceed normally

66

| **Warn** | Below recommended threshold but above minimum | Proceed with warning flag

67

| **Fail** | Does not meet minimum criteria | Exclude from analysis or mark as insufficient

68

69

=== 2.3 Result Metadata ===

70

71

Every analysis result includes gate statistics:

72

73

74

interface QualityGates {

gate1Stats: {

totalClaims: number;

validClaims: number;

excludedClaims: number;

79

exclusionReasons: { claimId: string; reason: string }[];

80

};

81

gate4Stats: {

82

totalVerdicts: number;

83

highConfidence: number;

84

mediumConfidence: number;

85

lowConfidence: number;

86

insufficient: number;

};

}

----

== 3. Gate 1: Claim Validation ==

=== 3.1 Purpose ===

Filter out non-verifiable claims before research begins, preventing wasted resources on opinions, predictions, or vague statements.

=== 3.2 Criteria ===

**Claims are EXCLUDED if**:

102

1. **Opinion/Editorial**: Subjective judgment without factual basis

103

1*. Example: "Policy X is the best approach"

104

1*. Action: Exclude unless claim is central to thesis

105

106

1. **Prediction/Speculation**: Future-oriented claims that cannot be verified

107

1*. Example: "Technology Y will dominate the market by 2030"

108

1*. Action: Exclude unless claim is central to thesis

109

110

1. **Low Specificity**: Vague statements without concrete assertions

111

1*. Example: "Some experts believe..."

112

1*. Action: Exclude unless claim is central to thesis

113

114

**Claims are KEPT if**:

115

* **Factual assertion**: Verifiable statement about past/present

116

* **Central claim**: Core thesis claim (kept regardless of specificity)

117

* **Attribution claim**: Claims about what someone said/did

118

119

=== 3.3 Implementation ===

120

121

**File**: ##apps/web/src/lib/analyzer/orchestrated.ts##

122

**Function**: Applied during ##understandClaim()## phase

123

**Phase**: UNDERSTAND (Phase 1)

124

125

**Exclusion Process**:

126

1. LLM extracts claims and marks each with role and type

127

1. Deterministic filter applies Gate 1 criteria

128

1. Excluded claims logged with reasons

129

1. Valid claims proceed to research phase

130

131

=== 3.4 Configuration ===

132

133

**UCM Pipeline Config**:

134

135

{

136

"gate1Enabled": true, // Enable/disable Gate 1

137

"gate1KeepCentralClaims": true // Keep central claims regardless of specificity

}

=== 3.5 Examples ===

**Example 1: Opinion Excluded**

144

145

Claim: "The Supreme Court's decision was unjust."

146

Role: evaluative

147

Result: EXCLUDED (opinion - no factual basis)

148

Reason: "Evaluative opinion without factual assertion"

149

150

151

**Example 2: Central Claim Kept**

152

153

Claim: "The policy will significantly improve outcomes."

154

Role: core

155

Result: KEPT (central to thesis, despite low specificity)

156

Reason: "Central claim kept for analysis"

157

158

159

**Example 3: Factual Assertion Kept**

160

161

Claim: "The court ruled in favor of Party A on January 15, 2025."

162

Role: core

163

Type: factual

164

Result: KEPT (verifiable factual assertion)

----

== 4. Gate 4: Verdict Confidence Assessment ==

=== 4.1 Purpose ===

Ensure verdicts have sufficient supporting evidence before publication, preventing low-confidence judgments from misleading users.

174

175

=== 4.2 Confidence Tiers ===

176

177

|= Tier |= Criteria |= Interpretation

178

| **HIGH** | 3+ sources AND 5+ facts AND reasoning >100 chars | Strong evidence base, high reliability

179

| **MEDIUM** | 2+ sources AND 3+ facts AND reasoning >50 chars | Adequate evidence, moderate reliability

180

| **LOW** | 1+ sources AND 1+ facts | Minimal evidence, low reliability

181

| **INSUFFICIENT** | <1 source OR <1 fact | Insufficient evidence for verdict

182

183

=== 4.3 Implementation ===

184

185

**File**: ##apps/web/src/lib/analyzer/orchestrated.ts##

186

**Function**: ##validateVerdictGate4()##

187

**Phase**: VERDICT GENERATION (Phase 3)

188

189

**Validation Process**:

190

1. Count sources supporting verdict

191

1. Count facts extracted from sources

192

1. Measure reasoning length

193

1. Assign confidence tier

194

1. Apply context scoping for counter-evidence

195

1. Flag verdicts below threshold

196

197

=== 4.4 Context Scoping ===

198

199

Counter-evidence is scoped to relevant analysis contexts:

200

201

202

// Only count criticism facts that are:

203

// 1. In the same context as the verdict, OR

204

// 2. Not scoped to any specific context (general criticism)

205

const contradictingFactCount = facts.filter(f =>

206

!verdict.supportingEvidenceIds.includes(f.id) &&

207

f.category === "criticism" &&

208

(!f.contextId || f.contextId === verdict.contextId)

).length;

This prevents criticism of one analysis context from penalizing claims about a different analysis context.

213

214

=== 4.5 Central Claim Exception ===

215

216

**Central claims remain publishable even if confidence is low**, because they are core to the thesis and users need to see the verdict regardless of evidence sufficiency.

217

218

**Rationale**: Users submitted the input to understand the truth of central claims. Hiding low-confidence verdicts would be misleading.

219

220

=== 4.6 Configuration ===

221

222

**UCM Pipeline Config**:

223

224

{

225

"gate4Enabled": true, // Enable/disable Gate 4

226

"gate4MinSources": 2, // Minimum sources for MEDIUM confidence

227

"gate4MinFacts": 3, // Minimum facts for MEDIUM confidence

228

"gate4MinReasoningLength": 50 // Minimum reasoning length for MEDIUM

}

=== 4.7 Examples ===

**Example 1: HIGH Confidence**

235

236

Verdict: "MOSTLY-TRUE" (85%)

237

Sources: 4 (Reuters, AP, BBC, Government site)

238

Facts: 12

239

Reasoning: 150 chars

240

Result: HIGH confidence tier

241

Action: Publish with full confidence

242

243

244

**Example 2: LOW Confidence (Central Claim)**

245

246

Verdict: "UNVERIFIED" (50%)

247

Sources: 1 (Blog post)

Facts: 2

Reasoning: 80 chars

Claim: Central

Result: LOW confidence tier

252

Action: Publish with warning (central claim exception)

253

254

255

**Example 3: INSUFFICIENT (Non-Central)**

256

257

Verdict: "UNVERIFIED" (50%)

Sources: 0

Facts: 0

Reasoning: 30 chars

Claim: Non-central

Result: INSUFFICIENT

Action: Exclude from report or mark as "No evidence found"

----

== 5. Confidence Impact on Verdict Calculation ==

269

270

=== 5.1 Truth Percentage Modulation ===

271

272

Confidence modulates the final truth percentage within each verdict band:

273

274

**Via ##truthFromBand()## function**:

275

276

function truthFromBand(band: "strong" | "partial" | "uncertain" | "refuted", confidence: number): number {

277

const conf = normalizePercentage(confidence) / 100;

278

switch (band) {

279

case "strong": return Math.round(72 + 28 * conf); // 72-100%

280

case "partial": return Math.round(50 + 35 * conf); // 50-85%

281

case "uncertain": return Math.round(35 + 30 * conf); // 35-65%

282

case "refuted": return Math.round(28 * (1 - conf)); // 0-28%

}

}

=== 5.2 Example Impact ===

288

289

**"strong" band with varying confidence**:

290

* **High confidence** (90%): 72 + 28x0.9 = 97% -> **TRUE**

291

* **Medium confidence** (60%): 72 + 28x0.6 = 89% -> **TRUE**

292

* **Low confidence** (30%): 72 + 28x0.3 = 80% -> **MOSTLY-TRUE**

293

294

Same evidence band, but lower confidence pulls verdict down within the band.

295

296

=== 5.3 MIXED vs UNVERIFIED Distinction ===

297

298

Confidence determines whether 43-57% range is **MIXED** or **UNVERIFIED**:

299

300

301

// Confidence threshold to distinguish MIXED from UNVERIFIED

302

const MIXED_CONFIDENCE_THRESHOLD = 60;

303

304

if (truthPercentage >= 43 && truthPercentage <= 57) {

305

return confidence >= 60 ? "MIXED" : "UNVERIFIED";

}

* **MIXED** (confidence >= 60%): Evidence on both sides, high confidence in mixed state

310

* **UNVERIFIED** (confidence < 60%): Insufficient evidence, low confidence

----

== 6. Gate Statistics and Reporting ==

315

316

=== 6.1 Gate Stats in Result JSON ===

317

318

Every analysis result includes gate statistics:

{

"qualityGates": {

"gate1Stats": {

"totalClaims": 15,

"validClaims": 12,

"excludedClaims": 3,

"exclusionReasons": [

328

{ "claimId": "C3", "reason": "Opinion without factual basis" },

329

{ "claimId": "C7", "reason": "Prediction about future events" },

330

{ "claimId": "C11", "reason": "Low specificity, non-central" }

]

},

"gate4Stats": {

"totalVerdicts": 12,

"highConfidence": 8,

"mediumConfidence": 3,

"lowConfidence": 1,

"insufficient": 0

}

}

}

=== 6.2 UI Display (Current Status) ===

345

346

**Current**: Gate stats included in JSON but not displayed in UI with per-item reasons

347

348

**Planned**: UI enhancements to show:

349

* Excluded claims with reasons in report

350

* Confidence tier badges on verdicts

351

* Warning indicators for low-confidence verdicts

----

== 7. Proposed Gates (Not Yet Implemented) ==

356

357

=== 7.1 Gate 2: Source Quality (Proposed) ===

358

359

**Purpose**: Filter low-quality sources before evidence extraction

360

**Criteria**:

361

* Source reliability score > threshold

362

* Domain not in blocklist

363

* Content length > minimum

364

365

**Status**: Proposed but not implemented (Source Reliability system exists but not integrated as gate)

366

367

=== 7.2 Gate 3: Evidence Relevance (Proposed) ===

368

369

**Purpose**: Filter tangential or low-probative-value evidence

370

**Criteria**:

371

* Thesis relevance score > threshold

372

* Recency appropriate for claim

373

* Geographic/jurisdictional match

374

375

**Status**: Proposed but not implemented (Evidence filtering exists but not formalized as gate)

----

== 8. Debugging and Diagnostics ==

380

381

=== 8.1 Checking Gate Stats ===

**In Result JSON**:

const result = await analyzeJob(jobId);

386

console.log('Gate 1 excluded:', result.qualityGates.gate1Stats.excludedClaims);

387

console.log('Gate 4 confidence:', result.qualityGates.gate4Stats);

388

389

390

**In Report Markdown**:

391

Search for "Quality Gates" section (planned feature)

392

393

=== 8.2 Common Issues ===

394

395

**Issue 1: Too many claims excluded by Gate 1**

396

* **Symptom**: Most claims marked as excluded

397

* **Cause**: Input is primarily opinion/editorial

398

* **Solution**: Clarify that Gate 1 is working correctly; input may not be fact-checkable

399

400

**Issue 2: All verdicts marked LOW confidence**

401

* **Symptom**: gate4Stats shows all verdicts in LOW tier

402

* **Cause**: Search returning few sources or sources have little relevant content

403

* **Solution**: Check search provider credentials, improve search queries, adjust Gate 4 thresholds

404

405

**Issue 3: Central claims excluded by Gate 1**

406

* **Symptom**: Core thesis claims not appearing in results

407

* **Cause**: ##gate1KeepCentralClaims=false## in config

408

* **Solution**: Enable central claim exception in UCM Pipeline config

----

== 9. Configuration Reference ==

413

414

=== 9.1 UCM Pipeline Config ===

{

// Gate 1: Claim Validation

419

"gate1Enabled": true,

420

"gate1KeepCentralClaims": true,

421

422

// Gate 4: Verdict Confidence Assessment

423

"gate4Enabled": true,

424

"gate4MinSources": 2,

425

"gate4MinFacts": 3,

426

"gate4MinReasoningLength": 50,

427

428

// Confidence thresholds

429

"mixedConfidenceThreshold": 60 // MIXED vs UNVERIFIED distinction

}

=== 9.2 Default Values ===

434

435

|= Setting |= Default |= Rationale

436

| ##gate1Enabled## | ##true## | Quality control essential

437

| ##gate1KeepCentralClaims## | ##true## | Users need to see core thesis verdicts

438

| ##gate4Enabled## | ##true## | Prevent low-quality verdicts

439

| ##gate4MinSources## | ##2## | Balance between quality and coverage

440

| ##gate4MinFacts## | ##3## | Minimum for reasonable confidence

441

| ##gate4MinReasoningLength## | ##50## | Ensure non-trivial reasoning

442

| ##mixedConfidenceThreshold## | ##60## | Clear distinction between mixed/unverified

----

== 10. Testing Quality Gates ==

447

448

=== 10.1 Unit Tests ===

449

450

**File**: ##apps/web/src/lib/analyzer/__tests__/quality-gates.test.ts##

451

452

**Coverage**:

453

* Gate 1 exclusion scenarios (opinion, prediction, low-specificity)

454

* Gate 1 central claim exception

455

* Gate 4 confidence tier assignment

456

* Gate 4 context scoping

457

* Gate 4 central claim exception

458

459

=== 10.2 Integration Tests ===

460

461

**Test Analysis Inputs**:

462

1. **High-quality factual article** -> Expect: Most claims pass Gate 1, high Gate 4 confidence

463

1. **Opinion editorial** -> Expect: Most claims excluded by Gate 1

464

1. **Low-source analysis** -> Expect: Low Gate 4 confidence tiers

465

466

=== 10.3 Manual Testing ===

467

468

**Steps**:

469

1. Run analysis on test input

470

1. Check result JSON for ##qualityGates## object

471

1. Verify exclusion reasons match expected criteria

472

1. Verify confidence tiers match source/fact counts

----

== 11. Related Documentation ==

477

478

* Calculations.md (see Calculations.md in local docs) - Verdict calculation methodology, confidence modulation

479

* [[Architecture Overview>>FactHarbor.Specification.Implementation.Architecture Overview.WebHome]] - Architecture overview, pipeline flow

480

* [[TriplePath Architecture>>FactHarbor.Specification.Implementation.Pipeline Architecture.TriplePath Architecture.WebHome]] - Pipeline variants and quality gate enforcement

481

* Evidence_Quality_Filtering.md (see Evidence_Quality_Filtering.md in local docs) - Evidence filtering (related to proposed Gate 3)

----

== 12. Conclusion ==

Quality Gates ensure that FactHarbor maintains high standards for claim evaluation and verdict confidence. The current implementation (Gate 1 and Gate 4) provides:

488

489

1. **Input quality control** (Gate 1) - Only verifiable claims analyzed

490

1. **Output quality control** (Gate 4) - Verdicts backed by sufficient evidence

491

1. **Transparency** - Gate stats included in every result

492

1. **Configurability** - Thresholds adjustable via UCM

493

494

**Key Takeaways**: