Last modified by Robert Schaub on 2025/12/24 20:35

From version 1.1
edited by Robert Schaub
on 2025/12/21 11:25
Change comment: Imported from XAR
To version 1.7
edited by Robert Schaub
on 2025/12/24 20:35
Change comment: Renamed back-links.

Summary

Details

Page properties
Parent
... ... @@ -1,1 +1,1 @@
1 -Test.FactHarbor.Roadmap.WebHome
1 +FactHarbor.Archive.FactHarbor delta for V0\.9\.70.Roadmap.WebHome
Content
... ... @@ -4,7 +4,7 @@
4 4  
5 5  **Success Metric:** <5% hallucination rate, all 4 quality gates operational
6 6  
7 ----
7 +----
8 8  
9 9  == 1. Overview ==
10 10  
... ... @@ -13,12 +13,13 @@
13 13  **Key Innovation:** Complete quality validation pipeline catches all categories of errors
14 14  
15 15  **What We're Proving:**
16 +
16 16  * All 4 quality gates work together effectively
17 17  * Evidence deduplication prevents artificial inflation
18 18  * System maintains quality at larger scale
19 19  * Quality metrics dashboard provides actionable insights
20 20  
21 ----
22 +----
22 22  
23 23  == 2. New Requirements ==
24 24  
... ... @@ -31,11 +31,13 @@
31 31  **Purpose:** Ensure AI-linked evidence actually relates to the claim
32 32  
33 33  **Validation Checks:**
35 +
34 34  1. **Semantic Similarity:** Cosine similarity between claim and evidence embeddings ≥ 0.6
35 35  2. **Entity Overlap:** At least 1 shared named entity between claim and evidence
36 36  3. **Topic Relevance:** Evidence discusses the claim's subject matter (score ≥ 0.5)
37 37  
38 38  **Action if Failed:**
41 +
39 39  * Discard irrelevant evidence (don't count it)
40 40  * If <2 relevant evidence items remain → "Insufficient Evidence" verdict
41 41  * Log discarded evidence for quality review
... ... @@ -42,7 +42,7 @@
42 42  
43 43  **Target:** 0% of evidence cited is off-topic
44 44  
45 ----
48 +----
46 46  
47 47  ==== Gate 3: Scenario Coherence Check ====
48 48  
... ... @@ -49,6 +49,7 @@
49 49  **Purpose:** Validate scenarios are logical, complete, and meaningfully different
50 50  
51 51  **Validation Checks:**
55 +
52 52  1. **Completeness:** All required fields populated (assumptions, scope, evidence context)
53 53  2. **Internal Consistency:** Assumptions don't contradict each other (score <0.3)
54 54  3. **Distinctiveness:** Scenarios are meaningfully different (similarity <0.8)
... ... @@ -55,6 +55,7 @@
55 55  4. **Minimum Detail:** At least 1 specific assumption per scenario
56 56  
57 57  **Action if Failed:**
62 +
58 58  * Merge duplicate scenarios
59 59  * Flag contradictory assumptions for review
60 60  * Reduce confidence score by 20%
... ... @@ -62,7 +62,7 @@
62 62  
63 63  **Target:** 0% duplicate scenarios, all scenarios internally consistent
64 64  
65 ----
70 +----
66 66  
67 67  === 2.2 FR54: Evidence Deduplication (NEW) ===
68 68  
... ... @@ -72,6 +72,7 @@
72 72  **Purpose:** Prevent counting the same evidence multiple times when cited by different sources
73 73  
74 74  **Problem:**
80 +
75 75  * Wire services (AP, Reuters) redistribute same content
76 76  * Different sites cite the same original study
77 77  * Aggregators copy primary sources
... ... @@ -78,6 +78,7 @@
78 78  * AKEL might count this as "5 sources" when it's really 1
79 79  
80 80  **Solution: Content Fingerprinting**
87 +
81 81  * Generate SHA-256 hash of normalized text
82 82  * Detect near-duplicates (≥85% similarity) using fuzzy matching
83 83  * Track which sources cited each unique piece of evidence
... ... @@ -85,7 +85,7 @@
85 85  
86 86  **Target:** Duplicate detection >95% accurate, evidence counts reflect reality
87 87  
88 ----
95 +----
89 89  
90 90  === 2.3 NFR13: Quality Metrics Dashboard (Internal) ===
91 91  
... ... @@ -93,6 +93,7 @@
93 93  **Fulfills:** Real-time quality monitoring during development
94 94  
95 95  **Dashboard Metrics:**
103 +
96 96  * Claim processing statistics
97 97  * Gate performance (pass/fail rates for each gate)
98 98  * Evidence quality metrics
... ... @@ -101,11 +101,12 @@
101 101  
102 102  **Target:** Dashboard functional, all metrics tracked, exportable
103 103  
104 ----
112 +----
105 105  
106 106  == 3. Success Criteria ==
107 107  
108 108  **✅ Quality:**
117 +
109 109  * Hallucination rate <5% (target: <3%)
110 110  * Average quality rating ≥8.0/10
111 111  * 0 critical failures (publishable falsities)
... ... @@ -112,6 +112,7 @@
112 112  * Gates correctly identify >95% of low-quality outputs
113 113  
114 114  **✅ All 4 Gates Operational:**
124 +
115 115  * Gate 1: Claim validation working
116 116  * Gate 2: Evidence relevance filtering working
117 117  * Gate 3: Scenario coherence checking working
... ... @@ -118,16 +118,18 @@
118 118  * Gate 4: Verdict confidence assessment working
119 119  
120 120  **✅ Evidence Deduplication:**
131 +
121 121  * Duplicate detection >95% accurate
122 122  * Evidence counts reflect reality
123 123  * Provenance tracked correctly
124 124  
125 125  **✅ Metrics Dashboard:**
137 +
126 126  * All metrics implemented and tracking
127 127  * Dashboard functional and useful
128 128  * Alerts trigger appropriately
129 129  
130 ----
142 +----
131 131  
132 132  == 4. Architecture Notes ==
133 133  
... ... @@ -142,6 +142,7 @@
142 142  {{/code}}
143 143  
144 144  **Key Additions from POC1:**
157 +
145 145  * Scenario generation component
146 146  * Evidence deduplication system
147 147  * Gates 2 & 3 implementation
... ... @@ -148,23 +148,23 @@
148 148  * Quality metrics collection
149 149  
150 150  **Still Simplified vs. Full System:**
164 +
151 151  * Single AKEL orchestration (not multi-component pipeline)
152 152  * No review queue
153 153  * No federation architecture
154 154  
155 -**See:** [[Architecture>>Test.FactHarbor.Specification.Architecture.WebHome]] for details
169 +**See:** [[Architecture>>FactHarbor.Archive.FactHarbor delta for V0\.9\.70.Specification.Architecture.WebHome]] for details
156 156  
157 ----
171 +----
158 158  
159 159  == Related Pages ==
160 160  
161 -* [[POC1>>Test.FactHarbor.Roadmap.POC1.WebHome]] - Previous phase
162 -* [[Beta 0>>Test.FactHarbor.Roadmap.Beta0.WebHome]] - Next phase
163 -* [[Roadmap Overview>>Test.FactHarbor.Roadmap.WebHome]]
164 -* [[Architecture>>Test.FactHarbor.Specification.Architecture.WebHome]]
175 +* [[POC1>>FactHarbor.Archive.FactHarbor delta for V0\.9\.70.Roadmap.POC1.WebHome]] - Previous phase
176 +* [[Beta 0>>Archive.FactHarbor delta for V0\.9\.70.Roadmap.Beta0.WebHome]] - Next phase
177 +* [[Roadmap Overview>>FactHarbor.Archive.FactHarbor delta for V0\.9\.70.Roadmap.WebHome]]
178 +* [[Architecture>>FactHarbor.Archive.FactHarbor delta for V0\.9\.70.Specification.Architecture.WebHome]]
165 165  
166 ----
180 +----
167 167  
168 168  **Document Status:** ✅ POC2 Specification Complete - Waiting for POC1 Completion
169 169  **Version:** V0.9.70
170 -