Wiki source code of Requirements

Last modified by Robert Schaub on 2025/12/22 14:16

Hide last authors
Robert Schaub 1.1 1 = Requirements = **This page defines Roles, Content States, Rules, and System Requirements for FactHarbor.** **Core Philosophy:** Invest in system improvement, not manual data correction. When AI makes errors, improve the algorithm and re-process automatically. == Navigation == * **[[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]]** - What users need from FactHarbor (drives these requirements)
2 * **This page** - How we fulfill those needs through system design (% class="box infomessage" %)
3 (((
4 **How to read this page:** 1. **User Needs drive Requirements**: See [[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]] for what users need
5 2. **Requirements define implementation**: This page shows how we fulfill those needs
6 3. **Functional Requirements (FR)**: Specific features and capabilities
7 4. **Non-Functional Requirements (NFR)**: Quality attributes (performance, security, etc.) Each requirement references which User Needs it fulfills.
8 ))) == 1. Roles == **Fulfills**: UN-12 (Submit claims), UN-13 (Cite verdicts), UN-14 (API access) FactHarbor uses three simple roles plus a reputation system. === 1.1 Reader === **Who**: Anyone (no login required) **Can**:
9 * Browse and search claims
10 * View scenarios, evidence, verdicts, and confidence scores
11 * Flag issues or errors
12 * Use filters, search, and visualization tools
13 * Submit claims automatically (new claims added if not duplicates) **Cannot**:
14 * Modify content
15 * Access edit history details **User Needs served**: UN-1 (Trust assessment), UN-2 (Claim verification), UN-3 (Article summary with FactHarbor analysis summary), UN-4 (Social media fact-checking), UN-5 (Source tracing), UN-7 (Evidence transparency), UN-8 (Understanding disagreement), UN-12 (Submit claims), UN-17 (In-article highlighting) === 1.2 Contributor === **Who**: Registered users (earns reputation through contributions) **Can**:
16 * Everything a Reader can do
17 * Edit claims, evidence, and scenarios
18 * Add sources and citations
19 * Suggest improvements to AI-generated content
20 * Participate in discussions
21 * Earn reputation points for quality contributions **Reputation System**:
22 * New contributors: Limited edit privileges
23 * Established contributors (established reputation): Full edit access
24 * Trusted contributors (substantial reputation): Can approve certain changes
25 * Reputation earned through: Accepted edits, helpful flags, quality contributions
26 * Reputation lost through: Reverted edits, invalid flags, abuse **Cannot**:
27 * Delete or hide content (only moderators)
28 * Override moderation decisions **User Needs served**: UN-13 (Cite and contribute) === 1.3 Moderator === **Who**: Trusted community members with proven track record, appointed by governance board **Can**:
29 * Review flagged content
30 * Hide harmful or abusive content
31 * Resolve disputes between contributors
32 * Issue warnings or temporary bans
33 * Make final decisions on content disputes
34 * Access full audit logs **Cannot**:
35 * Change governance rules
36 * Permanently ban users without board approval
37 * Override technical quality gates **Note**: Small team (3-5 initially), supported by automated moderation tools. === 1.4 Domain Trusted Contributors (Optional, Task-Specific) === **Who**: Subject matter specialists invited for specific high-stakes disputes **Not a permanent role**: Contacted externally when needed for contested claims in their domain **When used**:
38 * Medical claims with life/safety implications
39 * Legal interpretations with significant impact
40 * Scientific claims with high controversy
41 * Technical claims requiring specialized knowledge **Process**:
42 * Moderator identifies need for expert input
43 * Contact expert externally (don't require them to be users)
44 * Trusted Contributor provides written opinion with sources
45 * Opinion added to claim record
46 * Trusted Contributor acknowledged in claim **User Needs served**: UN-16 (Expert validation status) == 2. Content States == **Fulfills**: UN-1 (Trust indicators), UN-16 (Review status transparency) FactHarbor uses two content states. Focus is on transparency and confidence scoring, not gatekeeping. === 2.1 Published === **Status**: Visible to all users **Includes**:
47 * AI-generated analyses (default state)
48 * User-contributed content
49 * Edited/improved content **Quality Indicators** (displayed with content):
50 * **Confidence Score**: 0-100% (AI's confidence in analysis)
51 * **Source Quality Score**: 0-100% (based on source track record)
52 * **Controversy Flag**: If high dispute/edit activity
53 * **Completeness Score**: % of expected fields filled
54 * **Last Updated**: Date of most recent change
55 * **Edit Count**: Number of revisions
56 * **Review Status**: AI-generated / Human-reviewed / Expert-validated **Automatic Warnings**:
57 * Confidence < 60%: "Low confidence - use caution"
58 * Source quality < 40%: "Sources may be unreliable"
59 * High controversy: "Disputed - multiple interpretations exist"
60 * Medical/Legal/Safety domain: "Seek professional advice" **User Needs served**: UN-1 (Trust score), UN-9 (Methodology transparency), UN-15 (Evolution timeline), UN-16 (Review status) === 2.2 Hidden === **Status**: Not visible to regular users (only to moderators) **Reasons**:
61 * Spam or advertising
62 * Personal attacks or harassment
63 * Illegal content
64 * Privacy violations
65 * Deliberate misinformation (verified)
66 * Abuse or harmful content **Process**:
67 * Automated detection flags for moderator review
68 * Moderator confirms and hides
69 * Original author notified with reason
70 * Can appeal to board if disputes moderator decision **Note**: Content is hidden, not deleted (for audit trail) == 3. Contribution Rules == === 3.1 All Contributors Must === * Provide sources for factual claims
71 * Use clear, neutral language in FactHarbor's own summaries
72 * Respect others and maintain civil discourse
73 * Accept community feedback constructively
74 * Focus on improving quality, not protecting ego === 3.2 AKEL (AI System) === **AKEL is the primary system**. Human contributions supplement and train AKEL. **AKEL Must**:
75 * Mark all outputs as AI-generated
76 * Display confidence scores prominently
77 * Provide source citations
78 * Flag uncertainty clearly
79 * Identify contradictions in evidence
80 * Learn from human corrections **When AKEL Makes Errors**:
81 1. Capture the error pattern (what, why, how common)
82 2. Improve the system (better prompt, model, validation)
83 3. Re-process affected claims automatically
84 4. Measure improvement (did quality increase?) **Human Role**: Train AKEL through corrections, not replace AKEL === 3.3 Contributors Should === * Improve clarity and structure
85 * Add missing sources
86 * Flag errors for system improvement
87 * Suggest better ways to present information
88 * Participate in quality discussions === 3.4 Moderators Must === * Be impartial
89 * Document moderation decisions
90 * Respond to appeals promptly
91 * Use automated tools to scale efforts
92 * Focus on abuse/harm, not routine quality control == 4. Quality Standards == **Fulfills**: UN-5 (Source reliability), UN-6 (Publisher track records), UN-7 (Evidence transparency), UN-9 (Methodology transparency) === 4.1 Source Requirements === **Track Record Over Credentials**:
93 * Sources evaluated by historical accuracy
94 * Correction policy matters
95 * Independence from conflicts of interest
96 * Methodology transparency **Source Quality Database**:
97 * Automated tracking of source accuracy
98 * Correction frequency
99 * Reliability score (updated continuously)
100 * Users can see source track record **No automatic trust** for government, academia, or media - all evaluated by track record. **User Needs served**: UN-5 (Source provenance), UN-6 (Publisher reliability) === 4.2 Claim Requirements === * Clear subject and assertion
101 * Verifiable with available information
102 * Sourced (or explicitly marked as needing sources)
103 * Neutral language in FactHarbor summaries
104 * Appropriate context provided **User Needs served**: UN-2 (Claim extraction and verification) === 4.3 Evidence Requirements === * Publicly accessible (or explain why not)
105 * Properly cited with attribution
106 * Relevant to claim being evaluated
107 * Original source preferred over secondary **User Needs served**: UN-7 (Evidence transparency) === 4.4 Confidence Scoring === **Automated confidence calculation based on**:
108 * Source quality scores
109 * Evidence consistency
110 * Contradiction detection
111 * Completeness of analysis
112 * Historical accuracy of similar claims **Thresholds**:
113 * < 40%: Too low to publish (needs improvement)
114 * 40-60%: Published with "Low confidence" warning
115 * 60-80%: Published as standard
116 * 80-100%: Published as "High confidence" **User Needs served**: UN-1 (Trust assessment), UN-9 (Methodology transparency) == 5. Automated Risk Scoring == **Fulfills**: UN-10 (Manipulation detection), UN-16 (Appropriate review level) **Replace manual risk tiers with continuous automated scoring**. === 5.1 Risk Score Calculation === **Factors** (weighted algorithm):
117 * **Domain sensitivity**: Medical, legal, safety auto-flagged higher
118 * **Potential impact**: Views, citations, spread
119 * **Controversy level**: Flags, disputes, edit wars
120 * **Uncertainty**: Low confidence, contradictory evidence
121 * **Source reliability**: Track record of sources used **Score**: 0-100 (higher = more risk) === 5.2 Automated Actions === * **Score > 80**: Flag for moderator review before publication
122 * **Score 60-80**: Publish with prominent warnings
123 * **Score 40-60**: Publish with standard warnings
124 * **Score < 40**: Publish normally **Continuous monitoring**: Risk score recalculated as new information emerges **User Needs served**: UN-10 (Detect manipulation tactics), UN-16 (Review status) == 6. System Improvement Process == **Core principle**: Fix the system, not just the data. === 6.1 Error Capture === **When users flag errors or make corrections**:
125 1. What was wrong? (categorize)
126 2. What should it have been?
127 3. Why did the system fail? (root cause)
128 4. How common is this pattern?
129 5. Store in ErrorPattern table (improvement queue) === 6.2 Weekly Improvement Cycle === 1. **Review**: Analyze top error patterns
130 2. **Develop**: Create fix (prompt, model, validation)
131 3. **Test**: Validate fix on sample claims
132 4. **Deploy**: Roll out if quality improves
133 5. **Re-process**: Automatically update affected claims
134 6. **Monitor**: Track quality metrics === 6.3 Quality Metrics Dashboard === **Track continuously**:
135 * Error rate by category
136 * Source quality distribution
137 * Confidence score trends
138 * User flag rate (issues found)
139 * Correction acceptance rate
140 * Re-work rate
141 * Claims processed per hour **Goal**: 10% monthly improvement in error rate == 7. Automated Quality Monitoring == **Replace manual audit sampling with automated monitoring**. === 7.1 Continuous Metrics === * **Source quality**: Track record database
142 * **Consistency**: Contradiction detection
143 * **Clarity**: Readability scores
144 * **Completeness**: Field validation
145 * **Accuracy**: User corrections tracked === 7.2 Anomaly Detection === **Automated alerts for**:
146 * Sudden quality drops
147 * Unusual patterns
148 * Contradiction clusters
149 * Source reliability changes
150 * User behavior anomalies === 7.3 Targeted Review === * Review only flagged items
151 * Random sampling for calibration (not quotas)
152 * Learn from corrections to improve automation == 8. Functional Requirements == This section defines specific features that fulfill user needs. === 8.1 Claim Intake & Normalization === ==== FR1 — Claim Intake ==== **Fulfills**: UN-2 (Claim extraction), UN-4 (Quick fact-checking), UN-12 (Submit claims) * Users submit claims via simple form or API
153 * Claims can be text, URL, or image
154 * Duplicate detection (semantic similarity)
155 * Auto-categorization by domain ==== FR2 — Claim Normalization ==== **Fulfills**: UN-2 (Claim verification) * Standardize to clear assertion format
156 * Extract key entities (who, what, when, where)
157 * Identify claim type (factual, predictive, evaluative)
158 * Link to existing similar claims ==== FR3 — Claim Classification ==== **Fulfills**: UN-11 (Filtered research) * Domain: Politics, Science, Health, etc.
159 * Type: Historical fact, current stat, prediction, etc.
160 * Risk score: Automated calculation
161 * Complexity: Simple, moderate, complex === 8.2 Scenario System === ==== FR4 — Scenario Generation ==== **Fulfills**: UN-2 (Context-dependent verification), UN-3 (Article summary with FactHarbor analysis summary), UN-8 (Understanding disagreement) **Automated scenario creation**:
162 * AKEL analyzes claim and generates likely scenarios (use-cases and contexts)
163 * Each scenario includes: assumptions, definitions, boundaries, evidence context
164 * Users can flag incorrect scenarios
165 * System learns from corrections **Key Concept**: Scenarios represent different interpretations or contexts (e.g., "Clinical trials with healthy adults" vs. "Real-world data with diverse populations") ==== FR5 — Evidence Linking ==== **Fulfills**: UN-5 (Source tracing), UN-7 (Evidence transparency) * Automated evidence discovery from sources
166 * Relevance scoring
167 * Contradiction detection
168 * Source quality assessment ==== FR6 — Scenario Comparison ==== **Fulfills**: UN-3 (Article summary with FactHarbor analysis summary), UN-8 (Understanding disagreement) * Side-by-side comparison interface
169 * Highlight key differences between scenarios
170 * Show evidence supporting each scenario
171 * Display confidence scores per scenario === 8.3 Verdicts & Analysis === ==== FR7 — Automated Verdicts ==== **Fulfills**: UN-1 (Trust score), UN-2 (Verification verdicts), UN-3 (Article summary with FactHarbor analysis summary), UN-13 (Cite verdicts) * AKEL generates verdict based on evidence within each scenario
172 * **Likelihood range** displayed (e.g., "0.70-0.85 (likely true)") - NOT binary true/false
173 * **Uncertainty factors** explicitly listed (e.g., "Small sample sizes", "Long-term effects unknown")
174 * Confidence score displayed prominently
175 * Source quality indicators shown
176 * Contradictions noted
177 * Uncertainty acknowledged **Key Innovation**: Detailed probabilistic verdicts with explicit uncertainty, not binary judgments ==== FR8 — Time Evolution ==== **Fulfills**: UN-15 (Verdict evolution timeline) * Claims and verdicts update as new evidence emerges
178 * Version history maintained for all verdicts
179 * Changes highlighted
180 * Confidence score trends visible
181 * Users can see "as of date X, what did we know?" === 8.4 User Interface & Presentation === ==== FR12 — Two-Panel Summary View (Article Summary with FactHarbor Analysis Summary) ==== **Fulfills**: UN-3 (Article Summary with FactHarbor Analysis Summary) **Purpose**: Provide side-by-side comparison of what a document claims vs. FactHarbor's complete analysis of its credibility **Left Panel: Article Summary**:
182 * Document title, source, and claimed credibility
183 * "The Big Picture" - main thesis or position change
184 * "Key Findings" - structured summary of document's main claims
185 * "Reasoning" - document's explanation for positions
186 * "Conclusion" - document's bottom line **Right Panel: FactHarbor Analysis Summary**:
187 * FactHarbor's independent source credibility assessment
188 * Claim-by-claim verdicts with confidence scores
189 * Methodology assessment (strengths, limitations)
190 * Overall verdict on document quality
191 * Analysis ID for reference **Design Principles**:
192 * No scrolling required - both panels visible simultaneously
193 * Visual distinction between "what they say" and "FactHarbor's analysis"
194 * Color coding for verdicts (supported, uncertain, refuted)
195 * Confidence percentages clearly visible
196 * Mobile responsive (panels stack vertically on small screens) **Implementation Notes**:
197 * Generated automatically by AKEL for every analyzed document
198 * Updates when verdict evolves (maintains version history)
199 * Exportable as standalone summary report
200 * Shareable via permanent URL ==== FR13 — In-Article Claim Highlighting ==== **Fulfills**: UN-17 (In-article claim highlighting) **Purpose**: Enable readers to quickly assess claim credibility while reading by visually highlighting factual claims with color-coded indicators ==== Visual Example: Article with Highlighted Claims ==== (% class="box" %)
201 (((
202 **Article: "New Study Shows Benefits of Mediterranean Diet"** A recent study published in the Journal of Nutrition has revealed new findings about the Mediterranean diet. (% class="box successmessage" style="margin:10px 0;" %)
203 (((
204 🟢 **Researchers found that Mediterranean diet followers had a 25% lower risk of heart disease compared to control groups** (% style="font-size:0.9em; color:#666;" %)
205 ↑ WELL SUPPORTED • 87% confidence
206 [[Click for evidence details →]]
207 (%%)
208 ))) The study, which followed 10,000 participants over five years, showed significant improvements in cardiovascular health markers. (% class="box warningmessage" style="margin:10px 0;" %)
209 (((
210 🟡 **Some experts believe this diet can completely prevent heart attacks** (% style="font-size:0.9em; color:#666;" %)
211 ↑ UNCERTAIN • 45% confidence
212 Overstated - evidence shows risk reduction, not prevention
213 [[Click for details →]]
214 (%%)
215 ))) Dr. Maria Rodriguez, lead researcher, recommends incorporating more olive oil, fish, and vegetables into daily meals. (% class="box errormessage" style="margin:10px 0;" %)
216 (((
217 🔴 **The study proves that saturated fats cause heart disease** (% style="font-size:0.9em; color:#666;" %)
218 ↑ REFUTED • 15% confidence
219 Claim not supported by study design; correlation ≠ causation
220 [[Click for counter-evidence →]]
221 (%%)
222 ))) Participants also reported feeling more energetic and experiencing better sleep quality, though these were secondary measures.
223 ))) **Legend:**
224 * 🟢 = Well-supported claim (confidence ≥75%)
225 * 🟡 = Uncertain claim (confidence 40-74%)
226 * 🔴 = Refuted/unsupported claim (confidence <40%)
227 * Plain text = Non-factual content (context, opinions, recommendations) ==== Tooltip on Hover/Click ==== (% class="box infomessage" %)
228 (((
229 **FactHarbor Analysis** **Claim:**
230 "Researchers found that Mediterranean diet followers had a 25% lower risk of heart disease" **Verdict:** WELL SUPPORTED
231 **Confidence:** 87% **Evidence Summary:**
232 * Meta-analysis of 12 RCTs confirms 23-28% risk reduction
233 * Consistent findings across multiple populations
234 * Published in peer-reviewed journal (high credibility) **Uncertainty Factors:**
235 * Exact percentage varies by study (20-30% range) [[View Full Analysis →]]
236 ))) **Color-Coding System**:
237 * **Green**: Well-supported claims (confidence ≥75%, strong evidence)
238 * **Yellow/Orange**: Uncertain claims (confidence 40-74%, conflicting or limited evidence)
239 * **Red**: Refuted or unsupported claims (confidence <40%, contradicted by evidence)
240 * **Gray/Neutral**: Non-factual content (opinions, questions, procedural text) ==== Interactive Highlighting Example (Detailed View) ==== (% style="width:100%; border-collapse:collapse;" %)
241 |=**Article Text**|=**Status**|=**Analysis**
242 |(((A recent study published in the Journal of Nutrition has revealed new findings about the Mediterranean diet.)))|(% style="text-align:center;" %)Plain text|(% style="font-style:italic; color:#888;" %)Context - no highlighting
243 |(((//Researchers found that Mediterranean diet followers had a 25% lower risk of heart disease compared to control groups//)))|(% style="background-color:#D4EDDA; text-align:center; padding:8px;" %)🟢 **WELL SUPPORTED**|(((
244 **87% confidence** Meta-analysis of 12 RCTs confirms 23-28% risk reduction [[View Full Analysis]]
245 )))
246 |(((The study, which followed 10,000 participants over five years, showed significant improvements in cardiovascular health markers.)))|(% style="text-align:center;" %)Plain text|(% style="font-style:italic; color:#888;" %)Methodology - no highlighting
247 |(((//Some experts believe this diet can completely prevent heart attacks//)))|(% style="background-color:#FFF3CD; text-align:center; padding:8px;" %)🟡 **UNCERTAIN**|(((
248 **45% confidence** Overstated - evidence shows risk reduction, not prevention [[View Details]]
249 )))
250 |(((Dr. Rodriguez recommends incorporating more olive oil, fish, and vegetables into daily meals.)))|(% style="text-align:center;" %)Plain text|(% style="font-style:italic; color:#888;" %)Recommendation - no highlighting
251 |(((//The study proves that saturated fats cause heart disease//)))|(% style="background-color:#F8D7DA; text-align:center; padding:8px;" %)🔴 **REFUTED**|(((
252 **15% confidence** Claim not supported by study; correlation ≠ causation [[View Counter-Evidence]]
253 ))) **Design Notes:**
254 * Highlighted claims use italics to distinguish from plain text
255 * Color backgrounds match XWiki message box colors (success/warning/error)
256 * Status column shows verdict prominently
257 * Analysis column provides quick summary with link to details **User Actions**:
258 * **Hover** over highlighted claim → Tooltip appears
259 * **Click** highlighted claim → Detailed analysis modal/panel
260 * **Toggle** button to turn highlighting on/off
261 * **Keyboard**: Tab through highlighted claims **Interaction Design**:
262 * Hover/click on highlighted claim → Show tooltip with: * Claim text * Verdict (e.g., "WELL SUPPORTED") * Confidence score (e.g., "85%") * Brief evidence summary * Link to detailed analysis
263 * Toggle highlighting on/off (user preference)
264 * Adjustable color intensity for accessibility **Technical Requirements**:
265 * Real-time highlighting as page loads (non-blocking)
266 * Claim boundary detection (start/end of assertion)
267 * Handle nested or overlapping claims
268 * Preserve original article formatting
269 * Work with various content formats (HTML, plain text, PDFs) **Performance Requirements**:
270 * Highlighting renders within 500ms of page load
271 * No perceptible delay in reading experience
272 * Efficient DOM manipulation (avoid reflows) **Accessibility**:
273 * Color-blind friendly palette (use patterns/icons in addition to color)
274 * Screen reader compatible (ARIA labels for claim credibility)
275 * Keyboard navigation to highlighted claims **Implementation Notes**:
276 * Claims extracted and analyzed by AKEL during initial processing
277 * Highlighting data stored as annotations with byte offsets
278 * Client-side rendering of highlights based on verdict data
279 * Mobile responsive (tap instead of hover) === 8.5 Workflow & Moderation === ==== FR9 — Publication Workflow ==== **Fulfills**: UN-1 (Fast access to verified content), UN-16 (Clear review status) **Simple flow**:
280 1. Claim submitted
281 2. AKEL processes (automated)
282 3. If confidence > threshold: Publish (labeled as AI-generated)
283 4. If confidence < threshold: Flag for improvement
284 5. If risk score > threshold: Flag for moderator **No multi-stage approval process** ==== FR10 — Moderation ==== **Focus on abuse, not routine quality**:
285 * Automated abuse detection
286 * Moderators handle flags
287 * Quick response to harmful content
288 * Minimal involvement in routine content ==== FR11 — Audit Trail ==== **Fulfills**: UN-14 (API access to histories), UN-15 (Evolution tracking) * All edits logged
289 * Version history public
290 * Moderation decisions documented
291 * System improvements tracked == 9. Non-Functional Requirements == === 9.1 NFR1 — Performance === **Fulfills**: UN-4 (Fast fact-checking), UN-11 (Responsive filtering) * Claim processing: < 30 seconds
292 * Search response: < 2 seconds
293 * Page load: < 3 seconds
294 * 99% uptime === 9.2 NFR2 — Scalability === **Fulfills**: UN-14 (API access at scale) * Handle 10,000 claims initially
295 * Scale to 1M+ claims
296 * Support 100K+ concurrent users
297 * Automated processing scales linearly === 9.3 NFR3 — Transparency === **Fulfills**: UN-7 (Evidence transparency), UN-9 (Methodology transparency), UN-13 (Citable verdicts), UN-15 (Evolution visibility) * All algorithms open source
298 * All data exportable
299 * All decisions documented
300 * Quality metrics public === 9.4 NFR4 — Security & Privacy === * Follow [[Privacy Policy>>FactHarbor.Organisation.How-We-Work-Together.Privacy-Policy]]
301 * Secure authentication
302 * Data encryption
303 * Regular security audits === 9.5 NFR5 — Maintainability === * Modular architecture
304 * Automated testing
305 * Continuous integration
306 * Comprehensive documentation === NFR11: AKEL Quality Assurance Framework === **Fulfills:** AI safety, IFCN methodology transparency **Specification:** Multi-layer AI quality gates to detect hallucinations, low-confidence results, and logical inconsistencies. ==== Quality Gate 1: Claim Extraction Validation ==== **Purpose:** Ensure extracted claims are factual assertions (not opinions/predictions) **Checks:**
307 1. **Factual Statement Test:** Is this verifiable? (Yes/No)
308 2. **Opinion Detection:** Contains hedging language? ("I think", "probably", "best")
309 3. **Future Prediction Test:** Makes claims about future events?
310 4. **Specificity Score:** Contains specific entities, numbers, dates? **Thresholds:**
311 * Factual: Must be "Yes"
312 * Opinion markers: <2 hedging phrases
313 * Specificity: ≥3 specific elements **Action if Failed:** Flag as "Non-verifiable", do NOT generate verdict ==== Quality Gate 2: Evidence Relevance Validation ==== **Purpose:** Ensure AI-linked evidence actually relates to claim **Checks:**
314 1. **Semantic Similarity Score:** Evidence vs. claim (embeddings)
315 2. **Entity Overlap:** Shared people/places/things?
316 3. **Topic Relevance:** Discusses claim subject? **Thresholds:**
317 * Similarity: ≥0.6 (cosine similarity)
318 * Entity overlap: ≥1 shared entity
319 * Topic relevance: ≥0.5 **Action if Failed:** Discard irrelevant evidence ==== Quality Gate 3: Scenario Coherence Check ==== **Purpose:** Validate scenario assumptions are logical and complete **Checks:**
320 1. **Completeness:** All required fields populated
321 2. **Internal Consistency:** Assumptions don't contradict
322 3. **Distinguishability:** Scenarios meaningfully different **Thresholds:**
323 * Required fields: 100%
324 * Contradiction score: <0.3
325 * Scenario similarity: <0.8 **Action if Failed:** Merge duplicates, reduce confidence -20% ==== Quality Gate 4: Verdict Confidence Assessment ==== **Purpose:** Only publish high-confidence verdicts **Checks:**
326 1. **Evidence Count:** Minimum 2 sources
327 2. **Source Quality:** Average reliability ≥0.6
328 3. **Evidence Agreement:** Supporting vs. contradicting ≥0.6
329 4. **Uncertainty Factors:** Hedging in reasoning **Confidence Tiers:**
330 * **HIGH (80-100%):** ≥3 sources, ≥0.7 quality, ≥80% agreement
331 * **MEDIUM (50-79%):** ≥2 sources, ≥0.6 quality, ≥60% agreement
332 * **LOW (0-49%):** <2 sources OR low quality/agreement
333 * **INSUFFICIENT:** <2 sources → DO NOT PUBLISH **Implementation Phases:**
334 * **POC1:** Gates 1 & 4 only (basic validation)
335 * **POC2:** All 4 gates (complete framework)
336 * **V1.0:** Hardened with <5% hallucination rate **Acceptance Criteria:**
337 * ✅ All gates operational
338 * ✅ Hallucination rate <5%
339 * ✅ Quality metrics public === NFR12: Security Controls === **Fulfills:** Production readiness, legal compliance **Requirements:**
340 1. **Input Validation:** SQL injection, XSS, CSRF prevention
341 2. **Rate Limiting:** 5 analyses per minute per IP
342 3. **Authentication:** Secure sessions, API key rotation
343 4. **Data Protection:** HTTPS, encryption, backups
344 5. **Security Audit:** Penetration testing, GDPR compliance **Milestone:** Beta 0 (essential), V1.0 (complete) **BLOCKER** === NFR13: Quality Metrics Transparency === **Fulfills:** IFCN transparency, user trust **Public Metrics:**
345 * Quality gates performance
346 * Evidence quality stats
347 * Hallucination rate
348 * User feedback **Milestone:** POC2 (internal), Beta 0 (public), V1.0 (real-time) == 10. Requirements Priority Matrix ==
349
350 This table shows all functional and non-functional requirements ordered by urgency and priority.
351
352 **Note:** Implementation phases (POC1, POC2, Beta 0, V1.0) are defined in [[POC Requirements>>FactHarbor.Specification.POC.Requirements]] and [[Implementation Roadmap>>FactHarbor.Implementation-Roadmap.WebHome]], not in this priority matrix.
353
354 **Priority Levels:**
355 * **CRITICAL** - System doesn't work without it, or major safety/legal risk
356 * **HIGH** - Core functionality, essential for success
357 * **MEDIUM** - Important but not blocking
358 * **LOW** - Nice to have, can be deferred
359
360 **Urgency Levels:**
361 * **HIGH** - Immediate need (critical for proof of concept)
362 * **MEDIUM** - Important but not immediate
363 * **LOW** - Future enhancement
364
365 |= ID |= Title |= Priority |= Urgency |= Reason (for HIGH priority/urgency)
366 | **HIGH URGENCY** |||||
367 | **FR1** | Claim Intake | CRITICAL | HIGH | System entry point - cannot process claims without it
368 | **FR5** | Evidence Collection | CRITICAL | HIGH | Core fact-checking functionality - no evidence = no verdict
369 | **FR7** | Verdict Computation | CRITICAL | HIGH | The output users see - core value proposition
370 | **NFR11** | Quality Assurance Framework | CRITICAL | HIGH | Prevents AI hallucinations - FactHarbor's key differentiator
371 | **FR2** | Claim Normalization | HIGH | HIGH | Standardizes AI input for reliable processing
372 | **FR3** | Claim Classification | HIGH | HIGH | Identifies factual vs non-factual claims - essential quality gate
373 | **FR4** | Scenario Generation | HIGH | HIGH | Handles ambiguous claims - key methodology innovation
374 | **FR6** | Evidence Evaluation | HIGH | HIGH | Source quality directly impacts verdict credibility
375 | **MEDIUM URGENCY** |||||
376 | **NFR12** | Security Controls | CRITICAL | MEDIUM | —
377 | **FR9** | Corrections | HIGH | MEDIUM | IFCN requirement - mandatory for credibility
378 | **FR44** | ClaimReview Schema | HIGH | MEDIUM | Search engine visibility - MUST for V1.0 discovery
379 | **FR45** | Corrections Notification | HIGH | MEDIUM | IFCN compliance - required for corrections transparency
380 | **FR48** | Safety Framework | HIGH | MEDIUM | Prevents harm to contributors - legal and ethical requirement
381 | **NFR3** | Transparency | HIGH | MEDIUM | Core principle - essential for trust and credibility
382 | **NFR13** | Quality Metrics | HIGH | MEDIUM | Monitoring and transparency - IFCN compliance
383 | **FR8** | User Contribution | MEDIUM | MEDIUM | —
384 | **FR10** | Publishing | MEDIUM | MEDIUM | —
385 | **FR13** | API | MEDIUM | MEDIUM | —
386 | **FR46** | Image Verification | MEDIUM | MEDIUM | —
387 | **FR47** | Archive.org Integration | MEDIUM | MEDIUM | —
388 | **FR54** | Evidence Deduplication | MEDIUM | MEDIUM | —
389 | **NFR1** | Performance | MEDIUM | MEDIUM | —
390 | **NFR2** | Scalability | MEDIUM | MEDIUM | —
391 | **NFR4** | Security & Privacy | MEDIUM | MEDIUM | —
392 | **NFR5** | Maintainability | MEDIUM | MEDIUM | —
393 | **LOW URGENCY** |||||
394 | **FR11** | Social Sharing | LOW | LOW | —
395 | **FR12** | Notifications | LOW | LOW | —
396 | **FR49** | A/B Testing | LOW | LOW | —
397 | **FR50** | OSINT Toolkit Integration | LOW | LOW | —
398 | **FR51** | Video Verification System | LOW | LOW | —
399 | **FR52** | Interactive Detection Training | LOW | LOW | —
400 | **FR53** | Cross-Organizational Sharing | LOW | LOW | —
401
402 **Total:** 32 requirements (24 Functional, 8 Non-Functional)
403
404 **Notes:**
405 * Reason column: Only populated for HIGH priority or HIGH urgency items
406 * MEDIUM and LOW priority items use "—" (no specific reason needed)
407
408 **See also:**
409 * [[POC Requirements>>FactHarbor.Specification.POC.Requirements]] - POC1 scope and simplifications
410 * [[Implementation Roadmap>>FactHarbor.Implementation-Roadmap.WebHome]] - Phase-by-phase implementation plan
411 * [[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]] - Foundation that drives these requirements
412
413 === 10.1 User Needs Priority ===
414
415 User Needs (UN) are the foundation that drives functional and non-functional requirements. They are not independently prioritized; instead, their priority is inherited from the FR/NFR requirements they drive.
416
417 |= ID |= Title |= Drives Requirements
418 | **UN-1** | Trust Assessment at a Glance | Multiple FR/NFR
419 | **UN-2** | Claim Extraction and Verification | FR1-7
420 | **UN-3** | Article Summary with FactHarbor Analysis Summary | FR4
421 | **UN-4** | Social Media Fact-Checking | FR1, FR4
422 | **UN-5** | Source Provenance and Track Records | FR6
423 | **UN-6** | Publisher Reliability History | FR6
424 | **UN-7** | Evidence Transparency | NFR3
425 | **UN-8** | Understanding Disagreement and Consensus | FR4
426 | **UN-9** | Methodology Transparency | NFR3, NFR11
427 | **UN-10** | Manipulation Tactics Detection | FR48
428 | **UN-11** | Filtered Research | FR3
429 | **UN-12** | Submit Unchecked Claims | FR8
430 | **UN-13** | Cite FactHarbor Verdicts | FR10
431 | **UN-14** | API Access for Integration | FR13
432 | **UN-15** | Verdict Evolution Timeline | FR7
433 | **UN-16** | AI vs. Human Review Status | FR9
434 | **UN-17** | In-Article Claim Highlighting | FR1
435 | **UN-26** | Search Engine Visibility | FR44
436 | **UN-27** | Visual Claim Verification | FR46
437 | **UN-28** | Safe Contribution Environment | FR48
438
439 **Total:** 20 User Needs
440
441 **Note:** Each User Need inherits priority from the requirements it drives. For example, UN-2 (Claim Extraction and Verification) drives FR1-7, which are CRITICAL/HIGH priority, therefore UN-2 is also critical to the project.
442
443 == 11. MVP Scope == **Phase 1 : Read-Only MVP** Build:
444 * Automated claim analysis
445 * Confidence scoring
446 * Source evaluation
447 * Browse/search interface
448 * User flagging system **Goal**: Prove AI quality before adding user editing **User Needs fulfilled in Phase 1**: UN-1, UN-2, UN-3, UN-4, UN-5, UN-6, UN-7, UN-8, UN-9, UN-12 **Phase 2 : User Contributions** Add only if needed:
449 * Simple editing (Wikipedia-style)
450 * Reputation system
451 * Basic moderation
452 * In-article claim highlighting (FR13) **Additional User Needs fulfilled**: UN-13, UN-17 **Phase 3 : Refinement** * Continuous quality improvement
453 * Feature additions based on real usage
454 * Scale infrastructure **Additional User Needs fulfilled**: UN-14 (API access), UN-15 (Full evolution tracking) **Deferred**:
455 * Federation (until multiple successful instances exist)
456 * Complex contribution workflows (focus on automation)
457 * Extensive role hierarchy (keep simple) == 12. Success Metrics == **System Quality** (track weekly):
458 * Error rate by category (target: -10%/month)
459 * Average confidence score (target: increase)
460 * Source quality distribution (target: more high-quality)
461 * Contradiction detection rate (target: increase) **Efficiency** (track monthly):
462 * Claims processed per hour (target: increase)
463 * Human hours per claim (target: decrease)
464 * Automation coverage (target: >90%)
465 * Re-work rate (target: <5%) **User Satisfaction** (track quarterly):
466 * User flag rate (issues found)
467 * Correction acceptance rate (flags valid)
468 * Return user rate
469 * Trust indicators (surveys) **User Needs Metrics** (track quarterly):
470 * UN-1: % users who understand trust scores
471 * UN-4: Time to verify social media claim (target: <30s)
472 * UN-7: % users who access evidence details
473 * UN-8: % users who view multiple scenarios
474 * UN-15: % users who check evolution timeline
475 * UN-17: % users who enable in-article highlighting; avg. time spent on highlighted vs. non-highlighted articles == 13. Requirements Traceability == For full traceability matrix showing which requirements fulfill which user needs, see: * [[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]] - Section 8 includes comprehensive mapping tables == 14. Related Pages == **Non-Functional Requirements (see Section 9):**
476 * [[NFR11 — AKEL Quality Assurance Framework>>#NFR11]]
477 * [[NFR12 — Security Controls>>#NFR12]]
478 * [[NFR13 — Quality Metrics Transparency>>#NFR13]] **Other Requirements:**
479 * [[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]]
480 *
481 * [[Gap Analysis>>FactHarbor.Specification.Requirements.GapAnalysis]] * **[[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]]** - What users need (drives these requirements)
482 * [[Architecture>>FactHarbor.Specification.Architecture.WebHome]] - How requirements are implemented
483 * [[Data Model>>FactHarbor.Specification.Data Model.WebHome]] - Data structures supporting requirements
484 * [[Workflows>>FactHarbor.Specification.Workflows.WebHome]] - User interaction workflows
485 * [[AKEL>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]] - AI system fulfilling automation requirements
486 * [[Global Rules>>FactHarbor.Organisation.How-We-Work-Together.GlobalRules.WebHome]]
487 * [[Privacy Policy>>FactHarbor.Organisation.How-We-Work-Together.Privacy-Policy]] = V0.9.70 Additional Requirements = == Functional Requirements (Additional) == === FR44: ClaimReview Schema Implementation === Generate valid ClaimReview structured data for Google/Bing visibility. **Schema.org Mapping:**
488 * 80-100% likelihood → 5 (Highly Supported)
489 * 60-79% → 4 (Supported)
490 * 40-59% → 3 (Mixed)
491 * 20-39% → 2 (Questionable)
492 * 0-19% → 1 (Refuted) **Milestone:** V1.0 === FR45: User Corrections Notification System === Notify users when analyses are corrected. **Mechanisms:**
493 1. In-page banner (30 days)
494 2. Public correction log
495 3. Email notifications (opt-in)
496 4. RSS/API feed **Milestone:** Beta 0 (basic), V1.0 (complete) **BLOCKER** === FR46: Image Verification System === **Methods:**
497 1. Reverse image search
498 2. EXIF metadata analysis
499 3. Manipulation detection (basic)
500 4. Context verification **Milestone:** Beta 0 (basic), V1.0 (extended) === FR47: Archive.org Integration === Auto-save evidence sources to Wayback Machine. **Milestone:** Beta 0 === FR48: Safety Framework for Contributors === Protect contributors from harassment and legal threats. **Milestone:** V1.1 === FR49: A/B Testing Framework === Test AKEL approaches and UI designs systematically. **Milestone:** V1.0 === FR50-FR53: Future Enhancements (V2.0+) === * **FR50:** OSINT Toolkit Integration
501 * **FR51:** Video Verification System
502 * **FR52:** Interactive Detection Training
503 * **FR53:** Cross-Organizational Sharing **Milestone:** V2.0+ (12-18 months post-launch)
504 === FR54: Evidence Deduplication ===
505
506 **Fulfills:** Accurate evidence counting, quality metrics
507
508 **Purpose:** Avoid counting the same source multiple times when it appears in different forms.
509
510 **Specification:**
511
512 **Deduplication Logic:**
513
514 1. **URL Normalization:**
515 * Remove tracking parameters (?utm_source=...)
516 * Normalize http/https
517 * Normalize www/non-www
518 * Handle redirects
519
520 2. **Content Similarity:**
521 * If two sources have >90% text similarity → Same source
522 * If one is subset of other → Same source
523 * Use fuzzy matching for minor differences
524
525 3. **Cross-Domain Syndication:**
526 * Detect wire service content (AP, Reuters)
527 * Mark as single source if syndicated
528 * Count original publication only
529
530 **Display:**
531
532 {{code}}
533 Evidence Sources (3 unique, 5 total):
534 1. Original Article (NYTimes)
535 - Also appeared in: WashPost, Guardian (syndicated)
536 2. Research Paper (Nature)
537 3. Official Statement (WHO)
538 {{/code}}
539
540 **Acceptance Criteria:**
541 * ✅ Duplicate URLs recognized
542 * ✅ Syndicated content detected
543 * ✅ Evidence count shows "unique" vs "total"
544
545 **Milestone:** POC2, Beta 0
546
547 == Enhanced Existing Requirements == === FR7: Automated Verdicts (Enhanced with Quality Gates) === **POC1+ Enhancement:** After AKEL generates verdict, it passes through quality gates: {{code}}
548 Workflow:
549 1. Extract claims ↓
550 2. [GATE 1] Validate fact-checkable ↓
551 3. Generate scenarios ↓
552 4. Generate verdicts ↓
553 5. [GATE 4] Validate confidence ↓
554 6. Display to user
555 {{/code}} **Updated Verdict States:**
556 * PUBLISHED
557 * INSUFFICIENT_EVIDENCE
558 * NON_FACTUAL_CLAIM
559 * PROCESSING
560 * ERROR === FR4: Analysis Summary (Enhanced with Quality Metadata) === **POC1+ Enhancement:** Display quality indicators: {{code}}
561 Analysis Summary: Verifiable Claims: 3/5 High Confidence Verdicts: 1 Medium Confidence: 2 Evidence Sources: 12 Avg Source Quality: 0.73 Quality Score: 8.5/10
562 {{/code}}
563