Wiki source code of Requirements

Version 4.1 by Robert Schaub on 2025/12/22 20:32

Show last authors
1 = Requirements =
2
3 **This page defines Roles, Content States, Rules, and System Requirements for FactHarbor.**
4
5 **Core Philosophy:** Invest in system improvement, not manual data correction. When AI makes errors, improve the algorithm and re-process automatically.
6
7 == Navigation ==
8
9 * **[[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]]** - What users need from FactHarbor (drives these requirements)
10 * **This page** - How we fulfill those needs through system design
11
12 (% class="box infomessage" %)
13 (((
14 **How to read this page:**
15
16 1. **User Needs drive Requirements**: See [[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]] for what users need
17 2. **Requirements define implementation**: This page shows how we fulfill those needs
18 3. **Functional Requirements (FR)**: Specific features and capabilities
19 4. **Non-Functional Requirements (NFR)**: Quality attributes (performance, security, etc.)
20
21 Each requirement references which User Needs it fulfills.
22 )))
23
24 == 1. Roles ==
25
26 **Fulfills**: UN-12 (Submit claims), UN-13 (Cite verdicts), UN-14 (API access)
27
28 FactHarbor uses three simple roles plus a reputation system.
29
30 === 1.1 Reader ===
31
32 **Who**: Anyone (no login required)
33
34 **Can**:
35 * Browse and search claims
36 * View scenarios, evidence, verdicts, and confidence scores
37 * Flag issues or errors
38 * Use filters, search, and visualization tools
39 * Submit claims automatically (new claims added if not duplicates)
40
41 **Cannot**:
42 * Modify content
43 * Access edit history details
44
45 **User Needs served**: UN-1 (Trust assessment), UN-2 (Claim verification), UN-3 (Article summary with FactHarbor analysis summary), UN-4 (Social media fact-checking), UN-5 (Source tracing), UN-7 (Evidence transparency), UN-8 (Understanding disagreement), UN-12 (Submit claims), UN-17 (In-article highlighting)
46
47 === 1.2 Contributor ===
48
49 **Who**: Registered users (earns reputation through contributions)
50
51 **Can**:
52 * Everything a Reader can do
53 * Edit claims, evidence, and scenarios
54 * Add sources and citations
55 * Suggest improvements to AI-generated content
56 * Participate in discussions
57 * Earn reputation points for quality contributions
58
59 **Reputation System**:
60 * New contributors: Limited edit privileges
61 * Established contributors (established reputation): Full edit access
62 * Trusted contributors (substantial reputation): Can approve certain changes
63 * Reputation earned through: Accepted edits, helpful flags, quality contributions
64 * Reputation lost through: Reverted edits, invalid flags, abuse
65
66 **Cannot**:
67 * Delete or hide content (only moderators)
68 * Override moderation decisions
69
70 **User Needs served**: UN-13 (Cite and contribute)
71
72 === 1.3 Moderator ===
73
74 **Who**: Trusted community members with proven track record, appointed by governance board
75
76 **Can**:
77 * Review flagged content
78 * Hide harmful or abusive content
79 * Resolve disputes between contributors
80 * Issue warnings or temporary bans
81 * Make final decisions on content disputes
82 * Access full audit logs
83
84 **Cannot**:
85 * Change governance rules
86 * Permanently ban users without board approval
87 * Override technical quality gates
88
89 **Note**: Small team (3-5 initially), supported by automated moderation tools.
90
91 === 1.4 Domain Trusted Contributors (Optional, Task-Specific) ===
92
93 **Who**: Subject matter specialists invited for specific high-stakes disputes
94
95 **Not a permanent role**: Contacted externally when needed for contested claims in their domain
96
97 **When used**:
98 * Medical claims with life/safety implications
99 * Legal interpretations with significant impact
100 * Scientific claims with high controversy
101 * Technical claims requiring specialized knowledge
102
103 **Process**:
104 * Moderator identifies need for expert input
105 * Contact expert externally (don't require them to be users)
106 * Trusted Contributor provides written opinion with sources
107 * Opinion added to claim record
108 * Trusted Contributor acknowledged in claim
109
110 **User Needs served**: UN-16 (Expert validation status)
111
112 == 2. Content States ==
113
114 **Fulfills**: UN-1 (Trust indicators), UN-16 (Review status transparency)
115
116 FactHarbor uses two content states. Focus is on transparency and confidence scoring, not gatekeeping.
117
118 === 2.1 Published ===
119
120 **Status**: Visible to all users
121
122 **Includes**:
123 * AI-generated analyses (default state)
124 * User-contributed content
125 * Edited/improved content
126
127 **Quality Indicators** (displayed with content):
128 * **Confidence Score**: 0-100% (AI's confidence in analysis)
129 * **Source Quality Score**: 0-100% (based on source track record)
130 * **Controversy Flag**: If high dispute/edit activity
131 * **Completeness Score**: % of expected fields filled
132 * **Last Updated**: Date of most recent change
133 * **Edit Count**: Number of revisions
134 * **Review Status**: AI-generated / Human-reviewed / Expert-validated
135
136 **Automatic Warnings**:
137 * Confidence < 60%: "Low confidence - use caution"
138 * Source quality < 40%: "Sources may be unreliable"
139 * High controversy: "Disputed - multiple interpretations exist"
140 * Medical/Legal/Safety domain: "Seek professional advice"
141
142 **User Needs served**: UN-1 (Trust score), UN-9 (Methodology transparency), UN-15 (Evolution timeline), UN-16 (Review status)
143
144 === 2.2 Hidden ===
145
146 **Status**: Not visible to regular users (only to moderators)
147
148 **Reasons**:
149 * Spam or advertising
150 * Personal attacks or harassment
151 * Illegal content
152 * Privacy violations
153 * Deliberate misinformation (verified)
154 * Abuse or harmful content
155
156 **Process**:
157 * Automated detection flags for moderator review
158 * Moderator confirms and hides
159 * Original author notified with reason
160 * Can appeal to board if disputes moderator decision
161
162 **Note**: Content is hidden, not deleted (for audit trail)
163
164 == 3. Contribution Rules ==
165
166 === 3.1 All Contributors Must ===
167
168 * Provide sources for factual claims
169 * Use clear, neutral language in FactHarbor's own summaries
170 * Respect others and maintain civil discourse
171 * Accept community feedback constructively
172 * Focus on improving quality, not protecting ego
173
174 === 3.2 AKEL (AI System) ===
175
176 **AKEL is the primary system**. Human contributions supplement and train AKEL.
177
178 **AKEL Must**:
179 * Mark all outputs as AI-generated
180 * Display confidence scores prominently
181 * Provide source citations
182 * Flag uncertainty clearly
183 * Identify contradictions in evidence
184 * Learn from human corrections
185
186 **When AKEL Makes Errors**:
187 1. Capture the error pattern (what, why, how common)
188 2. Improve the system (better prompt, model, validation)
189 3. Re-process affected claims automatically
190 4. Measure improvement (did quality increase?)
191
192 **Human Role**: Train AKEL through corrections, not replace AKEL
193
194 === 3.3 Contributors Should ===
195
196 * Improve clarity and structure
197 * Add missing sources
198 * Flag errors for system improvement
199 * Suggest better ways to present information
200 * Participate in quality discussions
201
202 === 3.4 Moderators Must ===
203
204 * Be impartial
205 * Document moderation decisions
206 * Respond to appeals promptly
207 * Use automated tools to scale efforts
208 * Focus on abuse/harm, not routine quality control
209
210 == 4. Quality Standards ==
211
212 **Fulfills**: UN-5 (Source reliability), UN-6 (Publisher track records), UN-7 (Evidence transparency), UN-9 (Methodology transparency)
213
214 === 4.1 Source Requirements ===
215
216 **Track Record Over Credentials**:
217 * Sources evaluated by historical accuracy
218 * Correction policy matters
219 * Independence from conflicts of interest
220 * Methodology transparency
221
222 **Source Quality Database**:
223 * Automated tracking of source accuracy
224 * Correction frequency
225 * Reliability score (updated continuously)
226 * Users can see source track record
227
228 **No automatic trust** for government, academia, or media - all evaluated by track record.
229
230 **User Needs served**: UN-5 (Source provenance), UN-6 (Publisher reliability)
231
232 === 4.2 Claim Requirements ===
233
234 * Clear subject and assertion
235 * Verifiable with available information
236 * Sourced (or explicitly marked as needing sources)
237 * Neutral language in FactHarbor summaries
238 * Appropriate context provided
239
240 **User Needs served**: UN-2 (Claim extraction and verification)
241
242 === 4.3 Evidence Requirements ===
243
244 * Publicly accessible (or explain why not)
245 * Properly cited with attribution
246 * Relevant to claim being evaluated
247 * Original source preferred over secondary
248
249 **User Needs served**: UN-7 (Evidence transparency)
250
251 === 4.4 Confidence Scoring ===
252
253 **Automated confidence calculation based on**:
254 * Source quality scores
255 * Evidence consistency
256 * Contradiction detection
257 * Completeness of analysis
258 * Historical accuracy of similar claims
259
260 **Thresholds**:
261 * < 40%: Too low to publish (needs improvement)
262 * 40-60%: Published with "Low confidence" warning
263 * 60-80%: Published as standard
264 * 80-100%: Published as "High confidence"
265
266 **User Needs served**: UN-1 (Trust assessment), UN-9 (Methodology transparency)
267
268 == 5. Automated Risk Scoring ==
269
270 **Fulfills**: UN-10 (Manipulation detection), UN-16 (Appropriate review level)
271
272 **Replace manual risk tiers with continuous automated scoring**.
273
274 === 5.1 Risk Score Calculation ===
275
276 **Factors** (weighted algorithm):
277 * **Domain sensitivity**: Medical, legal, safety auto-flagged higher
278 * **Potential impact**: Views, citations, spread
279 * **Controversy level**: Flags, disputes, edit wars
280 * **Uncertainty**: Low confidence, contradictory evidence
281 * **Source reliability**: Track record of sources used
282
283 **Score**: 0-100 (higher = more risk)
284
285 === 5.2 Automated Actions ===
286
287 * **Score > 80**: Flag for moderator review before publication
288 * **Score 60-80**: Publish with prominent warnings
289 * **Score 40-60**: Publish with standard warnings
290 * **Score < 40**: Publish normally
291
292 **Continuous monitoring**: Risk score recalculated as new information emerges
293
294 **User Needs served**: UN-10 (Detect manipulation tactics), UN-16 (Review status)
295
296 == 6. System Improvement Process ==
297
298 **Core principle**: Fix the system, not just the data.
299
300 === 6.1 Error Capture ===
301
302 **When users flag errors or make corrections**:
303 1. What was wrong? (categorize)
304 2. What should it have been?
305 3. Why did the system fail? (root cause)
306 4. How common is this pattern?
307 5. Store in ErrorPattern table (improvement queue)
308
309 === 6.2 Continuous Improvement Cycle ===
310
311 1. **Review**: Analyze top error patterns
312 2. **Develop**: Create fix (prompt, model, validation)
313 3. **Test**: Validate fix on sample claims
314 4. **Deploy**: Roll out if quality improves
315 5. **Re-process**: Automatically update affected claims
316 6. **Monitor**: Track quality metrics
317
318 === 6.3 Quality Metrics Dashboard ===
319
320 **Track continuously**:
321 * Error rate by category
322 * Source quality distribution
323 * Confidence score trends
324 * User flag rate (issues found)
325 * Correction acceptance rate
326 * Re-work rate
327 * Claims processed per hour
328
329 **Goal**: continuous improvement in error rate
330
331 == 7. Automated Quality Monitoring ==
332
333 **Replace manual audit sampling with automated monitoring**.
334
335 === 7.1 Continuous Metrics ===
336
337 * **Source quality**: Track record database
338 * **Consistency**: Contradiction detection
339 * **Clarity**: Readability scores
340 * **Completeness**: Field validation
341 * **Accuracy**: User corrections tracked
342
343 === 7.2 Anomaly Detection ===
344
345 **Automated alerts for**:
346 * Sudden quality drops
347 * Unusual patterns
348 * Contradiction clusters
349 * Source reliability changes
350 * User behavior anomalies
351
352 === 7.3 Targeted Review ===
353
354 * Review only flagged items
355 * Random sampling for calibration (not quotas)
356 * Learn from corrections to improve automation
357
358 == 8. Functional Requirements ==
359
360 This section defines specific features that fulfill user needs.
361
362 === 8.1 Claim Intake & Normalization ===
363
364 ==== FR1 — Claim Intake ====
365
366 **Fulfills**: UN-2 (Claim extraction), UN-4 (Quick fact-checking), UN-12 (Submit claims)
367
368 * Users submit claims via simple form or API
369 * Claims can be text, URL, or image
370 * Duplicate detection (semantic similarity)
371 * Auto-categorization by domain
372
373 ==== FR2 — Claim Normalization ====
374
375 **Fulfills**: UN-2 (Claim verification)
376
377 * Standardize to clear assertion format
378 * Extract key entities (who, what, when, where)
379 * Identify claim type (factual, predictive, evaluative)
380 * Link to existing similar claims
381
382 ==== FR3 — Claim Classification ====
383
384 **Fulfills**: UN-11 (Filtered research)
385
386 * Domain: Politics, Science, Health, etc.
387 * Type: Historical fact, current stat, prediction, etc.
388 * Risk score: Automated calculation
389 * Complexity: Simple, moderate, complex
390
391 === 8.2 Scenario System ===
392
393 ==== FR4 — Scenario Generation ====
394
395 **Fulfills**: UN-2 (Context-dependent verification), UN-3 (Article summary with FactHarbor analysis summary), UN-8 (Understanding disagreement)
396
397 **Automated scenario creation**:
398 * AKEL analyzes claim and generates likely scenarios (use-cases and contexts)
399 * Each scenario includes: assumptions, definitions, boundaries, evidence context
400 * Users can flag incorrect scenarios
401 * System learns from corrections
402
403 **Key Concept**: Scenarios represent different interpretations or contexts (e.g., "Clinical trials with healthy adults" vs. "Real-world data with diverse populations")
404
405 ==== FR5 — Evidence Linking ====
406
407 **Fulfills**: UN-5 (Source tracing), UN-7 (Evidence transparency)
408
409 * Automated evidence discovery from sources
410 * Relevance scoring
411 * Contradiction detection
412 * Source quality assessment
413
414 ==== FR6 — Scenario Comparison ====
415
416 **Fulfills**: UN-3 (Article summary with FactHarbor analysis summary), UN-8 (Understanding disagreement)
417
418 * Side-by-side comparison interface
419 * Highlight key differences between scenarios
420 * Show evidence supporting each scenario
421 * Display confidence scores per scenario
422
423 === 8.3 Verdicts & Analysis ===
424
425 ==== FR7 — Automated Verdicts ====
426
427 **Fulfills**: UN-1 (Trust score), UN-2 (Verification verdicts), UN-3 (Article summary with FactHarbor analysis summary), UN-13 (Cite verdicts)
428
429 * AKEL generates verdict based on evidence within each scenario
430 * **Likelihood range** displayed (e.g., "0.70-0.85 (likely true)") - NOT binary true/false
431 * **Uncertainty factors** explicitly listed (e.g., "Small sample sizes", "Long-term effects unknown")
432 * Confidence score displayed prominently
433 * Source quality indicators shown
434 * Contradictions noted
435 * Uncertainty acknowledged
436
437 **Key Innovation**: Detailed probabilistic verdicts with explicit uncertainty, not binary judgments
438
439 ==== FR8 — Time Evolution ====
440
441 **Fulfills**: UN-15 (Verdict evolution timeline)
442
443 * Claims and verdicts update as new evidence emerges
444 * Version history maintained for all verdicts
445 * Changes highlighted
446 * Confidence score trends visible
447 * Users can see "as of date X, what did we know?"
448
449 === 8.4 User Interface & Presentation ===
450
451 ==== FR12 — Two-Panel Summary View (Article Summary with FactHarbor Analysis Summary) ====
452
453 **Fulfills**: UN-3 (Article Summary with FactHarbor Analysis Summary)
454
455 **Purpose**: Provide side-by-side comparison of what a document claims vs. FactHarbor's complete analysis of its credibility
456
457 **Left Panel: Article Summary**:
458 * Document title, source, and claimed credibility
459 * "The Big Picture" - main thesis or position change
460 * "Key Findings" - structured summary of document's main claims
461 * "Reasoning" - document's explanation for positions
462 * "Conclusion" - document's bottom line
463
464 **Right Panel: FactHarbor Analysis Summary**:
465 * FactHarbor's independent source credibility assessment
466 * Claim-by-claim verdicts with confidence scores
467 * Methodology assessment (strengths, limitations)
468 * Overall verdict on document quality
469 * Analysis ID for reference
470
471 **Design Principles**:
472 * No scrolling required - both panels visible simultaneously
473 * Visual distinction between "what they say" and "FactHarbor's analysis"
474 * Color coding for verdicts (supported, uncertain, refuted)
475 * Confidence percentages clearly visible
476 * Mobile responsive (panels stack vertically on small screens)
477
478 **Implementation Notes**:
479 * Generated automatically by AKEL for every analyzed document
480 * Updates when verdict evolves (maintains version history)
481 * Exportable as standalone summary report
482 * Shareable via permanent URL
483
484 ==== FR13 — In-Article Claim Highlighting ====
485
486 **Fulfills**: UN-17 (In-article claim highlighting)
487
488 **Purpose**: Enable readers to quickly assess claim credibility while reading by visually highlighting factual claims with color-coded indicators
489
490 ==== Visual Example: Article with Highlighted Claims ====
491
492 (% class="box" %)
493 (((
494 **Article: "New Study Shows Benefits of Mediterranean Diet"**
495
496 A recent study published in the Journal of Nutrition has revealed new findings about the Mediterranean diet.
497
498 (% class="box successmessage" style="margin:10px 0;" %)
499 (((
500 🟢 **Researchers found that Mediterranean diet followers had a 25% lower risk of heart disease compared to control groups**
501
502 (% style="font-size:0.9em; color:#666;" %)
503 ↑ WELL SUPPORTED • 87% confidence
504 [[Click for evidence details →]]
505 (%%)
506 )))
507
508 The study, which followed 10,000 participants over five years, showed significant improvements in cardiovascular health markers.
509
510 (% class="box warningmessage" style="margin:10px 0;" %)
511 (((
512 🟡 **Some experts believe this diet can completely prevent heart attacks**
513
514 (% style="font-size:0.9em; color:#666;" %)
515 ↑ UNCERTAIN • 45% confidence
516 Overstated - evidence shows risk reduction, not prevention
517 [[Click for details →]]
518 (%%)
519 )))
520
521 Dr. Maria Rodriguez, lead researcher, recommends incorporating more olive oil, fish, and vegetables into daily meals.
522
523 (% class="box errormessage" style="margin:10px 0;" %)
524 (((
525 🔴 **The study proves that saturated fats cause heart disease**
526
527 (% style="font-size:0.9em; color:#666;" %)
528 ↑ REFUTED • 15% confidence
529 Claim not supported by study design; correlation ≠ causation
530 [[Click for counter-evidence →]]
531 (%%)
532 )))
533
534 Participants also reported feeling more energetic and experiencing better sleep quality, though these were secondary measures.
535 )))
536
537 **Legend:**
538 * 🟢 = Well-supported claim (confidence ≥75%)
539 * 🟡 = Uncertain claim (confidence 40-74%)
540 * 🔴 = Refuted/unsupported claim (confidence <40%)
541 * Plain text = Non-factual content (context, opinions, recommendations)
542
543 ==== Tooltip on Hover/Click ====
544
545 (% class="box infomessage" %)
546 (((
547 **FactHarbor Analysis**
548
549 **Claim:**
550 "Researchers found that Mediterranean diet followers had a 25% lower risk of heart disease"
551
552 **Verdict:** WELL SUPPORTED
553 **Confidence:** 87%
554
555 **Evidence Summary:**
556 * Meta-analysis of 12 RCTs confirms 23-28% risk reduction
557 * Consistent findings across multiple populations
558 * Published in peer-reviewed journal (high credibility)
559
560 **Uncertainty Factors:**
561 * Exact percentage varies by study (20-30% range)
562
563 [[View Full Analysis →]]
564 )))
565
566 **Color-Coding System**:
567 * **Green**: Well-supported claims (confidence ≥75%, strong evidence)
568 * **Yellow/Orange**: Uncertain claims (confidence 40-74%, conflicting or limited evidence)
569 * **Red**: Refuted or unsupported claims (confidence <40%, contradicted by evidence)
570 * **Gray/Neutral**: Non-factual content (opinions, questions, procedural text)
571
572 ==== Interactive Highlighting Example (Detailed View) ====
573
574 (% style="width:100%; border-collapse:collapse;" %)
575 |=**Article Text**|=**Status**|=**Analysis**
576 |(((A recent study published in the Journal of Nutrition has revealed new findings about the Mediterranean diet.)))|(% style="text-align:center;" %)Plain text|(% style="font-style:italic; color:#888;" %)Context - no highlighting
577 |(((//Researchers found that Mediterranean diet followers had a 25% lower risk of heart disease compared to control groups//)))|(% style="background-color:#D4EDDA; text-align:center; padding:8px;" %)🟢 **WELL SUPPORTED**|(((
578 **87% confidence**
579
580 Meta-analysis of 12 RCTs confirms 23-28% risk reduction
581
582 [[View Full Analysis]]
583 )))
584 |(((The study, which followed 10,000 participants over five years, showed significant improvements in cardiovascular health markers.)))|(% style="text-align:center;" %)Plain text|(% style="font-style:italic; color:#888;" %)Methodology - no highlighting
585 |(((//Some experts believe this diet can completely prevent heart attacks//)))|(% style="background-color:#FFF3CD; text-align:center; padding:8px;" %)🟡 **UNCERTAIN**|(((
586 **45% confidence**
587
588 Overstated - evidence shows risk reduction, not prevention
589
590 [[View Details]]
591 )))
592 |(((Dr. Rodriguez recommends incorporating more olive oil, fish, and vegetables into daily meals.)))|(% style="text-align:center;" %)Plain text|(% style="font-style:italic; color:#888;" %)Recommendation - no highlighting
593 |(((//The study proves that saturated fats cause heart disease//)))|(% style="background-color:#F8D7DA; text-align:center; padding:8px;" %)🔴 **REFUTED**|(((
594 **15% confidence**
595
596 Claim not supported by study; correlation ≠ causation
597
598 [[View Counter-Evidence]]
599 )))
600
601 **Design Notes:**
602 * Highlighted claims use italics to distinguish from plain text
603 * Color backgrounds match XWiki message box colors (success/warning/error)
604 * Status column shows verdict prominently
605 * Analysis column provides quick summary with link to details
606
607 **User Actions**:
608 * **Hover** over highlighted claim → Tooltip appears
609 * **Click** highlighted claim → Detailed analysis modal/panel
610 * **Toggle** button to turn highlighting on/off
611 * **Keyboard**: Tab through highlighted claims
612
613 **Interaction Design**:
614 * Hover/click on highlighted claim → Show tooltip with:
615 * Claim text
616 * Verdict (e.g., "WELL SUPPORTED")
617 * Confidence score (e.g., "85%")
618 * Brief evidence summary
619 * Link to detailed analysis
620 * Toggle highlighting on/off (user preference)
621 * Adjustable color intensity for accessibility
622
623 **Technical Requirements**:
624 * Real-time highlighting as page loads (non-blocking)
625 * Claim boundary detection (start/end of assertion)
626 * Handle nested or overlapping claims
627 * Preserve original article formatting
628 * Work with various content formats (HTML, plain text, PDFs)
629
630 **Performance Requirements**:
631 * Highlighting renders within 500ms of page load
632 * No perceptible delay in reading experience
633 * Efficient DOM manipulation (avoid reflows)
634
635 **Accessibility**:
636 * Color-blind friendly palette (use patterns/icons in addition to color)
637 * Screen reader compatible (ARIA labels for claim credibility)
638 * Keyboard navigation to highlighted claims
639
640 **Implementation Notes**:
641 * Claims extracted and analyzed by AKEL during initial processing
642 * Highlighting data stored as annotations with byte offsets
643 * Client-side rendering of highlights based on verdict data
644 * Mobile responsive (tap instead of hover)
645
646 === 8.5 Workflow & Moderation ===
647
648 ==== FR9 — Publication Workflow ====
649
650 **Fulfills**: UN-1 (Fast access to verified content), UN-16 (Clear review status)
651
652 **Simple flow**:
653 1. Claim submitted
654 2. AKEL processes (automated)
655 3. If confidence > threshold: Publish (labeled as AI-generated)
656 4. If confidence < threshold: Flag for improvement
657 5. If risk score > threshold: Flag for moderator
658
659 **No multi-stage approval process**
660
661 ==== FR10 — Moderation ====
662
663 **Focus on abuse, not routine quality**:
664 * Automated abuse detection
665 * Moderators handle flags
666 * Quick response to harmful content
667 * Minimal involvement in routine content
668
669 ==== FR11 — Audit Trail ====
670
671 **Fulfills**: UN-14 (API access to histories), UN-15 (Evolution tracking)
672
673 * All edits logged
674 * Version history public
675 * Moderation decisions documented
676 * System improvements tracked
677
678 == 9. Non-Functional Requirements ==
679
680 === 9.1 NFR1 — Performance ===
681
682 **Fulfills**: UN-4 (Fast fact-checking), UN-11 (Responsive filtering)
683
684 * Claim processing: < 30 seconds
685 * Search response: < 2 seconds
686 * Page load: < 3 seconds
687 * 99% uptime
688
689 === 9.2 NFR2 — Scalability ===
690
691 **Fulfills**: UN-14 (API access at scale)
692
693 * Handle 10,000 claims initially
694 * Scale to 1M+ claims
695 * Support 100K+ concurrent users
696 * Automated processing scales linearly
697
698 === 9.3 NFR3 — Transparency ===
699
700 **Fulfills**: UN-7 (Evidence transparency), UN-9 (Methodology transparency), UN-13 (Citable verdicts), UN-15 (Evolution visibility)
701
702 * All algorithms open source
703 * All data exportable
704 * All decisions documented
705 * Quality metrics public
706
707 === 9.4 NFR4 — Security & Privacy ===
708
709 * Follow [[Privacy Policy>>FactHarbor.Organisation.How-We-Work-Together.Privacy-Policy]]
710 * Secure authentication
711 * Data encryption
712 * Regular security audits
713
714 === 9.5 NFR5 — Maintainability ===
715
716 * Modular architecture
717 * Automated testing
718 * Continuous integration
719 * Comprehensive documentation
720
721 === NFR11: AKEL Quality Assurance Framework ===
722
723 **Fulfills:** AI safety, IFCN methodology transparency
724
725 **Specification:**
726
727 Multi-layer AI quality gates to detect hallucinations, low-confidence results, and logical inconsistencies.
728
729 ==== Quality Gate 1: Claim Extraction Validation ====
730
731 **Purpose:** Ensure extracted claims are factual assertions (not opinions/predictions)
732
733 **Checks:**
734 1. **Factual Statement Test:** Is this verifiable? (Yes/No)
735 2. **Opinion Detection:** Contains hedging language? ("I think", "probably", "best")
736 3. **Future Prediction Test:** Makes claims about future events?
737 4. **Specificity Score:** Contains specific entities, numbers, dates?
738
739 **Thresholds:**
740 * Factual: Must be "Yes"
741 * Opinion markers: <2 hedging phrases
742 * Specificity: ≥3 specific elements
743
744 **Action if Failed:** Flag as "Non-verifiable", do NOT generate verdict
745
746 ==== Quality Gate 2: Evidence Relevance Validation ====
747
748 **Purpose:** Ensure AI-linked evidence actually relates to claim
749
750 **Checks:**
751 1. **Semantic Similarity Score:** Evidence vs. claim (embeddings)
752 2. **Entity Overlap:** Shared people/places/things?
753 3. **Topic Relevance:** Discusses claim subject?
754
755 **Thresholds:**
756 * Similarity: ≥0.6 (cosine similarity)
757 * Entity overlap: ≥1 shared entity
758 * Topic relevance: ≥0.5
759
760 **Action if Failed:** Discard irrelevant evidence
761
762 ==== Quality Gate 3: Scenario Coherence Check ====
763
764 **Purpose:** Validate scenario assumptions are logical and complete
765
766 **Checks:**
767 1. **Completeness:** All required fields populated
768 2. **Internal Consistency:** Assumptions don't contradict
769 3. **Distinguishability:** Scenarios meaningfully different
770
771 **Thresholds:**
772 * Required fields: 100%
773 * Contradiction score: <0.3
774 * Scenario similarity: <0.8
775
776 **Action if Failed:** Merge duplicates, reduce confidence -20%
777
778 ==== Quality Gate 4: Verdict Confidence Assessment ====
779
780 **Purpose:** Only publish high-confidence verdicts
781
782 **Checks:**
783 1. **Evidence Count:** Minimum 2 sources
784 2. **Source Quality:** Average reliability ≥0.6
785 3. **Evidence Agreement:** Supporting vs. contradicting ≥0.6
786 4. **Uncertainty Factors:** Hedging in reasoning
787
788 **Confidence Tiers:**
789 * **HIGH (80-100%):** ≥3 sources, ≥0.7 quality, ≥80% agreement
790 * **MEDIUM (50-79%):** ≥2 sources, ≥0.6 quality, ≥60% agreement
791 * **LOW (0-49%):** <2 sources OR low quality/agreement
792 * **INSUFFICIENT:** <2 sources → DO NOT PUBLISH
793
794 **Implementation Phases:**
795 * **POC1:** Gates 1 & 4 only (basic validation)
796 * **POC2:** All 4 gates (complete framework)
797 * **V1.0:** Hardened with <5% hallucination rate
798
799 **Acceptance Criteria:**
800 * ✅ All gates operational
801 * ✅ Hallucination rate <5%
802 * ✅ Quality metrics public
803
804 === NFR12: Security Controls ===
805
806 **Fulfills:** Production readiness, legal compliance
807
808 **Requirements:**
809 1. **Input Validation:** SQL injection, XSS, CSRF prevention
810 2. **Rate Limiting:** 5 analyses per minute per IP
811 3. **Authentication:** Secure sessions, API key rotation
812 4. **Data Protection:** HTTPS, encryption, backups
813 5. **Security Audit:** Penetration testing, GDPR compliance
814
815 **Milestone:** Beta 0 (essential), V1.0 (complete) **BLOCKER**
816
817 === NFR13: Quality Metrics Transparency ===
818
819 **Fulfills:** IFCN transparency, user trust
820
821 **Public Metrics:**
822 * Quality gates performance
823 * Evidence quality stats
824 * Hallucination rate
825 * User feedback
826
827 **Milestone:** POC2 (internal), Beta 0 (public), V1.0 (real-time)
828
829
830
831 == 13. Requirements Traceability ==
832
833 For full traceability matrix showing which requirements fulfill which user needs, see:
834
835 * [[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]] - Section 8 includes comprehensive mapping tables
836
837 == 14. Related Pages ==
838
839 **Non-Functional Requirements (see Section 9):**
840 * [[NFR11 — AKEL Quality Assurance Framework>>#NFR11]]
841 * [[NFR12 — Security Controls>>#NFR12]]
842 * [[NFR13 — Quality Metrics Transparency>>#NFR13]]
843
844 **Other Requirements:**
845 * [[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]]
846 * [[V1.0 Requirements>>FactHarbor.Specification.Requirements.V10.]]
847 * [[Gap Analysis>>FactHarbor.Specification.Requirements.GapAnalysis]]
848
849 * **[[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]]** - What users need (drives these requirements)
850 * [[Architecture>>FactHarbor.Specification.Architecture.WebHome]] - How requirements are implemented
851 * [[Data Model>>FactHarbor.Specification.Data Model.WebHome]] - Data structures supporting requirements
852 * [[Workflows>>FactHarbor.Specification.Workflows.WebHome]] - User interaction workflows
853 * [[AKEL>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]] - AI system fulfilling automation requirements
854 * [[Global Rules>>FactHarbor.Organisation.How-We-Work-Together.GlobalRules.WebHome]]
855 * [[Privacy Policy>>FactHarbor.Organisation.How-We-Work-Together.Privacy-Policy]]
856
857 = V0.9.70 Additional Requirements =
858
859 == Functional Requirements (Additional) ==
860
861 === FR44: ClaimReview Schema Implementation ===
862
863 Generate valid ClaimReview structured data for Google/Bing visibility.
864
865 **Schema.org Mapping:**
866 * 80-100% likelihood → 5 (Highly Supported)
867 * 60-79% → 4 (Supported)
868 * 40-59% → 3 (Mixed)
869 * 20-39% → 2 (Questionable)
870 * 0-19% → 1 (Refuted)
871
872 **Milestone:** V1.0
873
874 === FR45: User Corrections Notification System ===
875
876 Notify users when analyses are corrected.
877
878 **Mechanisms:**
879 1. In-page banner (30 days)
880 2. Public correction log
881 3. Email notifications (opt-in)
882 4. RSS/API feed
883
884 **Milestone:** Beta 0 (basic), V1.0 (complete) **BLOCKER**
885
886 === FR46: Image Verification System ===
887
888 **Methods:**
889 1. Reverse image search
890 2. EXIF metadata analysis
891 3. Manipulation detection (basic)
892 4. Context verification
893
894 **Milestone:** Beta 0 (basic), V1.0 (extended)
895
896 === FR47: Archive.org Integration ===
897
898 Auto-save evidence sources to Wayback Machine.
899
900 **Milestone:** Beta 0
901
902 === FR48: Safety Framework for Contributors ===
903
904 Protect contributors from harassment and legal threats.
905
906 **Milestone:** V1.1
907
908 === FR49: A/B Testing Framework ===
909
910 Test AKEL approaches and UI designs systematically.
911
912 **Milestone:** V1.0
913
914 === FR50: OSINT Toolkit Integration ===
915
916
917
918 **Priority:** HIGH (V1.1)
919 **Fulfills:** Advanced media verification
920 **Phase:** V1.1
921
922 **Purpose:** Integrate open-source intelligence tools for advanced verification.
923
924 **Tools to Integrate:**
925 * InVID/WeVerify (video verification)
926 * Bellingcat toolkit
927 * Additional TBD based on V1.0 learnings
928
929 === FR51: Video Verification System ===
930
931
932
933 **Priority:** HIGH (V1.1)
934 **Fulfills:** UN-27 (Visual claims), advanced media verification
935 **Phase:** V1.1
936
937 **Purpose:** Verify video-based claims.
938
939 **Specification:**
940 * Keyframe extraction
941 * Reverse video search
942 * Deepfake detection (AI-powered)
943 * Metadata analysis
944 * Acoustic signature analysis
945
946 === FR52: Interactive Detection Training ===
947
948
949
950 **Priority:** MEDIUM (V1.5)
951 **Fulfills:** Media literacy education
952 **Phase:** V1.5
953
954 **Purpose:** Teach users to identify misinformation.
955
956 **Specification:**
957 * Interactive tutorials
958 * Practice exercises
959 * Detection quizzes
960 * Gamification elements
961
962 === FR53: Cross-Organizational Sharing ===
963
964
965
966 **Priority:** MEDIUM (V1.5)
967 **Fulfills:** Collaboration with other fact-checkers
968 **Phase:** V1.5
969
970 **Purpose:** Share findings with IFCN/EFCSN members.
971
972 **Specification:**
973 * API for fact-checking organizations
974 * Structured data exchange
975 * Privacy controls
976 * Attribution requirements
977
978
979 == Summary ==
980
981 **V1.0 Critical Requirements (Must Have):**
982
983 * FR44: ClaimReview Schema ✅
984 * FR45: Corrections Notification ✅
985 * FR46: Image Verification ✅
986 * FR47: Archive.org Integration ✅
987 * FR48: Contributor Safety ✅
988 * FR49: A/B Testing ✅
989 * FR54: Evidence Deduplication ✅
990 * NFR11: Quality Assurance Framework ✅
991 * NFR12: Security Controls ✅
992 * NFR13: Quality Metrics Dashboard ✅
993
994 **V1.1+ (Future):**
995
996 * FR50: OSINT Integration
997 * FR51: Video Verification
998 * FR52: Detection Training
999 * FR53: Cross-Org Sharing
1000
1001
1002 **Total:** 11 critical requirements for V1.0
1003
1004 === FR54: Evidence Deduplication ===
1005
1006
1007
1008 **Priority:** CRITICAL (POC2/Beta)
1009 **Fulfills:** Accurate evidence counting, quality metrics
1010 **Phase:** POC2, Beta 0, V1.0
1011
1012 **Purpose:** Avoid counting the same source multiple times when it appears in different forms.
1013
1014 **Specification:**
1015
1016 **Deduplication Logic:**
1017
1018 1. **URL Normalization:**
1019 * Remove tracking parameters (?utm_source=...)
1020 * Normalize http/https
1021 * Normalize www/non-www
1022 * Handle redirects
1023
1024 2. **Content Similarity:**
1025 * If two sources have >90% text similarity → Same source
1026 * If one is subset of other → Same source
1027 * Use fuzzy matching for minor differences
1028
1029 3. **Cross-Domain Syndication:**
1030 * Detect wire service content (AP, Reuters)
1031 * Mark as single source if syndicated
1032 * Count original publication only
1033
1034 **Display:**
1035
1036 {{code}}
1037 Evidence Sources (3 unique, 5 total):
1038
1039 1. Original Article (NYTimes)
1040 - Also appeared in: WashPost, Guardian (syndicated)
1041
1042 2. Research Paper (Nature)
1043
1044 3. Official Statement (WHO)
1045 {{/code}}
1046
1047 **Acceptance Criteria:**
1048
1049 * ✅ URL normalization works
1050 * ✅ Content similarity detected
1051 * ✅ Syndicated content identified
1052 * ✅ Unique vs. total counts accurate
1053 * ✅ Improves evidence quality metrics
1054
1055
1056 == Additional Requirements (Lower Priority) ===== FR7: Automated Verdicts (Enhanced with Quality Gates) ===
1057
1058 **POC1+ Enhancement:**
1059
1060 After AKEL generates verdict, it passes through quality gates:
1061
1062 {{code}}
1063 Workflow:
1064 1. Extract claims
1065
1066 2. [GATE 1] Validate fact-checkable
1067
1068 3. Generate scenarios
1069
1070 4. Generate verdicts
1071
1072 5. [GATE 4] Validate confidence
1073
1074 6. Display to user
1075 {{/code}}
1076
1077 **Updated Verdict States:**
1078 * PUBLISHED
1079 * INSUFFICIENT_EVIDENCE
1080 * NON_FACTUAL_CLAIM
1081 * PROCESSING
1082 * ERROR
1083
1084 === FR4: Analysis Summary (Enhanced with Quality Metadata) ===
1085
1086 **POC1+ Enhancement:**
1087
1088 Display quality indicators:
1089
1090 {{code}}
1091 Analysis Summary:
1092 Verifiable Claims: 3/5
1093 High Confidence Verdicts: 1
1094 Medium Confidence: 2
1095 Evidence Sources: 12
1096 Avg Source Quality: 0.73
1097 Quality Score: 8.5/10
1098 {{/code}}