Requirements

Version 1.1 by Robert Schaub on 2025/12/22 14:34

Requirements

 This page defines Roles, Content States, Rules, and System Requirements for FactHarbor. Core Philosophy: Invest in system improvement, not manual data correction. When AI makes errors, improve the algorithm and re-process automatically. == Navigation == * User Needs - What users need from FactHarbor (drives these requirements)

  • This page - How we fulfill those needs through system design

    How to read this page: 1. User Needs drive Requirements: See User Needs for what users need
    2. Requirements define implementation: This page shows how we fulfill those needs
    3. Functional Requirements (FR): Specific features and capabilities
    4. Non-Functional Requirements (NFR): Quality attributes (performance, security, etc.) Each requirement references which User Needs it fulfills.

    == 1. Roles == Fulfills: UN-12 (Submit claims), UN-13 (Cite verdicts), UN-14 (API access) FactHarbor uses three simple roles plus a reputation system. === 1.1 Reader === Who: Anyone (no login required) Can:
  • Browse and search claims
  • View scenarios, evidence, verdicts, and confidence scores
  • Flag issues or errors
  • Use filters, search, and visualization tools
  • Submit claims automatically (new claims added if not duplicates) Cannot:
  • Modify content
  • Access edit history details User Needs served: UN-1 (Trust assessment), UN-2 (Claim verification), UN-3 (Article summary with FactHarbor analysis summary), UN-4 (Social media fact-checking), UN-5 (Source tracing), UN-7 (Evidence transparency), UN-8 (Understanding disagreement), UN-12 (Submit claims), UN-17 (In-article highlighting) === 1.2 Contributor === Who: Registered users (earns reputation through contributions) Can:
  • Everything a Reader can do
  • Edit claims, evidence, and scenarios
  • Add sources and citations
  • Suggest improvements to AI-generated content
  • Participate in discussions
  • Earn reputation points for quality contributions Reputation System:
  • New contributors: Limited edit privileges
  • Established contributors (established reputation): Full edit access
  • Trusted contributors (substantial reputation): Can approve certain changes
  • Reputation earned through: Accepted edits, helpful flags, quality contributions
  • Reputation lost through: Reverted edits, invalid flags, abuse Cannot:
  • Delete or hide content (only moderators)
  • Override moderation decisions User Needs served: UN-13 (Cite and contribute) === 1.3 Moderator === Who: Trusted community members with proven track record, appointed by governance board Can:
  • Review flagged content
  • Hide harmful or abusive content
  • Resolve disputes between contributors
  • Issue warnings or temporary bans
  • Make final decisions on content disputes
  • Access full audit logs Cannot:
  • Change governance rules
  • Permanently ban users without board approval
  • Override technical quality gates Note: Small team (3-5 initially), supported by automated moderation tools. === 1.4 Domain Trusted Contributors (Optional, Task-Specific) === Who: Subject matter specialists invited for specific high-stakes disputes Not a permanent role: Contacted externally when needed for contested claims in their domain When used:
  • Medical claims with life/safety implications
  • Legal interpretations with significant impact
  • Scientific claims with high controversy
  • Technical claims requiring specialized knowledge Process:
  • Moderator identifies need for expert input
  • Contact expert externally (don't require them to be users)
  • Trusted Contributor provides written opinion with sources
  • Opinion added to claim record
  • Trusted Contributor acknowledged in claim User Needs served: UN-16 (Expert validation status) == 2. Content States == Fulfills: UN-1 (Trust indicators), UN-16 (Review status transparency) FactHarbor uses two content states. Focus is on transparency and confidence scoring, not gatekeeping. === 2.1 Published === Status: Visible to all users Includes:
  • AI-generated analyses (default state)
  • User-contributed content
  • Edited/improved content Quality Indicators (displayed with content):
  • Confidence Score: 0-100% (AI's confidence in analysis)
  • Source Quality Score: 0-100% (based on source track record)
  • Controversy Flag: If high dispute/edit activity
  • Completeness Score: % of expected fields filled
  • Last Updated: Date of most recent change
  • Edit Count: Number of revisions
  • Review Status: AI-generated / Human-reviewed / Expert-validated Automatic Warnings:
  • Confidence < 60%: "Low confidence - use caution"
  • Source quality < 40%: "Sources may be unreliable"
  • High controversy: "Disputed - multiple interpretations exist"
  • Medical/Legal/Safety domain: "Seek professional advice" User Needs served: UN-1 (Trust score), UN-9 (Methodology transparency), UN-15 (Evolution timeline), UN-16 (Review status) === 2.2 Hidden === Status: Not visible to regular users (only to moderators) Reasons:
  • Spam or advertising
  • Personal attacks or harassment
  • Illegal content
  • Privacy violations
  • Deliberate misinformation (verified)
  • Abuse or harmful content Process:
  • Automated detection flags for moderator review
  • Moderator confirms and hides
  • Original author notified with reason
  • Can appeal to board if disputes moderator decision Note: Content is hidden, not deleted (for audit trail) == 3. Contribution Rules == === 3.1 All Contributors Must === * Provide sources for factual claims
  • Use clear, neutral language in FactHarbor's own summaries
  • Respect others and maintain civil discourse
  • Accept community feedback constructively
  • Focus on improving quality, not protecting ego === 3.2 AKEL (AI System) === AKEL is the primary system. Human contributions supplement and train AKEL. AKEL Must:
  • Mark all outputs as AI-generated
  • Display confidence scores prominently
  • Provide source citations
  • Flag uncertainty clearly
  • Identify contradictions in evidence
  • Learn from human corrections When AKEL Makes Errors:
  1. Capture the error pattern (what, why, how common)
    2. Improve the system (better prompt, model, validation)
    3. Re-process affected claims automatically
    4. Measure improvement (did quality increase?) Human Role: Train AKEL through corrections, not replace AKEL === 3.3 Contributors Should === * Improve clarity and structure
  • Add missing sources
  • Flag errors for system improvement
  • Suggest better ways to present information
  • Participate in quality discussions === 3.4 Moderators Must === * Be impartial
  • Document moderation decisions
  • Respond to appeals promptly
  • Use automated tools to scale efforts
  • Focus on abuse/harm, not routine quality control == 4. Quality Standards == Fulfills: UN-5 (Source reliability), UN-6 (Publisher track records), UN-7 (Evidence transparency), UN-9 (Methodology transparency) === 4.1 Source Requirements === Track Record Over Credentials:
  • Sources evaluated by historical accuracy
  • Correction policy matters
  • Independence from conflicts of interest
  • Methodology transparency Source Quality Database:
  • Automated tracking of source accuracy
  • Correction frequency
  • Reliability score (updated continuously)
  • Users can see source track record No automatic trust for government, academia, or media - all evaluated by track record. User Needs served: UN-5 (Source provenance), UN-6 (Publisher reliability) === 4.2 Claim Requirements === * Clear subject and assertion
  • Verifiable with available information
  • Sourced (or explicitly marked as needing sources)
  • Neutral language in FactHarbor summaries
  • Appropriate context provided User Needs served: UN-2 (Claim extraction and verification) === 4.3 Evidence Requirements === * Publicly accessible (or explain why not)
  • Properly cited with attribution
  • Relevant to claim being evaluated
  • Original source preferred over secondary User Needs served: UN-7 (Evidence transparency) === 4.4 Confidence Scoring === Automated confidence calculation based on:
  • Source quality scores
  • Evidence consistency
  • Contradiction detection
  • Completeness of analysis
  • Historical accuracy of similar claims Thresholds:
  • < 40%: Too low to publish (needs improvement)
  • 40-60%: Published with "Low confidence" warning
  • 60-80%: Published as standard
  • 80-100%: Published as "High confidence" User Needs served: UN-1 (Trust assessment), UN-9 (Methodology transparency) == 5. Automated Risk Scoring == Fulfills: UN-10 (Manipulation detection), UN-16 (Appropriate review level) Replace manual risk tiers with continuous automated scoring. === 5.1 Risk Score Calculation === Factors (weighted algorithm):
  • Domain sensitivity: Medical, legal, safety auto-flagged higher
  • Potential impact: Views, citations, spread
  • Controversy level: Flags, disputes, edit wars
  • Uncertainty: Low confidence, contradictory evidence
  • Source reliability: Track record of sources used Score: 0-100 (higher = more risk) === 5.2 Automated Actions === * Score > 80: Flag for moderator review before publication
  • Score 60-80: Publish with prominent warnings
  • Score 40-60: Publish with standard warnings
  • Score < 40: Publish normally Continuous monitoring: Risk score recalculated as new information emerges User Needs served: UN-10 (Detect manipulation tactics), UN-16 (Review status) == 6. System Improvement Process == Core principle: Fix the system, not just the data. === 6.1 Error Capture === When users flag errors or make corrections:
  1. What was wrong? (categorize)
    2. What should it have been?
    3. Why did the system fail? (root cause)
    4. How common is this pattern?
    5. Store in ErrorPattern table (improvement queue) === 6.2 Weekly Improvement Cycle === 1. Review: Analyze top error patterns
    2. Develop: Create fix (prompt, model, validation)
    3. Test: Validate fix on sample claims
    4. Deploy: Roll out if quality improves
    5. Re-process: Automatically update affected claims
    6. Monitor: Track quality metrics === 6.3 Quality Metrics Dashboard === Track continuously:
  • Error rate by category
  • Source quality distribution
  • Confidence score trends
  • User flag rate (issues found)
  • Correction acceptance rate
  • Re-work rate
  • Claims processed per hour Goal: 10% monthly improvement in error rate == 7. Automated Quality Monitoring == Replace manual audit sampling with automated monitoring. === 7.1 Continuous Metrics === * Source quality: Track record database
  • Consistency: Contradiction detection
  • Clarity: Readability scores
  • Completeness: Field validation
  • Accuracy: User corrections tracked === 7.2 Anomaly Detection === Automated alerts for:
  • Sudden quality drops
  • Unusual patterns
  • Contradiction clusters
  • Source reliability changes
  • User behavior anomalies === 7.3 Targeted Review === * Review only flagged items
  • Random sampling for calibration (not quotas)
  • Learn from corrections to improve automation == 8. Functional Requirements == This section defines specific features that fulfill user needs. === 8.1 Claim Intake & Normalization === ==== FR1 — Claim Intake ==== Fulfills: UN-2 (Claim extraction), UN-4 (Quick fact-checking), UN-12 (Submit claims) * Users submit claims via simple form or API
  • Claims can be text, URL, or image
  • Duplicate detection (semantic similarity)
  • Auto-categorization by domain ==== FR2 — Claim Normalization ==== Fulfills: UN-2 (Claim verification) * Standardize to clear assertion format
  • Extract key entities (who, what, when, where)
  • Identify claim type (factual, predictive, evaluative)
  • Link to existing similar claims ==== FR3 — Claim Classification ==== Fulfills: UN-11 (Filtered research) * Domain: Politics, Science, Health, etc.
  • Type: Historical fact, current stat, prediction, etc.
  • Risk score: Automated calculation
  • Complexity: Simple, moderate, complex === 8.2 Scenario System === ==== FR4 — Scenario Generation ==== Fulfills: UN-2 (Context-dependent verification), UN-3 (Article summary with FactHarbor analysis summary), UN-8 (Understanding disagreement) Automated scenario creation:
  • AKEL analyzes claim and generates likely scenarios (use-cases and contexts)
  • Each scenario includes: assumptions, definitions, boundaries, evidence context
  • Users can flag incorrect scenarios
  • System learns from corrections Key Concept: Scenarios represent different interpretations or contexts (e.g., "Clinical trials with healthy adults" vs. "Real-world data with diverse populations") ==== FR5 — Evidence Linking ==== Fulfills: UN-5 (Source tracing), UN-7 (Evidence transparency) * Automated evidence discovery from sources
  • Relevance scoring
  • Contradiction detection
  • Source quality assessment ==== FR6 — Scenario Comparison ==== Fulfills: UN-3 (Article summary with FactHarbor analysis summary), UN-8 (Understanding disagreement) * Side-by-side comparison interface
  • Highlight key differences between scenarios
  • Show evidence supporting each scenario
  • Display confidence scores per scenario === 8.3 Verdicts & Analysis === ==== FR7 — Automated Verdicts ==== Fulfills: UN-1 (Trust score), UN-2 (Verification verdicts), UN-3 (Article summary with FactHarbor analysis summary), UN-13 (Cite verdicts) * AKEL generates verdict based on evidence within each scenario
  • Likelihood range displayed (e.g., "0.70-0.85 (likely true)") - NOT binary true/false
  • Uncertainty factors explicitly listed (e.g., "Small sample sizes", "Long-term effects unknown")
  • Confidence score displayed prominently
  • Source quality indicators shown
  • Contradictions noted
  • Uncertainty acknowledged Key Innovation: Detailed probabilistic verdicts with explicit uncertainty, not binary judgments ==== FR8 — Time Evolution ==== Fulfills: UN-15 (Verdict evolution timeline) * Claims and verdicts update as new evidence emerges
  • Version history maintained for all verdicts
  • Changes highlighted
  • Confidence score trends visible
  • Users can see "as of date X, what did we know?" === 8.4 User Interface & Presentation === ==== FR12 — Two-Panel Summary View (Article Summary with FactHarbor Analysis Summary) ==== Fulfills: UN-3 (Article Summary with FactHarbor Analysis Summary) Purpose: Provide side-by-side comparison of what a document claims vs. FactHarbor's complete analysis of its credibility Left Panel: Article Summary:
  • Document title, source, and claimed credibility
  • "The Big Picture" - main thesis or position change
  • "Key Findings" - structured summary of document's main claims
  • "Reasoning" - document's explanation for positions
  • "Conclusion" - document's bottom line Right Panel: FactHarbor Analysis Summary:
  • FactHarbor's independent source credibility assessment
  • Claim-by-claim verdicts with confidence scores
  • Methodology assessment (strengths, limitations)
  • Overall verdict on document quality
  • Analysis ID for reference Design Principles:
  • No scrolling required - both panels visible simultaneously
  • Visual distinction between "what they say" and "FactHarbor's analysis"
  • Color coding for verdicts (supported, uncertain, refuted)
  • Confidence percentages clearly visible
  • Mobile responsive (panels stack vertically on small screens) Implementation Notes:
  • Generated automatically by AKEL for every analyzed document
  • Updates when verdict evolves (maintains version history)
  • Exportable as standalone summary report
  • Shareable via permanent URL ==== FR13 — In-Article Claim Highlighting ==== Fulfills: UN-17 (In-article claim highlighting) Purpose: Enable readers to quickly assess claim credibility while reading by visually highlighting factual claims with color-coded indicators ==== Visual Example: Article with Highlighted Claims ====

    Article: "New Study Shows Benefits of Mediterranean Diet" A recent study published in the Journal of Nutrition has revealed new findings about the Mediterranean diet. 

    🟢 Researchers found that Mediterranean diet followers had a 25% lower risk of heart disease compared to control groups
    ↑ WELL SUPPORTED • 87% confidence
    Click for evidence details →

    The study, which followed 10,000 participants over five years, showed significant improvements in cardiovascular health markers. 

    🟡 Some experts believe this diet can completely prevent heart attacks
    ↑ UNCERTAIN • 45% confidence
    Overstated - evidence shows risk reduction, not prevention
    Click for details →

    Dr. Maria Rodriguez, lead researcher, recommends incorporating more olive oil, fish, and vegetables into daily meals. 

    🔴 The study proves that saturated fats cause heart disease
    ↑ REFUTED • 15% confidence
    Claim not supported by study design; correlation ≠ causation
    Click for counter-evidence →

    Participants also reported feeling more energetic and experiencing better sleep quality, though these were secondary measures.

    Legend:
  • 🟢 = Well-supported claim (confidence ≥75%)
  • 🟡 = Uncertain claim (confidence 40-74%)
  • 🔴 = Refuted/unsupported claim (confidence <40%)
  • Plain text = Non-factual content (context, opinions, recommendations) ==== Tooltip on Hover/Click ====

    FactHarbor Analysis Claim:
    "Researchers found that Mediterranean diet followers had a 25% lower risk of heart disease" Verdict: WELL SUPPORTED
    Confidence: 87% Evidence Summary:

    • Meta-analysis of 12 RCTs confirms 23-28% risk reduction
    • Consistent findings across multiple populations
    • Published in peer-reviewed journal (high credibility) Uncertainty Factors:
    • Exact percentage varies by study (20-30% range) View Full Analysis →
    Color-Coding System:
  • Green: Well-supported claims (confidence ≥75%, strong evidence)
  • Yellow/Orange: Uncertain claims (confidence 40-74%, conflicting or limited evidence)
  • Red: Refuted or unsupported claims (confidence <40%, contradicted by evidence)
  • Gray/Neutral: Non-factual content (opinions, questions, procedural text) ==== Interactive Highlighting Example (Detailed View) ==== 
Article TextStatusAnalysis

A recent study published in the Journal of Nutrition has revealed new findings about the Mediterranean diet.

Plain textContext - no highlighting

Researchers found that Mediterranean diet followers had a 25% lower risk of heart disease compared to control groups

🟢 WELL SUPPORTED

87% confidence Meta-analysis of 12 RCTs confirms 23-28% risk reduction View Full Analysis

The study, which followed 10,000 participants over five years, showed significant improvements in cardiovascular health markers.

Plain textMethodology - no highlighting

Some experts believe this diet can completely prevent heart attacks

🟡 UNCERTAIN

45% confidence Overstated - evidence shows risk reduction, not prevention View Details

Dr. Rodriguez recommends incorporating more olive oil, fish, and vegetables into daily meals.

Plain textRecommendation - no highlighting

The study proves that saturated fats cause heart disease

🔴 REFUTED

15% confidence Claim not supported by study; correlation ≠ causation View Counter-Evidence

Design Notes:
  • Highlighted claims use italics to distinguish from plain text
  • Color backgrounds match XWiki message box colors (success/warning/error)
  • Status column shows verdict prominently
  • Analysis column provides quick summary with link to details User Actions:
  • Hover over highlighted claim → Tooltip appears
  • Click highlighted claim → Detailed analysis modal/panel
  • Toggle button to turn highlighting on/off
  • Keyboard: Tab through highlighted claims Interaction Design:
  • Hover/click on highlighted claim → Show tooltip with: * Claim text * Verdict (e.g., "WELL SUPPORTED") * Confidence score (e.g., "85%") * Brief evidence summary * Link to detailed analysis
  • Toggle highlighting on/off (user preference)
  • Adjustable color intensity for accessibility Technical Requirements:
  • Real-time highlighting as page loads (non-blocking)
  • Claim boundary detection (start/end of assertion)
  • Handle nested or overlapping claims
  • Preserve original article formatting
  • Work with various content formats (HTML, plain text, PDFs) Performance Requirements:
  • Highlighting renders within 500ms of page load
  • No perceptible delay in reading experience
  • Efficient DOM manipulation (avoid reflows) Accessibility:
  • Color-blind friendly palette (use patterns/icons in addition to color)
  • Screen reader compatible (ARIA labels for claim credibility)
  • Keyboard navigation to highlighted claims Implementation Notes:
  • Claims extracted and analyzed by AKEL during initial processing
  • Highlighting data stored as annotations with byte offsets
  • Client-side rendering of highlights based on verdict data
  • Mobile responsive (tap instead of hover) === 8.5 Workflow & Moderation === ==== FR9 — Publication Workflow ==== Fulfills: UN-1 (Fast access to verified content), UN-16 (Clear review status) Simple flow:
  1. Claim submitted
    2. AKEL processes (automated)
    3. If confidence > threshold: Publish (labeled as AI-generated)
    4. If confidence < threshold: Flag for improvement
    5. If risk score > threshold: Flag for moderator No multi-stage approval process ==== FR10 — Moderation ==== Focus on abuse, not routine quality:
  • Automated abuse detection
  • Moderators handle flags
  • Quick response to harmful content
  • Minimal involvement in routine content ==== FR11 — Audit Trail ==== Fulfills: UN-14 (API access to histories), UN-15 (Evolution tracking) * All edits logged
  • Version history public
  • Moderation decisions documented
  • System improvements tracked == 9. Non-Functional Requirements == === 9.1 NFR1 — Performance === Fulfills: UN-4 (Fast fact-checking), UN-11 (Responsive filtering) * Claim processing: < 30 seconds
  • Search response: < 2 seconds
  • Page load: < 3 seconds
  • 99% uptime === 9.2 NFR2 — Scalability === Fulfills: UN-14 (API access at scale) * Handle 10,000 claims initially
  • Scale to 1M+ claims
  • Support 100K+ concurrent users
  • Automated processing scales linearly === 9.3 NFR3 — Transparency === Fulfills: UN-7 (Evidence transparency), UN-9 (Methodology transparency), UN-13 (Citable verdicts), UN-15 (Evolution visibility) * All algorithms open source
  • All data exportable
  • All decisions documented
  • Quality metrics public === 9.4 NFR4 — Security & Privacy === * Follow Privacy Policy
  • Secure authentication
  • Data encryption
  • Regular security audits === 9.5 NFR5 — Maintainability === * Modular architecture
  • Automated testing
  • Continuous integration
  • Comprehensive documentation === NFR11: AKEL Quality Assurance Framework === Fulfills: AI safety, IFCN methodology transparency Specification: Multi-layer AI quality gates to detect hallucinations, low-confidence results, and logical inconsistencies. ==== Quality Gate 1: Claim Extraction Validation ==== Purpose: Ensure extracted claims are factual assertions (not opinions/predictions) Checks:
  1. Factual Statement Test: Is this verifiable? (Yes/No)
    2. Opinion Detection: Contains hedging language? ("I think", "probably", "best")
    3. Future Prediction Test: Makes claims about future events?
    4. Specificity Score: Contains specific entities, numbers, dates? Thresholds:
  • Factual: Must be "Yes"
  • Opinion markers: <2 hedging phrases
  • Specificity: ≥3 specific elements Action if Failed: Flag as "Non-verifiable", do NOT generate verdict ==== Quality Gate 2: Evidence Relevance Validation ==== Purpose: Ensure AI-linked evidence actually relates to claim Checks:
  1. Semantic Similarity Score: Evidence vs. claim (embeddings)
    2. Entity Overlap: Shared people/places/things?
    3. Topic Relevance: Discusses claim subject? Thresholds:
  • Similarity: ≥0.6 (cosine similarity)
  • Entity overlap: ≥1 shared entity
  • Topic relevance: ≥0.5 Action if Failed: Discard irrelevant evidence ==== Quality Gate 3: Scenario Coherence Check ==== Purpose: Validate scenario assumptions are logical and complete Checks:
  1. Completeness: All required fields populated
    2. Internal Consistency: Assumptions don't contradict
    3. Distinguishability: Scenarios meaningfully different Thresholds:
  • Required fields: 100%
  • Contradiction score: <0.3
  • Scenario similarity: <0.8 Action if Failed: Merge duplicates, reduce confidence -20% ==== Quality Gate 4: Verdict Confidence Assessment ==== Purpose: Only publish high-confidence verdicts Checks:
  1. Evidence Count: Minimum 2 sources
    2. Source Quality: Average reliability ≥0.6
    3. Evidence Agreement: Supporting vs. contradicting ≥0.6
    4. Uncertainty Factors: Hedging in reasoning Confidence Tiers:
  • HIGH (80-100%): ≥3 sources, ≥0.7 quality, ≥80% agreement
  • MEDIUM (50-79%): ≥2 sources, ≥0.6 quality, ≥60% agreement
  • LOW (0-49%): <2 sources OR low quality/agreement
  • INSUFFICIENT: <2 sources → DO NOT PUBLISH Implementation Phases:
  • POC1: Gates 1 & 4 only (basic validation)
  • POC2: All 4 gates (complete framework)
  • V1.0: Hardened with <5% hallucination rate Acceptance Criteria:
  • ✅ All gates operational
  • ✅ Hallucination rate <5%
  • ✅ Quality metrics public === NFR12: Security Controls === Fulfills: Production readiness, legal compliance Requirements:
  1. Input Validation: SQL injection, XSS, CSRF prevention
    2. Rate Limiting: 5 analyses per minute per IP
    3. Authentication: Secure sessions, API key rotation
    4. Data Protection: HTTPS, encryption, backups
    5. Security Audit: Penetration testing, GDPR compliance Milestone: Beta 0 (essential), V1.0 (complete) BLOCKER === NFR13: Quality Metrics Transparency === Fulfills: IFCN transparency, user trust Public Metrics:
  • Quality gates performance
  • Evidence quality stats
  • Hallucination rate
  • User feedback Milestone: POC2 (internal), Beta 0 (public), V1.0 (real-time) == 10. Requirements Priority Matrix ==

This table shows all functional and non-functional requirements ordered by urgency and priority.

Note: Implementation phases (POC1, POC2, Beta 0, V1.0) are defined in POC Requirements and Implementation Roadmap, not in this priority matrix.

Priority Levels:

  • CRITICAL - System doesn't work without it, or major safety/legal risk
  • HIGH - Core functionality, essential for success
  • MEDIUM - Important but not blocking
  • LOW - Nice to have, can be deferred

Urgency Levels:

  • HIGH - Immediate need (critical for proof of concept)
  • MEDIUM - Important but not immediate
  • LOW - Future enhancement
 ID  Title  Priority  Urgency  Reason (for HIGH priority/urgency)
 HIGH URGENCY 
 FR1  Claim Intake  CRITICAL  HIGH  System entry point - cannot process claims without it
 FR5  Evidence Collection  CRITICAL  HIGH  Core fact-checking functionality - no evidence = no verdict
 FR7  Verdict Computation  CRITICAL  HIGH  The output users see - core value proposition
 NFR11  Quality Assurance Framework  CRITICAL  HIGH  Prevents AI hallucinations - FactHarbor's key differentiator
 FR2  Claim Normalization  HIGH  HIGH  Standardizes AI input for reliable processing
 FR3  Claim Classification  HIGH  HIGH  Identifies factual vs non-factual claims - essential quality gate
 FR4  Scenario Generation  HIGH  HIGH  Handles ambiguous claims - key methodology innovation
 FR6  Evidence Evaluation  HIGH  HIGH  Source quality directly impacts verdict credibility
 MEDIUM URGENCY 
 NFR12  Security Controls  CRITICAL  MEDIUM  —
 FR9  Corrections  HIGH  MEDIUM  IFCN requirement - mandatory for credibility
 FR44  ClaimReview Schema  HIGH  MEDIUM  Search engine visibility - MUST for V1.0 discovery
 FR45  Corrections Notification  HIGH  MEDIUM  IFCN compliance - required for corrections transparency
 FR48  Safety Framework  HIGH  MEDIUM  Prevents harm to contributors - legal and ethical requirement
 NFR3  Transparency  HIGH  MEDIUM  Core principle - essential for trust and credibility
 NFR13  Quality Metrics  HIGH  MEDIUM  Monitoring and transparency - IFCN compliance
 FR8  User Contribution  MEDIUM  MEDIUM  —
 FR10  Publishing  MEDIUM  MEDIUM  —
 FR13  API  MEDIUM  MEDIUM  —
 FR46  Image Verification  MEDIUM  MEDIUM  —
 FR47  Archive.org Integration  MEDIUM  MEDIUM  —
 FR54  Evidence Deduplication  MEDIUM  MEDIUM  —
 NFR1  Performance  MEDIUM  MEDIUM  —
 NFR2  Scalability  MEDIUM  MEDIUM  —
 NFR4  Security & Privacy  MEDIUM  MEDIUM  —
 NFR5  Maintainability  MEDIUM  MEDIUM  —
 LOW URGENCY 
 FR11  Social Sharing  LOW  LOW  —
 FR12  Notifications  LOW  LOW  —
 FR49  A/B Testing  LOW  LOW  —
 FR50  OSINT Toolkit Integration  LOW  LOW  —
 FR51  Video Verification System  LOW  LOW  —
 FR52  Interactive Detection Training  LOW  LOW  —
 FR53  Cross-Organizational Sharing  LOW  LOW  —

Total: 32 requirements (24 Functional, 8 Non-Functional)

Notes:

  • Reason column: Only populated for HIGH priority or HIGH urgency items
  • MEDIUM and LOW priority items use "—" (no specific reason needed)

See also:

10.1 User Needs Priority

User Needs (UN) are the foundation that drives functional and non-functional requirements. They are not independently prioritized; instead, their priority is inherited from the FR/NFR requirements they drive.

 ID  Title  Drives Requirements
 UN-1  Trust Assessment at a Glance  Multiple FR/NFR
 UN-2  Claim Extraction and Verification  FR1-7
 UN-3  Article Summary with FactHarbor Analysis Summary  FR4
 UN-4  Social Media Fact-Checking  FR1, FR4
 UN-5  Source Provenance and Track Records  FR6
 UN-6  Publisher Reliability History  FR6
 UN-7  Evidence Transparency  NFR3
 UN-8  Understanding Disagreement and Consensus  FR4
 UN-9  Methodology Transparency  NFR3, NFR11
 UN-10  Manipulation Tactics Detection  FR48
 UN-11  Filtered Research  FR3
 UN-12  Submit Unchecked Claims  FR8
 UN-13  Cite FactHarbor Verdicts  FR10
 UN-14  API Access for Integration  FR13
 UN-15  Verdict Evolution Timeline  FR7
 UN-16  AI vs. Human Review Status  FR9
 UN-17  In-Article Claim Highlighting  FR1
 UN-26  Search Engine Visibility  FR44
 UN-27  Visual Claim Verification  FR46
 UN-28  Safe Contribution Environment  FR48

Total: 20 User Needs

Note: Each User Need inherits priority from the requirements it drives. For example, UN-2 (Claim Extraction and Verification) drives FR1-7, which are CRITICAL/HIGH priority, therefore UN-2 is also critical to the project.

11. MVP Scope

 Phase 1 : Read-Only MVP Build:

  • Automated claim analysis
  • Confidence scoring
  • Source evaluation
  • Browse/search interface
  • User flagging system Goal: Prove AI quality before adding user editing User Needs fulfilled in Phase 1: UN-1, UN-2, UN-3, UN-4, UN-5, UN-6, UN-7, UN-8, UN-9, UN-12 Phase 2 : User Contributions Add only if needed:
  • Simple editing (Wikipedia-style)
  • Reputation system
  • Basic moderation
  • In-article claim highlighting (FR13) Additional User Needs fulfilled: UN-13, UN-17 Phase 3 : Refinement * Continuous quality improvement
  • Feature additions based on real usage
  • Scale infrastructure Additional User Needs fulfilled: UN-14 (API access), UN-15 (Full evolution tracking) Deferred:
  • Federation (until multiple successful instances exist)
  • Complex contribution workflows (focus on automation)
  • Extensive role hierarchy (keep simple) == 12. Success Metrics == System Quality (track weekly):
  • Error rate by category (target: -10%/month)
  • Average confidence score (target: increase)
  • Source quality distribution (target: more high-quality)
  • Contradiction detection rate (target: increase) Efficiency (track monthly):
  • Claims processed per hour (target: increase)
  • Human hours per claim (target: decrease)
  • Automation coverage (target: >90%)
  • Re-work rate (target: <5%) User Satisfaction (track quarterly):
  • User flag rate (issues found)
  • Correction acceptance rate (flags valid)
  • Return user rate
  • Trust indicators (surveys) User Needs Metrics (track quarterly):
  • UN-1: % users who understand trust scores
  • UN-4: Time to verify social media claim (target: <30s)
  • UN-7: % users who access evidence details
  • UN-8: % users who view multiple scenarios
  • UN-15: % users who check evolution timeline
  • UN-17: % users who enable in-article highlighting; avg. time spent on highlighted vs. non-highlighted articles == 13. Requirements Traceability == For full traceability matrix showing which requirements fulfill which user needs, see: * User Needs - Section 8 includes comprehensive mapping tables == 14. Related Pages == Non-Functional Requirements (see Section 9):
  • NFR11 — AKEL Quality Assurance Framework
  • NFR12 — Security Controls
  • NFR13 — Quality Metrics Transparency Other Requirements:
  • User Needs

  • Gap Analysis * User Needs - What users need (drives these requirements)
  • Architecture - How requirements are implemented
  • Data Model - Data structures supporting requirements
  • Workflows - User interaction workflows
  • AKEL - AI system fulfilling automation requirements
  • Global Rules
  • Privacy Policy = V0.9.70 Additional Requirements = == Functional Requirements (Additional) == === FR44: ClaimReview Schema Implementation === Generate valid ClaimReview structured data for Google/Bing visibility. Schema.org Mapping:
  • 80-100% likelihood → 5 (Highly Supported)
  • 60-79% → 4 (Supported)
  • 40-59% → 3 (Mixed)
  • 20-39% → 2 (Questionable)
  • 0-19% → 1 (Refuted) Milestone: V1.0 === FR45: User Corrections Notification System === Notify users when analyses are corrected. Mechanisms:
  1. In-page banner (30 days)
    2. Public correction log
    3. Email notifications (opt-in)
    4. RSS/API feed Milestone: Beta 0 (basic), V1.0 (complete) BLOCKER === FR46: Image Verification System === Methods:
  2. Reverse image search
    2. EXIF metadata analysis
    3. Manipulation detection (basic)
    4. Context verification Milestone: Beta 0 (basic), V1.0 (extended) === FR47: Archive.org Integration === Auto-save evidence sources to Wayback Machine. Milestone: Beta 0 === FR48: Safety Framework for Contributors === Protect contributors from harassment and legal threats. Milestone: V1.1 === FR49: A/B Testing Framework === Test AKEL approaches and UI designs systematically. Milestone: V1.0 === FR50-FR53: Future Enhancements (V2.0+) === * FR50: OSINT Toolkit Integration
  • FR51: Video Verification System
  • FR52: Interactive Detection Training
  • FR53: Cross-Organizational Sharing Milestone: V2.0+ (12-18 months post-launch) 

FR54: Evidence Deduplication

Fulfills: Accurate evidence counting, quality metrics

Purpose: Avoid counting the same source multiple times when it appears in different forms.

Specification:

Deduplication Logic:

  1. URL Normalization:
  • Remove tracking parameters (?utm_source=...)
  • Normalize http/https
  • Normalize www/non-www
  • Handle redirects

2. Content Similarity:

  • If two sources have >90% text similarity → Same source
  • If one is subset of other → Same source
  • Use fuzzy matching for minor differences

3. Cross-Domain Syndication:

  • Detect wire service content (AP, Reuters)
  • Mark as single source if syndicated
  • Count original publication only

Display:

Evidence Sources (3 unique, 5 total):
1. Original Article (NYTimes)
  - Also appeared in: WashPost, Guardian (syndicated)
2. Research Paper (Nature)
3. Official Statement (WHO)

Acceptance Criteria:

  • ✅ Duplicate URLs recognized
  • ✅ Syndicated content detected
  • ✅ Evidence count shows "unique" vs "total"

Milestone: POC2, Beta 0

Enhanced Existing Requirements

 === FR7: Automated Verdicts (Enhanced with Quality Gates) === POC1+ Enhancement: After AKEL generates verdict, it passes through quality gates: Workflow:
1. Extract claims
2. [GATE 1] Validate fact-checkable
3. Generate scenarios
4. Generate verdicts
5. [GATE 4] Validate confidence
6. Display to user
Updated Verdict States:

  • PUBLISHED
  • INSUFFICIENT_EVIDENCE
  • NON_FACTUAL_CLAIM
  • PROCESSING
  • ERROR === FR4: Analysis Summary (Enhanced with Quality Metadata) === POC1+ Enhancement: Display quality indicators: Analysis Summary: Verifiable Claims: 3/5 High Confidence Verdicts: 1 Medium Confidence: 2 Evidence Sources: 12 Avg Source Quality: 0.73 Quality Score: 8.5/10