Requirements
Requirements
This page defines Roles, Content States, Rules, and System Requirements for FactHarbor. Core Philosophy: Invest in system improvement, not manual data correction. When AI makes errors, improve the algorithm and re-process automatically. == Navigation == * User Needs - What users need from FactHarbor (drives these requirements)
- This page - How we fulfill those needs through system design == 1. Roles == Fulfills: UN-12 (Submit claims), UN-13 (Cite verdicts), UN-14 (API access) FactHarbor uses three simple roles plus a reputation system. === 1.1 Reader === Who: Anyone (no login required) Can:
- Browse and search claims
- View scenarios, evidence, verdicts, and confidence scores
- Flag issues or errors
- Use filters, search, and visualization tools
- Submit claims automatically (new claims added if not duplicates) Cannot:
- Modify content
- Access edit history details User Needs served: UN-1 (Trust assessment), UN-2 (Claim verification), UN-3 (Article summary with FactHarbor analysis summary), UN-4 (Social media fact-checking), UN-5 (Source tracing), UN-7 (Evidence transparency), UN-8 (Understanding disagreement), UN-12 (Submit claims), UN-17 (In-article highlighting) === 1.2 Contributor === Who: Registered users (earns reputation through contributions) Can:
- Everything a Reader can do
- Edit claims, evidence, and scenarios
- Add sources and citations
- Suggest improvements to AI-generated content
- Participate in discussions
- Earn reputation points for quality contributions Reputation System:
- New contributors: Limited edit privileges
- Established contributors (established reputation): Full edit access
- Trusted contributors (substantial reputation): Can approve certain changes
- Reputation earned through: Accepted edits, helpful flags, quality contributions
- Reputation lost through: Reverted edits, invalid flags, abuse Cannot:
- Delete or hide content (only moderators)
- Override moderation decisions User Needs served: UN-13 (Cite and contribute) === 1.3 Moderator === Who: Trusted community members with proven track record, appointed by governance board Can:
- Review flagged content
- Hide harmful or abusive content
- Resolve disputes between contributors
- Issue warnings or temporary bans
- Make final decisions on content disputes
- Access full audit logs Cannot:
- Change governance rules
- Permanently ban users without board approval
- Override technical quality gates Note: Small team (3-5 initially), supported by automated moderation tools. === 1.4 Domain Trusted Contributors (Optional, Task-Specific) === Who: Subject matter specialists invited for specific high-stakes disputes Not a permanent role: Contacted externally when needed for contested claims in their domain When used:
- Medical claims with life/safety implications
- Legal interpretations with significant impact
- Scientific claims with high controversy
- Technical claims requiring specialized knowledge Process:
- Moderator identifies need for expert input
- Contact expert externally (don't require them to be users)
- Trusted Contributor provides written opinion with sources
- Opinion added to claim record
- Trusted Contributor acknowledged in claim User Needs served: UN-16 (Expert validation status) == 2. Content States == Fulfills: UN-1 (Trust indicators), UN-16 (Review status transparency) FactHarbor uses two content states. Focus is on transparency and confidence scoring, not gatekeeping. === 2.1 Published === Status: Visible to all users Includes:
- AI-generated analyses (default state)
- User-contributed content
- Edited/improved content Quality Indicators (displayed with content):
- Confidence Score: 0-100% (AI's confidence in analysis)
- Source Quality Score: 0-100% (based on source track record)
- Controversy Flag: If high dispute/edit activity
- Completeness Score: % of expected fields filled
- Last Updated: Date of most recent change
- Edit Count: Number of revisions
- Review Status: AI-generated / Human-reviewed / Expert-validated Automatic Warnings:
- Confidence < 60%: "Low confidence - use caution"
- Source quality < 40%: "Sources may be unreliable"
- High controversy: "Disputed - multiple interpretations exist"
- Medical/Legal/Safety domain: "Seek professional advice" User Needs served: UN-1 (Trust score), UN-9 (Methodology transparency), UN-15 (Evolution timeline), UN-16 (Review status) === 2.2 Hidden === Status: Not visible to regular users (only to moderators) Reasons:
- Spam or advertising
- Personal attacks or harassment
- Illegal content
- Privacy violations
- Deliberate misinformation (verified)
- Abuse or harmful content Process:
- Automated detection flags for moderator review
- Moderator confirms and hides
- Original author notified with reason
- Can appeal to board if disputes moderator decision Note: Content is hidden, not deleted (for audit trail) == 3. Contribution Rules == === 3.1 All Contributors Must === * Provide sources for factual claims
- Use clear, neutral language in FactHarbor's own summaries
- Respect others and maintain civil discourse
- Accept community feedback constructively
- Focus on improving quality, not protecting ego === 3.2 AKEL (AI System) === AKEL is the primary system. Human contributions supplement and train AKEL. AKEL Must:
- Mark all outputs as AI-generated
- Display confidence scores prominently
- Provide source citations
- Flag uncertainty clearly
- Identify contradictions in evidence
- Learn from human corrections When AKEL Makes Errors:
- Capture the error pattern (what, why, how common)
2. Improve the system (better prompt, model, validation)
3. Re-process affected claims automatically
4. Measure improvement (did quality increase?) Human Role: Train AKEL through corrections, not replace AKEL === 3.3 Contributors Should === * Improve clarity and structure
- Add missing sources
- Flag errors for system improvement
- Suggest better ways to present information
- Participate in quality discussions === 3.4 Moderators Must === * Be impartial
- Document moderation decisions
- Respond to appeals promptly
- Use automated tools to scale efforts
- Focus on abuse/harm, not routine quality control == 4. Quality Standards == Fulfills: UN-5 (Source reliability), UN-6 (Publisher track records), UN-7 (Evidence transparency), UN-9 (Methodology transparency) === 4.1 Source Requirements === Track Record Over Credentials:
- Sources evaluated by historical accuracy
- Correction policy matters
- Independence from conflicts of interest
- Methodology transparency Source Quality Database:
- Automated tracking of source accuracy
- Correction frequency
- Reliability score (updated continuously)
- Users can see source track record No automatic trust for government, academia, or media - all evaluated by track record. User Needs served: UN-5 (Source provenance), UN-6 (Publisher reliability) === 4.2 Claim Requirements === * Clear subject and assertion
- Verifiable with available information
- Sourced (or explicitly marked as needing sources)
- Neutral language in FactHarbor summaries
- Appropriate context provided User Needs served: UN-2 (Claim extraction and verification) === 4.3 Evidence Requirements === * Publicly accessible (or explain why not)
- Properly cited with attribution
- Relevant to claim being evaluated
- Original source preferred over secondary User Needs served: UN-7 (Evidence transparency) === 4.4 Confidence Scoring === Automated confidence calculation based on:
- Source quality scores
- Evidence consistency
- Contradiction detection
- Completeness of analysis
- Historical accuracy of similar claims Thresholds:
- < 40%: Too low to publish (needs improvement)
- 40-60%: Published with "Low confidence" warning
- 60-80%: Published as standard
- 80-100%: Published as "High confidence" User Needs served: UN-1 (Trust assessment), UN-9 (Methodology transparency) == 5. Automated Risk Scoring == Fulfills: UN-10 (Manipulation detection), UN-16 (Appropriate review level) Replace manual risk tiers with continuous automated scoring. === 5.1 Risk Score Calculation === Factors (weighted algorithm):
- Domain sensitivity: Medical, legal, safety auto-flagged higher
- Potential impact: Views, citations, spread
- Controversy level: Flags, disputes, edit wars
- Uncertainty: Low confidence, contradictory evidence
- Source reliability: Track record of sources used Score: 0-100 (higher = more risk) === 5.2 Automated Actions === * Score > 80: Flag for moderator review before publication
- Score 60-80: Publish with prominent warnings
- Score 40-60: Publish with standard warnings
- Score < 40: Publish normally Continuous monitoring: Risk score recalculated as new information emerges User Needs served: UN-10 (Detect manipulation tactics), UN-16 (Review status) == 6. System Improvement Process == Core principle: Fix the system, not just the data. === 6.1 Error Capture === When users flag errors or make corrections:
- What was wrong? (categorize)
2. What should it have been?
3. Why did the system fail? (root cause)
4. How common is this pattern?
5. Store in ErrorPattern table (improvement queue) === 6.2 Weekly Improvement Cycle === 1. Review: Analyze top error patterns
2. Develop: Create fix (prompt, model, validation)
3. Test: Validate fix on sample claims
4. Deploy: Roll out if quality improves
5. Re-process: Automatically update affected claims
6. Monitor: Track quality metrics === 6.3 Quality Metrics Dashboard === Track continuously:
- Error rate by category
- Source quality distribution
- Confidence score trends
- User flag rate (issues found)
- Correction acceptance rate
- Re-work rate
- Claims processed per hour Goal: 10% monthly improvement in error rate == 7. Automated Quality Monitoring == Replace manual audit sampling with automated monitoring. === 7.1 Continuous Metrics === * Source quality: Track record database
- Consistency: Contradiction detection
- Clarity: Readability scores
- Completeness: Field validation
- Accuracy: User corrections tracked === 7.2 Anomaly Detection === Automated alerts for:
- Sudden quality drops
- Unusual patterns
- Contradiction clusters
- Source reliability changes
- User behavior anomalies === 7.3 Targeted Review === * Review only flagged items
- Random sampling for calibration (not quotas)
- Learn from corrections to improve automation == 8. Functional Requirements == This section defines specific features that fulfill user needs. === 8.1 Claim Intake & Normalization === ==== FR1 — Claim Intake ==== Fulfills: UN-2 (Claim extraction), UN-4 (Quick fact-checking), UN-12 (Submit claims) * Users submit claims via simple form or API
- Claims can be text, URL, or image
- Duplicate detection (semantic similarity)
- Auto-categorization by domain ==== FR2 — Claim Normalization ==== Fulfills: UN-2 (Claim verification) * Standardize to clear assertion format
- Extract key entities (who, what, when, where)
- Identify claim type (factual, predictive, evaluative)
- Link to existing similar claims ==== FR3 — Claim Classification ==== Fulfills: UN-11 (Filtered research) * Domain: Politics, Science, Health, etc.
- Type: Historical fact, current stat, prediction, etc.
- Risk score: Automated calculation
- Complexity: Simple, moderate, complex === 8.2 Scenario System === ==== FR4 — Scenario Generation ==== Fulfills: UN-2 (Context-dependent verification), UN-3 (Article summary with FactHarbor analysis summary), UN-8 (Understanding disagreement) Automated scenario creation:
- AKEL analyzes claim and generates likely scenarios (use-cases and contexts)
- Each scenario includes: assumptions, definitions, boundaries, evidence context
- Users can flag incorrect scenarios
- System learns from corrections Key Concept: Scenarios represent different interpretations or contexts (e.g., "Clinical trials with healthy adults" vs. "Real-world data with diverse populations") ==== FR5 — Evidence Linking ==== Fulfills: UN-5 (Source tracing), UN-7 (Evidence transparency) * Automated evidence discovery from sources
- Relevance scoring
- Contradiction detection
- Source quality assessment ==== FR6 — Scenario Comparison ==== Fulfills: UN-3 (Article summary with FactHarbor analysis summary), UN-8 (Understanding disagreement) * Side-by-side comparison interface
- Highlight key differences between scenarios
- Show evidence supporting each scenario
- Display confidence scores per scenario === 8.3 Verdicts & Analysis === ==== FR7 — Automated Verdicts ==== Fulfills: UN-1 (Trust score), UN-2 (Verification verdicts), UN-3 (Article summary with FactHarbor analysis summary), UN-13 (Cite verdicts) * AKEL generates verdict based on evidence within each scenario
- Likelihood range displayed (e.g., "0.70-0.85 (likely true)") - NOT binary true/false
- Uncertainty factors explicitly listed (e.g., "Small sample sizes", "Long-term effects unknown")
- Confidence score displayed prominently
- Source quality indicators shown
- Contradictions noted
- Uncertainty acknowledged Key Innovation: Detailed probabilistic verdicts with explicit uncertainty, not binary judgments ==== FR8 — Time Evolution ==== Fulfills: UN-15 (Verdict evolution timeline) * Claims and verdicts update as new evidence emerges
- Version history maintained for all verdicts
- Changes highlighted
- Confidence score trends visible
- Users can see "as of date X, what did we know?" === 8.4 User Interface & Presentation === ==== FR12 — Two-Panel Summary View (Article Summary with FactHarbor Analysis Summary) ==== Fulfills: UN-3 (Article Summary with FactHarbor Analysis Summary) Purpose: Provide side-by-side comparison of what a document claims vs. FactHarbor's complete analysis of its credibility Left Panel: Article Summary:
- Document title, source, and claimed credibility
- "The Big Picture" - main thesis or position change
- "Key Findings" - structured summary of document's main claims
- "Reasoning" - document's explanation for positions
- "Conclusion" - document's bottom line Right Panel: FactHarbor Analysis Summary:
- FactHarbor's independent source credibility assessment
- Claim-by-claim verdicts with confidence scores
- Methodology assessment (strengths, limitations)
- Overall verdict on document quality
- Analysis ID for reference Design Principles:
- No scrolling required - both panels visible simultaneously
- Visual distinction between "what they say" and "FactHarbor's analysis"
- Color coding for verdicts (supported, uncertain, refuted)
- Confidence percentages clearly visible
- Mobile responsive (panels stack vertically on small screens) Implementation Notes:
- Generated automatically by AKEL for every analyzed document
- Updates when verdict evolves (maintains version history)
- Exportable as standalone summary report
- Shareable via permanent URL ==== FR13 — In-Article Claim Highlighting ==== Fulfills: UN-17 (In-article claim highlighting) Purpose: Enable readers to quickly assess claim credibility while reading by visually highlighting factual claims with color-coded indicators ==== Visual Example: Article with Highlighted Claims ==== Legend:
Article: "New Study Shows Benefits of Mediterranean Diet" A recent study published in the Journal of Nutrition has revealed new findings about the Mediterranean diet.
The study, which followed 10,000 participants over five years, showed significant improvements in cardiovascular health markers.
Dr. Maria Rodriguez, lead researcher, recommends incorporating more olive oil, fish, and vegetables into daily meals.
Participants also reported feeling more energetic and experiencing better sleep quality, though these were secondary measures.
- 🟢 = Well-supported claim (confidence ≥75%)
- 🟡 = Uncertain claim (confidence 40-74%)
- 🔴 = Refuted/unsupported claim (confidence <40%)
- Plain text = Non-factual content (context, opinions, recommendations) ==== Tooltip on Hover/Click ==== Color-Coding System:
- Green: Well-supported claims (confidence ≥75%, strong evidence)
- Yellow/Orange: Uncertain claims (confidence 40-74%, conflicting or limited evidence)
- Red: Refuted or unsupported claims (confidence <40%, contradicted by evidence)
- Gray/Neutral: Non-factual content (opinions, questions, procedural text) ==== Interactive Highlighting Example (Detailed View) ====
| Article Text | Status | Analysis |
|---|---|---|
A recent study published in the Journal of Nutrition has revealed new findings about the Mediterranean diet. | Plain text | Context - no highlighting |
Researchers found that Mediterranean diet followers had a 25% lower risk of heart disease compared to control groups | 🟢 WELL SUPPORTED | 87% confidence Meta-analysis of 12 RCTs confirms 23-28% risk reduction View Full Analysis |
The study, which followed 10,000 participants over five years, showed significant improvements in cardiovascular health markers. | Plain text | Methodology - no highlighting |
Some experts believe this diet can completely prevent heart attacks | 🟡 UNCERTAIN | 45% confidence Overstated - evidence shows risk reduction, not prevention View Details |
Dr. Rodriguez recommends incorporating more olive oil, fish, and vegetables into daily meals. | Plain text | Recommendation - no highlighting |
The study proves that saturated fats cause heart disease | 🔴 REFUTED | 15% confidence Claim not supported by study; correlation ≠ causation View Counter-Evidence |
- Highlighted claims use italics to distinguish from plain text
- Color backgrounds match XWiki message box colors (success/warning/error)
- Status column shows verdict prominently
- Analysis column provides quick summary with link to details User Actions:
- Hover over highlighted claim → Tooltip appears
- Click highlighted claim → Detailed analysis modal/panel
- Toggle button to turn highlighting on/off
- Keyboard: Tab through highlighted claims Interaction Design:
- Hover/click on highlighted claim → Show tooltip with: * Claim text * Verdict (e.g., "WELL SUPPORTED") * Confidence score (e.g., "85%") * Brief evidence summary * Link to detailed analysis
- Toggle highlighting on/off (user preference)
- Adjustable color intensity for accessibility Technical Requirements:
- Real-time highlighting as page loads (non-blocking)
- Claim boundary detection (start/end of assertion)
- Handle nested or overlapping claims
- Preserve original article formatting
- Work with various content formats (HTML, plain text, PDFs) Performance Requirements:
- Highlighting renders within 500ms of page load
- No perceptible delay in reading experience
- Efficient DOM manipulation (avoid reflows) Accessibility:
- Color-blind friendly palette (use patterns/icons in addition to color)
- Screen reader compatible (ARIA labels for claim credibility)
- Keyboard navigation to highlighted claims Implementation Notes:
- Claims extracted and analyzed by AKEL during initial processing
- Highlighting data stored as annotations with byte offsets
- Client-side rendering of highlights based on verdict data
- Mobile responsive (tap instead of hover) === 8.5 Workflow & Moderation === ==== FR9 — Publication Workflow ==== Fulfills: UN-1 (Fast access to verified content), UN-16 (Clear review status) Simple flow:
- Claim submitted
2. AKEL processes (automated)
3. If confidence > threshold: Publish (labeled as AI-generated)
4. If confidence < threshold: Flag for improvement
5. If risk score > threshold: Flag for moderator No multi-stage approval process ==== FR10 — Moderation ==== Focus on abuse, not routine quality:
- Automated abuse detection
- Moderators handle flags
- Quick response to harmful content
- Minimal involvement in routine content ==== FR11 — Audit Trail ==== Fulfills: UN-14 (API access to histories), UN-15 (Evolution tracking) * All edits logged
- Version history public
- Moderation decisions documented
- System improvements tracked == 9. Non-Functional Requirements == === 9.1 NFR1 — Performance === Fulfills: UN-4 (Fast fact-checking), UN-11 (Responsive filtering) * Claim processing: < 30 seconds
- Search response: < 2 seconds
- Page load: < 3 seconds
- 99% uptime === 9.2 NFR2 — Scalability === Fulfills: UN-14 (API access at scale) * Handle 10,000 claims initially
- Scale to 1M+ claims
- Support 100K+ concurrent users
- Automated processing scales linearly === 9.3 NFR3 — Transparency === Fulfills: UN-7 (Evidence transparency), UN-9 (Methodology transparency), UN-13 (Citable verdicts), UN-15 (Evolution visibility) * All algorithms open source
- All data exportable
- All decisions documented
- Quality metrics public === 9.4 NFR4 — Security & Privacy === * Follow Privacy Policy
- Secure authentication
- Data encryption
- Regular security audits === 9.5 NFR5 — Maintainability === * Modular architecture
- Automated testing
- Continuous integration
- Comprehensive documentation === NFR11: AKEL Quality Assurance Framework === Fulfills: AI safety, IFCN methodology transparency Specification: Multi-layer AI quality gates to detect hallucinations, low-confidence results, and logical inconsistencies. ==== Quality Gate 1: Claim Extraction Validation ==== Purpose: Ensure extracted claims are factual assertions (not opinions/predictions) Checks:
- Factual Statement Test: Is this verifiable? (Yes/No)
2. Opinion Detection: Contains hedging language? ("I think", "probably", "best")
3. Future Prediction Test: Makes claims about future events?
4. Specificity Score: Contains specific entities, numbers, dates? Thresholds:
- Factual: Must be "Yes"
- Opinion markers: <2 hedging phrases
- Specificity: ≥3 specific elements Action if Failed: Flag as "Non-verifiable", do NOT generate verdict ==== Quality Gate 2: Evidence Relevance Validation ==== Purpose: Ensure AI-linked evidence actually relates to claim Checks:
- Semantic Similarity Score: Evidence vs. claim (embeddings)
2. Entity Overlap: Shared people/places/things?
3. Topic Relevance: Discusses claim subject? Thresholds:
- Similarity: ≥0.6 (cosine similarity)
- Entity overlap: ≥1 shared entity
- Topic relevance: ≥0.5 Action if Failed: Discard irrelevant evidence ==== Quality Gate 3: Scenario Coherence Check ==== Purpose: Validate scenario assumptions are logical and complete Checks:
- Completeness: All required fields populated
2. Internal Consistency: Assumptions don't contradict
3. Distinguishability: Scenarios meaningfully different Thresholds:
- Required fields: 100%
- Contradiction score: <0.3
- Scenario similarity: <0.8 Action if Failed: Merge duplicates, reduce confidence -20% ==== Quality Gate 4: Verdict Confidence Assessment ==== Purpose: Only publish high-confidence verdicts Checks:
- Evidence Count: Minimum 2 sources
2. Source Quality: Average reliability ≥0.6
3. Evidence Agreement: Supporting vs. contradicting ≥0.6
4. Uncertainty Factors: Hedging in reasoning Confidence Tiers:
- HIGH (80-100%): ≥3 sources, ≥0.7 quality, ≥80% agreement
- MEDIUM (50-79%): ≥2 sources, ≥0.6 quality, ≥60% agreement
- LOW (0-49%): <2 sources OR low quality/agreement
- INSUFFICIENT: <2 sources → DO NOT PUBLISH Implementation Phases:
- POC1: Gates 1 & 4 only (basic validation)
- POC2: All 4 gates (complete framework)
- V1.0: Hardened with <5% hallucination rate Acceptance Criteria:
- ✅ All gates operational
- ✅ Hallucination rate <5%
- ✅ Quality metrics public === NFR12: Security Controls === Fulfills: Production readiness, legal compliance Requirements:
- Input Validation: SQL injection, XSS, CSRF prevention
2. Rate Limiting: 5 analyses per minute per IP
3. Authentication: Secure sessions, API key rotation
4. Data Protection: HTTPS, encryption, backups
5. Security Audit: Penetration testing, GDPR compliance Milestone: Beta 0 (essential), V1.0 (complete) BLOCKER === NFR13: Quality Metrics Transparency === Fulfills: IFCN transparency, user trust Public Metrics:
- Quality gates performance
- Evidence quality stats
- Hallucination rate
- User feedback Milestone: POC2 (internal), Beta 0 (public), V1.0 (real-time) == 10. Requirements Priority Matrix ==
This table shows all functional and non-functional requirements ordered by urgency and priority.
Note: Implementation phases (POC1, POC2, Beta 0, V1.0) are defined in POC Requirements and Implementation Roadmap, not in this priority matrix.
Priority Levels:
- CRITICAL - System doesn't work without it, or major safety/legal risk
- HIGH - Core functionality, essential for success
- MEDIUM - Important but not blocking
- LOW - Nice to have, can be deferred
Urgency Levels:
- HIGH - Immediate need (critical for proof of concept)
- MEDIUM - Important but not immediate
- LOW - Future enhancement
| ID | Title | Priority | Urgency | Reason (for HIGH priority/urgency) | |
|---|---|---|---|---|---|
| HIGH URGENCY | |||||
| FR1 | Claim Intake | CRITICAL | HIGH | System entry point - cannot process claims without it | |
| FR5 | Evidence Collection | CRITICAL | HIGH | Core fact-checking functionality - no evidence = no verdict | |
| FR7 | Verdict Computation | CRITICAL | HIGH | The output users see - core value proposition | |
| NFR11 | Quality Assurance Framework | CRITICAL | HIGH | Prevents AI hallucinations - FactHarbor's key differentiator | |
| FR2 | Claim Normalization | HIGH | HIGH | Standardizes AI input for reliable processing | |
| FR3 | Claim Classification | HIGH | HIGH | Identifies factual vs non-factual claims - essential quality gate | |
| FR4 | Scenario Generation | HIGH | HIGH | Handles ambiguous claims - key methodology innovation | |
| FR6 | Evidence Evaluation | HIGH | HIGH | Source quality directly impacts verdict credibility | |
| MEDIUM URGENCY | |||||
| NFR12 | Security Controls | CRITICAL | MEDIUM | — | |
| FR9 | Corrections | HIGH | MEDIUM | IFCN requirement - mandatory for credibility | |
| FR44 | ClaimReview Schema | HIGH | MEDIUM | Search engine visibility - MUST for V1.0 discovery | |
| FR45 | Corrections Notification | HIGH | MEDIUM | IFCN compliance - required for corrections transparency | |
| FR48 | Safety Framework | HIGH | MEDIUM | Prevents harm to contributors - legal and ethical requirement | |
| NFR3 | Transparency | HIGH | MEDIUM | Core principle - essential for trust and credibility | |
| NFR13 | Quality Metrics | HIGH | MEDIUM | Monitoring and transparency - IFCN compliance | |
| FR8 | User Contribution | MEDIUM | MEDIUM | — | |
| FR10 | Publishing | MEDIUM | MEDIUM | — | |
| FR13 | API | MEDIUM | MEDIUM | — | |
| FR46 | Image Verification | MEDIUM | MEDIUM | — | |
| FR47 | Archive.org Integration | MEDIUM | MEDIUM | — | |
| FR54 | Evidence Deduplication | MEDIUM | MEDIUM | — | |
| NFR1 | Performance | MEDIUM | MEDIUM | — | |
| NFR2 | Scalability | MEDIUM | MEDIUM | — | |
| NFR4 | Security & Privacy | MEDIUM | MEDIUM | — | |
| NFR5 | Maintainability | MEDIUM | MEDIUM | — | |
| LOW URGENCY | |||||
| FR11 | Social Sharing | LOW | LOW | — | |
| FR12 | Notifications | LOW | LOW | — | |
| FR49 | A/B Testing | LOW | LOW | — | |
| FR50 | OSINT Toolkit Integration | LOW | LOW | — | |
| FR51 | Video Verification System | LOW | LOW | — | |
| FR52 | Interactive Detection Training | LOW | LOW | — | |
| FR53 | Cross-Organizational Sharing | LOW | LOW | — |
Total: 32 requirements (24 Functional, 8 Non-Functional)
Notes:
- Reason column: Only populated for HIGH priority or HIGH urgency items
- MEDIUM and LOW priority items use "—" (no specific reason needed)
See also:
- POC Requirements - POC1 scope and simplifications
- Implementation Roadmap - Phase-by-phase implementation plan
- User Needs - Foundation that drives these requirements
10.1 User Needs Priority
User Needs (UN) are the foundation that drives functional and non-functional requirements. They are not independently prioritized; instead, their priority is inherited from the FR/NFR requirements they drive.
| ID | Title | Drives Requirements |
|---|---|---|
| UN-1 | Trust Assessment at a Glance | Multiple FR/NFR |
| UN-2 | Claim Extraction and Verification | FR1-7 |
| UN-3 | Article Summary with FactHarbor Analysis Summary | FR4 |
| UN-4 | Social Media Fact-Checking | FR1, FR4 |
| UN-5 | Source Provenance and Track Records | FR6 |
| UN-6 | Publisher Reliability History | FR6 |
| UN-7 | Evidence Transparency | NFR3 |
| UN-8 | Understanding Disagreement and Consensus | FR4 |
| UN-9 | Methodology Transparency | NFR3, NFR11 |
| UN-10 | Manipulation Tactics Detection | FR48 |
| UN-11 | Filtered Research | FR3 |
| UN-12 | Submit Unchecked Claims | FR8 |
| UN-13 | Cite FactHarbor Verdicts | FR10 |
| UN-14 | API Access for Integration | FR13 |
| UN-15 | Verdict Evolution Timeline | FR7 |
| UN-16 | AI vs. Human Review Status | FR9 |
| UN-17 | In-Article Claim Highlighting | FR1 |
| UN-26 | Search Engine Visibility | FR44 |
| UN-27 | Visual Claim Verification | FR46 |
| UN-28 | Safe Contribution Environment | FR48 |
Total: 20 User Needs
Note: Each User Need inherits priority from the requirements it drives. For example, UN-2 (Claim Extraction and Verification) drives FR1-7, which are CRITICAL/HIGH priority, therefore UN-2 is also critical to the project.
11. MVP Scope
Phase 1 : Read-Only MVP Build:
- Automated claim analysis
- Confidence scoring
- Source evaluation
- Browse/search interface
- User flagging system Goal: Prove AI quality before adding user editing User Needs fulfilled in Phase 1: UN-1, UN-2, UN-3, UN-4, UN-5, UN-6, UN-7, UN-8, UN-9, UN-12 Phase 2 : User Contributions Add only if needed:
- Simple editing (Wikipedia-style)
- Reputation system
- Basic moderation
- In-article claim highlighting (FR13) Additional User Needs fulfilled: UN-13, UN-17 Phase 3 : Refinement * Continuous quality improvement
- Feature additions based on real usage
- Scale infrastructure Additional User Needs fulfilled: UN-14 (API access), UN-15 (Full evolution tracking) Deferred:
- Federation (until multiple successful instances exist)
- Complex contribution workflows (focus on automation)
- Extensive role hierarchy (keep simple) == 12. Success Metrics == System Quality (track weekly):
- Error rate by category (target: -10%/month)
- Average confidence score (target: increase)
- Source quality distribution (target: more high-quality)
- Contradiction detection rate (target: increase) Efficiency (track monthly):
- Claims processed per hour (target: increase)
- Human hours per claim (target: decrease)
- Automation coverage (target: >90%)
- Re-work rate (target: <5%) User Satisfaction (track quarterly):
- User flag rate (issues found)
- Correction acceptance rate (flags valid)
- Return user rate
- Trust indicators (surveys) User Needs Metrics (track quarterly):
- UN-1: % users who understand trust scores
- UN-4: Time to verify social media claim (target: <30s)
- UN-7: % users who access evidence details
- UN-8: % users who view multiple scenarios
- UN-15: % users who check evolution timeline
- UN-17: % users who enable in-article highlighting; avg. time spent on highlighted vs. non-highlighted articles == 13. Requirements Traceability == For full traceability matrix showing which requirements fulfill which user needs, see: * User Needs - Section 8 includes comprehensive mapping tables == 14. Related Pages == Non-Functional Requirements (see Section 9):
- NFR11 — AKEL Quality Assurance Framework
- NFR12 — Security Controls
- NFR13 — Quality Metrics Transparency Other Requirements:
- User Needs
- Gap Analysis * User Needs - What users need (drives these requirements)
- Architecture - How requirements are implemented
- Data Model - Data structures supporting requirements
- Workflows - User interaction workflows
- AKEL - AI system fulfilling automation requirements
- Global Rules
- Privacy Policy = V0.9.70 Additional Requirements = == Functional Requirements (Additional) == === FR44: ClaimReview Schema Implementation === Generate valid ClaimReview structured data for Google/Bing visibility. Schema.org Mapping:
- 80-100% likelihood → 5 (Highly Supported)
- 60-79% → 4 (Supported)
- 40-59% → 3 (Mixed)
- 20-39% → 2 (Questionable)
- 0-19% → 1 (Refuted) Milestone: V1.0 === FR45: User Corrections Notification System === Notify users when analyses are corrected. Mechanisms:
- In-page banner (30 days)
2. Public correction log
3. Email notifications (opt-in)
4. RSS/API feed Milestone: Beta 0 (basic), V1.0 (complete) BLOCKER === FR46: Image Verification System === Methods: - Reverse image search
2. EXIF metadata analysis
3. Manipulation detection (basic)
4. Context verification Milestone: Beta 0 (basic), V1.0 (extended) === FR47: Archive.org Integration === Auto-save evidence sources to Wayback Machine. Milestone: Beta 0 === FR48: Safety Framework for Contributors === Protect contributors from harassment and legal threats. Milestone: V1.1 === FR49: A/B Testing Framework === Test AKEL approaches and UI designs systematically. Milestone: V1.0 === FR50-FR53: Future Enhancements (V2.0+) === * FR50: OSINT Toolkit Integration
- FR51: Video Verification System
- FR52: Interactive Detection Training
- FR53: Cross-Organizational Sharing Milestone: V2.0+ (12-18 months post-launch)
FR54: Evidence Deduplication
Fulfills: Accurate evidence counting, quality metrics
Purpose: Avoid counting the same source multiple times when it appears in different forms.
Specification:
Deduplication Logic:
- URL Normalization:
- Remove tracking parameters (?utm_source=...)
- Normalize http/https
- Normalize www/non-www
- Handle redirects
2. Content Similarity:
- If two sources have >90% text similarity → Same source
- If one is subset of other → Same source
- Use fuzzy matching for minor differences
3. Cross-Domain Syndication:
- Detect wire service content (AP, Reuters)
- Mark as single source if syndicated
- Count original publication only
Display:
1. Original Article (NYTimes)
- Also appeared in: WashPost, Guardian (syndicated)
2. Research Paper (Nature)
3. Official Statement (WHO)
Acceptance Criteria:
- ✅ Duplicate URLs recognized
- ✅ Syndicated content detected
- ✅ Evidence count shows "unique" vs "total"
Milestone: POC2, Beta 0
Enhanced Existing Requirements
=== FR7: Automated Verdicts (Enhanced with Quality Gates) === POC1+ Enhancement: After AKEL generates verdict, it passes through quality gates: Workflow:
1. Extract claims ↓
2. [GATE 1] Validate fact-checkable ↓
3. Generate scenarios ↓
4. Generate verdicts ↓
5. [GATE 4] Validate confidence ↓
6. Display to user Updated Verdict States:
- PUBLISHED
- INSUFFICIENT_EVIDENCE
- NON_FACTUAL_CLAIM
- PROCESSING
- ERROR === FR4: Analysis Summary (Enhanced with Quality Metadata) === POC1+ Enhancement: Display quality indicators: Analysis Summary: Verifiable Claims: 3/5 High Confidence Verdicts: 1 Medium Confidence: 2 Evidence Sources: 12 Avg Source Quality: 0.73 Quality Score: 8.5/10