Requirements
Requirements
This page defines Roles, Content States, Rules, and System Requirements for FactHarbor.
Core Philosophy: Invest in system improvement, not manual data correction. When AI makes errors, improve the algorithm and re-process automatically.
Navigation
- User Needs - What users need from FactHarbor (drives these requirements)
- This page - How we fulfill those needs through system design
1. Roles
Fulfills: UN-12 (Submit claims), UN-13 (Cite verdicts), UN-14 (API access)
FactHarbor uses three simple roles plus a reputation system.
1.1 Reader
Who: Anyone (no login required)
Can:
- Browse and search claims
- View scenarios, evidence, verdicts, and confidence scores
- Flag issues or errors
- Use filters, search, and visualization tools
- Submit claims automatically (new claims added if not duplicates)
Cannot:
- Modify content
- Access edit history details
User Needs served: UN-1 (Trust assessment), UN-2 (Claim verification), UN-3 (Article summary with FactHarbor analysis summary), UN-4 (Social media fact-checking), UN-5 (Source tracing), UN-7 (Evidence transparency), UN-8 (Understanding disagreement), UN-12 (Submit claims), UN-17 (In-article highlighting)
1.2 Contributor
Who: Registered users (earns reputation through contributions)
Can:
- Everything a Reader can do
- Edit claims, evidence, and scenarios
- Add sources and citations
- Suggest improvements to AI-generated content
- Participate in discussions
- Earn reputation points for quality contributions
Reputation System:
- New contributors: Limited edit privileges
- Established contributors (established reputation): Full edit access
- Trusted contributors (substantial reputation): Can approve certain changes
- Reputation earned through: Accepted edits, helpful flags, quality contributions
- Reputation lost through: Reverted edits, invalid flags, abuse
Cannot:
- Delete or hide content (only moderators)
- Override moderation decisions
User Needs served: UN-13 (Cite and contribute)
1.3 Moderator
Who: Trusted community members with proven track record, appointed by governance board
Can:
- Review flagged content
- Hide harmful or abusive content
- Resolve disputes between contributors
- Issue warnings or temporary bans
- Make final decisions on content disputes
- Access full audit logs
Cannot:
- Change governance rules
- Permanently ban users without board approval
- Override technical quality gates
Note: Small team (3-5 initially), supported by automated moderation tools.
1.4 Domain Trusted Contributors (Optional, Task-Specific)
Who: Subject matter specialists invited for specific high-stakes disputes
Not a permanent role: Contacted externally when needed for contested claims in their domain
When used:
- Medical claims with life/safety implications
- Legal interpretations with significant impact
- Scientific claims with high controversy
- Technical claims requiring specialized knowledge
Process:
- Moderator identifies need for expert input
- Contact expert externally (don't require them to be users)
- Trusted Contributor provides written opinion with sources
- Opinion added to claim record
- Trusted Contributor acknowledged in claim
User Needs served: UN-16 (Expert validation status)
2. Content States
Fulfills: UN-1 (Trust indicators), UN-16 (Review status transparency)
FactHarbor uses two content states. Focus is on transparency and confidence scoring, not gatekeeping.
2.1 Published
Status: Visible to all users
Includes:
- AI-generated analyses (default state)
- User-contributed content
- Edited/improved content
Quality Indicators (displayed with content):
- Confidence Score: 0-100% (AI's confidence in analysis)
- Source Quality Score: 0-100% (based on source track record)
- Controversy Flag: If high dispute/edit activity
- Completeness Score: % of expected fields filled
- Last Updated: Date of most recent change
- Edit Count: Number of revisions
- Review Status: AI-generated / Human-reviewed / Expert-validated
Automatic Warnings:
- Confidence < 60%: "Low confidence - use caution"
- Source quality < 40%: "Sources may be unreliable"
- High controversy: "Disputed - multiple interpretations exist"
- Medical/Legal/Safety domain: "Seek professional advice"
User Needs served: UN-1 (Trust score), UN-9 (Methodology transparency), UN-15 (Evolution timeline), UN-16 (Review status)
2.2 Hidden
Status: Not visible to regular users (only to moderators)
Reasons:
- Spam or advertising
- Personal attacks or harassment
- Illegal content
- Privacy violations
- Deliberate misinformation (verified)
- Abuse or harmful content
Process:
- Automated detection flags for moderator review
- Moderator confirms and hides
- Original author notified with reason
- Can appeal to board if disputes moderator decision
Note: Content is hidden, not deleted (for audit trail)
3. Contribution Rules
3.1 All Contributors Must
- Provide sources for factual claims
- Use clear, neutral language in FactHarbor's own summaries
- Respect others and maintain civil discourse
- Accept community feedback constructively
- Focus on improving quality, not protecting ego
3.2 AKEL (AI System)
AKEL is the primary system. Human contributions supplement and train AKEL.
AKEL Must:
- Mark all outputs as AI-generated
- Display confidence scores prominently
- Provide source citations
- Flag uncertainty clearly
- Identify contradictions in evidence
- Learn from human corrections
When AKEL Makes Errors:
- Capture the error pattern (what, why, how common)
2. Improve the system (better prompt, model, validation)
3. Re-process affected claims automatically
4. Measure improvement (did quality increase?)
Human Role: Train AKEL through corrections, not replace AKEL
3.3 Contributors Should
- Improve clarity and structure
- Add missing sources
- Flag errors for system improvement
- Suggest better ways to present information
- Participate in quality discussions
3.4 Moderators Must
- Be impartial
- Document moderation decisions
- Respond to appeals promptly
- Use automated tools to scale efforts
- Focus on abuse/harm, not routine quality control
4. Quality Standards
Fulfills: UN-5 (Source reliability), UN-6 (Publisher track records), UN-7 (Evidence transparency), UN-9 (Methodology transparency)
4.1 Source Requirements
Track Record Over Credentials:
- Sources evaluated by historical accuracy
- Correction policy matters
- Independence from conflicts of interest
- Methodology transparency
Source Quality Database:
- Automated tracking of source accuracy
- Correction frequency
- Reliability score (updated continuously)
- Users can see source track record
No automatic trust for government, academia, or media - all evaluated by track record.
User Needs served: UN-5 (Source provenance), UN-6 (Publisher reliability)
4.2 Claim Requirements
- Clear subject and assertion
- Verifiable with available information
- Sourced (or explicitly marked as needing sources)
- Neutral language in FactHarbor summaries
- Appropriate context provided
User Needs served: UN-2 (Claim extraction and verification)
4.3 Evidence Requirements
- Publicly accessible (or explain why not)
- Properly cited with attribution
- Relevant to claim being evaluated
- Original source preferred over secondary
User Needs served: UN-7 (Evidence transparency)
4.4 Confidence Scoring
Automated confidence calculation based on:
- Source quality scores
- Evidence consistency
- Contradiction detection
- Completeness of analysis
- Historical accuracy of similar claims
Thresholds:
- < 40%: Too low to publish (needs improvement)
- 40-60%: Published with "Low confidence" warning
- 60-80%: Published as standard
- 80-100%: Published as "High confidence"
User Needs served: UN-1 (Trust assessment), UN-9 (Methodology transparency)
5. Automated Risk Scoring
Fulfills: UN-10 (Manipulation detection), UN-16 (Appropriate review level)
Replace manual risk tiers with continuous automated scoring.
5.1 Risk Score Calculation
Factors (weighted algorithm):
- Domain sensitivity: Medical, legal, safety auto-flagged higher
- Potential impact: Views, citations, spread
- Controversy level: Flags, disputes, edit wars
- Uncertainty: Low confidence, contradictory evidence
- Source reliability: Track record of sources used
Score: 0-100 (higher = more risk)
5.2 Automated Actions
- Score > 80: Flag for moderator review before publication
- Score 60-80: Publish with prominent warnings
- Score 40-60: Publish with standard warnings
- Score < 40: Publish normally
Continuous monitoring: Risk score recalculated as new information emerges
User Needs served: UN-10 (Detect manipulation tactics), UN-16 (Review status)
6. System Improvement Process
Core principle: Fix the system, not just the data.
6.1 Error Capture
When users flag errors or make corrections:
- What was wrong? (categorize)
2. What should it have been?
3. Why did the system fail? (root cause)
4. How common is this pattern?
5. Store in ErrorPattern table (improvement queue)
6.2 Weekly Improvement Cycle
- Review: Analyze top error patterns
2. Develop: Create fix (prompt, model, validation)
3. Test: Validate fix on sample claims
4. Deploy: Roll out if quality improves
5. Re-process: Automatically update affected claims
6. Monitor: Track quality metrics
6.3 Quality Metrics Dashboard
Track continuously:
- Error rate by category
- Source quality distribution
- Confidence score trends
- User flag rate (issues found)
- Correction acceptance rate
- Re-work rate
- Claims processed per hour
Goal: 10% monthly improvement in error rate
7. Automated Quality Monitoring
Replace manual audit sampling with automated monitoring.
7.1 Continuous Metrics
- Source quality: Track record database
- Consistency: Contradiction detection
- Clarity: Readability scores
- Completeness: Field validation
- Accuracy: User corrections tracked
7.2 Anomaly Detection
Automated alerts for:
- Sudden quality drops
- Unusual patterns
- Contradiction clusters
- Source reliability changes
- User behavior anomalies
7.3 Targeted Review
- Review only flagged items
- Random sampling for calibration (not quotas)
- Learn from corrections to improve automation
8. Functional Requirements
This section defines specific features that fulfill user needs.
8.1 Claim Intake & Normalization
FR1 — Claim Intake
Fulfills: UN-2 (Claim extraction), UN-4 (Quick fact-checking), UN-12 (Submit claims)
- Users submit claims via simple form or API
- Claims can be text, URL, or image
- Duplicate detection (semantic similarity)
- Auto-categorization by domain
FR2 — Claim Normalization
Fulfills: UN-2 (Claim verification)
- Standardize to clear assertion format
- Extract key entities (who, what, when, where)
- Identify claim type (factual, predictive, evaluative)
- Link to existing similar claims
FR3 — Claim Classification
Fulfills: UN-11 (Filtered research)
- Domain: Politics, Science, Health, etc.
- Type: Historical fact, current stat, prediction, etc.
- Risk score: Automated calculation
- Complexity: Simple, moderate, complex
8.2 Scenario System
FR4 — Scenario Generation
Fulfills: UN-2 (Context-dependent verification), UN-3 (Article summary with FactHarbor analysis summary), UN-8 (Understanding disagreement)
Automated scenario creation:
- AKEL analyzes claim and generates likely scenarios (use-cases and contexts)
- Each scenario includes: assumptions, definitions, boundaries, evidence context
- Users can flag incorrect scenarios
- System learns from corrections
Key Concept: Scenarios represent different interpretations or contexts (e.g., "Clinical trials with healthy adults" vs. "Real-world data with diverse populations")
FR5 — Evidence Linking
Fulfills: UN-5 (Source tracing), UN-7 (Evidence transparency)
- Automated evidence discovery from sources
- Relevance scoring
- Contradiction detection
- Source quality assessment
FR6 — Scenario Comparison
Fulfills: UN-3 (Article summary with FactHarbor analysis summary), UN-8 (Understanding disagreement)
- Side-by-side comparison interface
- Highlight key differences between scenarios
- Show evidence supporting each scenario
- Display confidence scores per scenario
8.3 Verdicts & Analysis
FR7 — Automated Verdicts
Fulfills: UN-1 (Trust score), UN-2 (Verification verdicts), UN-3 (Article summary with FactHarbor analysis summary), UN-13 (Cite verdicts)
- AKEL generates verdict based on evidence within each scenario
- Likelihood range displayed (e.g., "0.70-0.85 (likely true)") - NOT binary true/false
- Uncertainty factors explicitly listed (e.g., "Small sample sizes", "Long-term effects unknown")
- Confidence score displayed prominently
- Source quality indicators shown
- Contradictions noted
- Uncertainty acknowledged
Key Innovation: Detailed probabilistic verdicts with explicit uncertainty, not binary judgments
FR8 — Time Evolution
Fulfills: UN-15 (Verdict evolution timeline)
- Claims and verdicts update as new evidence emerges
- Version history maintained for all verdicts
- Changes highlighted
- Confidence score trends visible
- Users can see "as of date X, what did we know?"
8.4 User Interface & Presentation
FR12 — Two-Panel Summary View (Article Summary with FactHarbor Analysis Summary)
Fulfills: UN-3 (Article Summary with FactHarbor Analysis Summary)
Purpose: Provide side-by-side comparison of what a document claims vs. FactHarbor's complete analysis of its credibility
Left Panel: Article Summary:
- Document title, source, and claimed credibility
- "The Big Picture" - main thesis or position change
- "Key Findings" - structured summary of document's main claims
- "Reasoning" - document's explanation for positions
- "Conclusion" - document's bottom line
Right Panel: FactHarbor Analysis Summary:
- FactHarbor's independent source credibility assessment
- Claim-by-claim verdicts with confidence scores
- Methodology assessment (strengths, limitations)
- Overall verdict on document quality
- Analysis ID for reference
Design Principles:
- No scrolling required - both panels visible simultaneously
- Visual distinction between "what they say" and "FactHarbor's analysis"
- Color coding for verdicts (supported, uncertain, refuted)
- Confidence percentages clearly visible
- Mobile responsive (panels stack vertically on small screens)
Implementation Notes:
- Generated automatically by AKEL for every analyzed document
- Updates when verdict evolves (maintains version history)
- Exportable as standalone summary report
- Shareable via permanent URL
FR13 — In-Article Claim Highlighting
Fulfills: UN-17 (In-article claim highlighting)
Purpose: Enable readers to quickly assess claim credibility while reading by visually highlighting factual claims with color-coded indicators
Visual Example: Article with Highlighted Claims
Article: "New Study Shows Benefits of Mediterranean Diet"
A recent study published in the Journal of Nutrition has revealed new findings about the Mediterranean diet.
The study, which followed 10,000 participants over five years, showed significant improvements in cardiovascular health markers.
Dr. Maria Rodriguez, lead researcher, recommends incorporating more olive oil, fish, and vegetables into daily meals.
Participants also reported feeling more energetic and experiencing better sleep quality, though these were secondary measures.
Legend:
- 🟢 = Well-supported claim (confidence ≥75%)
- 🟡 = Uncertain claim (confidence 40-74%)
- 🔴 = Refuted/unsupported claim (confidence <40%)
- Plain text = Non-factual content (context, opinions, recommendations)
Tooltip on Hover/Click
Color-Coding System:
- Green: Well-supported claims (confidence ≥75%, strong evidence)
- Yellow/Orange: Uncertain claims (confidence 40-74%, conflicting or limited evidence)
- Red: Refuted or unsupported claims (confidence <40%, contradicted by evidence)
- Gray/Neutral: Non-factual content (opinions, questions, procedural text)
Interactive Highlighting Example (Detailed View)
| Article Text | Status | Analysis |
|---|---|---|
A recent study published in the Journal of Nutrition has revealed new findings about the Mediterranean diet. | Plain text | Context - no highlighting |
Researchers found that Mediterranean diet followers had a 25% lower risk of heart disease compared to control groups | 🟢 WELL SUPPORTED | |
The study, which followed 10,000 participants over five years, showed significant improvements in cardiovascular health markers. | Plain text | Methodology - no highlighting |
Some experts believe this diet can completely prevent heart attacks | 🟡 UNCERTAIN | |
Dr. Rodriguez recommends incorporating more olive oil, fish, and vegetables into daily meals. | Plain text | Recommendation - no highlighting |
The study proves that saturated fats cause heart disease | 🔴 REFUTED |
Design Notes:
- Highlighted claims use italics to distinguish from plain text
- Color backgrounds match XWiki message box colors (success/warning/error)
- Status column shows verdict prominently
- Analysis column provides quick summary with link to details
User Actions:
- Hover over highlighted claim → Tooltip appears
- Click highlighted claim → Detailed analysis modal/panel
- Toggle button to turn highlighting on/off
- Keyboard: Tab through highlighted claims
Interaction Design:
- Hover/click on highlighted claim → Show tooltip with:
- Claim text
- Verdict (e.g., "WELL SUPPORTED")
- Confidence score (e.g., "85%")
- Brief evidence summary
- Link to detailed analysis
- Toggle highlighting on/off (user preference)
- Adjustable color intensity for accessibility
Technical Requirements:
- Real-time highlighting as page loads (non-blocking)
- Claim boundary detection (start/end of assertion)
- Handle nested or overlapping claims
- Preserve original article formatting
- Work with various content formats (HTML, plain text, PDFs)
Performance Requirements:
- Highlighting renders within 500ms of page load
- No perceptible delay in reading experience
- Efficient DOM manipulation (avoid reflows)
Accessibility:
- Color-blind friendly palette (use patterns/icons in addition to color)
- Screen reader compatible (ARIA labels for claim credibility)
- Keyboard navigation to highlighted claims
Implementation Notes:
- Claims extracted and analyzed by AKEL during initial processing
- Highlighting data stored as annotations with byte offsets
- Client-side rendering of highlights based on verdict data
- Mobile responsive (tap instead of hover)
8.5 Workflow & Moderation
FR9 — Publication Workflow
Fulfills: UN-1 (Fast access to verified content), UN-16 (Clear review status)
Simple flow:
- Claim submitted
2. AKEL processes (automated)
3. If confidence > threshold: Publish (labeled as AI-generated)
4. If confidence < threshold: Flag for improvement
5. If risk score > threshold: Flag for moderator
No multi-stage approval process
FR10 — Moderation
Focus on abuse, not routine quality:
- Automated abuse detection
- Moderators handle flags
- Quick response to harmful content
- Minimal involvement in routine content
FR11 — Audit Trail
Fulfills: UN-14 (API access to histories), UN-15 (Evolution tracking)
- All edits logged
- Version history public
- Moderation decisions documented
- System improvements tracked
9. Non-Functional Requirements
9.1 NFR1 — Performance
Fulfills: UN-4 (Fast fact-checking), UN-11 (Responsive filtering)
- Claim processing: < 30 seconds
- Search response: < 2 seconds
- Page load: < 3 seconds
- 99% uptime
9.2 NFR2 — Scalability
Fulfills: UN-14 (API access at scale)
- Handle 10,000 claims initially
- Scale to 1M+ claims
- Support 100K+ concurrent users
- Automated processing scales linearly
9.3 NFR3 — Transparency
Fulfills: UN-7 (Evidence transparency), UN-9 (Methodology transparency), UN-13 (Citable verdicts), UN-15 (Evolution visibility)
- All algorithms open source
- All data exportable
- All decisions documented
- Quality metrics public
9.4 NFR4 — Security & Privacy
- Follow Privacy Policy
- Secure authentication
- Data encryption
- Regular security audits
9.5 NFR5 — Maintainability
- Modular architecture
- Automated testing
- Continuous integration
- Comprehensive documentation
10. MVP Scope
Phase 1 (Months 1-3): Read-Only MVP
Build:
- Automated claim analysis
- Confidence scoring
- Source evaluation
- Browse/search interface
- User flagging system
Goal: Prove AI quality before adding user editing
User Needs fulfilled in Phase 1: UN-1, UN-2, UN-3, UN-4, UN-5, UN-6, UN-7, UN-8, UN-9, UN-12
Phase 2 (Months 4-6): User Contributions
Add only if needed:
- Simple editing (Wikipedia-style)
- Reputation system
- Basic moderation
- In-article claim highlighting (FR13)
Additional User Needs fulfilled: UN-13, UN-17
Phase 3 (Months 7-12): Refinement
- Continuous quality improvement
- Feature additions based on real usage
- Scale infrastructure
Additional User Needs fulfilled: UN-14 (API access), UN-15 (Full evolution tracking)
Deferred:
- Federation (until multiple successful instances exist)
- Complex contribution workflows (focus on automation)
- Extensive role hierarchy (keep simple)
11. Success Metrics
System Quality (track weekly):
- Error rate by category (target: -10%/month)
- Average confidence score (target: increase)
- Source quality distribution (target: more high-quality)
- Contradiction detection rate (target: increase)
Efficiency (track monthly):
- Claims processed per hour (target: increase)
- Human hours per claim (target: decrease)
- Automation coverage (target: >90%)
- Re-work rate (target: <5%)
User Satisfaction (track quarterly):
- User flag rate (issues found)
- Correction acceptance rate (flags valid)
- Return user rate
- Trust indicators (surveys)
User Needs Metrics (track quarterly):
- UN-1: % users who understand trust scores
- UN-4: Time to verify social media claim (target: <30s)
- UN-7: % users who access evidence details
- UN-8: % users who view multiple scenarios
- UN-15: % users who check evolution timeline
- UN-17: % users who enable in-article highlighting; avg. time spent on highlighted vs. non-highlighted articles
12. Requirements Traceability
For full traceability matrix showing which requirements fulfill which user needs, see:
- User Needs - Section 8 includes comprehensive mapping tables
13. Related Pages
- User Needs - What users need (drives these requirements)
- Architecture - How requirements are implemented
- Data Model - Data structures supporting requirements
- Workflows - User interaction workflows
- AKEL - AI system fulfilling automation requirements
- Global Rules
- Privacy Policy