Wiki source code of Requirements
Version 1.1 by Robert Schaub on 2025/12/18 12:03
Hide last authors
| author | version | line-number | content |
|---|---|---|---|
| |
1.1 | 1 | = Requirements = |
| 2 | This page defines **Roles**, **Content States**, **Rules**, and **System Principles** for FactHarbor. | ||
| 3 | **Core Philosophy:** Invest in system improvement, not manual data correction. When AI makes errors, improve the algorithm and re-process automatically. | ||
| 4 | == 1. Roles == | ||
| 5 | FactHarbor uses three simple roles plus a reputation system. | ||
| 6 | === 1.1 Reader === | ||
| 7 | **Who**: Anyone (no login required) | ||
| 8 | **Can**: | ||
| 9 | * Browse and search claims | ||
| 10 | * View scenarios, evidence, verdicts, and confidence scores | ||
| 11 | * Flag issues or errors | ||
| 12 | * Use filters, search, and visualization tools | ||
| 13 | * Submit claims automatically (new claims added if not duplicates) | ||
| 14 | **Cannot**: | ||
| 15 | * Modify content | ||
| 16 | * Access edit history details | ||
| 17 | === 1.2 Contributor === | ||
| 18 | **Who**: Registered users (earns reputation through contributions) | ||
| 19 | **Can**: | ||
| 20 | * Everything a Reader can do | ||
| 21 | * Edit claims, evidence, and scenarios | ||
| 22 | * Add sources and citations | ||
| 23 | * Suggest improvements to AI-generated content | ||
| 24 | * Participate in discussions | ||
| 25 | * Earn reputation points for quality contributions | ||
| 26 | **Reputation System**: | ||
| 27 | * New contributors: Limited edit privileges | ||
| 28 | * Established contributors (established reputation): Full edit access | ||
| 29 | * Trusted contributors (substantial reputation): Can approve certain changes | ||
| 30 | * Reputation earned through: Accepted edits, helpful flags, quality contributions | ||
| 31 | * Reputation lost through: Reverted edits, invalid flags, abuse | ||
| 32 | **Cannot**: | ||
| 33 | * Delete or hide content (only moderators) | ||
| 34 | * Override moderation decisions | ||
| 35 | === 1.3 Moderator === | ||
| 36 | **Who**: Trusted community members with proven track record, appointed by governance board | ||
| 37 | **Can**: | ||
| 38 | * Review flagged content | ||
| 39 | * Hide harmful or abusive content | ||
| 40 | * Resolve disputes between contributors | ||
| 41 | * Issue warnings or temporary bans | ||
| 42 | * Make final decisions on content disputes | ||
| 43 | * Access full audit logs | ||
| 44 | **Cannot**: | ||
| 45 | * Change governance rules | ||
| 46 | * Permanently ban users without board approval | ||
| 47 | * Override technical quality gates | ||
| 48 | **Note**: Small team (3-5 initially), supported by automated moderation tools. | ||
| 49 | === 1.4 Domain Trusted Contributors (Optional, Task-Specific) === | ||
| 50 | **Who**: Subject matter specialists invited for specific high-stakes disputes | ||
| 51 | **Not a permanent role**: Contacted externally when needed for contested claims in their domain | ||
| 52 | **When used**: | ||
| 53 | * Medical claims with life/safety implications | ||
| 54 | * Legal interpretations with significant impact | ||
| 55 | * Scientific claims with high controversy | ||
| 56 | * Technical claims requiring specialized knowledge | ||
| 57 | **Process**: | ||
| 58 | * Moderator identifies need for expert input | ||
| 59 | * Contact expert externally (don't require them to be users) | ||
| 60 | * Trusted Contributor provides written opinion with sources | ||
| 61 | * Opinion added to claim record | ||
| 62 | * Trusted Contributor acknowledged in claim | ||
| 63 | == 2. Content States == | ||
| 64 | FactHarbor uses two content states. Focus is on transparency and confidence scoring, not gatekeeping. | ||
| 65 | === 2.1 Published === | ||
| 66 | **Status**: Visible to all users | ||
| 67 | **Includes**: | ||
| 68 | * AI-generated analyses (default state) | ||
| 69 | * User-contributed content | ||
| 70 | * Edited/improved content | ||
| 71 | **Quality Indicators** (displayed with content): | ||
| 72 | * **Confidence Score**: 0-100% (AI's confidence in analysis) | ||
| 73 | * **Source Quality Score**: 0-100% (based on source track record) | ||
| 74 | * **Controversy Flag**: If high dispute/edit activity | ||
| 75 | * **Completeness Score**: % of expected fields filled | ||
| 76 | * **Last Updated**: Date of most recent change | ||
| 77 | * **Edit Count**: Number of revisions | ||
| 78 | **Automatic Warnings**: | ||
| 79 | * Confidence < 60%: "Low confidence - use caution" | ||
| 80 | * Source quality < 40%: "Sources may be unreliable" | ||
| 81 | * High controversy: "Disputed - multiple interpretations exist" | ||
| 82 | * Medical/Legal/Safety domain: "Seek professional advice" | ||
| 83 | === 2.2 Hidden === | ||
| 84 | **Status**: Not visible to regular users (only to moderators) | ||
| 85 | **Reasons**: | ||
| 86 | * Spam or advertising | ||
| 87 | * Personal attacks or harassment | ||
| 88 | * Illegal content | ||
| 89 | * Privacy violations | ||
| 90 | * Deliberate misinformation (verified) | ||
| 91 | * Abuse or harmful content | ||
| 92 | **Process**: | ||
| 93 | * Automated detection flags for moderator review | ||
| 94 | * Moderator confirms and hides | ||
| 95 | * Original author notified with reason | ||
| 96 | * Can appeal to board if disputes moderator decision | ||
| 97 | **Note**: Content is hidden, not deleted (for audit trail) | ||
| 98 | == 3. Contribution Rules == | ||
| 99 | === 3.1 All Contributors Must === | ||
| 100 | * Provide sources for factual claims | ||
| 101 | * Use clear, neutral language in FactHarbor's own summaries | ||
| 102 | * Respect others and maintain civil discourse | ||
| 103 | * Accept community feedback constructively | ||
| 104 | * Focus on improving quality, not protecting ego | ||
| 105 | === 3.2 AKEL (AI System) === | ||
| 106 | **AKEL is the primary system**. Human contributions supplement and train AKEL. | ||
| 107 | **AKEL Must**: | ||
| 108 | * Mark all outputs as AI-generated | ||
| 109 | * Display confidence scores prominently | ||
| 110 | * Provide source citations | ||
| 111 | * Flag uncertainty clearly | ||
| 112 | * Identify contradictions in evidence | ||
| 113 | * Learn from human corrections | ||
| 114 | **When AKEL Makes Errors**: | ||
| 115 | 1. Capture the error pattern (what, why, how common) | ||
| 116 | 2. Improve the system (better prompt, model, validation) | ||
| 117 | 3. Re-process affected claims automatically | ||
| 118 | 4. Measure improvement (did quality increase?) | ||
| 119 | **Human Role**: Train AKEL through corrections, not replace AKEL | ||
| 120 | === 3.3 Contributors Should === | ||
| 121 | * Improve clarity and structure | ||
| 122 | * Add missing sources | ||
| 123 | * Flag errors for system improvement | ||
| 124 | * Suggest better ways to present information | ||
| 125 | * Participate in quality discussions | ||
| 126 | === 3.4 Moderators Must === | ||
| 127 | * Be impartial | ||
| 128 | * Document moderation decisions | ||
| 129 | * Respond to appeals promptly | ||
| 130 | * Use automated tools to scale efforts | ||
| 131 | * Focus on abuse/harm, not routine quality control | ||
| 132 | == 4. Quality Standards == | ||
| 133 | === 4.1 Source Requirements === | ||
| 134 | **Track Record Over Credentials**: | ||
| 135 | * Sources evaluated by historical accuracy | ||
| 136 | * Correction policy matters | ||
| 137 | * Independence from conflicts of interest | ||
| 138 | * Methodology transparency | ||
| 139 | **Source Quality Database**: | ||
| 140 | * Automated tracking of source accuracy | ||
| 141 | * Correction frequency | ||
| 142 | * Reliability score (updated continuously) | ||
| 143 | * Users can see source track record | ||
| 144 | **No automatic trust** for government, academia, or media - all evaluated by track record. | ||
| 145 | === 4.2 Claim Requirements === | ||
| 146 | * Clear subject and assertion | ||
| 147 | * Verifiable with available information | ||
| 148 | * Sourced (or explicitly marked as needing sources) | ||
| 149 | * Neutral language in FactHarbor summaries | ||
| 150 | * Appropriate context provided | ||
| 151 | === 4.3 Evidence Requirements === | ||
| 152 | * Publicly accessible (or explain why not) | ||
| 153 | * Properly cited with attribution | ||
| 154 | * Relevant to claim being evaluated | ||
| 155 | * Original source preferred over secondary | ||
| 156 | === 4.4 Confidence Scoring === | ||
| 157 | **Automated confidence calculation based on**: | ||
| 158 | * Source quality scores | ||
| 159 | * Evidence consistency | ||
| 160 | * Contradiction detection | ||
| 161 | * Completeness of analysis | ||
| 162 | * Historical accuracy of similar claims | ||
| 163 | **Thresholds**: | ||
| 164 | * < 40%: Too low to publish (needs improvement) | ||
| 165 | * 40-60%: Published with "Low confidence" warning | ||
| 166 | * 60-80%: Published as standard | ||
| 167 | * 80-100%: Published as "High confidence" | ||
| 168 | == 5. Automated Risk Scoring == | ||
| 169 | **Replace manual risk tiers with continuous automated scoring**. | ||
| 170 | === 5.1 Risk Score Calculation === | ||
| 171 | **Factors** (weighted algorithm): | ||
| 172 | * **Domain sensitivity**: Medical, legal, safety auto-flagged higher | ||
| 173 | * **Potential impact**: Views, citations, spread | ||
| 174 | * **Controversy level**: Flags, disputes, edit wars | ||
| 175 | * **Uncertainty**: Low confidence, contradictory evidence | ||
| 176 | * **Source reliability**: Track record of sources used | ||
| 177 | **Score**: 0-100 (higher = more risk) | ||
| 178 | === 5.2 Automated Actions === | ||
| 179 | * **Score > 80**: Flag for moderator review before publication | ||
| 180 | * **Score 60-80**: Publish with prominent warnings | ||
| 181 | * **Score 40-60**: Publish with standard warnings | ||
| 182 | * **Score < 40**: Publish normally | ||
| 183 | **Continuous monitoring**: Risk score recalculated as new information emerges | ||
| 184 | == 6. System Improvement Process == | ||
| 185 | **Core principle**: Fix the system, not just the data. | ||
| 186 | === 6.1 Error Capture === | ||
| 187 | **When users flag errors or make corrections**: | ||
| 188 | 1. What was wrong? (categorize) | ||
| 189 | 2. What should it have been? | ||
| 190 | 3. Why did the system fail? (root cause) | ||
| 191 | 4. How common is this pattern? | ||
| 192 | 5. Store in ErrorPattern table (improvement queue) | ||
| 193 | === 6.2 Weekly Improvement Cycle === | ||
| 194 | 1. **Review**: Analyze top error patterns | ||
| 195 | 2. **Develop**: Create fix (prompt, model, validation) | ||
| 196 | 3. **Test**: Validate fix on sample claims | ||
| 197 | 4. **Deploy**: Roll out if quality improves | ||
| 198 | 5. **Re-process**: Automatically update affected claims | ||
| 199 | 6. **Monitor**: Track quality metrics | ||
| 200 | === 6.3 Quality Metrics Dashboard === | ||
| 201 | **Track continuously**: | ||
| 202 | * Error rate by category | ||
| 203 | * Source quality distribution | ||
| 204 | * Confidence score trends | ||
| 205 | * User flag rate (issues found) | ||
| 206 | * Correction acceptance rate | ||
| 207 | * Re-work rate | ||
| 208 | * Claims processed per hour | ||
| 209 | **Goal**: 10% monthly improvement in error rate | ||
| 210 | == 7. Automated Quality Monitoring == | ||
| 211 | **Replace manual audit sampling with automated monitoring**. | ||
| 212 | === 7.1 Continuous Metrics === | ||
| 213 | * **Source quality**: Track record database | ||
| 214 | * **Consistency**: Contradiction detection | ||
| 215 | * **Clarity**: Readability scores | ||
| 216 | * **Completeness**: Field validation | ||
| 217 | * **Accuracy**: User corrections tracked | ||
| 218 | === 7.2 Anomaly Detection === | ||
| 219 | **Automated alerts for**: | ||
| 220 | * Sudden quality drops | ||
| 221 | * Unusual patterns | ||
| 222 | * Contradiction clusters | ||
| 223 | * Source reliability changes | ||
| 224 | * User behavior anomalies | ||
| 225 | === 7.3 Targeted Review === | ||
| 226 | * Review only flagged items | ||
| 227 | * Random sampling for calibration (not quotas) | ||
| 228 | * Learn from corrections to improve automation | ||
| 229 | == 8. Claim Intake & Normalization == | ||
| 230 | === 8.1 FR1 – Claim Intake === | ||
| 231 | * Users submit claims via simple form or API | ||
| 232 | * Claims can be text, URL, or image | ||
| 233 | * Duplicate detection (semantic similarity) | ||
| 234 | * Auto-categorization by domain | ||
| 235 | === 8.2 FR2 – Claim Normalization === | ||
| 236 | * Standardize to clear assertion format | ||
| 237 | * Extract key entities (who, what, when, where) | ||
| 238 | * Identify claim type (factual, predictive, evaluative) | ||
| 239 | * Link to existing similar claims | ||
| 240 | === 8.3 FR3 – Claim Classification === | ||
| 241 | * Domain: Politics, Science, Health, etc. | ||
| 242 | * Type: Historical fact, current stat, prediction, etc. | ||
| 243 | * Risk score: Automated calculation | ||
| 244 | * Complexity: Simple, moderate, complex | ||
| 245 | == 9. Scenario System == | ||
| 246 | === 9.1 FR4 – Scenario Generation === | ||
| 247 | **Automated scenario creation**: | ||
| 248 | * AKEL analyzes claim and generates likely scenarios | ||
| 249 | * Each scenario includes: assumptions, evidence, conclusion | ||
| 250 | * Users can flag incorrect scenarios | ||
| 251 | * System learns from corrections | ||
| 252 | === 9.2 FR5 – Evidence Linking === | ||
| 253 | * Automated evidence discovery from sources | ||
| 254 | * Relevance scoring | ||
| 255 | * Contradiction detection | ||
| 256 | * Source quality assessment | ||
| 257 | === 9.3 FR6 – Scenario Comparison === | ||
| 258 | * Side-by-side comparison interface | ||
| 259 | * Highlight key differences | ||
| 260 | * Show evidence supporting each | ||
| 261 | * Display confidence scores | ||
| 262 | == 10. Verdicts & Analysis == | ||
| 263 | === 10.1 FR7 – Automated Verdicts === | ||
| 264 | * AKEL generates verdict based on evidence | ||
| 265 | * Confidence score displayed prominently | ||
| 266 | * Source quality indicators | ||
| 267 | * Contradictions noted | ||
| 268 | * Uncertainty acknowledged | ||
| 269 | === 10.2 FR8 – Time Evolution === | ||
| 270 | * Claims update as new evidence emerges | ||
| 271 | * Version history maintained | ||
| 272 | * Changes highlighted | ||
| 273 | * Confidence score trends visible | ||
| 274 | == 11. Workflow & Moderation == | ||
| 275 | === 11.1 FR9 – Publication Workflow === | ||
| 276 | **Simple flow**: | ||
| 277 | 1. Claim submitted | ||
| 278 | 2. AKEL processes (automated) | ||
| 279 | 3. If confidence > threshold: Publish | ||
| 280 | 4. If confidence < threshold: Flag for improvement | ||
| 281 | 5. If risk score > threshold: Flag for moderator | ||
| 282 | **No multi-stage approval process** | ||
| 283 | === 11.2 FR10 – Moderation === | ||
| 284 | **Focus on abuse, not routine quality**: | ||
| 285 | * Automated abuse detection | ||
| 286 | * Moderators handle flags | ||
| 287 | * Quick response to harmful content | ||
| 288 | * Minimal involvement in routine content | ||
| 289 | === 11.3 FR11 – Audit Trail === | ||
| 290 | * All edits logged | ||
| 291 | * Version history public | ||
| 292 | * Moderation decisions documented | ||
| 293 | * System improvements tracked | ||
| 294 | == 12. Technical Requirements == | ||
| 295 | === 12.1 NFR1 – Performance === | ||
| 296 | * Claim processing: < 30 seconds | ||
| 297 | * Search response: < 2 seconds | ||
| 298 | * Page load: < 3 seconds | ||
| 299 | * 99% uptime | ||
| 300 | === 12.2 NFR2 – Scalability === | ||
| 301 | * Handle 10,000 claims initially | ||
| 302 | * Scale to 1M+ claims | ||
| 303 | * Support 100K+ concurrent users | ||
| 304 | * Automated processing scales linearly | ||
| 305 | === 12.3 NFR3 – Transparency === | ||
| 306 | * All algorithms open source | ||
| 307 | * All data exportable | ||
| 308 | * All decisions documented | ||
| 309 | * Quality metrics public | ||
| 310 | === 12.4 NFR4 – Security & Privacy === | ||
| 311 | * Follow [[Privacy Policy>>FactHarbor.Organisation.How-We-Work-Together.Privacy-Policy]] | ||
| 312 | * Secure authentication | ||
| 313 | * Data encryption | ||
| 314 | * Regular security audits | ||
| 315 | === 12.5 NFR5 – Maintainability === | ||
| 316 | * Modular architecture | ||
| 317 | * Automated testing | ||
| 318 | * Continuous integration | ||
| 319 | * Comprehensive documentation | ||
| 320 | == 13. MVP Scope == | ||
| 321 | **Phase 1 (Months 1-3): Read-Only MVP** | ||
| 322 | Build: | ||
| 323 | * Automated claim analysis | ||
| 324 | * Confidence scoring | ||
| 325 | * Source evaluation | ||
| 326 | * Browse/search interface | ||
| 327 | * User flagging system | ||
| 328 | **Goal**: Prove AI quality before adding user editing | ||
| 329 | **Phase 2 (Months 4-6): User Contributions** | ||
| 330 | Add only if needed: | ||
| 331 | * Simple editing (Wikipedia-style) | ||
| 332 | * Reputation system | ||
| 333 | * Basic moderation | ||
| 334 | **Phase 3 (Months 7-12): Refinement** | ||
| 335 | * Continuous quality improvement | ||
| 336 | * Feature additions based on real usage | ||
| 337 | * Scale infrastructure | ||
| 338 | **Deferred**: | ||
| 339 | * Federation (until multiple successful instances exist) | ||
| 340 | * Complex contribution workflows (focus on automation) | ||
| 341 | * Extensive role hierarchy (keep simple) | ||
| 342 | == 14. Success Metrics == | ||
| 343 | **System Quality** (track weekly): | ||
| 344 | * Error rate by category (target: -10%/month) | ||
| 345 | * Average confidence score (target: increase) | ||
| 346 | * Source quality distribution (target: more high-quality) | ||
| 347 | * Contradiction detection rate (target: increase) | ||
| 348 | **Efficiency** (track monthly): | ||
| 349 | * Claims processed per hour (target: increase) | ||
| 350 | * Human hours per claim (target: decrease) | ||
| 351 | * Automation coverage (target: >90%) | ||
| 352 | * Re-work rate (target: <5%) | ||
| 353 | **User Satisfaction** (track quarterly): | ||
| 354 | * User flag rate (issues found) | ||
| 355 | * Correction acceptance rate (flags valid) | ||
| 356 | * Return user rate | ||
| 357 | * Trust indicators (surveys) | ||
| 358 | == 15. Related Pages == | ||
| 359 | * [[Architecture>>FactHarbor.Specification.Architecture.WebHome]] | ||
| 360 | * [[Data Model>>FactHarbor.Specification.Data Model.WebHome]] | ||
| 361 | * [[Workflows>>FactHarbor.Specification.Workflows.WebHome]] | ||
| 362 | * [[Global Rules>>FactHarbor.Organisation.How-We-Work-Together.GlobalRules.WebHome]] | ||
| 363 | * [[Privacy Policy>>FactHarbor.Organisation.How-We-Work-Together.Privacy-Policy]] |