Changes for page Requirements
Last modified by Robert Schaub on 2026/02/08 21:32
Summary
-
Page properties (1 modified, 0 added, 0 removed)
Details
- Page properties
-
- Content
-
... ... @@ -1,10 +1,40 @@ 1 1 = Requirements = 2 -This page defines **Roles**, **Content States**, **Rules**, and **System Principles** for FactHarbor. 2 + 3 +{{info}} 4 +**Phase Assignments:** See [[Requirements Roadmap Matrix>>FactHarbor.Roadmap.Requirements-Roadmap-Matrix.WebHome]] for which requirements are implemented in which phases. 5 +{{/info}} 6 + 7 +**This page defines Roles, Content States, Rules, and System Requirements for FactHarbor.** 8 + 3 3 **Core Philosophy:** Invest in system improvement, not manual data correction. When AI makes errors, improve the algorithm and re-process automatically. 10 + 11 +== Navigation == 12 + 13 +* **[[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]]** - What users need from FactHarbor (drives these requirements) 14 +* **This page** - How we fulfill those needs through system design 15 + 16 +(% class="box infomessage" %) 17 +((( 18 +**How to read this page:** 19 + 20 +1. **User Needs drive Requirements**: See [[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]] for what users need 21 +2. **Requirements define implementation**: This page shows how we fulfill those needs 22 +3. **Functional Requirements (FR)**: Specific features and capabilities 23 +4. **Non-Functional Requirements (NFR)**: Quality attributes (performance, security, etc.) 24 + 25 +Each requirement references which User Needs it fulfills. 26 +))) 27 + 4 4 == 1. Roles == 29 + 30 +**Fulfills**: UN-12 (Submit claims), UN-13 (Cite verdicts), UN-14 (API access) 31 + 5 5 FactHarbor uses three simple roles plus a reputation system. 33 + 6 6 === 1.1 Reader === 35 + 7 7 **Who**: Anyone (no login required) 37 + 8 8 **Can**: 9 9 * Browse and search claims 10 10 * View scenarios, evidence, verdicts, and confidence scores ... ... @@ -11,11 +11,17 @@ 11 11 * Flag issues or errors 12 12 * Use filters, search, and visualization tools 13 13 * Submit claims automatically (new claims added if not duplicates) 44 + 14 14 **Cannot**: 15 15 * Modify content 16 16 * Access edit history details 48 + 49 +**User Needs served**: UN-1 (Trust assessment), UN-2 (Claim verification), UN-3 (Article summary with FactHarbor analysis summary), UN-4 (Social media fact-checking), UN-5 (Source tracing), UN-7 (Evidence transparency), UN-8 (Understanding disagreement), UN-12 (Submit claims), UN-17 (In-article highlighting) 50 + 17 17 === 1.2 Contributor === 52 + 18 18 **Who**: Registered users (earns reputation through contributions) 54 + 19 19 **Can**: 20 20 * Everything a Reader can do 21 21 * Edit claims, evidence, and scenarios ... ... @@ -23,6 +23,7 @@ 23 23 * Suggest improvements to AI-generated content 24 24 * Participate in discussions 25 25 * Earn reputation points for quality contributions 62 + 26 26 **Reputation System**: 27 27 * New contributors: Limited edit privileges 28 28 * Established contributors (established reputation): Full edit access ... ... @@ -29,11 +29,17 @@ 29 29 * Trusted contributors (substantial reputation): Can approve certain changes 30 30 * Reputation earned through: Accepted edits, helpful flags, quality contributions 31 31 * Reputation lost through: Reverted edits, invalid flags, abuse 69 + 32 32 **Cannot**: 33 33 * Delete or hide content (only moderators) 34 34 * Override moderation decisions 73 + 74 +**User Needs served**: UN-13 (Cite and contribute) 75 + 35 35 === 1.3 Moderator === 77 + 36 36 **Who**: Trusted community members with proven track record, appointed by governance board 79 + 37 37 **Can**: 38 38 * Review flagged content 39 39 * Hide harmful or abusive content ... ... @@ -41,19 +41,26 @@ 41 41 * Issue warnings or temporary bans 42 42 * Make final decisions on content disputes 43 43 * Access full audit logs 87 + 44 44 **Cannot**: 45 45 * Change governance rules 46 46 * Permanently ban users without board approval 47 47 * Override technical quality gates 92 + 48 48 **Note**: Small team (3-5 initially), supported by automated moderation tools. 94 + 49 49 === 1.4 Domain Trusted Contributors (Optional, Task-Specific) === 96 + 50 50 **Who**: Subject matter specialists invited for specific high-stakes disputes 98 + 51 51 **Not a permanent role**: Contacted externally when needed for contested claims in their domain 100 + 52 52 **When used**: 53 53 * Medical claims with life/safety implications 54 54 * Legal interpretations with significant impact 55 55 * Scientific claims with high controversy 56 56 * Technical claims requiring specialized knowledge 106 + 57 57 **Process**: 58 58 * Moderator identifies need for expert input 59 59 * Contact expert externally (don't require them to be users) ... ... @@ -60,14 +60,24 @@ 60 60 * Trusted Contributor provides written opinion with sources 61 61 * Opinion added to claim record 62 62 * Trusted Contributor acknowledged in claim 113 + 114 +**User Needs served**: UN-16 (Expert validation status) 115 + 63 63 == 2. Content States == 117 + 118 +**Fulfills**: UN-1 (Trust indicators), UN-16 (Review status transparency) 119 + 64 64 FactHarbor uses two content states. Focus is on transparency and confidence scoring, not gatekeeping. 121 + 65 65 === 2.1 Published === 123 + 66 66 **Status**: Visible to all users 125 + 67 67 **Includes**: 68 68 * AI-generated analyses (default state) 69 69 * User-contributed content 70 70 * Edited/improved content 130 + 71 71 **Quality Indicators** (displayed with content): 72 72 * **Confidence Score**: 0-100% (AI's confidence in analysis) 73 73 * **Source Quality Score**: 0-100% (based on source track record) ... ... @@ -75,13 +75,20 @@ 75 75 * **Completeness Score**: % of expected fields filled 76 76 * **Last Updated**: Date of most recent change 77 77 * **Edit Count**: Number of revisions 138 +* **Review Status**: AI-generated / Human-reviewed / Expert-validated 139 + 78 78 **Automatic Warnings**: 79 79 * Confidence < 60%: "Low confidence - use caution" 80 80 * Source quality < 40%: "Sources may be unreliable" 81 81 * High controversy: "Disputed - multiple interpretations exist" 82 82 * Medical/Legal/Safety domain: "Seek professional advice" 145 + 146 +**User Needs served**: UN-1 (Trust score), UN-9 (Methodology transparency), ~~UN-15 (Evolution timeline - Deferred)~~, UN-16 (Review status) 147 + 83 83 === 2.2 Hidden === 149 + 84 84 **Status**: Not visible to regular users (only to moderators) 151 + 85 85 **Reasons**: 86 86 * Spam or advertising 87 87 * Personal attacks or harassment ... ... @@ -89,21 +89,29 @@ 89 89 * Privacy violations 90 90 * Deliberate misinformation (verified) 91 91 * Abuse or harmful content 159 + 92 92 **Process**: 93 93 * Automated detection flags for moderator review 94 94 * Moderator confirms and hides 95 95 * Original author notified with reason 96 96 * Can appeal to board if disputes moderator decision 165 + 97 97 **Note**: Content is hidden, not deleted (for audit trail) 167 + 98 98 == 3. Contribution Rules == 169 + 99 99 === 3.1 All Contributors Must === 171 + 100 100 * Provide sources for factual claims 101 101 * Use clear, neutral language in FactHarbor's own summaries 102 102 * Respect others and maintain civil discourse 103 103 * Accept community feedback constructively 104 104 * Focus on improving quality, not protecting ego 177 + 105 105 === 3.2 AKEL (AI System) === 179 + 106 106 **AKEL is the primary system**. Human contributions supplement and train AKEL. 181 + 107 107 **AKEL Must**: 108 108 * Mark all outputs as AI-generated 109 109 * Display confidence scores prominently ... ... @@ -111,49 +111,74 @@ 111 111 * Flag uncertainty clearly 112 112 * Identify contradictions in evidence 113 113 * Learn from human corrections 189 + 114 114 **When AKEL Makes Errors**: 115 115 1. Capture the error pattern (what, why, how common) 116 116 2. Improve the system (better prompt, model, validation) 117 117 3. Re-process affected claims automatically 118 118 4. Measure improvement (did quality increase?) 195 + 119 119 **Human Role**: Train AKEL through corrections, not replace AKEL 197 + 120 120 === 3.3 Contributors Should === 199 + 121 121 * Improve clarity and structure 122 122 * Add missing sources 123 123 * Flag errors for system improvement 124 124 * Suggest better ways to present information 125 125 * Participate in quality discussions 205 + 126 126 === 3.4 Moderators Must === 207 + 127 127 * Be impartial 128 128 * Document moderation decisions 129 129 * Respond to appeals promptly 130 130 * Use automated tools to scale efforts 131 131 * Focus on abuse/harm, not routine quality control 213 + 132 132 == 4. Quality Standards == 215 + 216 +**Fulfills**: UN-5 (Source reliability), UN-6 (Publisher track records), UN-7 (Evidence transparency), UN-9 (Methodology transparency) 217 + 133 133 === 4.1 Source Requirements === 219 + 134 134 **Track Record Over Credentials**: 135 135 * Sources evaluated by historical accuracy 136 136 * Correction policy matters 137 137 * Independence from conflicts of interest 138 138 * Methodology transparency 225 + 139 139 **Source Quality Database**: 140 140 * Automated tracking of source accuracy 141 141 * Correction frequency 142 142 * Reliability score (updated continuously) 143 143 * Users can see source track record 231 + 144 144 **No automatic trust** for government, academia, or media - all evaluated by track record. 233 + 234 +**User Needs served**: UN-5 (Source provenance), UN-6 (Publisher reliability) 235 + 145 145 === 4.2 Claim Requirements === 237 + 146 146 * Clear subject and assertion 147 147 * Verifiable with available information 148 148 * Sourced (or explicitly marked as needing sources) 149 149 * Neutral language in FactHarbor summaries 150 150 * Appropriate context provided 243 + 244 +**User Needs served**: UN-2 (Claim extraction and verification) 245 + 151 151 === 4.3 Evidence Requirements === 247 + 152 152 * Publicly accessible (or explain why not) 153 153 * Properly cited with attribution 154 154 * Relevant to claim being evaluated 155 155 * Original source preferred over secondary 252 + 253 +**User Needs served**: UN-7 (Evidence transparency) 254 + 156 156 === 4.4 Confidence Scoring === 256 + 157 157 **Automated confidence calculation based on**: 158 158 * Source quality scores 159 159 * Evidence consistency ... ... @@ -160,14 +160,23 @@ 160 160 * Contradiction detection 161 161 * Completeness of analysis 162 162 * Historical accuracy of similar claims 263 + 163 163 **Thresholds**: 164 164 * < 40%: Too low to publish (needs improvement) 165 165 * 40-60%: Published with "Low confidence" warning 166 166 * 60-80%: Published as standard 167 167 * 80-100%: Published as "High confidence" 269 + 270 +**User Needs served**: UN-1 (Trust assessment), UN-9 (Methodology transparency) 271 + 168 168 == 5. Automated Risk Scoring == 273 + 274 +**Fulfills**: UN-10 (Manipulation detection), UN-16 (Appropriate review level) 275 + 169 169 **Replace manual risk tiers with continuous automated scoring**. 277 + 170 170 === 5.1 Risk Score Calculation === 279 + 171 171 **Factors** (weighted algorithm): 172 172 * **Domain sensitivity**: Medical, legal, safety auto-flagged higher 173 173 * **Potential impact**: Views, citations, spread ... ... @@ -174,16 +174,26 @@ 174 174 * **Controversy level**: Flags, disputes, edit wars 175 175 * **Uncertainty**: Low confidence, contradictory evidence 176 176 * **Source reliability**: Track record of sources used 286 + 177 177 **Score**: 0-100 (higher = more risk) 288 + 178 178 === 5.2 Automated Actions === 290 + 179 179 * **Score > 80**: Flag for moderator review before publication 180 180 * **Score 60-80**: Publish with prominent warnings 181 181 * **Score 40-60**: Publish with standard warnings 182 182 * **Score < 40**: Publish normally 295 + 183 183 **Continuous monitoring**: Risk score recalculated as new information emerges 297 + 298 +**User Needs served**: UN-10 (Detect manipulation tactics), UN-16 (Review status) 299 + 184 184 == 6. System Improvement Process == 301 + 185 185 **Core principle**: Fix the system, not just the data. 303 + 186 186 === 6.1 Error Capture === 305 + 187 187 **When users flag errors or make corrections**: 188 188 1. What was wrong? (categorize) 189 189 2. What should it have been? ... ... @@ -190,7 +190,9 @@ 190 190 3. Why did the system fail? (root cause) 191 191 4. How common is this pattern? 192 192 5. Store in ErrorPattern table (improvement queue) 193 -=== 6.2 Weekly Improvement Cycle === 312 + 313 +=== 6.2 Continuous Improvement Cycle === 314 + 194 194 1. **Review**: Analyze top error patterns 195 195 2. **Develop**: Create fix (prompt, model, validation) 196 196 3. **Test**: Validate fix on sample claims ... ... @@ -197,7 +197,9 @@ 197 197 4. **Deploy**: Roll out if quality improves 198 198 5. **Re-process**: Automatically update affected claims 199 199 6. **Monitor**: Track quality metrics 321 + 200 200 === 6.3 Quality Metrics Dashboard === 323 + 201 201 **Track continuously**: 202 202 * Error rate by category 203 203 * Source quality distribution ... ... @@ -206,16 +206,23 @@ 206 206 * Correction acceptance rate 207 207 * Re-work rate 208 208 * Claims processed per hour 209 -**Goal**: 10% monthly improvement in error rate 332 + 333 +**Goal**: continuous improvement in error rate 334 + 210 210 == 7. Automated Quality Monitoring == 336 + 211 211 **Replace manual audit sampling with automated monitoring**. 338 + 212 212 === 7.1 Continuous Metrics === 340 + 213 213 * **Source quality**: Track record database 214 214 * **Consistency**: Contradiction detection 215 215 * **Clarity**: Readability scores 216 216 * **Completeness**: Field validation 217 217 * **Accuracy**: User corrections tracked 346 + 218 218 === 7.2 Anomaly Detection === 348 + 219 219 **Automated alerts for**: 220 220 * Sudden quality drops 221 221 * Unusual patterns ... ... @@ -222,142 +222,1622 @@ 222 222 * Contradiction clusters 223 223 * Source reliability changes 224 224 * User behavior anomalies 355 + 225 225 === 7.3 Targeted Review === 357 + 226 226 * Review only flagged items 227 227 * Random sampling for calibration (not quotas) 228 228 * Learn from corrections to improve automation 229 -== 8. Claim Intake & Normalization == 230 -=== 8.1 FR1 – Claim Intake === 361 + 362 +== 8. Functional Requirements == 363 + 364 +This section defines specific features that fulfill user needs. 365 + 366 +=== 8.1 Claim Intake & Normalization === 367 + 368 +==== FR1 — Claim Intake ==== 369 + 370 +**Fulfills**: UN-2 (Claim extraction), UN-4 (Quick fact-checking), UN-12 (Submit claims) 371 + 231 231 * Users submit claims via simple form or API 232 232 * Claims can be text, URL, or image 233 233 * Duplicate detection (semantic similarity) 234 234 * Auto-categorization by domain 235 -=== 8.2 FR2 – Claim Normalization === 376 + 377 +==== FR2 — Claim Normalization ==== 378 + 379 +**Fulfills**: UN-2 (Claim verification) 380 + 236 236 * Standardize to clear assertion format 237 237 * Extract key entities (who, what, when, where) 238 238 * Identify claim type (factual, predictive, evaluative) 239 239 * Link to existing similar claims 240 -=== 8.3 FR3 – Claim Classification === 385 + 386 +==== FR3 — Claim Classification ==== 387 + 388 +**Fulfills**: UN-11 (Filtered research) 389 + 241 241 * Domain: Politics, Science, Health, etc. 242 242 * Type: Historical fact, current stat, prediction, etc. 243 243 * Risk score: Automated calculation 244 244 * Complexity: Simple, moderate, complex 245 -== 9. Scenario System == 246 -=== 9.1 FR4 – Scenario Generation === 394 + 395 +=== 8.2 Scenario System === 396 + 397 +==== FR4 — Scenario Generation ==== 398 + 399 +**Fulfills**: UN-2 (Context-dependent verification), UN-3 (Article summary with FactHarbor analysis summary), UN-8 (Understanding disagreement) 400 + 247 247 **Automated scenario creation**: 248 -* AKEL analyzes claim and generates likely scenarios 249 -* Each scenario includes: assumptions, evidence ,conclusion402 +* AKEL analyzes claim and generates likely scenarios (use-cases and contexts) 403 +* Each scenario includes: assumptions, definitions, boundaries, evidence context 250 250 * Users can flag incorrect scenarios 251 251 * System learns from corrections 252 -=== 9.2 FR5 – Evidence Linking === 406 + 407 +**Key Concept**: Scenarios represent different interpretations or contexts (e.g., "Clinical trials with healthy adults" vs. "Real-world data with diverse populations") 408 + 409 +==== FR5 — Evidence Linking ==== 410 + 411 +**Fulfills**: UN-5 (Source tracing), UN-7 (Evidence transparency) 412 + 253 253 * Automated evidence discovery from sources 254 254 * Relevance scoring 255 255 * Contradiction detection 256 256 * Source quality assessment 257 -=== 9.3 FR6 – Scenario Comparison === 417 + 418 +==== FR6 — Scenario Comparison ==== 419 + 420 +**Fulfills**: UN-3 (Article summary with FactHarbor analysis summary), UN-8 (Understanding disagreement) 421 + 258 258 * Side-by-side comparison interface 259 -* Highlight key differences 260 -* Show evidence supporting each 261 -* Display confidence scores 262 -== 10. Verdicts & Analysis == 263 -=== 10.1 FR7 – Automated Verdicts === 264 -* AKEL generates verdict based on evidence 423 +* Highlight key differences between scenarios 424 +* Show evidence supporting each scenario 425 +* Display confidence scores per scenario 426 + 427 +=== 8.3 Verdicts & Analysis === 428 + 429 +==== FR7 — Automated Verdicts ==== 430 + 431 +**Fulfills**: UN-1 (Trust score), UN-2 (Verification verdicts), UN-3 (Article summary with FactHarbor analysis summary), UN-13 (Cite verdicts) 432 + 433 +* AKEL generates verdict based on evidence within each scenario 434 +* **Likelihood range** displayed (e.g., "0.70-0.85 (likely true)") - NOT binary true/false 435 +* **Uncertainty factors** explicitly listed (e.g., "Small sample sizes", "Long-term effects unknown") 265 265 * Confidence score displayed prominently 266 -* Source quality indicators 437 +* Source quality indicators shown 267 267 * Contradictions noted 268 268 * Uncertainty acknowledged 269 -=== 10.2 FR8 – Time Evolution === 270 -* Claims update as new evidence emerges 271 -* Version history maintained 440 + 441 +**Key Innovation**: Detailed probabilistic verdicts with explicit uncertainty, not binary judgments 442 + 443 +==== FR8 — Time Evolution ==== 444 + 445 +{{warning}} 446 +**Status:** Deferred (Not in V1.0) 447 + 448 +This requirement has been **dropped from the current architecture and design**. Versioned entities have been replaced with simple edit history tracking only. Full evolution timeline functionality is deferred to future releases beyond V1.0. 449 +{{/warning}} 450 + 451 +**Fulfills**: UN-15 (Verdict evolution timeline) 452 + 453 +* Claims and verdicts update as new evidence emerges 454 +* Version history maintained for all verdicts 272 272 * Changes highlighted 273 273 * Confidence score trends visible 274 -== 11. Workflow & Moderation == 275 -=== 11.1 FR9 – Publication Workflow === 457 +* Users can see "as of date X, what did we know?" 458 + 459 +=== 8.4 User Interface & Presentation === 460 + 461 +==== FR12 — Two-Panel Summary View (Article Summary with FactHarbor Analysis Summary) ==== 462 + 463 +**Fulfills**: UN-3 (Article Summary with FactHarbor Analysis Summary) 464 + 465 +**Purpose**: Provide side-by-side comparison of what a document claims vs. FactHarbor's complete analysis of its credibility 466 + 467 +**Left Panel: Article Summary**: 468 +* Document title, source, and claimed credibility 469 +* "The Big Picture" - main thesis or position change 470 +* "Key Findings" - structured summary of document's main claims 471 +* "Reasoning" - document's explanation for positions 472 +* "Conclusion" - document's bottom line 473 + 474 +**Right Panel: FactHarbor Analysis Summary**: 475 +* FactHarbor's independent source credibility assessment 476 +* Claim-by-claim verdicts with confidence scores 477 +* Methodology assessment (strengths, limitations) 478 +* Overall verdict on document quality 479 +* Analysis ID for reference 480 + 481 +**Design Principles**: 482 +* No scrolling required - both panels visible simultaneously 483 +* Visual distinction between "what they say" and "FactHarbor's analysis" 484 +* Color coding for verdicts (supported, uncertain, refuted) 485 +* Confidence percentages clearly visible 486 +* Mobile responsive (panels stack vertically on small screens) 487 + 488 +**Implementation Notes**: 489 +* Generated automatically by AKEL for every analyzed document 490 +* Updates when verdict evolves (maintains version history) 491 +* Exportable as standalone summary report 492 +* Shareable via permanent URL 493 + 494 +==== FR13 — In-Article Claim Highlighting ==== 495 + 496 +**Fulfills**: UN-17 (In-article claim highlighting) 497 + 498 +**Purpose**: Enable readers to quickly assess claim credibility while reading by visually highlighting factual claims with color-coded indicators 499 + 500 +==== Visual Example: Article with Highlighted Claims ==== 501 + 502 +(% class="box" %) 503 +((( 504 +**Article: "New Study Shows Benefits of Mediterranean Diet"** 505 + 506 +A recent study published in the Journal of Nutrition has revealed new findings about the Mediterranean diet. 507 + 508 +(% class="box successmessage" style="margin:10px 0;" %) 509 +((( 510 +🟢 **Researchers found that Mediterranean diet followers had a 25% lower risk of heart disease compared to control groups** 511 + 512 +(% style="font-size:0.9em; color:#666;" %) 513 +↑ WELL SUPPORTED • 87% confidence 514 +[[Click for evidence details →]] 515 +(%%) 516 +))) 517 + 518 +The study, which followed 10,000 participants over five years, showed significant improvements in cardiovascular health markers. 519 + 520 +(% class="box warningmessage" style="margin:10px 0;" %) 521 +((( 522 +🟡 **Some experts believe this diet can completely prevent heart attacks** 523 + 524 +(% style="font-size:0.9em; color:#666;" %) 525 +↑ UNCERTAIN • 45% confidence 526 +Overstated - evidence shows risk reduction, not prevention 527 +[[Click for details →]] 528 +(%%) 529 +))) 530 + 531 +Dr. Maria Rodriguez, lead researcher, recommends incorporating more olive oil, fish, and vegetables into daily meals. 532 + 533 +(% class="box errormessage" style="margin:10px 0;" %) 534 +((( 535 +🔴 **The study proves that saturated fats cause heart disease** 536 + 537 +(% style="font-size:0.9em; color:#666;" %) 538 +↑ REFUTED • 15% confidence 539 +Claim not supported by study design; correlation ≠ causation 540 +[[Click for counter-evidence →]] 541 +(%%) 542 +))) 543 + 544 +Participants also reported feeling more energetic and experiencing better sleep quality, though these were secondary measures. 545 +))) 546 + 547 +**Legend:** 548 +* 🟢 = Well-supported claim (confidence ≥75%) 549 +* 🟡 = Uncertain claim (confidence 40-74%) 550 +* 🔴 = Refuted/unsupported claim (confidence <40%) 551 +* Plain text = Non-factual content (context, opinions, recommendations) 552 + 553 +==== Tooltip on Hover/Click ==== 554 + 555 +(% class="box infomessage" %) 556 +((( 557 +**FactHarbor Analysis** 558 + 559 +**Claim:** 560 +"Researchers found that Mediterranean diet followers had a 25% lower risk of heart disease" 561 + 562 +**Verdict:** WELL SUPPORTED 563 +**Confidence:** 87% 564 + 565 +**Evidence Summary:** 566 +* Meta-analysis of 12 RCTs confirms 23-28% risk reduction 567 +* Consistent findings across multiple populations 568 +* Published in peer-reviewed journal (high credibility) 569 + 570 +**Uncertainty Factors:** 571 +* Exact percentage varies by study (20-30% range) 572 + 573 +[[View Full Analysis →]] 574 +))) 575 + 576 +**Color-Coding System**: 577 +* **Green**: Well-supported claims (confidence ≥75%, strong evidence) 578 +* **Yellow/Orange**: Uncertain claims (confidence 40-74%, conflicting or limited evidence) 579 +* **Red**: Refuted or unsupported claims (confidence <40%, contradicted by evidence) 580 +* **Gray/Neutral**: Non-factual content (opinions, questions, procedural text) 581 + 582 +==== Interactive Highlighting Example (Detailed View) ==== 583 + 584 +(% style="width:100%; border-collapse:collapse;" %) 585 +|=**Article Text**|=**Status**|=**Analysis** 586 +|(((A recent study published in the Journal of Nutrition has revealed new findings about the Mediterranean diet.)))|(% style="text-align:center;" %)Plain text|(% style="font-style:italic; color:#888;" %)Context - no highlighting 587 +|(((//Researchers found that Mediterranean diet followers had a 25% lower risk of heart disease compared to control groups//)))|(% style="background-color:#D4EDDA; text-align:center; padding:8px;" %)🟢 **WELL SUPPORTED**|((( 588 +**87% confidence** 589 + 590 +Meta-analysis of 12 RCTs confirms 23-28% risk reduction 591 + 592 +[[View Full Analysis]] 593 +))) 594 +|(((The study, which followed 10,000 participants over five years, showed significant improvements in cardiovascular health markers.)))|(% style="text-align:center;" %)Plain text|(% style="font-style:italic; color:#888;" %)Methodology - no highlighting 595 +|(((//Some experts believe this diet can completely prevent heart attacks//)))|(% style="background-color:#FFF3CD; text-align:center; padding:8px;" %)🟡 **UNCERTAIN**|((( 596 +**45% confidence** 597 + 598 +Overstated - evidence shows risk reduction, not prevention 599 + 600 +[[View Details]] 601 +))) 602 +|(((Dr. Rodriguez recommends incorporating more olive oil, fish, and vegetables into daily meals.)))|(% style="text-align:center;" %)Plain text|(% style="font-style:italic; color:#888;" %)Recommendation - no highlighting 603 +|(((//The study proves that saturated fats cause heart disease//)))|(% style="background-color:#F8D7DA; text-align:center; padding:8px;" %)🔴 **REFUTED**|((( 604 +**15% confidence** 605 + 606 +Claim not supported by study; correlation ≠ causation 607 + 608 +[[View Counter-Evidence]] 609 +))) 610 + 611 +**Design Notes:** 612 +* Highlighted claims use italics to distinguish from plain text 613 +* Color backgrounds match XWiki message box colors (success/warning/error) 614 +* Status column shows verdict prominently 615 +* Analysis column provides quick summary with link to details 616 + 617 +**User Actions**: 618 +* **Hover** over highlighted claim → Tooltip appears 619 +* **Click** highlighted claim → Detailed analysis modal/panel 620 +* **Toggle** button to turn highlighting on/off 621 +* **Keyboard**: Tab through highlighted claims 622 + 623 +**Interaction Design**: 624 +* Hover/click on highlighted claim → Show tooltip with: 625 + * Claim text 626 + * Verdict (e.g., "WELL SUPPORTED") 627 + * Confidence score (e.g., "85%") 628 + * Brief evidence summary 629 + * Link to detailed analysis 630 +* Toggle highlighting on/off (user preference) 631 +* Adjustable color intensity for accessibility 632 + 633 +**Technical Requirements**: 634 +* Real-time highlighting as page loads (non-blocking) 635 +* Claim boundary detection (start/end of assertion) 636 +* Handle nested or overlapping claims 637 +* Preserve original article formatting 638 +* Work with various content formats (HTML, plain text, PDFs) 639 + 640 +**Performance Requirements**: 641 +* Highlighting renders within 500ms of page load 642 +* No perceptible delay in reading experience 643 +* Efficient DOM manipulation (avoid reflows) 644 + 645 +**Accessibility**: 646 +* Color-blind friendly palette (use patterns/icons in addition to color) 647 +* Screen reader compatible (ARIA labels for claim credibility) 648 +* Keyboard navigation to highlighted claims 649 + 650 +**Implementation Notes**: 651 +* Claims extracted and analyzed by AKEL during initial processing 652 +* Highlighting data stored as annotations with byte offsets 653 +* Client-side rendering of highlights based on verdict data 654 +* Mobile responsive (tap instead of hover) 655 + 656 +=== 8.5 Workflow & Moderation === 657 + 658 +==== FR9 — Publication Workflow ==== 659 + 660 +**Fulfills**: UN-1 (Fast access to verified content), UN-16 (Clear review status) 661 + 276 276 **Simple flow**: 277 277 1. Claim submitted 278 278 2. AKEL processes (automated) 279 -3. If confidence > threshold: Publish 665 +3. If confidence > threshold: Publish (labeled as AI-generated) 280 280 4. If confidence < threshold: Flag for improvement 281 281 5. If risk score > threshold: Flag for moderator 668 + 282 282 **No multi-stage approval process** 283 -=== 11.2 FR10 – Moderation === 670 + 671 +==== FR10 — Moderation ==== 672 + 284 284 **Focus on abuse, not routine quality**: 285 285 * Automated abuse detection 286 286 * Moderators handle flags 287 287 * Quick response to harmful content 288 288 * Minimal involvement in routine content 289 -=== 11.3 FR11 – Audit Trail === 678 + 679 +==== FR11 — Audit Trail ==== 680 + 681 +**Fulfills**: UN-14 (API access to histories), UN-15 (Evolution tracking) 682 + 290 290 * All edits logged 291 291 * Version history public 292 292 * Moderation decisions documented 293 293 * System improvements tracked 294 -== 12. Technical Requirements == 295 -=== 12.1 NFR1 – Performance === 687 + 688 +== 9. Non-Functional Requirements == 689 + 690 +=== 9.1 NFR1 — Performance === 691 + 692 +**Fulfills**: UN-4 (Fast fact-checking), UN-11 (Responsive filtering) 693 + 296 296 * Claim processing: < 30 seconds 297 297 * Search response: < 2 seconds 298 298 * Page load: < 3 seconds 299 299 * 99% uptime 300 -=== 12.2 NFR2 – Scalability === 698 + 699 +=== 9.2 NFR2 — Scalability === 700 + 701 +**Fulfills**: UN-14 (API access at scale) 702 + 301 301 * Handle 10,000 claims initially 302 302 * Scale to 1M+ claims 303 303 * Support 100K+ concurrent users 304 304 * Automated processing scales linearly 305 -=== 12.3 NFR3 – Transparency === 707 + 708 +=== 9.3 NFR3 — Transparency === 709 + 710 +**Fulfills**: UN-7 (Evidence transparency), UN-9 (Methodology transparency), UN-13 (Citable verdicts), UN-15 (Evolution visibility) 711 + 306 306 * All algorithms open source 307 307 * All data exportable 308 308 * All decisions documented 309 309 * Quality metrics public 310 -=== 12.4 NFR4 – Security & Privacy === 716 + 717 +=== 9.4 NFR4 — Security & Privacy === 718 + 311 311 * Follow [[Privacy Policy>>FactHarbor.Organisation.How-We-Work-Together.Privacy-Policy]] 312 312 * Secure authentication 313 313 * Data encryption 314 314 * Regular security audits 315 -=== 12.5 NFR5 – Maintainability === 723 + 724 +=== 9.5 NFR5 — Maintainability === 725 + 316 316 * Modular architecture 317 317 * Automated testing 318 318 * Continuous integration 319 319 * Comprehensive documentation 320 -== 13. MVP Scope == 321 -**Phase 1 (Months 1-3): Read-Only MVP** 322 -Build: 323 -* Automated claim analysis 324 -* Confidence scoring 325 -* Source evaluation 326 -* Browse/search interface 327 -* User flagging system 328 -**Goal**: Prove AI quality before adding user editing 329 -**Phase 2 (Months 4-6): User Contributions** 330 -Add only if needed: 331 -* Simple editing (Wikipedia-style) 332 -* Reputation system 333 -* Basic moderation 334 -**Phase 3 (Months 7-12): Refinement** 335 -* Continuous quality improvement 336 -* Feature additions based on real usage 337 -* Scale infrastructure 338 -**Deferred**: 339 -* Federation (until multiple successful instances exist) 340 -* Complex contribution workflows (focus on automation) 341 -* Extensive role hierarchy (keep simple) 342 -== 14. Success Metrics == 343 -**System Quality** (track weekly): 344 -* Error rate by category (target: -10%/month) 345 -* Average confidence score (target: increase) 346 -* Source quality distribution (target: more high-quality) 347 -* Contradiction detection rate (target: increase) 348 -**Efficiency** (track monthly): 349 -* Claims processed per hour (target: increase) 350 -* Human hours per claim (target: decrease) 351 -* Automation coverage (target: >90%) 352 -* Re-work rate (target: <5%) 353 -**User Satisfaction** (track quarterly): 354 -* User flag rate (issues found) 355 -* Correction acceptance rate (flags valid) 356 -* Return user rate 357 -* Trust indicators (surveys) 358 -== 15. Related Pages == 359 -* [[Architecture>>FactHarbor.Specification.Architecture.WebHome]] 360 -* [[Data Model>>FactHarbor.Specification.Data Model.WebHome]] 361 -* [[Workflows>>FactHarbor.Specification.Workflows.WebHome]] 730 + 731 +=== NFR11: AKEL Quality Assurance Framework === 732 + 733 +**Fulfills:** AI safety, IFCN methodology transparency 734 + 735 +**Specification:** 736 + 737 +Multi-layer AI quality gates to detect hallucinations, low-confidence results, and logical inconsistencies. 738 + 739 +==== Quality Gate 1: Claim Extraction Validation ==== 740 + 741 +**Purpose:** Ensure extracted claims are factual assertions (not opinions/predictions) 742 + 743 +**Checks:** 744 +1. **Factual Statement Test:** Is this verifiable? (Yes/No) 745 +2. **Opinion Detection:** Contains hedging language? ("I think", "probably", "best") 746 +3. **Future Prediction Test:** Makes claims about future events? 747 +4. **Specificity Score:** Contains specific entities, numbers, dates? 748 + 749 +**Thresholds:** 750 +* Factual: Must be "Yes" 751 +* Opinion markers: <2 hedging phrases 752 +* Specificity: ≥3 specific elements 753 + 754 +**Action if Failed:** Flag as "Non-verifiable", do NOT generate verdict 755 + 756 +==== Quality Gate 2: Evidence Relevance Validation ==== 757 + 758 +**Purpose:** Ensure AI-linked evidence actually relates to claim 759 + 760 +**Checks:** 761 +1. **Semantic Similarity Score:** Evidence vs. claim (embeddings) 762 +2. **Entity Overlap:** Shared people/places/things? 763 +3. **Topic Relevance:** Discusses claim subject? 764 + 765 +**Thresholds:** 766 +* Similarity: ≥0.6 (cosine similarity) 767 +* Entity overlap: ≥1 shared entity 768 +* Topic relevance: ≥0.5 769 + 770 +**Action if Failed:** Discard irrelevant evidence 771 + 772 +==== Quality Gate 3: Scenario Coherence Check ==== 773 + 774 +**Purpose:** Validate scenario assumptions are logical and complete 775 + 776 +**Checks:** 777 +1. **Completeness:** All required fields populated 778 +2. **Internal Consistency:** Assumptions don't contradict 779 +3. **Distinguishability:** Scenarios meaningfully different 780 + 781 +**Thresholds:** 782 +* Required fields: 100% 783 +* Contradiction score: <0.3 784 +* Scenario similarity: <0.8 785 + 786 +**Action if Failed:** Merge duplicates, reduce confidence -20% 787 + 788 +==== Quality Gate 4: Verdict Confidence Assessment ==== 789 + 790 +**Purpose:** Only publish high-confidence verdicts 791 + 792 +**Checks:** 793 +1. **Evidence Count:** Minimum 2 sources 794 +2. **Source Quality:** Average reliability ≥0.6 795 +3. **Evidence Agreement:** Supporting vs. contradicting ≥0.6 796 +4. **Uncertainty Factors:** Hedging in reasoning 797 + 798 +**Confidence Tiers:** 799 +* **HIGH (80-100%):** ≥3 sources, ≥0.7 quality, ≥80% agreement 800 +* **MEDIUM (50-79%):** ≥2 sources, ≥0.6 quality, ≥60% agreement 801 +* **LOW (0-49%):** <2 sources OR low quality/agreement 802 +* **INSUFFICIENT:** <2 sources → DO NOT PUBLISH 803 + 804 +**Implementation Phases:** 805 +* **POC1:** Gates 1 & 4 only (basic validation) 806 +* **POC2:** All 4 gates (complete framework) 807 +* **V1.0:** Hardened with <5% hallucination rate 808 + 809 +**Acceptance Criteria:** 810 +* ✅ All gates operational 811 +* ✅ Hallucination rate <5% 812 +* ✅ Quality metrics public 813 + 814 +=== NFR12: Security Controls === 815 + 816 +**Fulfills:** Data protection, system integrity, user privacy, production readiness 817 + 818 +**Purpose:** Protect FactHarbor systems, user data, and operations from security threats, ensuring production-grade security posture. 819 + 820 +**Specification:** 821 + 822 +==== API Security ==== 823 + 824 +**Rate Limiting:** 825 +* **Analysis endpoints:** 100 requests/hour per IP 826 +* **Read endpoints:** 1,000 requests/hour per IP 827 +* **Search:** 500 requests/hour per IP 828 +* **Authenticated users:** 5x higher limits 829 +* **Burst protection:** Max 10 requests/second 830 + 831 +**Authentication & Authorization:** 832 +* **API Keys:** Required for programmatic access 833 +* **JWT tokens:** For user sessions (1-hour expiry) 834 +* **OAuth2:** For third-party integrations 835 +* **Role-Based Access Control (RBAC):** 836 + * Public: Read-only access to published claims 837 + * Contributor: Submit claims, provide evidence 838 + * Moderator: Review contributions, manage quality 839 + * Admin: System configuration, user management 840 + 841 +**CORS Policies:** 842 +* Whitelist approved domains only 843 +* No wildcard origins in production 844 +* Credentials required for sensitive endpoints 845 + 846 +**Input Sanitization:** 847 +* Validate all user input against schemas 848 +* Sanitize HTML/JavaScript in text submissions 849 +* Prevent SQL injection (use parameterized queries) 850 +* Prevent command injection (no shell execution of user input) 851 +* Max request size: 10MB 852 +* File upload restrictions: Whitelist file types, scan for malware 853 + 854 +--- 855 + 856 +==== Data Security ==== 857 + 858 +**Encryption at Rest:** 859 +* Database encryption using AES-256 860 +* Encrypted backups 861 +* Key management via cloud provider KMS (AWS KMS, Google Cloud KMS) 862 +* Regular key rotation (90-day cycle) 863 + 864 +**Encryption in Transit:** 865 +* HTTPS/TLS 1.3 only (no TLS 1.0/1.1) 866 +* Strong cipher suites only 867 +* HSTS (HTTP Strict Transport Security) enabled 868 +* Certificate pinning for mobile apps 869 + 870 +**Secure Credential Storage:** 871 +* Passwords hashed with bcrypt (cost factor 12+) 872 +* API keys encrypted in database 873 +* Secrets stored in environment variables (never in code) 874 +* Use secrets manager (AWS Secrets Manager, HashiCorp Vault) 875 + 876 +**Data Privacy:** 877 +* Minimal data collection (privacy by design) 878 +* User data deletion on request (GDPR compliance) 879 +* PII encryption in database 880 +* Anonymize logs (no PII in log files) 881 + 882 +--- 883 + 884 +==== Application Security ==== 885 + 886 +**OWASP Top 10 Compliance:** 887 + 888 +1. **Broken Access Control:** RBAC implementation, path traversal prevention 889 +2. **Cryptographic Failures:** Strong encryption, secure key management 890 +3. **Injection:** Parameterized queries, input validation 891 +4. **Insecure Design:** Security review of all features 892 +5. **Security Misconfiguration:** Hardened defaults, security headers 893 +6. **Vulnerable Components:** Dependency scanning (see below) 894 +7. **Authentication Failures:** Strong password policy, MFA support 895 +8. **Data Integrity Failures:** Signature verification, checksums 896 +9. **Security Logging Failures:** Comprehensive audit logs 897 +10. **Server-Side Request Forgery:** URL validation, whitelist domains 898 + 899 +**Security Headers:** 900 +* `Content-Security-Policy`: Strict CSP to prevent XSS 901 +* `X-Frame-Options`: DENY (prevent clickjacking) 902 +* `X-Content-Type-Options`: nosniff 903 +* `Referrer-Policy`: strict-origin-when-cross-origin 904 +* `Permissions-Policy`: Restrict browser features 905 + 906 +**Dependency Vulnerability Scanning:** 907 +* **Tools:** Snyk, Dependabot, npm audit, pip-audit 908 +* **Frequency:** Daily automated scans 909 +* **Action:** Patch critical vulnerabilities within 24 hours 910 +* **Policy:** No known high/critical CVEs in production 911 + 912 +**Security Audits:** 913 +* **Internal:** Quarterly security reviews 914 +* **External:** Annual penetration testing by certified firm 915 +* **Bug Bounty:** Public bug bounty program (V1.1+) 916 +* **Compliance:** SOC 2 Type II certification target (V1.5) 917 + 918 +--- 919 + 920 +==== Operational Security ==== 921 + 922 +**DDoS Protection:** 923 +* CloudFlare or AWS Shield 924 +* Rate limiting at CDN layer 925 +* Automatic IP blocking for abuse patterns 926 + 927 +**Monitoring & Alerting:** 928 +* Real-time security event monitoring 929 +* Alerts for: 930 + * Failed login attempts (>5 in 10 minutes) 931 + * API abuse patterns 932 + * Unusual data access patterns 933 + * Security scan detections 934 +* Integration with SIEM (Security Information and Event Management) 935 + 936 +**Incident Response:** 937 +* Documented incident response plan 938 +* Security incident classification (P1-P4) 939 +* On-call rotation for security issues 940 +* Post-mortem for all security incidents 941 +* Public disclosure policy (coordinated disclosure) 942 + 943 +**Backup & Recovery:** 944 +* Daily encrypted backups 945 +* 30-day retention period 946 +* Tested recovery procedures (quarterly) 947 +* Disaster recovery plan (RTO: 4 hours, RPO: 1 hour) 948 + 949 +--- 950 + 951 +==== Compliance & Standards ==== 952 + 953 +**GDPR Compliance:** 954 +* User consent management 955 +* Right to access data 956 +* Right to deletion 957 +* Data portability 958 +* Privacy policy published 959 + 960 +**Accessibility:** 961 +* WCAG 2.1 AA compliance 962 +* Screen reader compatibility 963 +* Keyboard navigation 964 +* Alt text for images 965 + 966 +**Browser Support:** 967 +* Modern browsers only (Chrome/Edge/Firefox/Safari latest 2 versions) 968 +* No IE11 support 969 + 970 +**Acceptance Criteria:** 971 + 972 +* ✅ Passes OWASP ZAP security scan (no high/critical findings) 973 +* ✅ All dependencies with known vulnerabilities patched 974 +* ✅ Penetration test completed with no critical findings 975 +* ✅ Rate limiting blocks abuse attempts 976 +* ✅ Encryption at rest and in transit verified 977 +* ✅ Security headers scored A+ on securityheaders.com 978 +* ✅ Incident response plan documented and tested 979 +* ✅ 95% uptime over 30-day period 980 + 981 +=== NFR13: Quality Metrics Transparency === 982 + 983 +**Fulfills:** User trust, transparency, continuous improvement, IFCN methodology transparency 984 + 985 +**Purpose:** Provide transparent, measurable quality metrics that demonstrate AKEL's performance and build user trust in automated fact-checking. 986 + 987 +**Specification:** 988 + 989 +==== Component: Public Quality Dashboard ==== 990 + 991 +**Core Metrics to Display:** 992 + 993 +**1. Verdict Quality Metrics** 994 + 995 +**TIGERScore (Fact-Checking Quality):** 996 +* **Definition:** Measures how well generated verdicts match expert fact-checker judgments 997 +* **Scale:** 0-100 (higher is better) 998 +* **Calculation:** Using TIGERScore framework (Truth-conditional accuracy, Informativeness, Generality, Evaluativeness, Relevance) 999 +* **Target:** Average ≥80 for production release 1000 +* **Display:** 1001 +{{code}} 1002 +Verdict Quality (TIGERScore): 1003 +Overall: 84.2 ▲ (+2.1 from last month) 1004 + 1005 +Distribution: 1006 + Excellent (>80): 67% 1007 + Good (60-80): 28% 1008 + Needs Improvement (<60): 5% 1009 + 1010 +Trend: [Graph showing improvement over time] 1011 +{{/code}} 1012 + 1013 +**2. Hallucination & Faithfulness Metrics** 1014 + 1015 +**AlignScore (Faithfulness to Evidence):** 1016 +* **Definition:** Measures how well verdicts align with actual evidence content 1017 +* **Scale:** 0-1 (higher is better) 1018 +* **Purpose:** Detect AI hallucinations (making claims not supported by evidence) 1019 +* **Target:** Average ≥0.85, hallucination rate <5% 1020 +* **Display:** 1021 +{{code}} 1022 +Evidence Faithfulness (AlignScore): 1023 +Average: 0.87 ▼ (-0.02 from last month) 1024 + 1025 +Hallucination Rate: 4.2% 1026 + - Claims without evidence support: 3.1% 1027 + - Misrepresented evidence: 1.1% 1028 + 1029 +Action: Prompt engineering review scheduled 1030 +{{/code}} 1031 + 1032 +**3. Evidence Quality Metrics** 1033 + 1034 +**Source Reliability:** 1035 +* Average source quality score (0-1 scale) 1036 +* Distribution of high/medium/low quality sources 1037 +* Publisher track record trends 1038 + 1039 +**Evidence Coverage:** 1040 +* Average number of sources per claim 1041 +* Percentage of claims with ≥2 sources (EFCSN minimum) 1042 +* Geographic diversity of sources 1043 + 1044 +**Display:** 1045 +{{code}} 1046 +Evidence Quality: 1047 + 1048 +Average Sources per Claim: 4.2 1049 +Claims with ≥2 sources: 94% (EFCSN compliant) 1050 + 1051 +Source Quality Distribution: 1052 + High quality (>0.8): 48% 1053 + Medium quality (0.5-0.8): 43% 1054 + Low quality (<0.5): 9% 1055 + 1056 +Geographic Diversity: 23 countries represented 1057 +{{/code}} 1058 + 1059 +**4. Contributor Consensus Metrics** (when human reviewers involved) 1060 + 1061 +**Inter-Rater Reliability (IRR):** 1062 +* **Calculation:** Cohen's Kappa or Fleiss' Kappa for multiple raters 1063 +* **Scale:** 0-1 (higher is better) 1064 +* **Interpretation:** 1065 + * >0.8: Almost perfect agreement 1066 + * 0.6-0.8: Substantial agreement 1067 + * 0.4-0.6: Moderate agreement 1068 + * <0.4: Poor agreement 1069 +* **Target:** Maintain ≥0.7 (substantial agreement) 1070 + 1071 +**Display:** 1072 +{{code}} 1073 +Contributor Consensus: 1074 + 1075 +Inter-Rater Reliability (IRR): 0.73 (Substantial agreement) 1076 + - Verdict agreement: 78% 1077 + - Evidence quality agreement: 71% 1078 + - Scenario structure agreement: 69% 1079 + 1080 +Cases requiring moderator review: 12 1081 +Moderator override rate: 8% 1082 +{{/code}} 1083 + 1084 +--- 1085 + 1086 +==== Quality Dashboard Implementation ==== 1087 + 1088 +**Dashboard Location:** `/quality-metrics` 1089 + 1090 +**Update Frequency:** 1091 +* **POC2:** Weekly manual updates 1092 +* **Beta 0:** Daily automated updates 1093 +* **V1.0:** Real-time metrics (updated hourly) 1094 + 1095 +**Dashboard Sections:** 1096 + 1097 +1. **Overview:** Key metrics at a glance 1098 +2. **Verdict Quality:** TIGERScore trends and distributions 1099 +3. **Evidence Analysis:** Source quality and coverage 1100 +4. **AI Performance:** Hallucination rates, AlignScore 1101 +5. **Human Oversight:** Contributor consensus, review rates 1102 +6. **System Health:** Processing times, error rates, uptime 1103 + 1104 +**Example Dashboard Layout:** 1105 + 1106 +{{code}} 1107 +┌─────────────────────────────────────────────────────────────┐ 1108 +│ FactHarbor Quality Metrics Last updated: │ 1109 +│ Public Dashboard 2 hours ago │ 1110 +└─────────────────────────────────────────────────────────────┘ 1111 + 1112 +📊 KEY METRICS 1113 +───────────────────────────────────────────────────────────── 1114 +TIGERScore (Verdict Quality): 84.2 ▲ (+2.1) 1115 +AlignScore (Faithfulness): 0.87 ▼ (-0.02) 1116 +Hallucination Rate: 4.2% ✓ (Target: <5%) 1117 +Average Sources per Claim: 4.2 ▲ (+0.3) 1118 + 1119 +📈 TRENDS (30 days) 1120 +───────────────────────────────────────────────────────────── 1121 +[Graph: TIGERScore trending upward] 1122 +[Graph: Hallucination rate declining] 1123 +[Graph: Evidence quality stable] 1124 + 1125 +⚠️ IMPROVEMENT TARGETS 1126 +───────────────────────────────────────────────────────────── 1127 +1. Reduce hallucination rate to <3% (Current: 4.2%) 1128 +2. Increase TIGERScore average to >85 (Current: 84.2) 1129 +3. Maintain IRR >0.75 (Current: 0.73) 1130 + 1131 +📄 DETAILED REPORTS 1132 +───────────────────────────────────────────────────────────── 1133 +• Monthly Quality Report (PDF) 1134 +• Methodology Documentation 1135 +• AKEL Performance Analysis 1136 +• Contributor Agreement Analysis 1137 + 1138 +{{/code}} 1139 + 1140 +--- 1141 + 1142 +==== Continuous Improvement Feedback Loop ==== 1143 + 1144 +**How Metrics Inform AKEL Improvements:** 1145 + 1146 +1. **Identify Weak Areas:** 1147 + * Low TIGERScore → Review prompt engineering 1148 + * High hallucination → Strengthen evidence grounding 1149 + * Low IRR → Clarify evaluation criteria 1150 + 1151 +2. **A/B Testing Integration:** 1152 + * Test prompt variations 1153 + * Measure impact on quality metrics 1154 + * Deploy winners automatically 1155 + 1156 +3. **Alert Thresholds:** 1157 + * TIGERScore drops below 75 → Alert team 1158 + * Hallucination rate exceeds 7% → Pause auto-publishing 1159 + * IRR below 0.6 → Moderator training needed 1160 + 1161 +4. **Monthly Quality Reviews:** 1162 + * Analyze trends 1163 + * Identify systematic issues 1164 + * Plan prompt improvements 1165 + * Update AKEL models 1166 + 1167 +--- 1168 + 1169 +==== Metric Calculation Details ==== 1170 + 1171 +**TIGERScore Implementation:** 1172 +* Reference: https://github.com/TIGER-AI-Lab/TIGERScore 1173 +* Input: Generated verdict + reference verdict (from expert) 1174 +* Output: 0-100 score across 5 dimensions 1175 +* Requires: Test set of expert-reviewed claims (minimum 100) 1176 + 1177 +**AlignScore Implementation:** 1178 +* Reference: https://github.com/yuh-zha/AlignScore 1179 +* Input: Generated verdict + source evidence text 1180 +* Output: 0-1 faithfulness score 1181 +* Calculation: Semantic alignment between claim and evidence 1182 + 1183 +**Source Quality Scoring:** 1184 +* Use existing source reliability database (e.g., NewsGuard, MBFC) 1185 +* Factor in: Publication history, corrections record, transparency 1186 +* Scale: 0-1 (weighted average across sources) 1187 + 1188 +--- 1189 + 1190 +==== Integration Points ==== 1191 + 1192 +* **NFR11: AKEL Quality Assurance** - Metrics validate quality gate effectiveness 1193 +* **FR49: A/B Testing** - Metrics measure test success 1194 +* **FR11: Audit Trail** - Source of quality data 1195 +* **NFR3: Transparency** - Public metrics build trust 1196 + 1197 +**Acceptance Criteria:** 1198 + 1199 +* ✅ All core metrics implemented and calculating correctly 1200 +* ✅ Dashboard updates daily (Beta 0) or hourly (V1.0) 1201 +* ✅ Alerts trigger when metrics degrade beyond thresholds 1202 +* ✅ Monthly quality report auto-generates 1203 +* ✅ Dashboard is publicly accessible (no login required) 1204 +* ✅ Mobile-responsive dashboard design 1205 +* ✅ Metrics inform quarterly AKEL improvement planning 1206 + 1207 +== 13. Requirements Traceability == 1208 + 1209 +For full traceability matrix showing which requirements fulfill which user needs, see: 1210 + 1211 +* [[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]] - Section 8 includes comprehensive mapping tables 1212 + 1213 +== 14. Related Pages == 1214 + 1215 +**Non-Functional Requirements (see Section 9):** 1216 +* [[NFR11 — AKEL Quality Assurance Framework>>#NFR11]] 1217 +* [[NFR12 — Security Controls>>#NFR12]] 1218 +* [[NFR13 — Quality Metrics Transparency>>#NFR13]] 1219 + 1220 +**Other Requirements:** 1221 +* [[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]] 1222 +* [[V1.0 Requirements>>FactHarbor.Specification.Requirements.V10.]] 1223 +* [[Gap Analysis>>FactHarbor.Specification.Requirements.GapAnalysis]] 1224 + 1225 +* **[[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]]** - What users need (drives these requirements) 1226 +* [[Architecture>>FactHarbor.Specification.Architecture.WebHome]] - How requirements are implemented 1227 +* [[Data Model>>FactHarbor.Specification.Data Model.WebHome]] - Data structures supporting requirements 1228 +* [[Workflows>>FactHarbor.Specification.Workflows.WebHome]] - User interaction workflows 1229 +* [[AKEL>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]] - AI system fulfilling automation requirements 362 362 * [[Global Rules>>FactHarbor.Organisation.How-We-Work-Together.GlobalRules.WebHome]] 363 363 * [[Privacy Policy>>FactHarbor.Organisation.How-We-Work-Together.Privacy-Policy]] 1232 + 1233 += V0.9.70 Additional Requirements = 1234 + 1235 +== Functional Requirements (Additional) == 1236 + 1237 +=== FR44: ClaimReview Schema Implementation === 1238 + 1239 +**Fulfills:** UN-13 (Cite FactHarbor Verdicts), UN-14 (API Access for Integration), UN-26 (Search Engine Visibility) 1240 + 1241 +**Purpose:** Generate valid ClaimReview structured data for every published analysis to enable Google/Bing search visibility and fact-check discovery. 1242 + 1243 +**Specification:** 1244 + 1245 +==== Component: Schema.org Markup Generator ==== 1246 + 1247 +FactHarbor must generate valid ClaimReview structured data following Schema.org specifications for every published claim analysis. 1248 + 1249 +**Required JSON-LD Schema:** 1250 + 1251 +{{code language="json"}} 1252 +{ 1253 + "@context": "https://schema.org", 1254 + "@type": "ClaimReview", 1255 + "datePublished": "YYYY-MM-DD", 1256 + "url": "https://factharbor.org/claims/{claim_id}", 1257 + "claimReviewed": "The exact claim text", 1258 + "author": { 1259 + "@type": "Organization", 1260 + "name": "FactHarbor", 1261 + "url": "https://factharbor.org" 1262 + }, 1263 + "reviewRating": { 1264 + "@type": "Rating", 1265 + "ratingValue": "1-5", 1266 + "bestRating": "5", 1267 + "worstRating": "1", 1268 + "alternateName": "FactHarbor likelihood score" 1269 + }, 1270 + "itemReviewed": { 1271 + "@type": "Claim", 1272 + "author": { 1273 + "@type": "Person", 1274 + "name": "Claim author if known" 1275 + }, 1276 + "datePublished": "YYYY-MM-DD if known", 1277 + "appearance": { 1278 + "@type": "CreativeWork", 1279 + "url": "Original claim URL if from article" 1280 + } 1281 + } 1282 +} 1283 +{{/code}} 1284 + 1285 +**FactHarbor-Specific Mapping:** 1286 + 1287 +**Likelihood Score to Rating Scale:** 1288 +* 80-100% likelihood → 5 (Highly Supported) 1289 +* 60-79% likelihood → 4 (Supported) 1290 +* 40-59% likelihood → 3 (Mixed/Uncertain) 1291 +* 20-39% likelihood → 2 (Questionable) 1292 +* 0-19% likelihood → 1 (Refuted) 1293 + 1294 +**Multiple Scenarios Handling:** 1295 +* If claim has multiple scenarios with different verdicts, generate **separate ClaimReview** for each scenario 1296 +* Add `disambiguatingDescription` field explaining scenario context 1297 +* Example: "Scenario: If interpreted as referring to 2023 data..." 1298 + 1299 +==== Implementation Requirements ==== 1300 + 1301 +1. **Auto-generate** on claim publication 1302 +2. **Embed** in HTML `<head>` section as JSON-LD script 1303 +3. **Validate** against Schema.org validator before publishing 1304 +4. **Submit** to Google Search Console for indexing 1305 +5. **Update** automatically when verdict changes (integrate with FR8: Time Evolution) 1306 + 1307 +==== Integration Points ==== 1308 + 1309 +* **FR7: Automated Verdicts** - Source of rating data and claim text 1310 +* **FR8: Time Evolution** - Triggers schema updates when verdicts change 1311 +* **FR11: Audit Trail** - Logs all schema generation and update events 1312 + 1313 +==== Resources ==== 1314 + 1315 +* ClaimReview Project: https://www.claimreviewproject.com 1316 +* Schema.org ClaimReview: https://schema.org/ClaimReview 1317 +* Google Fact Check Guidelines: https://developers.google.com/search/docs/appearance/fact-check 1318 + 1319 +**Acceptance Criteria:** 1320 + 1321 +* ✅ Passes Google Structured Data Testing Tool 1322 +* ✅ Appears in Google Fact Check Explorer within 48 hours of publication 1323 +* ✅ Valid JSON-LD syntax (no errors) 1324 +* ✅ All required fields populated with correct data types 1325 +* ✅ Handles multi-scenario claims correctly (separate ClaimReview per scenario) 1326 + 1327 +=== FR45: User Corrections Notification System === 1328 + 1329 +**Fulfills:** IFCN Principle 5 (Open & Honest Corrections), EFCSN compliance 1330 + 1331 +**Purpose:** When any claim analysis is corrected, notify users who previously viewed the claim to maintain transparency and build trust. 1332 + 1333 +**Specification:** 1334 + 1335 +==== Component: Corrections Visibility Framework ==== 1336 + 1337 +**Correction Types:** 1338 + 1339 +1. **Major Correction:** Verdict changes category (e.g., "Supported" → "Refuted") 1340 +2. **Significant Correction:** Likelihood score changes >20% 1341 +3. **Minor Correction:** Evidence additions, source quality updates 1342 +4. **Scenario Addition:** New scenario added to existing claim 1343 + 1344 +==== Notification Mechanisms ==== 1345 + 1346 +**1. In-Page Banner:** 1347 + 1348 +Display prominent banner on claim page: 1349 + 1350 +{{code}} 1351 +[!] CORRECTION NOTICE 1352 +This analysis was updated on [DATE]. [View what changed] [Dismiss] 1353 + 1354 +Major changes: 1355 +• Verdict changed from "Likely True (75%)" to "Uncertain (45%)" 1356 +• New contradicting evidence added from [Source] 1357 +• Scenario 2 updated with additional context 1358 + 1359 +[See full correction log] 1360 +{{/code}} 1361 + 1362 +**2. Correction Log Page:** 1363 + 1364 +* Public changelog at `/claims/{id}/corrections` 1365 +* Displays for each correction: 1366 + * Date/time of correction 1367 + * What changed (before/after comparison) 1368 + * Why changed (reason if provided) 1369 + * Who made change (AKEL auto-update vs. contributor override) 1370 + 1371 +**3. Email Notifications (opt-in):** 1372 + 1373 +* Send to users who bookmarked or shared the claim 1374 +* Subject: "FactHarbor Correction: [Claim title]" 1375 +* Include summary of changes 1376 +* Link to updated analysis 1377 + 1378 +**4. RSS/API Feed:** 1379 + 1380 +* Corrections feed at `/corrections.rss` 1381 +* API endpoint: `GET /api/corrections?since={timestamp}` 1382 +* Enables external monitoring by journalists and researchers 1383 + 1384 +==== Display Rules ==== 1385 + 1386 +* Show banner on **ALL pages** displaying the claim (search results, related claims, embeddings) 1387 +* Banner persists for **30 days** after correction 1388 +* **"Corrections" count badge** on claim card 1389 +* **Timestamp** on every verdict: "Last updated: [datetime]" 1390 + 1391 +==== IFCN Compliance Requirements ==== 1392 + 1393 +* Corrections policy published at `/corrections-policy` 1394 +* User can report suspected errors via `/report-error/{claim_id}` 1395 +* Link to IFCN complaint process (if FactHarbor becomes signatory) 1396 +* **Scrupulous transparency:** Never silently edit analyses 1397 + 1398 +==== Integration Points ==== 1399 + 1400 +* **FR8: Time Evolution** - Triggers corrections when verdicts change 1401 +* **FR11: Audit Trail** - Source of correction data and change history 1402 +* **NFR3: Transparency** - Public correction log demonstrates commitment 1403 + 1404 +**Acceptance Criteria:** 1405 + 1406 +* ✅ Banner appears within 60 seconds of correction 1407 +* ✅ Correction log is permanent and publicly accessible 1408 +* ✅ Email notifications deliver within 5 minutes 1409 +* ✅ RSS feed updates in real-time 1410 +* ✅ Mobile-responsive banner design 1411 +* ✅ Accessible (screen reader compatible) 1412 + 1413 +=== FR46: Image Verification System === 1414 + 1415 +**Fulfills:** UN-27 (Visual Claim Verification) 1416 + 1417 +**Purpose:** Verify authenticity and context of images shared with claims to detect manipulation, misattribution, and out-of-context usage. 1418 + 1419 +**Specification:** 1420 + 1421 +==== Component: Multi-Method Image Verification ==== 1422 + 1423 +**Method 1: Reverse Image Search** 1424 + 1425 +**Purpose:** Find earlier uses of the image to verify context 1426 + 1427 +**Implementation:** 1428 +* Integrate APIs: 1429 + * **Google Vision AI** (reverse search) 1430 + * **TinEye** (oldest known uses) 1431 + * **Bing Visual Search** (broad coverage) 1432 + 1433 +**Process:** 1434 +1. Extract image from claim or user upload 1435 +2. Query multiple reverse search services 1436 +3. Analyze results for: 1437 + * Earliest known publication 1438 + * Original context (what was it really showing?) 1439 + * Publication timeline 1440 + * Geographic spread 1441 + 1442 +**Output:** 1443 +{{code}} 1444 +Reverse Image Search Results: 1445 + 1446 +Earliest known use: 2019-03-15 (5 years before claim) 1447 +Original context: "Photo from 2019 flooding in Mumbai" 1448 +This claim uses it for: "2024 hurricane damage in Florida" 1449 + 1450 +⚠️ Image is OUT OF CONTEXT 1451 + 1452 +Found in 47 other articles: 1453 +• 2019-03-15: Mumbai floods (original) 1454 +• 2020-07-22: Bangladesh monsoon 1455 +• 2024-10-15: Current claim (misattributed) 1456 + 1457 +[View full timeline] 1458 +{{/code}} 1459 + 1460 +--- 1461 + 1462 +**Method 2: AI Manipulation Detection** 1463 + 1464 +**Purpose:** Detect deepfakes, face swaps, and digital alterations 1465 + 1466 +**Implementation:** 1467 +* Integrate detection services: 1468 + * **Sensity AI** (deepfake detection) 1469 + * **Reality Defender** (multimodal analysis) 1470 + * **AWS Rekognition** (face detection inconsistencies) 1471 + 1472 +**Detection Categories:** 1473 +1. **Face Manipulation:** 1474 + * Deepfake face swaps 1475 + * Expression manipulation 1476 + * Identity replacement 1477 + 1478 +2. **Image Manipulation:** 1479 + * Copy-paste artifacts 1480 + * Clone stamp detection 1481 + * Content-aware fill detection 1482 + * JPEG compression inconsistencies 1483 + 1484 +3. **AI Generation:** 1485 + * Detect fully AI-generated images 1486 + * Identify generation artifacts 1487 + * Check for model signatures 1488 + 1489 +**Confidence Scoring:** 1490 +* **HIGH (80-100%):** Strong evidence of manipulation 1491 +* **MEDIUM (50-79%):** Suspicious artifacts detected 1492 +* **LOW (0-49%):** Minor inconsistencies or inconclusive 1493 + 1494 +**Output:** 1495 +{{code}} 1496 +Manipulation Analysis: 1497 + 1498 +Face Manipulation: LOW RISK (12%) 1499 +Image Editing: MEDIUM RISK (64%) 1500 + • Clone stamp artifacts detected in sky region 1501 + • JPEG compression inconsistent between objects 1502 + 1503 +AI Generation: LOW RISK (8%) 1504 + 1505 +⚠️ Possible manipulation detected. Manual review recommended. 1506 +{{/code}} 1507 + 1508 +--- 1509 + 1510 +**Method 3: Metadata Analysis (EXIF)** 1511 + 1512 +**Purpose:** Extract technical details that may reveal manipulation or misattribution 1513 + 1514 +**Extracted Data:** 1515 +* **Camera/Device:** Make, model, software 1516 +* **Timestamps:** Original date, modification dates 1517 +* **Location:** GPS coordinates (if present) 1518 +* **Editing History:** Software used, edit count 1519 +* **File Properties:** Resolution, compression, format conversions 1520 + 1521 +**Red Flags:** 1522 +* Metadata completely stripped (suspicious) 1523 +* Timestamp conflicts with claimed date 1524 +* GPS location conflicts with claimed location 1525 +* Multiple edit rounds (hiding something?) 1526 +* Creation date after modification date (impossible) 1527 + 1528 +**Output:** 1529 +{{code}} 1530 +Image Metadata: 1531 + 1532 +Camera: iPhone 14 Pro 1533 +Original date: 2023-08-12 14:32:15 1534 +Location: 40.7128°N, 74.0060°W (New York City) 1535 +Modified: 2024-10-15 08:45:22 1536 +Software: Adobe Photoshop 2024 1537 + 1538 +⚠️ Location conflicts with claim 1539 +Claim says: "Taken in Los Angeles" 1540 +EXIF says: New York City 1541 + 1542 +⚠️ Edited 14 months after capture 1543 +{{/code}} 1544 + 1545 +--- 1546 + 1547 +==== Verification Workflow ==== 1548 + 1549 +**Automatic Triggers:** 1550 +1. User submits claim with image 1551 +2. Article being analyzed contains images 1552 +3. Social media post includes photos 1553 + 1554 +**Process:** 1555 +1. Extract images from content 1556 +2. Run all 3 verification methods in parallel 1557 +3. Aggregate results into confidence score 1558 +4. Generate human-readable summary 1559 +5. Display prominently in analysis 1560 + 1561 +**Display Integration:** 1562 + 1563 +Show image verification panel in claim analysis: 1564 + 1565 +{{code}} 1566 +📷 IMAGE VERIFICATION 1567 + 1568 +[Image thumbnail] 1569 + 1570 +✅ Reverse Search: Original context verified 1571 +⚠️ Manipulation: Possible editing detected (64% confidence) 1572 +✅ Metadata: Consistent with claim details 1573 + 1574 +Overall Assessment: CAUTION ADVISED 1575 +This image may have been edited. Original context appears accurate. 1576 + 1577 +[View detailed analysis] 1578 +{{/code}} 1579 + 1580 +==== Integration Points ==== 1581 + 1582 +* **FR7: Automated Verdicts** - Image verification affects claim credibility 1583 +* **FR4: Analysis Summary** - Image findings included in summary 1584 +* **UN-27: Visual Claim Verification** - Direct fulfillment 1585 + 1586 +==== Cost Considerations ==== 1587 + 1588 +**API Costs (estimated per image):** 1589 +* Google Vision AI: $0.001-0.003 1590 +* TinEye: $0.02 (commercial API) 1591 +* Sensity AI: $0.05-0.10 1592 +* AWS Rekognition: $0.001-0.002 1593 + 1594 +**Total per image:** ~$0.07-0.15 1595 + 1596 +**Mitigation Strategies:** 1597 +* Cache results for duplicate images 1598 +* Use free tier quotas where available 1599 +* Prioritize higher-value claims for deep analysis 1600 +* Offer premium verification as paid tier 1601 + 1602 +**Acceptance Criteria:** 1603 + 1604 +* ✅ Reverse image search finds original sources 1605 +* ✅ Manipulation detection accuracy >80% on test dataset 1606 +* ✅ EXIF extraction works for major image formats (JPEG, PNG, HEIC) 1607 +* ✅ Results display within 10 seconds 1608 +* ✅ Mobile-friendly image comparison interface 1609 +* ✅ False positive rate <15% 1610 + 1611 +=== FR47: Archive.org Integration === 1612 + 1613 +**Importance:** CRITICAL 1614 +**Fulfills:** Evidence persistence, FR5 (Evidence linking) 1615 + 1616 +**Purpose:** Ensure evidence remains accessible even if original sources are deleted. 1617 + 1618 +**Specification:** 1619 + 1620 +**Automatic Archiving:** 1621 + 1622 +When AKEL links evidence: 1623 +1. Check if URL already archived (Wayback Machine API) 1624 +2. If not, submit for archiving (Save Page Now API) 1625 +3. Store both original URL and archive URL 1626 +4. Display both to users 1627 + 1628 +**Archive Display:** 1629 + 1630 +{{code}} 1631 +Evidence Source: [Original URL] 1632 +Archived: [Archive.org URL] (Captured: [date]) 1633 + 1634 +[View Original] [View Archive] 1635 +{{/code}} 1636 + 1637 +**Fallback Logic:** 1638 + 1639 +* If original URL unavailable → Auto-redirect to archive 1640 +* If archive unavailable → Display warning 1641 +* If both unavailable → Flag for manual review 1642 + 1643 +**API Integration:** 1644 + 1645 +* Use Wayback Machine Availability API 1646 +* Use Save Page Now API (SPNv2) 1647 +* Rate limiting: 15 requests/minute (Wayback limit) 1648 + 1649 +**Acceptance Criteria:** 1650 + 1651 +* ✅ All evidence URLs auto-archived 1652 +* ✅ Archive links displayed to users 1653 +* ✅ Fallback to archive if original unavailable 1654 +* ✅ API rate limits respected 1655 +* ✅ Archive status visible in evidence display 1656 + 1657 +== Category 4: Community Safety ===== FR48: Contributor Safety Framework === 1658 + 1659 +**Importance:** CRITICAL 1660 +**Fulfills:** UN-28 (Safe contribution environment) 1661 + 1662 +**Purpose:** Protect contributors from harassment, doxxing, and coordinated attacks. 1663 + 1664 +**Specification:** 1665 + 1666 +**1. Privacy Protection:** 1667 + 1668 +* **Optional Pseudonymity:** Contributors can use pseudonyms 1669 +* **Email Privacy:** Emails never displayed publicly 1670 +* **Profile Privacy:** Contributors control what's public 1671 +* **IP Logging:** Only for abuse prevention, not public 1672 + 1673 +**2. Harassment Prevention:** 1674 + 1675 +* **Automated Toxicity Detection:** Flag abusive comments 1676 +* **Personal Information Detection:** Auto-block doxxing attempts 1677 +* **Coordinated Attack Detection:** Identify brigading patterns 1678 +* **Rapid Response:** Moderator alerts for harassment 1679 + 1680 +**3. Safety Features:** 1681 + 1682 +* **Block Users:** Contributors can block harassers 1683 +* **Private Contributions:** Option to contribute anonymously 1684 +* **Report Harassment:** One-click harassment reporting 1685 +* **Safety Resources:** Links to support resources 1686 + 1687 +**4. Moderator Tools:** 1688 + 1689 +* **Quick Ban:** Immediately block abusers 1690 +* **Pattern Detection:** Identify coordinated attacks 1691 +* **Appeal Process:** Fair review of moderation actions 1692 +* **Escalation:** Serious threats escalated to authorities 1693 + 1694 +**5. Trusted Contributor Protection:** 1695 + 1696 +* **Enhanced Privacy:** Additional protection for high-profile contributors 1697 +* **Verification:** Optional identity verification (not public) 1698 +* **Legal Support:** Resources for contributors facing legal threats 1699 + 1700 +**Acceptance Criteria:** 1701 + 1702 +* ✅ Pseudonyms supported 1703 +* ✅ Toxicity detection active 1704 +* ✅ Doxxing auto-blocked 1705 +* ✅ Harassment reporting functional 1706 +* ✅ Moderator tools implemented 1707 +* ✅ Safety policy published 1708 + 1709 +== Category 5: Continuous Improvement ===== FR49: A/B Testing Framework === 1710 + 1711 +**Importance:** CRITICAL 1712 +**Fulfills:** Continuous system improvement 1713 + 1714 +**Purpose:** Test and measure improvements to AKEL prompts, algorithms, and workflows. 1715 + 1716 +**Specification:** 1717 + 1718 +**Test Capabilities:** 1719 + 1720 +1. **Prompt Variations:** 1721 + * Test different claim extraction prompts 1722 + * Test different verdict generation prompts 1723 + * Measure: Accuracy, clarity, completeness 1724 + 1725 +2. **Algorithm Variations:** 1726 + * Test different source scoring algorithms 1727 + * Test different confidence calculations 1728 + * Measure: Audit accuracy, user satisfaction 1729 + 1730 +3. **Workflow Variations:** 1731 + * Test different quality gate thresholds 1732 + * Test different risk tier assignments 1733 + * Measure: Publication rate, quality scores 1734 + 1735 +**Implementation:** 1736 + 1737 +* **Traffic Split:** 50/50 or 90/10 splits 1738 +* **Randomization:** Consistent per claim (not per user) 1739 +* **Metrics Collection:** Automatic for all variants 1740 +* **Statistical Significance:** Minimum sample size calculation 1741 +* **Rollout:** Winner promoted to 100% traffic 1742 + 1743 +**A/B Test Workflow:** 1744 + 1745 +{{code}} 1746 +1. Hypothesis: "New prompt improves claim extraction" 1747 +2. Design test: Control vs. Variant 1748 +3. Define metrics: Extraction accuracy, completeness 1749 +4. Run test: 7-14 days, minimum 100 claims each 1750 +5. Analyze results: Statistical significance? 1751 +6. Decision: Deploy winner or iterate 1752 +{{/code}} 1753 + 1754 +**Acceptance Criteria:** 1755 + 1756 +* ✅ A/B testing framework implemented 1757 +* ✅ Can test prompt variations 1758 +* ✅ Can test algorithm variations 1759 +* ✅ Metrics automatically collected 1760 +* ✅ Statistical significance calculated 1761 +* ✅ Results inform system improvements 1762 + 1763 +=== FR54: Evidence Deduplication === 1764 + 1765 +**Importance:** CRITICAL (POC2/Beta) 1766 +**Fulfills:** Accurate evidence counting, quality metrics 1767 + 1768 +**Purpose:** Avoid counting the same source multiple times when it appears in different forms. 1769 + 1770 +**Specification:** 1771 + 1772 +**Deduplication Logic:** 1773 + 1774 +1. **URL Normalization:** 1775 + * Remove tracking parameters (?utm_source=...) 1776 + * Normalize http/https 1777 + * Normalize www/non-www 1778 + * Handle redirects 1779 + 1780 +2. **Content Similarity:** 1781 + * If two sources have >90% text similarity → Same source 1782 + * If one is subset of other → Same source 1783 + * Use fuzzy matching for minor differences 1784 + 1785 +3. **Cross-Domain Syndication:** 1786 + * Detect wire service content (AP, Reuters) 1787 + * Mark as single source if syndicated 1788 + * Count original publication only 1789 + 1790 +**Display:** 1791 + 1792 +{{code}} 1793 +Evidence Sources (3 unique, 5 total): 1794 + 1795 +1. Original Article (NYTimes) 1796 + - Also appeared in: WashPost, Guardian (syndicated) 1797 + 1798 +2. Research Paper (Nature) 1799 + 1800 +3. Official Statement (WHO) 1801 +{{/code}} 1802 + 1803 +**Acceptance Criteria:** 1804 + 1805 +* ✅ URL normalization works 1806 +* ✅ Content similarity detected 1807 +* ✅ Syndicated content identified 1808 +* ✅ Unique vs. total counts accurate 1809 +* ✅ Improves evidence quality metrics 1810 + 1811 +== Additional Requirements (Lower Importance) ===== FR50: OSINT Toolkit Integration === 1812 + 1813 +**Fulfills:** Advanced media verification 1814 + 1815 +**Purpose:** Integrate open-source intelligence tools for advanced verification. 1816 + 1817 +**Tools to Integrate:** 1818 +* InVID/WeVerify (video verification) 1819 +* Bellingcat toolkit 1820 +* Additional TBD based on V1.0 learnings 1821 + 1822 +=== FR51: Video Verification System === 1823 + 1824 +**Fulfills:** UN-27 (Visual claims), advanced media verification 1825 + 1826 +**Purpose:** Verify video-based claims. 1827 + 1828 +**Specification:** 1829 +* Keyframe extraction 1830 +* Reverse video search 1831 +* Deepfake detection (AI-powered) 1832 +* Metadata analysis 1833 +* Acoustic signature analysis 1834 + 1835 +=== FR52: Interactive Detection Training === 1836 + 1837 +**Fulfills:** Media literacy education 1838 + 1839 +**Purpose:** Teach users to identify misinformation. 1840 + 1841 +**Specification:** 1842 +* Interactive tutorials 1843 +* Practice exercises 1844 +* Detection quizzes 1845 +* Gamification elements 1846 + 1847 +=== FR53: Cross-Organizational Sharing === 1848 + 1849 +**Fulfills:** Collaboration with other fact-checkers 1850 + 1851 +**Purpose:** Share findings with IFCN/EFCSN members. 1852 + 1853 +**Specification:** 1854 +* API for fact-checking organizations 1855 +* Structured data exchange 1856 +* Privacy controls 1857 +* Attribution requirements 1858 + 1859 +== Summary == 1860 + 1861 +**V1.0 Critical Requirements (Must Have):** 1862 + 1863 +* FR44: ClaimReview Schema ✅ 1864 +* FR45: Corrections Notification ✅ 1865 +* FR46: Image Verification ✅ 1866 +* FR47: Archive.org Integration ✅ 1867 +* FR48: Contributor Safety ✅ 1868 +* FR49: A/B Testing ✅ 1869 +* FR54: Evidence Deduplication ✅ 1870 +* NFR11: Quality Assurance Framework ✅ 1871 +* NFR12: Security Controls ✅ 1872 +* NFR13: Quality Metrics Dashboard ✅ 1873 + 1874 +**V1.1+ (Future):** 1875 + 1876 +* FR50: OSINT Integration 1877 +* FR51: Video Verification 1878 +* FR52: Detection Training 1879 +* FR53: Cross-Org Sharing 1880 + 1881 +**Total:** 11 critical requirements for V1.0 1882 + 1883 +=== FR54: Evidence Deduplication === 1884 + 1885 +**Fulfills:** Accurate evidence counting, quality metrics 1886 + 1887 +**Purpose:** Avoid counting the same source multiple times when it appears in different forms. 1888 + 1889 +**Specification:** 1890 + 1891 +**Deduplication Logic:** 1892 + 1893 +1. **URL Normalization:** 1894 + * Remove tracking parameters (?utm_source=...) 1895 + * Normalize http/https 1896 + * Normalize www/non-www 1897 + * Handle redirects 1898 + 1899 +2. **Content Similarity:** 1900 + * If two sources have >90% text similarity → Same source 1901 + * If one is subset of other → Same source 1902 + * Use fuzzy matching for minor differences 1903 + 1904 +3. **Cross-Domain Syndication:** 1905 + * Detect wire service content (AP, Reuters) 1906 + * Mark as single source if syndicated 1907 + * Count original publication only 1908 + 1909 +**Display:** 1910 + 1911 +{{code}} 1912 +Evidence Sources (3 unique, 5 total): 1913 + 1914 +1. Original Article (NYTimes) 1915 + - Also appeared in: WashPost, Guardian (syndicated) 1916 + 1917 +2. Research Paper (Nature) 1918 + 1919 +3. Official Statement (WHO) 1920 +{{/code}} 1921 + 1922 +**Acceptance Criteria:** 1923 + 1924 +* ✅ URL normalization works 1925 +* ✅ Content similarity detected 1926 +* ✅ Syndicated content identified 1927 +* ✅ Unique vs. total counts accurate 1928 +* ✅ Improves evidence quality metrics 1929 + 1930 +== Additional Requirements (Lower Importance) ===== FR7: Automated Verdicts (Enhanced with Quality Gates) === 1931 + 1932 +**POC1+ Enhancement:** 1933 + 1934 +After AKEL generates verdict, it passes through quality gates: 1935 + 1936 +{{code}} 1937 +Workflow: 1938 +1. Extract claims 1939 + ↓ 1940 +2. [GATE 1] Validate fact-checkable 1941 + ↓ 1942 +3. Generate scenarios 1943 + ↓ 1944 +4. Generate verdicts 1945 + ↓ 1946 +5. [GATE 4] Validate confidence 1947 + ↓ 1948 +6. Display to user 1949 +{{/code}} 1950 + 1951 +**Updated Verdict States:** 1952 +* PUBLISHED 1953 +* INSUFFICIENT_EVIDENCE 1954 +* NON_FACTUAL_CLAIM 1955 +* PROCESSING 1956 +* ERROR 1957 + 1958 +=== FR4: Analysis Summary (Enhanced with Quality Metadata) === 1959 + 1960 +**POC1+ Enhancement:** 1961 + 1962 +Display quality indicators: 1963 + 1964 +{{code}} 1965 +Analysis Summary: 1966 + Verifiable Claims: 3/5 1967 + High Confidence Verdicts: 1 1968 + Medium Confidence: 2 1969 + Evidence Sources: 12 1970 + Avg Source Quality: 0.73 1971 + Quality Score: 8.5/10 1972 +{{/code}} 1973 +