Wiki source code of V1.0 Requirements
Last modified by Robert Schaub on 2025/12/22 13:50
Show last authors
| author | version | line-number | content |
|---|---|---|---|
| 1 | = V1.0 Requirements = | ||
| 2 | |||
| 3 | **Version:** 0.9.70 | ||
| 4 | **Phase:** Version 1.0 (Production Launch) | ||
| 5 | **Priority:** CRITICAL | ||
| 6 | **Status:** Ready for Implementation | ||
| 7 | |||
| 8 | This page specifies the requirements that MUST be implemented for FactHarbor V1.0 production launch, based on comprehensive fact-checking industry research (December 2025). | ||
| 9 | |||
| 10 | |||
| 11 | == Overview == | ||
| 12 | |||
| 13 | V1.0 adds critical requirements for: | ||
| 14 | * **Platform Integration** (ClaimReview schema, corrections system) | ||
| 15 | * **Quality Assurance** (Enhanced AKEL gates, security) | ||
| 16 | * **Media Verification** (Image verification, evidence archiving) | ||
| 17 | * **Community Safety** (Contributor protection) | ||
| 18 | * **Continuous Improvement** (A/B testing, quality metrics) | ||
| 19 | |||
| 20 | **Total New Requirements:** 11 (FR44-FR54, NFR11-NFR13) | ||
| 21 | **New User Needs:** 3 (UN-26, UN-27, UN-28) | ||
| 22 | |||
| 23 | |||
| 24 | == Category 1: Platform Integration & Standards Compliance == | ||
| 25 | |||
| 26 | === FR44: ClaimReview Schema Implementation === | ||
| 27 | |||
| 28 | **Priority:** CRITICAL | ||
| 29 | **Fulfills:** UN-13 (Cite verdicts), UN-14 (API access), UN-26 (Search engine visibility) | ||
| 30 | **Phase:** V1.0 | ||
| 31 | |||
| 32 | **Purpose:** Make FactHarbor analyses discoverable via Google Fact Check Explorer and other search engines. | ||
| 33 | |||
| 34 | **Specification:** | ||
| 35 | |||
| 36 | FactHarbor must generate valid ClaimReview structured data for every published analysis following Schema.org specifications. | ||
| 37 | |||
| 38 | **Required Fields:** | ||
| 39 | |||
| 40 | {{code language="json"}} | ||
| 41 | { | ||
| 42 | "@context": "https://schema.org", | ||
| 43 | "@type": "ClaimReview", | ||
| 44 | "datePublished": "YYYY-MM-DD", | ||
| 45 | "url": "https://factharbor.org/claims/{claim_id}", | ||
| 46 | "claimReviewed": "The exact claim text", | ||
| 47 | "author": { | ||
| 48 | "@type": "Organization", | ||
| 49 | "name": "FactHarbor", | ||
| 50 | "url": "https://factharbor.org" | ||
| 51 | }, | ||
| 52 | "reviewRating": { | ||
| 53 | "@type": "Rating", | ||
| 54 | "ratingValue": "1-5", | ||
| 55 | "bestRating": "5", | ||
| 56 | "worstRating": "1", | ||
| 57 | "alternateName": "FactHarbor likelihood score" | ||
| 58 | }, | ||
| 59 | "itemReviewed": { | ||
| 60 | "@type": "Claim", | ||
| 61 | "author": { | ||
| 62 | "@type": "Person" or "Organization", | ||
| 63 | "name": "Claim author if known" | ||
| 64 | }, | ||
| 65 | "datePublished": "YYYY-MM-DD if known", | ||
| 66 | "appearance": { | ||
| 67 | "@type": "CreativeWork", | ||
| 68 | "url": "Original claim URL if from article" | ||
| 69 | } | ||
| 70 | } | ||
| 71 | } | ||
| 72 | {{/code}} | ||
| 73 | |||
| 74 | **FactHarbor-Specific Mapping:** | ||
| 75 | |||
| 76 | Rating Value Conversion: | ||
| 77 | * 80-100% likelihood → 5 (Highly Supported) | ||
| 78 | * 60-79% likelihood → 4 (Supported) | ||
| 79 | * 40-59% likelihood → 3 (Mixed/Uncertain) | ||
| 80 | * 20-39% likelihood → 2 (Questionable) | ||
| 81 | * 0-19% likelihood → 1 (Refuted) | ||
| 82 | |||
| 83 | **Multiple Scenarios Handling:** | ||
| 84 | |||
| 85 | If claim has multiple scenarios with different verdicts: | ||
| 86 | * Generate **separate ClaimReview** for each scenario | ||
| 87 | * Add `disambiguatingDescription` field explaining scenario context | ||
| 88 | * Example: "In the context of [scenario assumptions]..." | ||
| 89 | |||
| 90 | **Implementation Requirements:** | ||
| 91 | |||
| 92 | 1. Auto-generate ClaimReview JSON-LD on claim publication | ||
| 93 | 2. Embed in HTML `<head>` section of claim page | ||
| 94 | 3. Validate against Schema.org validator before deployment | ||
| 95 | 4. Submit sitemap to Google Search Console | ||
| 96 | 5. Update ClaimReview when verdict changes (FR8: Time Evolution) | ||
| 97 | 6. Handle corrections (update dateModified field) | ||
| 98 | |||
| 99 | **Acceptance Criteria:** | ||
| 100 | |||
| 101 | * ✅ Passes Google Structured Data Testing Tool | ||
| 102 | * ✅ Appears in Google Fact Check Explorer within 48 hours of publication | ||
| 103 | * ✅ Valid JSON-LD syntax (no errors) | ||
| 104 | * ✅ All required fields populated | ||
| 105 | * ✅ Handles multi-scenario claims correctly | ||
| 106 | * ✅ Updates automatically on verdict changes | ||
| 107 | |||
| 108 | **Integration Points:** | ||
| 109 | |||
| 110 | * FR7: Automated Verdicts (source of rating data) | ||
| 111 | * FR8: Time Evolution (triggers schema updates) | ||
| 112 | * FR11: Audit Trail (log schema generation/updates) | ||
| 113 | * FR45: Corrections (updates schema on corrections) | ||
| 114 | |||
| 115 | **Resources:** | ||
| 116 | |||
| 117 | * ClaimReview Project: https://www.claimreviewproject.com | ||
| 118 | * Schema.org ClaimReview: https://schema.org/ClaimReview | ||
| 119 | * Google Fact Check Guidelines: https://developers.google.com/search/docs/appearance/fact-check | ||
| 120 | |||
| 121 | |||
| 122 | === FR45: User Corrections Notification System === | ||
| 123 | |||
| 124 | **Priority:** CRITICAL | ||
| 125 | **Fulfills:** IFCN Principle 5 (Open & Honest Corrections), EFCSN compliance | ||
| 126 | **Phase:** V1.0 | ||
| 127 | |||
| 128 | **Purpose:** When claim analyses are corrected, notify users who previously viewed the claim. | ||
| 129 | |||
| 130 | **Specification:** | ||
| 131 | |||
| 132 | **Correction Types:** | ||
| 133 | |||
| 134 | 1. **Major Correction:** Verdict changes category (e.g., "Supported" → "Refuted") | ||
| 135 | 2. **Significant Correction:** Likelihood score changes >20% | ||
| 136 | 3. **Minor Correction:** Evidence additions, source quality updates | ||
| 137 | 4. **Scenario Addition:** New scenario added to existing claim | ||
| 138 | |||
| 139 | **Notification Mechanisms:** | ||
| 140 | |||
| 141 | **1. In-Page Banner (Required):** | ||
| 142 | |||
| 143 | {{code}} | ||
| 144 | [!] CORRECTION NOTICE | ||
| 145 | This analysis was updated on [DATE]. [View what changed] [Dismiss] | ||
| 146 | |||
| 147 | Major changes: | ||
| 148 | • Verdict changed from "Likely True (75%)" to "Uncertain (45%)" | ||
| 149 | • New contradicting evidence added from [Source] | ||
| 150 | • Scenario 2 updated with additional context | ||
| 151 | |||
| 152 | [See full correction log] | ||
| 153 | {{/code}} | ||
| 154 | |||
| 155 | **Display Rules:** | ||
| 156 | * Show banner on ALL pages displaying the claim | ||
| 157 | * Banner persists for 30 days after correction | ||
| 158 | * "Corrections" count badge on claim card | ||
| 159 | * Timestamp on every verdict: "Last updated: [datetime]" | ||
| 160 | |||
| 161 | **2. Correction Log Page (Required):** | ||
| 162 | |||
| 163 | * Public changelog at `/claims/{id}/corrections` | ||
| 164 | * Displays: | ||
| 165 | * Date/time of correction | ||
| 166 | * What changed (before/after comparison) | ||
| 167 | * Why changed (reason if provided) | ||
| 168 | * Who made change (AKEL auto-update vs. contributor override) | ||
| 169 | * Diff view of changes | ||
| 170 | |||
| 171 | **3. Email Notifications (Optional for users):** | ||
| 172 | |||
| 173 | * Send to users who bookmarked/shared claim | ||
| 174 | * Subject: "FactHarbor Correction: [Claim title]" | ||
| 175 | * Include summary of changes | ||
| 176 | * Link to updated analysis | ||
| 177 | * Unsubscribe option | ||
| 178 | |||
| 179 | **4. RSS/API Feed (Required):** | ||
| 180 | |||
| 181 | * Corrections feed at `/corrections.rss` | ||
| 182 | * API endpoint: `GET /api/corrections?since={timestamp}` | ||
| 183 | * Enables external monitoring | ||
| 184 | * Machine-readable format | ||
| 185 | |||
| 186 | **IFCN Compliance Requirements:** | ||
| 187 | |||
| 188 | * Corrections policy published at `/corrections-policy` | ||
| 189 | * User can report suspected errors via `/report-error/{claim_id}` | ||
| 190 | * Link to IFCN complaint process (if FactHarbor becomes signatory) | ||
| 191 | * Scrupulous transparency: **never** silently edit | ||
| 192 | * All corrections permanent and public | ||
| 193 | |||
| 194 | **Acceptance Criteria:** | ||
| 195 | |||
| 196 | * ✅ Banner appears within 60 seconds of correction | ||
| 197 | * ✅ Correction log is permanent and public | ||
| 198 | * ✅ Email notifications deliver <5 minutes | ||
| 199 | * ✅ RSS feed updates in real-time | ||
| 200 | * ✅ Mobile-responsive banner design | ||
| 201 | * ✅ Accessible (screen reader compatible) | ||
| 202 | * ✅ Cannot be dismissed permanently (reappears for 30 days) | ||
| 203 | |||
| 204 | **Integration Points:** | ||
| 205 | |||
| 206 | * FR8: Time Evolution (triggers corrections) | ||
| 207 | * FR11: Audit Trail (source of correction data) | ||
| 208 | * NFR3: Transparency (public correction log) | ||
| 209 | * FR44: ClaimReview (updates schema) | ||
| 210 | |||
| 211 | |||
| 212 | == Category 2: Quality Assurance == | ||
| 213 | |||
| 214 | === NFR11: AKEL Quality Assurance Framework === | ||
| 215 | |||
| 216 | **Priority:** CRITICAL | ||
| 217 | **Fulfills:** AI safety, IFCN methodology transparency | ||
| 218 | **Phase:** V1.0 | ||
| 219 | |||
| 220 | **Purpose:** Prevent AI hallucinations and low-quality outputs through multi-layer automated checks. | ||
| 221 | |||
| 222 | **Specification:** | ||
| 223 | |||
| 224 | This enhances the existing 4 quality gates with more detailed specifications and confidence thresholds. | ||
| 225 | |||
| 226 | **Gate 1: Claim Extraction Validation** | ||
| 227 | |||
| 228 | **Purpose:** Ensure extracted claims are factual assertions (not opinions/predictions) | ||
| 229 | |||
| 230 | **Automated Checks:** | ||
| 231 | |||
| 232 | 1. **Factual Statement Test:** Is this verifiable? (Yes/No) | ||
| 233 | 2. **Opinion Detection:** Contains hedging language? ("I think", "probably", "might") | ||
| 234 | 3. **Future Prediction Test:** Makes claim about future events? | ||
| 235 | 4. **Specificity Score:** Contains specific entities, numbers, dates? | ||
| 236 | |||
| 237 | **Thresholds:** | ||
| 238 | |||
| 239 | * Factual: Must be "Yes" | ||
| 240 | * Opinion markers: <2 hedging phrases | ||
| 241 | * Specificity: ≥3 specific elements | ||
| 242 | |||
| 243 | **Action if Failed:** | ||
| 244 | |||
| 245 | * Flag claim as "Non-verifiable" or "Opinion" | ||
| 246 | * Do NOT generate verdict | ||
| 247 | * Display to user: "This appears to be an opinion rather than a factual claim" | ||
| 248 | * Log pattern for system improvement | ||
| 249 | |||
| 250 | |||
| 251 | **Gate 2: Evidence Relevance Validation** | ||
| 252 | |||
| 253 | **Purpose:** Ensure AI-linked evidence actually relates to claim | ||
| 254 | |||
| 255 | **Automated Checks:** | ||
| 256 | |||
| 257 | 1. **Semantic Similarity Score:** Evidence text vs. claim (using embeddings) | ||
| 258 | 2. **Entity Overlap:** Do evidence and claim mention same people/places/things? | ||
| 259 | 3. **Contradiction Detection:** Does evidence discuss the claim topic? | ||
| 260 | |||
| 261 | **Thresholds:** | ||
| 262 | |||
| 263 | * Similarity: ≥0.6 (cosine similarity) | ||
| 264 | * Entity overlap: ≥1 shared entity | ||
| 265 | * Topic relevance: ≥0.5 | ||
| 266 | |||
| 267 | **Action if Failed:** | ||
| 268 | |||
| 269 | * Discard irrelevant evidence | ||
| 270 | * If <2 relevant evidence items remain, verdict = "Insufficient Evidence" | ||
| 271 | * Block publication if below threshold | ||
| 272 | * Log pattern for search improvement | ||
| 273 | |||
| 274 | |||
| 275 | **Gate 3: Scenario Coherence Check** | ||
| 276 | |||
| 277 | **Purpose:** Validate scenario assumptions are logical and complete | ||
| 278 | |||
| 279 | **Automated Checks:** | ||
| 280 | |||
| 281 | 1. **Completeness:** Scenario has all required fields | ||
| 282 | 2. **Internal Consistency:** Assumptions don't contradict each other | ||
| 283 | 3. **Distinguishability:** Scenarios are meaningfully different (not duplicates) | ||
| 284 | |||
| 285 | **Thresholds:** | ||
| 286 | |||
| 287 | * Required fields: 100% populated | ||
| 288 | * Contradiction score: <0.3 (self-contradiction embedding) | ||
| 289 | * Scenario similarity: <0.8 (between scenarios) | ||
| 290 | |||
| 291 | **Action if Failed:** | ||
| 292 | |||
| 293 | * Merge duplicate scenarios | ||
| 294 | * Flag inconsistent assumptions for sampling audit | ||
| 295 | * Reduce confidence score by 20% | ||
| 296 | |||
| 297 | |||
| 298 | **Gate 4: Verdict Confidence Assessment** | ||
| 299 | |||
| 300 | **Purpose:** Only publish high-confidence verdicts | ||
| 301 | |||
| 302 | **Automated Checks:** | ||
| 303 | |||
| 304 | 1. **Evidence Count:** Minimum 2 sources (EFCSN standard) | ||
| 305 | 2. **Source Quality:** Average source reliability ≥0.6 | ||
| 306 | 3. **Evidence Agreement:** Supporting vs. contradicting ratio | ||
| 307 | 4. **Uncertainty Factors:** Number of explicit uncertainties | ||
| 308 | |||
| 309 | **Confidence Tiers:** | ||
| 310 | |||
| 311 | {{code}} | ||
| 312 | HIGH (80-100%): | ||
| 313 | - ≥3 high-quality sources | ||
| 314 | - >80% agreement | ||
| 315 | - <2 uncertainty factors | ||
| 316 | - Publish immediately (all risk tiers) | ||
| 317 | |||
| 318 | MEDIUM (50-79%): | ||
| 319 | - 2-3 sources | ||
| 320 | - 60-80% agreement | ||
| 321 | - 2-4 uncertainty factors | ||
| 322 | - Publish with standard labels | ||
| 323 | |||
| 324 | LOW (0-49%): | ||
| 325 | - <2 sources OR | ||
| 326 | - <60% agreement OR | ||
| 327 | - >4 uncertainty factors | ||
| 328 | - BLOCK publication | ||
| 329 | {{/code}} | ||
| 330 | |||
| 331 | **Publication Rules:** | ||
| 332 | |||
| 333 | * HIGH confidence: Publish immediately | ||
| 334 | * MEDIUM confidence: Publish with "May contain uncertainties" label | ||
| 335 | * LOW confidence: Block, improve system | ||
| 336 | |||
| 337 | **Acceptance Criteria:** | ||
| 338 | |||
| 339 | * ✅ All 4 gates implemented | ||
| 340 | * ✅ Thresholds configurable (for A/B testing) | ||
| 341 | * ✅ Gate failures logged with details | ||
| 342 | * ✅ Confidence scores accurate (validated through sampling audits) | ||
| 343 | * ✅ <5% hallucination rate (measured via audits) | ||
| 344 | |||
| 345 | **Integration Points:** | ||
| 346 | |||
| 347 | * FR7: Automated Verdicts (applies gates) | ||
| 348 | * AKEL: Quality gates enforcement | ||
| 349 | * NFR13: Quality metrics (reports gate performance) | ||
| 350 | |||
| 351 | |||
| 352 | === NFR12: Advanced Security Controls === | ||
| 353 | |||
| 354 | **Priority:** CRITICAL | ||
| 355 | **Fulfills:** Production security requirements | ||
| 356 | **Phase:** V1.0 | ||
| 357 | |||
| 358 | **Specification:** | ||
| 359 | |||
| 360 | **Essential Security (V1.0 Launch):** | ||
| 361 | |||
| 362 | 1. **DDoS Protection:** | ||
| 363 | * Rate limiting: 100 requests/hour per IP (content submission) | ||
| 364 | * Cloudflare or equivalent | ||
| 365 | * Automatic IP blocking on abuse | ||
| 366 | |||
| 367 | 2. **API Rate Limiting:** | ||
| 368 | * 1000 requests/hour per API key | ||
| 369 | * Burst allowance: 50 requests/minute | ||
| 370 | * 429 responses with retry-after headers | ||
| 371 | |||
| 372 | 3. **Audit Logging:** | ||
| 373 | * All moderation actions logged | ||
| 374 | * All system changes logged | ||
| 375 | * Logs retained 2 years | ||
| 376 | * Tamper-proof logging | ||
| 377 | |||
| 378 | 4. **Input Validation:** | ||
| 379 | * Sanitize all user inputs | ||
| 380 | * Prevent SQL injection | ||
| 381 | * Prevent XSS attacks | ||
| 382 | * Max input sizes enforced | ||
| 383 | |||
| 384 | 5. **Authentication:** | ||
| 385 | * OAuth 2.0 for API access | ||
| 386 | * Secure session management | ||
| 387 | * Password requirements (if applicable) | ||
| 388 | |||
| 389 | **Full Security (V1.1+):** | ||
| 390 | |||
| 391 | 6. **Penetration Testing:** Annual third-party tests | ||
| 392 | 7. **Vulnerability Scanning:** Automated weekly scans | ||
| 393 | 8. **Security Incident Response:** Documented procedures | ||
| 394 | 9. **Data Encryption:** At rest and in transit | ||
| 395 | 10. **Access Control:** Role-based permissions | ||
| 396 | |||
| 397 | **Acceptance Criteria:** | ||
| 398 | |||
| 399 | * ✅ Rate limits enforced | ||
| 400 | * ✅ DDoS protection active | ||
| 401 | * ✅ All logs captured | ||
| 402 | * ✅ Input validation prevents common attacks | ||
| 403 | * ✅ API authentication required | ||
| 404 | |||
| 405 | |||
| 406 | === NFR13: Reference-Free Quality Metrics === | ||
| 407 | |||
| 408 | **Priority:** CRITICAL (POC2 onwards) | ||
| 409 | **Fulfills:** NFR3 (Transparency), continuous improvement monitoring | ||
| 410 | **Phase:** POC2, Beta 0, V1.0 | ||
| 411 | |||
| 412 | **Purpose:** Measure AKEL quality without requiring human-labeled ground truth. | ||
| 413 | |||
| 414 | **Specification:** | ||
| 415 | |||
| 416 | **Metrics Dashboard (Public):** | ||
| 417 | |||
| 418 | **1. Consistency Metrics:** | ||
| 419 | |||
| 420 | * **Cross-Source Consistency:** Do verdicts align with evidence? | ||
| 421 | * **Temporal Consistency:** Do verdicts remain stable over time (when evidence unchanged)? | ||
| 422 | * **Scenario Consistency:** Do related scenarios have coherent verdicts? | ||
| 423 | |||
| 424 | **2. Completeness Metrics:** | ||
| 425 | |||
| 426 | * **Evidence Retrieval Rate:** % of claims with ≥2 sources | ||
| 427 | * **Contradiction Search Coverage:** % of claims with counter-evidence searched | ||
| 428 | * **Source Diversity:** Number of distinct sources per claim | ||
| 429 | |||
| 430 | **3. Confidence Calibration:** | ||
| 431 | |||
| 432 | * **Confidence vs. Evidence Strength:** Does high confidence correlate with strong evidence? | ||
| 433 | * **Confidence Distribution:** Are verdicts appropriately uncertain? | ||
| 434 | |||
| 435 | **4. Quality Gate Performance:** | ||
| 436 | |||
| 437 | * **Gate Pass Rates:** % passing each gate | ||
| 438 | * **Gate Failure Reasons:** What causes most failures? | ||
| 439 | * **Gate Effectiveness:** Sampling audit validation of gates | ||
| 440 | |||
| 441 | **5. User Engagement Metrics:** | ||
| 442 | |||
| 443 | * **Correction Rate:** How often do users flag issues? | ||
| 444 | * **Appeal Rate:** How often are corrections requested? | ||
| 445 | * **User Satisfaction:** Survey results | ||
| 446 | |||
| 447 | **Dashboard Features:** | ||
| 448 | |||
| 449 | * Public access at `/quality-metrics` | ||
| 450 | * Updated daily | ||
| 451 | * Historical trends (30/90/365 days) | ||
| 452 | * Breakdown by risk tier | ||
| 453 | * Download raw data (CSV) | ||
| 454 | |||
| 455 | **Acceptance Criteria:** | ||
| 456 | |||
| 457 | * ✅ Dashboard publicly accessible | ||
| 458 | * ✅ Updates daily | ||
| 459 | * ✅ All metrics implemented | ||
| 460 | * ✅ Historical data retained | ||
| 461 | * ✅ Transparent methodology explained | ||
| 462 | |||
| 463 | |||
| 464 | == Category 3: Media Verification == | ||
| 465 | |||
| 466 | === FR46: Image Verification System === | ||
| 467 | |||
| 468 | **Priority:** CRITICAL | ||
| 469 | **Fulfills:** UN-27 (Visual claim verification) | ||
| 470 | **Phase:** V1.0 (Basic), V1.1 (Extended) | ||
| 471 | |||
| 472 | **Purpose:** Enable users to verify image-based claims. | ||
| 473 | |||
| 474 | **V1.0 Specification (Basic):** | ||
| 475 | |||
| 476 | **1. Reverse Image Search:** | ||
| 477 | |||
| 478 | * Integration with Google Image Search API | ||
| 479 | * Integration with TinEye API | ||
| 480 | * Display earliest known appearance | ||
| 481 | * Show similar/modified versions | ||
| 482 | |||
| 483 | **2. Metadata Analysis:** | ||
| 484 | |||
| 485 | * EXIF data extraction | ||
| 486 | * Creation date/time | ||
| 487 | * Camera/device information | ||
| 488 | * GPS location (if available) | ||
| 489 | * Edit history (if available) | ||
| 490 | |||
| 491 | **3. Basic Manipulation Detection:** | ||
| 492 | |||
| 493 | * Error Level Analysis (ELA) | ||
| 494 | * Flag obviously manipulated images | ||
| 495 | * Not AI-powered detection (V1.1+) | ||
| 496 | |||
| 497 | **UI Workflow:** | ||
| 498 | |||
| 499 | {{code}} | ||
| 500 | User uploads image | ||
| 501 | ↓ | ||
| 502 | System runs reverse search | ||
| 503 | ↓ | ||
| 504 | System extracts metadata | ||
| 505 | ↓ | ||
| 506 | System performs ELA | ||
| 507 | ↓ | ||
| 508 | Results displayed: | ||
| 509 | - Earliest known appearance | ||
| 510 | - Similar images found | ||
| 511 | - Metadata (camera, date, location) | ||
| 512 | - Manipulation indicators (if any) | ||
| 513 | ↓ | ||
| 514 | User can create claim based on findings | ||
| 515 | {{/code}} | ||
| 516 | |||
| 517 | **V1.1 Specification (Extended - Future):** | ||
| 518 | |||
| 519 | * AI-powered deepfake detection | ||
| 520 | * Acoustic signature analysis (for videos) | ||
| 521 | * Advanced forensic tools (noise patterns, compression artifacts) | ||
| 522 | |||
| 523 | **Acceptance Criteria (V1.0):** | ||
| 524 | |||
| 525 | * ✅ Reverse search functional (Google + TinEye) | ||
| 526 | * ✅ Metadata extracted correctly | ||
| 527 | * ✅ ELA results displayed | ||
| 528 | * ✅ User-friendly interface | ||
| 529 | * ✅ Results help users make informed decisions | ||
| 530 | |||
| 531 | |||
| 532 | === FR47: Archive.org Integration === | ||
| 533 | |||
| 534 | **Priority:** CRITICAL | ||
| 535 | **Fulfills:** Evidence persistence, FR5 (Evidence linking) | ||
| 536 | **Phase:** V1.0 | ||
| 537 | |||
| 538 | **Purpose:** Ensure evidence remains accessible even if original sources are deleted. | ||
| 539 | |||
| 540 | **Specification:** | ||
| 541 | |||
| 542 | **Automatic Archiving:** | ||
| 543 | |||
| 544 | When AKEL links evidence: | ||
| 545 | 1. Check if URL already archived (Wayback Machine API) | ||
| 546 | 2. If not, submit for archiving (Save Page Now API) | ||
| 547 | 3. Store both original URL and archive URL | ||
| 548 | 4. Display both to users | ||
| 549 | |||
| 550 | **Archive Display:** | ||
| 551 | |||
| 552 | {{code}} | ||
| 553 | Evidence Source: [Original URL] | ||
| 554 | Archived: [Archive.org URL] (Captured: [date]) | ||
| 555 | |||
| 556 | [View Original] [View Archive] | ||
| 557 | {{/code}} | ||
| 558 | |||
| 559 | **Fallback Logic:** | ||
| 560 | |||
| 561 | * If original URL unavailable → Auto-redirect to archive | ||
| 562 | * If archive unavailable → Display warning | ||
| 563 | * If both unavailable → Flag for manual review | ||
| 564 | |||
| 565 | **API Integration:** | ||
| 566 | |||
| 567 | * Use Wayback Machine Availability API | ||
| 568 | * Use Save Page Now API (SPNv2) | ||
| 569 | * Rate limiting: 15 requests/minute (Wayback limit) | ||
| 570 | |||
| 571 | **Acceptance Criteria:** | ||
| 572 | |||
| 573 | * ✅ All evidence URLs auto-archived | ||
| 574 | * ✅ Archive links displayed to users | ||
| 575 | * ✅ Fallback to archive if original unavailable | ||
| 576 | * ✅ API rate limits respected | ||
| 577 | * ✅ Archive status visible in evidence display | ||
| 578 | |||
| 579 | |||
| 580 | == Category 4: Community Safety == | ||
| 581 | |||
| 582 | === FR48: Contributor Safety Framework === | ||
| 583 | |||
| 584 | **Priority:** CRITICAL | ||
| 585 | **Fulfills:** UN-28 (Safe contribution environment) | ||
| 586 | **Phase:** V1.0 | ||
| 587 | |||
| 588 | **Purpose:** Protect contributors from harassment, doxxing, and coordinated attacks. | ||
| 589 | |||
| 590 | **Specification:** | ||
| 591 | |||
| 592 | **1. Privacy Protection:** | ||
| 593 | |||
| 594 | * **Optional Pseudonymity:** Contributors can use pseudonyms | ||
| 595 | * **Email Privacy:** Emails never displayed publicly | ||
| 596 | * **Profile Privacy:** Contributors control what's public | ||
| 597 | * **IP Logging:** Only for abuse prevention, not public | ||
| 598 | |||
| 599 | **2. Harassment Prevention:** | ||
| 600 | |||
| 601 | * **Automated Toxicity Detection:** Flag abusive comments | ||
| 602 | * **Personal Information Detection:** Auto-block doxxing attempts | ||
| 603 | * **Coordinated Attack Detection:** Identify brigading patterns | ||
| 604 | * **Rapid Response:** Moderator alerts for harassment | ||
| 605 | |||
| 606 | **3. Safety Features:** | ||
| 607 | |||
| 608 | * **Block Users:** Contributors can block harassers | ||
| 609 | * **Private Contributions:** Option to contribute anonymously | ||
| 610 | * **Report Harassment:** One-click harassment reporting | ||
| 611 | * **Safety Resources:** Links to support resources | ||
| 612 | |||
| 613 | **4. Moderator Tools:** | ||
| 614 | |||
| 615 | * **Quick Ban:** Immediately block abusers | ||
| 616 | * **Pattern Detection:** Identify coordinated attacks | ||
| 617 | * **Appeal Process:** Fair review of moderation actions | ||
| 618 | * **Escalation:** Serious threats escalated to authorities | ||
| 619 | |||
| 620 | **5. Trusted Contributor Protection:** | ||
| 621 | |||
| 622 | * **Enhanced Privacy:** Additional protection for high-profile contributors | ||
| 623 | * **Verification:** Optional identity verification (not public) | ||
| 624 | * **Legal Support:** Resources for contributors facing legal threats | ||
| 625 | |||
| 626 | **Acceptance Criteria:** | ||
| 627 | |||
| 628 | * ✅ Pseudonyms supported | ||
| 629 | * ✅ Toxicity detection active | ||
| 630 | * ✅ Doxxing auto-blocked | ||
| 631 | * ✅ Harassment reporting functional | ||
| 632 | * ✅ Moderator tools implemented | ||
| 633 | * ✅ Safety policy published | ||
| 634 | |||
| 635 | |||
| 636 | == Category 5: Continuous Improvement == | ||
| 637 | |||
| 638 | === FR49: A/B Testing Framework === | ||
| 639 | |||
| 640 | **Priority:** CRITICAL | ||
| 641 | **Fulfills:** Continuous system improvement | ||
| 642 | **Phase:** V1.0 | ||
| 643 | |||
| 644 | **Purpose:** Test and measure improvements to AKEL prompts, algorithms, and workflows. | ||
| 645 | |||
| 646 | **Specification:** | ||
| 647 | |||
| 648 | **Test Capabilities:** | ||
| 649 | |||
| 650 | 1. **Prompt Variations:** | ||
| 651 | * Test different claim extraction prompts | ||
| 652 | * Test different verdict generation prompts | ||
| 653 | * Measure: Accuracy, clarity, completeness | ||
| 654 | |||
| 655 | 2. **Algorithm Variations:** | ||
| 656 | * Test different source scoring algorithms | ||
| 657 | * Test different confidence calculations | ||
| 658 | * Measure: Audit accuracy, user satisfaction | ||
| 659 | |||
| 660 | 3. **Workflow Variations:** | ||
| 661 | * Test different quality gate thresholds | ||
| 662 | * Test different risk tier assignments | ||
| 663 | * Measure: Publication rate, quality scores | ||
| 664 | |||
| 665 | **Implementation:** | ||
| 666 | |||
| 667 | * **Traffic Split:** 50/50 or 90/10 splits | ||
| 668 | * **Randomization:** Consistent per claim (not per user) | ||
| 669 | * **Metrics Collection:** Automatic for all variants | ||
| 670 | * **Statistical Significance:** Minimum sample size calculation | ||
| 671 | * **Rollout:** Winner promoted to 100% traffic | ||
| 672 | |||
| 673 | **A/B Test Workflow:** | ||
| 674 | |||
| 675 | {{code}} | ||
| 676 | 1. Hypothesis: "New prompt improves claim extraction" | ||
| 677 | 2. Design test: Control vs. Variant | ||
| 678 | 3. Define metrics: Extraction accuracy, completeness | ||
| 679 | 4. Run test: 7-14 days, minimum 100 claims each | ||
| 680 | 5. Analyze results: Statistical significance? | ||
| 681 | 6. Decision: Deploy winner or iterate | ||
| 682 | {{/code}} | ||
| 683 | |||
| 684 | **Acceptance Criteria:** | ||
| 685 | |||
| 686 | * ✅ A/B testing framework implemented | ||
| 687 | * ✅ Can test prompt variations | ||
| 688 | * ✅ Can test algorithm variations | ||
| 689 | * ✅ Metrics automatically collected | ||
| 690 | * ✅ Statistical significance calculated | ||
| 691 | * ✅ Results inform system improvements | ||
| 692 | |||
| 693 | |||
| 694 | === FR54: Evidence Deduplication === | ||
| 695 | |||
| 696 | **Priority:** CRITICAL (POC2/Beta) | ||
| 697 | **Fulfills:** Accurate evidence counting, quality metrics | ||
| 698 | **Phase:** POC2, Beta 0, V1.0 | ||
| 699 | |||
| 700 | **Purpose:** Avoid counting the same source multiple times when it appears in different forms. | ||
| 701 | |||
| 702 | **Specification:** | ||
| 703 | |||
| 704 | **Deduplication Logic:** | ||
| 705 | |||
| 706 | 1. **URL Normalization:** | ||
| 707 | * Remove tracking parameters (?utm_source=...) | ||
| 708 | * Normalize http/https | ||
| 709 | * Normalize www/non-www | ||
| 710 | * Handle redirects | ||
| 711 | |||
| 712 | 2. **Content Similarity:** | ||
| 713 | * If two sources have >90% text similarity → Same source | ||
| 714 | * If one is subset of other → Same source | ||
| 715 | * Use fuzzy matching for minor differences | ||
| 716 | |||
| 717 | 3. **Cross-Domain Syndication:** | ||
| 718 | * Detect wire service content (AP, Reuters) | ||
| 719 | * Mark as single source if syndicated | ||
| 720 | * Count original publication only | ||
| 721 | |||
| 722 | **Display:** | ||
| 723 | |||
| 724 | {{code}} | ||
| 725 | Evidence Sources (3 unique, 5 total): | ||
| 726 | |||
| 727 | 1. Original Article (NYTimes) | ||
| 728 | - Also appeared in: WashPost, Guardian (syndicated) | ||
| 729 | |||
| 730 | 2. Research Paper (Nature) | ||
| 731 | |||
| 732 | 3. Official Statement (WHO) | ||
| 733 | {{/code}} | ||
| 734 | |||
| 735 | **Acceptance Criteria:** | ||
| 736 | |||
| 737 | * ✅ URL normalization works | ||
| 738 | * ✅ Content similarity detected | ||
| 739 | * ✅ Syndicated content identified | ||
| 740 | * ✅ Unique vs. total counts accurate | ||
| 741 | * ✅ Improves evidence quality metrics | ||
| 742 | |||
| 743 | |||
| 744 | == Additional Requirements (Lower Priority) == | ||
| 745 | |||
| 746 | === FR50: OSINT Toolkit Integration === | ||
| 747 | |||
| 748 | **Priority:** HIGH (V1.1) | ||
| 749 | **Fulfills:** Advanced media verification | ||
| 750 | **Phase:** V1.1 | ||
| 751 | |||
| 752 | **Purpose:** Integrate open-source intelligence tools for advanced verification. | ||
| 753 | |||
| 754 | **Tools to Integrate:** | ||
| 755 | * InVID/WeVerify (video verification) | ||
| 756 | * Bellingcat toolkit | ||
| 757 | * Additional TBD based on V1.0 learnings | ||
| 758 | |||
| 759 | |||
| 760 | === FR51: Video Verification System === | ||
| 761 | |||
| 762 | **Priority:** HIGH (V1.1) | ||
| 763 | **Fulfills:** UN-27 (Visual claims), advanced media verification | ||
| 764 | **Phase:** V1.1 | ||
| 765 | |||
| 766 | **Purpose:** Verify video-based claims. | ||
| 767 | |||
| 768 | **Specification:** | ||
| 769 | * Keyframe extraction | ||
| 770 | * Reverse video search | ||
| 771 | * Deepfake detection (AI-powered) | ||
| 772 | * Metadata analysis | ||
| 773 | * Acoustic signature analysis | ||
| 774 | |||
| 775 | |||
| 776 | === FR52: Interactive Detection Training === | ||
| 777 | |||
| 778 | **Priority:** MEDIUM (V1.5) | ||
| 779 | **Fulfills:** Media literacy education | ||
| 780 | **Phase:** V1.5 | ||
| 781 | |||
| 782 | **Purpose:** Teach users to identify misinformation. | ||
| 783 | |||
| 784 | **Specification:** | ||
| 785 | * Interactive tutorials | ||
| 786 | * Practice exercises | ||
| 787 | * Detection quizzes | ||
| 788 | * Gamification elements | ||
| 789 | |||
| 790 | |||
| 791 | === FR53: Cross-Organizational Sharing === | ||
| 792 | |||
| 793 | **Priority:** MEDIUM (V1.5) | ||
| 794 | **Fulfills:** Collaboration with other fact-checkers | ||
| 795 | **Phase:** V1.5 | ||
| 796 | |||
| 797 | **Purpose:** Share findings with IFCN/EFCSN members. | ||
| 798 | |||
| 799 | **Specification:** | ||
| 800 | * API for fact-checking organizations | ||
| 801 | * Structured data exchange | ||
| 802 | * Privacy controls | ||
| 803 | * Attribution requirements | ||
| 804 | |||
| 805 | |||
| 806 | == Summary == | ||
| 807 | |||
| 808 | **V1.0 Critical Requirements (Must Have):** | ||
| 809 | |||
| 810 | * FR44: ClaimReview Schema ✅ | ||
| 811 | * FR45: Corrections Notification ✅ | ||
| 812 | * FR46: Image Verification ✅ | ||
| 813 | * FR47: Archive.org Integration ✅ | ||
| 814 | * FR48: Contributor Safety ✅ | ||
| 815 | * FR49: A/B Testing ✅ | ||
| 816 | * FR54: Evidence Deduplication ✅ | ||
| 817 | * NFR11: Quality Assurance Framework ✅ | ||
| 818 | * NFR12: Security Controls ✅ | ||
| 819 | * NFR13: Quality Metrics Dashboard ✅ | ||
| 820 | |||
| 821 | **V1.1+ (Future):** | ||
| 822 | |||
| 823 | * FR50: OSINT Integration | ||
| 824 | * FR51: Video Verification | ||
| 825 | * FR52: Detection Training | ||
| 826 | * FR53: Cross-Org Sharing | ||
| 827 | |||
| 828 | |||
| 829 | **Total:** 11 critical requirements for V1.0 |