Changes for page Requirements
Last modified by Robert Schaub on 2026/02/08 21:32
Summary
-
Page properties (1 modified, 0 added, 0 removed)
Details
- Page properties
-
- Content
-
... ... @@ -1,7 +1,7 @@ 1 1 = Requirements = 2 2 3 3 {{info}} 4 -**Phase Assignments:** See [[Requirements Roadmap Matrix>> Archive.FactHarbor.Roadmap.Requirements-Roadmap-Matrix.WebHome]] for which requirements are implemented in which phases.4 +**Phase Assignments:** See [[Requirements Roadmap Matrix>>FactHarbor.Roadmap.Requirements-Roadmap-Matrix.WebHome]] for which requirements are implemented in which phases. 5 5 {{/info}} 6 6 7 7 **This page defines Roles, Content States, Rules, and System Requirements for FactHarbor.** ... ... @@ -36,7 +36,6 @@ 36 36 **Who**: Anyone (no login required) 37 37 38 38 **Can**: 39 - 40 40 * Browse and search claims 41 41 * View scenarios, evidence, verdicts, and confidence scores 42 42 * Flag issues or errors ... ... @@ -44,7 +44,6 @@ 44 44 * Submit claims automatically (new claims added if not duplicates) 45 45 46 46 **Cannot**: 47 - 48 48 * Modify content 49 49 * Access edit history details 50 50 ... ... @@ -55,7 +55,6 @@ 55 55 **Who**: Registered users (earns reputation through contributions) 56 56 57 57 **Can**: 58 - 59 59 * Everything a Reader can do 60 60 * Edit claims, evidence, and scenarios 61 61 * Add sources and citations ... ... @@ -64,7 +64,6 @@ 64 64 * Earn reputation points for quality contributions 65 65 66 66 **Reputation System**: 67 - 68 68 * New contributors: Limited edit privileges 69 69 * Established contributors (established reputation): Full edit access 70 70 * Trusted contributors (substantial reputation): Can approve certain changes ... ... @@ -72,7 +72,6 @@ 72 72 * Reputation lost through: Reverted edits, invalid flags, abuse 73 73 74 74 **Cannot**: 75 - 76 76 * Delete or hide content (only moderators) 77 77 * Override moderation decisions 78 78 ... ... @@ -83,7 +83,6 @@ 83 83 **Who**: Trusted community members with proven track record, appointed by governance board 84 84 85 85 **Can**: 86 - 87 87 * Review flagged content 88 88 * Hide harmful or abusive content 89 89 * Resolve disputes between contributors ... ... @@ -92,7 +92,6 @@ 92 92 * Access full audit logs 93 93 94 94 **Cannot**: 95 - 96 96 * Change governance rules 97 97 * Permanently ban users without board approval 98 98 * Override technical quality gates ... ... @@ -106,7 +106,6 @@ 106 106 **Not a permanent role**: Contacted externally when needed for contested claims in their domain 107 107 108 108 **When used**: 109 - 110 110 * Medical claims with life/safety implications 111 111 * Legal interpretations with significant impact 112 112 * Scientific claims with high controversy ... ... @@ -113,7 +113,6 @@ 113 113 * Technical claims requiring specialized knowledge 114 114 115 115 **Process**: 116 - 117 117 * Moderator identifies need for expert input 118 118 * Contact expert externally (don't require them to be users) 119 119 * Trusted Contributor provides written opinion with sources ... ... @@ -133,13 +133,11 @@ 133 133 **Status**: Visible to all users 134 134 135 135 **Includes**: 136 - 137 137 * AI-generated analyses (default state) 138 138 * User-contributed content 139 139 * Edited/improved content 140 140 141 141 **Quality Indicators** (displayed with content): 142 - 143 143 * **Confidence Score**: 0-100% (AI's confidence in analysis) 144 144 * **Source Quality Score**: 0-100% (based on source track record) 145 145 * **Controversy Flag**: If high dispute/edit activity ... ... @@ -149,7 +149,6 @@ 149 149 * **Review Status**: AI-generated / Human-reviewed / Expert-validated 150 150 151 151 **Automatic Warnings**: 152 - 153 153 * Confidence < 60%: "Low confidence - use caution" 154 154 * Source quality < 40%: "Sources may be unreliable" 155 155 * High controversy: "Disputed - multiple interpretations exist" ... ... @@ -162,7 +162,6 @@ 162 162 **Status**: Not visible to regular users (only to moderators) 163 163 164 164 **Reasons**: 165 - 166 166 * Spam or advertising 167 167 * Personal attacks or harassment 168 168 * Illegal content ... ... @@ -171,7 +171,6 @@ 171 171 * Abuse or harmful content 172 172 173 173 **Process**: 174 - 175 175 * Automated detection flags for moderator review 176 176 * Moderator confirms and hides 177 177 * Original author notified with reason ... ... @@ -194,7 +194,6 @@ 194 194 **AKEL is the primary system**. Human contributions supplement and train AKEL. 195 195 196 196 **AKEL Must**: 197 - 198 198 * Mark all outputs as AI-generated 199 199 * Display confidence scores prominently 200 200 * Provide source citations ... ... @@ -203,7 +203,6 @@ 203 203 * Learn from human corrections 204 204 205 205 **When AKEL Makes Errors**: 206 - 207 207 1. Capture the error pattern (what, why, how common) 208 208 2. Improve the system (better prompt, model, validation) 209 209 3. Re-process affected claims automatically ... ... @@ -234,7 +234,6 @@ 234 234 === 4.1 Source Requirements === 235 235 236 236 **Track Record Over Credentials**: 237 - 238 238 * Sources evaluated by historical accuracy 239 239 * Correction policy matters 240 240 * Independence from conflicts of interest ... ... @@ -241,7 +241,6 @@ 241 241 * Methodology transparency 242 242 243 243 **Source Quality Database**: 244 - 245 245 * Automated tracking of source accuracy 246 246 * Correction frequency 247 247 * Reliability score (updated continuously) ... ... @@ -273,7 +273,6 @@ 273 273 === 4.4 Confidence Scoring === 274 274 275 275 **Automated confidence calculation based on**: 276 - 277 277 * Source quality scores 278 278 * Evidence consistency 279 279 * Contradiction detection ... ... @@ -281,7 +281,6 @@ 281 281 * Historical accuracy of similar claims 282 282 283 283 **Thresholds**: 284 - 285 285 * < 40%: Too low to publish (needs improvement) 286 286 * 40-60%: Published with "Low confidence" warning 287 287 * 60-80%: Published as standard ... ... @@ -298,7 +298,6 @@ 298 298 === 5.1 Risk Score Calculation === 299 299 300 300 **Factors** (weighted algorithm): 301 - 302 302 * **Domain sensitivity**: Medical, legal, safety auto-flagged higher 303 303 * **Potential impact**: Views, citations, spread 304 304 * **Controversy level**: Flags, disputes, edit wars ... ... @@ -325,7 +325,6 @@ 325 325 === 6.1 Error Capture === 326 326 327 327 **When users flag errors or make corrections**: 328 - 329 329 1. What was wrong? (categorize) 330 330 2. What should it have been? 331 331 3. Why did the system fail? (root cause) ... ... @@ -344,7 +344,6 @@ 344 344 === 6.3 Quality Metrics Dashboard === 345 345 346 346 **Track continuously**: 347 - 348 348 * Error rate by category 349 349 * Source quality distribution 350 350 * Confidence score trends ... ... @@ -370,7 +370,6 @@ 370 370 === 7.2 Anomaly Detection === 371 371 372 372 **Automated alerts for**: 373 - 374 374 * Sudden quality drops 375 375 * Unusual patterns 376 376 * Contradiction clusters ... ... @@ -423,7 +423,6 @@ 423 423 **Fulfills**: UN-2 (Context-dependent verification), UN-3 (Article summary with FactHarbor analysis summary), UN-8 (Understanding disagreement) 424 424 425 425 **Automated scenario creation**: 426 - 427 427 * AKEL analyzes claim and generates likely scenarios (use-cases and contexts) 428 428 * Each scenario includes: assumptions, definitions, boundaries, evidence context 429 429 * Users can flag incorrect scenarios ... ... @@ -490,7 +490,6 @@ 490 490 **Purpose**: Provide side-by-side comparison of what a document claims vs. FactHarbor's complete analysis of its credibility 491 491 492 492 **Left Panel: Article Summary**: 493 - 494 494 * Document title, source, and claimed credibility 495 495 * "The Big Picture" - main thesis or position change 496 496 * "Key Findings" - structured summary of document's main claims ... ... @@ -498,7 +498,6 @@ 498 498 * "Conclusion" - document's bottom line 499 499 500 500 **Right Panel: FactHarbor Analysis Summary**: 501 - 502 502 * FactHarbor's independent source credibility assessment 503 503 * Claim-by-claim verdicts with confidence scores 504 504 * Methodology assessment (strengths, limitations) ... ... @@ -506,7 +506,6 @@ 506 506 * Analysis ID for reference 507 507 508 508 **Design Principles**: 509 - 510 510 * No scrolling required - both panels visible simultaneously 511 511 * Visual distinction between "what they say" and "FactHarbor's analysis" 512 512 * Color coding for verdicts (supported, uncertain, refuted) ... ... @@ -514,7 +514,6 @@ 514 514 * Mobile responsive (panels stack vertically on small screens) 515 515 516 516 **Implementation Notes**: 517 - 518 518 * Generated automatically by AKEL for every analyzed document 519 519 * Updates when verdict evolves (maintains version history) 520 520 * Exportable as standalone summary report ... ... @@ -541,8 +541,7 @@ 541 541 (% style="font-size:0.9em; color:#666;" %) 542 542 ↑ WELL SUPPORTED • 87% confidence 543 543 [[Click for evidence details →]] 544 - 545 - 515 +(%%) 546 546 ))) 547 547 548 548 The study, which followed 10,000 participants over five years, showed significant improvements in cardiovascular health markers. ... ... @@ -555,8 +555,7 @@ 555 555 ↑ UNCERTAIN • 45% confidence 556 556 Overstated - evidence shows risk reduction, not prevention 557 557 [[Click for details →]] 558 - 559 - 528 +(%%) 560 560 ))) 561 561 562 562 Dr. Maria Rodriguez, lead researcher, recommends incorporating more olive oil, fish, and vegetables into daily meals. ... ... @@ -569,8 +569,7 @@ 569 569 ↑ REFUTED • 15% confidence 570 570 Claim not supported by study design; correlation ≠ causation 571 571 [[Click for counter-evidence →]] 572 - 573 - 541 +(%%) 574 574 ))) 575 575 576 576 Participants also reported feeling more energetic and experiencing better sleep quality, though these were secondary measures. ... ... @@ -577,7 +577,6 @@ 577 577 ))) 578 578 579 579 **Legend:** 580 - 581 581 * 🟢 = Well-supported claim (confidence ≥75%) 582 582 * 🟡 = Uncertain claim (confidence 40-74%) 583 583 * 🔴 = Refuted/unsupported claim (confidence <40%) ... ... @@ -596,13 +596,11 @@ 596 596 **Confidence:** 87% 597 597 598 598 **Evidence Summary:** 599 - 600 600 * Meta-analysis of 12 RCTs confirms 23-28% risk reduction 601 601 * Consistent findings across multiple populations 602 602 * Published in peer-reviewed journal (high credibility) 603 603 604 604 **Uncertainty Factors:** 605 - 606 606 * Exact percentage varies by study (20-30% range) 607 607 608 608 [[View Full Analysis →]] ... ... @@ -609,7 +609,6 @@ 609 609 ))) 610 610 611 611 **Color-Coding System**: 612 - 613 613 * **Green**: Well-supported claims (confidence ≥75%, strong evidence) 614 614 * **Yellow/Orange**: Uncertain claims (confidence 40-74%, conflicting or limited evidence) 615 615 * **Red**: Refuted or unsupported claims (confidence <40%, contradicted by evidence) ... ... @@ -619,12 +619,8 @@ 619 619 620 620 (% style="width:100%; border-collapse:collapse;" %) 621 621 |=**Article Text**|=**Status**|=**Analysis** 622 -|((( 623 -A recent study published in the Journal of Nutrition has revealed new findings about the Mediterranean diet. 624 -)))|(% style="text-align:center;" %)Plain text|(% style="font-style:italic; color:#888;" %)Context - no highlighting 625 -|((( 626 -//Researchers found that Mediterranean diet followers had a 25% lower risk of heart disease compared to control groups// 627 -)))|(% style="background-color:#D4EDDA; text-align:center; padding:8px;" %)🟢 **WELL SUPPORTED**|((( 586 +|(((A recent study published in the Journal of Nutrition has revealed new findings about the Mediterranean diet.)))|(% style="text-align:center;" %)Plain text|(% style="font-style:italic; color:#888;" %)Context - no highlighting 587 +|(((//Researchers found that Mediterranean diet followers had a 25% lower risk of heart disease compared to control groups//)))|(% style="background-color:#D4EDDA; text-align:center; padding:8px;" %)🟢 **WELL SUPPORTED**|((( 628 628 **87% confidence** 629 629 630 630 Meta-analysis of 12 RCTs confirms 23-28% risk reduction ... ... @@ -631,12 +631,8 @@ 631 631 632 632 [[View Full Analysis]] 633 633 ))) 634 -|((( 635 -The study, which followed 10,000 participants over five years, showed significant improvements in cardiovascular health markers. 636 -)))|(% style="text-align:center;" %)Plain text|(% style="font-style:italic; color:#888;" %)Methodology - no highlighting 637 -|((( 638 -//Some experts believe this diet can completely prevent heart attacks// 639 -)))|(% style="background-color:#FFF3CD; text-align:center; padding:8px;" %)🟡 **UNCERTAIN**|((( 594 +|(((The study, which followed 10,000 participants over five years, showed significant improvements in cardiovascular health markers.)))|(% style="text-align:center;" %)Plain text|(% style="font-style:italic; color:#888;" %)Methodology - no highlighting 595 +|(((//Some experts believe this diet can completely prevent heart attacks//)))|(% style="background-color:#FFF3CD; text-align:center; padding:8px;" %)🟡 **UNCERTAIN**|((( 640 640 **45% confidence** 641 641 642 642 Overstated - evidence shows risk reduction, not prevention ... ... @@ -643,12 +643,8 @@ 643 643 644 644 [[View Details]] 645 645 ))) 646 -|((( 647 -Dr. Rodriguez recommends incorporating more olive oil, fish, and vegetables into daily meals. 648 -)))|(% style="text-align:center;" %)Plain text|(% style="font-style:italic; color:#888;" %)Recommendation - no highlighting 649 -|((( 650 -//The study proves that saturated fats cause heart disease// 651 -)))|(% style="background-color:#F8D7DA; text-align:center; padding:8px;" %)🔴 **REFUTED**|((( 602 +|(((Dr. Rodriguez recommends incorporating more olive oil, fish, and vegetables into daily meals.)))|(% style="text-align:center;" %)Plain text|(% style="font-style:italic; color:#888;" %)Recommendation - no highlighting 603 +|(((//The study proves that saturated fats cause heart disease//)))|(% style="background-color:#F8D7DA; text-align:center; padding:8px;" %)🔴 **REFUTED**|((( 652 652 **15% confidence** 653 653 654 654 Claim not supported by study; correlation ≠ causation ... ... @@ -657,7 +657,6 @@ 657 657 ))) 658 658 659 659 **Design Notes:** 660 - 661 661 * Highlighted claims use italics to distinguish from plain text 662 662 * Color backgrounds match XWiki message box colors (success/warning/error) 663 663 * Status column shows verdict prominently ... ... @@ -664,7 +664,6 @@ 664 664 * Analysis column provides quick summary with link to details 665 665 666 666 **User Actions**: 667 - 668 668 * **Hover** over highlighted claim → Tooltip appears 669 669 * **Click** highlighted claim → Detailed analysis modal/panel 670 670 * **Toggle** button to turn highlighting on/off ... ... @@ -671,18 +671,16 @@ 671 671 * **Keyboard**: Tab through highlighted claims 672 672 673 673 **Interaction Design**: 674 - 675 675 * Hover/click on highlighted claim → Show tooltip with: 676 -* Claim text 677 -* Verdict (e.g., "WELL SUPPORTED") 678 -* Confidence score (e.g., "85%") 679 -* Brief evidence summary 680 -* Link to detailed analysis 625 + * Claim text 626 + * Verdict (e.g., "WELL SUPPORTED") 627 + * Confidence score (e.g., "85%") 628 + * Brief evidence summary 629 + * Link to detailed analysis 681 681 * Toggle highlighting on/off (user preference) 682 682 * Adjustable color intensity for accessibility 683 683 684 684 **Technical Requirements**: 685 - 686 686 * Real-time highlighting as page loads (non-blocking) 687 687 * Claim boundary detection (start/end of assertion) 688 688 * Handle nested or overlapping claims ... ... @@ -690,19 +690,16 @@ 690 690 * Work with various content formats (HTML, plain text, PDFs) 691 691 692 692 **Performance Requirements**: 693 - 694 694 * Highlighting renders within 500ms of page load 695 695 * No perceptible delay in reading experience 696 696 * Efficient DOM manipulation (avoid reflows) 697 697 698 698 **Accessibility**: 699 - 700 700 * Color-blind friendly palette (use patterns/icons in addition to color) 701 701 * Screen reader compatible (ARIA labels for claim credibility) 702 702 * Keyboard navigation to highlighted claims 703 703 704 704 **Implementation Notes**: 705 - 706 706 * Claims extracted and analyzed by AKEL during initial processing 707 707 * Highlighting data stored as annotations with byte offsets 708 708 * Client-side rendering of highlights based on verdict data ... ... @@ -715,7 +715,6 @@ 715 715 **Fulfills**: UN-1 (Fast access to verified content), UN-16 (Clear review status) 716 716 717 717 **Simple flow**: 718 - 719 719 1. Claim submitted 720 720 2. AKEL processes (automated) 721 721 3. If confidence > threshold: Publish (labeled as AI-generated) ... ... @@ -727,7 +727,6 @@ 727 727 ==== FR10 — Moderation ==== 728 728 729 729 **Focus on abuse, not routine quality**: 730 - 731 731 * Automated abuse detection 732 732 * Moderators handle flags 733 733 * Quick response to harmful content ... ... @@ -798,7 +798,6 @@ 798 798 **Purpose:** Ensure extracted claims are factual assertions (not opinions/predictions) 799 799 800 800 **Checks:** 801 - 802 802 1. **Factual Statement Test:** Is this verifiable? (Yes/No) 803 803 2. **Opinion Detection:** Contains hedging language? ("I think", "probably", "best") 804 804 3. **Future Prediction Test:** Makes claims about future events? ... ... @@ -805,7 +805,6 @@ 805 805 4. **Specificity Score:** Contains specific entities, numbers, dates? 806 806 807 807 **Thresholds:** 808 - 809 809 * Factual: Must be "Yes" 810 810 * Opinion markers: <2 hedging phrases 811 811 * Specificity: ≥3 specific elements ... ... @@ -817,13 +817,11 @@ 817 817 **Purpose:** Ensure AI-linked evidence actually relates to claim 818 818 819 819 **Checks:** 820 - 821 821 1. **Semantic Similarity Score:** Evidence vs. claim (embeddings) 822 822 2. **Entity Overlap:** Shared people/places/things? 823 823 3. **Topic Relevance:** Discusses claim subject? 824 824 825 825 **Thresholds:** 826 - 827 827 * Similarity: ≥0.6 (cosine similarity) 828 828 * Entity overlap: ≥1 shared entity 829 829 * Topic relevance: ≥0.5 ... ... @@ -835,13 +835,11 @@ 835 835 **Purpose:** Validate scenario assumptions are logical and complete 836 836 837 837 **Checks:** 838 - 839 839 1. **Completeness:** All required fields populated 840 840 2. **Internal Consistency:** Assumptions don't contradict 841 841 3. **Distinguishability:** Scenarios meaningfully different 842 842 843 843 **Thresholds:** 844 - 845 845 * Required fields: 100% 846 846 * Contradiction score: <0.3 847 847 * Scenario similarity: <0.8 ... ... @@ -853,7 +853,6 @@ 853 853 **Purpose:** Only publish high-confidence verdicts 854 854 855 855 **Checks:** 856 - 857 857 1. **Evidence Count:** Minimum 2 sources 858 858 2. **Source Quality:** Average reliability ≥0.6 859 859 3. **Evidence Agreement:** Supporting vs. contradicting ≥0.6 ... ... @@ -860,7 +860,6 @@ 860 860 4. **Uncertainty Factors:** Hedging in reasoning 861 861 862 862 **Confidence Tiers:** 863 - 864 864 * **HIGH (80-100%):** ≥3 sources, ≥0.7 quality, ≥80% agreement 865 865 * **MEDIUM (50-79%):** ≥2 sources, ≥0.6 quality, ≥60% agreement 866 866 * **LOW (0-49%):** <2 sources OR low quality/agreement ... ... @@ -867,13 +867,11 @@ 867 867 * **INSUFFICIENT:** <2 sources → DO NOT PUBLISH 868 868 869 869 **Implementation Phases:** 870 - 871 871 * **POC1:** Gates 1 & 4 only (basic validation) 872 872 * **POC2:** All 4 gates (complete framework) 873 873 * **V1.0:** Hardened with <5% hallucination rate 874 874 875 875 **Acceptance Criteria:** 876 - 877 877 * ✅ All gates operational 878 878 * ✅ Hallucination rate <5% 879 879 * ✅ Quality metrics public ... ... @@ -889,7 +889,6 @@ 889 889 ==== API Security ==== 890 890 891 891 **Rate Limiting:** 892 - 893 893 * **Analysis endpoints:** 100 requests/hour per IP 894 894 * **Read endpoints:** 1,000 requests/hour per IP 895 895 * **Search:** 500 requests/hour per IP ... ... @@ -897,24 +897,21 @@ 897 897 * **Burst protection:** Max 10 requests/second 898 898 899 899 **Authentication & Authorization:** 900 - 901 901 * **API Keys:** Required for programmatic access 902 902 * **JWT tokens:** For user sessions (1-hour expiry) 903 903 * **OAuth2:** For third-party integrations 904 904 * **Role-Based Access Control (RBAC):** 905 -* Public: Read-only access to published claims 906 -* Contributor: Submit claims, provide evidence 907 -* Moderator: Review contributions, manage quality 908 -* Admin: System configuration, user management 836 + * Public: Read-only access to published claims 837 + * Contributor: Submit claims, provide evidence 838 + * Moderator: Review contributions, manage quality 839 + * Admin: System configuration, user management 909 909 910 910 **CORS Policies:** 911 - 912 912 * Whitelist approved domains only 913 913 * No wildcard origins in production 914 914 * Credentials required for sensitive endpoints 915 915 916 916 **Input Sanitization:** 917 - 918 918 * Validate all user input against schemas 919 919 * Sanitize HTML/JavaScript in text submissions 920 920 * Prevent SQL injection (use parameterized queries) ... ... @@ -922,12 +922,11 @@ 922 922 * Max request size: 10MB 923 923 * File upload restrictions: Whitelist file types, scan for malware 924 924 925 ---- -854 +--- 926 926 927 927 ==== Data Security ==== 928 928 929 929 **Encryption at Rest:** 930 - 931 931 * Database encryption using AES-256 932 932 * Encrypted backups 933 933 * Key management via cloud provider KMS (AWS KMS, Google Cloud KMS) ... ... @@ -934,7 +934,6 @@ 934 934 * Regular key rotation (90-day cycle) 935 935 936 936 **Encryption in Transit:** 937 - 938 938 * HTTPS/TLS 1.3 only (no TLS 1.0/1.1) 939 939 * Strong cipher suites only 940 940 * HSTS (HTTP Strict Transport Security) enabled ... ... @@ -941,7 +941,6 @@ 941 941 * Certificate pinning for mobile apps 942 942 943 943 **Secure Credential Storage:** 944 - 945 945 * Passwords hashed with bcrypt (cost factor 12+) 946 946 * API keys encrypted in database 947 947 * Secrets stored in environment variables (never in code) ... ... @@ -948,13 +948,12 @@ 948 948 * Use secrets manager (AWS Secrets Manager, HashiCorp Vault) 949 949 950 950 **Data Privacy:** 951 - 952 952 * Minimal data collection (privacy by design) 953 953 * User data deletion on request (GDPR compliance) 954 954 * PII encryption in database 955 955 * Anonymize logs (no PII in log files) 956 956 957 ---- -882 +--- 958 958 959 959 ==== Application Security ==== 960 960 ... ... @@ -972,7 +972,6 @@ 972 972 10. **Server-Side Request Forgery:** URL validation, whitelist domains 973 973 974 974 **Security Headers:** 975 - 976 976 * `Content-Security-Policy`: Strict CSP to prevent XSS 977 977 * `X-Frame-Options`: DENY (prevent clickjacking) 978 978 * `X-Content-Type-Options`: nosniff ... ... @@ -980,7 +980,6 @@ 980 980 * `Permissions-Policy`: Restrict browser features 981 981 982 982 **Dependency Vulnerability Scanning:** 983 - 984 984 * **Tools:** Snyk, Dependabot, npm audit, pip-audit 985 985 * **Frequency:** Daily automated scans 986 986 * **Action:** Patch critical vulnerabilities within 24 hours ... ... @@ -987,34 +987,30 @@ 987 987 * **Policy:** No known high/critical CVEs in production 988 988 989 989 **Security Audits:** 990 - 991 991 * **Internal:** Quarterly security reviews 992 992 * **External:** Annual penetration testing by certified firm 993 993 * **Bug Bounty:** Public bug bounty program (V1.1+) 994 994 * **Compliance:** SOC 2 Type II certification target (V1.5) 995 995 996 ---- -918 +--- 997 997 998 998 ==== Operational Security ==== 999 999 1000 1000 **DDoS Protection:** 1001 - 1002 1002 * CloudFlare or AWS Shield 1003 1003 * Rate limiting at CDN layer 1004 1004 * Automatic IP blocking for abuse patterns 1005 1005 1006 1006 **Monitoring & Alerting:** 1007 - 1008 1008 * Real-time security event monitoring 1009 1009 * Alerts for: 1010 -* Failed login attempts (>5 in 10 minutes) 1011 -* API abuse patterns 1012 -* Unusual data access patterns 1013 -* Security scan detections 930 + * Failed login attempts (>5 in 10 minutes) 931 + * API abuse patterns 932 + * Unusual data access patterns 933 + * Security scan detections 1014 1014 * Integration with SIEM (Security Information and Event Management) 1015 1015 1016 1016 **Incident Response:** 1017 - 1018 1018 * Documented incident response plan 1019 1019 * Security incident classification (P1-P4) 1020 1020 * On-call rotation for security issues ... ... @@ -1022,18 +1022,16 @@ 1022 1022 * Public disclosure policy (coordinated disclosure) 1023 1023 1024 1024 **Backup & Recovery:** 1025 - 1026 1026 * Daily encrypted backups 1027 1027 * 30-day retention period 1028 1028 * Tested recovery procedures (quarterly) 1029 1029 * Disaster recovery plan (RTO: 4 hours, RPO: 1 hour) 1030 1030 1031 ---- -949 +--- 1032 1032 1033 1033 ==== Compliance & Standards ==== 1034 1034 1035 1035 **GDPR Compliance:** 1036 - 1037 1037 * User consent management 1038 1038 * Right to access data 1039 1039 * Right to deletion ... ... @@ -1041,7 +1041,6 @@ 1041 1041 * Privacy policy published 1042 1042 1043 1043 **Accessibility:** 1044 - 1045 1045 * WCAG 2.1 AA compliance 1046 1046 * Screen reader compatibility 1047 1047 * Keyboard navigation ... ... @@ -1048,7 +1048,6 @@ 1048 1048 * Alt text for images 1049 1049 1050 1050 **Browser Support:** 1051 - 1052 1052 * Modern browsers only (Chrome/Edge/Firefox/Safari latest 2 versions) 1053 1053 * No IE11 support 1054 1054 ... ... @@ -1075,18 +1075,16 @@ 1075 1075 1076 1076 **Core Metrics to Display:** 1077 1077 1078 -* \\ 1079 -** \\ 1080 -**1. Verdict Quality Metrics 993 +**1. Verdict Quality Metrics** 1081 1081 1082 1082 **TIGERScore (Fact-Checking Quality):** 1083 - 1084 1084 * **Definition:** Measures how well generated verdicts match expert fact-checker judgments 1085 1085 * **Scale:** 0-100 (higher is better) 1086 1086 * **Calculation:** Using TIGERScore framework (Truth-conditional accuracy, Informativeness, Generality, Evaluativeness, Relevance) 1087 1087 * **Target:** Average ≥80 for production release 1088 1088 * **Display:** 1089 -{{code}}Verdict Quality (TIGERScore): 1001 +{{code}} 1002 +Verdict Quality (TIGERScore): 1090 1090 Overall: 84.2 ▲ (+2.1 from last month) 1091 1091 1092 1092 Distribution: ... ... @@ -1094,18 +1094,19 @@ 1094 1094 Good (60-80): 28% 1095 1095 Needs Improvement (<60): 5% 1096 1096 1097 -Trend: [Graph showing improvement over time]{{/code}} 1010 +Trend: [Graph showing improvement over time] 1011 +{{/code}} 1098 1098 1099 1099 **2. Hallucination & Faithfulness Metrics** 1100 1100 1101 1101 **AlignScore (Faithfulness to Evidence):** 1102 - 1103 1103 * **Definition:** Measures how well verdicts align with actual evidence content 1104 1104 * **Scale:** 0-1 (higher is better) 1105 1105 * **Purpose:** Detect AI hallucinations (making claims not supported by evidence) 1106 1106 * **Target:** Average ≥0.85, hallucination rate <5% 1107 1107 * **Display:** 1108 -{{code}}Evidence Faithfulness (AlignScore): 1021 +{{code}} 1022 +Evidence Faithfulness (AlignScore): 1109 1109 Average: 0.87 ▼ (-0.02 from last month) 1110 1110 1111 1111 Hallucination Rate: 4.2% ... ... @@ -1112,24 +1112,24 @@ 1112 1112 - Claims without evidence support: 3.1% 1113 1113 - Misrepresented evidence: 1.1% 1114 1114 1115 -Action: Prompt engineering review scheduled{{/code}} 1029 +Action: Prompt engineering review scheduled 1030 +{{/code}} 1116 1116 1117 1117 **3. Evidence Quality Metrics** 1118 1118 1119 1119 **Source Reliability:** 1120 - 1121 1121 * Average source quality score (0-1 scale) 1122 1122 * Distribution of high/medium/low quality sources 1123 1123 * Publisher track record trends 1124 1124 1125 1125 **Evidence Coverage:** 1126 - 1127 1127 * Average number of sources per claim 1128 1128 * Percentage of claims with ≥2 sources (EFCSN minimum) 1129 1129 * Geographic diversity of sources 1130 1130 1131 1131 **Display:** 1132 -{{code}}Evidence Quality: 1045 +{{code}} 1046 +Evidence Quality: 1133 1133 1134 1134 Average Sources per Claim: 4.2 1135 1135 Claims with ≥2 sources: 94% (EFCSN compliant) ... ... @@ -1139,23 +1139,24 @@ 1139 1139 Medium quality (0.5-0.8): 43% 1140 1140 Low quality (<0.5): 9% 1141 1141 1142 -Geographic Diversity: 23 countries represented{{/code}} 1056 +Geographic Diversity: 23 countries represented 1057 +{{/code}} 1143 1143 1144 1144 **4. Contributor Consensus Metrics** (when human reviewers involved) 1145 1145 1146 1146 **Inter-Rater Reliability (IRR):** 1147 - 1148 1148 * **Calculation:** Cohen's Kappa or Fleiss' Kappa for multiple raters 1149 1149 * **Scale:** 0-1 (higher is better) 1150 1150 * **Interpretation:** 1151 -* >0.8: Almost perfect agreement 1152 -* 0.6-0.8: Substantial agreement 1153 -* 0.4-0.6: Moderate agreement 1154 -* <0.4: Poor agreement 1065 + * >0.8: Almost perfect agreement 1066 + * 0.6-0.8: Substantial agreement 1067 + * 0.4-0.6: Moderate agreement 1068 + * <0.4: Poor agreement 1155 1155 * **Target:** Maintain ≥0.7 (substantial agreement) 1156 1156 1157 1157 **Display:** 1158 -{{code}}Contributor Consensus: 1072 +{{code}} 1073 +Contributor Consensus: 1159 1159 1160 1160 Inter-Rater Reliability (IRR): 0.73 (Substantial agreement) 1161 1161 - Verdict agreement: 78% ... ... @@ -1163,9 +1163,10 @@ 1163 1163 - Scenario structure agreement: 69% 1164 1164 1165 1165 Cases requiring moderator review: 12 1166 -Moderator override rate: 8%{{/code}} 1081 +Moderator override rate: 8% 1082 +{{/code}} 1167 1167 1168 ---- -1084 +--- 1169 1169 1170 1170 ==== Quality Dashboard Implementation ==== 1171 1171 ... ... @@ -1172,7 +1172,6 @@ 1172 1172 **Dashboard Location:** `/quality-metrics` 1173 1173 1174 1174 **Update Frequency:** 1175 - 1176 1176 * **POC2:** Weekly manual updates 1177 1177 * **Beta 0:** Daily automated updates 1178 1178 * **V1.0:** Real-time metrics (updated hourly) ... ... @@ -1222,7 +1222,7 @@ 1222 1222 1223 1223 {{/code}} 1224 1224 1225 ---- -1140 +--- 1226 1226 1227 1227 ==== Continuous Improvement Feedback Loop ==== 1228 1228 ... ... @@ -1229,36 +1229,31 @@ 1229 1229 **How Metrics Inform AKEL Improvements:** 1230 1230 1231 1231 1. **Identify Weak Areas:** 1147 + * Low TIGERScore → Review prompt engineering 1148 + * High hallucination → Strengthen evidence grounding 1149 + * Low IRR → Clarify evaluation criteria 1232 1232 1233 -* Low TIGERScore → Review prompt engineering 1234 -* High hallucination → Strengthen evidence grounding 1235 -* Low IRR → Clarify evaluation criteria 1236 - 1237 1237 2. **A/B Testing Integration:** 1152 + * Test prompt variations 1153 + * Measure impact on quality metrics 1154 + * Deploy winners automatically 1238 1238 1239 -* Test prompt variations 1240 -* Measure impact on quality metrics 1241 -* Deploy winners automatically 1242 - 1243 1243 3. **Alert Thresholds:** 1157 + * TIGERScore drops below 75 → Alert team 1158 + * Hallucination rate exceeds 7% → Pause auto-publishing 1159 + * IRR below 0.6 → Moderator training needed 1244 1244 1245 -* TIGERScore drops below 75 → Alert team 1246 -* Hallucination rate exceeds 7% → Pause auto-publishing 1247 -* IRR below 0.6 → Moderator training needed 1248 - 1249 1249 4. **Monthly Quality Reviews:** 1162 + * Analyze trends 1163 + * Identify systematic issues 1164 + * Plan prompt improvements 1165 + * Update AKEL models 1250 1250 1251 -* Analyze trends 1252 -* Identify systematic issues 1253 -* Plan prompt improvements 1254 -* Update AKEL models 1167 +--- 1255 1255 1256 ----- 1257 - 1258 1258 ==== Metric Calculation Details ==== 1259 1259 1260 1260 **TIGERScore Implementation:** 1261 - 1262 1262 * Reference: https://github.com/TIGER-AI-Lab/TIGERScore 1263 1263 * Input: Generated verdict + reference verdict (from expert) 1264 1264 * Output: 0-100 score across 5 dimensions ... ... @@ -1265,7 +1265,6 @@ 1265 1265 * Requires: Test set of expert-reviewed claims (minimum 100) 1266 1266 1267 1267 **AlignScore Implementation:** 1268 - 1269 1269 * Reference: https://github.com/yuh-zha/AlignScore 1270 1270 * Input: Generated verdict + source evidence text 1271 1271 * Output: 0-1 faithfulness score ... ... @@ -1272,12 +1272,11 @@ 1272 1272 * Calculation: Semantic alignment between claim and evidence 1273 1273 1274 1274 **Source Quality Scoring:** 1275 - 1276 1276 * Use existing source reliability database (e.g., NewsGuard, MBFC) 1277 1277 * Factor in: Publication history, corrections record, transparency 1278 1278 * Scale: 0-1 (weighted average across sources) 1279 1279 1280 ---- -1188 +--- 1281 1281 1282 1282 ==== Integration Points ==== 1283 1283 ... ... @@ -1305,13 +1305,11 @@ 1305 1305 == 14. Related Pages == 1306 1306 1307 1307 **Non-Functional Requirements (see Section 9):** 1308 - 1309 1309 * [[NFR11 — AKEL Quality Assurance Framework>>#NFR11]] 1310 1310 * [[NFR12 — Security Controls>>#NFR12]] 1311 1311 * [[NFR13 — Quality Metrics Transparency>>#NFR13]] 1312 1312 1313 1313 **Other Requirements:** 1314 - 1315 1315 * [[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]] 1316 1316 * [[V1.0 Requirements>>FactHarbor.Specification.Requirements.V10.]] 1317 1317 * [[Gap Analysis>>FactHarbor.Specification.Requirements.GapAnalysis]] ... ... @@ -1321,7 +1321,7 @@ 1321 1321 * [[Data Model>>FactHarbor.Specification.Data Model.WebHome]] - Data structures supporting requirements 1322 1322 * [[Workflows>>FactHarbor.Specification.Workflows.WebHome]] - User interaction workflows 1323 1323 * [[AKEL>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]] - AI system fulfilling automation requirements 1324 -* [[Global Rules>> Archive.FactHarbor.Organisation.How-We-Work-Together.GlobalRules.WebHome]]1230 +* [[Global Rules>>FactHarbor.Organisation.How-We-Work-Together.GlobalRules.WebHome]] 1325 1325 * [[Privacy Policy>>FactHarbor.Organisation.How-We-Work-Together.Privacy-Policy]] 1326 1326 1327 1327 = V0.9.70 Additional Requirements = ... ... @@ -1379,7 +1379,6 @@ 1379 1379 **FactHarbor-Specific Mapping:** 1380 1380 1381 1381 **Likelihood Score to Rating Scale:** 1382 - 1383 1383 * 80-100% likelihood → 5 (Highly Supported) 1384 1384 * 60-79% likelihood → 4 (Supported) 1385 1385 * 40-59% likelihood → 3 (Mixed/Uncertain) ... ... @@ -1387,7 +1387,6 @@ 1387 1387 * 0-19% likelihood → 1 (Refuted) 1388 1388 1389 1389 **Multiple Scenarios Handling:** 1390 - 1391 1391 * If claim has multiple scenarios with different verdicts, generate **separate ClaimReview** for each scenario 1392 1392 * Add `disambiguatingDescription` field explaining scenario context 1393 1393 * Example: "Scenario: If interpreted as referring to 2023 data..." ... ... @@ -1439,9 +1439,7 @@ 1439 1439 1440 1440 ==== Notification Mechanisms ==== 1441 1441 1442 -* \\ 1443 -** \\ 1444 -**1. In-Page Banner: 1346 +**1. In-Page Banner:** 1445 1445 1446 1446 Display prominent banner on claim page: 1447 1447 ... ... @@ -1461,10 +1461,10 @@ 1461 1461 1462 1462 * Public changelog at `/claims/{id}/corrections` 1463 1463 * Displays for each correction: 1464 -* Date/time of correction 1465 -* What changed (before/after comparison) 1466 -* Why changed (reason if provided) 1467 -* Who made change (AKEL auto-update vs. contributor override) 1366 + * Date/time of correction 1367 + * What changed (before/after comparison) 1368 + * Why changed (reason if provided) 1369 + * Who made change (AKEL auto-update vs. contributor override) 1468 1468 1469 1469 **3. Email Notifications (opt-in):** 1470 1470 ... ... @@ -1523,25 +1523,23 @@ 1523 1523 **Purpose:** Find earlier uses of the image to verify context 1524 1524 1525 1525 **Implementation:** 1526 - 1527 1527 * Integrate APIs: 1528 -* **Google Vision AI** (reverse search) 1529 -* **TinEye** (oldest known uses) 1530 -* **Bing Visual Search** (broad coverage) 1429 + * **Google Vision AI** (reverse search) 1430 + * **TinEye** (oldest known uses) 1431 + * **Bing Visual Search** (broad coverage) 1531 1531 1532 1532 **Process:** 1533 - 1534 1534 1. Extract image from claim or user upload 1535 1535 2. Query multiple reverse search services 1536 1536 3. Analyze results for: 1437 + * Earliest known publication 1438 + * Original context (what was it really showing?) 1439 + * Publication timeline 1440 + * Geographic spread 1537 1537 1538 -* Earliest known publication 1539 -* Original context (what was it really showing?) 1540 -* Publication timeline 1541 -* Geographic spread 1542 - 1543 1543 **Output:** 1544 -{{code}}Reverse Image Search Results: 1443 +{{code}} 1444 +Reverse Image Search Results: 1545 1545 1546 1546 Earliest known use: 2019-03-15 (5 years before claim) 1547 1547 Original context: "Photo from 2019 flooding in Mumbai" ... ... @@ -1554,9 +1554,10 @@ 1554 1554 • 2020-07-22: Bangladesh monsoon 1555 1555 • 2024-10-15: Current claim (misattributed) 1556 1556 1557 -[View full timeline]{{/code}} 1457 +[View full timeline] 1458 +{{/code}} 1558 1558 1559 ---- -1460 +--- 1560 1560 1561 1561 **Method 2: AI Manipulation Detection** 1562 1562 ... ... @@ -1563,41 +1563,36 @@ 1563 1563 **Purpose:** Detect deepfakes, face swaps, and digital alterations 1564 1564 1565 1565 **Implementation:** 1566 - 1567 1567 * Integrate detection services: 1568 -* **Sensity AI** (deepfake detection) 1569 -* **Reality Defender** (multimodal analysis) 1570 -* **AWS Rekognition** (face detection inconsistencies) 1468 + * **Sensity AI** (deepfake detection) 1469 + * **Reality Defender** (multimodal analysis) 1470 + * **AWS Rekognition** (face detection inconsistencies) 1571 1571 1572 1572 **Detection Categories:** 1573 - 1574 1574 1. **Face Manipulation:** 1474 + * Deepfake face swaps 1475 + * Expression manipulation 1476 + * Identity replacement 1575 1575 1576 -* Deepfake face swaps 1577 -* Expression manipulation 1578 -* Identity replacement 1579 - 1580 1580 2. **Image Manipulation:** 1479 + * Copy-paste artifacts 1480 + * Clone stamp detection 1481 + * Content-aware fill detection 1482 + * JPEG compression inconsistencies 1581 1581 1582 -* Copy-paste artifacts 1583 -* Clone stamp detection 1584 -* Content-aware fill detection 1585 -* JPEG compression inconsistencies 1586 - 1587 1587 3. **AI Generation:** 1485 + * Detect fully AI-generated images 1486 + * Identify generation artifacts 1487 + * Check for model signatures 1588 1588 1589 -* Detect fully AI-generated images 1590 -* Identify generation artifacts 1591 -* Check for model signatures 1592 - 1593 1593 **Confidence Scoring:** 1594 - 1595 1595 * **HIGH (80-100%):** Strong evidence of manipulation 1596 1596 * **MEDIUM (50-79%):** Suspicious artifacts detected 1597 1597 * **LOW (0-49%):** Minor inconsistencies or inconclusive 1598 1598 1599 1599 **Output:** 1600 -{{code}}Manipulation Analysis: 1495 +{{code}} 1496 +Manipulation Analysis: 1601 1601 1602 1602 Face Manipulation: LOW RISK (12%) 1603 1603 Image Editing: MEDIUM RISK (64%) ... ... @@ -1606,9 +1606,10 @@ 1606 1606 1607 1607 AI Generation: LOW RISK (8%) 1608 1608 1609 -⚠️ Possible manipulation detected. Manual review recommended.{{/code}} 1505 +⚠️ Possible manipulation detected. Manual review recommended. 1506 +{{/code}} 1610 1610 1611 ---- -1508 +--- 1612 1612 1613 1613 **Method 3: Metadata Analysis (EXIF)** 1614 1614 ... ... @@ -1615,7 +1615,6 @@ 1615 1615 **Purpose:** Extract technical details that may reveal manipulation or misattribution 1616 1616 1617 1617 **Extracted Data:** 1618 - 1619 1619 * **Camera/Device:** Make, model, software 1620 1620 * **Timestamps:** Original date, modification dates 1621 1621 * **Location:** GPS coordinates (if present) ... ... @@ -1623,7 +1623,6 @@ 1623 1623 * **File Properties:** Resolution, compression, format conversions 1624 1624 1625 1625 **Red Flags:** 1626 - 1627 1627 * Metadata completely stripped (suspicious) 1628 1628 * Timestamp conflicts with claimed date 1629 1629 * GPS location conflicts with claimed location ... ... @@ -1631,7 +1631,8 @@ 1631 1631 * Creation date after modification date (impossible) 1632 1632 1633 1633 **Output:** 1634 -{{code}}Image Metadata: 1529 +{{code}} 1530 +Image Metadata: 1635 1635 1636 1636 Camera: iPhone 14 Pro 1637 1637 Original date: 2023-08-12 14:32:15 ... ... @@ -1643,20 +1643,19 @@ 1643 1643 Claim says: "Taken in Los Angeles" 1644 1644 EXIF says: New York City 1645 1645 1646 -⚠️ Edited 14 months after capture{{/code}} 1542 +⚠️ Edited 14 months after capture 1543 +{{/code}} 1647 1647 1648 ---- -1545 +--- 1649 1649 1650 1650 ==== Verification Workflow ==== 1651 1651 1652 1652 **Automatic Triggers:** 1653 - 1654 1654 1. User submits claim with image 1655 1655 2. Article being analyzed contains images 1656 1656 3. Social media post includes photos 1657 1657 1658 1658 **Process:** 1659 - 1660 1660 1. Extract images from content 1661 1661 2. Run all 3 verification methods in parallel 1662 1662 3. Aggregate results into confidence score ... ... @@ -1691,16 +1691,14 @@ 1691 1691 ==== Cost Considerations ==== 1692 1692 1693 1693 **API Costs (estimated per image):** 1694 - 1695 1695 * Google Vision AI: $0.001-0.003 1696 1696 * TinEye: $0.02 (commercial API) 1697 1697 * Sensity AI: $0.05-0.10 1698 1698 * AWS Rekognition: $0.001-0.002 1699 1699 1700 -**Total per image:** $0.07-0.15 **1594 +**Total per image:** ~$0.07-0.15 1701 1701 1702 1702 **Mitigation Strategies:** 1703 - 1704 1704 * Cache results for duplicate images 1705 1705 * Use free tier quotas where available 1706 1706 * Prioritize higher-value claims for deep analysis ... ... @@ -1727,7 +1727,6 @@ 1727 1727 **Automatic Archiving:** 1728 1728 1729 1729 When AKEL links evidence: 1730 - 1731 1731 1. Check if URL already archived (Wayback Machine API) 1732 1732 2. If not, submit for archiving (Save Page Now API) 1733 1733 3. Store both original URL and archive URL ... ... @@ -1762,10 +1762,8 @@ 1762 1762 * ✅ API rate limits respected 1763 1763 * ✅ Archive status visible in evidence display 1764 1764 1765 -== Category 4: Community Safety == 1657 +== Category 4: Community Safety ===== FR48: Contributor Safety Framework === 1766 1766 1767 - FR48: Contributor Safety Framework === 1768 - 1769 1769 **Importance:** CRITICAL 1770 1770 **Fulfills:** UN-28 (Safe contribution environment) 1771 1771 ... ... @@ -1773,9 +1773,7 @@ 1773 1773 1774 1774 **Specification:** 1775 1775 1776 -* \\ 1777 -** \\ 1778 -**1. Privacy Protection: 1666 +**1. Privacy Protection:** 1779 1779 1780 1780 * **Optional Pseudonymity:** Contributors can use pseudonyms 1781 1781 * **Email Privacy:** Emails never displayed publicly ... ... @@ -1818,10 +1818,8 @@ 1818 1818 * ✅ Moderator tools implemented 1819 1819 * ✅ Safety policy published 1820 1820 1821 -== Category 5: Continuous Improvement == 1709 +== Category 5: Continuous Improvement ===== FR49: A/B Testing Framework === 1822 1822 1823 - FR49: A/B Testing Framework === 1824 - 1825 1825 **Importance:** CRITICAL 1826 1826 **Fulfills:** Continuous system improvement 1827 1827 ... ... @@ -1832,23 +1832,20 @@ 1832 1832 **Test Capabilities:** 1833 1833 1834 1834 1. **Prompt Variations:** 1721 + * Test different claim extraction prompts 1722 + * Test different verdict generation prompts 1723 + * Measure: Accuracy, clarity, completeness 1835 1835 1836 -* Test different claim extraction prompts 1837 -* Test different verdict generation prompts 1838 -* Measure: Accuracy, clarity, completeness 1839 - 1840 1840 2. **Algorithm Variations:** 1726 + * Test different source scoring algorithms 1727 + * Test different confidence calculations 1728 + * Measure: Audit accuracy, user satisfaction 1841 1841 1842 -* Test different source scoring algorithms 1843 -* Test different confidence calculations 1844 -* Measure: Audit accuracy, user satisfaction 1845 - 1846 1846 3. **Workflow Variations:** 1731 + * Test different quality gate thresholds 1732 + * Test different risk tier assignments 1733 + * Measure: Publication rate, quality scores 1847 1847 1848 -* Test different quality gate thresholds 1849 -* Test different risk tier assignments 1850 -* Measure: Publication rate, quality scores 1851 - 1852 1852 **Implementation:** 1853 1853 1854 1854 * **Traffic Split:** 50/50 or 90/10 splits ... ... @@ -1889,24 +1889,21 @@ 1889 1889 **Deduplication Logic:** 1890 1890 1891 1891 1. **URL Normalization:** 1775 + * Remove tracking parameters (?utm_source=...) 1776 + * Normalize http/https 1777 + * Normalize www/non-www 1778 + * Handle redirects 1892 1892 1893 -* Remove tracking parameters (?utm_source=...) 1894 -* Normalize http/https 1895 -* Normalize www/non-www 1896 -* Handle redirects 1897 - 1898 1898 2. **Content Similarity:** 1781 + * If two sources have >90% text similarity → Same source 1782 + * If one is subset of other → Same source 1783 + * Use fuzzy matching for minor differences 1899 1899 1900 -* If two sources have >90% text similarity → Same source 1901 -* If one is subset of other → Same source 1902 -* Use fuzzy matching for minor differences 1903 - 1904 1904 3. **Cross-Domain Syndication:** 1786 + * Detect wire service content (AP, Reuters) 1787 + * Mark as single source if syndicated 1788 + * Count original publication only 1905 1905 1906 -* Detect wire service content (AP, Reuters) 1907 -* Mark as single source if syndicated 1908 -* Count original publication only 1909 - 1910 1910 **Display:** 1911 1911 1912 1912 {{code}} ... ... @@ -1928,16 +1928,13 @@ 1928 1928 * ✅ Unique vs. total counts accurate 1929 1929 * ✅ Improves evidence quality metrics 1930 1930 1931 -== Additional Requirements (Lower Importance) == 1811 +== Additional Requirements (Lower Importance) ===== FR50: OSINT Toolkit Integration === 1932 1932 1933 - FR50: OSINT Toolkit Integration === 1934 - 1935 1935 **Fulfills:** Advanced media verification 1936 1936 1937 1937 **Purpose:** Integrate open-source intelligence tools for advanced verification. 1938 1938 1939 1939 **Tools to Integrate:** 1940 - 1941 1941 * InVID/WeVerify (video verification) 1942 1942 * Bellingcat toolkit 1943 1943 * Additional TBD based on V1.0 learnings ... ... @@ -1949,7 +1949,6 @@ 1949 1949 **Purpose:** Verify video-based claims. 1950 1950 1951 1951 **Specification:** 1952 - 1953 1953 * Keyframe extraction 1954 1954 * Reverse video search 1955 1955 * Deepfake detection (AI-powered) ... ... @@ -1963,7 +1963,6 @@ 1963 1963 **Purpose:** Teach users to identify misinformation. 1964 1964 1965 1965 **Specification:** 1966 - 1967 1967 * Interactive tutorials 1968 1968 * Practice exercises 1969 1969 * Detection quizzes ... ... @@ -1976,7 +1976,6 @@ 1976 1976 **Purpose:** Share findings with IFCN/EFCSN members. 1977 1977 1978 1978 **Specification:** 1979 - 1980 1980 * API for fact-checking organizations 1981 1981 * Structured data exchange 1982 1982 * Privacy controls ... ... @@ -2017,24 +2017,21 @@ 2017 2017 **Deduplication Logic:** 2018 2018 2019 2019 1. **URL Normalization:** 1894 + * Remove tracking parameters (?utm_source=...) 1895 + * Normalize http/https 1896 + * Normalize www/non-www 1897 + * Handle redirects 2020 2020 2021 -* Remove tracking parameters (?utm_source=...) 2022 -* Normalize http/https 2023 -* Normalize www/non-www 2024 -* Handle redirects 2025 - 2026 2026 2. **Content Similarity:** 1900 + * If two sources have >90% text similarity → Same source 1901 + * If one is subset of other → Same source 1902 + * Use fuzzy matching for minor differences 2027 2027 2028 -* If two sources have >90% text similarity → Same source 2029 -* If one is subset of other → Same source 2030 -* Use fuzzy matching for minor differences 2031 - 2032 2032 3. **Cross-Domain Syndication:** 1905 + * Detect wire service content (AP, Reuters) 1906 + * Mark as single source if syndicated 1907 + * Count original publication only 2033 2033 2034 -* Detect wire service content (AP, Reuters) 2035 -* Mark as single source if syndicated 2036 -* Count original publication only 2037 - 2038 2038 **Display:** 2039 2039 2040 2040 {{code}} ... ... @@ -2056,10 +2056,8 @@ 2056 2056 * ✅ Unique vs. total counts accurate 2057 2057 * ✅ Improves evidence quality metrics 2058 2058 2059 -== Additional Requirements (Lower Importance) == 1930 +== Additional Requirements (Lower Importance) ===== FR7: Automated Verdicts (Enhanced with Quality Gates) === 2060 2060 2061 - FR7: Automated Verdicts (Enhanced with Quality Gates) === 2062 - 2063 2063 **POC1+ Enhancement:** 2064 2064 2065 2065 After AKEL generates verdict, it passes through quality gates: ... ... @@ -2080,7 +2080,6 @@ 2080 2080 {{/code}} 2081 2081 2082 2082 **Updated Verdict States:** 2083 - 2084 2084 * PUBLISHED 2085 2085 * INSUFFICIENT_EVIDENCE 2086 2086 * NON_FACTUAL_CLAIM ... ... @@ -2102,3 +2102,4 @@ 2102 2102 Avg Source Quality: 0.73 2103 2103 Quality Score: 8.5/10 2104 2104 {{/code}} 1973 +