Changes for page Requirements
Last modified by Robert Schaub on 2025/12/23 11:03
Summary
-
Page properties (1 modified, 0 added, 0 removed)
Details
- Page properties
-
- Content
-
... ... @@ -306,7 +306,7 @@ 306 306 4. How common is this pattern? 307 307 5. Store in ErrorPattern table (improvement queue) 308 308 309 -=== 6.2 WeeklyImprovement Cycle ===309 +=== 6.2 Continuous Improvement Cycle === 310 310 311 311 1. **Review**: Analyze top error patterns 312 312 2. **Develop**: Create fix (prompt, model, validation) ... ... @@ -326,7 +326,7 @@ 326 326 * Re-work rate 327 327 * Claims processed per hour 328 328 329 -**Goal**: 10% monthlyimprovement in error rate329 +**Goal**: continuous improvement in error rate 330 330 331 331 == 7. Automated Quality Monitoring == 332 332 ... ... @@ -803,185 +803,405 @@ 803 803 804 804 === NFR12: Security Controls === 805 805 806 -**Fulfills:** Productionreadiness,legalcompliance806 +**Fulfills:** Data protection, system integrity, user privacy, production readiness 807 807 808 -**Requirements:** 809 -1. **Input Validation:** SQL injection, XSS, CSRF prevention 810 -2. **Rate Limiting:** 5 analyses per minute per IP 811 -3. **Authentication:** Secure sessions, API key rotation 812 -4. **Data Protection:** HTTPS, encryption, backups 813 -5. **Security Audit:** Penetration testing, GDPR compliance 808 +**Phase:** Beta 0 (essential), V1.0 (complete) **BLOCKER** 814 814 815 -** Milestone:**Beta0 (essential),V1.0(complete)**BLOCKER**810 +**Purpose:** Protect FactHarbor systems, user data, and operations from security threats, ensuring production-grade security posture. 816 816 812 +**Specification:** 813 + 814 +==== API Security ==== 815 + 816 +**Rate Limiting:** 817 +* **Analysis endpoints:** 100 requests/hour per IP 818 +* **Read endpoints:** 1,000 requests/hour per IP 819 +* **Search:** 500 requests/hour per IP 820 +* **Authenticated users:** 5x higher limits 821 +* **Burst protection:** Max 10 requests/second 822 + 823 +**Authentication & Authorization:** 824 +* **API Keys:** Required for programmatic access 825 +* **JWT tokens:** For user sessions (1-hour expiry) 826 +* **OAuth2:** For third-party integrations 827 +* **Role-Based Access Control (RBAC):** 828 + * Public: Read-only access to published claims 829 + * Contributor: Submit claims, provide evidence 830 + * Moderator: Review contributions, manage quality 831 + * Admin: System configuration, user management 832 + 833 +**CORS Policies:** 834 +* Whitelist approved domains only 835 +* No wildcard origins in production 836 +* Credentials required for sensitive endpoints 837 + 838 +**Input Sanitization:** 839 +* Validate all user input against schemas 840 +* Sanitize HTML/JavaScript in text submissions 841 +* Prevent SQL injection (use parameterized queries) 842 +* Prevent command injection (no shell execution of user input) 843 +* Max request size: 10MB 844 +* File upload restrictions: Whitelist file types, scan for malware 845 + 846 +--- 847 + 848 +==== Data Security ==== 849 + 850 +**Encryption at Rest:** 851 +* Database encryption using AES-256 852 +* Encrypted backups 853 +* Key management via cloud provider KMS (AWS KMS, Google Cloud KMS) 854 +* Regular key rotation (90-day cycle) 855 + 856 +**Encryption in Transit:** 857 +* HTTPS/TLS 1.3 only (no TLS 1.0/1.1) 858 +* Strong cipher suites only 859 +* HSTS (HTTP Strict Transport Security) enabled 860 +* Certificate pinning for mobile apps 861 + 862 +**Secure Credential Storage:** 863 +* Passwords hashed with bcrypt (cost factor 12+) 864 +* API keys encrypted in database 865 +* Secrets stored in environment variables (never in code) 866 +* Use secrets manager (AWS Secrets Manager, HashiCorp Vault) 867 + 868 +**Data Privacy:** 869 +* Minimal data collection (privacy by design) 870 +* User data deletion on request (GDPR compliance) 871 +* PII encryption in database 872 +* Anonymize logs (no PII in log files) 873 + 874 +--- 875 + 876 +==== Application Security ==== 877 + 878 +**OWASP Top 10 Compliance:** 879 + 880 +1. **Broken Access Control:** RBAC implementation, path traversal prevention 881 +2. **Cryptographic Failures:** Strong encryption, secure key management 882 +3. **Injection:** Parameterized queries, input validation 883 +4. **Insecure Design:** Security review of all features 884 +5. **Security Misconfiguration:** Hardened defaults, security headers 885 +6. **Vulnerable Components:** Dependency scanning (see below) 886 +7. **Authentication Failures:** Strong password policy, MFA support 887 +8. **Data Integrity Failures:** Signature verification, checksums 888 +9. **Security Logging Failures:** Comprehensive audit logs 889 +10. **Server-Side Request Forgery:** URL validation, whitelist domains 890 + 891 +**Security Headers:** 892 +* `Content-Security-Policy`: Strict CSP to prevent XSS 893 +* `X-Frame-Options`: DENY (prevent clickjacking) 894 +* `X-Content-Type-Options`: nosniff 895 +* `Referrer-Policy`: strict-origin-when-cross-origin 896 +* `Permissions-Policy`: Restrict browser features 897 + 898 +**Dependency Vulnerability Scanning:** 899 +* **Tools:** Snyk, Dependabot, npm audit, pip-audit 900 +* **Frequency:** Daily automated scans 901 +* **Action:** Patch critical vulnerabilities within 24 hours 902 +* **Policy:** No known high/critical CVEs in production 903 + 904 +**Security Audits:** 905 +* **Internal:** Quarterly security reviews 906 +* **External:** Annual penetration testing by certified firm 907 +* **Bug Bounty:** Public bug bounty program (V1.1+) 908 +* **Compliance:** SOC 2 Type II certification target (V1.5) 909 + 910 +--- 911 + 912 +==== Operational Security ==== 913 + 914 +**DDoS Protection:** 915 +* CloudFlare or AWS Shield 916 +* Rate limiting at CDN layer 917 +* Automatic IP blocking for abuse patterns 918 + 919 +**Monitoring & Alerting:** 920 +* Real-time security event monitoring 921 +* Alerts for: 922 + * Failed login attempts (>5 in 10 minutes) 923 + * API abuse patterns 924 + * Unusual data access patterns 925 + * Security scan detections 926 +* Integration with SIEM (Security Information and Event Management) 927 + 928 +**Incident Response:** 929 +* Documented incident response plan 930 +* Security incident classification (P1-P4) 931 +* On-call rotation for security issues 932 +* Post-mortem for all security incidents 933 +* Public disclosure policy (coordinated disclosure) 934 + 935 +**Backup & Recovery:** 936 +* Daily encrypted backups 937 +* 30-day retention period 938 +* Tested recovery procedures (quarterly) 939 +* Disaster recovery plan (RTO: 4 hours, RPO: 1 hour) 940 + 941 +--- 942 + 943 +==== Compliance & Standards ==== 944 + 945 +**GDPR Compliance:** 946 +* User consent management 947 +* Right to access data 948 +* Right to deletion 949 +* Data portability 950 +* Privacy policy published 951 + 952 +**Accessibility:** 953 +* WCAG 2.1 AA compliance 954 +* Screen reader compatibility 955 +* Keyboard navigation 956 +* Alt text for images 957 + 958 +**Browser Support:** 959 +* Modern browsers only (Chrome/Edge/Firefox/Safari latest 2 versions) 960 +* No IE11 support 961 + 962 +**Acceptance Criteria:** 963 + 964 +* ✅ Passes OWASP ZAP security scan (no high/critical findings) 965 +* ✅ All dependencies with known vulnerabilities patched 966 +* ✅ Penetration test completed with no critical findings 967 +* ✅ Rate limiting blocks abuse attempts 968 +* ✅ Encryption at rest and in transit verified 969 +* ✅ Security headers scored A+ on securityheaders.com 970 +* ✅ Incident response plan documented and tested 971 +* ✅ 95% uptime over 30-day period 972 + 973 + 817 817 === NFR13: Quality Metrics Transparency === 818 818 819 -**Fulfills:** IFCNtransparency, user trust976 +**Fulfills:** User trust, transparency, continuous improvement, IFCN methodology transparency 820 820 821 -**Public Metrics:** 822 -* Quality gates performance 823 -* Evidence quality stats 824 -* Hallucination rate 825 -* User feedback 978 +**Phase:** POC2 (internal), Beta 0 (public), V1.0 (real-time) 826 826 827 -** Milestone:** POC2 (internal),Beta0 (public),V1.0(real-time)980 +**Purpose:** Provide transparent, measurable quality metrics that demonstrate AKEL's performance and build user trust in automated fact-checking. 828 828 829 - == 10. Requirements Priority Matrix ==982 +**Specification:** 830 830 831 - Thistable shows all functional andnon-functionalrequirements ordered byurgencyand priority.984 +==== Component: Public Quality Dashboard ==== 832 832 833 -** Note:** Implementation phases (POC1, POC2, Beta 0, V1.0) aredefined in [[POC Requirements>>FactHarbor.Specification.POC.Requirements]]and [[ImplementationRoadmap>>FactHarbor.Implementation-Roadmap.WebHome]], not in thispriority matrix.986 +**Core Metrics to Display:** 834 834 835 -**Priority Levels:** 836 -* **CRITICAL** - System doesn't work without it, or major safety/legal risk 837 -* **HIGH** - Core functionality, essential for success 838 -* **MEDIUM** - Important but not blocking 839 -* **LOW** - Nice to have, can be deferred 988 +**1. Verdict Quality Metrics** 840 840 841 -**Urgency Levels:** 842 -* **HIGH** - Immediate need (critical for proof of concept) 843 -* **MEDIUM** - Important but not immediate 844 -* **LOW** - Future enhancement 990 +**TIGERScore (Fact-Checking Quality):** 991 +* **Definition:** Measures how well generated verdicts match expert fact-checker judgments 992 +* **Scale:** 0-100 (higher is better) 993 +* **Calculation:** Using TIGERScore framework (Truth-conditional accuracy, Informativeness, Generality, Evaluativeness, Relevance) 994 +* **Target:** Average ≥80 for production release 995 +* **Display:** 996 +{{code}} 997 +Verdict Quality (TIGERScore): 998 +Overall: 84.2 ▲ (+2.1 from last month) 845 845 846 -|= ID |= Title |= Priority |= Urgency 847 -| **HIGH URGENCY** ||| 848 -| **FR1** | Claim Intake | CRITICAL | HIGH 849 -| **FR5** | Evidence Collection | CRITICAL | HIGH 850 -| **FR7** | Verdict Computation | CRITICAL | HIGH 851 -| **NFR11** | Quality Assurance Framework | CRITICAL | HIGH 852 -| **FR2** | Claim Normalization | HIGH | HIGH 853 -| **FR3** | Claim Classification | HIGH | HIGH 854 -| **FR4** | Scenario Generation | HIGH | HIGH 855 -| **FR6** | Evidence Evaluation | HIGH | HIGH 856 -| **MEDIUM URGENCY** ||| 857 -| **NFR12** | Security Controls | CRITICAL | MEDIUM 858 -| **FR9** | Corrections | HIGH | MEDIUM 859 -| **FR44** | ClaimReview Schema | HIGH | MEDIUM 860 -| **FR45** | Corrections Notification | HIGH | MEDIUM 861 -| **FR48** | Safety Framework | HIGH | MEDIUM 862 -| **NFR3** | Transparency | HIGH | MEDIUM 863 -| **NFR13** | Quality Metrics | HIGH | MEDIUM 864 -| **FR8** | User Contribution | MEDIUM | MEDIUM 865 -| **FR10** | Publishing | MEDIUM | MEDIUM 866 -| **FR13** | API | MEDIUM | MEDIUM 867 -| **FR46** | Image Verification | MEDIUM | MEDIUM 868 -| **FR47** | Archive.org Integration | MEDIUM | MEDIUM 869 -| **NFR1** | Performance | MEDIUM | MEDIUM 870 -| **NFR2** | Scalability | MEDIUM | MEDIUM 871 -| **NFR4** | Security & Privacy | MEDIUM | MEDIUM 872 -| **NFR5** | Maintainability | MEDIUM | MEDIUM 873 -| **LOW URGENCY** ||| 874 -| **FR11** | Social Sharing | LOW | LOW 875 -| **FR12** | Notifications | LOW | LOW 876 -| **FR49** | A/B Testing | LOW | LOW 877 -| **FR50** | OSINT Toolkit Integration | LOW | LOW 878 -| **FR51** | Video Verification System | LOW | LOW 879 -| **FR52** | Interactive Detection Training | LOW | LOW 880 -| **FR53** | Cross-Organizational Sharing | LOW | LOW 1000 +Distribution: 1001 + Excellent (>80): 67% 1002 + Good (60-80): 28% 1003 + Needs Improvement (<60): 5% 881 881 882 -**Total:** 31 requirements (23 Functional, 8 Non-Functional) 1005 +Trend: [Graph showing improvement over time] 1006 +{{/code}} 883 883 884 -**See also:** 885 -* [[POC Requirements>>FactHarbor.Specification.POC.Requirements]] - POC1 scope and simplifications 886 -* [[Implementation Roadmap>>FactHarbor.Implementation-Roadmap.WebHome]] - Phase-by-phase implementation plan 887 -* [[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]] - Foundation that drives these requirements 1008 +**2. Hallucination & Faithfulness Metrics** 888 888 889 -=== 10.1 User Needs Priority === 1010 +**AlignScore (Faithfulness to Evidence):** 1011 +* **Definition:** Measures how well verdicts align with actual evidence content 1012 +* **Scale:** 0-1 (higher is better) 1013 +* **Purpose:** Detect AI hallucinations (making claims not supported by evidence) 1014 +* **Target:** Average ≥0.85, hallucination rate <5% 1015 +* **Display:** 1016 +{{code}} 1017 +Evidence Faithfulness (AlignScore): 1018 +Average: 0.87 ▼ (-0.02 from last month) 890 890 891 -User Needs (UN) are the foundation that drives functional and non-functional requirements. They are not independently prioritized; instead, their priority is inherited from the FR/NFR requirements they drive. 1020 +Hallucination Rate: 4.2% 1021 + - Claims without evidence support: 3.1% 1022 + - Misrepresented evidence: 1.1% 892 892 893 -|= ID |= Title |= Drives Requirements 894 -| **UN-1** | Trust Assessment at a Glance | Multiple FR/NFR 895 -| **UN-2** | Claim Extraction and Verification | FR1-7 896 -| **UN-3** | Article Summary with FactHarbor Analysis Summary | FR4 897 -| **UN-4** | Social Media Fact-Checking | FR1, FR4 898 -| **UN-5** | Source Provenance and Track Records | FR6 899 -| **UN-6** | Publisher Reliability History | FR6 900 -| **UN-7** | Evidence Transparency | NFR3 901 -| **UN-8** | Understanding Disagreement and Consensus | FR4 902 -| **UN-9** | Methodology Transparency | NFR3, NFR11 903 -| **UN-10** | Manipulation Tactics Detection | FR48 904 -| **UN-11** | Filtered Research | FR3 905 -| **UN-12** | Submit Unchecked Claims | FR8 906 -| **UN-13** | Cite FactHarbor Verdicts | FR10 907 -| **UN-14** | API Access for Integration | FR13 908 -| **UN-15** | Verdict Evolution Timeline | FR7 909 -| **UN-16** | AI vs. Human Review Status | FR9 910 -| **UN-17** | In-Article Claim Highlighting | FR1 911 -| **UN-26** | Search Engine Visibility | FR44 912 -| **UN-27** | Visual Claim Verification | FR46 913 -| **UN-28** | Safe Contribution Environment | FR48 1024 +Action: Prompt engineering review scheduled 1025 +{{/code}} 914 914 915 -** Total:**20 UserNeeds1027 +**3. Evidence Quality Metrics** 916 916 917 -**Note:** Each User Need inherits priority from the requirements it drives. For example, UN-2 (Claim Extraction and Verification) drives FR1-7, which are CRITICAL/HIGH priority, therefore UN-2 is also critical to the project. 1029 +**Source Reliability:** 1030 +* Average source quality score (0-1 scale) 1031 +* Distribution of high/medium/low quality sources 1032 +* Publisher track record trends 918 918 919 -== 11. MVP Scope == 1034 +**Evidence Coverage:** 1035 +* Average number of sources per claim 1036 +* Percentage of claims with ≥2 sources (EFCSN minimum) 1037 +* Geographic diversity of sources 920 920 921 -**Phase 1 (Months 1-3): Read-Only MVP** 1039 +**Display:** 1040 +{{code}} 1041 +Evidence Quality: 922 922 923 -Build: 924 -* Automated claim analysis 925 -* Confidence scoring 926 -* Source evaluation 927 -* Browse/search interface 928 -* User flagging system 1043 +Average Sources per Claim: 4.2 1044 +Claims with ≥2 sources: 94% (EFCSN compliant) 929 929 930 -**Goal**: Prove AI quality before adding user editing 1046 +Source Quality Distribution: 1047 + High quality (>0.8): 48% 1048 + Medium quality (0.5-0.8): 43% 1049 + Low quality (<0.5): 9% 931 931 932 -**User Needs fulfilled in Phase 1**: UN-1, UN-2, UN-3, UN-4, UN-5, UN-6, UN-7, UN-8, UN-9, UN-12 1051 +Geographic Diversity: 23 countries represented 1052 +{{/code}} 933 933 934 -** Phase 2 (Months4-6):UserContributions**1054 +**4. Contributor Consensus Metrics** (when human reviewers involved) 935 935 936 -Add only if needed: 937 -* Simple editing (Wikipedia-style) 938 -* Reputation system 939 -* Basic moderation 940 -* In-article claim highlighting (FR13) 1056 +**Inter-Rater Reliability (IRR):** 1057 +* **Calculation:** Cohen's Kappa or Fleiss' Kappa for multiple raters 1058 +* **Scale:** 0-1 (higher is better) 1059 +* **Interpretation:** 1060 + * >0.8: Almost perfect agreement 1061 + * 0.6-0.8: Substantial agreement 1062 + * 0.4-0.6: Moderate agreement 1063 + * <0.4: Poor agreement 1064 +* **Target:** Maintain ≥0.7 (substantial agreement) 941 941 942 -**Additional User Needs fulfilled**: UN-13, UN-17 1066 +**Display:** 1067 +{{code}} 1068 +Contributor Consensus: 943 943 944 -**Phase 3 (Months 7-12): Refinement** 1070 +Inter-Rater Reliability (IRR): 0.73 (Substantial agreement) 1071 + - Verdict agreement: 78% 1072 + - Evidence quality agreement: 71% 1073 + - Scenario structure agreement: 69% 945 945 946 - *Continuous qualityimprovement947 - * Featureadditionsbasedonreal usage948 - * Scaleinfrastructure1075 +Cases requiring moderator review: 12 1076 +Moderator override rate: 8% 1077 +{{/code}} 949 949 950 - **Additional User Needs fulfilled**: UN-14 (API access), UN-15 (Full evolution tracking)1079 +--- 951 951 952 -**Deferred**: 953 -* Federation (until multiple successful instances exist) 954 -* Complex contribution workflows (focus on automation) 955 -* Extensive role hierarchy (keep simple) 1081 +==== Quality Dashboard Implementation ==== 956 956 957 - ==12.Success Metrics==1083 +**Dashboard Location:** `/quality-metrics` 958 958 959 -**System Quality** (track weekly): 960 -* Error rate by category (target: -10%/month) 961 -* Average confidence score (target: increase) 962 -* Source quality distribution (target: more high-quality) 963 -* Contradiction detection rate (target: increase) 1085 +**Update Frequency:** 1086 +* **POC2:** Weekly manual updates 1087 +* **Beta 0:** Daily automated updates 1088 +* **V1.0:** Real-time metrics (updated hourly) 964 964 965 -**Efficiency** (track monthly): 966 -* Claims processed per hour (target: increase) 967 -* Human hours per claim (target: decrease) 968 -* Automation coverage (target: >90%) 969 -* Re-work rate (target: <5%) 1090 +**Dashboard Sections:** 970 970 971 -**User Satisfaction** (track quarterly): 972 -* User flag rate (issues found) 973 -* Correction acceptance rate (flags valid) 974 -* Return user rate 975 -* Trust indicators (surveys) 1092 +1. **Overview:** Key metrics at a glance 1093 +2. **Verdict Quality:** TIGERScore trends and distributions 1094 +3. **Evidence Analysis:** Source quality and coverage 1095 +4. **AI Performance:** Hallucination rates, AlignScore 1096 +5. **Human Oversight:** Contributor consensus, review rates 1097 +6. **System Health:** Processing times, error rates, uptime 976 976 977 -**User Needs Metrics** (track quarterly): 978 -* UN-1: % users who understand trust scores 979 -* UN-4: Time to verify social media claim (target: <30s) 980 -* UN-7: % users who access evidence details 981 -* UN-8: % users who view multiple scenarios 982 -* UN-15: % users who check evolution timeline 983 -* UN-17: % users who enable in-article highlighting; avg. time spent on highlighted vs. non-highlighted articles 1099 +**Example Dashboard Layout:** 984 984 1101 +{{code}} 1102 +┌─────────────────────────────────────────────────────────────┐ 1103 +│ FactHarbor Quality Metrics Last updated: │ 1104 +│ Public Dashboard 2 hours ago │ 1105 +└─────────────────────────────────────────────────────────────┘ 1106 + 1107 +📊 KEY METRICS 1108 +───────────────────────────────────────────────────────────── 1109 +TIGERScore (Verdict Quality): 84.2 ▲ (+2.1) 1110 +AlignScore (Faithfulness): 0.87 ▼ (-0.02) 1111 +Hallucination Rate: 4.2% ✓ (Target: <5%) 1112 +Average Sources per Claim: 4.2 ▲ (+0.3) 1113 + 1114 +📈 TRENDS (30 days) 1115 +───────────────────────────────────────────────────────────── 1116 +[Graph: TIGERScore trending upward] 1117 +[Graph: Hallucination rate declining] 1118 +[Graph: Evidence quality stable] 1119 + 1120 +⚠️ IMPROVEMENT TARGETS 1121 +───────────────────────────────────────────────────────────── 1122 +1. Reduce hallucination rate to <3% (Current: 4.2%) 1123 +2. Increase TIGERScore average to >85 (Current: 84.2) 1124 +3. Maintain IRR >0.75 (Current: 0.73) 1125 + 1126 +📄 DETAILED REPORTS 1127 +───────────────────────────────────────────────────────────── 1128 +• Monthly Quality Report (PDF) 1129 +• Methodology Documentation 1130 +• AKEL Performance Analysis 1131 +• Contributor Agreement Analysis 1132 + 1133 +{{/code}} 1134 + 1135 +--- 1136 + 1137 +==== Continuous Improvement Feedback Loop ==== 1138 + 1139 +**How Metrics Inform AKEL Improvements:** 1140 + 1141 +1. **Identify Weak Areas:** 1142 + * Low TIGERScore → Review prompt engineering 1143 + * High hallucination → Strengthen evidence grounding 1144 + * Low IRR → Clarify evaluation criteria 1145 + 1146 +2. **A/B Testing Integration:** 1147 + * Test prompt variations 1148 + * Measure impact on quality metrics 1149 + * Deploy winners automatically 1150 + 1151 +3. **Alert Thresholds:** 1152 + * TIGERScore drops below 75 → Alert team 1153 + * Hallucination rate exceeds 7% → Pause auto-publishing 1154 + * IRR below 0.6 → Moderator training needed 1155 + 1156 +4. **Monthly Quality Reviews:** 1157 + * Analyze trends 1158 + * Identify systematic issues 1159 + * Plan prompt improvements 1160 + * Update AKEL models 1161 + 1162 +--- 1163 + 1164 +==== Metric Calculation Details ==== 1165 + 1166 +**TIGERScore Implementation:** 1167 +* Reference: https://github.com/TIGER-AI-Lab/TIGERScore 1168 +* Input: Generated verdict + reference verdict (from expert) 1169 +* Output: 0-100 score across 5 dimensions 1170 +* Requires: Test set of expert-reviewed claims (minimum 100) 1171 + 1172 +**AlignScore Implementation:** 1173 +* Reference: https://github.com/yuh-zha/AlignScore 1174 +* Input: Generated verdict + source evidence text 1175 +* Output: 0-1 faithfulness score 1176 +* Calculation: Semantic alignment between claim and evidence 1177 + 1178 +**Source Quality Scoring:** 1179 +* Use existing source reliability database (e.g., NewsGuard, MBFC) 1180 +* Factor in: Publication history, corrections record, transparency 1181 +* Scale: 0-1 (weighted average across sources) 1182 + 1183 +--- 1184 + 1185 +==== Integration Points ==== 1186 + 1187 +* **NFR11: AKEL Quality Assurance** - Metrics validate quality gate effectiveness 1188 +* **FR49: A/B Testing** - Metrics measure test success 1189 +* **FR11: Audit Trail** - Source of quality data 1190 +* **NFR3: Transparency** - Public metrics build trust 1191 + 1192 +**Acceptance Criteria:** 1193 + 1194 +* ✅ All core metrics implemented and calculating correctly 1195 +* ✅ Dashboard updates daily (Beta 0) or hourly (V1.0) 1196 +* ✅ Alerts trigger when metrics degrade beyond thresholds 1197 +* ✅ Monthly quality report auto-generates 1198 +* ✅ Dashboard is publicly accessible (no login required) 1199 +* ✅ Mobile-responsive dashboard design 1200 +* ✅ Metrics inform quarterly AKEL improvement planning 1201 + 1202 + 1203 + 1204 + 985 985 == 13. Requirements Traceability == 986 986 987 987 For full traceability matrix showing which requirements fulfill which user needs, see: ... ... @@ -1014,62 +1014,599 @@ 1014 1014 1015 1015 === FR44: ClaimReview Schema Implementation === 1016 1016 1017 - Generate validClaimReviewstructureddataforGoogle/Bingvisibility.1237 +**Fulfills:** UN-13 (Cite FactHarbor Verdicts), UN-14 (API Access for Integration), UN-26 (Search Engine Visibility) 1018 1018 1019 -**Schema.org Mapping:** 1239 +**Phase:** V1.0 1240 + 1241 +**Purpose:** Generate valid ClaimReview structured data for every published analysis to enable Google/Bing search visibility and fact-check discovery. 1242 + 1243 +**Specification:** 1244 + 1245 +==== Component: Schema.org Markup Generator ==== 1246 + 1247 +FactHarbor must generate valid ClaimReview structured data following Schema.org specifications for every published claim analysis. 1248 + 1249 +**Required JSON-LD Schema:** 1250 + 1251 +{{code language="json"}} 1252 +{ 1253 + "@context": "https://schema.org", 1254 + "@type": "ClaimReview", 1255 + "datePublished": "YYYY-MM-DD", 1256 + "url": "https://factharbor.org/claims/{claim_id}", 1257 + "claimReviewed": "The exact claim text", 1258 + "author": { 1259 + "@type": "Organization", 1260 + "name": "FactHarbor", 1261 + "url": "https://factharbor.org" 1262 + }, 1263 + "reviewRating": { 1264 + "@type": "Rating", 1265 + "ratingValue": "1-5", 1266 + "bestRating": "5", 1267 + "worstRating": "1", 1268 + "alternateName": "FactHarbor likelihood score" 1269 + }, 1270 + "itemReviewed": { 1271 + "@type": "Claim", 1272 + "author": { 1273 + "@type": "Person", 1274 + "name": "Claim author if known" 1275 + }, 1276 + "datePublished": "YYYY-MM-DD if known", 1277 + "appearance": { 1278 + "@type": "CreativeWork", 1279 + "url": "Original claim URL if from article" 1280 + } 1281 + } 1282 +} 1283 +{{/code}} 1284 + 1285 +**FactHarbor-Specific Mapping:** 1286 + 1287 +**Likelihood Score to Rating Scale:** 1020 1020 * 80-100% likelihood → 5 (Highly Supported) 1021 -* 60-79% → 4 (Supported) 1022 -* 40-59% → 3 (Mixed) 1023 -* 20-39% → 2 (Questionable) 1024 -* 0-19% → 1 (Refuted) 1289 +* 60-79% likelihood → 4 (Supported) 1290 +* 40-59% likelihood → 3 (Mixed/Uncertain) 1291 +* 20-39% likelihood → 2 (Questionable) 1292 +* 0-19% likelihood → 1 (Refuted) 1025 1025 1026 -**Milestone:** V1.0 1294 +**Multiple Scenarios Handling:** 1295 +* If claim has multiple scenarios with different verdicts, generate **separate ClaimReview** for each scenario 1296 +* Add `disambiguatingDescription` field explaining scenario context 1297 +* Example: "Scenario: If interpreted as referring to 2023 data..." 1027 1027 1299 +==== Implementation Requirements ==== 1300 + 1301 +1. **Auto-generate** on claim publication 1302 +2. **Embed** in HTML `<head>` section as JSON-LD script 1303 +3. **Validate** against Schema.org validator before publishing 1304 +4. **Submit** to Google Search Console for indexing 1305 +5. **Update** automatically when verdict changes (integrate with FR8: Time Evolution) 1306 + 1307 +==== Integration Points ==== 1308 + 1309 +* **FR7: Automated Verdicts** - Source of rating data and claim text 1310 +* **FR8: Time Evolution** - Triggers schema updates when verdicts change 1311 +* **FR11: Audit Trail** - Logs all schema generation and update events 1312 + 1313 +==== Resources ==== 1314 + 1315 +* ClaimReview Project: https://www.claimreviewproject.com 1316 +* Schema.org ClaimReview: https://schema.org/ClaimReview 1317 +* Google Fact Check Guidelines: https://developers.google.com/search/docs/appearance/fact-check 1318 + 1319 +**Acceptance Criteria:** 1320 + 1321 +* ✅ Passes Google Structured Data Testing Tool 1322 +* ✅ Appears in Google Fact Check Explorer within 48 hours of publication 1323 +* ✅ Valid JSON-LD syntax (no errors) 1324 +* ✅ All required fields populated with correct data types 1325 +* ✅ Handles multi-scenario claims correctly (separate ClaimReview per scenario) 1326 + 1327 + 1028 1028 === FR45: User Corrections Notification System === 1029 1029 1030 - Notifyuserswhenanalysesare corrected.1330 +**Fulfills:** IFCN Principle 5 (Open & Honest Corrections), EFCSN compliance 1031 1031 1032 -**Mechanisms:** 1033 -1. In-page banner (30 days) 1034 -2. Public correction log 1035 -3. Email notifications (opt-in) 1036 -4. RSS/API feed 1332 +**Phase:** Beta 0 (basic), V1.0 (complete) **BLOCKER** 1037 1037 1038 -** Milestone:**Beta0(basic),V1.0(complete)**BLOCKER**1334 +**Purpose:** When any claim analysis is corrected, notify users who previously viewed the claim to maintain transparency and build trust. 1039 1039 1336 +**Specification:** 1337 + 1338 +==== Component: Corrections Visibility Framework ==== 1339 + 1340 +**Correction Types:** 1341 + 1342 +1. **Major Correction:** Verdict changes category (e.g., "Supported" → "Refuted") 1343 +2. **Significant Correction:** Likelihood score changes >20% 1344 +3. **Minor Correction:** Evidence additions, source quality updates 1345 +4. **Scenario Addition:** New scenario added to existing claim 1346 + 1347 +==== Notification Mechanisms ==== 1348 + 1349 +**1. In-Page Banner:** 1350 + 1351 +Display prominent banner on claim page: 1352 + 1353 +{{code}} 1354 +[!] CORRECTION NOTICE 1355 +This analysis was updated on [DATE]. [View what changed] [Dismiss] 1356 + 1357 +Major changes: 1358 +• Verdict changed from "Likely True (75%)" to "Uncertain (45%)" 1359 +• New contradicting evidence added from [Source] 1360 +• Scenario 2 updated with additional context 1361 + 1362 +[See full correction log] 1363 +{{/code}} 1364 + 1365 +**2. Correction Log Page:** 1366 + 1367 +* Public changelog at `/claims/{id}/corrections` 1368 +* Displays for each correction: 1369 + * Date/time of correction 1370 + * What changed (before/after comparison) 1371 + * Why changed (reason if provided) 1372 + * Who made change (AKEL auto-update vs. contributor override) 1373 + 1374 +**3. Email Notifications (opt-in):** 1375 + 1376 +* Send to users who bookmarked or shared the claim 1377 +* Subject: "FactHarbor Correction: [Claim title]" 1378 +* Include summary of changes 1379 +* Link to updated analysis 1380 + 1381 +**4. RSS/API Feed:** 1382 + 1383 +* Corrections feed at `/corrections.rss` 1384 +* API endpoint: `GET /api/corrections?since={timestamp}` 1385 +* Enables external monitoring by journalists and researchers 1386 + 1387 +==== Display Rules ==== 1388 + 1389 +* Show banner on **ALL pages** displaying the claim (search results, related claims, embeddings) 1390 +* Banner persists for **30 days** after correction 1391 +* **"Corrections" count badge** on claim card 1392 +* **Timestamp** on every verdict: "Last updated: [datetime]" 1393 + 1394 +==== IFCN Compliance Requirements ==== 1395 + 1396 +* Corrections policy published at `/corrections-policy` 1397 +* User can report suspected errors via `/report-error/{claim_id}` 1398 +* Link to IFCN complaint process (if FactHarbor becomes signatory) 1399 +* **Scrupulous transparency:** Never silently edit analyses 1400 + 1401 +==== Integration Points ==== 1402 + 1403 +* **FR8: Time Evolution** - Triggers corrections when verdicts change 1404 +* **FR11: Audit Trail** - Source of correction data and change history 1405 +* **NFR3: Transparency** - Public correction log demonstrates commitment 1406 + 1407 +**Acceptance Criteria:** 1408 + 1409 +* ✅ Banner appears within 60 seconds of correction 1410 +* ✅ Correction log is permanent and publicly accessible 1411 +* ✅ Email notifications deliver within 5 minutes 1412 +* ✅ RSS feed updates in real-time 1413 +* ✅ Mobile-responsive banner design 1414 +* ✅ Accessible (screen reader compatible) 1415 + 1416 + 1040 1040 === FR46: Image Verification System === 1041 1041 1042 -**Methods:** 1043 -1. Reverse image search 1044 -2. EXIF metadata analysis 1045 -3. Manipulation detection (basic) 1046 -4. Context verification 1419 +**Fulfills:** UN-27 (Visual Claim Verification) 1047 1047 1048 -** Milestone:** Beta 0 (basic), V1.0 (extended)1421 +**Phase:** Beta 0 (basic), V1.0 (extended) 1049 1049 1423 +**Purpose:** Verify authenticity and context of images shared with claims to detect manipulation, misattribution, and out-of-context usage. 1424 + 1425 +**Specification:** 1426 + 1427 +==== Component: Multi-Method Image Verification ==== 1428 + 1429 +**Method 1: Reverse Image Search** 1430 + 1431 +**Purpose:** Find earlier uses of the image to verify context 1432 + 1433 +**Implementation:** 1434 +* Integrate APIs: 1435 + * **Google Vision AI** (reverse search) 1436 + * **TinEye** (oldest known uses) 1437 + * **Bing Visual Search** (broad coverage) 1438 + 1439 +**Process:** 1440 +1. Extract image from claim or user upload 1441 +2. Query multiple reverse search services 1442 +3. Analyze results for: 1443 + * Earliest known publication 1444 + * Original context (what was it really showing?) 1445 + * Publication timeline 1446 + * Geographic spread 1447 + 1448 +**Output:** 1449 +{{code}} 1450 +Reverse Image Search Results: 1451 + 1452 +Earliest known use: 2019-03-15 (5 years before claim) 1453 +Original context: "Photo from 2019 flooding in Mumbai" 1454 +This claim uses it for: "2024 hurricane damage in Florida" 1455 + 1456 +⚠️ Image is OUT OF CONTEXT 1457 + 1458 +Found in 47 other articles: 1459 +• 2019-03-15: Mumbai floods (original) 1460 +• 2020-07-22: Bangladesh monsoon 1461 +• 2024-10-15: Current claim (misattributed) 1462 + 1463 +[View full timeline] 1464 +{{/code}} 1465 + 1466 +--- 1467 + 1468 +**Method 2: AI Manipulation Detection** 1469 + 1470 +**Purpose:** Detect deepfakes, face swaps, and digital alterations 1471 + 1472 +**Implementation:** 1473 +* Integrate detection services: 1474 + * **Sensity AI** (deepfake detection) 1475 + * **Reality Defender** (multimodal analysis) 1476 + * **AWS Rekognition** (face detection inconsistencies) 1477 + 1478 +**Detection Categories:** 1479 +1. **Face Manipulation:** 1480 + * Deepfake face swaps 1481 + * Expression manipulation 1482 + * Identity replacement 1483 + 1484 +2. **Image Manipulation:** 1485 + * Copy-paste artifacts 1486 + * Clone stamp detection 1487 + * Content-aware fill detection 1488 + * JPEG compression inconsistencies 1489 + 1490 +3. **AI Generation:** 1491 + * Detect fully AI-generated images 1492 + * Identify generation artifacts 1493 + * Check for model signatures 1494 + 1495 +**Confidence Scoring:** 1496 +* **HIGH (80-100%):** Strong evidence of manipulation 1497 +* **MEDIUM (50-79%):** Suspicious artifacts detected 1498 +* **LOW (0-49%):** Minor inconsistencies or inconclusive 1499 + 1500 +**Output:** 1501 +{{code}} 1502 +Manipulation Analysis: 1503 + 1504 +Face Manipulation: LOW RISK (12%) 1505 +Image Editing: MEDIUM RISK (64%) 1506 + • Clone stamp artifacts detected in sky region 1507 + • JPEG compression inconsistent between objects 1508 + 1509 +AI Generation: LOW RISK (8%) 1510 + 1511 +⚠️ Possible manipulation detected. Manual review recommended. 1512 +{{/code}} 1513 + 1514 +--- 1515 + 1516 +**Method 3: Metadata Analysis (EXIF)** 1517 + 1518 +**Purpose:** Extract technical details that may reveal manipulation or misattribution 1519 + 1520 +**Extracted Data:** 1521 +* **Camera/Device:** Make, model, software 1522 +* **Timestamps:** Original date, modification dates 1523 +* **Location:** GPS coordinates (if present) 1524 +* **Editing History:** Software used, edit count 1525 +* **File Properties:** Resolution, compression, format conversions 1526 + 1527 +**Red Flags:** 1528 +* Metadata completely stripped (suspicious) 1529 +* Timestamp conflicts with claimed date 1530 +* GPS location conflicts with claimed location 1531 +* Multiple edit rounds (hiding something?) 1532 +* Creation date after modification date (impossible) 1533 + 1534 +**Output:** 1535 +{{code}} 1536 +Image Metadata: 1537 + 1538 +Camera: iPhone 14 Pro 1539 +Original date: 2023-08-12 14:32:15 1540 +Location: 40.7128°N, 74.0060°W (New York City) 1541 +Modified: 2024-10-15 08:45:22 1542 +Software: Adobe Photoshop 2024 1543 + 1544 +⚠️ Location conflicts with claim 1545 +Claim says: "Taken in Los Angeles" 1546 +EXIF says: New York City 1547 + 1548 +⚠️ Edited 14 months after capture 1549 +{{/code}} 1550 + 1551 +--- 1552 + 1553 +==== Verification Workflow ==== 1554 + 1555 +**Automatic Triggers:** 1556 +1. User submits claim with image 1557 +2. Article being analyzed contains images 1558 +3. Social media post includes photos 1559 + 1560 +**Process:** 1561 +1. Extract images from content 1562 +2. Run all 3 verification methods in parallel 1563 +3. Aggregate results into confidence score 1564 +4. Generate human-readable summary 1565 +5. Display prominently in analysis 1566 + 1567 +**Display Integration:** 1568 + 1569 +Show image verification panel in claim analysis: 1570 + 1571 +{{code}} 1572 +📷 IMAGE VERIFICATION 1573 + 1574 +[Image thumbnail] 1575 + 1576 +✅ Reverse Search: Original context verified 1577 +⚠️ Manipulation: Possible editing detected (64% confidence) 1578 +✅ Metadata: Consistent with claim details 1579 + 1580 +Overall Assessment: CAUTION ADVISED 1581 +This image may have been edited. Original context appears accurate. 1582 + 1583 +[View detailed analysis] 1584 +{{/code}} 1585 + 1586 +==== Integration Points ==== 1587 + 1588 +* **FR7: Automated Verdicts** - Image verification affects claim credibility 1589 +* **FR4: Analysis Summary** - Image findings included in summary 1590 +* **UN-27: Visual Claim Verification** - Direct fulfillment 1591 + 1592 +==== Cost Considerations ==== 1593 + 1594 +**API Costs (estimated per image):** 1595 +* Google Vision AI: $0.001-0.003 1596 +* TinEye: $0.02 (commercial API) 1597 +* Sensity AI: $0.05-0.10 1598 +* AWS Rekognition: $0.001-0.002 1599 + 1600 +**Total per image:** ~$0.07-0.15 1601 + 1602 +**Mitigation Strategies:** 1603 +* Cache results for duplicate images 1604 +* Use free tier quotas where available 1605 +* Prioritize higher-value claims for deep analysis 1606 +* Offer premium verification as paid tier 1607 + 1608 +**Acceptance Criteria:** 1609 + 1610 +* ✅ Reverse image search finds original sources 1611 +* ✅ Manipulation detection accuracy >80% on test dataset 1612 +* ✅ EXIF extraction works for major image formats (JPEG, PNG, HEIC) 1613 +* ✅ Results display within 10 seconds 1614 +* ✅ Mobile-friendly image comparison interface 1615 +* ✅ False positive rate <15% 1616 + 1617 + 1050 1050 === FR47: Archive.org Integration === 1051 1051 1052 -Auto-save evidence sources to Wayback Machine. 1620 +**Priority:** CRITICAL 1621 +**Fulfills:** Evidence persistence, FR5 (Evidence linking) 1622 +**Phase:** V1.0 1053 1053 1054 -** Milestone:**Beta01624 +**Purpose:** Ensure evidence remains accessible even if original sources are deleted. 1055 1055 1056 - === FR48:Safety Frameworkfor Contributors ===1626 +**Specification:** 1057 1057 1058 - Protect contributorsfromharassment and legal threats.1628 +**Automatic Archiving:** 1059 1059 1060 -**Milestone:** V1.1 1630 +When AKEL links evidence: 1631 +1. Check if URL already archived (Wayback Machine API) 1632 +2. If not, submit for archiving (Save Page Now API) 1633 +3. Store both original URL and archive URL 1634 +4. Display both to users 1061 1061 1062 - === FR49:A/B Testing Framework===1636 +**Archive Display:** 1063 1063 1064 -Test AKEL approaches and UI designs systematically. 1638 +{{code}} 1639 +Evidence Source: [Original URL] 1640 +Archived: [Archive.org URL] (Captured: [date]) 1065 1065 1066 -**Milestone:** V1.0 1642 +[View Original] [View Archive] 1643 +{{/code}} 1067 1067 1068 - ===FR50: OSINT ToolkitIntegration ===1645 +**Fallback Logic:** 1069 1069 1647 +* If original URL unavailable → Auto-redirect to archive 1648 +* If archive unavailable → Display warning 1649 +* If both unavailable → Flag for manual review 1070 1070 1651 +**API Integration:** 1071 1071 1072 -**Priority:** HIGH (V1.1) 1653 +* Use Wayback Machine Availability API 1654 +* Use Save Page Now API (SPNv2) 1655 +* Rate limiting: 15 requests/minute (Wayback limit) 1656 + 1657 +**Acceptance Criteria:** 1658 + 1659 +* ✅ All evidence URLs auto-archived 1660 +* ✅ Archive links displayed to users 1661 +* ✅ Fallback to archive if original unavailable 1662 +* ✅ API rate limits respected 1663 +* ✅ Archive status visible in evidence display 1664 + 1665 + 1666 +== Category 4: Community Safety ===== FR48: Contributor Safety Framework === 1667 + 1668 +**Priority:** CRITICAL 1669 +**Fulfills:** UN-28 (Safe contribution environment) 1670 +**Phase:** V1.0 1671 + 1672 +**Purpose:** Protect contributors from harassment, doxxing, and coordinated attacks. 1673 + 1674 +**Specification:** 1675 + 1676 +**1. Privacy Protection:** 1677 + 1678 +* **Optional Pseudonymity:** Contributors can use pseudonyms 1679 +* **Email Privacy:** Emails never displayed publicly 1680 +* **Profile Privacy:** Contributors control what's public 1681 +* **IP Logging:** Only for abuse prevention, not public 1682 + 1683 +**2. Harassment Prevention:** 1684 + 1685 +* **Automated Toxicity Detection:** Flag abusive comments 1686 +* **Personal Information Detection:** Auto-block doxxing attempts 1687 +* **Coordinated Attack Detection:** Identify brigading patterns 1688 +* **Rapid Response:** Moderator alerts for harassment 1689 + 1690 +**3. Safety Features:** 1691 + 1692 +* **Block Users:** Contributors can block harassers 1693 +* **Private Contributions:** Option to contribute anonymously 1694 +* **Report Harassment:** One-click harassment reporting 1695 +* **Safety Resources:** Links to support resources 1696 + 1697 +**4. Moderator Tools:** 1698 + 1699 +* **Quick Ban:** Immediately block abusers 1700 +* **Pattern Detection:** Identify coordinated attacks 1701 +* **Appeal Process:** Fair review of moderation actions 1702 +* **Escalation:** Serious threats escalated to authorities 1703 + 1704 +**5. Trusted Contributor Protection:** 1705 + 1706 +* **Enhanced Privacy:** Additional protection for high-profile contributors 1707 +* **Verification:** Optional identity verification (not public) 1708 +* **Legal Support:** Resources for contributors facing legal threats 1709 + 1710 +**Acceptance Criteria:** 1711 + 1712 +* ✅ Pseudonyms supported 1713 +* ✅ Toxicity detection active 1714 +* ✅ Doxxing auto-blocked 1715 +* ✅ Harassment reporting functional 1716 +* ✅ Moderator tools implemented 1717 +* ✅ Safety policy published 1718 + 1719 + 1720 +== Category 5: Continuous Improvement ===== FR49: A/B Testing Framework === 1721 + 1722 +**Priority:** CRITICAL 1723 +**Fulfills:** Continuous system improvement 1724 +**Phase:** V1.0 1725 + 1726 +**Purpose:** Test and measure improvements to AKEL prompts, algorithms, and workflows. 1727 + 1728 +**Specification:** 1729 + 1730 +**Test Capabilities:** 1731 + 1732 +1. **Prompt Variations:** 1733 + * Test different claim extraction prompts 1734 + * Test different verdict generation prompts 1735 + * Measure: Accuracy, clarity, completeness 1736 + 1737 +2. **Algorithm Variations:** 1738 + * Test different source scoring algorithms 1739 + * Test different confidence calculations 1740 + * Measure: Audit accuracy, user satisfaction 1741 + 1742 +3. **Workflow Variations:** 1743 + * Test different quality gate thresholds 1744 + * Test different risk tier assignments 1745 + * Measure: Publication rate, quality scores 1746 + 1747 +**Implementation:** 1748 + 1749 +* **Traffic Split:** 50/50 or 90/10 splits 1750 +* **Randomization:** Consistent per claim (not per user) 1751 +* **Metrics Collection:** Automatic for all variants 1752 +* **Statistical Significance:** Minimum sample size calculation 1753 +* **Rollout:** Winner promoted to 100% traffic 1754 + 1755 +**A/B Test Workflow:** 1756 + 1757 +{{code}} 1758 +1. Hypothesis: "New prompt improves claim extraction" 1759 +2. Design test: Control vs. Variant 1760 +3. Define metrics: Extraction accuracy, completeness 1761 +4. Run test: 7-14 days, minimum 100 claims each 1762 +5. Analyze results: Statistical significance? 1763 +6. Decision: Deploy winner or iterate 1764 +{{/code}} 1765 + 1766 +**Acceptance Criteria:** 1767 + 1768 +* ✅ A/B testing framework implemented 1769 +* ✅ Can test prompt variations 1770 +* ✅ Can test algorithm variations 1771 +* ✅ Metrics automatically collected 1772 +* ✅ Statistical significance calculated 1773 +* ✅ Results inform system improvements 1774 + 1775 + 1776 +=== FR54: Evidence Deduplication === 1777 + 1778 +**Priority:** CRITICAL (POC2/Beta) 1779 +**Fulfills:** Accurate evidence counting, quality metrics 1780 +**Phase:** POC2, Beta 0, V1.0 1781 + 1782 +**Purpose:** Avoid counting the same source multiple times when it appears in different forms. 1783 + 1784 +**Specification:** 1785 + 1786 +**Deduplication Logic:** 1787 + 1788 +1. **URL Normalization:** 1789 + * Remove tracking parameters (?utm_source=...) 1790 + * Normalize http/https 1791 + * Normalize www/non-www 1792 + * Handle redirects 1793 + 1794 +2. **Content Similarity:** 1795 + * If two sources have >90% text similarity → Same source 1796 + * If one is subset of other → Same source 1797 + * Use fuzzy matching for minor differences 1798 + 1799 +3. **Cross-Domain Syndication:** 1800 + * Detect wire service content (AP, Reuters) 1801 + * Mark as single source if syndicated 1802 + * Count original publication only 1803 + 1804 +**Display:** 1805 + 1806 +{{code}} 1807 +Evidence Sources (3 unique, 5 total): 1808 + 1809 +1. Original Article (NYTimes) 1810 + - Also appeared in: WashPost, Guardian (syndicated) 1811 + 1812 +2. Research Paper (Nature) 1813 + 1814 +3. Official Statement (WHO) 1815 +{{/code}} 1816 + 1817 +**Acceptance Criteria:** 1818 + 1819 +* ✅ URL normalization works 1820 +* ✅ Content similarity detected 1821 +* ✅ Syndicated content identified 1822 +* ✅ Unique vs. total counts accurate 1823 +* ✅ Improves evidence quality metrics 1824 + 1825 + 1826 +== Additional Requirements (Lower Priority) ===== FR50: OSINT Toolkit Integration === 1827 + 1828 + 1829 + 1073 1073 **Fulfills:** Advanced media verification 1074 1074 **Phase:** V1.1 1075 1075 ... ... @@ -1084,7 +1084,6 @@ 1084 1084 1085 1085 1086 1086 1087 -**Priority:** HIGH (V1.1) 1088 1088 **Fulfills:** UN-27 (Visual claims), advanced media verification 1089 1089 **Phase:** V1.1 1090 1090 ... ... @@ -1101,7 +1101,6 @@ 1101 1101 1102 1102 1103 1103 1104 -**Priority:** MEDIUM (V1.5) 1105 1105 **Fulfills:** Media literacy education 1106 1106 **Phase:** V1.5 1107 1107 ... ... @@ -1117,7 +1117,6 @@ 1117 1117 1118 1118 1119 1119 1120 -**Priority:** MEDIUM (V1.5) 1121 1121 **Fulfills:** Collaboration with other fact-checkers 1122 1122 **Phase:** V1.5 1123 1123 ... ... @@ -1159,7 +1159,6 @@ 1159 1159 1160 1160 1161 1161 1162 -**Priority:** CRITICAL (POC2/Beta) 1163 1163 **Fulfills:** Accurate evidence counting, quality metrics 1164 1164 **Phase:** POC2, Beta 0, V1.0 1165 1165