Changes for page Requirements

Last modified by Robert Schaub on 2025/12/23 11:03

From version 1.1
edited by Robert Schaub
on 2025/12/22 19:12
Change comment: Imported from XAR
To version 6.1
edited by Robert Schaub
on 2025/12/23 08:03
Change comment: Imported from XAR

Summary

Details

Page properties
Content
... ... @@ -306,7 +306,7 @@
306 306  4. How common is this pattern?
307 307  5. Store in ErrorPattern table (improvement queue)
308 308  
309 -=== 6.2 Weekly Improvement Cycle ===
309 +=== 6.2 Continuous Improvement Cycle ===
310 310  
311 311  1. **Review**: Analyze top error patterns
312 312  2. **Develop**: Create fix (prompt, model, validation)
... ... @@ -326,7 +326,7 @@
326 326  * Re-work rate
327 327  * Claims processed per hour
328 328  
329 -**Goal**: 10% monthly improvement in error rate
329 +**Goal**: continuous improvement in error rate
330 330  
331 331  == 7. Automated Quality Monitoring ==
332 332  
... ... @@ -803,185 +803,405 @@
803 803  
804 804  === NFR12: Security Controls ===
805 805  
806 -**Fulfills:** Production readiness, legal compliance
806 +**Fulfills:** Data protection, system integrity, user privacy, production readiness
807 807  
808 -**Requirements:**
809 -1. **Input Validation:** SQL injection, XSS, CSRF prevention
810 -2. **Rate Limiting:** 5 analyses per minute per IP
811 -3. **Authentication:** Secure sessions, API key rotation
812 -4. **Data Protection:** HTTPS, encryption, backups
813 -5. **Security Audit:** Penetration testing, GDPR compliance
808 +**Phase:** Beta 0 (essential), V1.0 (complete) **BLOCKER**
814 814  
815 -**Milestone:** Beta 0 (essential), V1.0 (complete) **BLOCKER**
810 +**Purpose:** Protect FactHarbor systems, user data, and operations from security threats, ensuring production-grade security posture.
816 816  
812 +**Specification:**
813 +
814 +==== API Security ====
815 +
816 +**Rate Limiting:**
817 +* **Analysis endpoints:** 100 requests/hour per IP
818 +* **Read endpoints:** 1,000 requests/hour per IP
819 +* **Search:** 500 requests/hour per IP
820 +* **Authenticated users:** 5x higher limits
821 +* **Burst protection:** Max 10 requests/second
822 +
823 +**Authentication & Authorization:**
824 +* **API Keys:** Required for programmatic access
825 +* **JWT tokens:** For user sessions (1-hour expiry)
826 +* **OAuth2:** For third-party integrations
827 +* **Role-Based Access Control (RBAC):**
828 + * Public: Read-only access to published claims
829 + * Contributor: Submit claims, provide evidence
830 + * Moderator: Review contributions, manage quality
831 + * Admin: System configuration, user management
832 +
833 +**CORS Policies:**
834 +* Whitelist approved domains only
835 +* No wildcard origins in production
836 +* Credentials required for sensitive endpoints
837 +
838 +**Input Sanitization:**
839 +* Validate all user input against schemas
840 +* Sanitize HTML/JavaScript in text submissions
841 +* Prevent SQL injection (use parameterized queries)
842 +* Prevent command injection (no shell execution of user input)
843 +* Max request size: 10MB
844 +* File upload restrictions: Whitelist file types, scan for malware
845 +
846 +---
847 +
848 +==== Data Security ====
849 +
850 +**Encryption at Rest:**
851 +* Database encryption using AES-256
852 +* Encrypted backups
853 +* Key management via cloud provider KMS (AWS KMS, Google Cloud KMS)
854 +* Regular key rotation (90-day cycle)
855 +
856 +**Encryption in Transit:**
857 +* HTTPS/TLS 1.3 only (no TLS 1.0/1.1)
858 +* Strong cipher suites only
859 +* HSTS (HTTP Strict Transport Security) enabled
860 +* Certificate pinning for mobile apps
861 +
862 +**Secure Credential Storage:**
863 +* Passwords hashed with bcrypt (cost factor 12+)
864 +* API keys encrypted in database
865 +* Secrets stored in environment variables (never in code)
866 +* Use secrets manager (AWS Secrets Manager, HashiCorp Vault)
867 +
868 +**Data Privacy:**
869 +* Minimal data collection (privacy by design)
870 +* User data deletion on request (GDPR compliance)
871 +* PII encryption in database
872 +* Anonymize logs (no PII in log files)
873 +
874 +---
875 +
876 +==== Application Security ====
877 +
878 +**OWASP Top 10 Compliance:**
879 +
880 +1. **Broken Access Control:** RBAC implementation, path traversal prevention
881 +2. **Cryptographic Failures:** Strong encryption, secure key management
882 +3. **Injection:** Parameterized queries, input validation
883 +4. **Insecure Design:** Security review of all features
884 +5. **Security Misconfiguration:** Hardened defaults, security headers
885 +6. **Vulnerable Components:** Dependency scanning (see below)
886 +7. **Authentication Failures:** Strong password policy, MFA support
887 +8. **Data Integrity Failures:** Signature verification, checksums
888 +9. **Security Logging Failures:** Comprehensive audit logs
889 +10. **Server-Side Request Forgery:** URL validation, whitelist domains
890 +
891 +**Security Headers:**
892 +* `Content-Security-Policy`: Strict CSP to prevent XSS
893 +* `X-Frame-Options`: DENY (prevent clickjacking)
894 +* `X-Content-Type-Options`: nosniff
895 +* `Referrer-Policy`: strict-origin-when-cross-origin
896 +* `Permissions-Policy`: Restrict browser features
897 +
898 +**Dependency Vulnerability Scanning:**
899 +* **Tools:** Snyk, Dependabot, npm audit, pip-audit
900 +* **Frequency:** Daily automated scans
901 +* **Action:** Patch critical vulnerabilities within 24 hours
902 +* **Policy:** No known high/critical CVEs in production
903 +
904 +**Security Audits:**
905 +* **Internal:** Quarterly security reviews
906 +* **External:** Annual penetration testing by certified firm
907 +* **Bug Bounty:** Public bug bounty program (V1.1+)
908 +* **Compliance:** SOC 2 Type II certification target (V1.5)
909 +
910 +---
911 +
912 +==== Operational Security ====
913 +
914 +**DDoS Protection:**
915 +* CloudFlare or AWS Shield
916 +* Rate limiting at CDN layer
917 +* Automatic IP blocking for abuse patterns
918 +
919 +**Monitoring & Alerting:**
920 +* Real-time security event monitoring
921 +* Alerts for:
922 + * Failed login attempts (>5 in 10 minutes)
923 + * API abuse patterns
924 + * Unusual data access patterns
925 + * Security scan detections
926 +* Integration with SIEM (Security Information and Event Management)
927 +
928 +**Incident Response:**
929 +* Documented incident response plan
930 +* Security incident classification (P1-P4)
931 +* On-call rotation for security issues
932 +* Post-mortem for all security incidents
933 +* Public disclosure policy (coordinated disclosure)
934 +
935 +**Backup & Recovery:**
936 +* Daily encrypted backups
937 +* 30-day retention period
938 +* Tested recovery procedures (quarterly)
939 +* Disaster recovery plan (RTO: 4 hours, RPO: 1 hour)
940 +
941 +---
942 +
943 +==== Compliance & Standards ====
944 +
945 +**GDPR Compliance:**
946 +* User consent management
947 +* Right to access data
948 +* Right to deletion
949 +* Data portability
950 +* Privacy policy published
951 +
952 +**Accessibility:**
953 +* WCAG 2.1 AA compliance
954 +* Screen reader compatibility
955 +* Keyboard navigation
956 +* Alt text for images
957 +
958 +**Browser Support:**
959 +* Modern browsers only (Chrome/Edge/Firefox/Safari latest 2 versions)
960 +* No IE11 support
961 +
962 +**Acceptance Criteria:**
963 +
964 +* ✅ Passes OWASP ZAP security scan (no high/critical findings)
965 +* ✅ All dependencies with known vulnerabilities patched
966 +* ✅ Penetration test completed with no critical findings
967 +* ✅ Rate limiting blocks abuse attempts
968 +* ✅ Encryption at rest and in transit verified
969 +* ✅ Security headers scored A+ on securityheaders.com
970 +* ✅ Incident response plan documented and tested
971 +* ✅ 95% uptime over 30-day period
972 +
973 +
817 817  === NFR13: Quality Metrics Transparency ===
818 818  
819 -**Fulfills:** IFCN transparency, user trust
976 +**Fulfills:** User trust, transparency, continuous improvement, IFCN methodology transparency
820 820  
821 -**Public Metrics:**
822 -* Quality gates performance
823 -* Evidence quality stats
824 -* Hallucination rate
825 -* User feedback
978 +**Phase:** POC2 (internal), Beta 0 (public), V1.0 (real-time)
826 826  
827 -**Milestone:** POC2 (internal), Beta 0 (public), V1.0 (real-time)
980 +**Purpose:** Provide transparent, measurable quality metrics that demonstrate AKEL's performance and build user trust in automated fact-checking.
828 828  
829 -== 10. Requirements Priority Matrix ==
982 +**Specification:**
830 830  
831 -This table shows all functional and non-functional requirements ordered by urgency and priority.
984 +==== Component: Public Quality Dashboard ====
832 832  
833 -**Note:** Implementation phases (POC1, POC2, Beta 0, V1.0) are defined in [[POC Requirements>>FactHarbor.Specification.POC.Requirements]] and [[Implementation Roadmap>>FactHarbor.Implementation-Roadmap.WebHome]], not in this priority matrix.
986 +**Core Metrics to Display:**
834 834  
835 -**Priority Levels:**
836 -* **CRITICAL** - System doesn't work without it, or major safety/legal risk
837 -* **HIGH** - Core functionality, essential for success
838 -* **MEDIUM** - Important but not blocking
839 -* **LOW** - Nice to have, can be deferred
988 +**1. Verdict Quality Metrics**
840 840  
841 -**Urgency Levels:**
842 -* **HIGH** - Immediate need (critical for proof of concept)
843 -* **MEDIUM** - Important but not immediate
844 -* **LOW** - Future enhancement
990 +**TIGERScore (Fact-Checking Quality):**
991 +* **Definition:** Measures how well generated verdicts match expert fact-checker judgments
992 +* **Scale:** 0-100 (higher is better)
993 +* **Calculation:** Using TIGERScore framework (Truth-conditional accuracy, Informativeness, Generality, Evaluativeness, Relevance)
994 +* **Target:** Average ≥80 for production release
995 +* **Display:**
996 +{{code}}
997 +Verdict Quality (TIGERScore):
998 +Overall: 84.2 ▲ (+2.1 from last month)
845 845  
846 -|= ID |= Title |= Priority |= Urgency
847 -| **HIGH URGENCY** |||
848 -| **FR1** | Claim Intake | CRITICAL | HIGH
849 -| **FR5** | Evidence Collection | CRITICAL | HIGH
850 -| **FR7** | Verdict Computation | CRITICAL | HIGH
851 -| **NFR11** | Quality Assurance Framework | CRITICAL | HIGH
852 -| **FR2** | Claim Normalization | HIGH | HIGH
853 -| **FR3** | Claim Classification | HIGH | HIGH
854 -| **FR4** | Scenario Generation | HIGH | HIGH
855 -| **FR6** | Evidence Evaluation | HIGH | HIGH
856 -| **MEDIUM URGENCY** |||
857 -| **NFR12** | Security Controls | CRITICAL | MEDIUM
858 -| **FR9** | Corrections | HIGH | MEDIUM
859 -| **FR44** | ClaimReview Schema | HIGH | MEDIUM
860 -| **FR45** | Corrections Notification | HIGH | MEDIUM
861 -| **FR48** | Safety Framework | HIGH | MEDIUM
862 -| **NFR3** | Transparency | HIGH | MEDIUM
863 -| **NFR13** | Quality Metrics | HIGH | MEDIUM
864 -| **FR8** | User Contribution | MEDIUM | MEDIUM
865 -| **FR10** | Publishing | MEDIUM | MEDIUM
866 -| **FR13** | API | MEDIUM | MEDIUM
867 -| **FR46** | Image Verification | MEDIUM | MEDIUM
868 -| **FR47** | Archive.org Integration | MEDIUM | MEDIUM
869 -| **NFR1** | Performance | MEDIUM | MEDIUM
870 -| **NFR2** | Scalability | MEDIUM | MEDIUM
871 -| **NFR4** | Security & Privacy | MEDIUM | MEDIUM
872 -| **NFR5** | Maintainability | MEDIUM | MEDIUM
873 -| **LOW URGENCY** |||
874 -| **FR11** | Social Sharing | LOW | LOW
875 -| **FR12** | Notifications | LOW | LOW
876 -| **FR49** | A/B Testing | LOW | LOW
877 -| **FR50** | OSINT Toolkit Integration | LOW | LOW
878 -| **FR51** | Video Verification System | LOW | LOW
879 -| **FR52** | Interactive Detection Training | LOW | LOW
880 -| **FR53** | Cross-Organizational Sharing | LOW | LOW
1000 +Distribution:
1001 + Excellent (>80): 67%
1002 + Good (60-80): 28%
1003 + Needs Improvement (<60): 5%
881 881  
882 -**Total:** 31 requirements (23 Functional, 8 Non-Functional)
1005 +Trend: [Graph showing improvement over time]
1006 +{{/code}}
883 883  
884 -**See also:**
885 -* [[POC Requirements>>FactHarbor.Specification.POC.Requirements]] - POC1 scope and simplifications
886 -* [[Implementation Roadmap>>FactHarbor.Implementation-Roadmap.WebHome]] - Phase-by-phase implementation plan
887 -* [[User Needs>>FactHarbor.Specification.Requirements.User Needs.WebHome]] - Foundation that drives these requirements
1008 +**2. Hallucination & Faithfulness Metrics**
888 888  
889 -=== 10.1 User Needs Priority ===
1010 +**AlignScore (Faithfulness to Evidence):**
1011 +* **Definition:** Measures how well verdicts align with actual evidence content
1012 +* **Scale:** 0-1 (higher is better)
1013 +* **Purpose:** Detect AI hallucinations (making claims not supported by evidence)
1014 +* **Target:** Average ≥0.85, hallucination rate <5%
1015 +* **Display:**
1016 +{{code}}
1017 +Evidence Faithfulness (AlignScore):
1018 +Average: 0.87 ▼ (-0.02 from last month)
890 890  
891 -User Needs (UN) are the foundation that drives functional and non-functional requirements. They are not independently prioritized; instead, their priority is inherited from the FR/NFR requirements they drive.
1020 +Hallucination Rate: 4.2%
1021 + - Claims without evidence support: 3.1%
1022 + - Misrepresented evidence: 1.1%
892 892  
893 -|= ID |= Title |= Drives Requirements
894 -| **UN-1** | Trust Assessment at a Glance | Multiple FR/NFR
895 -| **UN-2** | Claim Extraction and Verification | FR1-7
896 -| **UN-3** | Article Summary with FactHarbor Analysis Summary | FR4
897 -| **UN-4** | Social Media Fact-Checking | FR1, FR4
898 -| **UN-5** | Source Provenance and Track Records | FR6
899 -| **UN-6** | Publisher Reliability History | FR6
900 -| **UN-7** | Evidence Transparency | NFR3
901 -| **UN-8** | Understanding Disagreement and Consensus | FR4
902 -| **UN-9** | Methodology Transparency | NFR3, NFR11
903 -| **UN-10** | Manipulation Tactics Detection | FR48
904 -| **UN-11** | Filtered Research | FR3
905 -| **UN-12** | Submit Unchecked Claims | FR8
906 -| **UN-13** | Cite FactHarbor Verdicts | FR10
907 -| **UN-14** | API Access for Integration | FR13
908 -| **UN-15** | Verdict Evolution Timeline | FR7
909 -| **UN-16** | AI vs. Human Review Status | FR9
910 -| **UN-17** | In-Article Claim Highlighting | FR1
911 -| **UN-26** | Search Engine Visibility | FR44
912 -| **UN-27** | Visual Claim Verification | FR46
913 -| **UN-28** | Safe Contribution Environment | FR48
1024 +Action: Prompt engineering review scheduled
1025 +{{/code}}
914 914  
915 -**Total:** 20 User Needs
1027 +**3. Evidence Quality Metrics**
916 916  
917 -**Note:** Each User Need inherits priority from the requirements it drives. For example, UN-2 (Claim Extraction and Verification) drives FR1-7, which are CRITICAL/HIGH priority, therefore UN-2 is also critical to the project.
1029 +**Source Reliability:**
1030 +* Average source quality score (0-1 scale)
1031 +* Distribution of high/medium/low quality sources
1032 +* Publisher track record trends
918 918  
919 -== 11. MVP Scope ==
1034 +**Evidence Coverage:**
1035 +* Average number of sources per claim
1036 +* Percentage of claims with ≥2 sources (EFCSN minimum)
1037 +* Geographic diversity of sources
920 920  
921 -**Phase 1 (Months 1-3): Read-Only MVP**
1039 +**Display:**
1040 +{{code}}
1041 +Evidence Quality:
922 922  
923 -Build:
924 -* Automated claim analysis
925 -* Confidence scoring
926 -* Source evaluation
927 -* Browse/search interface
928 -* User flagging system
1043 +Average Sources per Claim: 4.2
1044 +Claims with ≥2 sources: 94% (EFCSN compliant)
929 929  
930 -**Goal**: Prove AI quality before adding user editing
1046 +Source Quality Distribution:
1047 + High quality (>0.8): 48%
1048 + Medium quality (0.5-0.8): 43%
1049 + Low quality (<0.5): 9%
931 931  
932 -**User Needs fulfilled in Phase 1**: UN-1, UN-2, UN-3, UN-4, UN-5, UN-6, UN-7, UN-8, UN-9, UN-12
1051 +Geographic Diversity: 23 countries represented
1052 +{{/code}}
933 933  
934 -**Phase 2 (Months 4-6): User Contributions**
1054 +**4. Contributor Consensus Metrics** (when human reviewers involved)
935 935  
936 -Add only if needed:
937 -* Simple editing (Wikipedia-style)
938 -* Reputation system
939 -* Basic moderation
940 -* In-article claim highlighting (FR13)
1056 +**Inter-Rater Reliability (IRR):**
1057 +* **Calculation:** Cohen's Kappa or Fleiss' Kappa for multiple raters
1058 +* **Scale:** 0-1 (higher is better)
1059 +* **Interpretation:**
1060 + * >0.8: Almost perfect agreement
1061 + * 0.6-0.8: Substantial agreement
1062 + * 0.4-0.6: Moderate agreement
1063 + * <0.4: Poor agreement
1064 +* **Target:** Maintain ≥0.7 (substantial agreement)
941 941  
942 -**Additional User Needs fulfilled**: UN-13, UN-17
1066 +**Display:**
1067 +{{code}}
1068 +Contributor Consensus:
943 943  
944 -**Phase 3 (Months 7-12): Refinement**
1070 +Inter-Rater Reliability (IRR): 0.73 (Substantial agreement)
1071 + - Verdict agreement: 78%
1072 + - Evidence quality agreement: 71%
1073 + - Scenario structure agreement: 69%
945 945  
946 -* Continuous quality improvement
947 -* Feature additions based on real usage
948 -* Scale infrastructure
1075 +Cases requiring moderator review: 12
1076 +Moderator override rate: 8%
1077 +{{/code}}
949 949  
950 -**Additional User Needs fulfilled**: UN-14 (API access), UN-15 (Full evolution tracking)
1079 +---
951 951  
952 -**Deferred**:
953 -* Federation (until multiple successful instances exist)
954 -* Complex contribution workflows (focus on automation)
955 -* Extensive role hierarchy (keep simple)
1081 +==== Quality Dashboard Implementation ====
956 956  
957 -== 12. Success Metrics ==
1083 +**Dashboard Location:** `/quality-metrics`
958 958  
959 -**System Quality** (track weekly):
960 -* Error rate by category (target: -10%/month)
961 -* Average confidence score (target: increase)
962 -* Source quality distribution (target: more high-quality)
963 -* Contradiction detection rate (target: increase)
1085 +**Update Frequency:**
1086 +* **POC2:** Weekly manual updates
1087 +* **Beta 0:** Daily automated updates
1088 +* **V1.0:** Real-time metrics (updated hourly)
964 964  
965 -**Efficiency** (track monthly):
966 -* Claims processed per hour (target: increase)
967 -* Human hours per claim (target: decrease)
968 -* Automation coverage (target: >90%)
969 -* Re-work rate (target: <5%)
1090 +**Dashboard Sections:**
970 970  
971 -**User Satisfaction** (track quarterly):
972 -* User flag rate (issues found)
973 -* Correction acceptance rate (flags valid)
974 -* Return user rate
975 -* Trust indicators (surveys)
1092 +1. **Overview:** Key metrics at a glance
1093 +2. **Verdict Quality:** TIGERScore trends and distributions
1094 +3. **Evidence Analysis:** Source quality and coverage
1095 +4. **AI Performance:** Hallucination rates, AlignScore
1096 +5. **Human Oversight:** Contributor consensus, review rates
1097 +6. **System Health:** Processing times, error rates, uptime
976 976  
977 -**User Needs Metrics** (track quarterly):
978 -* UN-1: % users who understand trust scores
979 -* UN-4: Time to verify social media claim (target: <30s)
980 -* UN-7: % users who access evidence details
981 -* UN-8: % users who view multiple scenarios
982 -* UN-15: % users who check evolution timeline
983 -* UN-17: % users who enable in-article highlighting; avg. time spent on highlighted vs. non-highlighted articles
1099 +**Example Dashboard Layout:**
984 984  
1101 +{{code}}
1102 +┌─────────────────────────────────────────────────────────────┐
1103 +│ FactHarbor Quality Metrics Last updated: │
1104 +│ Public Dashboard 2 hours ago │
1105 +└─────────────────────────────────────────────────────────────┘
1106 +
1107 +📊 KEY METRICS
1108 +─────────────────────────────────────────────────────────────
1109 +TIGERScore (Verdict Quality): 84.2 ▲ (+2.1)
1110 +AlignScore (Faithfulness): 0.87 ▼ (-0.02)
1111 +Hallucination Rate: 4.2% ✓ (Target: <5%)
1112 +Average Sources per Claim: 4.2 ▲ (+0.3)
1113 +
1114 +📈 TRENDS (30 days)
1115 +─────────────────────────────────────────────────────────────
1116 +[Graph: TIGERScore trending upward]
1117 +[Graph: Hallucination rate declining]
1118 +[Graph: Evidence quality stable]
1119 +
1120 +⚠️ IMPROVEMENT TARGETS
1121 +─────────────────────────────────────────────────────────────
1122 +1. Reduce hallucination rate to <3% (Current: 4.2%)
1123 +2. Increase TIGERScore average to >85 (Current: 84.2)
1124 +3. Maintain IRR >0.75 (Current: 0.73)
1125 +
1126 +📄 DETAILED REPORTS
1127 +─────────────────────────────────────────────────────────────
1128 +• Monthly Quality Report (PDF)
1129 +• Methodology Documentation
1130 +• AKEL Performance Analysis
1131 +• Contributor Agreement Analysis
1132 +
1133 +{{/code}}
1134 +
1135 +---
1136 +
1137 +==== Continuous Improvement Feedback Loop ====
1138 +
1139 +**How Metrics Inform AKEL Improvements:**
1140 +
1141 +1. **Identify Weak Areas:**
1142 + * Low TIGERScore → Review prompt engineering
1143 + * High hallucination → Strengthen evidence grounding
1144 + * Low IRR → Clarify evaluation criteria
1145 +
1146 +2. **A/B Testing Integration:**
1147 + * Test prompt variations
1148 + * Measure impact on quality metrics
1149 + * Deploy winners automatically
1150 +
1151 +3. **Alert Thresholds:**
1152 + * TIGERScore drops below 75 → Alert team
1153 + * Hallucination rate exceeds 7% → Pause auto-publishing
1154 + * IRR below 0.6 → Moderator training needed
1155 +
1156 +4. **Monthly Quality Reviews:**
1157 + * Analyze trends
1158 + * Identify systematic issues
1159 + * Plan prompt improvements
1160 + * Update AKEL models
1161 +
1162 +---
1163 +
1164 +==== Metric Calculation Details ====
1165 +
1166 +**TIGERScore Implementation:**
1167 +* Reference: https://github.com/TIGER-AI-Lab/TIGERScore
1168 +* Input: Generated verdict + reference verdict (from expert)
1169 +* Output: 0-100 score across 5 dimensions
1170 +* Requires: Test set of expert-reviewed claims (minimum 100)
1171 +
1172 +**AlignScore Implementation:**
1173 +* Reference: https://github.com/yuh-zha/AlignScore
1174 +* Input: Generated verdict + source evidence text
1175 +* Output: 0-1 faithfulness score
1176 +* Calculation: Semantic alignment between claim and evidence
1177 +
1178 +**Source Quality Scoring:**
1179 +* Use existing source reliability database (e.g., NewsGuard, MBFC)
1180 +* Factor in: Publication history, corrections record, transparency
1181 +* Scale: 0-1 (weighted average across sources)
1182 +
1183 +---
1184 +
1185 +==== Integration Points ====
1186 +
1187 +* **NFR11: AKEL Quality Assurance** - Metrics validate quality gate effectiveness
1188 +* **FR49: A/B Testing** - Metrics measure test success
1189 +* **FR11: Audit Trail** - Source of quality data
1190 +* **NFR3: Transparency** - Public metrics build trust
1191 +
1192 +**Acceptance Criteria:**
1193 +
1194 +* ✅ All core metrics implemented and calculating correctly
1195 +* ✅ Dashboard updates daily (Beta 0) or hourly (V1.0)
1196 +* ✅ Alerts trigger when metrics degrade beyond thresholds
1197 +* ✅ Monthly quality report auto-generates
1198 +* ✅ Dashboard is publicly accessible (no login required)
1199 +* ✅ Mobile-responsive dashboard design
1200 +* ✅ Metrics inform quarterly AKEL improvement planning
1201 +
1202 +
1203 +
1204 +
985 985  == 13. Requirements Traceability ==
986 986  
987 987  For full traceability matrix showing which requirements fulfill which user needs, see:
... ... @@ -1014,39 +1014,387 @@
1014 1014  
1015 1015  === FR44: ClaimReview Schema Implementation ===
1016 1016  
1017 -Generate valid ClaimReview structured data for Google/Bing visibility.
1237 +**Fulfills:** UN-13 (Cite FactHarbor Verdicts), UN-14 (API Access for Integration), UN-26 (Search Engine Visibility)
1018 1018  
1019 -**Schema.org Mapping:**
1239 +**Phase:** V1.0
1240 +
1241 +**Purpose:** Generate valid ClaimReview structured data for every published analysis to enable Google/Bing search visibility and fact-check discovery.
1242 +
1243 +**Specification:**
1244 +
1245 +==== Component: Schema.org Markup Generator ====
1246 +
1247 +FactHarbor must generate valid ClaimReview structured data following Schema.org specifications for every published claim analysis.
1248 +
1249 +**Required JSON-LD Schema:**
1250 +
1251 +{{code language="json"}}
1252 +{
1253 + "@context": "https://schema.org",
1254 + "@type": "ClaimReview",
1255 + "datePublished": "YYYY-MM-DD",
1256 + "url": "https://factharbor.org/claims/{claim_id}",
1257 + "claimReviewed": "The exact claim text",
1258 + "author": {
1259 + "@type": "Organization",
1260 + "name": "FactHarbor",
1261 + "url": "https://factharbor.org"
1262 + },
1263 + "reviewRating": {
1264 + "@type": "Rating",
1265 + "ratingValue": "1-5",
1266 + "bestRating": "5",
1267 + "worstRating": "1",
1268 + "alternateName": "FactHarbor likelihood score"
1269 + },
1270 + "itemReviewed": {
1271 + "@type": "Claim",
1272 + "author": {
1273 + "@type": "Person",
1274 + "name": "Claim author if known"
1275 + },
1276 + "datePublished": "YYYY-MM-DD if known",
1277 + "appearance": {
1278 + "@type": "CreativeWork",
1279 + "url": "Original claim URL if from article"
1280 + }
1281 + }
1282 +}
1283 +{{/code}}
1284 +
1285 +**FactHarbor-Specific Mapping:**
1286 +
1287 +**Likelihood Score to Rating Scale:**
1020 1020  * 80-100% likelihood → 5 (Highly Supported)
1021 -* 60-79% → 4 (Supported)
1022 -* 40-59% → 3 (Mixed)
1023 -* 20-39% → 2 (Questionable)
1024 -* 0-19% → 1 (Refuted)
1289 +* 60-79% likelihood → 4 (Supported)
1290 +* 40-59% likelihood → 3 (Mixed/Uncertain)
1291 +* 20-39% likelihood → 2 (Questionable)
1292 +* 0-19% likelihood → 1 (Refuted)
1025 1025  
1026 -**Milestone:** V1.0
1294 +**Multiple Scenarios Handling:**
1295 +* If claim has multiple scenarios with different verdicts, generate **separate ClaimReview** for each scenario
1296 +* Add `disambiguatingDescription` field explaining scenario context
1297 +* Example: "Scenario: If interpreted as referring to 2023 data..."
1027 1027  
1299 +==== Implementation Requirements ====
1300 +
1301 +1. **Auto-generate** on claim publication
1302 +2. **Embed** in HTML `<head>` section as JSON-LD script
1303 +3. **Validate** against Schema.org validator before publishing
1304 +4. **Submit** to Google Search Console for indexing
1305 +5. **Update** automatically when verdict changes (integrate with FR8: Time Evolution)
1306 +
1307 +==== Integration Points ====
1308 +
1309 +* **FR7: Automated Verdicts** - Source of rating data and claim text
1310 +* **FR8: Time Evolution** - Triggers schema updates when verdicts change
1311 +* **FR11: Audit Trail** - Logs all schema generation and update events
1312 +
1313 +==== Resources ====
1314 +
1315 +* ClaimReview Project: https://www.claimreviewproject.com
1316 +* Schema.org ClaimReview: https://schema.org/ClaimReview
1317 +* Google Fact Check Guidelines: https://developers.google.com/search/docs/appearance/fact-check
1318 +
1319 +**Acceptance Criteria:**
1320 +
1321 +* ✅ Passes Google Structured Data Testing Tool
1322 +* ✅ Appears in Google Fact Check Explorer within 48 hours of publication
1323 +* ✅ Valid JSON-LD syntax (no errors)
1324 +* ✅ All required fields populated with correct data types
1325 +* ✅ Handles multi-scenario claims correctly (separate ClaimReview per scenario)
1326 +
1327 +
1028 1028  === FR45: User Corrections Notification System ===
1029 1029  
1030 -Notify users when analyses are corrected.
1330 +**Fulfills:** IFCN Principle 5 (Open & Honest Corrections), EFCSN compliance
1031 1031  
1032 -**Mechanisms:**
1033 -1. In-page banner (30 days)
1034 -2. Public correction log
1035 -3. Email notifications (opt-in)
1036 -4. RSS/API feed
1332 +**Phase:** Beta 0 (basic), V1.0 (complete) **BLOCKER**
1037 1037  
1038 -**Milestone:** Beta 0 (basic), V1.0 (complete) **BLOCKER**
1334 +**Purpose:** When any claim analysis is corrected, notify users who previously viewed the claim to maintain transparency and build trust.
1039 1039  
1336 +**Specification:**
1337 +
1338 +==== Component: Corrections Visibility Framework ====
1339 +
1340 +**Correction Types:**
1341 +
1342 +1. **Major Correction:** Verdict changes category (e.g., "Supported" → "Refuted")
1343 +2. **Significant Correction:** Likelihood score changes >20%
1344 +3. **Minor Correction:** Evidence additions, source quality updates
1345 +4. **Scenario Addition:** New scenario added to existing claim
1346 +
1347 +==== Notification Mechanisms ====
1348 +
1349 +**1. In-Page Banner:**
1350 +
1351 +Display prominent banner on claim page:
1352 +
1353 +{{code}}
1354 +[!] CORRECTION NOTICE
1355 +This analysis was updated on [DATE]. [View what changed] [Dismiss]
1356 +
1357 +Major changes:
1358 +• Verdict changed from "Likely True (75%)" to "Uncertain (45%)"
1359 +• New contradicting evidence added from [Source]
1360 +• Scenario 2 updated with additional context
1361 +
1362 +[See full correction log]
1363 +{{/code}}
1364 +
1365 +**2. Correction Log Page:**
1366 +
1367 +* Public changelog at `/claims/{id}/corrections`
1368 +* Displays for each correction:
1369 + * Date/time of correction
1370 + * What changed (before/after comparison)
1371 + * Why changed (reason if provided)
1372 + * Who made change (AKEL auto-update vs. contributor override)
1373 +
1374 +**3. Email Notifications (opt-in):**
1375 +
1376 +* Send to users who bookmarked or shared the claim
1377 +* Subject: "FactHarbor Correction: [Claim title]"
1378 +* Include summary of changes
1379 +* Link to updated analysis
1380 +
1381 +**4. RSS/API Feed:**
1382 +
1383 +* Corrections feed at `/corrections.rss`
1384 +* API endpoint: `GET /api/corrections?since={timestamp}`
1385 +* Enables external monitoring by journalists and researchers
1386 +
1387 +==== Display Rules ====
1388 +
1389 +* Show banner on **ALL pages** displaying the claim (search results, related claims, embeddings)
1390 +* Banner persists for **30 days** after correction
1391 +* **"Corrections" count badge** on claim card
1392 +* **Timestamp** on every verdict: "Last updated: [datetime]"
1393 +
1394 +==== IFCN Compliance Requirements ====
1395 +
1396 +* Corrections policy published at `/corrections-policy`
1397 +* User can report suspected errors via `/report-error/{claim_id}`
1398 +* Link to IFCN complaint process (if FactHarbor becomes signatory)
1399 +* **Scrupulous transparency:** Never silently edit analyses
1400 +
1401 +==== Integration Points ====
1402 +
1403 +* **FR8: Time Evolution** - Triggers corrections when verdicts change
1404 +* **FR11: Audit Trail** - Source of correction data and change history
1405 +* **NFR3: Transparency** - Public correction log demonstrates commitment
1406 +
1407 +**Acceptance Criteria:**
1408 +
1409 +* ✅ Banner appears within 60 seconds of correction
1410 +* ✅ Correction log is permanent and publicly accessible
1411 +* ✅ Email notifications deliver within 5 minutes
1412 +* ✅ RSS feed updates in real-time
1413 +* ✅ Mobile-responsive banner design
1414 +* ✅ Accessible (screen reader compatible)
1415 +
1416 +
1040 1040  === FR46: Image Verification System ===
1041 1041  
1042 -**Methods:**
1043 -1. Reverse image search
1044 -2. EXIF metadata analysis
1045 -3. Manipulation detection (basic)
1046 -4. Context verification
1419 +**Fulfills:** UN-27 (Visual Claim Verification)
1047 1047  
1048 -**Milestone:** Beta 0 (basic), V1.0 (extended)
1421 +**Phase:** Beta 0 (basic), V1.0 (extended)
1049 1049  
1423 +**Purpose:** Verify authenticity and context of images shared with claims to detect manipulation, misattribution, and out-of-context usage.
1424 +
1425 +**Specification:**
1426 +
1427 +==== Component: Multi-Method Image Verification ====
1428 +
1429 +**Method 1: Reverse Image Search**
1430 +
1431 +**Purpose:** Find earlier uses of the image to verify context
1432 +
1433 +**Implementation:**
1434 +* Integrate APIs:
1435 + * **Google Vision AI** (reverse search)
1436 + * **TinEye** (oldest known uses)
1437 + * **Bing Visual Search** (broad coverage)
1438 +
1439 +**Process:**
1440 +1. Extract image from claim or user upload
1441 +2. Query multiple reverse search services
1442 +3. Analyze results for:
1443 + * Earliest known publication
1444 + * Original context (what was it really showing?)
1445 + * Publication timeline
1446 + * Geographic spread
1447 +
1448 +**Output:**
1449 +{{code}}
1450 +Reverse Image Search Results:
1451 +
1452 +Earliest known use: 2019-03-15 (5 years before claim)
1453 +Original context: "Photo from 2019 flooding in Mumbai"
1454 +This claim uses it for: "2024 hurricane damage in Florida"
1455 +
1456 +⚠️ Image is OUT OF CONTEXT
1457 +
1458 +Found in 47 other articles:
1459 +• 2019-03-15: Mumbai floods (original)
1460 +• 2020-07-22: Bangladesh monsoon
1461 +• 2024-10-15: Current claim (misattributed)
1462 +
1463 +[View full timeline]
1464 +{{/code}}
1465 +
1466 +---
1467 +
1468 +**Method 2: AI Manipulation Detection**
1469 +
1470 +**Purpose:** Detect deepfakes, face swaps, and digital alterations
1471 +
1472 +**Implementation:**
1473 +* Integrate detection services:
1474 + * **Sensity AI** (deepfake detection)
1475 + * **Reality Defender** (multimodal analysis)
1476 + * **AWS Rekognition** (face detection inconsistencies)
1477 +
1478 +**Detection Categories:**
1479 +1. **Face Manipulation:**
1480 + * Deepfake face swaps
1481 + * Expression manipulation
1482 + * Identity replacement
1483 +
1484 +2. **Image Manipulation:**
1485 + * Copy-paste artifacts
1486 + * Clone stamp detection
1487 + * Content-aware fill detection
1488 + * JPEG compression inconsistencies
1489 +
1490 +3. **AI Generation:**
1491 + * Detect fully AI-generated images
1492 + * Identify generation artifacts
1493 + * Check for model signatures
1494 +
1495 +**Confidence Scoring:**
1496 +* **HIGH (80-100%):** Strong evidence of manipulation
1497 +* **MEDIUM (50-79%):** Suspicious artifacts detected
1498 +* **LOW (0-49%):** Minor inconsistencies or inconclusive
1499 +
1500 +**Output:**
1501 +{{code}}
1502 +Manipulation Analysis:
1503 +
1504 +Face Manipulation: LOW RISK (12%)
1505 +Image Editing: MEDIUM RISK (64%)
1506 + • Clone stamp artifacts detected in sky region
1507 + • JPEG compression inconsistent between objects
1508 +
1509 +AI Generation: LOW RISK (8%)
1510 +
1511 +⚠️ Possible manipulation detected. Manual review recommended.
1512 +{{/code}}
1513 +
1514 +---
1515 +
1516 +**Method 3: Metadata Analysis (EXIF)**
1517 +
1518 +**Purpose:** Extract technical details that may reveal manipulation or misattribution
1519 +
1520 +**Extracted Data:**
1521 +* **Camera/Device:** Make, model, software
1522 +* **Timestamps:** Original date, modification dates
1523 +* **Location:** GPS coordinates (if present)
1524 +* **Editing History:** Software used, edit count
1525 +* **File Properties:** Resolution, compression, format conversions
1526 +
1527 +**Red Flags:**
1528 +* Metadata completely stripped (suspicious)
1529 +* Timestamp conflicts with claimed date
1530 +* GPS location conflicts with claimed location
1531 +* Multiple edit rounds (hiding something?)
1532 +* Creation date after modification date (impossible)
1533 +
1534 +**Output:**
1535 +{{code}}
1536 +Image Metadata:
1537 +
1538 +Camera: iPhone 14 Pro
1539 +Original date: 2023-08-12 14:32:15
1540 +Location: 40.7128°N, 74.0060°W (New York City)
1541 +Modified: 2024-10-15 08:45:22
1542 +Software: Adobe Photoshop 2024
1543 +
1544 +⚠️ Location conflicts with claim
1545 +Claim says: "Taken in Los Angeles"
1546 +EXIF says: New York City
1547 +
1548 +⚠️ Edited 14 months after capture
1549 +{{/code}}
1550 +
1551 +---
1552 +
1553 +==== Verification Workflow ====
1554 +
1555 +**Automatic Triggers:**
1556 +1. User submits claim with image
1557 +2. Article being analyzed contains images
1558 +3. Social media post includes photos
1559 +
1560 +**Process:**
1561 +1. Extract images from content
1562 +2. Run all 3 verification methods in parallel
1563 +3. Aggregate results into confidence score
1564 +4. Generate human-readable summary
1565 +5. Display prominently in analysis
1566 +
1567 +**Display Integration:**
1568 +
1569 +Show image verification panel in claim analysis:
1570 +
1571 +{{code}}
1572 +📷 IMAGE VERIFICATION
1573 +
1574 +[Image thumbnail]
1575 +
1576 +✅ Reverse Search: Original context verified
1577 +⚠️ Manipulation: Possible editing detected (64% confidence)
1578 +✅ Metadata: Consistent with claim details
1579 +
1580 +Overall Assessment: CAUTION ADVISED
1581 +This image may have been edited. Original context appears accurate.
1582 +
1583 +[View detailed analysis]
1584 +{{/code}}
1585 +
1586 +==== Integration Points ====
1587 +
1588 +* **FR7: Automated Verdicts** - Image verification affects claim credibility
1589 +* **FR4: Analysis Summary** - Image findings included in summary
1590 +* **UN-27: Visual Claim Verification** - Direct fulfillment
1591 +
1592 +==== Cost Considerations ====
1593 +
1594 +**API Costs (estimated per image):**
1595 +* Google Vision AI: $0.001-0.003
1596 +* TinEye: $0.02 (commercial API)
1597 +* Sensity AI: $0.05-0.10
1598 +* AWS Rekognition: $0.001-0.002
1599 +
1600 +**Total per image:** ~$0.07-0.15
1601 +
1602 +**Mitigation Strategies:**
1603 +* Cache results for duplicate images
1604 +* Use free tier quotas where available
1605 +* Prioritize higher-value claims for deep analysis
1606 +* Offer premium verification as paid tier
1607 +
1608 +**Acceptance Criteria:**
1609 +
1610 +* ✅ Reverse image search finds original sources
1611 +* ✅ Manipulation detection accuracy >80% on test dataset
1612 +* ✅ EXIF extraction works for major image formats (JPEG, PNG, HEIC)
1613 +* ✅ Results display within 10 seconds
1614 +* ✅ Mobile-friendly image comparison interface
1615 +* ✅ False positive rate <15%
1616 +
1617 +
1050 1050  === FR47: Archive.org Integration ===
1051 1051  
1052 1052  Auto-save evidence sources to Wayback Machine.
... ... @@ -1065,19 +1065,145 @@
1065 1065  
1066 1066  **Milestone:** V1.0
1067 1067  
1068 -=== FR50-FR53: Future Enhancements (V2.0+) ===
1636 +=== FR50: OSINT Toolkit Integration ===
1069 1069  
1070 -* **FR50:** OSINT Toolkit Integration
1071 -* **FR51:** Video Verification System
1072 -* **FR52:** Interactive Detection Training
1073 -* **FR53:** Cross-Organizational Sharing
1074 1074  
1075 -**Milestone:** V2.0+ (12-18 months post-launch)
1076 1076  
1077 -== Enhanced Existing Requirements ==
1640 +**Fulfills:** Advanced media verification
1641 +**Phase:** V1.1
1078 1078  
1079 -=== FR7: Automated Verdicts (Enhanced with Quality Gates) ===
1643 +**Purpose:** Integrate open-source intelligence tools for advanced verification.
1080 1080  
1645 +**Tools to Integrate:**
1646 +* InVID/WeVerify (video verification)
1647 +* Bellingcat toolkit
1648 +* Additional TBD based on V1.0 learnings
1649 +
1650 +=== FR51: Video Verification System ===
1651 +
1652 +
1653 +
1654 +**Fulfills:** UN-27 (Visual claims), advanced media verification
1655 +**Phase:** V1.1
1656 +
1657 +**Purpose:** Verify video-based claims.
1658 +
1659 +**Specification:**
1660 +* Keyframe extraction
1661 +* Reverse video search
1662 +* Deepfake detection (AI-powered)
1663 +* Metadata analysis
1664 +* Acoustic signature analysis
1665 +
1666 +=== FR52: Interactive Detection Training ===
1667 +
1668 +
1669 +
1670 +**Fulfills:** Media literacy education
1671 +**Phase:** V1.5
1672 +
1673 +**Purpose:** Teach users to identify misinformation.
1674 +
1675 +**Specification:**
1676 +* Interactive tutorials
1677 +* Practice exercises
1678 +* Detection quizzes
1679 +* Gamification elements
1680 +
1681 +=== FR53: Cross-Organizational Sharing ===
1682 +
1683 +
1684 +
1685 +**Fulfills:** Collaboration with other fact-checkers
1686 +**Phase:** V1.5
1687 +
1688 +**Purpose:** Share findings with IFCN/EFCSN members.
1689 +
1690 +**Specification:**
1691 +* API for fact-checking organizations
1692 +* Structured data exchange
1693 +* Privacy controls
1694 +* Attribution requirements
1695 +
1696 +
1697 +== Summary ==
1698 +
1699 +**V1.0 Critical Requirements (Must Have):**
1700 +
1701 +* FR44: ClaimReview Schema ✅
1702 +* FR45: Corrections Notification ✅
1703 +* FR46: Image Verification ✅
1704 +* FR47: Archive.org Integration ✅
1705 +* FR48: Contributor Safety ✅
1706 +* FR49: A/B Testing ✅
1707 +* FR54: Evidence Deduplication ✅
1708 +* NFR11: Quality Assurance Framework ✅
1709 +* NFR12: Security Controls ✅
1710 +* NFR13: Quality Metrics Dashboard ✅
1711 +
1712 +**V1.1+ (Future):**
1713 +
1714 +* FR50: OSINT Integration
1715 +* FR51: Video Verification
1716 +* FR52: Detection Training
1717 +* FR53: Cross-Org Sharing
1718 +
1719 +
1720 +**Total:** 11 critical requirements for V1.0
1721 +
1722 +=== FR54: Evidence Deduplication ===
1723 +
1724 +
1725 +
1726 +**Fulfills:** Accurate evidence counting, quality metrics
1727 +**Phase:** POC2, Beta 0, V1.0
1728 +
1729 +**Purpose:** Avoid counting the same source multiple times when it appears in different forms.
1730 +
1731 +**Specification:**
1732 +
1733 +**Deduplication Logic:**
1734 +
1735 +1. **URL Normalization:**
1736 + * Remove tracking parameters (?utm_source=...)
1737 + * Normalize http/https
1738 + * Normalize www/non-www
1739 + * Handle redirects
1740 +
1741 +2. **Content Similarity:**
1742 + * If two sources have >90% text similarity → Same source
1743 + * If one is subset of other → Same source
1744 + * Use fuzzy matching for minor differences
1745 +
1746 +3. **Cross-Domain Syndication:**
1747 + * Detect wire service content (AP, Reuters)
1748 + * Mark as single source if syndicated
1749 + * Count original publication only
1750 +
1751 +**Display:**
1752 +
1753 +{{code}}
1754 +Evidence Sources (3 unique, 5 total):
1755 +
1756 +1. Original Article (NYTimes)
1757 + - Also appeared in: WashPost, Guardian (syndicated)
1758 +
1759 +2. Research Paper (Nature)
1760 +
1761 +3. Official Statement (WHO)
1762 +{{/code}}
1763 +
1764 +**Acceptance Criteria:**
1765 +
1766 +* ✅ URL normalization works
1767 +* ✅ Content similarity detected
1768 +* ✅ Syndicated content identified
1769 +* ✅ Unique vs. total counts accurate
1770 +* ✅ Improves evidence quality metrics
1771 +
1772 +
1773 +== Additional Requirements (Lower Priority) ===== FR7: Automated Verdicts (Enhanced with Quality Gates) ===
1774 +
1081 1081  **POC1+ Enhancement:**
1082 1082  
1083 1083  After AKEL generates verdict, it passes through quality gates: