Architecture

See [[Design Decisions>>FactHarbor.Specification.Design-Decisions]] and [[When to Add Complexity>>FactHarbor.Specification.When-to-Add-Complexity]] for detailed rationale.

68

== 3. AKEL Architecture ==

69

{{include reference="FactHarbor.Specification.Diagrams.AKEL_Architecture.WebHome"/}}

70

See [[AI Knowledge Extraction Layer (AKEL)>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]] for detailed information.

71

== 4. Storage Architecture ==

72

{{include reference="FactHarbor.Specification.Diagrams.Storage Architecture.WebHome"/}}

73

See [[Storage Strategy>>FactHarbor.Specification.Architecture.WebHome]] for detailed information.

74

== 4.5 Versioning Architecture ==

75

{{include reference="FactHarbor.Specification.Diagrams.Versioning Architecture.WebHome"/}}

76

== 5. Automated Systems in Detail ==

77

FactHarbor relies heavily on automation to achieve scale and quality. Here's how each automated system works:

78

=== 5.1 AKEL (AI Knowledge Evaluation Layer) ===

79

**What it does**: Primary AI processing engine that analyzes claims automatically

80

**Inputs**:

81

* User-submitted claim text

82

* Existing evidence and sources

83

* Source track record database

84

**Processing steps**:

85

1. **Parse & Extract**: Identify key components, entities, assertions

86

2. **Gather Evidence**: Search web and database for relevant sources

87

3. **Check Sources**: Evaluate source reliability using track records

88

4. **Extract Scenarios**: Identify different contexts from evidence

89

5. **Synthesize Verdict**: Compile evidence assessment per scenario

90

6. **Calculate Risk**: Assess potential harm and controversy

91

**Outputs**:

92

* Structured claim record

93

* Evidence links with relevance scores

94

* Scenarios with context descriptions

95

* Verdict summary per scenario

96

* Overall confidence score

97

* Risk assessment

98

**Timing**: 10-18 seconds total (parallel processing)

99

=== 5.2 Background Jobs ===

100

**Source Track Record Updates** (Weekly):

101

* Analyze claim outcomes from past week

102

* Calculate source accuracy and reliability

103

* Update source_track_record table

104

* Never triggered by individual claims (prevents circular dependencies)

105

**Cache Management** (Continuous):

106

* Warm cache for popular claims

107

* Invalidate cache on claim updates

108

* Monitor cache hit rates

109

**Metrics Aggregation** (Hourly):

110

* Roll up detailed metrics

111

* Calculate system health indicators

112

* Generate performance reports

113

**Data Archival** (Daily):

114

* Move old AKEL logs to S3 (90+ days)

115

* Archive old edit history

116

* Compress and backup data

117

=== 5.3 Quality Monitoring ===

118

**Automated checks run continuously**:

119

* **Anomaly Detection**: Flag unusual patterns

120

* Sudden confidence score changes

121

* Unusual evidence distributions

122

* Suspicious source patterns

123

* **Contradiction Detection**: Identify conflicts

124

* Evidence that contradicts other evidence

125

* Claims with internal contradictions

126

* Source track record anomalies

127

* **Completeness Validation**: Ensure thoroughness

128

* Sufficient evidence gathered

129

* Multiple source types represented

130

* Key scenarios identified

131

=== 5.4 Moderation Detection ===

132

**Automated abuse detection**:

133

* **Spam Identification**: Pattern matching for spam claims

134

* **Manipulation Detection**: Identify coordinated editing

135

* **Gaming Detection**: Flag attempts to game source scores

136

* **Suspicious Activity**: Log unusual behavior patterns

137

**Human Review**: Moderators review flagged items, system learns from decisions

138

== 6. Scalability Strategy ==

139

=== 6.1 Horizontal Scaling ===

140

Components scale independently:

141

* **AKEL Workers**: Add more processing workers as claim volume grows

142

* **Database Read Replicas**: Add replicas for read-heavy workloads

143

* **Cache Layer**: Redis cluster for distributed caching

144

* **API Servers**: Load-balanced API instances

145

=== 6.2 Vertical Scaling ===

146

Individual components can be upgraded:

147

* **Database Server**: Increase CPU/RAM for PostgreSQL

148

* **Cache Memory**: Expand Redis memory

149

* **Worker Resources**: More powerful AKEL worker machines

150

=== 6.3 Performance Optimization ===

151

Built-in optimizations:

152

* **Denormalized Data**: Cache summary data in claim records (70% fewer joins)

153

* **Parallel Processing**: AKEL pipeline processes in parallel (40% faster)

154

* **Intelligent Caching**: Redis caches frequently accessed data

155

* **Background Processing**: Non-urgent tasks run asynchronously

156

== 7. Monitoring & Observability ==

157

=== 7.1 Key Metrics ===

158

System tracks:

159

* **Performance**: AKEL processing time, API response time, cache hit rate

160

* **Quality**: Confidence score distribution, evidence completeness, contradiction rate

161

* **Usage**: Claims per day, active users, API requests

162

* **Errors**: Failed AKEL runs, API errors, database issues

163

=== 7.2 Alerts ===

164

Automated alerts for:

165

* Processing time >30 seconds (threshold breach)

166

* Error rate >1% (quality issue)

167

* Cache hit rate <80% (cache problem)

168

* Database connections >80% capacity (scaling needed)

169

=== 7.3 Dashboards ===

170

Real-time monitoring:

171

* **System Health**: Overall status and key metrics

172

* **AKEL Performance**: Processing time breakdown

173

* **Quality Metrics**: Confidence scores, completeness

174

* **User Activity**: Usage patterns, peak times

175

== 8. Security Architecture ==

176

=== 8.1 Authentication & Authorization ===

177

* **User Authentication**: Secure login with password hashing

178

* **Role-Based Access**: Reader, Contributor, Moderator, Admin

179

* **API Keys**: For programmatic access

180

* **Rate Limiting**: Prevent abuse

181

=== 8.2 Data Security ===

182

* **Encryption**: TLS for transport, encrypted storage for sensitive data

183

* **Audit Logging**: Track all significant changes

184

* **Input Validation**: Sanitize all user inputs

185

* **SQL Injection Protection**: Parameterized queries

186

=== 8.3 Abuse Prevention ===

187

* **Rate Limiting**: Prevent flooding and DDoS

188

* **Automated Detection**: Flag suspicious patterns

189

* **Human Review**: Moderators investigate flagged content

190

* **Ban Mechanisms**: Block abusive users/IPs

191

== 9. Deployment Architecture ==

192

=== 9.1 Production Environment ===

193

**Components**:

194

* Load Balancer (HAProxy or cloud LB)

195

* Multiple API servers (stateless)

196

* AKEL worker pool (auto-scaling)

197

* PostgreSQL primary + read replicas

198

* Redis cluster

199

* S3-compatible storage

200

**Regions**: Single region for V1.0, multi-region when needed

201

=== 9.2 Development & Staging ===

202

**Development**: Local Docker Compose setup

203

**Staging**: Scaled-down production replica

204

**CI/CD**: Automated testing and deployment

205

=== 9.3 Disaster Recovery ===

206

* **Database Backups**: Daily automated backups to S3

207

* **Point-in-Time Recovery**: Transaction log archival

208

* **Replication**: Real-time replication to standby

209

* **Recovery Time Objective**: <4 hours

210

211

=== 9.5 Federation Architecture Diagram ===

212

213

{{include reference="FactHarbor.Specification.Diagrams.Federation Architecture.WebHome"/}}

214

215

== 10. Future Architecture Evolution ==

216

=== 10.1 When to Add Complexity ===

217

See [[When to Add Complexity>>FactHarbor.Specification.When-to-Add-Complexity]] for specific triggers.

218

**Elasticsearch**: When PostgreSQL search consistently >500ms

219

**TimescaleDB**: When metrics queries consistently >1s

220

**Federation**: When 10,000+ users and explicit demand

221

**Complex Reputation**: When 100+ active contributors

222

=== 10.2 Federation (V2.0+) ===

223

**Deferred until**:

224

* Core product proven with 10,000+ users

225

* User demand for decentralization

226

* Single-node limits reached

227

See [[Federation & Decentralization>>FactHarbor.Specification.Federation & Decentralization.WebHome]] for future plans.

228

== 11. Technology Stack Summary ==

229

**Backend**:

230

* Python (FastAPI or Django)

231

* PostgreSQL (primary database)

232

* Redis (caching)

233

**Frontend**:

234

* Modern JavaScript framework (React, Vue, or Svelte)

235

* Server-side rendering for SEO

236

**AI/LLM**:

237

* Multi-provider orchestration (Claude, GPT-4, local models)

238

* Fallback and cross-checking support

239

**Infrastructure**:

240

* Docker containers

241

* Kubernetes or cloud platform auto-scaling

242

* S3-compatible object storage

243

**Monitoring**:

244

* Prometheus + Grafana

245

* Structured logging (ELK or cloud logging)

246

* Error tracking (Sentry)

247

== 12. Related Pages ==

248

* [[AI Knowledge Extraction Layer (AKEL)>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]]

249

* [[Storage Strategy>>FactHarbor.Specification.Architecture.WebHome]]

250

* [[Data Model>>FactHarbor.Specification.Data Model.WebHome]]

251

* [[API Layer>>FactHarbor.Specification.Architecture.WebHome]]

252

* [[Design Decisions>>FactHarbor.Specification.Design-Decisions]]

253

* [[When to Add Complexity>>FactHarbor.Specification.When-to-Add-Complexity]]

Wiki source code of Architecture

Applications

Navigation

Need help?