Changes for page Design Decisions

Last modified by Robert Schaub on 2026/02/08 08:31

From version 1.3
edited by Robert Schaub
on 2026/02/08 08:30
Change comment: Renamed back-links.
To version 1.1
edited by Robert Schaub
on 2026/01/20 21:40
Change comment: Imported from XAR

Summary

Details

Page properties
Content
... ... @@ -1,13 +1,9 @@
1 1  = Design Decisions =
2 -
3 3  This page explains key architectural choices in FactHarbor and why simpler alternatives were chosen over complex solutions.
4 4  **Philosophy**: Start simple, add complexity only when metrics prove necessary.
5 -
6 6  == 1. Single Primary Database (PostgreSQL) ==
7 -
8 8  **Decision**: Use PostgreSQL for all data initially, not multiple specialized databases
9 9  **Alternatives considered**:
10 -
11 11  * ❌ PostgreSQL + TimescaleDB + Elasticsearch from day one
12 12  * ❌ Multiple specialized databases (graph, document, time-series)
13 13  * ❌ Microservices with separate databases
... ... @@ -23,12 +23,9 @@
23 23  * TimescaleDB: When metrics queries consistently >1s
24 24  * Graph DB: If relationship queries become complex
25 25  **Evidence**: Research shows single-DB architectures work well until 10,000+ users (Vertabelo, AWS patterns)
26 -
27 27  == 2. Three-Layer Architecture ==
28 -
29 29  **Decision**: Organize system into 3 layers (Interface, Processing, Data)
30 30  **Alternatives considered**:
31 -
32 32  * ❌ 7 layers (Ingestion, AKEL, Quality, Publication, Improvement, UI, Moderation)
33 33  * ❌ Pure microservices (20+ services)
34 34  * ❌ Monolithic single-layer
... ... @@ -39,12 +39,9 @@
39 39  * Can scale each layer independently
40 40  * Reduces cognitive load
41 41  **Research**: Modern architecture best practices recommend 3-4 layers maximum for maintainability
42 -
43 43  == 3. Deferred Federation ==
44 -
45 45  **Decision**: Single-node architecture for V1.0, federation only in V2.0+
46 46  **Alternatives considered**:
47 -
48 48  * ❌ Federated from day one
49 49  * ❌ P2P architecture
50 50  * ❌ Blockchain-based
... ... @@ -60,12 +60,9 @@
60 60  * Geographic distribution becomes necessary
61 61  * Censorship becomes real problem
62 62  **Evidence**: Research shows premature federation increases failure risk (InfoQ MVP architecture)
63 -
64 64  == 4. Parallel AKEL Processing ==
65 -
66 66  **Decision**: Process evidence/sources/scenarios in parallel, not sequentially
67 67  **Alternatives considered**:
68 -
69 69  * ❌ Pure sequential pipeline (15-30 seconds)
70 70  * ❌ Fully async/event-driven (complex orchestration)
71 71  * ❌ Microservices per stage
... ... @@ -76,12 +76,9 @@
76 76  * Improves user experience
77 77  **Implementation**: Simple parallelization within single AKEL worker
78 78  **Evidence**: LLM orchestration research (2024-2025) strongly recommends pipeline parallelization
79 -
80 80  == 5. Simple Manual Roles ==
81 -
82 82  **Decision**: Manual role assignment for V1.0 (Reader, Contributor, Moderator, Admin)
83 83  **Alternatives considered**:
84 -
85 85  * ❌ Complex reputation point system from day one
86 86  * ❌ Automated privilege escalation
87 87  * ❌ Reputation decay algorithms
... ... @@ -96,12 +96,9 @@
96 96  * Manual role management becomes bottleneck
97 97  * Clear abuse patterns emerge requiring automation
98 98  **Evidence**: Successful communities (Wikipedia, Stack Overflow) started simple and added complexity gradually
99 -
100 100  == 6. One-to-Many Scenarios ==
101 -
102 102  **Decision**: Scenarios belong to single claims (one-to-many) for V1.0
103 103  **Alternatives considered**:
104 -
105 105  * ❌ Many-to-many with junction table
106 106  * ❌ Scenarios as separate first-class entities
107 107  * ❌ Hierarchical scenario taxonomy
... ... @@ -115,12 +115,9 @@
115 115  * Clear use cases for scenario reuse emerge
116 116  * Performance doesn't degrade
117 117  **Trade-off**: Slight duplication of scenarios vs. simpler mental model
118 -
119 119  == 7. Two-Tier Edit History ==
120 -
121 121  **Decision**: Hot audit trail (PostgreSQL) + Cold debug logs (S3 archive)
122 122  **Alternatives considered**:
123 -
124 124  * ❌ Everything in PostgreSQL forever
125 125  * ❌ Everything archived immediately
126 126  * ❌ Complex versioning system from day one
... ... @@ -133,12 +133,9 @@
133 133  * Hot: Human edits, moderation actions, major AKEL updates
134 134  * Cold: All AKEL processing logs (archived after 90 days)
135 135  **Evidence**: Standard pattern for high-volume audit systems
136 -
137 137  == 8. Denormalized Cache Fields ==
138 -
139 139  **Decision**: Store summary data in claim records (evidence_summary, source_names, scenario_count)
140 140  **Alternatives considered**:
141 -
142 142  * ❌ Fully normalized (join every time)
143 143  * ❌ Fully denormalized (duplicate everything)
144 144  * ❌ External cache only (Redis)
... ... @@ -145,7 +145,7 @@
145 145  **Why selective denormalization**:
146 146  * 70% fewer joins on common queries
147 147  * Much faster claim list/search pages
148 -* Trade-off: Small storage increase (10%)
123 +* Trade-off: Small storage increase (~10%)
149 149  * Read-heavy system (95% reads) benefits greatly
150 150  **Update strategy**:
151 151  * Immediate: On user-visible edits
... ... @@ -152,12 +152,9 @@
152 152  * Deferred: Background job every hour
153 153  * Invalidation: On source data changes
154 154  **Evidence**: Content management best practices recommend denormalization for read-heavy systems
155 -
156 156  == 9. Multi-Provider LLM Orchestration ==
157 -
158 158  **Decision**: Abstract LLM calls behind interface, support multiple providers
159 159  **Alternatives considered**:
160 -
161 161  * ❌ Hard-coded to single LLM provider
162 162  * ❌ Switch providers manually
163 163  * ❌ Complex multi-agent system
... ... @@ -168,12 +168,9 @@
168 168  * Resilience (automatic fallback)
169 169  **Implementation**: Simple routing layer, task-based provider selection
170 170  **Evidence**: Modern LLM app architecture (2024-2025) strongly recommends orchestration
171 -
172 172  == 10. Source Scoring Separation ==
173 -
174 174  **Decision**: Separate source scoring (weekly batch) from claim analysis (real-time)
175 175  **Alternatives considered**:
176 -
177 177  * ❌ Update source scores during claim analysis
178 178  * ❌ Real-time score calculation
179 179  * ❌ Complex feedback loops
... ... @@ -188,12 +188,9 @@
188 188  * Monday-Saturday: Claims use those scores
189 189  * Never update scores during analysis
190 190  **Evidence**: Standard pattern to prevent feedback loops in ML systems
191 -
192 192  == 11. Simple Versioning ==
193 -
194 194  **Decision**: Basic audit trail only for V1.0 (before/after values, who/when/why)
195 195  **Alternatives considered**:
196 -
197 197  * ❌ Full Git-like versioning from day one
198 198  * ❌ Branching and merging
199 199  * ❌ Time-travel queries
... ... @@ -208,11 +208,8 @@
208 208  * Users request "restore previous version"
209 209  * Need for branching emerges
210 210  **Evidence**: "You Aren't Gonna Need It" (YAGNI) principle from Extreme Programming
211 -
212 212  == Design Philosophy ==
213 -
214 214  **Guiding Principles**:
215 -
216 216  1. **Start Simple**: Build minimum viable features
217 217  2. **Measure First**: Add complexity only when metrics prove necessity
218 218  3. **User-Driven**: Let user requests guide feature additions
... ... @@ -219,15 +219,12 @@
219 219  4. **Iterate**: Evolve based on real-world usage
220 220  5. **Fail Fast**: Simple systems fail in simple ways
221 221  **Inspiration**:
222 -
223 223  * "Premature optimization is the root of all evil" - Donald Knuth
224 224  * "You Aren't Gonna Need It" - Extreme Programming
225 225  * "Make it work, make it right, make it fast" - Kent Beck
226 226  **Result**: FactHarbor V1.0 is 35% simpler than original design while maintaining all core functionality and actually becoming more scalable.
227 -
228 228  == Related Pages ==
229 -
230 -* [[Architecture>>Archive.FactHarbor 2026\.02\.08.Specification.Architecture.WebHome]]
190 +* [[Architecture>>FactHarbor.Specification.Architecture.WebHome]]
231 231  * [[When to Add Complexity>>FactHarbor.Specification.When-to-Add-Complexity]]
232 232  * [[Data Model>>FactHarbor.Specification.Data Model.WebHome]]
233 -* [[AKEL>>Archive.FactHarbor 2026\.02\.08.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]]
193 +* [[AKEL>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]]