Changes for page Design Decisions

Last modified by Robert Schaub on 2026/02/08 08:31

From version 1.1
edited by Robert Schaub
on 2026/01/20 21:40
Change comment: Imported from XAR
To version 1.2
edited by Robert Schaub
on 2026/02/08 08:30
Change comment: Renamed back-links.

Summary

Details

Page properties
Content
... ... @@ -1,9 +1,13 @@
1 1  = Design Decisions =
2 +
2 2  This page explains key architectural choices in FactHarbor and why simpler alternatives were chosen over complex solutions.
3 3  **Philosophy**: Start simple, add complexity only when metrics prove necessary.
5 +
4 4  == 1. Single Primary Database (PostgreSQL) ==
7 +
5 5  **Decision**: Use PostgreSQL for all data initially, not multiple specialized databases
6 6  **Alternatives considered**:
10 +
7 7  * ❌ PostgreSQL + TimescaleDB + Elasticsearch from day one
8 8  * ❌ Multiple specialized databases (graph, document, time-series)
9 9  * ❌ Microservices with separate databases
... ... @@ -19,9 +19,12 @@
19 19  * TimescaleDB: When metrics queries consistently >1s
20 20  * Graph DB: If relationship queries become complex
21 21  **Evidence**: Research shows single-DB architectures work well until 10,000+ users (Vertabelo, AWS patterns)
26 +
22 22  == 2. Three-Layer Architecture ==
28 +
23 23  **Decision**: Organize system into 3 layers (Interface, Processing, Data)
24 24  **Alternatives considered**:
31 +
25 25  * ❌ 7 layers (Ingestion, AKEL, Quality, Publication, Improvement, UI, Moderation)
26 26  * ❌ Pure microservices (20+ services)
27 27  * ❌ Monolithic single-layer
... ... @@ -32,9 +32,12 @@
32 32  * Can scale each layer independently
33 33  * Reduces cognitive load
34 34  **Research**: Modern architecture best practices recommend 3-4 layers maximum for maintainability
42 +
35 35  == 3. Deferred Federation ==
44 +
36 36  **Decision**: Single-node architecture for V1.0, federation only in V2.0+
37 37  **Alternatives considered**:
47 +
38 38  * ❌ Federated from day one
39 39  * ❌ P2P architecture
40 40  * ❌ Blockchain-based
... ... @@ -50,9 +50,12 @@
50 50  * Geographic distribution becomes necessary
51 51  * Censorship becomes real problem
52 52  **Evidence**: Research shows premature federation increases failure risk (InfoQ MVP architecture)
63 +
53 53  == 4. Parallel AKEL Processing ==
65 +
54 54  **Decision**: Process evidence/sources/scenarios in parallel, not sequentially
55 55  **Alternatives considered**:
68 +
56 56  * ❌ Pure sequential pipeline (15-30 seconds)
57 57  * ❌ Fully async/event-driven (complex orchestration)
58 58  * ❌ Microservices per stage
... ... @@ -63,9 +63,12 @@
63 63  * Improves user experience
64 64  **Implementation**: Simple parallelization within single AKEL worker
65 65  **Evidence**: LLM orchestration research (2024-2025) strongly recommends pipeline parallelization
79 +
66 66  == 5. Simple Manual Roles ==
81 +
67 67  **Decision**: Manual role assignment for V1.0 (Reader, Contributor, Moderator, Admin)
68 68  **Alternatives considered**:
84 +
69 69  * ❌ Complex reputation point system from day one
70 70  * ❌ Automated privilege escalation
71 71  * ❌ Reputation decay algorithms
... ... @@ -80,9 +80,12 @@
80 80  * Manual role management becomes bottleneck
81 81  * Clear abuse patterns emerge requiring automation
82 82  **Evidence**: Successful communities (Wikipedia, Stack Overflow) started simple and added complexity gradually
99 +
83 83  == 6. One-to-Many Scenarios ==
101 +
84 84  **Decision**: Scenarios belong to single claims (one-to-many) for V1.0
85 85  **Alternatives considered**:
104 +
86 86  * ❌ Many-to-many with junction table
87 87  * ❌ Scenarios as separate first-class entities
88 88  * ❌ Hierarchical scenario taxonomy
... ... @@ -96,9 +96,12 @@
96 96  * Clear use cases for scenario reuse emerge
97 97  * Performance doesn't degrade
98 98  **Trade-off**: Slight duplication of scenarios vs. simpler mental model
118 +
99 99  == 7. Two-Tier Edit History ==
120 +
100 100  **Decision**: Hot audit trail (PostgreSQL) + Cold debug logs (S3 archive)
101 101  **Alternatives considered**:
123 +
102 102  * ❌ Everything in PostgreSQL forever
103 103  * ❌ Everything archived immediately
104 104  * ❌ Complex versioning system from day one
... ... @@ -111,9 +111,12 @@
111 111  * Hot: Human edits, moderation actions, major AKEL updates
112 112  * Cold: All AKEL processing logs (archived after 90 days)
113 113  **Evidence**: Standard pattern for high-volume audit systems
136 +
114 114  == 8. Denormalized Cache Fields ==
138 +
115 115  **Decision**: Store summary data in claim records (evidence_summary, source_names, scenario_count)
116 116  **Alternatives considered**:
141 +
117 117  * ❌ Fully normalized (join every time)
118 118  * ❌ Fully denormalized (duplicate everything)
119 119  * ❌ External cache only (Redis)
... ... @@ -120,7 +120,7 @@
120 120  **Why selective denormalization**:
121 121  * 70% fewer joins on common queries
122 122  * Much faster claim list/search pages
123 -* Trade-off: Small storage increase (~10%)
148 +* Trade-off: Small storage increase (10%)
124 124  * Read-heavy system (95% reads) benefits greatly
125 125  **Update strategy**:
126 126  * Immediate: On user-visible edits
... ... @@ -127,9 +127,12 @@
127 127  * Deferred: Background job every hour
128 128  * Invalidation: On source data changes
129 129  **Evidence**: Content management best practices recommend denormalization for read-heavy systems
155 +
130 130  == 9. Multi-Provider LLM Orchestration ==
157 +
131 131  **Decision**: Abstract LLM calls behind interface, support multiple providers
132 132  **Alternatives considered**:
160 +
133 133  * ❌ Hard-coded to single LLM provider
134 134  * ❌ Switch providers manually
135 135  * ❌ Complex multi-agent system
... ... @@ -140,9 +140,12 @@
140 140  * Resilience (automatic fallback)
141 141  **Implementation**: Simple routing layer, task-based provider selection
142 142  **Evidence**: Modern LLM app architecture (2024-2025) strongly recommends orchestration
171 +
143 143  == 10. Source Scoring Separation ==
173 +
144 144  **Decision**: Separate source scoring (weekly batch) from claim analysis (real-time)
145 145  **Alternatives considered**:
176 +
146 146  * ❌ Update source scores during claim analysis
147 147  * ❌ Real-time score calculation
148 148  * ❌ Complex feedback loops
... ... @@ -157,9 +157,12 @@
157 157  * Monday-Saturday: Claims use those scores
158 158  * Never update scores during analysis
159 159  **Evidence**: Standard pattern to prevent feedback loops in ML systems
191 +
160 160  == 11. Simple Versioning ==
193 +
161 161  **Decision**: Basic audit trail only for V1.0 (before/after values, who/when/why)
162 162  **Alternatives considered**:
196 +
163 163  * ❌ Full Git-like versioning from day one
164 164  * ❌ Branching and merging
165 165  * ❌ Time-travel queries
... ... @@ -174,8 +174,11 @@
174 174  * Users request "restore previous version"
175 175  * Need for branching emerges
176 176  **Evidence**: "You Aren't Gonna Need It" (YAGNI) principle from Extreme Programming
211 +
177 177  == Design Philosophy ==
213 +
178 178  **Guiding Principles**:
215 +
179 179  1. **Start Simple**: Build minimum viable features
180 180  2. **Measure First**: Add complexity only when metrics prove necessity
181 181  3. **User-Driven**: Let user requests guide feature additions
... ... @@ -182,12 +182,15 @@
182 182  4. **Iterate**: Evolve based on real-world usage
183 183  5. **Fail Fast**: Simple systems fail in simple ways
184 184  **Inspiration**:
222 +
185 185  * "Premature optimization is the root of all evil" - Donald Knuth
186 186  * "You Aren't Gonna Need It" - Extreme Programming
187 187  * "Make it work, make it right, make it fast" - Kent Beck
188 188  **Result**: FactHarbor V1.0 is 35% simpler than original design while maintaining all core functionality and actually becoming more scalable.
227 +
189 189  == Related Pages ==
229 +
190 190  * [[Architecture>>FactHarbor.Specification.Architecture.WebHome]]
191 191  * [[When to Add Complexity>>FactHarbor.Specification.When-to-Add-Complexity]]
192 192  * [[Data Model>>FactHarbor.Specification.Data Model.WebHome]]
193 -* [[AKEL>>FactHarbor.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]]
233 +* [[AKEL>>Archive.FactHarbor 2026\.02\.08.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]]