Wiki source code of When to Add Complexity

Last modified by Robert Schaub on 2026/02/08 08:32

version	line-number	content
1.1	1	= When to Add Complexity =
1.2	2
1.1	3	FactHarbor starts simple and adds complexity only when metrics prove it's necessary. This page defines clear triggers for adding deferred features.
	4	Philosophy: Let data and user feedback drive complexity, not assumptions about future needs.
1.2	5
1.1	6	== 1. Add Elasticsearch ==
1.2	7
1.1	8	Current: PostgreSQL full-text search
	9	Add Elasticsearch when:
1.2	10
1.1	11	* ✅ PostgreSQL search queries consistently >500ms
	12	* ✅ Search accounts for >20% of total database load
	13	* ✅ Users complain about search speed
	14	* ✅ Search index size >50GB
	15	Metrics to monitor:
	16	* Search query response time (P95, P99)
	17	* Database CPU usage during search
	18	* User search abandonment rate
	19	* Search result relevance scores
	20	Before adding:
	21	* Try PostgreSQL search optimization (indexes, pg_trgm, GIN indexes)
	22	* Profile slow queries
	23	* Consider query result caching
	24	* Estimate Elasticsearch costs
1.2	25	Implementation effort:
	26
1.1	27	== 2. Add TimescaleDB ==
1.2	28
1.1	29	Current: PostgreSQL with time-series data in regular tables
	30	Add TimescaleDB when:
1.2	31
1.1	32	* ✅ Metrics queries consistently >1 second
	33	* ✅ Metrics tables >100GB
	34	* ✅ Need for time-series specific features (continuous aggregates, data retention policies)
	35	* ✅ Dashboard loading noticeably slow
	36	Metrics to monitor:
	37	* Metrics query response time
	38	* Metrics table size growth rate
	39	* Dashboard load time
	40	* Time-series query patterns
	41	Before adding:
	42	* Try PostgreSQL optimization (partitioning, materialized views)
	43	* Implement query result caching
	44	* Consider data aggregation strategies
	45	* Profile slow metrics queries
1.2	46	Implementation effort:
	47
1.1	48	== 3. Add Federation ==
1.2	49
1.1	50	Current: Single-node deployment with read replicas
	51	Add Federation when:
1.2	52
1.1	53	* ✅ 10,000+ users on single node
	54	* ✅ Users explicitly request ability to run own instances
	55	* ✅ Geographic latency becomes significant problem (>200ms)
	56	* ✅ Censorship/control concerns emerge
	57	* ✅ Community demands decentralization
	58	Metrics to monitor:
	59	* Total active users
	60	* Geographic distribution of users
	61	* Single-node performance limits
	62	* User feature requests
	63	* Community sentiment
	64	Before adding:
	65	* Exhaust vertical scaling options
	66	* Add read replicas in multiple regions
	67	* Implement CDN for static content
	68	* Survey users about federation interest
1.2	69	Implementation effort: (major undertaking)
	70
1.1	71	== 4. Add Complex Reputation System ==
1.2	72
1.1	73	Current: Simple manual roles (Reader, Contributor, Moderator, Admin)
	74	Add Complex Reputation when:
1.2	75
1.1	76	* ✅ 100+ active contributors
	77	* ✅ Manual role management becomes bottleneck (>5 hours/week)
	78	* ✅ Clear patterns of abuse require automated detection
	79	* ✅ Community requests reputation visibility
	80	Metrics to monitor:
	81	* Number of active contributors
	82	* Time spent on manual role management
	83	* Abuse incident rate
	84	* Contribution quality distribution
	85	* Community feedback on roles
	86	Before adding:
	87	* Document current manual process thoroughly
	88	* Identify most time-consuming tasks
	89	* Prototype automated reputation algorithm
	90	* Get community feedback on proposal
1.2	91	Implementation effort:
	92
1.1	93	== 5. Add Many-to-Many Scenarios ==
1.2	94
1.1	95	Current: Scenarios belong to single claims (one-to-many)
	96	Add Many-to-Many Scenarios when:
1.2	97
1.1	98	* ✅ Users request "apply this scenario to other claims"
	99	* ✅ Clear use cases for scenario reuse emerge
	100	* ✅ Scenario duplication becomes significant storage issue
	101	* ✅ Cross-claim scenario analysis requested
	102	Metrics to monitor:
	103	* Scenario duplication rate
	104	* User feature requests
	105	* Storage costs of scenarios
	106	* Query patterns involving scenarios
	107	Before adding:
	108	* Analyze scenario duplication patterns
	109	* Design junction table schema
	110	* Plan data migration strategy
	111	* Consider query performance impact
1.2	112	Implementation effort:
	113
1.1	114	== 6. Add Full Versioning System ==
1.2	115
1.1	116	Current: Simple audit trail (before/after values, who/when/why)
	117	Add Full Versioning when:
1.2	118
1.1	119	* ✅ Users request "see complete version history"
	120	* ✅ Users request "restore to specific previous version"
	121	* ✅ Need for branching and merging emerges
	122	* ✅ Collaborative editing requires conflict resolution
	123	Metrics to monitor:
	124	* User feature requests for versioning
	125	* Manual rollback frequency
	126	* Edit conflict rate
	127	* Storage costs of full history
	128	Before adding:
	129	* Design branching/merging strategy
	130	* Plan storage optimization (delta compression)
	131	* Consider UI/UX for version history
	132	* Estimate storage and performance impact
1.2	133	Implementation effort:
	134
1.1	135	== 7. Add Graph Database ==
1.2	136
1.1	137	Current: Relational data model in PostgreSQL
	138	Add Graph Database when:
1.2	139
1.1	140	* ✅ Complex relationship queries become common
	141	* ✅ Need for multi-hop traversals (friend-of-friend, citation chains)
	142	* ✅ PostgreSQL recursive queries too slow
	143	* ✅ Graph algorithms needed (PageRank, community detection)
	144	Metrics to monitor:
	145	* Relationship query patterns
	146	* Recursive query performance
	147	* Use cases requiring graph traversals
	148	* Query complexity growth
	149	Before adding:
	150	* Try PostgreSQL recursive CTEs
	151	* Consider graph extensions for PostgreSQL
	152	* Profile slow relationship queries
	153	* Evaluate Neo4j vs alternatives
1.2	154	Implementation effort:
	155
1.1	156	== 8. Add Real-Time Collaboration ==
1.2	157
1.1	158	Current: Asynchronous edits with eventual consistency
	159	Add Real-Time Collaboration when:
1.2	160
1.1	161	* ✅ Users request simultaneous editing
	162	* ✅ Conflict resolution becomes frequent issue
	163	* ✅ Need for live updates during editing sessions
	164	* ✅ Collaborative workflows common
	165	Metrics to monitor:
	166	* Edit conflict frequency
	167	* User feature requests
	168	* Collaborative editing patterns
	169	* Average edit session duration
	170	Before adding:
	171	* Design conflict resolution strategy (Operational Transform or CRDT)
	172	* Consider WebSocket infrastructure
	173	* Plan UI/UX for real-time editing
	174	* Estimate server resource requirements
1.2	175	Implementation effort:
	176
1.1	177	== 9. Add Machine Learning Pipeline ==
1.2	178
1.1	179	Current: Rule-based quality scoring and LLM-based analysis
	180	Add ML Pipeline when:
1.2	181
1.1	182	* ✅ Need for custom models beyond LLM APIs
	183	* ✅ Opportunity for specialized fine-tuning
	184	* ✅ Cost savings from specialized models
	185	* ✅ Real-time learning from user feedback
	186	Metrics to monitor:
	187	* LLM API costs
	188	* Need for domain-specific models
	189	* Quality improvement opportunities
	190	* User feedback patterns
	191	Before adding:
	192	* Collect training data (user feedback, corrections)
	193	* Experiment with fine-tuning approaches
	194	* Estimate cost savings vs infrastructure costs
	195	* Consider model hosting options
1.2	196	Implementation effort:
	197
1.1	198	== 10. Add Blockchain/Web3 Integration ==
1.2	199
1.1	200	Current: Traditional database with audit logs
	201	Add Blockchain when:
1.2	202
1.1	203	* ✅ Need for immutable public audit trail
	204	* ✅ Decentralized verification demanded
	205	* ✅ Token economics would add value
	206	* ✅ Community governance requires voting
	207	* ✅ Cross-organization trust is critical
	208	Metrics to monitor:
	209	* User requests for blockchain features
	210	* Need for external verification
	211	* Governance participation rate
	212	* Trust/verification requirements
	213	Before adding:
	214	* Evaluate real vs perceived benefits
	215	* Consider costs (gas fees, infrastructure)
	216	* Design token economics carefully
	217	* Study successful Web3 content platforms
1.2	218	Implementation effort:
	219
1.1	220	== Decision Framework ==
1.2	221
1.1	222	For any complexity addition, ask:
1.2	223
1.1	224	==== Do we have data? ====
1.2	225
1.1	226	* Metrics showing current system inadequate?
	227	* User requests documenting need?
	228	* Performance problems proven?
1.2	229
1.1	230	==== Have we exhausted simpler options? ====
1.2	231
1.1	232	* Optimization of current system?
	233	* Configuration tuning?
	234	* Simple workarounds?
1.2	235
1.1	236	==== Do we understand the cost? ====
1.2	237
1.1	238	* Implementation time realistic?
	239	* Ongoing maintenance burden?
	240	* Infrastructure costs?
	241	* Technical debt implications?
1.2	242
1.1	243	==== Is the timing right? ====
1.2	244
1.1	245	* Core product stable?
	246	* Team capacity available?
	247	* User demand strong enough?
	248	If all four answers are YES: Proceed with complexity addition
	249	If any answer is NO: Defer and revisit later
1.2	250
1.1	251	== Monitoring Dashboard ==
1.2	252
1.1	253	Recommended metrics to track:
	254	Performance:
1.2	255
1.1	256	* P95/P99 response times for all major operations
	257	* Database query performance
	258	* AKEL processing time
	259	* Search performance
	260	Usage:
	261	* Active users (daily, weekly, monthly)
	262	* Claims processed per day
	263	* Search queries per day
	264	* Contribution rate
	265	Costs:
	266	* Infrastructure costs per user
	267	* LLM API costs per claim
	268	* Storage costs per GB
	269	* Total operational costs
	270	Quality:
	271	* Confidence score distribution
	272	* Evidence completeness
	273	* Source reliability trends
	274	* User satisfaction (surveys)
	275	Community:
	276	* Active contributors
	277	* Moderation workload
	278	* Feature requests by category
	279	* Abuse incident rate
1.2	280
1.1	281	== Quarterly Review Process ==
1.2	282
1.1	283	Every quarter, review:
1.2	284
1.1	285	1. Metrics dashboard: Are any triggers close to thresholds?
	286	2. User feedback: What features are most requested?
	287	3. Performance: What's slowing down?
	288	4. Costs: What's most expensive?
	289	5. Team capacity: Can we handle new complexity?
	290	Decision: Prioritize complexity additions based on:
1.2	291
1.1	292	* Urgency (current pain vs future optimization)
	293	* Impact (user benefit vs internal efficiency)
	294	* Effort (quick wins vs major projects)
	295	* Dependencies (prerequisites needed)
1.2	296
1.1	297	== Related Pages ==
1.2	298
1.1	299	* [[Design Decisions>>FactHarbor.Specification.Design-Decisions]]
1.2	300	* [[Architecture>>Archive.FactHarbor 2026\.02\.08.Specification.Architecture.WebHome]]
1.3	301	* [[Data Model>>Archive.FactHarbor 2026\.02\.08.Specification.Data Model.WebHome]]
1.1	302	## Remember
	303	Build what you need now. Measure everything. Add complexity only when data proves it's necessary.
1.2	304	The best architecture is the simplest one that works for current needs. 🎯##

Wiki source code of When to Add Complexity

Applications

Navigation

Need help?