System Performance Metrics

version	line-number	content
1.1	1	= System Performance Metrics =
	2	What we monitor to ensure AKEL performs well.
	3	== 1. Purpose ==
	4	These metrics tell us:
	5	* ✅ Is AKEL performing within acceptable ranges?
	6	* ✅ Where should we focus improvement efforts?
	7	* ✅ When do humans need to intervene?
	8	* ✅ Are our changes improving things?
	9	Principle: Measure to improve, not to judge.
	10	== 2. Metric Categories ==
	11	=== 2.1 AKEL Performance ===
	12	Processing speed and reliability
	13	=== 2.2 Content Quality ===
	14	Output quality and user satisfaction
	15	=== 2.3 System Health ===
	16	Infrastructure and operational metrics
	17	=== 2.4 User Experience ===
	18	How users interact with the system
	19	== 3. AKEL Performance Metrics ==
	20	=== 3.1 Processing Time ===
	21	Metric: Time from claim submission to verdict publication
	22	Measurements:
	23	* P50 (median): 50% of claims processed within X seconds
	24	* P95: 95% of claims processed within Y seconds
	25	* P99: 99% of claims processed within Z seconds
	26	Targets:
	27	* P50: ≤ 12 seconds
	28	* P95: ≤ 18 seconds
	29	* P99: ≤ 25 seconds
	30	Alert thresholds:
	31	* P95 > 20 seconds: Monitor closely
	32	* P95 > 25 seconds: Investigate immediately
	33	* P95 > 30 seconds: Emergency - intervention required
	34	Why it matters: Slow processing = poor user experience
	35	Improvement ideas:
	36	* Optimize evidence extraction
	37	* Better caching
	38	* Parallel processing
	39	* Database query optimization
	40	=== 3.2 Success Rate ===
	41	Metric: % of claims successfully processed without errors
	42	Target: ≥ 99%
	43	Alert thresholds:
	44	* 98-99%: Monitor
	45	* 95-98%: Investigate
	46	* <95%: Emergency
	47	Common failure causes:
	48	* Timeout (evidence extraction took too long)
	49	* Parse error (claim text unparsable)
	50	* External API failure (source unavailable)
	51	* Resource exhaustion (memory/CPU)
	52	Why it matters: Errors frustrate users and reduce trust
	53	=== 3.3 Evidence Completeness ===
	54	Metric: % of claims where AKEL found sufficient evidence
	55	Measurement: Claims with ≥3 pieces of evidence from ≥2 distinct sources
	56	Target: ≥ 80%
	57	Alert thresholds:
	58	* 75-80%: Monitor
	59	* 70-75%: Investigate
	60	* <70%: Intervention needed
	61	Why it matters: Incomplete evidence = low confidence verdicts
	62	Improvement ideas:
	63	* Better search algorithms
	64	* More source integrations
	65	* Improved relevance scoring
	66	=== 3.4 Source Diversity ===
	67	Metric: Average number of distinct sources per claim
	68	Target: ≥ 3.0 sources per claim
	69	Alert thresholds:
	70	* 2.5-3.0: Monitor
	71	* 2.0-2.5: Investigate
	72	* <2.0: Intervention needed
	73	Why it matters: Multiple sources increase confidence and reduce bias
	74	=== 3.5 Scenario Coverage ===
	75	Metric: % of claims with at least one scenario extracted
	76	Target: ≥ 75%
	77	Why it matters: Scenarios provide context for verdicts
	78	== 4. Content Quality Metrics ==
	79	=== 4.1 Confidence Distribution ===
	80	Metric: Distribution of confidence scores across claims
	81	Target: Roughly normal distribution
	82	* ~10% very low confidence (0.0-0.3)
	83	* ~20% low confidence (0.3-0.5)
	84	* ~40% medium confidence (0.5-0.7)
	85	* ~20% high confidence (0.7-0.9)
	86	* ~10% very high confidence (0.9-1.0)
	87	Alert thresholds:
	88	* >30% very low confidence: Evidence extraction issues
	89	* >30% very high confidence: Too aggressive/overconfident
	90	* Heavily skewed distribution: Systematic bias
	91	Why it matters: Confidence should reflect actual uncertainty
	92	=== 4.2 Contradiction Rate ===
	93	Metric: % of claims with internal contradictions detected
	94	Target: ≤ 5%
	95	Alert thresholds:
	96	* 5-10%: Monitor
	97	* 10-15%: Investigate
	98	* >15%: Intervention needed
	99	Why it matters: High contradiction rate suggests poor evidence quality or logic errors
	100	=== 4.3 User Feedback Ratio ===
	101	Metric: Helpful vs unhelpful user ratings
	102	Target: ≥ 70% helpful
	103	Alert thresholds:
	104	* 60-70%: Monitor
	105	* 50-60%: Investigate
	106	* <50%: Emergency
	107	Why it matters: Direct measure of user satisfaction
	108	=== 4.4 False Positive/Negative Rate ===
	109	Metric: When humans review flagged items, how often was AKEL right?
	110	Measurement:
	111	* False positive: AKEL flagged for review, but actually fine
	112	* False negative: Missed something that should've been flagged
	113	Target:
	114	* False positive rate: ≤ 20%
	115	* False negative rate: ≤ 5%
	116	Why it matters: Balance between catching problems and not crying wolf
	117	== 5. System Health Metrics ==
	118	=== 5.1 Uptime ===
	119	Metric: % of time system is available and functional
	120	Target: ≥ 99.9% (less than 45 minutes downtime per month)
	121	Alert: Immediate notification on any downtime
	122	Why it matters: Users expect 24/7 availability
	123	=== 5.2 Error Rate ===
	124	Metric: Errors per 1000 requests
	125	Target: ≤ 1 error per 1000 requests (0.1%)
	126	Alert thresholds:
	127	* 1-5 per 1000: Monitor
	128	* 5-10 per 1000: Investigate
	129	* >10 per 1000: Emergency
	130	Why it matters: Errors disrupt user experience
	131	=== 5.3 Database Performance ===
	132	Metrics:
	133	* Query response time (P95)
	134	* Connection pool utilization
	135	* Slow query frequency
	136	Targets:
	137	* P95 query time: ≤ 50ms
	138	* Connection pool: ≤ 80% utilized
	139	* Slow queries (>1s): ≤ 10 per hour
	140	Why it matters: Database bottlenecks slow entire system
	141	=== 5.4 Cache Hit Rate ===
	142	Metric: % of requests served from cache vs. database
	143	Target: ≥ 80%
	144	Why it matters: Higher cache hit rate = faster responses, less DB load
	145	=== 5.5 Resource Utilization ===
	146	Metrics:
	147	* CPU utilization
	148	* Memory utilization
	149	* Disk I/O
	150	* Network bandwidth
	151	Targets:
	152	* Average CPU: ≤ 60%
	153	* Peak CPU: ≤ 85%
	154	* Memory: ≤ 80%
	155	* Disk I/O: ≤ 70%
	156	Alert: Any metric consistently >85%
	157	Why it matters: Headroom for traffic spikes, prevents resource exhaustion
	158	== 6. User Experience Metrics ==
	159	=== 6.1 Time to First Verdict ===
	160	Metric: Time from user submitting claim to seeing initial verdict
	161	Target: ≤ 15 seconds
	162	Why it matters: User perception of speed
	163	=== 6.2 Claim Submission Rate ===
	164	Metric: Claims submitted per day/hour
	165	Monitoring: Track trends, detect anomalies
	166	Why it matters: Understand usage patterns, capacity planning
	167	=== 6.3 User Retention ===
	168	Metric: % of users who return after first visit
	169	Target: ≥ 30% (1-week retention)
	170	Why it matters: Indicates system usefulness
	171	=== 6.4 Feature Usage ===
	172	Metrics:
	173	* % of users who explore evidence
	174	* % who check scenarios
	175	* % who view source track records
	176	Why it matters: Understand how users interact with system
	177	== 7. Metric Dashboard ==
	178	=== 7.1 Real-Time Dashboard ===
	179	Always visible:
	180	* Current processing time (P95)
	181	* Success rate (last hour)
	182	* Error rate (last hour)
	183	* System health status
	184	Update frequency: Every 30 seconds
	185	=== 7.2 Daily Dashboard ===
	186	Reviewed daily:
	187	* All AKEL performance metrics
	188	* Content quality metrics
	189	* System health trends
	190	* User feedback summary
	191	=== 7.3 Weekly Reports ===
	192	Reviewed weekly:
	193	* Trends over time
	194	* Week-over-week comparisons
	195	* Improvement priorities
	196	* Outstanding issues
	197	=== 7.4 Monthly/Quarterly Reports ===
	198	Comprehensive analysis:
	199	* Long-term trends
	200	* Seasonal patterns
	201	* Strategic metrics
	202	* Goal progress
	203	== 8. Alert System ==
	204	=== 8.1 Alert Levels ===
	205	Info: Metric outside target, but within acceptable range
	206	* Action: Note in daily review
	207	* Example: P95 processing time 19s (target 18s, acceptable <20s)
	208	Warning: Metric outside acceptable range
	209	* Action: Investigate within 24 hours
	210	* Example: Success rate 97% (acceptable >98%)
	211	Critical: Metric severely degraded
	212	* Action: Investigate immediately
	213	* Example: Error rate 2% (acceptable <0.5%)
	214	Emergency: System failure or severe degradation
	215	* Action: Page on-call, all hands
	216	* Example: Uptime <95%, P95 >30s
	217	=== 8.2 Alert Channels ===
	218	Slack/Discord: All alerts
	219	Email: Warning and above
	220	SMS: Critical and emergency only
	221	PagerDuty: Emergency only
	222	=== 8.3 On-Call Rotation ===
	223	Technical Coordinator: Primary on-call
	224	Backup: Designated team member
	225	Responsibilities:
	226	* Respond to alerts within SLA
	227	* Investigate and diagnose issues
	228	* Implement fixes or escalate
	229	* Document incidents
	230	== 9. Metric-Driven Improvement ==
	231	=== 9.1 Prioritization ===
	232	Focus improvements on:
	233	* Metrics furthest from target
	234	* Metrics with biggest user impact
	235	* Metrics easiest to improve
	236	* Strategic priorities
	237	=== 9.2 Success Criteria ===
	238	Every improvement project should:
	239	* Target specific metrics
	240	* Set concrete improvement goals
	241	* Measure before and after
	242	* Document learnings
	243	Example: "Reduce P95 processing time from 20s to 16s by optimizing evidence extraction"
	244	=== 9.3 A/B Testing ===
	245	When feasible:
	246	* Run two versions
	247	* Measure metric differences
	248	* Choose based on data
	249	* Roll out winner
	250	== 10. Bias and Fairness Metrics ==
	251	=== 10.1 Domain Balance ===
	252	Metric: Confidence distribution by domain
	253	Target: Similar distributions across domains
	254	Alert: One domain consistently much lower/higher confidence
	255	Why it matters: Ensure no systematic domain bias
	256	=== 10.2 Source Type Balance ===
	257	Metric: Evidence distribution by source type
	258	Target: Diverse source types represented
	259	Alert: Over-reliance on one source type
	260	Why it matters: Prevent source type bias
	261	=== 10.3 Geographic Balance ===
	262	Metric: Source geographic distribution
	263	Target: Multiple regions represented
	264	Alert: Over-concentration in one region
	265	Why it matters: Reduce geographic/cultural bias
	266	== 11. Experimental Metrics ==
	267	New metrics to test:
	268	* User engagement time
	269	* Evidence exploration depth
	270	* Cross-reference usage
	271	* Mobile vs desktop usage
	272	Process:
	273	1. Define metric hypothesis
	274	2. Implement tracking
	275	3. Collect data for 1 month
	276	4. Evaluate usefulness
	277	5. Add to standard set or discard
	278	== 12. Anti-Patterns ==
	279	Don't:
	280	* ❌ Measure too many things (focus on what matters)
	281	* ❌ Set unrealistic targets (demotivating)
	282	* ❌ Ignore metrics when inconvenient
	283	* ❌ Game metrics (destroys their value)
	284	* ❌ Blame individuals for metric failures
	285	* ❌ Let metrics become the goal (they're tools)
	286	Do:
	287	* ✅ Focus on actionable metrics
	288	* ✅ Set ambitious but achievable targets
	289	* ✅ Respond to metric signals
	290	* ✅ Continuously validate metrics still matter
	291	* ✅ Use metrics for system improvement, not people evaluation
	292	* ✅ Remember: metrics serve users, not the other way around
	293	== 13. Related Pages ==
	294	* [[Automation Philosophy>>FactHarbor.Organisation.Automation-Philosophy]] - Why we monitor systems, not outputs
	295	* [[Continuous Improvement>>FactHarbor.Organisation.How-We-Work-Together.Continuous-Improvement]] - How we use metrics to improve
	296	* [[Governance>>FactHarbor.Organisation.Governance.WebHome]] - Quarterly performance reviews
	297	---
	298	Remember: We measure the SYSTEM, not individual outputs. Metrics drive IMPROVEMENT, not judgment.

1.1

1

= System Performance Metrics =

2

**What we monitor to ensure AKEL performs well.**

3

== 1. Purpose ==

4

These metrics tell us:

5

* ✅ Is AKEL performing within acceptable ranges?

6

* ✅ Where should we focus improvement efforts?

7

* ✅ When do humans need to intervene?

8

* ✅ Are our changes improving things?

9

**Principle**: Measure to improve, not to judge.

10

== 2. Metric Categories ==

11

=== 2.1 AKEL Performance ===

12

**Processing speed and reliability**

13

=== 2.2 Content Quality ===

14

**Output quality and user satisfaction**

15

=== 2.3 System Health ===

16

**Infrastructure and operational metrics**

17

=== 2.4 User Experience ===

18

**How users interact with the system**

19

== 3. AKEL Performance Metrics ==

20

=== 3.1 Processing Time ===

21

**Metric**: Time from claim submission to verdict publication

22

**Measurements**:

23

* P50 (median): 50% of claims processed within X seconds

24

* P95: 95% of claims processed within Y seconds

25

* P99: 99% of claims processed within Z seconds

**Targets**:

* P50: ≤ 12 seconds

* P95: ≤ 18 seconds

* P99: ≤ 25 seconds

**Alert thresholds**:

31

* P95 > 20 seconds: Monitor closely

32

* P95 > 25 seconds: Investigate immediately

33

* P95 > 30 seconds: Emergency - intervention required

34

**Why it matters**: Slow processing = poor user experience

35

**Improvement ideas**:

36

* Optimize evidence extraction

37

* Better caching

38

* Parallel processing

39

* Database query optimization

40

=== 3.2 Success Rate ===

41

**Metric**: % of claims successfully processed without errors

42

**Target**: ≥ 99%

43

**Alert thresholds**:

44

* 98-99%: Monitor

45

* 95-98%: Investigate

46

* <95%: Emergency

47

**Common failure causes**:

48

* Timeout (evidence extraction took too long)

49

* Parse error (claim text unparsable)

50

* External API failure (source unavailable)

51

* Resource exhaustion (memory/CPU)

52

**Why it matters**: Errors frustrate users and reduce trust

53

=== 3.3 Evidence Completeness ===

54

**Metric**: % of claims where AKEL found sufficient evidence

55

**Measurement**: Claims with ≥3 pieces of evidence from ≥2 distinct sources

56

**Target**: ≥ 80%

57

**Alert thresholds**:

58

* 75-80%: Monitor

59

* 70-75%: Investigate

60

* <70%: Intervention needed

61

**Why it matters**: Incomplete evidence = low confidence verdicts

62

**Improvement ideas**:

63

* Better search algorithms

64

* More source integrations

65

* Improved relevance scoring

66

=== 3.4 Source Diversity ===

67

**Metric**: Average number of distinct sources per claim

68

**Target**: ≥ 3.0 sources per claim

69

**Alert thresholds**:

70

* 2.5-3.0: Monitor

71

* 2.0-2.5: Investigate

72

* <2.0: Intervention needed

73

**Why it matters**: Multiple sources increase confidence and reduce bias

74

=== 3.5 Scenario Coverage ===

75

**Metric**: % of claims with at least one scenario extracted

76

**Target**: ≥ 75%

77

**Why it matters**: Scenarios provide context for verdicts

78

== 4. Content Quality Metrics ==

79

=== 4.1 Confidence Distribution ===

80

**Metric**: Distribution of confidence scores across claims

81

**Target**: Roughly normal distribution

82

* ~10% very low confidence (0.0-0.3)

83

* ~20% low confidence (0.3-0.5)

84

* ~40% medium confidence (0.5-0.7)

85

* ~20% high confidence (0.7-0.9)

86

* ~10% very high confidence (0.9-1.0)

87

**Alert thresholds**:

88

* >30% very low confidence: Evidence extraction issues

89

* >30% very high confidence: Too aggressive/overconfident

90

* Heavily skewed distribution: Systematic bias

91

**Why it matters**: Confidence should reflect actual uncertainty

92

=== 4.2 Contradiction Rate ===

93

**Metric**: % of claims with internal contradictions detected

94

**Target**: ≤ 5%

95

**Alert thresholds**:

96

* 5-10%: Monitor

97

* 10-15%: Investigate

98

* >15%: Intervention needed

99

**Why it matters**: High contradiction rate suggests poor evidence quality or logic errors

100

=== 4.3 User Feedback Ratio ===

101

**Metric**: Helpful vs unhelpful user ratings

102

**Target**: ≥ 70% helpful

103

**Alert thresholds**:

104

* 60-70%: Monitor

105

* 50-60%: Investigate

106

* <50%: Emergency

107

**Why it matters**: Direct measure of user satisfaction

108

=== 4.4 False Positive/Negative Rate ===

109

**Metric**: When humans review flagged items, how often was AKEL right?

110

**Measurement**:

111

* False positive: AKEL flagged for review, but actually fine

112

* False negative: Missed something that should've been flagged

113

**Target**:

114

* False positive rate: ≤ 20%

115

* False negative rate: ≤ 5%

116

**Why it matters**: Balance between catching problems and not crying wolf

117

== 5. System Health Metrics ==

118

=== 5.1 Uptime ===

119

**Metric**: % of time system is available and functional

120

**Target**: ≥ 99.9% (less than 45 minutes downtime per month)

121

**Alert**: Immediate notification on any downtime

122

**Why it matters**: Users expect 24/7 availability

123

=== 5.2 Error Rate ===

124

**Metric**: Errors per 1000 requests

125

**Target**: ≤ 1 error per 1000 requests (0.1%)

126

**Alert thresholds**:

127

* 1-5 per 1000: Monitor

128

* 5-10 per 1000: Investigate

129

* >10 per 1000: Emergency

130

**Why it matters**: Errors disrupt user experience

131

=== 5.3 Database Performance ===

132

**Metrics**:

133

* Query response time (P95)

134

* Connection pool utilization

135

* Slow query frequency

136

**Targets**:

137

* P95 query time: ≤ 50ms

138

* Connection pool: ≤ 80% utilized

139

* Slow queries (>1s): ≤ 10 per hour

140

**Why it matters**: Database bottlenecks slow entire system

141

=== 5.4 Cache Hit Rate ===

142

**Metric**: % of requests served from cache vs. database

143

**Target**: ≥ 80%

144

**Why it matters**: Higher cache hit rate = faster responses, less DB load

145

=== 5.5 Resource Utilization ===

**Metrics**:

* CPU utilization

* Memory utilization

* Disk I/O

* Network bandwidth

**Targets**:

* Average CPU: ≤ 60%

* Peak CPU: ≤ 85%

* Memory: ≤ 80%

* Disk I/O: ≤ 70%

**Alert**: Any metric consistently >85%

157

**Why it matters**: Headroom for traffic spikes, prevents resource exhaustion

158

== 6. User Experience Metrics ==

159

=== 6.1 Time to First Verdict ===

160

**Metric**: Time from user submitting claim to seeing initial verdict

161

**Target**: ≤ 15 seconds

162

**Why it matters**: User perception of speed

163

=== 6.2 Claim Submission Rate ===

164

**Metric**: Claims submitted per day/hour

165

**Monitoring**: Track trends, detect anomalies

166

**Why it matters**: Understand usage patterns, capacity planning

167

=== 6.3 User Retention ===

168

**Metric**: % of users who return after first visit

169

**Target**: ≥ 30% (1-week retention)

170

**Why it matters**: Indicates system usefulness

171

=== 6.4 Feature Usage ===

172

**Metrics**:

173

* % of users who explore evidence

174

* % who check scenarios

175

* % who view source track records

176

**Why it matters**: Understand how users interact with system

177

== 7. Metric Dashboard ==

178

=== 7.1 Real-Time Dashboard ===

179

**Always visible**:

180

* Current processing time (P95)

181

* Success rate (last hour)

182

* Error rate (last hour)

183

* System health status

184

**Update frequency**: Every 30 seconds

185

=== 7.2 Daily Dashboard ===

186

**Reviewed daily**:

187

* All AKEL performance metrics

188

* Content quality metrics

189

* System health trends

190

* User feedback summary

191

=== 7.3 Weekly Reports ===

192

**Reviewed weekly**:

193

* Trends over time

194

* Week-over-week comparisons

195

* Improvement priorities

196

* Outstanding issues

197

=== 7.4 Monthly/Quarterly Reports ===

198

**Comprehensive analysis**:

* Long-term trends

* Seasonal patterns

* Strategic metrics

* Goal progress

== 8. Alert System ==

204

=== 8.1 Alert Levels ===

205

**Info**: Metric outside target, but within acceptable range

206

* Action: Note in daily review

207

* Example: P95 processing time 19s (target 18s, acceptable <20s)

208

**Warning**: Metric outside acceptable range

209

* Action: Investigate within 24 hours

210

* Example: Success rate 97% (acceptable >98%)

211

**Critical**: Metric severely degraded

212

* Action: Investigate immediately

213

* Example: Error rate 2% (acceptable <0.5%)

214

**Emergency**: System failure or severe degradation

215

* Action: Page on-call, all hands

216

* Example: Uptime <95%, P95 >30s

217

=== 8.2 Alert Channels ===

218

**Slack/Discord**: All alerts

219

**Email**: Warning and above

220

**SMS**: Critical and emergency only

221

**PagerDuty**: Emergency only

222

=== 8.3 On-Call Rotation ===

223

**Technical Coordinator**: Primary on-call

224

**Backup**: Designated team member

225

**Responsibilities**:

226

* Respond to alerts within SLA

227

* Investigate and diagnose issues

228

* Implement fixes or escalate

229

* Document incidents

230

== 9. Metric-Driven Improvement ==

231

=== 9.1 Prioritization ===

232

**Focus improvements on**:

233

* Metrics furthest from target

234

* Metrics with biggest user impact

235

* Metrics easiest to improve

236

* Strategic priorities

237

=== 9.2 Success Criteria ===

238

**Every improvement project should**:

239

* Target specific metrics

240

* Set concrete improvement goals

241

* Measure before and after

242

* Document learnings

243

**Example**: "Reduce P95 processing time from 20s to 16s by optimizing evidence extraction"

244

=== 9.3 A/B Testing ===

245

**When feasible**:

246

* Run two versions

247

* Measure metric differences

248

* Choose based on data

249

* Roll out winner

250

== 10. Bias and Fairness Metrics ==

251

=== 10.1 Domain Balance ===

252

**Metric**: Confidence distribution by domain

253

**Target**: Similar distributions across domains

254

**Alert**: One domain consistently much lower/higher confidence

255

**Why it matters**: Ensure no systematic domain bias

256

=== 10.2 Source Type Balance ===

257

**Metric**: Evidence distribution by source type

258

**Target**: Diverse source types represented

259

**Alert**: Over-reliance on one source type

260

**Why it matters**: Prevent source type bias

261

=== 10.3 Geographic Balance ===

262

**Metric**: Source geographic distribution

263

**Target**: Multiple regions represented

264

**Alert**: Over-concentration in one region

265

**Why it matters**: Reduce geographic/cultural bias

266

== 11. Experimental Metrics ==

267

**New metrics to test**:

268

* User engagement time

269

* Evidence exploration depth

270

* Cross-reference usage

271

* Mobile vs desktop usage

272

**Process**:

273

1. Define metric hypothesis

274

2. Implement tracking

275

3. Collect data for 1 month

276

4. Evaluate usefulness

277

5. Add to standard set or discard

278

== 12. Anti-Patterns ==

279

**Don't**:

280

* ❌ Measure too many things (focus on what matters)

281

* ❌ Set unrealistic targets (demotivating)

282

* ❌ Ignore metrics when inconvenient

283

* ❌ Game metrics (destroys their value)

284

* ❌ Blame individuals for metric failures

285

* ❌ Let metrics become the goal (they're tools)

286

**Do**:

287

* ✅ Focus on actionable metrics

288

* ✅ Set ambitious but achievable targets

289

* ✅ Respond to metric signals

290

* ✅ Continuously validate metrics still matter

291

* ✅ Use metrics for system improvement, not people evaluation

292

* ✅ Remember: metrics serve users, not the other way around

293

== 13. Related Pages ==

294

* [[Automation Philosophy>>FactHarbor.Organisation.Automation-Philosophy]] - Why we monitor systems, not outputs

295

* [[Continuous Improvement>>FactHarbor.Organisation.How-We-Work-Together.Continuous-Improvement]] - How we use metrics to improve

296

* [[Governance>>FactHarbor.Organisation.Governance.WebHome]] - Quarterly performance reviews

297

---

298

**Remember**: We measure the SYSTEM, not individual outputs. Metrics drive IMPROVEMENT, not judgment.

Wiki source code of System Performance Metrics