Changes for page Data Model

Last modified by Robert Schaub on 2026/02/08 08:27

From version 4.8
edited by Robert Schaub
on 2025/12/24 20:35
Change comment: Renamed back-links.
To version 5.2
edited by Robert Schaub
on 2026/01/20 20:25
Change comment: Renamed back-links.

Summary

Details

Page properties
Content
... ... @@ -79,48 +79,48 @@
79 79  
80 80  Runs independently of claim analysis:
81 81  {{code language="python"}}def update_source_scores_weekly():
82 - """
83 - Background job: Calculate source reliability
84 - Never triggered by individual claim analysis
85 - """
86 - # Analyze all claims from past week
87 - claims = get_claims_from_past_week()
88 - for source in get_all_sources():
89 - # Calculate accuracy metrics
90 - correct_verdicts = count_correct_verdicts_citing(source, claims)
91 - total_citations = count_total_citations(source, claims)
92 - accuracy = correct_verdicts / total_citations if total_citations > 0 else 0.5
93 - # Weight by claim importance
94 - weighted_score = calculate_weighted_score(source, claims)
95 - # Update source record
96 - source.track_record_score = weighted_score
97 - source.total_citations = total_citations
98 - source.last_updated = now()
99 - source.save()
100 - # Job runs: Sunday 2 AM UTC
101 - # Never during claim processing{{/code}}
82 + """
83 + Background job: Calculate source reliability
84 + Never triggered by individual claim analysis
85 + """
86 + # Analyze all claims from past week
87 + claims = get_claims_from_past_week()
88 + for source in get_all_sources():
89 + # Calculate accuracy metrics
90 + correct_verdicts = count_correct_verdicts_citing(source, claims)
91 + total_citations = count_total_citations(source, claims)
92 + accuracy = correct_verdicts / total_citations if total_citations > 0 else 0.5
93 + # Weight by claim importance
94 + weighted_score = calculate_weighted_score(source, claims)
95 + # Update source record
96 + source.track_record_score = weighted_score
97 + source.total_citations = total_citations
98 + source.last_updated = now()
99 + source.save()
100 + # Job runs: Sunday 2 AM UTC
101 + # Never during claim processing{{/code}}
102 102  
103 103  ==== Real-Time Claim Analysis (AKEL) ====
104 104  
105 105  Uses source scores but never updates them:
106 106  {{code language="python"}}def analyze_claim(claim_text):
107 - """
108 - Real-time: Analyze claim using current source scores
109 - READ source scores, never UPDATE them
110 - """
111 - # Gather evidence
112 - evidence_list = gather_evidence(claim_text)
113 - for evidence in evidence_list:
114 - # READ source score (snapshot from last weekly update)
115 - source = get_source(evidence.source_id)
116 - source_score = source.track_record_score
117 - # Use score to weight evidence
118 - evidence.weighted_relevance = evidence.relevance * source_score
119 - # Generate verdict using weighted evidence
120 - verdict = synthesize_verdict(evidence_list)
121 - # NEVER update source scores here
122 - # That happens in weekly background job
123 - return verdict{{/code}}
107 + """
108 + Real-time: Analyze claim using current source scores
109 + READ source scores, never UPDATE them
110 + """
111 + # Gather evidence
112 + evidence_list = gather_evidence(claim_text)
113 + for evidence in evidence_list:
114 + # READ source score (snapshot from last weekly update)
115 + source = get_source(evidence.source_id)
116 + source_score = source.track_record_score
117 + # Use score to weight evidence
118 + evidence.weighted_relevance = evidence.relevance * source_score
119 + # Generate verdict using weighted evidence
120 + verdict = synthesize_verdict(evidence_list)
121 + # NEVER update source scores here
122 + # That happens in weekly background job
123 + return verdict{{/code}}
124 124  
125 125  ==== Monthly Audit (Quality Assurance) ====
126 126  
... ... @@ -156,14 +156,14 @@
156 156  **Example Timeline**:
157 157  ```
158 158  Sunday 2 AM: Calculate source scores for past week
159 - → NYT score: 0.87 (up from 0.85)
160 - → Blog X score: 0.52 (down from 0.61)
159 + → NYT score: 0.87 (up from 0.85)
160 + → Blog X score: 0.52 (down from 0.61)
161 161  Monday-Saturday: Claims processed using these scores
162 - → All claims this week use NYT=0.87
163 - → All claims this week use Blog X=0.52
162 + → All claims this week use NYT=0.87
163 + → All claims this week use Blog X=0.52
164 164  Next Sunday 2 AM: Recalculate scores including this week's claims
165 - → NYT score: 0.89 (trending up)
166 - → Blog X score: 0.48 (trending down)
165 + → NYT score: 0.89 (trending up)
166 + → Blog X score: 0.48 (trending down)
167 167  ```
168 168  
169 169  === 1.4 Scenario ===
... ... @@ -288,7 +288,7 @@
288 288  * Threshold-based promotions
289 289  * Reputation decay for inactive users
290 290  * Track record scoring for contributors
291 -See [[When to Add Complexity>>Test.FactHarbor.Specification.When-to-Add-Complexity]] for triggers.
291 +See [[When to Add Complexity>>FactHarbor.Specification.When-to-Add-Complexity]] for triggers.
292 292  
293 293  === 1.7 Edit ===
294 294  
... ... @@ -361,11 +361,11 @@
361 361  
362 362  == 1.4 Core Data Model ERD ==
363 363  
364 -{{include reference="Test.FactHarbor pre V0\.9\.70.Specification.Diagrams.Core Data Model ERD.WebHome"/}}
364 +{{include reference="FactHarbor.Specification.Diagrams.Core Data Model ERD.WebHome"/}}
365 365  
366 366  == 1.5 User Class Diagram ==
367 367  
368 -{{include reference="Test.FactHarbor pre V0\.9\.70.Specification.Diagrams.User Class Diagram.WebHome"/}}
368 +{{include reference="FactHarbor.Specification.Diagrams.User Class Diagram.WebHome"/}}
369 369  
370 370  == 2. Versioning Strategy ==
371 371  
... ... @@ -387,9 +387,9 @@
387 387  **Example**:
388 388  ```
389 389  Claim V1: "The sky is blue"
390 - → User edits →
390 + → User edits →
391 391  Claim V2: "The sky is blue during daytime"
392 - → EDIT table stores: {before: "The sky is blue", after: "The sky is blue during daytime"}
392 + → EDIT table stores: {before: "The sky is blue", after: "The sky is blue during daytime"}
393 393  ```
394 394  
395 395  == 2.5. Storage vs Computation Strategy ==
... ... @@ -506,8 +506,8 @@
506 506  * **Compute cost**: $0.005-0.01 per request (LLM API call)
507 507  * **Frequency**: Viewed in detail by 20% of users
508 508  * **Trade-off analysis**:
509 - - IF STORED: 1M claims × 3 KB = 3 GB storage, $0.05/month, fast access
510 - - IF COMPUTED: 1M claims × 20% views × $0.01 = $2,000/month in LLM costs
509 + - IF STORED: 1M claims × 3 KB = 3 GB storage, $0.05/month, fast access
510 + - IF COMPUTED: 1M claims × 20% views × $0.01 = $2,000/month in LLM costs
511 511  * **Reproducibility**: Scenarios may improve as AI improves (good to recompute)
512 512  * **Speed**: Computed = 5-8 seconds delay, Stored = instant
513 513  * **Decision**: ✅ STORE (hybrid approach below)
... ... @@ -551,8 +551,8 @@
551 551  * **Current design**: Stored in User table
552 552  * **Alternative**: Compute from Edit table
553 553  * **Trade-off**:
554 - - Stored: Fast, simple
555 - - Computed: Always accurate, no denormalization
554 + - Stored: Fast, simple
555 + - Computed: Always accurate, no denormalization
556 556  * **Frequency**: Read on every user action
557 557  * **Compute cost**: Simple COUNT query, milliseconds
558 558  * **Decision**: ✅ STORE - Performance critical, read-heavy
... ... @@ -560,7 +560,7 @@
560 560  === Summary Table ===
561 561  
562 562  | Data Type | Storage | Compute | Size per Claim | Decision | Rationale |\\
563 -|-|-|-|||-|\\
563 +|-----|-|-||----|-----|\\
564 564  | Claim core | ✅ | - | 1 KB | STORE | Essential |\\
565 565  | Evidence | ✅ | - | 2 KB × 5 = 10 KB | STORE | Reproducibility |\\
566 566  | Sources | ✅ | - | 500 B (shared) | STORE | Track record |\\
... ... @@ -585,7 +585,7 @@
585 585  * **Total**: $75/month infrastructure
586 586  **LLM cost savings by caching**:
587 587  * Analysis summary stored: Save $0.03 per claim = $30K per 1M claims
588 -* Scenarios stored: Save $0.01 per claim × 20% views = $2K per 1M claims
588 +* Scenarios stored: Save $0.01 per claim × 20% views = $2K per 1M claims
589 589  * Verdict stored: Save $0.003 per claim = $3K per 1M claims
590 590  * **Total savings**: $35K per 1M claims vs recomputing every time
591 591  
... ... @@ -610,11 +610,11 @@
610 610  
611 611  * Storage: 180 MB
612 612  * Cost: $10/month
613 -**Year 3**: 100K claims
613 +**Year 3**: 100K claims
614 614  * Storage: 1.8 GB
615 615  * Cost: $30/month
616 616  **Year 5**: 1M claims
617 -* Storage: 18 GB
617 +* Storage: 18 GB
618 618  * Cost: $75/month
619 619  **Year 10**: 10M claims
620 620  * Storage: 180 GB
... ... @@ -700,6 +700,6 @@
700 700  
701 701  == 4. Related Pages ==
702 702  
703 -* [[Architecture>>Archive.FactHarbor delta for V0\.9\.70.Specification.Architecture.WebHome]]
704 -* [[Requirements>>Archive.FactHarbor delta for V0\.9\.70.Specification.Requirements.WebHome]]
705 -* [[Workflows>>Test.FactHarbor pre V0\.9\.70.Specification.Workflows.WebHome]]
703 +* [[Architecture>>Archive.FactHarbor.Specification.Architecture.WebHome]]
704 +* [[Requirements>>FactHarbor.Specification.Requirements.WebHome]]
705 +* [[Workflows>>FactHarbor.Specification.Workflows.WebHome]]