Changes for page Automation
Last modified by Robert Schaub on 2025/12/24 21:46
Summary
-
Page properties (1 modified, 0 added, 0 removed)
Details
- Page properties
-
- Content
-
... ... @@ -37,6 +37,78 @@ 37 37 * Detected manipulation attempt 38 38 * Unusual pattern 39 39 * Moderator reviews and may take action 40 + 41 +== 2.5 LLM-Based Processing Architecture == 42 + 43 +FactHarbor delegates complex reasoning and analysis tasks to Large Language Models (LLMs). The architecture evolves from POC to production: 44 + 45 +=== POC: Two-Phase Approach === 46 + 47 +**Phase 1: Claim Extraction** 48 +* Single LLM call to extract all claims from submitted content 49 +* Light structure, focused on identifying distinct verifiable claims 50 +* Output: List of claims with context 51 + 52 +**Phase 2: Claim Analysis (Parallel)** 53 +* Single LLM call per claim (parallelizable) 54 +* Full structured output: Evidence, Scenarios, Sources, Verdict, Risk 55 +* Each claim analyzed independently 56 + 57 +**Advantages:** 58 +* Fast to implement (2-4 weeks to working POC) 59 +* Only 2-3 API calls total (1 + N claims) 60 +* Simple to debug (claim-level isolation) 61 +* Proves concept viability 62 + 63 +=== Production: Three-Phase Approach === 64 + 65 +**Phase 1: Claim Extraction + Validation** 66 +* Extract distinct verifiable claims 67 +* Validate claim clarity and uniqueness 68 +* Remove duplicates and vague claims 69 + 70 +**Phase 2: Evidence Gathering (Parallel)** 71 +* For each claim independently: 72 + * Find supporting and contradicting evidence 73 + * Identify authoritative sources 74 + * Generate test scenarios 75 +* Validation: Check evidence quality and source validity 76 +* Error containment: Issues in one claim don't affect others 77 + 78 +**Phase 3: Verdict Generation (Parallel)** 79 +* For each claim: 80 + * Generate verdict based on validated evidence 81 + * Assess confidence and risk level 82 + * Flag low-confidence results for human review 83 +* Validation: Check verdict consistency with evidence 84 + 85 +**Advantages:** 86 +* Error containment between phases 87 +* Clear quality gates and validation 88 +* Observable metrics per phase 89 +* Scalable (parallel processing across claims) 90 +* Adaptable (can optimize each phase independently) 91 + 92 +=== LLM Task Delegation === 93 + 94 +All complex cognitive tasks are delegated to LLMs: 95 +* **Claim Extraction**: Understanding context, identifying distinct claims 96 +* **Evidence Finding**: Analyzing sources, assessing relevance 97 +* **Scenario Generation**: Creating testable hypotheses 98 +* **Source Evaluation**: Assessing reliability and authority 99 +* **Verdict Generation**: Synthesizing evidence into conclusions 100 +* **Risk Assessment**: Evaluating potential impact 101 + 102 +=== Error Mitigation === 103 + 104 +Research shows sequential LLM calls face compound error risks. FactHarbor mitigates this through: 105 +* **Validation gates** between phases 106 +* **Confidence thresholds** for quality control 107 +* **Parallel processing** to avoid error propagation across claims 108 +* **Human review queue** for low-confidence verdicts 109 +* **Independent claim processing** - errors in one claim don't cascade to others 110 + 111 + 40 40 == 3. Risk Tiers == 41 41 Risk tiers classify claims by potential impact and guide audit sampling rates. 42 42 === 3.1 Tier A (High Risk) === ... ... @@ -79,6 +79,11 @@ 79 79 **Release 1.0** (Initial): Tier B/C auto-published, Tier A flagged for review 80 80 **Release 2.0** (Mature): All tiers auto-published with risk labels, sampling audits 81 81 See [[Automation Roadmap>>FactHarbor.Specification.Diagrams.Automation Roadmap.WebHome]] for detailed progression. 154 + 155 +== 5.5 Automation Roadmap == 156 + 157 +{{include reference="FactHarbor.Specification.Diagrams.Automation Roadmap.WebHome"/}} 158 + 82 82 == 6. Human Role == 83 83 Humans do NOT review content for approval. Instead: 84 84 **Monitoring**: Watch aggregate performance metrics ... ... @@ -86,6 +86,11 @@ 86 86 **Exception handling**: Review AKEL-flagged items 87 87 **Governance**: Set policies AKEL applies 88 88 See [[Contributor Processes>>FactHarbor.Organisation.Contributor-Processes]] for how to improve the system. 166 + 167 +== 6.5 Manual vs Automated Matrix == 168 + 169 +{{include reference="FactHarbor.Specification.Diagrams.Manual vs Automated matrix.WebHome"/}} 170 + 89 89 == 7. Moderation == 90 90 Moderators handle items AKEL flags: 91 91 **Abuse detection**: Spam, manipulation, harassment