Workflows

Last modified by Robert Schaub on 2026/02/08 08:32

Workflows

FactHarbor workflows are simple, automated, focused on continuous improvement.

1. Core Principles

  • Automated by default: AI processes everything
  • Publish immediately: No centralized approval (removed in V0.9.50)
  • Quality through monitoring: Not gatekeeping
  • Fix systems, not data: Errors trigger improvements
  • Human-in-loop: Only for edge cases and abuse

2. Claim Submission Workflow

2.1 Claim Extraction

When users submit content (text, articles, web pages), FactHarbor first extracts individual verifiable claims:

Input Types:

  • Single claim: "The Earth is flat"
  • Text with multiple claims: "Climate change is accelerating. Sea levels rose 3mm in 2023. Arctic ice decreased 13% annually."
  • URLs: Web pages analyzed for factual claims

Extraction Process:

  • LLM analyzes submitted content
  • Identifies distinct, verifiable factual claims
  • Separates claims from opinions, questions, or commentary
  • Each claim becomes independent for processing

Output:

  • List of claims with context
  • Each claim assigned unique ID
  • Original context preserved for reference

This extraction ensures:

  • Each claim receives focused analysis
  • Multiple claims in one submission are all processed
  • Claims are properly isolated for independent verification
  • Context is preserved for accurate interpretation

```
User submits → Duplicate detection → Categorization → Processing queue → User receives ID
```
Timeline: Seconds
No approval needed

2.5 Claim and Scenario Workflow

Information

Current Implementation (v2.6.33) - Scenarios have been replaced by KeyFactors - optional decomposition questions discovered during the understanding phase.

Claim Analysis Workflow


graph TB
    Start[User Submission]

    subgraph Step1[Step 1 Understand]
        Extract{understandClaim LLM Analysis}
        Gate1{Gate 1 Claim Validation}
        DetectType[Detect Input Type]
        DetectScopes[Detect Scopes]
        KeyFactors[Discover KeyFactors]
    end

    subgraph Step2[Step 2 Research]
        Decide[decideNextResearch]
        Search[Web Search]
        Fetch[Fetch Sources]
        Facts[extractFacts]
    end

    subgraph Step3[Step 3 Verdict]
        Verdict[generateVerdicts]
        Gate4{Gate 4 Confidence Check}
    end

    subgraph Output[Output]
        Publish[Publish Result]
        LowConf[Low Confidence Flag]
    end

    Start --> Extract
    Extract --> Gate1
    Gate1 -->|Pass Factual| DetectType
    Gate1 -->|Fail Opinion| Exclude[Exclude from analysis]
    DetectType --> DetectScopes
    DetectScopes --> KeyFactors
    KeyFactors --> Decide
    Decide --> Search
    Search --> Fetch
    Fetch --> Facts
    Facts -->|More research needed| Decide
    Facts -->|Sufficient evidence| Verdict
    Verdict --> Gate4
    Gate4 -->|High or Medium confidence| Publish
    Gate4 -->|Low or Insufficient| LowConf

Quality Gates (Implemented)

 Gate  Name  Purpose  Pass Criteria
 Gate 1  Claim Validation  Filter non-factual claims  Factual, opinion score 0.3 or less, specificity 0.3 or more
 Gate 4  Verdict Confidence  Ensure sufficient evidence  2 or more sources, avg quality 0.6 or more, agreement 60% or more

Gates 2 (Contradiction Search) and 3 (Uncertainty Quantification) are not yet implemented.

KeyFactors (Replaces Scenarios)

KeyFactors are optional decomposition questions discovered during the understanding phase:

  • Not stored as separate entities
  • Help break down complex claims into checkable sub-questions
  • See Docs/ARCHITECTURE/KeyFactors_Design.md for design rationale

7-Point Verdict Scale

  • TRUE (86-100%) - Claim is well-supported by evidence
  • MOSTLY-TRUE (72-85%) - Largely accurate with minor caveats
  • LEANING-TRUE (58-71%) - More evidence supports than contradicts
  • MIXED (43-57%, high confidence) - Roughly equal evidence both ways
  • UNVERIFIED (43-57%, low confidence) - Insufficient evidence to determine
  • LEANING-FALSE (29-42%) - More evidence contradicts than supports
  • MOSTLY-FALSE (15-28%) - Largely inaccurate
  • FALSE (0-14%) - Claim is refuted by evidence

3. Automated Analysis Workflow

```
Claim from queue

Evidence gathering (AKEL)

Source evaluation (track record check)

Scenario generation

Verdict synthesis

Risk assessment

Quality gates (confidence > 40%? risk < 80%?)

Publish OR Flag for improvement
```
Timeline: 10-30 seconds
90%+ published automatically

3.5 Evidence and Verdict Workflow

Information

Current Implementation (v2.6.33) - Simplified model without versioning. Uses 7-point symmetric verdict scale.

Evidence and Verdict Data Model


erDiagram
    CLAIM ||--|| CLAIM_VERDICT : has
    CLAIM_VERDICT }o--o{ FACT : supported_by
    FACT }o--|| SOURCE : from

    CLAIM {
        string id_PK
        string text
        string type
        string claimRole
        boolean isCentral
        string_array dependsOn
    }

    CLAIM_VERDICT {
        string id_PK
        string claimId_FK
        string verdict
        int truthPercentage
        int confidence
        string explanation
        string_array supportingFactIds
        string_array opposingFactIds
        string contestationStatus
        float harmPotential
    }

    FACT {
        string id_PK
        string sourceId_FK
        string text
        string quote
        string relevance
        boolean supports
        string extractedContext
    }

    SOURCE {
        string id_PK
        string name
        string domain
        string url
        float reliabilityScore
        string bias
        string factualReporting
    }

Verdict Generation Flow


flowchart TB
    subgraph Research[Research Phase]
        FACTS[Collected Facts]
        SOURCES[Source Metadata]
    end

    subgraph Analysis[Analysis]
        WEIGHT[Weight Evidence by source reliability]
        CONTEST[Check Contestation doubted vs contested]
        HARM[Assess Harm Potential]
    end

    subgraph Verdict[Verdict Generation]
        CALC[Calculate Truth Percentage]
        MAP[Map to 7-point Scale]
        CONF[Assign Confidence]
    end

    subgraph Output[Result]
        CLAIM_V[Claim Verdict]
        ARTICLE_V[Article Verdict]
    end

    FACTS --> WEIGHT
    SOURCES --> WEIGHT
    WEIGHT --> CONTEST
    CONTEST --> HARM
    HARM --> CALC
    CALC --> MAP
    MAP --> CONF
    CONF --> CLAIM_V
    CLAIM_V --> ARTICLE_V

7-Point Verdict Scale

 Verdict  Truth % Range  Description
 TRUE  86-100%  Claim is well-supported by evidence
 MOSTLY-TRUE  72-85%  Largely accurate with minor caveats
 LEANING-TRUE  58-71%  More evidence supports than contradicts
 MIXED  43-57% (high conf)  Roughly equal evidence both ways
 UNVERIFIED  43-57% (low conf)  Insufficient evidence to determine
 LEANING-FALSE  29-42%  More evidence contradicts than supports
 MOSTLY-FALSE  15-28%  Largely inaccurate
 FALSE  0-14%  Claim is refuted by evidence

Contestation Status

  • Doubted: Evidence is weak, uncertain, or ambiguous
  • Contested: Strong evidence exists on both sides

Source Reliability

Source reliability scores come from external MBFC (Media Bias/Fact Check) bundle:

  • Pre-loaded reliability scores for known sources
  • Configurable via source-reliability.ts

4. Publication Workflow

Standard (90%+): Pass quality gates → Publish immediately with confidence scores
High Risk (<10%): Risk > 80% → Moderator review
Low Quality: Confidence < 40% → Improvement queue → Re-process

5. User Contribution Workflow

```
Contributor edits → System validates → Applied immediately → Logged → Reputation earned
```
No approval required (Wikipedia model)
New contributors (<50 reputation): Limited to minor edits

5.5 Quality and Audit Workflow

Information

Current Implementation (v2.6.33) - Only Gate 1 (Claim Validation) and Gate 4 (Verdict Confidence) are implemented. Gates 2-3 are planned for future.

Quality Gates Flow


flowchart TB
    subgraph Input[Input]
        CLAIM[Extracted Claim]
    end

    subgraph Gate1[Gate 1 Claim Validation]
        G1_CHECK{Is claim factual}
        G1_OPINION[Opinion Detection]
        G1_SPECIFIC[Specificity Check]
        G1_FUTURE[Future Prediction]
    end

    subgraph Research[Research]
        EVIDENCE[Gather Evidence]
    end

    subgraph Gate4[Gate 4 Verdict Confidence]
        G4_COUNT{Evidence Count}
        G4_QUALITY{Source Quality}
        G4_AGREE{Evidence Agreement}
        G4_TIER[Assign Confidence Tier]
    end

    subgraph Output[Output]
        PUBLISH[Publish Verdict]
        EXCLUDE[Exclude]
        LOWCONF[Flag for Review]
    end

    CLAIM --> G1_CHECK
    G1_CHECK --> G1_OPINION
    G1_OPINION --> G1_SPECIFIC
    G1_SPECIFIC --> G1_FUTURE
    G1_FUTURE -->|Pass| EVIDENCE
    G1_FUTURE -->|Fail| EXCLUDE
    EVIDENCE --> G4_COUNT
    G4_COUNT -->|2 or more| G4_QUALITY
    G4_COUNT -->|less than 2| LOWCONF
    G4_QUALITY -->|0.6 or more| G4_AGREE
    G4_QUALITY -->|less than 0.6| LOWCONF
    G4_AGREE -->|60 percent or more| G4_TIER
    G4_AGREE -->|less than 60 percent| LOWCONF
    G4_TIER -->|HIGH or MEDIUM| PUBLISH
    G4_TIER -->|LOW| LOWCONF

Gate Details

Gate 1: Claim Validation

Purpose: Ensure extracted claims are factual assertions that can be verified.

 Check  Purpose  Pass Criteria
 Factuality Test  Can this claim be proven true/false?  Must be verifiable
 Opinion Detection  Contains subjective language?  Opinion score 0.3 or less
 Specificity Check  Contains concrete details?  Specificity score 0.3 or more
 Future Prediction  About future events?  Must be about past/present

Gate 4: Verdict Confidence Assessment

Purpose: Only display verdicts with sufficient evidence and confidence.

 Tier  Evidence  Avg Quality  Agreement  Publishable?
 HIGH  3+ sources  0.7 or more  80% or more  Yes
 MEDIUM  2+ sources  0.6 or more  60% or more  Yes
 LOW  2+ sources  0.5 or more  40% or more  Needs review
 INSUFFICIENT  Less than 2 sources  Any  Any  More research needed

Not Yet Implemented

Gate 2: Contradiction Search (planned) - Counter-evidence actively searched

Gate 3: Uncertainty Quantification (planned) - Data gaps identified and disclosed

6. Flagging Workflow

```
User flags issue → Categorize (abuse/quality) → Automated or manual resolution
```
Quality issues: Add to improvement queue → System fix → Auto re-process
Abuse: Moderator review → Action taken

7. Moderation Workflow

Automated pre-moderation: 95% published automatically
Moderator queue: Only high-risk or flagged content
Appeal process: Different moderator → Governing Team if needed

8. System Improvement Workflow

Weekly cycle:
```
Monday: Review error patterns
Tuesday-Wednesday: Develop fixes
Thursday: Test improvements
Friday: Deploy & re-process
Weekend: Monitor metrics
```
Error capture:
```
Error detected → Categorize → Root cause → Improvement queue → Pattern analysis
```
A/B Testing:
```
New algorithm → Split traffic (90% control, 10% test) → Run 1 week → Compare metrics → Deploy if better
```

9. Quality Monitoring Workflow

Continuous: Every hour calculate metrics, detect anomalies
Daily: Update source track records, aggregate error patterns
Weekly: System improvement cycle, performance review

10. Source Track Record Workflow

Initial score: New source starts at 50 (neutral)
Daily updates: Calculate accuracy, correction frequency, update score
Continuous: All claims using source recalculated when score changes

11. Re-Processing Workflow

Triggers: System improvement deployed, source score updated, new evidence, error fixed
Process: Identify affected claims → Re-run AKEL → Compare → Update if better → Log change

12. Related Pages