Requirements

Version 1.1 by Robert Schaub on 2025/12/16 21:42

Requirements

This page defines Roles, Responsibilities, and Rules for contributors and users of FactHarbor.

1. Roles

1.1 Reader

Who: Anyone (no login required).

Can:

  • Browse and search claims
  • View scenarios, evidence, verdicts, and timelines
  • Compare scenarios and explore assumptions
  • Flag issues, errors, contradictions, or suspicious patterns
  • Use filters, search, and visualization tools
  • Create personal views (saved searches, bookmarks - local browser storage)
  • Submit claims automatically by providing text to analyze - new claims are added automatically unless equal claims already exist in the system

Cannot:

  • Modify existing content
  • Access draft content
  • Participate in governance decisions

Note: Readers can request human review of AI-generated content by flagging it.

1.2 Contributor

Who: Registered and logged-in users (extends Reader capabilities).

Can:

  • Everything a Reader can do
  • Submit claims
  • Submit evidence
  • Provide feedback
  • Suggest scenarios
  • Flag content for review
  • Request human review of AI-generated content

Cannot:

  • Publish or mark content as "reviewed" or "approved"
  • Override expert or maintainer decisions
  • Directly modify AKEL or quality gate configurations

1.3 Reviewer

Who: Trusted community members, appointed by maintainers.

Can:

  • Review contributions from Contributors and AKEL drafts
  • Validate AI-generated content (Mode 2 → Mode 3 transition)
  • Edit claims, scenarios, and evidence
  • Add clarifications or warnings
  • Change content status: `draft` → `in review` → `published` / `rejected`
  • Approve or reject Tier B and C content for "Human-Reviewed" status
  • Flag content for expert review
  • Participate in audit sampling

Cannot:

  • Approve Tier A content for "Human-Reviewed" status (requires Expert)
  • Change governance rules
  • Unilaterally change expert conclusions without process
  • Bypass quality gates

Note on AI-Drafted Content:

  • Reviewers can validate AI-generated content (Mode 2) to promote it to "Human-Reviewed" (Mode 3)
  • For Tier B and C, Reviewers have approval authority
  • For Tier A, only Experts can grant "Human-Reviewed" status

1.4 Expert (Domain-Specific)

Who: Subject-matter specialists in specific domains (medicine, law, science, etc.).

Can:

  • Everything a Reviewer can do
  • Final authority on Tier A content "Human-Reviewed" status
  • Validate complex or controversial claims in their domain
  • Define domain-specific quality standards
  • Set reliability thresholds for domain sources
  • Participate in risk tier assignment review
  • Override AKEL suggestions in their domain (with documentation)

Cannot:

  • Change platform governance policies
  • Approve content outside their expertise domain
  • Bypass technical quality gates (but can flag for adjustment)

Specialization:

  • Experts are domain-specific (e.g., "Medical Expert", "Legal Expert", "Climate Science Expert")
  • Cross-domain claims may require multiple expert reviews

1.5 Auditor

Who: Reviewers or Experts assigned to sampling audit duties.

Can:

  • Review sampled AI-generated content against quality standards
  • Validate quality gate enforcement
  • Identify patterns in AI errors or hallucinations
  • Provide feedback for system improvement
  • Flag content for immediate review if errors found
  • Contribute to audit statistics and transparency reports

Cannot:

  • Change audit sampling algorithms (maintainer responsibility)
  • Bypass normal review workflows
  • Audit content they personally created

Selection:

  • Auditors selected based on domain expertise and review quality
  • Rotation to prevent audit fatigue
  • Stratified assignment (Tier A auditors need higher expertise)

Audit Focus:

  • Tier A: Recommendation 30-50% sampling rate, expert auditors
  • Tier B: Recommendation 10-20% sampling rate, reviewer/expert auditors
  • Tier C: Recommendation 5-10% sampling rate, reviewer auditors

1.6 Moderator

Who: Maintainers or trusted long-term contributors.

Can:

  • All Reviewer and Expert capabilities (cross-domain)
  • Manage user accounts and permissions
  • Handle disputes and conflicts
  • Enforce community guidelines
  • Suspend or ban abusive users
  • Finalize publication status for sensitive content
  • Review and adjust risk tier assignments
  • Oversee audit system performance

Cannot:

  • Change core data model or architecture
  • Override technical system constraints
  • Make unilateral governance decisions without consensus

1.7 Maintainer

Who: Core team members responsible for the platform.

Can:

  • All Moderator capabilities
  • Change data model, architecture, and technical systems
  • Configure quality gates and AKEL parameters
  • Adjust audit sampling algorithms
  • Set and modify risk tier policies
  • Make platform-wide governance decisions
  • Access and modify backend systems
  • Deploy updates and fixes
  • Grant and revoke roles

Governance:

  • Maintainers operate under organizational governance rules
  • Major policy changes require Governing Team approval
  • Technical decisions made collaboratively

2. Content Publication States

2.1 Mode 1: Draft

  • Not visible to public
  • Visible to contributor and reviewers
  • Can be edited by contributor or reviewers
  • Default state for failed quality gates

2.2 Mode 2: AI-Generated (Published)

  • Public and visible to all users
  • Clearly labeled as "AI-Generated, Awaiting Human Review"
  • Passed all automated quality gates
  • Risk tier displayed (A/B/C)
  • Users can:
    • Read and use content
    • Request human review
    • Flag for expert attention
  • Subject to sampling audits
  • Can be promoted to Mode 3 by reviewer/expert validation

2.3 Mode 3: Human-Reviewed (Published)

  • Public and visible to all users
  • Labeled as "Human-Reviewed" with reviewer/expert attribution
  • Passed quality gates + human validation
  • Highest trust level
  • For Tier A, requires Expert approval
  • For Tier B/C, Reviewer approval sufficient

2.4 Rejected

  • Not visible to public
  • Visible to contributor with rejection reason
  • Can be resubmitted after addressing issues
  • Rejection logged for transparency

3. Contribution Rules

3.1 All Contributors Must:

  • Provide sources for claims
  • Use clear, neutral language
  • Avoid personal attacks or insults
  • Respect intellectual property (cite sources)
  • Accept community feedback gracefully

3.2 AKEL (AI) Must:

  • Mark all outputs with `AuthorType = AI`
  • Pass quality gates before Mode 2 publication
  • Perform mandatory contradiction search
  • Disclose confidence levels and uncertainty
  • Provide traceable reasoning chains
  • Flag potential bubbles or echo chambers
  • Submit to audit sampling

3.3 Reviewers Must:

  • Be impartial and evidence-based
  • Document reasoning for decisions
  • Escalate to experts when appropriate
  • Participate in audits when assigned
  • Provide constructive feedback

3.4 Experts Must:

  • Stay within domain expertise
  • Disclose conflicts of interest
  • Document specialized terminology
  • Provide reasoning for domain-specific decisions
  • Participate in Tier A audits

4. Quality Standards

4.1 Source Requirements

  • Primary sources preferred over secondary
  • Publication date and author must be identifiable
  • Sources must be accessible (not paywalled when possible)
  • Contradictory sources must be acknowledged
  • Echo chamber sources must be flagged

4.2 Claim Requirements

  • Falsifiable or evaluable
  • Clear definitions of key terms
  • Boundaries and scope stated
  • Assumptions made explicit
  • Uncertainty acknowledged

4.3 Evidence Requirements

  • Relevant to the claim and scenario
  • Reliability assessment provided
  • Methodology described (for studies)
  • Limitations noted
  • Conflicting evidence acknowledged

5. Risk Tier Assignment

Automated (AKEL): Initial tier suggested based on domain, keywords, impact
Human Validation: Moderators or Experts can override AKEL suggestions
Review: Risk tiers periodically reviewed based on audit outcomes

Tier A Indicators:

  • Medical diagnosis or treatment advice
  • Legal interpretation or advice
  • Election or voting information
  • Safety or security sensitive
  • Major financial decisions
  • Potential for significant harm

Tier B Indicators:

  • Complex scientific causality
  • Contested policy domains
  • Historical interpretation with political implications
  • Significant economic impact claims

Tier C Indicators:

  • Established historical facts
  • Simple definitions
  • Well-documented scientific consensus
  • Basic reference information

6. User Role Hierarchy Diagram

The following diagram visualizes the complete role hierarchy:

7. Role Hierarchy Diagrams

7.1 User Class Diagram

The following class diagram visualizes the complete user role hierarchy:

User Class Diagram


classDiagram
    class BaseUser {
        +view_results()
        +browse()
        +search()
    }
    class Reader {
        <>
        +browse()
        +search()
        +view_results()
    }
    class RegisteredUser {
        +UUID id
        +String username
        +Role role
        +Timestamp created_at
        +submit_url()
        +flag_issue()
        +view_submission_history()
    }
    class UCMAdministrator {
        +manage_config()
        +view_audit_trail()
        +activate_config_version()
        +trigger_reanalysis()
        +view_system_metrics()
    }
    class Moderator {
        +review_flags()
        +hide_content()
        +ban_user()
    }
    BaseUser <|-- Reader : anonymous
    BaseUser <|-- RegisteredUser : logged in
    RegisteredUser <|-- UCMAdministrator : appointed
    RegisteredUser <|-- Moderator : appointed

Role Permissions

 Role  Capabilities  Requirements
 Reader (Guest)  Browse, search, view results  No login required
 User (Registered)  Everything Reader can + submit URLs/text (rate-limited), flag content  Free account required
 UCM Administrator  Everything User can + manage UCM config, view audit trail, trigger re-analysis  Appointed by Governing Team
 Moderator  Everything User can + review flags, hide content, ban users  Appointed by Governing Team

Current Implementation

  • All users are anonymous Readers (no authentication system yet)
  • UCM config management via CLI/direct DB access
  • No moderator tooling
  • No rate limiting (single-user development mode)

Design Principles

  • No data editing roles — analysis outputs are immutable
  • UCM Administrator improves the system through configuration, not by editing individual outputs
  • Submission requires login — LLM inference and web search are not free; rate limits control costs
  • Four roles: Reader (guest), User (registered), UCM Administrator (appointed), Moderator (appointed)

7.2 Human User Roles

This diagram shows the two-track progression for human users:

User Role Structure


graph TD
    READER[Reader - Guest/Anonymous] --> |Can| R1[Browse Published Analyses]
    READER --> |Can| R2[Search Content]
    READER --> |Can| R3[View Analysis Results]
    USER[User - Registered] --> |Can| RU1[Everything Reader Can]
    USER --> |Can| RU2[Submit URLs/Text - Rate-Limited]
    USER --> |Can| RU3[Flag Issues]
    UCM_ADMIN[UCM Administrator - Appointed] --> |Can| U1[Manage UCM Config]
    UCM_ADMIN --> |Can| U2[View Config Audit Trail]
    UCM_ADMIN --> |Can| U3[Trigger Re-Analysis]
    UCM_ADMIN --> |Can| U4[View System Metrics]
    MODERATOR[Moderator - Appointed] --> |Can| M1[Review Flags]
    MODERATOR --> |Can| M2[Hide Harmful Content]
    MODERATOR --> |Can| M3[Ban Abusive Users]

Role Descriptions

 Role  Purpose  Current Status
 Reader (Guest)  Anonymous browsing, searching, and viewing  Implemented (all users)
 User (Registered)  Submit URLs/text for analysis (rate-limited)  Not yet implemented (no auth)
 UCM Administrator  Manage UCM configuration, view audit trail  Partially implemented (CLI/direct DB)
 Moderator  Handle abuse, enforce community guidelines  Not yet implemented

Current Implementation

All users are anonymous Readers:

  • Can view analysis results
  • Can browse and search published analyses
  • No persistent accounts (no authentication system yet)
  • No submission rate limiting (single-user development mode)

Design Principles

  • No data editing — analysis outputs are immutable
  • Improve the system, not the data — UCM Administrators tune configuration to improve quality
  • Moderators handle abuse only — not content quality (that is automated)
  • Low barrier to entry — anyone can browse and search without registration; submission requires a free account
  • Rate-limited submissions — LLM inference and web search are not free; registered users have configurable quotas

7.3 Technical and System Users

This diagram shows system processes and their management:

Warning

Partially Implemented (v2.6.33) - Only AKEL system service is implemented. User system, moderators, background scheduler, and search indexer are not yet implemented.

Target Technical Model


erDiagram
    USER {
        string UserID_PK
        string role
        int reputation
    }
    MODERATOR {
        string ModeratorID_PK
        string UserID_FK
        string permissions
    }
    SYSTEM_SERVICE {
        string ServiceID_PK
        string ServiceName
        string Purpose
        string Status
    }
    AKEL {
        string InstanceID_PK
        string ServiceID_FK
        string Version
    }
    BACKGROUND_SCHEDULER {
        string SchedulerID_PK
        string ServiceID_FK
        string ScheduledTasks
    }
    SEARCH_INDEXER {
        string IndexerID_PK
        string ServiceID_FK
        string LastSyncTime
    }
    USER ||--o| MODERATOR : appointed_as
    MODERATOR ||--o{ SYSTEM_SERVICE : monitors
    SYSTEM_SERVICE ||--|| AKEL : AI_processing
    SYSTEM_SERVICE ||--|| BACKGROUND_SCHEDULER : periodic_tasks
    SYSTEM_SERVICE ||--|| SEARCH_INDEXER : search_sync

Implementation Status

 Component  Target Purpose  Current Status
 USER  User accounts with reputation  Not implemented (anonymous only)
 MODERATOR  Appointed users with permissions  Not implemented
 AKEL  AI processing engine  Implemented (Triple-Path pipeline)
 BACKGROUND_SCHEDULER  Periodic tasks  Not implemented
 SEARCH_INDEXER  Elasticsearch sync  Not implemented (no Elasticsearch)

Current Implementation

v2.6.33 has only:

  • AKEL pipeline for analysis
  • .NET API for job persistence
  • No background services
  • No search indexing (uses web search only)

Key Design Principles:

  • Two tracks from Contributor: Content Track (Reviewer) and Technical Track (Maintainer)
  • Technical Users: System processes (AKEL, bots) managed by Maintainers
  • Separation of concerns: Editorial authority independent from technical authority

Functional Requirements

This page defines what the FactHarbor system must do to fulfill its mission.

Requirements are structured as FR (Functional Requirement) items and organized by capability area.

8. Claim Intake & Normalization

8.1 FR1 – Claim Intake

The system must support Claim creation from:

  • Free-text input (from any Reader)
  • URLs (web pages, articles, posts)
  • Uploaded documents and transcripts
  • Structured feeds (optional, e.g. from partner platforms)
  • Automated ingestion (federation input)
  • AKEL extraction from multi-claim texts

Automatic submission: Any Reader can submit text, and new claims are added automatically unless identical claims already exist.

8.2 FR2 – Claim Normalization

  • Convert diverse inputs into short, structured, declarative claims
  • Preserve original phrasing for reference
  • Avoid hidden reinterpretation; differences between original and normalized phrasing must be visible

8.3 FR3 – Claim Classification

  • Classify claims by topic, domain, and type (e.g., quantitative, causal, normative)
  • Assign risk tier (A/B/C) based on domain and potential impact
  • Suggest which node / experts are relevant

8.4 FR4 – Claim Clustering

  • Group similar claims into Claim Clusters
  • Allow manual correction of cluster membership
  • Provide explanation why two claims are considered "same cluster"

9. Scenario System

9.1 FR5 – Scenario Creation

  • Contributors, Reviewers, and Experts can create scenarios
  • AKEL can propose draft scenarios
  • Each scenario is tied to exactly one Claim Cluster

9.2 FR6 – Required Scenario Fields

Each scenario includes:

  • Definitions (key terms)
  • Assumptions (explicit, testable where possible)
  • Boundaries (time, geography, population, conditions)
  • Scope of evidence considered
  • Intended decision / context (optional)

9.3 FR7 – Scenario Versioning

  • Every change to a scenario creates a new version
  • Previous versions remain accessible with timestamps and rationale
  • ParentVersionID links versions

9.4 FR8 – Scenario Comparison

  • Users can compare scenarios side by side
  • Show differences in assumptions, definitions, and evidence sets

10. Evidence Management

10.1 FR9 – Evidence Ingestion

  • Attach external sources (articles, studies, datasets, reports, transcripts) to Scenarios
  • Allow multiple pieces of evidence per Scenario
  • Support large file uploads (with size limits)

10.2 FR10 – Evidence Assessment

For each piece of evidence:

  • Assign reliability / quality ratings
  • Capture who rated it and why
  • Indicate known limitations, biases, or conflicts of interest
  • Track evidence version history

10.3 FR11 – Evidence Linking

  • Link one piece of evidence to multiple scenarios if relevant
  • Make dependencies explicit (e.g., "Scenario A uses subset of evidence used in Scenario B")
  • Use ScenarioEvidenceLink table with RelevanceScore

11. Verdicts & Truth Landscape

11.1 FR12 – Scenario Verdicts

For each Scenario:

  • Provide a probability- or likelihood-based verdict
  • Capture uncertainty and reasoning
  • Distinguish between AKEL draft and human-approved verdict
  • Support Mode 1 (draft), Mode 2 (AI-generated), Mode 3 (human-reviewed)

11.2 FR13 – Truth Landscape

  • Aggregate all scenario-specific verdicts into a "truth landscape" for a claim
  • Make disagreements visible rather than collapsing them into a single binary result
  • Show parallel scenarios and their respective verdicts

11.3 FR14 – Time Evolution

  • Show how verdicts and evidence evolve over time
  • Allow users to see "as of date X, what did we know?"
  • Maintain complete version history for auditing

12. Workflow, Moderation & Audit

12.1 FR15 – Workflow States

  • Draft → In Review → Published / Rejected
  • Separate states for Claims, Scenarios, Evidence, and Verdicts
  • Support Mode 1/2/3 publication model

12.2 FR16 – Moderation & Abuse Handling

  • Allow Moderators to hide content or lock edits for abuse or legal reasons
  • Keep internal audit trail even if public view is restricted
  • Support user reporting and flagging

12.3 FR17 – Audit Trail

  • Every significant action (create, edit, publish, delete/hide) is logged with:
    • Who did it
    • When (timestamp)
    • What changed (diffs)
    • Why (justification text)

13. Quality Gates & AI Review

13.1 FR18 – Quality Gate Validation

Before AI-generated content (Mode 2) publication, enforce:

  • Gate 1: Source Quality
  • Gate 2: Contradiction Search (MANDATORY)
  • Gate 3: Uncertainty Quantification
  • Gate 4: Structural Integrity

13.2 FR19 – Audit Sampling

  • Implement stratified sampling by risk tier
  • Recommendation: 30-50% Tier A, 10-20% Tier B, 5-10% Tier C
  • Support audit workflow and feedback loop

13.3 FR20 – Risk Tier Assignment

  • AKEL suggests tier based on domain, keywords, impact
  • Moderators and Experts can override
  • Risk tier affects publication workflow

14. Federation Requirements

14.1 FR21 – Node Autonomy

  • Each node can run independently (local policies, local users, local moderation)
  • Nodes decide which other nodes to federate with
  • Trust levels: Trusted / Neutral / Untrusted

14.2 FR22 – Data Sharing Modes

Nodes must be able to:

  • Share claims and summaries only
  • Share selected claims, scenarios, and verdicts
  • Share full underlying evidence metadata where allowed
  • Opt-out of sharing sensitive or restricted content

14.3 FR23 – Synchronization & Conflict Handling

  • Changes from remote nodes must be mergeable or explicitly conflict-marked
  • Conflicting verdicts are allowed and visible; not forced into consensus
  • Support push/pull/subscription synchronization

14.4 FR24 – Federation Discovery

  • Discover other nodes and their capabilities (public endpoints, policies)
  • Allow whitelisting / blacklisting of nodes
  • Global identifier format: `factharbor://node_url/type/local_id`

14.5 FR25 – Cross-Node AI Knowledge Exchange

  • Share vector embeddings for clustering
  • Share canonical claim forms
  • Share scenario templates
  • Share contradiction alerts
  • NEVER share model weights
  • NEVER override local governance

15. Non-Functional Requirements

15.1 NFR1 – Transparency

  • All assumptions, evidence, and reasoning behind verdicts must be visible
  • AKEL involvement must be clearly labeled
  • Users must be able to inspect the chain of reasoning and versions

15.2 NFR2 – Security

  • Role-based access control
  • Transport-level security (HTTPS)
  • Secure storage of secrets (API keys, credentials)
  • Audit trails for sensitive actions

15.3 NFR3 – Privacy & Compliance

  • Configurable data retention policies
  • Ability to redact or pseudonymize personal data when required
  • Compliance hooks for jurisdiction-specific rules (e.g. GDPR-like deletion requests)

15.4 NFR4 – Performance

  • POC: typical interactions < 2 s
  • Release 1.0: < 300 ms for common read operations after caching
  • Degradation strategies under load

15.5 NFR5 – Scalability

  • POC: 50 internal testers on one node
  • Beta 0: 100 external testers on one node
  • Release 1.0: 2000+ concurrent users on a reasonably provisioned node

Technical targets for Release 1.0:

  • Scalable monolith or early microservice architecture
  • Sharded vector database (for semantic search)
  • Optional IPFS or other decentralized storage for large artifacts
  • Horizontal scalability for read capacity

15.6 NFR6 – Interoperability

  • Open, documented API
  • Modular AKEL that can be swapped or extended
  • Federation protocols that follow open standards where possible
  • Standard model for external integrations

15.7 NFR7 – Observability & Operations

  • Metrics for performance, errors, and queue backlogs
  • Logs for key flows (claim intake, scenario changes, verdict updates, federation sync)
  • Health endpoints for monitoring

15.8 NFR8 – Maintainability

  • Clear module boundaries (API, core services, AKEL, storage, federation)
  • Backward-compatible schema migration strategy where feasible
  • Configuration via files / environment variables, not hard-coded

15.9 NFR9 – Usability

  • UI optimized for exploring complexity, not hiding it
  • Support for saved views, filters, and user-level preferences
  • Progressive disclosure: casual users see summaries, advanced users can dive deep

16. Release Levels

16.1 Proof of Concept (POC)

  • Single node
  • Limited user set (50 internal testers)
  • Basic claim → scenario → evidence → verdict flow
  • Minimal federation (optional)
  • AI-generated publication (Mode 2) demonstration
  • Quality gates active

16.2 Beta 0

  • One or few nodes
  • External testers (100)
  • Expanded workflows and basic moderation
  • Initial federation experiments
  • Audit sampling implemented

16.3 Release 1.0

  • 2000+ concurrent users
  • Scalable architecture
  • Sharded vector DB
  • IPFS optional
  • High automation (AKEL assistance)
  • Multi-node federation with full sync protocol
  • Mature audit system

17. Related Pages