Open Source Model and Licensing

Last modified by Robert Schaub on 2025/12/18 12:03

Open Source Model and Licensing

1. Purpose and Relation to Other Documents

This page explains how FactHarbor is run from a licensing and enforcement perspective – as an open, trustworthy, non-profit oriented, but professionally maintained project.
It covers in particular:

the licensing choices for code, documentation, data, and core specifications,
how contributors grant the project the right to use and enforce those licenses,
how AI-related components (such as AKEL) fit into the licensing picture,
how licence choices support manipulation-resistance and long-term openness,
organisational transparency commitments,
privacy and data protection standards.
Together with the other Organisation pages, it defines how FactHarbor is run:
Governance – who decides what, and under which principles
Finance & Compliance – how funding, transparency, and internal controls work
Legal Framework – legal forms, contracts, and regulatory aspects
The Specification (Mission, Requirements, Architecture, Data Model, Workflows, etc.) describes what FactHarbor does.
This Open Source Model and Licensing page (together with Governance and Finance & Compliance) describes how FactHarbor is run and protected.
For historical context, earlier drafts used a purely AGPLv3-centric model for the core software.
The current licence mix is defined in the sections below and takes precedence over any older drafts.

2. Overview

FactHarbor is, and will remain, an open source project that:

publishes its work openly whenever legally and ethically possible
makes its reasoning and evidence inspectable
invites contributions under clear, transparent rules
avoids situations where a "FactHarbor-branded" system becomes a black box
maintains exceptional organisational transparency to build trust
This page defines:
the licensing choices currently used,
the goals and principles behind these choices,
how contributors are governed from a licensing/enforcement perspective,
how AI models and third-party components are handled,
the standards that repositories must follow,
organisational transparency and privacy commitments.
Normative licensing decisions on this page override any older variants or drafts.

3. Licensing (Current Decisions)

3.1 Documentation

All general documentation (organisational and technical) is licensed under:

Creative Commons Attribution-ShareAlike 4.0 (CC BY-SA 4.0)
This allows:
reuse, adaptation, and translation of documentation,
including commercial reuse,
as long as:
clear attribution to FactHarbor is preserved, and
derivative works are shared under the same license (CC BY-SA 4.0).
Exception handling:
In rare cases, security-sensitive or abuse-enabling documentation may be:
published only in partial form, or
made available under more restrictive terms, or
kept internal.
Any such exceptions must be explicitly documented where they apply.

3.2 Core Protocol & Data Model

The core protocol, core data model (including key ERDs), and other "defining specifications" are licensed under:

Creative Commons Attribution-ShareAlike 4.0 (CC BY-SA 4.0)
Intent:
enable collaborative evolution of the protocol and data model,
allow broad reuse, referencing, and implementation,
ensure derivative specifications remain open (share-alike requirement),
maintain canonical status through trademark control rather than license restrictions.
Implications:
You may use, implement, and modify the protocol/data model in your own systems.
You may publish derivative or modified specifications under CC BY-SA 4.0.
Derivative specifications must:
be clearly attributed to FactHarbor,
use different branding/names (trademark protection),
state they are "derived from FactHarbor protocol",
remain under CC BY-SA 4.0 (share-alike).
Changes to the canonical FactHarbor specification are governed through FactHarbor's internal review and release processes.
Trademark Protection:
The "FactHarbor" name and associated marks are protected separately from the license. Derivative protocols may not use "FactHarbor" branding without explicit permission, ensuring users can distinguish official from derivative implementations.
This approach (license for sharing + trademark for brand protection) follows successful models like Mozilla Firefox and the W3C.

3.3 Code

Default License: Unless explicitly stated otherwise, code produced under the FactHarbor project is licensed under:

MIT License
This allows:
broad reuse, including in commercial software,
proprietary integrations and extensions,
as long as:
the MIT license text is included, and
attribution to the FactHarbor project is preserved.
Hybrid Licensing for Core Components:
For the core reasoning engine and AKEL components, we recommend using AGPL-3.0 to prevent black-box deployments and ensure transparency of modifications.
The recommended hybrid approach:
AGPL-3.0 for: Core verdict engine, AKEL reasoning logic, scenario evaluation engine
MIT for: Integrations, utilities, frontend clients, libraries, tools
This hybrid model (similar to Wikimedia's use of AGPL for MediaWiki) balances maximum adoption with protection of the transparency mission.
Rationale:
AGPL-3.0 is network copyleft – requires source disclosure for network services
Prevents "FactHarbor-as-a-service" black boxes that contradict transparency mission
MIT for peripheral components maximizes ecosystem growth
Strong protection of openness of reasoning is handled via:
open protocol and data model (CC BY-SA),
open documentation (CC BY-SA),
AGPL for core reasoning components,
and explicit transparency rules.
The decision to implement this hybrid model should be made explicitly before the first public release.

3.4 Structured Data & Curation Artefacts

Structured data, curated knowledge artefacts and derived datasets are licensed under:

Open Database License (ODbL)
Note on ODbL: The Open Database License includes a share-alike requirement, ensuring derivative databases remain open. This aligns with FactHarbor's commitment to openness and prevents proprietary capture of community-curated data.
Principles:
data used for public reasoning should be:
reusable and remixable,
properly attributed,
versioned and traceable,
kept open through share-alike.
privacy, safety, and legal constraints may require:
partial publication or anonymity,
stronger access control around certain datasets.
Concrete exceptions and more restrictive handling must be documented at dataset level.

3.5 Attribution Guidelines (Non-Mandatory but Recommended)

FactHarbor encourages, but generally does not require beyond the base licenses, that:

user interfaces show a short line such as:
`Powered by FactHarbor (open documentation, open protocol, open data)`
Intent:
strengthen brand recognition and trust,
keep attribution light-weight and compatible with open licenses,
avoid creating extra legal conditions beyond the existing licenses.

4. Licensing Goals and Principles

Earlier "Open Source Model & Licensing" drafts contained valuable reasoning about why strong open-source protections might be needed. The core goals remain relevant, even though the exact license mix has evolved.
FactHarbor's licensing aims to:

Protect openness of reasoning
Users must be able to understand how conclusions were reached.
Code and documentation that materially affect user-visible behaviour should be inspectable or clearly described.
Discourage hostile or misleading forks
Avoid "closed clones" that keep the FactHarbor name or appearance while hiding important changes.
Forks that significantly diverge should use their own branding and not pretend to be official FactHarbor instances.
Make modifications traceable
Substantial changes to code, specs, or governance documents should be documented and versioned.
Users interacting with a service based on FactHarbor should be able to see which version or fork they are using.
Support long-term sustainability and legal clarity
Licenses and governance must be enforceable in practice.
The organisation should have clear standing to protect the project if needed.

5. Contributors, Governance & CLA

5.1 Contributor Journey (from licensing perspective)

The contributor journey (Visitor → New Contributor → Contributor → Trusted Contributor → Contributor → Moderator → Trusted Contributor) is defined in more detail in the Contributor Processes and Organisation pages.
From a *licensing* perspective, the key points are:

All contributions must be compatible with the chosen licenses (CC, MIT, AGPL, ODbL, etc.).
Contributors confirm that they have the right to contribute the material under these licenses.
Higher-trust roles (Trusted Contributors, Contributors, Moderators) help enforce licensing and attribution rules when reviewing changes.
For full role definitions, see the Organisation / Contributor Processes documentation.

5.2 Contributor License Agreement (CLA)

To keep the legal situation clear and enforceable, FactHarbor uses a Contributor License Agreement (CLA).
See Contributor License Agreement.

5.2.1 Dual Contributor Model

FactHarbor distinguishes between two contributor types with different copyright arrangements:
Unpaid Contributors (Volunteers):

Retain copyright of their contributions
Grant FactHarbor a perpetual, royalty-free license to use and distribute
Enable the project to enforce licenses on their behalf
Maintain attribution in version control and documentation
Paid Contributors (Employees, Contractors):
Transfer copyright to FactHarbor Organisation
Ensures clear ownership for sponsored work
Simplifies long-term governance
Still receive attribution for their contributions
This dual model:
Respects volunteer contributions while preserving their rights
Provides clarity for commercially sponsored work
Ensures FactHarbor can effectively maintain and defend the project
Maintains transparency about contribution sources

5.2.2 Core Intent (All Contributors)

Regardless of contributor type, the CLA ensures:

Contributors grant the FactHarbor organisation:
a perpetual, worldwide, irrevocable license to use, modify, and redistribute their contributions under the project's chosen licenses (CC BY-SA, MIT, AGPL, ODbL, etc.), and
the express right to enforce those licenses and pursue legal action against infringers on their behalf.
This ensures that:
the organisation has clear standing to defend the project legally,
individual contributors do not have to act alone against infringements,
licensing remains enforceable even if contributors become inactive.

5.2.3 Determining Contributor Type

Default: Contributors are considered unpaid volunteers unless they have a written agreement specifying paid status.
Paid Status Indicators: Employment contract, written contracting agreement, or grant/sponsorship agreement.
Transparency: Contributor type should be disclosed where applicable.
See Contributor License Agreement for complete terms.

6. AI Models and Licensing (AKEL)

AKEL (AI Knowledge Extraction Layer) may rely on different types of models. Licensing and transparency rules are crucial here.

6.1 Open vs Proprietary Models

AKEL may use:

Open-source models (preferred):
weights and code are openly available under compatible licenses,
prompts, evaluation logic and integration code are made public where licenses permit.
Proprietary / hosted models (allowed but constrained):
used only when necessary for quality or feasibility,
must be clearly disclosed to the user at point of use,
AKEL must label which parts of its output derive from proprietary tools,
surrounding integration logic remains open (MIT/AGPL or compatible) and is documented.
Rules:
No deployment may suggest "fully open" AI if proprietary models are used without disclosure.
For high-impact reasoning (e.g. health, politics, safety-critical topics), open, auditable models are preferred wherever feasible.
Where proprietary models are unavoidable, additional care is taken to:
document limitations,
avoid overstating certainty,
and keep reasoning layers as transparent as possible.

6.2 Prompts, Pipelines and Integration Code

Orchestration code, pipelines and evaluation logic around AKEL are treated as part of the open FactHarbor codebase (MIT or AGPL).
Where prompts or model configurations are licensed in a way that restricts publication, this must be documented clearly, and safe abstractions should be used in public documentation.

6.3 AI Prompts and Orchestration

Prompts, system instructions, and orchestration code are considered Code and licensed under MIT or AGPL (depending on component).
They must be visible in the repository to ensure the system is not a 'black box'.
If a proprietary model requires a prompt that cannot be shared (e.g. contractual restriction), that component cannot be part of the open core.

7. Third-Party Libraries and Components

FactHarbor depends on third-party libraries under:

permissive licenses (MIT, Apache-2.0, BSD), and/or
other compatible open-source licenses.
Requirements:
All dependencies must be license-compatible with:
the MIT/AGPL-licensed code,
and the overall FactHarbor licensing strategy.
License information is documented in:
`/LICENSE` and, where applicable, `/NOTICE`,
and a dedicated "Third-Party Licenses" section in project documentation.
FactHarbor actively avoids dependencies that:
restrict redistribution in ways incompatible with open-source norms,
prevent network users from accessing the relevant source,
or conflict with the project's transparency and licensing goals.

8. Repository Standards

Each official FactHarbor repository must follow a minimum standard.

8.1 Required Files

Each repository should contain at least:

`README` – purpose, scope, status, and how to use it.
`LICENSE` – the applicable license(s) for the repository.
`CONTRIBUTING` – how to propose changes; coding/writing guidelines.
`CODEOWNERS` – who is responsible for which parts.
`CHANGELOG` – human-readable log of important changes.
`SECURITY` (or `SECURITY.md`) – how to report vulnerabilities and how they are handled.

8.2 Prohibited Content

FactHarbor repositories must not contain:

purely ideological advocacy texts unrelated to the project's purpose,
opaque binaries or artefacts that cannot reasonably be inspected or reproduced,
embedded secrets (API keys, passwords, private tokens),
content that materially contradicts the stated licenses or governance rules.

9. Historical Licensing Option: AGPLv3 for Core Engine (Non-Normative Background)

Earlier versions of this page explored a strong copyleft option for the core software based on GNU Affero General Public License v3 (AGPLv3).
Those drafts argued that:

AGPLv3, as a network-copyleft license, would:
require modified network services to publish their source to users,
prevent closed forks of the core reasoning engine,
ensure that any public "FactHarbor-like" service stays inspectable.
They also defined:
the scope of AGPLv3 coverage (backend services, AKEL logic, frontend),
expectations for forks (must remain AGPLv3, must declare they are forks),
and the same CLA principles now adapted to the current license mix.
These AGPLv3 considerations have been partially adopted in the hybrid licensing model (section 3.3), where AGPL-3.0 is recommended for core reasoning components.
They are preserved here as design background and may be revisited for specific components or future arrangements.

10. Organisational Transparency

FactHarbor is committed to exceptional transparency in all aspects of its operations, governance, and finances. This commitment is essential to build trust in a system claiming to support well-grounded judgments.

10.1 Financial Transparency

We commit to publishing annually:

Complete financial statements (audited where possible)
Swiss tax filings (annual statements per Swiss law)
Income sources in aggregate (grants, donations, sponsorships)
Expense breakdown by category
Compensation ranges for staff roles (not individual salaries)
Major funding relationships and partnerships

10.2 Governance Transparency

We commit to publishing:

All governance documents (bylaws, policies, procedures)
Governing Team composition and meeting schedules
Governing Team meeting minutes (with narrow exceptions for privacy, security, or legal matters)
Policy changes with rationale and effective dates
Decision-making process documentation
Conflict of interest policies and disclosures

10.3 Operational Transparency

We commit to publishing:

Transparency reports (published twice yearly)
Content moderation statistics and practices
AKEL performance metrics and audit results
Risk tier assignment statistics
Partnership agreements and funding relationships
Incident reports (security, moderation, governance)
System uptime and performance data

10.4 Privacy Protection

While maintaining organisational transparency, we protect:

Individual user privacy and personal data
Security vulnerabilities (until patched, typically 30-90 days)
Personnel matters and personal information
Ongoing legal matters (until resolved)
Whistleblower and abuse reports
Authentication credentials and sensitive operational details

10.5 Review and Oversight

Annual review of all information marked "private"
Public reporting on transparency compliance
Community input opportunities on transparency policies
Appeals process for information requests
Independent transparency audits (when feasible)
See Transparency Policy for complete details.

11. Privacy and Data Protection

FactHarbor is committed to protecting user privacy while maintaining transparency in operations and governance.

11.1 Data Collection Principles

Data minimization: Collect only what is necessary for functionality
Purpose limitation: Use data only for stated purposes
Short retention: Delete data when no longer needed
User control: Provide access, correction, and deletion rights

11.2 User Rights

Users have the right to:

Access their personal data
Correct inaccurate information
Delete their accounts and associated data
Export their data (portability)
Object to certain processing
Lodge complaints with supervisory authorities

11.3 What We Collect

For specific details on data collection practices, retention periods, and processing purposes, see Privacy Policy.
In general:

Public contributions: Permanently public and attributed (essential for transparency)
Account information: Email, username (minimal required data)
Technical data: IP addresses, user agents (short retention, logged out users)
Usage data: Aggregated, anonymized analytics

11.4 What We Never Do

Sell or rent user data
Share personal data with third parties for marketing
Track users across unrelated sites
Use personal data for purposes beyond stated scope
Keep personal data longer than necessary

11.5 Security Measures

Encryption in transit (TLS/HTTPS)
Encryption at rest for sensitive data
Access controls and authentication
Regular security audits
Incident response procedures
Vulnerability disclosure program
Data Protection Impact Assessments (DPIA) for high-risk processing (required by FADP Article 22)
See Privacy Policy for complete details.

11.6 Data Protection Officer (DPO)

If we serve users in the European Union, we will appoint a Data Protection Officer (DPO) as required by EU GDPR Article 37.
The DPO will:

Advise on data protection compliance
Monitor FADP and GDPR compliance
Serve as contact point for Swiss FDPIC and EU data protection authorities
Conduct privacy audits and DPIAs
Handle data subject requests
Contact (if appointed): [DPO contact to be established if needed]
Note: Swiss law (FADP) does not require a DPO for organizations of our size. However, EU GDPR Article 37 requires a DPO for:
Large-scale systematic monitoring of data subjects
Large-scale processing of sensitive personal data (including political opinions, health information)
Given that FactHarbor processes claims containing political opinions and uses AI for systematic evaluation, we commit to appointing a DPO if we process personal data of EU residents.

12. Exceptions and Appeals

12.1 Requesting Information

If you believe FactHarbor should disclose specific organisational information:

Submit a written request to [Transparency contact to be established]
2. Specify the information requested and rationale
3. Expect initial response promptly

12.2 Appeals Process

If a transparency request is denied:

Appeal to the Transparency Committee (if established)
2. Provide additional context or rationale
3. Expect appeal decision promptly
4. Final appeals may be escalated to the Governing Team

12.3 Exception Criteria

Information may be withheld only if disclosure would:

Violate individual privacy rights
Compromise security (vulnerability, credential)
Violate legal obligations (court order, attorney-client privilege)
Enable abuse or harm (expose victim, enable attack)
Breach fiduciary duty (ongoing confidential negotiations)
All exceptions are time-limited and reviewed annually.

Open Source Model and Licensing

Open Source Model and Licensing

1. Purpose and Relation to Other Documents

2. Overview

3. Licensing (Current Decisions)

3.1 Documentation

3.2 Core Protocol & Data Model

3.3 Code

3.4 Structured Data & Curation Artefacts

3.5 Attribution Guidelines (Non-Mandatory but Recommended)

4. Licensing Goals and Principles

5. Contributors, Governance & CLA

5.1 Contributor Journey (from licensing perspective)

5.2 Contributor License Agreement (CLA)

5.2.1 Dual Contributor Model

5.2.2 Core Intent (All Contributors)

5.2.3 Determining Contributor Type

6. AI Models and Licensing (AKEL)

6.1 Open vs Proprietary Models

6.2 Prompts, Pipelines and Integration Code

6.3 AI Prompts and Orchestration

7. Third-Party Libraries and Components

8. Repository Standards

8.1 Required Files

8.2 Prohibited Content

9. Historical Licensing Option: AGPLv3 for Core Engine (Non-Normative Background)

10. Organisational Transparency

10.1 Financial Transparency

10.2 Governance Transparency

10.3 Operational Transparency

10.4 Privacy Protection

10.5 Review and Oversight

11. Privacy and Data Protection

11.1 Data Collection Principles

11.2 User Rights

11.3 What We Collect

11.4 What We Never Do

11.5 Security Measures

11.6 Data Protection Officer (DPO)

12. Exceptions and Appeals

12.1 Requesting Information

12.2 Appeals Process

12.3 Exception Criteria

Applications

Need help?