Federation & Decentralization

1

= Federation & Decentralization =

2

3

FactHarbor is designed to operate as a **federated network of nodes** rather than a single central server.

4

5

Decentralization provides:

6

7

* **Resilience** against censorship or political pressure

8

* **Autonomy** for local governance and moderation

9

* **Scalability** across many independent communities

10

* **Trust** without centralized control

11

* **Domain specialization** (health-focused nodes, energy-focused nodes, etc.)

12

13

FactHarbor draws inspiration from the Fediverse but uses stronger structure, versioning, and integrity guarantees.

----

== Federated FactHarbor Nodes ==

18

19

Each FactHarbor instance ("node") maintains:

20

21

* Its own database

22

* Its own AKEL instance

23

* Its own reviewers, experts, and contributors

24

* Its own governance rules

25

26

Nodes exchange structured information:

* Claims

* Scenarios

* Evidence metadata (not necessarily full files)

31

* Verdicts (optional)

32

* Hashes and signatures for integrity

33

34

Nodes choose which external nodes they trust.

----

== Global Identifiers ==

39

40

Every entity receives a globally unique, linkable identifier.

41

42

**Format**:

43

`factharbor://node_url/type/local_id`

44

45

**Example**:

46

`factharbor://factharbor.energy/claim/CLM-55812`

**Supported types**:

* `claim`

* `scenario`

* `evidence`

* `verdict`

* `user` (optional)

* `cluster`

**Properties**:

* Globally consistent

60

* Human-readable

61

* Hash-derived

62

* Independent of database internals

63

* URL-resolvable (future enhancement)

64

65

This allows cross-node references and prevents identifier collisions in federated environments.

----

== Trust Model ==

Each node maintains a **trust table** defining relationships with other nodes:

=== Trust Levels ===

**Trusted Nodes**:

* Claims auto-imported

78

* Scenarios accepted without re-review

79

* Evidence considered valid

80

* Verdicts displayed to users

81

* High synchronization priority

**Neutral Nodes**:

* Claims imported but flagged for review

86

* Scenarios require local validation

87

* Evidence requires re-assessment

88

* Verdicts shown with "external node" disclaimer

89

* Normal synchronization priority

**Untrusted Nodes**:

* Claims quarantined, manual import only

94

* Scenarios rejected by default

95

* Evidence not accepted

96

* Verdicts not displayed

97

* No automatic synchronization

98

99

=== Trust Affects ===

100

101

* **Auto-import**: Whether claims/scenarios are automatically added

102

* **Re-review requirements**: Whether local reviewers must validate

103

* **Verdict display**: Whether external verdicts are shown to users

104

* **Synchronization frequency**: How often data is exchanged

105

* **Reputation signals**: How external reputation is interpreted

106

107

=== Local Trust Authority ===

108

109

Each node's governance team decides:

110

111

* Which nodes to trust

112

* Trust level criteria

113

* Trust escalation/degradation rules

114

* Dispute resolution with partner nodes

115

116

Trust is **local and autonomous** - no global trust registry exists.

----

== Data Sharing Model ==

121

122

=== What Nodes Share ===

123

124

**Always shared** (if federation enabled):

125

126

* Claims and claim clusters

127

* Scenario structures

128

* Evidence metadata and content hashes

129

* Integrity signatures

130

131

**Optionally shared**:

132

133

* Full evidence files (large documents)

134

* Verdicts (nodes may choose to keep verdicts local)

135

* Vector embeddings

136

* Scenario templates

137

* AKEL distilled knowledge

**Never shared**:

* Internal user lists

142

* Reviewer comments and internal discussions

143

* Governance decisions and meeting notes

144

* Access control data

145

* Private or sensitive content marked as local-only

146

147

=== Large Evidence Files ===

Evidence files are:

* Stored locally by default

152

* Referenced via global content hash

153

* Optionally served through IPFS

154

* Accessible via direct peer-to-peer transfer

155

* Can be stored in S3-compatible object storage

----

== Synchronization Protocol ==

160

161

Nodes exchange data using multiple synchronization methods:

162

163

=== Push-Based Synchronization ===

164

165

**Mechanism**: Webhooks

166

167

When local content changes:

168

169

1. Node builds signed bundle

170

2. Sends webhook notification to subscribed nodes

171

3. Remote nodes fetch bundle

172

4. Remote nodes validate and import

173

174

**Use case**: Real-time updates for trusted partners

175

176

=== Pull-Based Synchronization ===

177

178

**Mechanism**: Scheduled polling

Nodes periodically:

1. Query partner nodes for updates

183

2. Fetch changed entities since last sync

184

3. Validate and import

185

4. Store sync checkpoint

186

187

**Use case**: Regular batch updates, lower trust nodes

188

189

=== Subscription-Based Synchronization ===

190

191

**Mechanism**: WebSub-like protocol

Nodes subscribe to:

* Specific claim clusters

196

* Specific domains (medical, energy, etc.)

197

* Specific scenario types

198

* Verdict updates

199

200

Publisher pushes updates only to subscribers.

201

202

**Use case**: Selective federation, domain specialization

203

204

=== Large Asset Transfer ===

For files >10MB:

* S3-compatible object storage

209

* IPFS (content-addressed)

210

* Direct peer-to-peer transfer

211

* Chunked HTTP transfer with resume support

----

== Federation Sync Workflow ==

216

217

Complete synchronization sequence for creating and sharing new content:

218

219

=== Step 1: Local Node Creates New Versions ===

220

221

User or AKEL creates:

222

223

* New claim version

224

* New scenario version

225

* New evidence version

226

* New verdict version

227

228

All changes tracked with:

* VersionID

* ParentVersionID

* AuthorType

* Timestamp

* JustificationText

=== Step 2: Federation Layer Builds Signed Bundle ===

237

238

Federation layer packages:

239

240

* Entity data (claim, scenario, evidence metadata, verdict)

241

* Version lineage (ParentVersionID chain)

242

* Cryptographic signatures

243

* Node provenance information

* Trust metadata

Bundle format:

* JSON-LD for structured data

249

* Content-addressed hashes

250

* Digital signatures for integrity

251

252

=== Step 3: Bundle Includes Required Data ===

253

254

Each bundle contains:

255

256

* **Claims**: Full claim text, classification, domain

257

* **Scenarios**: Definitions, assumptions, boundaries

258

* **Evidence metadata**: Source URLs, hashes, reliability scores (not always full files)

259

* **Verdicts**: Likelihood ranges, uncertainty, reasoning chains

260

* **Lineage**: Version history, parent relationships

261

* **Signatures**: Cryptographic proof of origin

262

263

=== Step 4: Bundle Pushed to Trusted Neighbor Nodes ===

264

265

Based on trust table:

266

267

* Push to **trusted nodes** immediately

268

* Queue for **neutral nodes** (batched)

269

* Skip **untrusted nodes**

Push methods:

* Webhook notification

274

* Direct API call

275

* Pub/Sub message queue

276

277

=== Step 5: Remote Nodes Validate Lineage and Signatures ===

Receiving node:

1. Verifies cryptographic signatures

282

2. Validates version lineage (ParentVersionID chain)

283

3. Checks for conflicts with local data

284

4. Validates data structure and required fields

285

5. Applies local trust policies

286

287

Validation failures → reject or quarantine bundle

288

289

=== Step 6: Accept or Branch Versions ===

290

291

**Accept** (if validation passes):

292

293

* Import new versions

294

* Maintain provenance metadata

295

* Link to local related entities

296

* Update local indices

297

298

**Branch** (if conflict detected):

299

300

* Create parallel version tree

301

* Mark as "external branch"

302

* Allow local reviewers to merge or reject

303

* Preserve both version histories

304

305

**Reject** (if validation fails):

306

307

* Log rejection reason

308

* Notify source node (optional)

309

* Quarantine for manual review (optional)

310

311

=== Step 7: Local Re-evaluation Runs if Required ===

312

313

After import, local node checks:

314

315

* Does new evidence affect existing verdicts?

316

* Do new scenarios require re-assessment?

317

* Are there contradictions with local content?

If yes:

* Trigger AKEL re-evaluation

322

* Queue for reviewer attention

323

* Update affected verdicts

324

* Notify users following related content

----

== Cross-Node AI Knowledge Exchange ==

329

330

Each node runs its own AKEL instance and may exchange AI-derived knowledge:

331

332

=== What Can Be Shared ===

333

334

**Vector embeddings**:

335

336

* For cross-node claim clustering

337

* For semantic search alignment

338

* Never includes training data

339

340

**Canonical claim forms**:

341

342

* Normalized claim text

343

* Standard phrasing templates

344

* Domain-specific formulations

345

346

**Scenario templates**:

347

348

* Reusable scenario structures

349

* Common assumption patterns

350

* Evaluation method templates

351

352

**Contradiction alerts**:

353

354

* Detected conflicts between claims

355

* Evidence conflicts across nodes

356

* Scenario incompatibilities

357

358

**Metadata and insights**:

359

360

* Aggregate quality metrics

361

* Reliability signal extraction

362

* Bubble detection patterns

363

364

=== What Can NEVER Be Shared ===

365

366

**Model weights**: No sharing of trained model parameters

367

368

**Training data**: No sharing of full training datasets

369

370

**Local governance overrides**: AKEL suggestions can be overridden locally

371

372

**User behavior data**: No cross-node tracking

373

374

**Internal review discussions**: Private content stays private

375

376

=== Benefits of AI Knowledge Exchange ===

377

378

* Reduced duplication across nodes

379

* Improved claim clustering accuracy

380

* Faster contradiction detection

381

* Shared scenario libraries

382

* Cross-node quality improvements

383

384

=== Local Control Maintained ===

385

386

* Nodes accept or reject shared AI knowledge

387

* Human reviewers can override any AKEL suggestion

388

* Local governance always has final authority

389

* No external AI control over local content

390

* Privacy-preserving knowledge exchange

----

== Decentralized Processing ==

395

396

Each node independently performs:

397

398

* AKEL processing

399

* Scenario drafting and validation

400

* Evidence review

401

* Verdict calculation

402

* Truth landscape summarization

403

404

Nodes can specialize:

405

406

* Health-focused node with medical experts

407

* Energy-focused node with domain knowledge

408

* Small node delegating scenario libraries to partners

409

* Regional node with language/culture specialization

410

411

Optional data sharing includes:

412

413

* Embeddings for clustering

414

* Claim clusters for alignment

415

* Scenario templates for efficiency

416

* Verdict comparison metadata

----

== Scaling to Thousands of Users ==

421

422

Nodes scale independently through:

423

424

* Horizontally scalable API servers

425

* Worker pools for AKEL tasks

426

* Hybrid storage (local + S3/IPFS)

427

* Redis caching for performance

428

* Sharded or partitioned databases

429

430

Federation allows effectively unlimited horizontal scaling by adding new nodes.

431

432

Communities may form:

433

434

* Domain-specific nodes (epidemiology, energy, climate)

435

* Language or region-based nodes

436

* NGO or institutional nodes

437

* Private organizational nodes

438

* Academic research nodes

439

440

Nodes cooperate through:

441

442

* Scenario library sharing

443

* Shared or overlapping claim clusters

444

* Expert delegation between nodes

445

* Distributed AKEL task support

446

* Cross-node quality audits

----

== Federation and Release 1.0 ==

451

452

**POC**: Single node, optional federation experiments

453

454

**Beta 0**: 2-3 nodes, basic federation protocol

455

456

**Release 1.0**: Full federation support with:

457

458

* Robust synchronization protocol

459

* Trust model implementation

460

* Cross-node AI knowledge exchange

461

* Federated search and discovery

462

* Distributed audit collaboration

463

* Inter-node expert consultation

----

== Related Pages ==

* [[AKEL (AI Knowledge Extraction Layer)>>Archive.FactHarbor V0\.9\.18 copy.Specification.AI Knowledge Extraction Layer (AKEL).WebHome]]

470

* [[Data Model>>FactHarbor.Archive.FactHarbor V0\.9\.18 copy.Specification.Data Model.WebHome]]

471

* [[Architecture>>Archive.FactHarbor V0\.9\.18 copy.Specification.Architecture.WebHome]]

472

* [[Workflows>>FactHarbor.Archive.FactHarbor V0\.9\.18 copy.Specification.Workflows.WebHome]]

Wiki source code of Federation & Decentralization