clawd/memory/azure-files-backup-requirem...

3.1 KiB

Azure Files Backup — Requirements Spec

Captured: 2025-01-28 | Domain: Personal | Priority: HIGH

Purpose

POC to prove a point: The right architecture can backup billions of files with minimal database overhead.

This is NOT a Kaseya project — it's Johan demonstrating his design philosophy.

Target

  • Azure Files API specifically
  • NOT Azure Blob Storage
  • NOT OneDrive/SharePoint

Scale Requirements

  • Billions of files
  • 64-bit node IDs required
  • DB must fit in RAM for fast queries (~50GB target)

Database Design (~50 bytes/file)

Field Type Size Purpose
node_id int64 8 bytes Unique identifier (billions need 64-bit)
parent_id int64 8 bytes Tree structure link
name varchar ~20 bytes Filename only, NOT full path
size int64 8 bytes File size in bytes
mtime int64 8 bytes Unix timestamp
hash int64 8 bytes xorhash (MSFT standard)

Total: ~50 bytes/file → ~50GB for 1 billion files → fits in RAM

Key Constraints

  • Node tree only — NO full path strings stored
  • Paths reconstructed by walking parent_id to root
  • Rename directory = update 1 row, not millions
  • DB is index + analytics only

Object Storage Design

Everything that doesn't fit in 50 bytes goes here:

  • Full metadata (ACLs, extended attributes, permissions)
  • File content (chunked, deduplicated)
  • Version history
  • FlatBuffer serialized

Bundling

  • TAR format (proven, standard)
  • Only when it saves ops (not for just 2 files)
  • Threshold TBD (likely <64KB or <1MB)

Hash Strategy

  • xorhash — MSFT standard, 64-bit, fast
  • NOT sha256 (overkill for change detection)
  • Used for: change detection, not cryptographic verification

Architecture

~/dev/azure-backup/
├── core/    — library (tree, hash, storage interface, flatbuffer)
├── worker/  — K8s-scalable backup worker (100s of workers)
├── api/     — REST API for GUI
└── web/     — Go templates + htmx

Worker Design

  • Stateless K8s pods
  • Horizontal scaling (add pods, auto-claim work)
  • Job types: scan, backup, restore, verify
  • Queue: Postgres SKIP LOCKED (works up to ~1000 workers)

Multi-Tenant

  • Isolated by tenant_id + share_id
  • Each tenant+share gets separate node tree
  • Object paths: {tenant_id}/{share_id}/{node_id}

GUI Requirements

  • Web UI: Go + htmx/templ
  • Multi-tenant view (not single-tenant)

Meta

  • Language: Go (all the way, core library)
  • Repo: ~/dev/azure-backup
  • License: Proprietary
  • Type: Personal POC (prove a point)

Open Questions (resolved)

  • 64-bit node IDs (billions of files)
  • xorhash not sha256
  • TAR bundling
  • Multi-tenant GUI
  • Proprietary license

Status

  • Requirements captured
  • Repo scaffolded
  • ARCHITECTURE.md written
  • FlatBuffer schema + Go code generated
  • Azure SDK integration (real client implementation)
  • Web UI (Go + htmx + Tailwind)
  • 4,400+ lines of Go code
  • 🔲 Azure free trial account (needs Johan)
  • 🔲 Database integration (Postgres)
  • 🔲 End-to-end test with real Azure Files