clawd/memory/azure-files-progress-202601...

203 lines
6.7 KiB
Markdown

# Azure Files Backup POC - Progress Notes
**Date:** 2025-01-29
**Repo:** ~/dev/azure-backup
## Summary
Continued development of the Azure Files Backup POC. Implemented Azure Files API client, tree scanner, FlatBuffer schema, and tests.
## Completed Tasks
### 1. ✅ Azure Files API Client (`pkg/storage/azure.go`)
Created a comprehensive Azure Files client with:
- **`AzureFilesClient` interface** - Main abstraction for Azure Files operations:
- `ListDirectory()` - List files/directories
- `GetFile()` - Download file content
- `GetFileProperties()` - Get file metadata without content
- `GetFileRange()` - Download specific byte range
- `WalkDirectory()` - Recursive directory traversal
- `Close()` - Cleanup
- **`AzureConfig`** - Configuration with validation:
- Supports Shared Key auth (`AccountName` + `AccountKey`)
- Supports SAS token auth
- Supports connection string
- **`MockAzureFilesClient`** - Full mock implementation for testing:
- `AddFile()`, `AddDirectory()`, `SetFileContent()`, `SetError()`
- All interface methods implemented with mock behavior
- Supports `SkipDir` for walk control
### 2. ✅ FlatBuffer Schema (`schemas/metadata.fbs`)
Designed comprehensive FlatBuffer schema with:
- **`FileMetadata`** - Full file metadata (stored in object storage):
- Identity: node_id, parent_id, full_path, name
- Timestamps: mtime, ctime, atime
- POSIX: mode, owner, group
- Extended: ACL entries, xattrs
- Content: xorhash, content_hash, chunk references
- Versions: version history references
- Azure-specific: etag, content_md5, attributes
- **`DirectoryMetadata`** - Directory-specific metadata with child counts
- **`ScanResult`** - Scan operation results
- **`BackupManifest`** - Backup snapshot metadata
### 3. ✅ Tree Scanner (`pkg/tree/scanner.go`)
Implemented the scanner that syncs Azure state to local DB:
- **`Scanner`** - Main scanner type:
- Takes `AzureFilesClient`, `NodeStore`, and `Logger`
- Supports full and incremental scans
- Batch processing for efficiency
- Progress callbacks for monitoring
- **`ScanOptions`** - Configurable scan behavior:
- `RootPath`, `RootNodeID`, `FullScan`
- `BatchSize`, `Concurrency`
- `ProgressCallback`, `ProgressInterval`
- **`ScanStats`** - Atomic counters for metrics:
- TotalFiles, TotalDirs, TotalSize
- AddedFiles, ModifiedFiles, DeletedFiles
- Errors with messages
- **XOR Hash utilities**:
- `XORHash()` - Compute 8-byte fingerprint from data
- `ComputeXORHashFromReader()` - Streaming version
- `blockHash()` - FNV-1a block hashing
### 4. ✅ Scan Handler Integration (`pkg/worker/scan.go`)
Updated scan handler to use the new components:
- `AzureClientFactory` for dependency injection
- `RegisterShare()` for per-share Azure configs
- Integrated with `tree.Scanner`
- Progress logging during scans
### 5. ✅ Tests (`pkg/storage/azure_test.go`, `pkg/tree/scanner_test.go`)
Comprehensive test coverage:
- Mock Azure client tests (file operations, walking, errors, SkipDir)
- Config validation tests
- Path helper tests (basename, dirname)
- Scanner tests (empty share, new files, modifications)
- XOR hash tests (determinism, collision detection)
## Blockers
### 🔲 Go Not Installed on Build System
The development environment doesn't have Go installed, so tests couldn't be run. The code is syntactically complete but not runtime-verified.
**Action needed:** Run tests on a system with Go 1.22+:
```bash
cd ~/dev/azure-backup
go mod tidy
go test ./...
```
### 🔲 Azure Credentials
Real Azure client implementation is stubbed - returns "Azure SDK not initialized" errors.
**Action needed:**
1. Create Azure free trial account
2. Create an Azure Storage account with Files service
3. Create a file share for testing
4. Add credentials to config
## Architecture Notes
### File Storage: 50-Byte DB Row
| Field | Size | Purpose |
|-----------|-------|-----------------------|
| node_id | 8B | Unique identifier |
| parent_id | 8B | Tree structure |
| name | ~20B | Filename only |
| size | 8B | File size |
| mtime | 8B | Modified timestamp |
| xorhash | 8B | Change detection |
Everything else (ACLs, xattrs, content chunks) goes to object storage as FlatBuffers.
### Change Detection Flow
1. Azure reports file via ListDirectory
2. Compare mtime/size with DB → quick diff
3. If different, compute XOR hash → deep diff
4. If hash differs, queue for backup
### Tree Structure Benefits
- **Rename directory:** Update 1 row, not millions
- **Move subtree:** Update 1 parent_id
- **Find descendants:** Recursive CTE, properly indexed
- **Storage:** ~20 bytes vs 100+ bytes for full paths
## File Structure
```
azure-backup/
├── ARCHITECTURE.md # Full design doc
├── schemas/
│ └── metadata.fbs # FlatBuffer schema ✅ NEW
├── pkg/
│ ├── db/
│ │ ├── node.go # Node type & interface
│ │ ├── postgres.go # PostgreSQL implementation
│ │ └── tree.go # Tree operations
│ ├── storage/
│ │ ├── azure.go # Azure Files client ✅ NEW
│ │ ├── azure_test.go # Tests ✅ NEW
│ │ ├── client.go # Object storage abstraction
│ │ ├── flatbuf.go # Metadata serialization
│ │ └── chunks.go # Content chunking
│ ├── tree/
│ │ ├── scanner.go # Azure scanner ✅ NEW
│ │ ├── scanner_test.go # Tests ✅ NEW
│ │ ├── walk.go # Tree walking
│ │ ├── diff.go # Tree diffing
│ │ └── path.go # Path reconstruction
│ └── worker/
│ ├── scan.go # Scan handler ✅ UPDATED
│ ├── queue.go # Job queue
│ ├── backup.go # Backup handler
│ └── restore.go # Restore handler
└── cmd/
└── backup-worker/
└── main.go # K8s worker binary
```
## Next Steps
1. **Set up Go environment** and run tests
2. **Azure account setup** - free trial, storage account, file share
3. **Generate FlatBuffer code** - `flatc --go -o pkg/storage/fb schemas/metadata.fbs`
4. **Implement real Azure SDK integration** - uncomment TODOs in azure.go
5. **Content chunking** - implement deduplication in pkg/storage/chunks.go
6. **End-to-end test** - full scan → backup → restore cycle
## Dependencies Added
- `github.com/Azure/azure-sdk-for-go/sdk/storage/azfile v1.1.1`
## Code Quality
- All new code follows existing patterns
- Interface-first design for testability
- Mock implementations for offline testing
- Atomic counters for concurrent stats
- Error handling with context