dealspace/WATERMARK-SPEC.md

43 KiB

Dealspace — Watermark & File Protection Pipeline

Version: 0.1 — 2026-02-28
Status: Design specification for section 6.2 of SPEC.md
Author: James (subagent)


1. Design Principles

  1. Original files are sacred. Storage always contains the clean, unmodified original. Watermarks are applied at serve time, never persisted.

  2. Watermarks are forensic, not decorative. If a document leaks, we must trace it to a specific user/org/timestamp. Watermarks are evidence, not theater.

  3. FIPS 140-3 throughout. All crypto operations use FIPS-approved algorithms. No exceptions.

  4. Performance over perfection. A 50ms watermark that traces 99% of leaks beats a 5s watermark that's "perfect." Users won't wait.

  5. Graceful degradation. If watermarking fails (corrupted file, unsupported variant), serve with audit log + fallback watermark strategy, never block access entirely.


2. Watermark Content

Standard watermark string (configurable per project):

{user_name} · {org_name} · {iso_timestamp} · CONFIDENTIAL

Example:

John Smith · Acme Capital · 2026-02-28T14:32:17Z · CONFIDENTIAL

2.1 Watermark Variants

Variant Use Case Content
standard Normal access Full string as above
screen_deterrent Tiled background for PDF preview Repeated diagonal pattern
minimal Fallback when processing fails {user_id}:{timestamp} (short, traceable)

2.2 Watermark Styling (Project Config)

type WatermarkConfig struct {
    Text          string  // Template: "{user_name} · {org_name} · {timestamp} · CONFIDENTIAL"
    FontFamily    string  // Default: "Helvetica" (PDF), "Calibri" (Office)
    FontSize      int     // Default: 10 (text), 48 (tiled background)
    Color         string  // RGBA hex: "#FF0000AA" (semi-transparent red)
    Position      string  // "footer" | "header" | "diagonal" | "tiled"
    Opacity       float64 // 0.0-1.0, default 0.3 for diagonal/tiled
}

3. File Type Implementations

3.1 PDF Watermarking

Library: github.com/pdfcpu/pdfcpu (pure Go, FIPS-compatible, actively maintained)

Approach:

  1. Parse PDF into memory
  2. Add watermark as text annotation or stamped content on each page
  3. Serialize modified PDF to output stream

Watermark Placement:

  • Footer watermark: Bottom center of each page, 10pt gray text
  • Diagonal tiled (screen deterrent): 45° repeated pattern across entire page, 0.15 opacity

Algorithm:

func WatermarkPDF(input io.Reader, output io.Writer, wm WatermarkParams) error {
    // 1. Read PDF
    ctx, err := pdfcpu.ReadContext(input, nil)
    if err != nil {
        return fmt.Errorf("pdf parse: %w", err)
    }
    
    // 2. Build watermark spec
    wmSpec := pdfcpu.TextWatermark{
        Text:     wm.Text,
        FontName: "Helvetica",
        FontSize: 10,
        Color:    pdfcpu.Gray,
        Pos:      pdfcpu.BottomCenter,
    }
    
    // 3. Apply to all pages
    if err := pdfcpu.AddWatermarks(ctx, nil, wmSpec); err != nil {
        return fmt.Errorf("pdf watermark: %w", err)
    }
    
    // 4. Optionally add diagonal tiled pattern for screen deterrent
    if wm.ScreenDeterrent {
        tiledSpec := pdfcpu.TextWatermark{
            Text:     wm.Text,
            FontSize: 48,
            Color:    pdfcpu.LightGray,
            Opacity:  0.15,
            Rotation: 45,
            Diagonal: true,
        }
        pdfcpu.AddWatermarks(ctx, nil, tiledSpec)
    }
    
    // 5. Write output
    return pdfcpu.WriteContext(ctx, output)
}

Performance:

  • Small PDF (1-10 pages): ~20-50ms
  • Large PDF (100+ pages): ~200-500ms
  • Memory: ~2x file size during processing

Caching: Never cache watermarked PDFs. Each serve includes user-specific timestamp. Caching would serve stale timestamps or wrong user identities. The whole point is forensic traceability.

Edge Cases:

Case Handling
Password-protected PDF Reject with error: "Cannot watermark encrypted PDF. Contact administrator." Log to audit.
Corrupted PDF Attempt parse; if fails, serve original with minimal watermark in filename + audit log
PDF/A strict pdfcpu preserves PDF/A compliance; no special handling needed
Scanned PDF (images) Watermark overlays images; no text extraction needed
1000+ page PDF Stream processing; set timeout at 30s, fallback to minimal if exceeded

3.2 Word Document (.docx) Watermarking

Library: github.com/unidoc/unioffice (pure Go, Office Open XML manipulation)

Approach:

  1. Unzip DOCX (it's a ZIP of XML files)
  2. Modify word/document.xml to add footer content
  3. Create/modify word/footer1.xml with watermark text
  4. Update [Content_Types].xml and relationships
  5. Rezip and serve

Watermark Placement:

  • Footer: Centered text in document footer, appears on every page
  • Header alternative: For "CONFIDENTIAL" prominence, add to header

Algorithm:

func WatermarkDOCX(input io.Reader, output io.Writer, wm WatermarkParams) error {
    // 1. Open DOCX
    doc, err := document.Read(input, int64(size))
    if err != nil {
        return fmt.Errorf("docx parse: %w", err)
    }
    
    // 2. Get or create footer
    footer := doc.AddFooter()
    footer.SetParagraphProperties(document.ParagraphStyleFooter)
    
    // 3. Add watermark paragraph
    para := footer.AddParagraph()
    para.SetAlignment(document.AlignmentCenter)
    run := para.AddRun()
    run.AddText(wm.Text)
    run.Properties().SetColor(color.Gray)
    run.Properties().SetSize(10)
    
    // 4. Apply footer to all sections
    for _, section := range doc.Sections() {
        section.SetFooter(footer, document.FooterTypeDefault)
    }
    
    // 5. Save
    return doc.Save(output)
}

Performance:

  • Typical DOCX: ~30-80ms
  • Large DOCX with images: ~100-300ms
  • Memory: ~3x file size (uncompressed XML is verbose)

Caching: Never. Same reasoning as PDF.

Edge Cases:

Case Handling
Password-protected DOCX Reject with error. Office encryption prevents modification.
Corrupted DOCX Attempt parse; fallback to encrypted-download-only mode
DOCX with existing footer Append watermark to existing footer, don't replace
DOCM (macro-enabled) Same process; macros preserved. Consider security warning.
DOC (legacy binary) Convert via LibreOffice CLI first, or reject. See 3.2.1.

3.2.1 Legacy DOC Handling

Binary .doc files cannot be watermarked with pure Go. Options:

  1. Convert to PDF on upload (recommended for M&A — preserves formatting, prevents editing)
  2. LibreOffice CLI conversion at serve time: libreoffice --headless --convert-to docx
  3. Reject with message: "Legacy format. Please upload .docx"

Recommendation: Option 1 for new uploads; Option 3 for existing files in MVP.


3.3 Excel (.xlsx) Watermarking

Library: github.com/xuri/excelize/v2 (pure Go, actively maintained, 15k+ stars)

Approach:

  1. Open XLSX
  2. For each sheet: insert header row with watermark text
  3. Optionally: add sheet-level "protection" (cosmetic, not security — easily bypassed)
  4. Save to output stream

Watermark Placement:

  • Header row (Row 1): Merged cells spanning data width, light gray background, watermark text
  • Sheet header/footer: Print-only watermark (visible when printed)

Algorithm:

func WatermarkXLSX(input io.Reader, output io.Writer, wm WatermarkParams) error {
    // 1. Open workbook
    f, err := excelize.OpenReader(input)
    if err != nil {
        return fmt.Errorf("xlsx parse: %w", err)
    }
    defer f.Close()
    
    // 2. Watermark each sheet
    for _, sheet := range f.GetSheetList() {
        // Get data dimensions
        dim, _ := f.GetSheetDimension(sheet)
        cols := parseColumnCount(dim) // e.g., "A1:J50" → 10 columns
        
        // Insert row at top
        if err := f.InsertRows(sheet, 1, 1); err != nil {
            continue
        }
        
        // Merge cells for watermark banner
        endCol := columnLetter(cols)
        f.MergeCell(sheet, "A1", endCol+"1")
        
        // Set watermark text
        f.SetCellValue(sheet, "A1", wm.Text)
        
        // Style: light gray background, centered, small font
        styleID, _ := f.NewStyle(&excelize.Style{
            Fill: excelize.Fill{Type: "pattern", Color: []string{"#EEEEEE"}, Pattern: 1},
            Font: &excelize.Font{Size: 9, Color: "#888888"},
            Alignment: &excelize.Alignment{Horizontal: "center"},
        })
        f.SetCellStyle(sheet, "A1", endCol+"1", styleID)
        
        // Add print header/footer
        f.SetHeaderFooter(sheet, &excelize.HeaderFooterOptions{
            OddFooter: "&C" + wm.Text,
        })
    }
    
    // 3. Optional: add sheet protection (cosmetic only)
    if wm.AddProtection {
        for _, sheet := range f.GetSheetList() {
            f.ProtectSheet(sheet, &excelize.SheetProtection{
                Password:      "", // No password — just prevents casual editing
                SelectLockedCells: true,
            })
        }
    }
    
    // 4. Write output
    return f.Write(output)
}

Performance:

  • Small XLSX: ~20-50ms
  • Large XLSX (10k+ rows): ~100-400ms
  • Memory: ~2-4x file size

Caching: Never.

Edge Cases:

Case Handling
Password-protected XLSX Reject. Cannot modify encrypted workbook.
Workbook with VBA macros (.xlsm) Process same as .xlsx; macros preserved
Very wide sheets (1000+ columns) Skip merge, add watermark to A1 only
Charts/pivot tables Unaffected; watermark is in data area
XLS (legacy binary) Reject or convert via LibreOffice. Same as DOC.

3.4 Image Watermarking (JPG, PNG, WebP)

Library: Standard library image + golang.org/x/image + github.com/fogleman/gg (2D graphics)

Approach:

  1. Decode image
  2. Draw semi-transparent text overlay
  3. Encode to output format

Watermark Placement:

  • Bottom-right corner: Primary watermark, semi-transparent white text with drop shadow
  • Tiled diagonal (optional): For high-value images, repeated pattern across entire image

Algorithm:

func WatermarkImage(input io.Reader, output io.Writer, format string, wm WatermarkParams) error {
    // 1. Decode image
    img, _, err := image.Decode(input)
    if err != nil {
        return fmt.Errorf("image decode: %w", err)
    }
    
    bounds := img.Bounds()
    width, height := bounds.Dx(), bounds.Dy()
    
    // 2. Create drawing context
    dc := gg.NewContextForImage(img)
    
    // 3. Calculate font size based on image dimensions
    fontSize := float64(width) / 50 // ~2% of width
    if fontSize < 12 {
        fontSize = 12
    }
    if fontSize > 48 {
        fontSize = 48
    }
    
    dc.LoadFontFace("/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf", fontSize)
    
    // 4. Position: bottom-right with padding
    textWidth, textHeight := dc.MeasureString(wm.Text)
    x := float64(width) - textWidth - 20
    y := float64(height) - 20
    
    // 5. Draw drop shadow
    dc.SetRGBA(0, 0, 0, 0.5)
    dc.DrawString(wm.Text, x+2, y+2)
    
    // 6. Draw watermark text
    dc.SetRGBA(1, 1, 1, 0.7)
    dc.DrawString(wm.Text, x, y)
    
    // 7. Optional: diagonal tiled pattern
    if wm.ScreenDeterrent {
        dc.SetRGBA(0.5, 0.5, 0.5, 0.15)
        dc.LoadFontFace("/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf", fontSize*2)
        for row := -height; row < height*2; row += int(fontSize * 4) {
            for col := -width; col < width*2; col += int(textWidth * 1.5) {
                dc.Push()
                dc.RotateAbout(gg.Radians(45), float64(col), float64(row))
                dc.DrawString(wm.Text, float64(col), float64(row))
                dc.Pop()
            }
        }
    }
    
    // 8. Encode output
    switch format {
    case "jpeg", "jpg":
        return jpeg.Encode(output, dc.Image(), &jpeg.Options{Quality: 90})
    case "png":
        return png.Encode(output, dc.Image())
    case "webp":
        return webp.Encode(output, dc.Image(), &webp.Options{Quality: 90})
    default:
        return png.Encode(output, dc.Image())
    }
}

Performance:

  • Small image (<1MB): ~10-30ms
  • Large image (10MB+): ~100-300ms
  • Memory: ~4x pixel dimensions (RGBA in memory)

Caching: Never.

Edge Cases:

Case Handling
Animated GIF Extract first frame, watermark, serve as static. Or reject.
Very small image (<200px) Reduce font size; may become illegible — accept this
HEIC/HEIF Convert to JPEG first (Apple format, limited Go support)
TIFF Decode with golang.org/x/image/tiff; serve as PNG
RAW formats Reject. Convert on upload.
SVG Skip pixel watermarking; add text element to XML

3.5 Video Watermarking (MP4, MOV)

Tool: FFmpeg (external binary) — no pure Go solution exists for video processing

Approach:

  1. Pipe original video to FFmpeg stdin
  2. FFmpeg overlays text watermark
  3. Stream FFmpeg stdout to HTTP response

Watermark Placement:

  • Bottom-right corner: Semi-transparent text overlay, visible but not distracting
  • Optional burn-in: More prominent for high-sensitivity content

Algorithm:

func WatermarkVideo(ctx context.Context, objectID string, w http.ResponseWriter, wm WatermarkParams) error {
    // 1. Build FFmpeg command
    // Text escape: replace special chars
    escapedText := strings.ReplaceAll(wm.Text, ":", "\\:")
    escapedText = strings.ReplaceAll(escapedText, "'", "\\'")
    
    // drawtext filter
    filter := fmt.Sprintf(
        "drawtext=text='%s':fontsize=24:fontcolor=white@0.7:x=w-tw-20:y=h-th-20:shadowcolor=black@0.5:shadowx=2:shadowy=2",
        escapedText,
    )
    
    cmd := exec.CommandContext(ctx, "ffmpeg",
        "-i", "pipe:0",      // Read from stdin
        "-vf", filter,       // Apply text filter
        "-c:v", "libx264",   // Re-encode video
        "-preset", "fast",   // Speed over compression
        "-crf", "23",        // Quality (lower = better)
        "-c:a", "copy",      // Copy audio unchanged
        "-movflags", "+faststart+frag_keyframe+empty_moov",  // Streaming-friendly
        "-f", "mp4",         // Output format
        "pipe:1",            // Write to stdout
    )
    
    // 2. Set up pipes
    stdin, _ := cmd.StdinPipe()
    cmd.Stdout = w
    cmd.Stderr = os.Stderr // Log errors
    
    // 3. Start FFmpeg
    if err := cmd.Start(); err != nil {
        return fmt.Errorf("ffmpeg start: %w", err)
    }
    
    // 4. Stream input file to FFmpeg
    go func() {
        defer stdin.Close()
        obj, _ := store.Read(objectID)
        io.Copy(stdin, bytes.NewReader(obj))
    }()
    
    // 5. Wait for completion
    return cmd.Wait()
}

Performance:

  • 1-minute video: ~5-15 seconds (re-encoding required)
  • 10-minute video: ~30-90 seconds
  • Recommendation: For videos >5 minutes, use async processing + notification when ready

Caching: ⚠️ Consider selective caching for large videos.

  • Risk: Cached version has wrong timestamp for subsequent views
  • Mitigation: Cache key includes {object_id}:{user_id}:{date} — same user same day gets cache
  • Invalidate cache at midnight or on project config change

Edge Cases:

Case Handling
Very long video (>1hr) Async processing; return 202 with job ID; poll for completion
Corrupted video FFmpeg will error; return 500 with audit log
Unsupported codec FFmpeg handles most; truly exotic formats: reject
Audio-only file No video stream to watermark; add metadata comment instead
MKV, AVI, WMV Convert to MP4 on serve (FFmpeg handles this)

3.6 Other File Types

Strategy: Encrypted download only. No preview, no watermarking.

Affected Types:

  • ZIP, TAR, 7Z (archives)
  • CAD files (DWG, DXF)
  • Database exports (SQL, CSV with sensitive data)
  • Executables (rare but possible)
  • Unknown/binary files

Watermark Alternative:

  1. Filename includes minimal watermark: report_{user_id}_{timestamp}.zip
  2. Audit log captures full context
  3. "CONFIDENTIAL" wrapper: Serve inside a new ZIP containing the file + a NOTICE.txt
func ServeEncryptedDownload(w http.ResponseWriter, objectID string, wm WatermarkParams) error {
    // Create wrapper ZIP with notice
    buf := new(bytes.Buffer)
    zw := zip.NewWriter(buf)
    
    // Add notice file
    notice, _ := zw.Create("NOTICE.txt")
    fmt.Fprintf(notice, "CONFIDENTIAL\n\nDownloaded by: %s\nOrganization: %s\nTimestamp: %s\n\nUnauthorized distribution is prohibited.",
        wm.UserName, wm.OrgName, wm.Timestamp)
    
    // Add original file
    original, _ := zw.Create(wm.OriginalFilename)
    obj, _ := store.Read(objectID)
    original.Write(obj)
    
    zw.Close()
    
    // Set download filename with watermark info
    filename := fmt.Sprintf("%s_%s_%s.zip", 
        strings.TrimSuffix(wm.OriginalFilename, filepath.Ext(wm.OriginalFilename)),
        wm.UserID[:8],
        time.Now().Format("20060102"))
    
    w.Header().Set("Content-Disposition", fmt.Sprintf(`attachment; filename="%s"`, filename))
    w.Header().Set("Content-Type", "application/zip")
    w.Write(buf.Bytes())
    return nil
}

4. Screen Capture Protection

Reality Check: True screen capture protection is impossible. Any DRM can be defeated by pointing a camera at a screen. Our goal is deterrence and traceability, not prevention.

4.1 Visual Deterrent Strategy

For PDFs served in-browser:

  1. Apply diagonal tiled watermark pattern (45°, repeated every 200px)
  2. Use user-specific text in the pattern
  3. Opacity 0.15 — visible in screenshots but doesn't obstruct reading

For images:

  1. Same diagonal tiled pattern
  2. Consider more aggressive opacity (0.25) for high-sensitivity images

For video:

  1. Persistent corner watermark (already implemented)
  2. Optional: periodic full-screen flash of watermark text (every 60s, 2s duration, 0.3 opacity)

4.2 Additional Deterrents

Technique Effectiveness Implementation
Diagonal tiled watermark High Built into watermark functions
Random position micro-watermarks Medium Add 5-10 tiny (8px) watermarks at random positions
Invisible watermarks (steganography) Low (easily stripped) Not recommended — complexity vs. value
JavaScript screenshot detection Low (easily bypassed) Not recommended
CSS -webkit-user-select: none Cosmetic only Add to viewer CSS
type ScreenDeterrentLevel int

const (
    DeterrentNone     ScreenDeterrentLevel = 0
    DeterrentStandard ScreenDeterrentLevel = 1  // Footer + diagonal tiled
    DeterrentHigh     ScreenDeterrentLevel = 2  // + random micro-watermarks
)

Default to DeterrentStandard for all data room documents. Project admins can escalate to DeterrentHigh for specific folders.


5. Audit Trail

Every file serve MUST be logged. No exceptions.

5.1 Audit Entry Structure

type FileServeAudit struct {
    ID          string    `json:"id"`           // UUID
    ProjectID   string    `json:"project_id"`
    ObjectID    string    `json:"object_id"`    // File being served
    EntryID     string    `json:"entry_id"`     // Parent answer entry
    ActorID     string    `json:"actor_id"`     // User requesting
    ActorOrg    string    `json:"actor_org"`    // Organization
    Action      string    `json:"action"`       // "view" | "download" | "print"
    IP          string    `json:"ip"`           // Client IP (X-Forwarded-For aware)
    UserAgent   string    `json:"user_agent"`
    Timestamp   time.Time `json:"timestamp"`
    WatermarkID string    `json:"watermark_id"` // Unique ID embedded in watermark
    FileType    string    `json:"file_type"`    // "pdf", "docx", etc.
    FileSize    int64     `json:"file_size"`
    Success     bool      `json:"success"`
    ErrorMsg    string    `json:"error_msg,omitempty"`
}

5.2 Watermark ID

Each watermark includes a unique, traceable ID:

func GenerateWatermarkID(actorID, objectID string, timestamp time.Time) string {
    // Short, human-readable, globally unique
    h := hmac.New(sha256.New, watermarkSecret)
    h.Write([]byte(actorID + objectID + timestamp.Format(time.RFC3339)))
    sum := h.Sum(nil)
    return base32.StdEncoding.EncodeToString(sum[:8])[:13] // e.g., "JBSWY3DPEHPK3"
}

This ID appears in the watermark text and audit log. If a document leaks, grep for the ID → instant attribution.

5.3 Audit Table

CREATE TABLE file_serves (
    id           TEXT PRIMARY KEY,
    project_id   TEXT NOT NULL,
    object_id    TEXT NOT NULL,
    entry_id     TEXT,
    actor_id     TEXT NOT NULL,
    actor_org    TEXT,
    action       TEXT NOT NULL,
    ip           TEXT NOT NULL,
    user_agent   TEXT,
    ts           INTEGER NOT NULL,
    watermark_id TEXT NOT NULL,
    file_type    TEXT NOT NULL,
    file_size    INTEGER,
    success      INTEGER NOT NULL,
    error_msg    TEXT
);

CREATE INDEX idx_serves_project ON file_serves(project_id);
CREATE INDEX idx_serves_actor ON file_serves(actor_id);
CREATE INDEX idx_serves_object ON file_serves(object_id);
CREATE INDEX idx_serves_watermark ON file_serves(watermark_id);
CREATE INDEX idx_serves_ts ON file_serves(ts);

5.4 Audit Logging Function

func LogFileServe(ctx context.Context, audit FileServeAudit) error {
    audit.ID = uuid.NewString()
    if audit.Timestamp.IsZero() {
        audit.Timestamp = time.Now()
    }
    
    // Pack sensitive fields (action, user_agent, error_msg)
    packed, err := Pack(audit)
    if err != nil {
        return err
    }
    
    _, err = db.ExecContext(ctx, `
        INSERT INTO file_serves 
        (id, project_id, object_id, entry_id, actor_id, actor_org, action, ip, user_agent, ts, watermark_id, file_type, file_size, success, error_msg)
        VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)`,
        audit.ID, audit.ProjectID, audit.ObjectID, audit.EntryID,
        audit.ActorID, audit.ActorOrg, packed.Action, audit.IP, packed.UserAgent,
        audit.Timestamp.UnixMilli(), audit.WatermarkID, audit.FileType, audit.FileSize,
        boolToInt(audit.Success), packed.ErrorMsg,
    )
    return err
}

6. Burn After Reading Mode

Optional per-file setting: file can only be downloaded N times total, or N times per user.

6.1 Configuration

type BurnConfig struct {
    Enabled        bool   `json:"enabled"`
    MaxDownloads   int    `json:"max_downloads"`      // Total across all users (0 = unlimited)
    MaxPerUser     int    `json:"max_per_user"`       // Per individual user (0 = unlimited)
    ExpiresAt      *int64 `json:"expires_at"`         // Unix ms timestamp (optional)
    NotifyOnBurn   bool   `json:"notify_on_burn"`     // Alert admins when limit reached
}

6.2 Tracking Table

CREATE TABLE burn_tracking (
    object_id    TEXT NOT NULL,
    actor_id     TEXT NOT NULL,
    download_count INTEGER NOT NULL DEFAULT 0,
    last_download INTEGER,
    PRIMARY KEY (object_id, actor_id)
);

CREATE TABLE burn_totals (
    object_id       TEXT PRIMARY KEY,
    total_downloads INTEGER NOT NULL DEFAULT 0,
    burned_at       INTEGER  -- When limit was hit
);

6.3 Burn Check Function

func CheckBurnLimit(ctx context.Context, objectID, actorID string) (allowed bool, remaining int, err error) {
    // 1. Load burn config from entry Data
    config, err := GetBurnConfig(ctx, objectID)
    if err != nil || !config.Enabled {
        return true, -1, nil // No burn limit
    }
    
    // 2. Check expiration
    if config.ExpiresAt != nil && time.Now().UnixMilli() > *config.ExpiresAt {
        return false, 0, nil
    }
    
    // 3. Check total downloads
    var total int
    db.QueryRowContext(ctx, "SELECT total_downloads FROM burn_totals WHERE object_id = ?", objectID).Scan(&total)
    if config.MaxDownloads > 0 && total >= config.MaxDownloads {
        return false, 0, nil
    }
    
    // 4. Check per-user downloads
    var userCount int
    db.QueryRowContext(ctx, 
        "SELECT download_count FROM burn_tracking WHERE object_id = ? AND actor_id = ?",
        objectID, actorID).Scan(&userCount)
    if config.MaxPerUser > 0 && userCount >= config.MaxPerUser {
        return false, 0, nil
    }
    
    // 5. Calculate remaining
    remaining = -1 // Unlimited
    if config.MaxDownloads > 0 {
        remaining = config.MaxDownloads - total
    }
    if config.MaxPerUser > 0 {
        userRemaining := config.MaxPerUser - userCount
        if remaining < 0 || userRemaining < remaining {
            remaining = userRemaining
        }
    }
    
    return true, remaining, nil
}

func IncrementBurnCount(ctx context.Context, objectID, actorID string) error {
    tx, _ := db.BeginTx(ctx, nil)
    defer tx.Rollback()
    
    // Upsert user tracking
    tx.ExecContext(ctx, `
        INSERT INTO burn_tracking (object_id, actor_id, download_count, last_download)
        VALUES (?, ?, 1, ?)
        ON CONFLICT (object_id, actor_id) DO UPDATE SET
            download_count = download_count + 1,
            last_download = ?`,
        objectID, actorID, time.Now().UnixMilli(), time.Now().UnixMilli())
    
    // Upsert total
    tx.ExecContext(ctx, `
        INSERT INTO burn_totals (object_id, total_downloads)
        VALUES (?, 1)
        ON CONFLICT (object_id) DO UPDATE SET
            total_downloads = total_downloads + 1`,
        objectID)
    
    return tx.Commit()
}

6.4 Burn Notification

When a file hits its limit:

func NotifyBurn(ctx context.Context, objectID string, config BurnConfig) {
    if !config.NotifyOnBurn {
        return
    }
    
    // Update burned_at timestamp
    db.ExecContext(ctx, "UPDATE burn_totals SET burned_at = ? WHERE object_id = ?",
        time.Now().UnixMilli(), objectID)
    
    // Notify project admins (via existing notification system)
    entry, _ := GetEntryByObjectID(ctx, objectID)
    NotifyProjectAdmins(ctx, entry.ProjectID, NotificationBurnLimitReached, map[string]any{
        "object_id": objectID,
        "filename":  entry.Data["filename"],
    })
}

7. Go Implementation Design

7.1 lib/watermark.go — Function Signatures

package lib

import (
    "context"
    "io"
    "time"
)

// WatermarkParams contains all info needed to generate a watermark
type WatermarkParams struct {
    UserID      string
    UserName    string
    OrgID       string
    OrgName     string
    Timestamp   time.Time
    WatermarkID string // Unique traceable ID
    
    // Styling (from project config)
    Config      WatermarkConfig
    
    // Original file info
    OriginalFilename string
    FileType         string // "pdf", "docx", "xlsx", "jpg", etc.
    
    // Options
    ScreenDeterrent  bool // Add aggressive visual deterrent
}

// WatermarkConfig is project-level styling configuration
type WatermarkConfig struct {
    Text          string  // Template with placeholders
    FontFamily    string
    FontSize      int
    Color         string  // RGBA hex
    Position      string  // "footer", "header", "diagonal", "tiled"
    Opacity       float64
    DeterrentLevel ScreenDeterrentLevel
}

// ScreenDeterrentLevel controls anti-screenshot measures
type ScreenDeterrentLevel int

const (
    DeterrentNone     ScreenDeterrentLevel = 0
    DeterrentStandard ScreenDeterrentLevel = 1
    DeterrentHigh     ScreenDeterrentLevel = 2
)

// BuildWatermarkText generates the actual watermark string from params
func BuildWatermarkText(params WatermarkParams) string

// GenerateWatermarkID creates a unique, traceable watermark identifier
func GenerateWatermarkID(actorID, objectID string, timestamp time.Time) string

// ---- Per-Type Watermarking Functions ----

// WatermarkPDF adds watermark to PDF, writing result to output
func WatermarkPDF(input io.Reader, output io.Writer, params WatermarkParams) error

// WatermarkDOCX adds watermark to Word document
func WatermarkDOCX(input io.Reader, output io.Writer, params WatermarkParams) error

// WatermarkXLSX adds watermark to Excel spreadsheet
func WatermarkXLSX(input io.Reader, output io.Writer, params WatermarkParams) error

// WatermarkImage adds watermark to image (JPG, PNG, WebP)
// Returns the output format used (may differ from input for unsupported formats)
func WatermarkImage(input io.Reader, output io.Writer, inputFormat string, params WatermarkParams) (outputFormat string, err error)

// WatermarkVideo streams video with watermark overlay
// This is special: it writes directly to http.ResponseWriter, handling streaming
func WatermarkVideo(ctx context.Context, objectReader io.Reader, w io.Writer, params WatermarkParams) error

// ServeProtectedFile is the unified entry point for the protection pipeline
// It detects file type and applies appropriate watermarking
func ServeProtectedFile(ctx context.Context, objectID string, w io.Writer, params WatermarkParams) error

// ServeEncryptedDownload wraps non-watermarkable files with NOTICE.txt
func ServeEncryptedDownload(ctx context.Context, objectID string, w io.Writer, params WatermarkParams) error

// ---- Audit Functions ----

// LogFileServe records every file access to the audit table
func LogFileServe(ctx context.Context, audit FileServeAudit) error

// FileServeAudit contains all info about a file serve event
type FileServeAudit struct {
    ID          string
    ProjectID   string
    ObjectID    string
    EntryID     string
    ActorID     string
    ActorOrg    string
    Action      string // "view", "download", "print"
    IP          string
    UserAgent   string
    Timestamp   time.Time
    WatermarkID string
    FileType    string
    FileSize    int64
    Success     bool
    ErrorMsg    string
}

// ---- Burn After Reading ----

// BurnConfig defines download limits for a file
type BurnConfig struct {
    Enabled      bool
    MaxDownloads int   // Total across all users
    MaxPerUser   int   // Per individual user
    ExpiresAt    *int64
    NotifyOnBurn bool
}

// CheckBurnLimit verifies if download is allowed and returns remaining count
func CheckBurnLimit(ctx context.Context, objectID, actorID string) (allowed bool, remaining int, err error)

// IncrementBurnCount records a successful download
func IncrementBurnCount(ctx context.Context, objectID, actorID string) error

// GetBurnConfig retrieves burn settings for an object
func GetBurnConfig(ctx context.Context, objectID string) (BurnConfig, error)

// SetBurnConfig updates burn settings for an object
func SetBurnConfig(ctx context.Context, objectID string, config BurnConfig) error

7.2 Dependencies (go.mod additions)

// PDF processing
require github.com/pdfcpu/pdfcpu v0.8.0

// Office documents
require github.com/unidoc/unioffice v1.33.0
require github.com/xuri/excelize/v2 v2.8.1

// Image processing
require github.com/fogleman/gg v1.3.0
require golang.org/x/image v0.18.0

// WebP support (for image encoding)
require github.com/chai2010/webp v1.1.1

// Video (FFmpeg is external binary, no Go dep needed)
// Ensure ffmpeg is installed: apt install ffmpeg

// Crypto (FIPS 140-3 compliance)
// Use standard library crypto/aes, crypto/hmac, crypto/sha256
// These are FIPS-approved when built with GOEXPERIMENT=boringcrypto

FIPS 140-3 Note: Build with GOEXPERIMENT=boringcrypto to use BoringSSL for FIPS-compliant crypto operations.

7.3 Serve Pipeline

Complete request flow from HTTP request to watermarked response:

┌─────────────────────────────────────────────────────────────────────────────────┐
│                              FILE SERVE PIPELINE                                │
└─────────────────────────────────────────────────────────────────────────────────┘

  HTTP Request: GET /api/files/{object_id}
         │
         ▼
  ┌──────────────────────┐
  │ 1. Auth Middleware   │ Extract JWT, validate session
  └──────────────────────┘
         │
         ▼
  ┌──────────────────────┐
  │ 2. CheckAccess()     │ Verify actor can access this object
  │    (lib/rbac.go)     │ Walk to parent entry → workstream → project
  └──────────────────────┘ Return 403 if denied
         │
         ▼
  ┌──────────────────────┐
  │ 3. CheckBurnLimit()  │ Verify download quota not exceeded
  │    (lib/watermark.go)│ Return 410 Gone if burned
  └──────────────────────┘
         │
         ▼
  ┌──────────────────────┐
  │ 4. ObjectRead()      │ Fetch encrypted object from store
  │    (lib/store.go)    │ Decrypt with project key
  └──────────────────────┘
         │
         ▼
  ┌──────────────────────┐
  │ 5. Build Params      │ Construct WatermarkParams from:
  │                      │ - Actor (user_id, name, org)
  │                      │ - Project config (styling)
  │                      │ - Timestamp + generated watermark ID
  │                      │ - File metadata (type, name)
  └──────────────────────┘
         │
         ▼
  ┌──────────────────────┐
  │ 6. Route by Type     │ Detect file type from extension/magic bytes
  │                      │
  │   PDF ──────────►    │ WatermarkPDF()
  │   DOCX ─────────►    │ WatermarkDOCX()
  │   XLSX ─────────►    │ WatermarkXLSX()
  │   Image ────────►    │ WatermarkImage()
  │   Video ────────►    │ WatermarkVideo()
  │   Other ────────►    │ ServeEncryptedDownload()
  └──────────────────────┘
         │
         ▼
  ┌──────────────────────┐
  │ 7. Stream to Client  │ Set Content-Type, Content-Disposition
  │                      │ Write watermarked content to ResponseWriter
  └──────────────────────┘
         │
         ▼
  ┌──────────────────────┐
  │ 8. LogFileServe()    │ Record to audit table (async, don't block)
  │    (lib/watermark.go)│
  └──────────────────────┘
         │
         ▼
  ┌──────────────────────┐
  │ 9. IncrementBurn()   │ Update download counters if burn enabled
  │    (lib/watermark.go)│ Check if limit now hit → notify admins
  └──────────────────────┘
         │
         ▼
      HTTP Response: Watermarked file stream

7.4 Handler Implementation

// api/handlers.go

func (h *Handler) ServeFile(w http.ResponseWriter, r *http.Request) {
    ctx := r.Context()
    objectID := chi.URLParam(r, "objectID")
    actor := auth.ActorFromContext(ctx)
    
    // 1. Access check
    entry, err := lib.GetEntryByObjectID(ctx, h.db, objectID)
    if err != nil {
        http.Error(w, "not found", 404)
        return
    }
    
    if !lib.CheckAccess(ctx, h.db, actor.ID, entry.ProjectID, entry.ID, "read") {
        http.Error(w, "forbidden", 403)
        return
    }
    
    // 2. Burn limit check
    allowed, remaining, err := lib.CheckBurnLimit(ctx, h.db, objectID, actor.ID)
    if err != nil {
        http.Error(w, "internal error", 500)
        return
    }
    if !allowed {
        http.Error(w, "download limit exceeded", 410) // Gone
        return
    }
    
    // 3. Read object
    data, err := lib.ObjectRead(ctx, h.store, h.projectKey, objectID)
    if err != nil {
        http.Error(w, "not found", 404)
        return
    }
    
    // 4. Build watermark params
    fileInfo := entry.Data["files"].([]any)[0].(map[string]any) // Simplified
    projectConfig, _ := lib.GetProjectWatermarkConfig(ctx, h.db, entry.ProjectID)
    
    params := lib.WatermarkParams{
        UserID:           actor.ID,
        UserName:         actor.Name,
        OrgID:            actor.OrgID,
        OrgName:          actor.OrgName,
        Timestamp:        time.Now(),
        WatermarkID:      lib.GenerateWatermarkID(actor.ID, objectID, time.Now()),
        Config:           projectConfig,
        OriginalFilename: fileInfo["name"].(string),
        FileType:         fileInfo["type"].(string),
        ScreenDeterrent:  projectConfig.DeterrentLevel >= lib.DeterrentStandard,
    }
    
    // 5. Prepare audit entry (log after response)
    audit := lib.FileServeAudit{
        ProjectID:   entry.ProjectID,
        ObjectID:    objectID,
        EntryID:     entry.ID,
        ActorID:     actor.ID,
        ActorOrg:    actor.OrgName,
        Action:      "download",
        IP:          realIP(r),
        UserAgent:   r.UserAgent(),
        WatermarkID: params.WatermarkID,
        FileType:    params.FileType,
        FileSize:    int64(len(data)),
        Success:     true,
    }
    
    // 6. Set response headers
    contentType := mimeTypeFromExt(params.FileType)
    w.Header().Set("Content-Type", contentType)
    w.Header().Set("Content-Disposition", 
        fmt.Sprintf(`attachment; filename="%s"`, params.OriginalFilename))
    if remaining > 0 {
        w.Header().Set("X-Downloads-Remaining", fmt.Sprintf("%d", remaining))
    }
    
    // 7. Apply watermark and stream
    err = lib.ServeProtectedFile(ctx, bytes.NewReader(data), w, params)
    if err != nil {
        audit.Success = false
        audit.ErrorMsg = err.Error()
        // Still log the attempt
    }
    
    // 8. Log and update burn count (async)
    go func() {
        lib.LogFileServe(context.Background(), h.db, audit)
        if audit.Success {
            lib.IncrementBurnCount(context.Background(), h.db, objectID, actor.ID)
        }
    }()
}

8. Performance Considerations

8.1 Processing Time Budgets

File Type Target Acceptable Timeout
PDF (≤10pg) <100ms <500ms 10s
PDF (>100pg) <1s <5s 30s
DOCX <100ms <500ms 10s
XLSX <100ms <500ms 10s
Image <50ms <200ms 5s
Video (1min) <10s <30s 120s
Video (10min) async async 300s

8.2 Memory Management

  • PDF: Stream pages when possible; full memory load for complex watermarks
  • DOCX/XLSX: Full memory load required (ZIP structure)
  • Image: Decode → process → encode; ~4x pixel dimensions in RAM
  • Video: Stream through FFmpeg; no full file in memory

Memory limits per request:

  • PDF: 256MB max
  • Office: 128MB max
  • Image: 512MB max (4K images at 4 bytes/pixel)
  • Video: Streaming, no limit

8.3 Caching Strategy

What Cache? Rationale
Watermarked files Never Timestamp and user-specific content
Project watermark config 5min TTL Rarely changes
User/org lookups 5min TTL Rarely changes
Object decryption ⚠️ Per-request Could cache decrypted bytes briefly
Burn counts Never Must be accurate

8.4 Async Processing for Large Files

Videos >5 minutes and PDFs >500 pages should use async processing:

func (h *Handler) ServeFileLarge(w http.ResponseWriter, r *http.Request) {
    // ... access checks ...
    
    if isLargeFile(params.FileType, fileSize) {
        // Queue for background processing
        jobID := lib.QueueWatermarkJob(ctx, h.queue, objectID, params)
        
        w.Header().Set("Content-Type", "application/json")
        w.WriteHeader(202) // Accepted
        json.NewEncoder(w).Encode(map[string]string{
            "status": "processing",
            "job_id": jobID,
            "poll_url": "/api/files/jobs/" + jobID,
        })
        return
    }
    
    // Normal sync processing...
}

9. Error Handling & Fallbacks

9.1 Graceful Degradation

Error Fallback User Message
PDF parse failure Serve original with audit + filename watermark "Processing unavailable"
Office parse failure Encrypt + download only "Preview unavailable"
Image decode failure Serve original with audit "Processing unavailable"
Video FFmpeg failure Encrypt + download only "Streaming unavailable"
Timeout exceeded Serve original with audit "Processing timeout"

9.2 Error Logging

All watermark failures are logged with full context:

type WatermarkError struct {
    ObjectID  string
    FileType  string
    Error     string
    Stack     string
    Fallback  string // What we did instead
    Timestamp time.Time
}

10. Security Considerations

10.1 FIPS 140-3 Compliance

All crypto operations use FIPS-approved algorithms:

  • AES-256-GCM for encryption
  • SHA-256 for hashing
  • HMAC-SHA256 for watermark ID generation
  • Build with GOEXPERIMENT=boringcrypto

10.2 Watermark Tampering Resistance

Watermarks are applied at serve time, so tampering with stored files doesn't help. However:

  • Digital signatures on PDFs are invalidated by watermarking (expected)
  • Office documents could have watermark row deleted (accepted risk; audit trail remains)
  • Image watermarks can be cropped (tiled pattern mitigates)
  • Video watermarks can be cropped (corner + periodic full-screen mitigates)

10.3 Audit Integrity

Audit logs should be:

  • Write-only (no DELETE endpoint)
  • Integrity-checked (HMAC of each row, chain hash)
  • Replicated off-server for SOX/compliance

11. Testing Strategy

11.1 Unit Tests

func TestWatermarkPDF(t *testing.T)
func TestWatermarkPDFPassword(t *testing.T) // Should fail gracefully
func TestWatermarkPDFCorrupt(t *testing.T)   // Should fallback
func TestWatermarkDOCX(t *testing.T)
func TestWatermarkXLSX(t *testing.T)
func TestWatermarkImage(t *testing.T)
func TestBurnLimitEnforced(t *testing.T)
func TestBurnLimitPerUser(t *testing.T)
func TestAuditLogCreated(t *testing.T)
func TestWatermarkIDUnique(t *testing.T)

11.2 Integration Tests

  • Upload file → download → verify watermark present
  • Exceed burn limit → verify 410 response
  • Concurrent downloads → verify accurate burn counting
  • Large video → verify async handling

11.3 Sample Files

Maintain a test corpus of:

  • Normal files (all types)
  • Password-protected files
  • Corrupted files
  • Maximum-size files (stress test)
  • Edge cases per type (see sections 3.x)

12. Open Questions / Future Work

  1. PDF/A compliance: Verify pdfcpu maintains PDF/A compliance after watermarking. May need explicit flag.

  2. Office 365 online preview: When files are previewed in Office Online, watermarks must persist. May need server-side rendering instead.

  3. Mobile app considerations: Native mobile viewers may strip/ignore some watermarks. Test thoroughly.

  4. Print watermarks: Physical prints should show watermark. PDF print header/footer may be more robust than visual overlay.

  5. Invisible forensic watermarks: Steganographic watermarks that survive screenshots/prints. Complex, may add later.

  6. Video DRM: HLS with encryption + Widevine. Overkill for MVP, but worth considering for future.


This specification is complete and ready for implementation. Questions → Johan.