30 KiB

Raw Blame History

Dealspace — AI Matching & Embedding Specification

Version: 0.1 — 2026-02-28
Status: Pre-implementation. Addresses SPEC-REVIEW.md section 3 (race conditions) and section 7.4 (magic threshold).

1. Embedding Model Selection

1.1 Candidates

Model	Provider	Dimensions	Context	Latency	Cost	Retention
`nomic-embed-text-v1.5`	Fireworks	768	8192	~50ms	$0.008/1M	Zero
`voyage-finance-2`	Voyage AI	1024	16000	~80ms	$0.12/1M	30 days

1.2 Domain Analysis

M&A requests contain:

Financial terminology: EBITDA, working capital adjustments, earnout provisions, rep & warranty
Legal terminology: indemnification, IP assignments, material adverse change, disclosure schedules
Industry-specific terms: varies by deal (tech: ARR, churn; healthcare: HIPAA, 340B; manufacturing: capex, inventory turns)

Voyage-finance-2 was trained specifically on financial documents (10-Ks, credit agreements, M&A filings). It shows ~8% improvement on financial similarity benchmarks vs. general-purpose models.

Nomic-embed-text-v1.5 is general-purpose but performs well on semantic matching. Zero retention is critical for M&A confidentiality.

1.3 Recommendation

Use Fireworks nomic-embed-text-v1.5 for MVP:

Zero retention — Voyage's 30-day retention is unacceptable for M&A data
15x cheaper — allows generous matching without cost concerns
Proven stack — same infrastructure as inou (known good)
Good enough — general semantic similarity works for request matching; we're matching "audited financials" to "FY2024 audit report," not parsing covenant calculations

Revisit voyage-finance-2 when:

Voyage offers a zero-retention API option
Match quality metrics show <85% human confirmation rate
We expand to quantitative matching (finding answers by numerical similarity)

1.4 Fireworks Configuration

const (
    EmbedModel     = "nomic-embed-text-v1.5"
    EmbedEndpoint  = "https://api.fireworks.ai/inference/v1/embeddings"
    EmbedDimension = 768
    EmbedMaxTokens = 8192  // model context limit
)

API key stored in environment: FIREWORKS_API_KEY

2. What to Embed

2.1 Request Embedding

Embed the semantic intent, not raw fields.

func BuildRequestEmbedText(r *Request, ws *Workstream) string {
    var b strings.Builder
    
    // Workstream context (aids cross-workstream relevance)
    b.WriteString("Workstream: ")
    b.WriteString(ws.Name)
    b.WriteString("\n\n")
    
    // Title is high-signal
    b.WriteString("Request: ")
    b.WriteString(r.Title)
    b.WriteString("\n\n")
    
    // Body provides detail
    if r.Body != "" {
        b.WriteString("Details: ")
        b.WriteString(r.Body)
    }
    
    return strings.TrimSpace(b.String())
}

Do NOT embed:

Request ID, ref numbers (non-semantic)
Due dates, priority (operational, not semantic)
Assignee names (PII, not relevant to matching)
Status (changes frequently, embedding is point-in-time)

2.2 Answer Embedding

Answers may be long (multi-document explanations). Use chunking with overlap.

const (
    AnswerChunkSize    = 1500  // tokens (~6000 chars)
    AnswerChunkOverlap = 200   // tokens overlap for context continuity
)

func BuildAnswerEmbedTexts(a *Answer, ws *Workstream) []string {
    var chunks []string
    
    // Prefix every chunk with context
    prefix := fmt.Sprintf("Workstream: %s\nAnswer: %s\n\n", ws.Name, a.Title)
    
    body := a.Body
    if len(body) <= AnswerChunkSize*4 { // ~6000 chars = 1 chunk
        return []string{prefix + body}
    }
    
    // Chunk long answers
    for start := 0; start < len(body); {
        end := start + AnswerChunkSize*4
        if end > len(body) {
            end = len(body)
        }
        
        // Find sentence boundary near end
        if end < len(body) {
            if idx := strings.LastIndex(body[start:end], ". "); idx > 0 {
                end = start + idx + 1
            }
        }
        
        chunks = append(chunks, prefix+body[start:end])
        start = end - AnswerChunkOverlap*4
        if start < 0 {
            start = 0
        }
    }
    
    return chunks
}

Do NOT embed:

File contents (privacy: see section 10)
File names (may contain sensitive identifiers)
Rejection reasons (operational, not semantic)
Internal comments (IB-only context)

2.3 Embedding Storage

CREATE TABLE embeddings (
    id         TEXT PRIMARY KEY,           -- UUID
    entry_id   TEXT NOT NULL,              -- request or answer entry_id
    chunk_idx  INTEGER NOT NULL DEFAULT 0, -- 0 for single-chunk, 0..N for multi-chunk
    vector     BLOB NOT NULL,              -- 768 float32 = 3072 bytes
    text_hash  TEXT NOT NULL,              -- SHA-256 of embedded text (dedup check)
    model      TEXT NOT NULL,              -- "nomic-embed-text-v1.5"
    created_at INTEGER NOT NULL,
    UNIQUE(entry_id, chunk_idx)
);

CREATE INDEX idx_embeddings_entry ON embeddings(entry_id);

Note: Vector stored as raw float32 bytes (little-endian). Cosine similarity computed in Go, not SQL.

3. Retroactive Matching (Bidirectional)

3.1 The Problem (from SPEC-REVIEW §3.1)

The original spec only describes matching when a buyer submits a request. But:

New request should search existing published answers
New answer (when published) should search open requests

Both directions are required for complete coverage.

3.2 Matching Directions

Direction 1: Request → Answers
  Trigger: Request created OR request text updated
  Search:  All published answers in accessible workstreams
  Output:  Suggested answer_links with ai_score

Direction 2: Answer → Requests  
  Trigger: Answer published (stage = dataroom)
  Search:  All open requests in accessible workstreams
  Output:  Suggested answer_links with ai_score

3.3 Implementation

// Called when request is created or body/title changes
func MatchRequestToAnswers(ctx context.Context, requestID string) ([]AnswerMatch, error) {
    // 1. Get request embedding (create if missing)
    // 2. Load all published answer embeddings in same workstream
    // 3. Cosine similarity against each
    // 4. Return matches above threshold
}

// Called when answer is published
func MatchAnswerToRequests(ctx context.Context, answerID string) ([]RequestMatch, error) {
    // 1. Get answer embedding(s) (multi-chunk: use max score across chunks)
    // 2. Load all open request embeddings in same workstream
    // 3. Cosine similarity against each
    // 4. Return matches above threshold
}

3.4 Matching on Update

If a request body is edited:

Recompute embedding (check text_hash — skip if unchanged)
Re-run matching
New suggestions appear; existing confirmed links preserved

If an answer body is edited before publish:

No action (draft state)

If an answer is re-published (correction):

Re-run matching
Flag for human review if new requests match

4. Broadcast Idempotency

4.1 The Problem (from SPEC-REVIEW §3.2)

Multiple requests can link to the same answer. Without idempotency:

Confirming R1↔A1 sends broadcast to Buyer A
Confirming R2↔A1 sends another broadcast to Buyer B
If Buyer A also asked R2... they get two notifications

4.2 Broadcasts Table

CREATE TABLE broadcasts (
    id           TEXT PRIMARY KEY,
    answer_id    TEXT NOT NULL REFERENCES entries(entry_id),
    request_id   TEXT NOT NULL REFERENCES entries(entry_id),
    recipient_id TEXT NOT NULL REFERENCES users(id),
    channel      TEXT NOT NULL,       -- "web" | "email" | "slack" | "teams"
    sent_at      INTEGER NOT NULL,
    UNIQUE(answer_id, request_id, recipient_id, channel)
);

CREATE INDEX idx_broadcasts_answer ON broadcasts(answer_id);
CREATE INDEX idx_broadcasts_recipient ON broadcasts(recipient_id);

4.3 Broadcast Logic

func BroadcastAnswer(ctx context.Context, tx *sql.Tx, answerID string) error {
    // 1. Get all confirmed answer_links for this answer
    links, err := getConfirmedLinks(tx, answerID)
    
    // 2. For each link, get the request's origin_id (ultimate requester)
    recipients := make(map[string][]string) // user_id -> []request_id
    for _, link := range links {
        req, _ := getRequest(tx, link.RequestID)
        recipients[req.OriginID] = append(recipients[req.OriginID], link.RequestID)
    }
    
    // 3. For each recipient, check idempotency and send
    for userID, requestIDs := range recipients {
        for _, reqID := range requestIDs {
            // Check if already sent
            exists, _ := broadcastExists(tx, answerID, reqID, userID, "web")
            if exists {
                continue // idempotent skip
            }
            
            // Record broadcast
            err := insertBroadcast(tx, answerID, reqID, userID, "web", time.Now().UnixMilli())
            if err != nil {
                return err
            }
            
            // Queue notification (outside transaction)
            NotifyQueue.Push(Notification{
                UserID:    userID,
                AnswerID:  answerID,
                RequestID: reqID,
            })
        }
    }
    
    return nil
}

4.4 Notification Deduplication

Even with idempotency per (answer, request, recipient), a user might get multiple notifications if they asked 5 equivalent questions.

User-facing behavior: Collapse into single notification:

"Answer published: [Title] — resolves 5 of your requests"

This is a presentation concern, not a data model change. The broadcasts table tracks each link; the notification renderer collapses them.

5. Configurable Similarity Threshold

5.1 The Problem (from SPEC-REVIEW §7.4)

Hardcoded 0.72 is a magic number that:

May be too strict for some workstreams (legal requests are verbose)
May be too loose for others (financial requests are terse)
Cannot be tuned without code changes

5.2 Per-Workstream Configuration

Add to workstream entry's Data:

{
  "name": "Finance",
  "match_config": {
    "threshold": 0.72,
    "auto_confirm_threshold": 0.95,
    "cross_workstream": ["Legal"]
  }
}

Field	Type	Default	Description
`threshold`	float	0.72	Minimum score to suggest match
`auto_confirm_threshold`	float	null	If set, scores above this auto-confirm (no human review)
`cross_workstream`	[]string	[]	Workstream slugs to include in matching (see section 6)

5.3 Threshold Tuning Guidance

Workstream Type	Recommended Threshold	Rationale
Finance	0.72	Standard M&A requests, well-defined terminology
Legal	0.68	Verbose requests with boilerplate, semantic core is smaller
IT	0.75	Technical specificity matters, false positives costly
HR	0.70	Mix of standard and org-specific terms
Operations	0.72	General business terminology

5.4 Calibration Process

After initial deal data:

Export all (request, answer, human_confirmed) tuples
Compute score distribution for confirmed vs. rejected matches
Adjust threshold to maximize F1 score per workstream
Log threshold changes to audit for compliance

6. Cross-Workstream Matching

6.1 Use Case

An IT request ("Describe cybersecurity insurance coverage") may be answered by a Legal answer (cyber liability policy document).

Without cross-workstream matching, the IT buyer never sees the Legal answer.

6.2 Opt-In Per Workstream Pair

Configured in each workstream's match_config.cross_workstream:

// IT workstream
{
  "match_config": {
    "cross_workstream": ["Legal"]  // IT requests search Legal answers
  }
}

// Legal workstream  
{
  "match_config": {
    "cross_workstream": ["IT", "Finance"]  // Legal requests search IT and Finance
  }
}

Relationship is directional: IT searching Legal doesn't imply Legal searches IT.

6.3 RBAC Interaction

Cross-workstream matching only returns answers the requester can access:

func GetMatchableAnswers(ctx context.Context, actorID, requestWorkstreamID string) ([]Answer, error) {
    // 1. Get workstream config
    ws, _ := getWorkstream(requestWorkstreamID)
    
    // 2. Build workstream list (self + cross)
    workstreams := []string{ws.ID}
    workstreams = append(workstreams, ws.MatchConfig.CrossWorkstream...)
    
    // 3. Filter by access (RBAC)
    var accessible []string
    for _, wsID := range workstreams {
        if hasAccess(actorID, wsID, "read") {
            accessible = append(accessible, wsID)
        }
    }
    
    // 4. Get published answers from accessible workstreams
    return getPublishedAnswers(accessible)
}

7. Request Deduplication (Auto-Suggest Existing Answers)

7.1 The Problem

Buyer B asks the same question Buyer A already got answered. Without dedup:

Seller does duplicate work
IB reviews duplicate request
Buyer B waits when answer already exists

7.2 Dedup Flow

Buyer B submits request
  → Embed request
  → Search published answers (same logic as section 3)
  → If match score ≥ threshold:
      → Show Buyer B: "Similar answer already exists — view it?"
      → If Buyer B accepts: link request to existing answer, mark resolved
      → If Buyer B declines: proceed with normal request flow

7.3 UX Considerations

Don't block submission. Show suggestion after submit, not as a gate:

┌─────────────────────────────────────────────────────────────┐
│ Your request has been submitted.                            │
│                                                              │
│ 💡 We found a similar published answer that may help:        │
│                                                              │
│    "FY2024 Audited Financial Statements"                    │
│    Published: 2026-02-15 | Similarity: 89%                  │
│                                                              │
│    [View Answer]  [This doesn't answer my question]         │
└─────────────────────────────────────────────────────────────┘

7.4 Data Model

When buyer accepts the suggestion:

INSERT INTO answer_links (answer_id, request_id, linked_by, linked_at, status, ai_score)
VALUES (?, ?, ?, ?, 'self_confirmed', ?);
-- status = 'self_confirmed' means buyer accepted the AI suggestion
-- no IB review required

When buyer declines:

INSERT INTO answer_links (answer_id, request_id, linked_by, linked_at, status, ai_score)
VALUES (?, ?, ?, ?, 'rejected_by_requester', ?);
-- Prevents suggesting this answer again for this request
-- Request proceeds to normal IB/Seller workflow

8. Race Condition Fixes (DB Transactions)

8.1 The Problem (from SPEC-REVIEW §3)

Without transactions:

IB confirms match R1↔A1
Concurrent: IB publishes A1
Broadcast fires during confirm
Confirm completes, tries to broadcast again
Duplicate notifications or worse — inconsistent state

8.2 Transaction Boundaries

Atomic operation 1: Confirm Match

func ConfirmMatch(ctx context.Context, answerID, requestID, actorID string) error {
    return db.Transaction(func(tx *sql.Tx) error {
        // 1. Verify answer exists and is published
        answer, err := getAnswer(tx, answerID)
        if err != nil || answer.Status != "published" {
            return ErrAnswerNotPublished
        }
        
        // 2. Update answer_link status
        err = updateAnswerLink(tx, answerID, requestID, "confirmed", actorID)
        if err != nil {
            return err
        }
        
        // 3. Broadcast (idempotent)
        err = BroadcastAnswer(ctx, tx, answerID)
        if err != nil {
            return err
        }
        
        return nil
    })
}

Atomic operation 2: Publish Answer

func PublishAnswer(ctx context.Context, answerID, actorID string) error {
    return db.Transaction(func(tx *sql.Tx) error {
        // 1. Update answer status
        err := updateAnswerStatus(tx, answerID, "published", actorID)
        if err != nil {
            return err
        }
        
        // 2. Update entry stage to dataroom
        err = updateEntryStage(tx, answerID, "dataroom")
        if err != nil {
            return err
        }
        
        // 3. Run retroactive matching (creates pending answer_links)
        matches, err := MatchAnswerToRequests(ctx, answerID)
        if err != nil {
            return err
        }
        
        for _, m := range matches {
            err = insertAnswerLink(tx, answerID, m.RequestID, "pending", m.Score)
            if err != nil {
                return err
            }
        }
        
        // 4. Broadcast already-confirmed links (if any pre-existed)
        err = BroadcastAnswer(ctx, tx, answerID)
        if err != nil {
            return err
        }
        
        return nil
    })
}

8.3 Optimistic Locking

Add version column to prevent concurrent modifications:

ALTER TABLE entries ADD COLUMN version INTEGER NOT NULL DEFAULT 1;

func updateAnswerStatus(tx *sql.Tx, answerID, status string, expectedVersion int) (int, error) {
    result, err := tx.Exec(`
        UPDATE entries 
        SET data = json_set(data, '$.status', ?),
            version = version + 1,
            updated_at = ?
        WHERE entry_id = ? AND version = ?
    `, status, time.Now().UnixMilli(), answerID, expectedVersion)
    
    if err != nil {
        return 0, err
    }
    
    rows, _ := result.RowsAffected()
    if rows == 0 {
        return 0, ErrConcurrentModification
    }
    
    return expectedVersion + 1, nil
}

9. SQLite Cosine Similarity & Qdrant Migration

9.1 Pure Go Cosine Similarity

SQLite doesn't have native vector operations. Compute in Go:

// CosineSimilarity computes similarity between two vectors.
// Vectors must be same length. Returns value in [-1, 1].
func CosineSimilarity(a, b []float32) float32 {
    if len(a) != len(b) {
        panic("vector length mismatch")
    }
    
    var dotProduct, normA, normB float32
    for i := range a {
        dotProduct += a[i] * b[i]
        normA += a[i] * a[i]
        normB += b[i] * b[i]
    }
    
    if normA == 0 || normB == 0 {
        return 0
    }
    
    return dotProduct / (float32(math.Sqrt(float64(normA))) * float32(math.Sqrt(float64(normB))))
}

// BatchCosineSimilarity computes query vs all candidates.
// Uses SIMD via Go compiler optimizations.
func BatchCosineSimilarity(query []float32, candidates [][]float32) []float32 {
    scores := make([]float32, len(candidates))
    
    // Pre-compute query norm
    var queryNorm float32
    for _, v := range query {
        queryNorm += v * v
    }
    queryNorm = float32(math.Sqrt(float64(queryNorm)))
    
    for i, candidate := range candidates {
        var dot, candNorm float32
        for j := range query {
            dot += query[j] * candidate[j]
            candNorm += candidate[j] * candidate[j]
        }
        candNorm = float32(math.Sqrt(float64(candNorm)))
        
        if queryNorm == 0 || candNorm == 0 {
            scores[i] = 0
        } else {
            scores[i] = dot / (queryNorm * candNorm)
        }
    }
    
    return scores
}

9.2 Performance Characteristics (SQLite + Go)

Embeddings	Load Time	Search Time	Memory
1,000	50ms	2ms	3 MB
10,000	500ms	20ms	30 MB
100,000	5s	200ms	300 MB
1,000,000	50s	2s	3 GB

Acceptable for MVP: Most deals have <10,000 documents. Search under 100ms is fine.

9.3 Qdrant Migration Threshold

Migrate to Qdrant when:

Embedding count > 100,000 — search latency exceeds 200ms
Memory pressure — embeddings consume >500MB RAM
Multi-tenancy — need isolated collections per client (compliance)

9.4 Qdrant Integration (Future)

type VectorStore interface {
    Upsert(id string, vector []float32, metadata map[string]any) error
    Search(query []float32, filter map[string]any, limit int) ([]SearchResult, error)
    Delete(id string) error
}

// SQLiteVectorStore implements VectorStore using embeddings table
type SQLiteVectorStore struct { ... }

// QdrantVectorStore implements VectorStore using Qdrant API
type QdrantVectorStore struct { ... }

Abstract behind interface now; swap implementation later without code changes.

9.5 Hybrid Mode (Transition)

During migration:

Write to both SQLite and Qdrant
Read from Qdrant (with SQLite fallback)
Validate results match for first 1000 queries
Drop SQLite embeddings table after validation

10. Privacy: Plaintext Only, Never Files

10.1 Embedding Content Policy

ALLOWED to embed:

Request title
Request body text
Answer title
Answer body text (the explanation, not file contents)
Workstream name (context)

NEVER embed:

File contents (PDF, DOCX, XLSX, images)
File names (may contain deal names, party names)
Internal comments
Routing/assignment metadata
User names or email addresses

10.2 Why No File Embedding?

Privacy: M&A documents contain material non-public information. Sending to ANY external API (even zero-retention) creates compliance risk.
Size: A single PDF may be 100+ pages. Embedding would require chunking, storage, and search across potentially millions of chunks. Overkill for request-matching.
Semantic mismatch: Request asks "audited financials for FY2024." The answer body says "Please find attached the FY2024 audited financial statements." The body text + title is sufficient for matching — we don't need to embed page 47 of the PDF.

10.3 Future: On-Premise OCR + Embedding

If file-level search becomes required:

Run OCR on-premise (GLM-OCR on forge, not external API)
Store extracted text in entry.data (encrypted at rest)
Embed extracted text (still goes to Fireworks, but it's our extracted text, not raw file)

This is out of scope for MVP.

10.4 Audit Trail

Log every embedding request for compliance:

CREATE TABLE embed_audit (
    id         TEXT PRIMARY KEY,
    entry_id   TEXT NOT NULL,
    text_hash  TEXT NOT NULL,     -- SHA-256 of text sent
    text_len   INTEGER NOT NULL,  -- character count
    model      TEXT NOT NULL,
    requested_at INTEGER NOT NULL,
    latency_ms INTEGER,
    success    INTEGER NOT NULL
);

Do NOT log the actual text — that defeats the privacy purpose. Log the hash for correlation if needed.

11. lib/embed.go — Function Signatures

11.1 Public API

package lib

import (
    "context"
)

// EmbedConfig holds embedding service configuration.
type EmbedConfig struct {
    APIKey      string  // FIREWORKS_API_KEY
    Endpoint    string  // defaults to Fireworks endpoint
    Model       string  // defaults to nomic-embed-text-v1.5
    Timeout     time.Duration
    MaxRetries  int
}

// EmbedResult contains the embedding and metadata.
type EmbedResult struct {
    Vector    []float32
    TextHash  string  // SHA-256 of input text
    Model     string
    TokenCount int
    LatencyMs int64
}

// MatchResult represents a potential match with score.
type MatchResult struct {
    EntryID   string
    ChunkIdx  int
    Score     float32
    EntryType string  // "request" | "answer"
}

// Embed generates an embedding for the given text.
// Returns ErrTextTooLong if text exceeds model context.
// Returns ErrEmptyText if text is empty or whitespace only.
func Embed(ctx context.Context, cfg *EmbedConfig, text string) (*EmbedResult, error)

// EmbedBatch generates embeddings for multiple texts.
// More efficient than calling Embed in a loop (single API call).
// Max 100 texts per batch.
func EmbedBatch(ctx context.Context, cfg *EmbedConfig, texts []string) ([]*EmbedResult, error)

// EmbedRequest creates and stores embedding for a request entry.
// Idempotent: skips if embedding exists and text_hash matches.
func EmbedRequest(ctx context.Context, db *sql.DB, cfg *EmbedConfig, requestID string) error

// EmbedAnswer creates and stores embedding(s) for an answer entry.
// May produce multiple chunks for long answers.
// Idempotent: skips chunks where text_hash matches.
func EmbedAnswer(ctx context.Context, db *sql.DB, cfg *EmbedConfig, answerID string) error

// MatchRequestToAnswers finds published answers matching the request.
// Returns matches above the workstream's configured threshold.
// Respects cross-workstream config and RBAC.
func MatchRequestToAnswers(ctx context.Context, db *sql.DB, actorID, requestID string) ([]MatchResult, error)

// MatchAnswerToRequests finds open requests matching the answer.
// Returns matches above the workstream's configured threshold.
// Respects cross-workstream config and RBAC.
func MatchAnswerToRequests(ctx context.Context, db *sql.DB, actorID, answerID string) ([]MatchResult, error)

// FindDuplicateRequests finds existing requests similar to the given text.
// Used for deduplication suggestions before/after submission.
func FindDuplicateRequests(ctx context.Context, db *sql.DB, actorID, workstreamID, text string) ([]MatchResult, error)

// CosineSimilarity computes similarity between two vectors.
func CosineSimilarity(a, b []float32) float32

// DeleteEmbeddings removes all embeddings for an entry.
// Called when entry is deleted.
func DeleteEmbeddings(ctx context.Context, db *sql.DB, entryID string) error

// RefreshEmbedding re-embeds an entry if content changed.
// Compares text_hash to detect changes.
// Returns true if embedding was updated.
func RefreshEmbedding(ctx context.Context, db *sql.DB, cfg *EmbedConfig, entryID string) (bool, error)

11.2 Async Embedding on Publish

Embedding should not block the user action. Use async processing:

// EmbedQueue is a background worker that processes embedding requests.
type EmbedQueue struct {
    cfg    *EmbedConfig
    db     *sql.DB
    queue  chan embedJob
    wg     sync.WaitGroup
}

type embedJob struct {
    EntryID   string
    EntryType string  // "request" | "answer"
    Priority  int     // 0 = normal, 1 = high (new request needs matching)
}

// Start begins processing the embedding queue.
// Workers defaults to 2 (Fireworks rate limit friendly).
func (q *EmbedQueue) Start(workers int)

// Stop gracefully shuts down the queue.
func (q *EmbedQueue) Stop()

// Enqueue adds an entry for embedding.
// Non-blocking; returns immediately.
func (q *EmbedQueue) Enqueue(entryID, entryType string, priority int)

11.3 Integration Points

On Request Create:

func HandleCreateRequest(w http.ResponseWriter, r *http.Request) {
    // ... validation, RBAC, insert entry ...
    
    // Queue embedding (non-blocking)
    embedQueue.Enqueue(request.ID, "request", 1)  // high priority
    
    // Return success immediately
    respondJSON(w, request)
}

On Answer Publish:

func HandlePublishAnswer(w http.ResponseWriter, r *http.Request) {
    err := db.Transaction(func(tx *sql.Tx) error {
        // ... update status, stage ...
        
        // Embedding happens inline for matching (within transaction timeout)
        err := EmbedAnswer(ctx, tx, cfg, answer.ID)
        if err != nil {
            // Log but don't fail — matching can happen later
            log.Warn("embedding failed, will retry", "error", err)
        }
        
        // Match and create answer_links
        matches, _ := MatchAnswerToRequests(ctx, tx, actorID, answer.ID)
        for _, m := range matches {
            insertAnswerLink(tx, answer.ID, m.EntryID, "pending", m.Score)
        }
        
        // Broadcast confirmed links
        return BroadcastAnswer(ctx, tx, answer.ID)
    })
    
    respondJSON(w, answer)
}

11.4 Error Handling

var (
    ErrTextTooLong       = errors.New("text exceeds model context limit")
    ErrEmptyText         = errors.New("text is empty or whitespace only")
    ErrEmbeddingFailed   = errors.New("embedding API call failed")
    ErrRateLimited       = errors.New("embedding API rate limited")
    ErrNoEmbedding       = errors.New("entry has no embedding")
    ErrWorkstreamConfig  = errors.New("workstream missing match configuration")
)

Retry policy for transient errors:

ErrRateLimited: exponential backoff, max 3 retries
ErrEmbeddingFailed: retry once after 1s
All others: fail immediately

12. answer_links Table (Updated)

Incorporates SPEC-REVIEW feedback on rejection tracking:

CREATE TABLE answer_links (
    answer_id    TEXT NOT NULL REFERENCES entries(entry_id),
    request_id   TEXT NOT NULL REFERENCES entries(entry_id),
    
    -- Who created the link
    linked_by    TEXT NOT NULL,
    linked_at    INTEGER NOT NULL,
    
    -- AI matching metadata
    ai_score     REAL,           -- cosine similarity at time of match
    ai_model     TEXT,           -- model used for matching
    
    -- Review status
    status       TEXT NOT NULL DEFAULT 'pending',
        -- 'pending': AI suggested, awaiting human review
        -- 'confirmed': IB confirmed the match
        -- 'rejected': IB rejected the match
        -- 'self_confirmed': requester accepted dedup suggestion
        -- 'rejected_by_requester': requester declined dedup suggestion
    
    reviewed_by  TEXT,           -- who reviewed (if status != pending)
    reviewed_at  INTEGER,        -- when reviewed
    reject_reason TEXT,          -- why rejected (if status = rejected)
    
    PRIMARY KEY (answer_id, request_id)
);

CREATE INDEX idx_links_answer ON answer_links(answer_id);
CREATE INDEX idx_links_request ON answer_links(request_id);
CREATE INDEX idx_links_status ON answer_links(status);

13. Summary: What Gets Built

Component	Location	Purpose
`lib/embed.go`	Core embedding logic	API calls, similarity, storage
`embeddings` table	Schema	Vector storage
`broadcasts` table	Schema	Idempotency
`answer_links`	Schema update	Status + rejection tracking
`embed_audit` table	Schema	Compliance logging
`EmbedQueue`	Background worker	Async processing
Workstream config	Entry.Data	Per-workstream thresholds

Not built (future):

Qdrant integration (interface defined, impl deferred)
File content embedding (privacy: out of scope)
Auto-confirm (threshold defined, feature disabled for MVP)

This document extends SPEC.md. If conflicts exist, discuss before implementing.

30 KiB Raw Blame History