Initial spec
This commit is contained in:
commit
08996a1396
|
|
@ -0,0 +1,495 @@
|
|||
# Dealspace — Architecture Specification
|
||||
|
||||
**Version:** 0.1 — 2026-02-28
|
||||
**Status:** Pre-implementation. This document is the ground truth. Code follows the spec, never the reverse.
|
||||
|
||||
---
|
||||
|
||||
## 1. What This Is
|
||||
|
||||
A workflow platform for M&A deal management. Investment Banks, Sellers, and Buyers collaborate on a structured request-and-answer system. The core primitive is a **Request** — not a document, not a folder. Documents are how requests get resolved.
|
||||
|
||||
**Not** a VDR that grew features. Designed clean, from first principles.
|
||||
|
||||
---
|
||||
|
||||
## 2. What This Is Not
|
||||
|
||||
- Not a document repository with a request list bolted on
|
||||
- Not a project management tool with deal branding
|
||||
- Not a clone of any existing product
|
||||
- Not feature-complete on day one — the spec defines the architecture; MVP scope is separate
|
||||
|
||||
---
|
||||
|
||||
## 3. The Flow
|
||||
|
||||
```
|
||||
IB creates Project
|
||||
→ configures Workstreams (Finance, Legal, IT, HR, Operations...)
|
||||
→ invites Participants (assigns role per workstream)
|
||||
→ issues Request List to Seller
|
||||
|
||||
Seller receives Requests
|
||||
→ assigns internally
|
||||
→ uploads Answers (documents, data)
|
||||
→ marks complete
|
||||
|
||||
IB vets Answers
|
||||
→ approves → Answer published to Data Room
|
||||
→ rejects → back to Seller with comment
|
||||
|
||||
Buyers enter
|
||||
→ submit Requests (via Data Room interface)
|
||||
→ AI matches against existing Answers (human confirms)
|
||||
→ unmatched → routed to IB/Seller for resolution
|
||||
→ Answer published → broadcast to all Buyers who asked equivalent question
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Core Data Model — Entry-Based
|
||||
|
||||
Inspired directly by inou's entry architecture. **One table to rule them all.**
|
||||
|
||||
### 4.1 The Entry
|
||||
|
||||
```go
|
||||
type Entry struct {
|
||||
EntryID string // UUID, plain (never encrypted)
|
||||
ProjectID string // UUID, plain
|
||||
ParentID string // UUID, plain (empty = project root)
|
||||
Type string // structural kind (see 4.2)
|
||||
Depth int // 0=project, 1=workstream, 2=list, 3=request/answer
|
||||
SearchKey string // packed: primary lookup (workstream slug, request ref#)
|
||||
SearchKey2 string // packed: secondary lookup (requester org, answer hash)
|
||||
Summary string // packed: structural/navigational ONLY — no content
|
||||
Data []byte // packed: all content, metadata, routing, status
|
||||
Stage string // plain: "pre_dataroom" | "dataroom" | "closed"
|
||||
CreatedAt int64 // unix ms, plain
|
||||
UpdatedAt int64 // unix ms, plain
|
||||
CreatedBy string // user ID, plain
|
||||
}
|
||||
```
|
||||
|
||||
**Rule:** Summary is navigational only. Never put content in Summary. LLMs and MCP tools read Data.
|
||||
|
||||
### 4.2 Entry Types (Type field)
|
||||
|
||||
| Type | Depth | Description |
|
||||
|----------------|-------|------------------------------------------|
|
||||
| `project` | 0 | Top-level container |
|
||||
| `workstream` | 1 | RBAC anchor (Finance, Legal, IT...) |
|
||||
| `request_list` | 2 | Named collection of requests |
|
||||
| `request` | 3 | A single request item |
|
||||
| `answer` | 3 | A response to one or more requests |
|
||||
| `comment` | * | Threaded comment on any entry |
|
||||
|
||||
### 4.3 Answer → Request Links (many-to-many)
|
||||
|
||||
One answer can satisfy N requests. When an answer is published, all linked requests are notified — broadcast to all requesting parties who have access.
|
||||
|
||||
```sql
|
||||
CREATE TABLE answer_links (
|
||||
answer_id TEXT NOT NULL REFERENCES entries(entry_id),
|
||||
request_id TEXT NOT NULL REFERENCES entries(entry_id),
|
||||
linked_by TEXT NOT NULL,
|
||||
linked_at INTEGER NOT NULL,
|
||||
confirmed INTEGER NOT NULL DEFAULT 0, -- 1 = human confirmed
|
||||
ai_score REAL,
|
||||
PRIMARY KEY (answer_id, request_id)
|
||||
);
|
||||
```
|
||||
|
||||
### 4.4 Entry Data (JSON inside packed blob)
|
||||
|
||||
For `request`:
|
||||
```json
|
||||
{
|
||||
"title": "Provide audited financials FY2024",
|
||||
"body": "...",
|
||||
"priority": "high|normal|low",
|
||||
"due_date": "2026-03-15",
|
||||
"assigned_to": ["user_id"],
|
||||
"status": "open|assigned|answered|vetted|published|closed",
|
||||
"ref": "FIN-042"
|
||||
}
|
||||
```
|
||||
|
||||
For `answer`:
|
||||
```json
|
||||
{
|
||||
"title": "...",
|
||||
"body": "...",
|
||||
"files": ["object_id"],
|
||||
"status": "draft|submitted|approved|rejected|published",
|
||||
"rejection_reason": "...",
|
||||
"broadcast_to": "linked_requesters|all_workstream|all_dataroom"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. RBAC
|
||||
|
||||
### 5.1 Roles
|
||||
|
||||
| Role | Scope | Permissions |
|
||||
|-----------------|----------------|----------------------------------------------------------|
|
||||
| `ib_admin` | Project | Full control, all workstreams |
|
||||
| `ib_member` | Workstream(s) | Manage requests + vet answers in assigned workstreams |
|
||||
| `seller_admin` | Project | See all requests directed at seller, manage seller team |
|
||||
| `seller_member` | Workstream(s) | Answer requests in assigned workstreams |
|
||||
| `buyer_admin` | Project | Manage buyer team, see data room |
|
||||
| `buyer_member` | Workstream(s) | Submit requests, view published data room answers |
|
||||
| `observer` | Workstream(s) | Read-only, no submission |
|
||||
|
||||
### 5.2 Access Table
|
||||
|
||||
```sql
|
||||
CREATE TABLE access (
|
||||
id TEXT PRIMARY KEY,
|
||||
project_id TEXT NOT NULL,
|
||||
workstream_id TEXT, -- null = all workstreams in this project
|
||||
user_id TEXT NOT NULL,
|
||||
role TEXT NOT NULL,
|
||||
ops TEXT NOT NULL, -- "r", "rw", "rwdm"
|
||||
granted_by TEXT NOT NULL,
|
||||
granted_at INTEGER NOT NULL
|
||||
);
|
||||
```
|
||||
|
||||
**RBAC anchor:** The `workstream` entry (depth 1) is the access root. Every operation walks up to the workstream to resolve permissions.
|
||||
|
||||
### 5.3 The Single Throat
|
||||
|
||||
Three choke points. No exceptions. Not even "just this once."
|
||||
|
||||
```
|
||||
All Reads → EntryRead(actorID, projectID, filter) → CheckAccess → query
|
||||
All Writes → EntryWrite(actorID, entries...) → CheckAccess → save
|
||||
All Deletes → EntryDelete(actorID, projectID, filter) → CheckAccess → delete
|
||||
Object I/O → ObjectRead/ObjectWrite/ObjectDelete → CheckAccess → storage
|
||||
```
|
||||
|
||||
No handler ever touches the DB directly. No raw SQL outside lib/dbcore.go.
|
||||
|
||||
### 5.4 Data Room Visibility
|
||||
|
||||
`Stage = "pre_dataroom"` entries are invisible to buyer roles — not filtered in the UI, invisible at the DB layer via CheckAccess. Buyers cannot see a question exists, let alone its answer, until the IB publishes it to the data room.
|
||||
|
||||
---
|
||||
|
||||
## 6. Security
|
||||
|
||||
### 6.1 Encryption & Compression
|
||||
|
||||
All string content fields use Pack / Unpack:
|
||||
|
||||
```
|
||||
Pack: raw string → zstd compress → AES-256-GCM encrypt → []byte
|
||||
Unpack: []byte → decrypt → decompress → string
|
||||
```
|
||||
|
||||
- SearchKey/SearchKey2: Deterministic encryption (HMAC-derived nonce) — allows indexed lookups
|
||||
- Summary, Data: Non-deterministic (random nonce) — never queried directly
|
||||
- IDs, integers, Stage: Plain text — structural, never sensitive
|
||||
- Files: ObjectWrite encrypts before storage; ObjectRead decrypts on serve
|
||||
|
||||
Per-project encryption key derived from master key + project_id. One project's key compromise does not affect others.
|
||||
|
||||
### 6.2 File Protection Pipeline
|
||||
|
||||
Files are never served raw. Every file goes through the protection pipeline at serve time. The stored file is always the clean original.
|
||||
|
||||
| Type | Protection |
|
||||
|---------------|------------------------------------------------------------------|
|
||||
| PDF | Dynamic watermark (user + timestamp + org) rendered per-request |
|
||||
| Word (.docx) | Watermark injected into document XML before serve |
|
||||
| Excel (.xlsx) | Sheet protection + watermark header row injected before serve |
|
||||
| Images | Watermark text burned into pixel data per-request |
|
||||
| Video | Watermark overlay via ffmpeg, served as stream |
|
||||
| Other | Encrypted download only, no preview |
|
||||
|
||||
Watermark content (configurable per project):
|
||||
```
|
||||
{user_name} · {org_name} · {datetime} · CONFIDENTIAL
|
||||
```
|
||||
|
||||
Watermarks are generated at serve time. Parameters are project-level config, not hardcoded.
|
||||
|
||||
### 6.3 Storage Pricing (Competitive Advantage)
|
||||
|
||||
Files stored compressed + encrypted. No per-MB extortion. Competitors charge up to $20/MB for "secure storage." We store at actual cost. This is a direct and easy competitive win for Misha.
|
||||
|
||||
### 6.4 Audit Log
|
||||
|
||||
Every access grant change, file download, status transition — logged.
|
||||
|
||||
```sql
|
||||
CREATE TABLE audit (
|
||||
id TEXT PRIMARY KEY,
|
||||
project_id TEXT NOT NULL,
|
||||
actor_id TEXT NOT NULL,
|
||||
action TEXT NOT NULL, -- packed
|
||||
target_id TEXT,
|
||||
details TEXT, -- packed
|
||||
ip TEXT,
|
||||
ts INTEGER NOT NULL
|
||||
);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Object Store
|
||||
|
||||
```go
|
||||
type ObjectStore interface {
|
||||
Write(id string, data []byte) error
|
||||
Read(id string) ([]byte, error)
|
||||
Delete(id string) error
|
||||
Exists(id string) bool
|
||||
}
|
||||
```
|
||||
|
||||
Object ID = SHA-256 of encrypted content. Content-addressable — automatic dedup.
|
||||
|
||||
Implementations: local filesystem (default), S3-compatible (plug-in). App code never knows the difference.
|
||||
|
||||
---
|
||||
|
||||
## 8. AI Matching Pipeline
|
||||
|
||||
When a buyer submits a request:
|
||||
|
||||
1. Embed request text (Fireworks nomic-embed-text-v1.5 — zero retention)
|
||||
2. Cosine similarity vs all published answers in same workstream
|
||||
3. Score ≥ 0.72 → suggest match, require human confirmation
|
||||
4. Score < 0.72 → route to IB/Seller for manual response
|
||||
5. Human confirms → answer_links row `confirmed=1`, broadcast fires
|
||||
|
||||
Private data never leaves Fireworks (zero retention policy). Same infra as inou.
|
||||
|
||||
---
|
||||
|
||||
## 9. Themes
|
||||
|
||||
Theme = CSS custom properties bundle. Zero hardcoded colors in templates. Every color references a CSS var.
|
||||
|
||||
```go
|
||||
type Theme struct {
|
||||
ID string
|
||||
Name string
|
||||
ProjectID string // null = system theme
|
||||
Properties string // packed — CSS vars as JSON
|
||||
}
|
||||
```
|
||||
|
||||
Built-in: Light, Dark, High-contrast. Projects can define a custom theme (brand colors, logo). Users can override with personal preference. Theme switching = swap one class on `<html>`. No JavaScript framework required.
|
||||
|
||||
---
|
||||
|
||||
## 10. MCP Support
|
||||
|
||||
MCP server exposes deal context to AI tools. Follows inou's MCP pattern:
|
||||
|
||||
- All tools operate within `(actor, project)` context — full RBAC enforced
|
||||
- Read tools: list requests, query answers, check status, get workstream summary
|
||||
- Write tools: AI-suggested routing (human confirmation required before any state change)
|
||||
- Gating: AI cannot read pre-dataroom content without explicit unlock (mirrors inou's tier-1/tier-2 pattern)
|
||||
|
||||
Detailed MCP spec written separately after core schema is stable.
|
||||
|
||||
---
|
||||
|
||||
## 11. Go Implementation Rules
|
||||
|
||||
Non-negotiable. Violations require explicit discussion.
|
||||
|
||||
### 11.1 Package Structure
|
||||
|
||||
```
|
||||
dealspace/
|
||||
cmd/server/ main entry point, config loading
|
||||
lib/
|
||||
types.go All shared types — Entry, User, Project, Theme, etc.
|
||||
dbcore.go EntryRead, EntryWrite, EntryDelete — the three choke points
|
||||
rbac.go CheckAccess, permission resolution, role definitions
|
||||
crypto.go Pack, Unpack, ObjectEncrypt, ObjectDecrypt
|
||||
store.go ObjectStore interface + implementations
|
||||
watermark.go Per-type watermark injection (PDF, DOCX, XLSX, image, video)
|
||||
embed.go AI embedding client + cosine similarity
|
||||
notify.go Broadcast logic — answer published → notify requesters
|
||||
api/
|
||||
middleware.go Auth, logging, rate limiting, CORS
|
||||
handlers.go Thin handlers only — extract input, call lib, return response
|
||||
routes.go Route registration
|
||||
portal/
|
||||
templates/ HTML templates (no hardcoded colors)
|
||||
static/ CSS (theme vars), JS (minimal)
|
||||
mcp/
|
||||
server.go MCP tool registration and dispatch
|
||||
```
|
||||
|
||||
### 11.2 Handler Rules
|
||||
|
||||
- Handlers: extract input → call lib → return response. Nothing else.
|
||||
- No SQL in handlers. Ever.
|
||||
- No business logic in handlers. Ever.
|
||||
- If two handlers share logic → extract to lib.
|
||||
- Error responses: one helper function, used everywhere. `{"error": "...", "code": "..."}`
|
||||
|
||||
### 11.3 DB Access Rules
|
||||
|
||||
- No `db.Query` / `db.Exec` outside `lib/dbcore.go`
|
||||
- No raw SQL in any file outside `lib/dbcore.go`
|
||||
- Entry access: `EntryRead`, `EntryWrite`, `EntryDelete` only
|
||||
- Object access: `ObjectRead`, `ObjectWrite`, `ObjectDelete` only
|
||||
- User/project/access operations: dedicated functions in dbcore, never inline SQL
|
||||
|
||||
### 11.4 Naming Conventions
|
||||
|
||||
- RBAC-enforced functions: exported, full name (`EntryRead`, `EntryWrite`)
|
||||
- System-only bypass: unexported, explicit suffix (`entryReadSystem`)
|
||||
- The distinction must be obvious from the name alone
|
||||
|
||||
---
|
||||
|
||||
## 12. UI Philosophy
|
||||
|
||||
- **Project = select box at the top.** One line. You pick your project and you're in it.
|
||||
- No project browser consuming 20% of screen real estate.
|
||||
- Workstream tabs within a project: Finance | Legal | IT | HR | Operations
|
||||
- Information hierarchy: Workstream → Request List → Request → Answer
|
||||
- Status visible without clicking in
|
||||
- Competitor trap to avoid: adding features without removing complexity. Every new feature must justify its screen cost.
|
||||
|
||||
---
|
||||
|
||||
## 13. Out of Scope for MVP
|
||||
|
||||
- Email notifications
|
||||
- Mobile app
|
||||
- Third-party integrations (DocuSign, Salesforce)
|
||||
- Public API
|
||||
- Per-firm white-labeling
|
||||
|
||||
---
|
||||
|
||||
## 14. Retired Code
|
||||
|
||||
Previous attempt archived at `/home/johan/dev/dealroom-retired-20260228/`
|
||||
|
||||
**Carried forward:**
|
||||
- AI matching concept (embeddings + cosine similarity at 0.72 threshold)
|
||||
- Broadcast answer semantics
|
||||
- Color palette
|
||||
|
||||
**Everything else starts fresh.**
|
||||
|
||||
---
|
||||
|
||||
## 15. Schema Change Checklist
|
||||
|
||||
When modifying the data model:
|
||||
1. Update this SPEC.md first
|
||||
2. Update `lib/types.go`
|
||||
3. Update `lib/dbcore.go`
|
||||
4. Update `lib/rbac.go` if access model changes
|
||||
5. Update migration files
|
||||
6. Update MCP tools if query patterns changed
|
||||
7. No exceptions to this order
|
||||
|
||||
---
|
||||
|
||||
*This document is the ground truth. If code disagrees with the spec, the code is wrong.*
|
||||
|
||||
---
|
||||
|
||||
## 16. Workflow & Task Model (added 2026-02-28)
|
||||
|
||||
### 16.1 The Core Insight
|
||||
|
||||
Most users are **workers, not deal managers**. When the accountant logs in they see their task inbox — not a deal room, not workstream dashboards, not buyer activity. Just: what do I need to do today.
|
||||
|
||||
The big picture (deal progress, buyer activity, request completion %) is the IB admin's view. Role determines UI surface entirely. Same platform, completely different experience.
|
||||
|
||||
### 16.2 The Routing Chain
|
||||
|
||||
Tasks don't just get assigned — they have a return path. Every forward creates an obligation to return.
|
||||
|
||||
```
|
||||
Buyer → IB analyst → CFO → accountant
|
||||
↓ (done)
|
||||
Buyer ← IB analyst ← CFO ←──┘
|
||||
```
|
||||
|
||||
Each hop knows where it came from and where it goes back when done. The IB analyst sees "waiting on CFO" — the buyer sees nothing until the answer is published. Internal routing is invisible to external parties.
|
||||
|
||||
### 16.3 Entry Fields for Workflow (plain, indexed — never packed)
|
||||
|
||||
| Field | Purpose |
|
||||
|----------------|---------------------------------------------------------------|
|
||||
| `assignee_id` | Who has it RIGHT NOW — powers the personal task inbox |
|
||||
| `return_to_id` | Who it goes back to when done |
|
||||
| `origin_id` | The ultimate requestor (buyer) who triggered the chain |
|
||||
|
||||
`routing_chain` — packed in Data: full hop history with actors + timestamps, visible to IB admin only.
|
||||
|
||||
When the accountant marks done → automatically lands in CFO's inbox.
|
||||
When CFO approves → automatically lands in IB analyst's inbox.
|
||||
When IB analyst publishes → buyer is notified.
|
||||
|
||||
No manual re-routing at each hop. The chain is set when the task is forwarded.
|
||||
|
||||
### 16.4 entry_events Table
|
||||
|
||||
The thread behind every entry. This IS the workflow history.
|
||||
|
||||
```sql
|
||||
CREATE TABLE entry_events (
|
||||
id TEXT PRIMARY KEY,
|
||||
entry_id TEXT NOT NULL REFERENCES entries(entry_id),
|
||||
actor_id TEXT NOT NULL,
|
||||
channel TEXT NOT NULL, -- "web" | "email" | "slack" | "teams"
|
||||
action TEXT NOT NULL, -- packed: "message"|"upload"|"forward"|"approve"|"reject"|"publish"
|
||||
data TEXT NOT NULL, -- packed: message body, file refs, status transition details
|
||||
ts INTEGER NOT NULL
|
||||
);
|
||||
CREATE INDEX idx_events_entry ON entry_events(entry_id);
|
||||
CREATE INDEX idx_events_actor ON entry_events(actor_id);
|
||||
```
|
||||
|
||||
Full thread visible to deal managers. Workers (accountant, CFO) see only their task queue — `WHERE assignee_id = me`.
|
||||
|
||||
### 16.5 channel_threads Table
|
||||
|
||||
Maps external thread IDs to entries. Enables email/Slack/Teams participation without login.
|
||||
|
||||
```sql
|
||||
CREATE TABLE channel_threads (
|
||||
id TEXT PRIMARY KEY,
|
||||
entry_id TEXT NOT NULL REFERENCES entries(entry_id),
|
||||
channel TEXT NOT NULL, -- "email" | "slack" | "teams"
|
||||
thread_id TEXT NOT NULL, -- Message-ID, thread_ts, conversationId
|
||||
project_id TEXT NOT NULL,
|
||||
UNIQUE(channel, thread_id)
|
||||
);
|
||||
```
|
||||
|
||||
Inbound message on a known thread_id → creates entry_event on the mapped entry → triggers next hop in routing chain.
|
||||
|
||||
### 16.6 Final Table List
|
||||
|
||||
```
|
||||
entries — the tree + workflow state (assignee_id, return_to_id, origin_id plain+indexed)
|
||||
entry_events — the thread / workflow history per entry
|
||||
channel_threads — external channel routing (email/Slack/Teams → entry)
|
||||
answer_links — answer ↔ request, ai_score, confirmed, rejected, vetting
|
||||
users — accounts + auth
|
||||
access — RBAC: (user, project, workstream, role, ops)
|
||||
embeddings — (entry_id, vector BLOB) for AI matching
|
||||
audit — security events: grants, downloads, logins, key transitions
|
||||
```
|
||||
|
||||
Eight tables. No objects table — files referenced by 16-char hex ObjectID stored in entry Data.
|
||||
Loading…
Reference in New Issue