inou/docs/study-access-in-entries.md

177 lines
8.1 KiB
Markdown

# Study: Access Rights as a Field on Entries
## Design
Access permissions are stored as a packed BLOB field (`Access`) on the entry itself — like filesystem permissions on an inode. No separate access table.
Every entry already has `DossierID` (whose data) and `ParentID` (hierarchy). Adding an `Access` field means each entry can declare: "these grantees have these permissions here and on all children."
### The Access Field
```
Access BLOB — packed JSON, same as other string fields
```
Format when unpacked:
```json
[
{"grantee": "6e4e8192881a7494", "ops": 1, "relation": 3},
{"grantee": "a1b2c3d4e5f60718", "ops": 3}
]
```
- `grantee` — int64 ID (hex16 representation) of who gets access
- `ops` — bitmask (1=read, 2=write, 4=delete, 8=manage)
- `relation` — relationship type (optional, for display: parent, doctor, etc.)
Most entries have an empty Access field. Only entries where access is explicitly granted have it populated.
### How CheckAccess Works
Walk up the hierarchy from the requested entry, checking the Access field at each level:
```
CheckAccess(accessorID, dossierID, entryID, perm):
1. accessor == "" or system → true (system access)
2. accessor == dossierID → true (self-access)
3. Load entry (entryID)
- Check entry.Access for matching grantee + perm → granted
4. If entry.ParentID != "" and entry.ParentID != entry.EntryID:
- Load parent, check parent.Access → granted
- Continue walking up via ParentID
5. Load dossier root (dossierID, cat 0)
- Check root.Access → granted
6. No match at any level → denied
```
Grants cascade downward. A grant on the dossier root gives access to everything. A grant on an imaging study gives access to that study and all its series/slices. This is why ParentID must form a clean tree to the dossier root.
### Hierarchy Requirement
For the walk to work, every root-level entry (study, lab batch, genome extraction, etc.) must have `ParentID = dossierID`. This connects the permission tree:
```
Dossier entry (cat 0, Access: [{grantee: "johan", ops: 1}])
├── Study (cat 1, ParentID = dossierID)
│ ├── Series (ParentID = studyID)
│ │ └── Slice (ParentID = seriesID)
│ └── Series
├── Lab batch (cat 3, ParentID = dossierID)
│ └── Lab result (ParentID = batchID)
└── Genome extraction (cat 4, ParentID = dossierID)
└── Tier (ParentID = extractionID)
└── Variant (ParentID = tierID)
```
A grant on the dossier entry → access to everything.
A grant on the study → access to that study's series and slices only.
### entryReadAccessible (Dashboard: "Show me all dossiers I can access")
This is the one query that currently needs the access table: "find all dossiers where I have a grant." Without a separate table, this becomes a cross-dossier query.
Options:
**Option A: Scan all cat-0 entries for matching Access field**
```sql
SELECT ... FROM entries WHERE Category = 0
```
Then in Go, unpack each entry's Access field and check for a matching grantee. This is O(all dossiers) — works at inou's current scale (hundreds, not millions), but doesn't scale.
**Option B: Index table (materialized view)**
Maintain a lightweight lookup: `access_index(GranteeID, DossierID)` — plain text, no packing. Updated whenever an Access field is written. CheckAccess still walks entries (source of truth). The index only serves the "list my accessible dossiers" query.
This is similar to the current access table but reduced to a two-column index. Not the source of truth — just a lookup accelerator.
**Option C: SearchKey on dossier entries encodes grantees**
Store grantee IDs in a queryable field. But SearchKey is already used for email on cat-0 entries.
**Option D: Separate query with LIKE on packed Access field**
Not feasible — Access is packed (compressed + encrypted), not queryable with SQL.
**Recommendation: Option A for now, Option B if scale demands it.** At current scale (< 1000 dossiers), scanning cat-0 entries and checking Access fields in Go is fast enough. If it becomes a problem, add a two-column index table but the source of truth remains the Access field on entries.
### What Happens to the Access Table
Eliminated. `DROP TABLE access`. The schema becomes:
```
entries — everything, including permissions (via Access field)
audit — immutable log (stays separate — different concern, append-only)
```
Two tables instead of three.
### What Happens to Access Functions
All the current stubs (`AccessGet`, `AccessList`, `AccessWrite`, `GrantAccess`, `RevokeAccess`, `RevokeAllAccess`, `ListGrants`, `ListGrantees`, `AccessListByAccessor`, `AccessListByTargetWithNames`, `CanManageDossier`) all deleted.
Replaced by:
- **`CheckAccess`** in `dbcore.go` walks hierarchy, checks Access fields. Uses `dbLoad` internally (no RBAC recursion access checks are *about* RBAC, not *subject to* it).
- **Granting access** = `EntryRead` the target entry, modify its Access field, `EntryWrite` it back. The caller must have PermManage. No special function needed it's a field update.
- **Revoking access** = same pattern, remove grantee from Access field.
- **Listing grantees** = `EntryRead` the entry, parse its Access field.
- **Listing accessible dossiers** = `entryReadAccessible` scans cat-0 entries (as today, but checks Access field instead of access table).
### The Circular Dependency
There is none. CheckAccess uses `dbLoad` to read entries and inspect their Access field. It never calls `EntryRead`. `EntryRead` calls `CheckAccess`, which reads entries directly. This is the same pattern as today (CheckAccess queries the access table directly via `dbQuery`). The only difference: instead of a separate table, it reads from the same table using a different code path.
Access checks are *about* RBAC, not *subject to* RBAC. You don't need permission to check if you have permission.
### Performance
**Current (access table):** One `dbQuery` per CheckAccess call `SELECT ... FROM access WHERE GranteeID = ? AND DossierID = ?`.
**Proposed (Access field):** One `dbLoad` per level in the hierarchy. Worst case: 4 loads (entry parent grandparent dossier root). Best case: 1 load (grant is on the entry itself, or self-access shortcut).
The loads involve Unpack (decompress + decrypt) of the Access field. But:
- Most entries have empty Access fields Unpack of empty is near-zero
- Grants are typically at the dossier root level the walk usually reaches it in 1-3 hops
- `dbLoad` hits SQLite by primary key fast
For the common case (dossier-level grant), CheckAccess loads the dossier root entry, unpacks its Access field, finds the match. One primary key lookup + one Unpack. Comparable to the current single query on the access table.
### Schema Change
```sql
ALTER TABLE entries ADD COLUMN Access BLOB;
```
On both staging and production. The Entry struct gains:
```go
Access string `db:"Access"` // packed JSON: [{grantee, ops, relation}]
```
### Migration
1. For each row in the current `access` table, read `DossierID` and `EntryID`
2. Load that entry, parse its Access field (or initialize empty array)
3. Append the grant, write back
4. After migration, drop the access table
### IDs as int64
Per the agreed design, IDs are stored as int64 internally. The Access field stores grantee IDs as int64 (represented as hex16 in JSON for readability, parsed via `ParseID`). This aligns with the planned migration from string IDs to int64.
### Summary
| Aspect | Current (access table) | Proposed (Access field) |
|--------|----------------------|------------------------|
| Tables | 3 (entries, access, audit) | 2 (entries, audit) |
| Source of truth | Two places | One place (entries) |
| Grant/revoke | Special functions | Field update via EntryWrite |
| CheckAccess | Query access table | Walk hierarchy, check Access field |
| "My dossiers" | Query access table | Scan cat-0 entries |
| Functions needed | 11 stubs | 0 new (CheckAccess + EntryRead/Write) |
| Hierarchy | Implicit (EntryID = scope) | Explicit (ParentID tree) |
| Scoped access | EntryID field on grant | Grant on any node in tree |