6.1 KiB

Raw Blame History

inou API Design Decisions

January 2026

Overview

This document captures the API redesign decisions for inou, focusing on making health data AI-ready through progressive disclosure, efficient queries, and secure token-based access.

Authentication

Token-Based Access

{
  "d": "dossier_id",
  "exp": 1736470800
}

d: The authenticated dossier (who logged in)
exp: Unix timestamp expiration (few hours for external LLMs like Grok)
Token is encrypted using existing lib.CryptoEncrypt
No raw dossier IDs in URLs that live forever
Backend looks up permissions from dossier_access table (not in token)

Why Tokens?

Grok/ChatGPT users were querying with raw dossier IDs days later
Tokens expire, preventing stale access
Simpler than passing dossier in every request

API Endpoints

REST Style with Versioning

GET /api/v1/dossiers
GET /api/v1/dossiers/{id}
GET /api/v1/dossiers/{id}/entries
GET /api/v1/dossiers/{id}/entries/{entry_id}
GET /api/v1/dossiers/{id}/entries/{entry_id}?detail=full

Token in Header

Authorization: Bearer <token>

Query Parameters

Param	Purpose
`detail=full`	Return full image/text data
`search=pons`	Search summaries/tags
`category=imaging`	Filter by category (English)
`anatomy=hypothalamus`	Find slices by anatomy
`W=200&L=500`	DICOM windowing for images

Dossier Always Explicit

Even if user has only one dossier, it's in the URL. No special cases.

Progressive Disclosure

LLM gets everything needed to decide in first call. Full detail only when needed.

Entry Type	Quick Glance	Full Detail
Study	anatomy + summary	-
Series	slice count, orientation	-
Slice	thumbnail (150x150)	full image
Document	summary + tags	full text
Lab	values + ranges	historical trend
Genome	category + count	variant list

Fewer Round-Trips

Before: LLM guesses slices, fetches one by one, backtracks After: LLM sees anatomy index, requests exact slices needed

Thumbnails

Specification

Size: 150x150 max dimension (preserve aspect ratio)
Format: PNG (8-bit greyscale, lossless)
Target: ~5KB per thumbnail
Storage: Database (in Data JSON field or BLOB column)

Why DB Not Filesystem?

Batch queries: "get 50 slices with thumbnails" = 1 query
Fewer IOPS (no 50 small file reads)
DB file stays hot in cache
4000 slices x 5KB = 20MB (trivial)

Full Images

Stay on filesystem (/tank/inou/objects/)
Fetched one at a time with ?detail=full

Anatomical Analysis

Problem

LLM spends many round-trips finding the right slice ("find the pons" → guess → wrong → try again)

Solution

Analyze reference slices at ingest, store anatomy with z-ranges.

Approach

For each orientation (SAG, COR, AX) present in study
Pick mid-slice from T2 (preferred) or T1
Run vision API: "identify anatomical structures with z-ranges"
Store in study entry

Storage

{
  "anatomy": {
    "pons": {"z_min": -30, "z_max": -20},
    "cerebellum": {"z_min": -45, "z_max": -25},
    "hypothalamus": {"z_min": 20, "z_max": 26}
  },
  "body_part": "brain"
}

Query Time

LLM asks "find pons in this series"
Lookup: pons at z=-30 to -20
Find slices in series where slice_location BETWEEN -30 AND -20
Return matching slices with thumbnails

Cost

1-3 vision API calls per study (~$0.01-0.03)
Stored once, used forever
No per-query cost

Why Not Position-Based Lookup?

Tried it, deleted it. Pediatric vs adult brains have different z-coordinates for same structures.

Generalization

Brain: mid-sagittal reference
Spine: mid-sagittal reference
Knee: mid-sagittal reference
Abdomen: may need coronal + axial references
Animals: same principle, vision model identifies structures

Int	Name	Notes
0	imaging	Unified: slice, series, study (Type differentiates)
1	document
2	lab
3	genome	Unified: tier, rsid, variant (Type differentiates)
...	...

Database

SQLite Stays

Read-heavy workload (LLM queries)
Occasional writes (imports)
Single server, few users
Millions of rows, no problem
Single file backup
WAL mode for concurrency

Alternatives Considered

FoundationDB: Overkill for single server
bbolt/BadgerDB: Lose SQL convenience, maintain indexes manually
Postgres: "Beautiful tech from 15 years ago" - user preference

What We Actually Use

Just key-value with secondary indexes:

Get by ID
Get children (parent_id = X)
Filter by dossier
Filter by category
Order by ordinal/timestamp

SQLite handles this perfectly.

RBAC (Future)

Concept

Category-level permissions per relationship.

Example: Trainer can see exercise, nutrition, supplements but NOT hospitalization, fertility.

Implementation

Presets by relationship type (trainer, doctor, family)
User overrides per category
Stored in dossier_access table
Backend lookup (not in token)

Not Priority Now

Token + expiration first. RBAC layers on later.

Summary

Decision	Choice
Auth	Token with dossier + expiration
API Style	REST, versioned (`/api/v1/`)
Endpoints	`/dossiers`, `/entries` (2 main)
Thumbnails	150x150 PNG, ~5KB, in DB
Full images	Filesystem, on-demand
Anatomy	Reference slice analysis at ingest
Categories	Integers internally, translated strings externally
Database	SQLite
RBAC	Future work

Core Principle: AI-ready health data through progressive disclosure. LLM gets context upfront, fetches details only when needed.

6.1 KiB

Raw Blame History

inou API Design Decisions

Overview

Authentication

Token-Based Access

Why Tokens?

API Endpoints

REST Style with Versioning

Token in Header

Query Parameters

Dossier Always Explicit

Progressive Disclosure

Fewer Round-Trips

Thumbnails

Specification

Why DB Not Filesystem?

Full Images

Anatomical Analysis

Problem

Solution

Approach

Storage

Query Time

Cost

Why Not Position-Based Lookup?

Generalization

Categories

Consolidated Structure

Translation

Database

SQLite Stays

Alternatives Considered

What We Actually Use

RBAC (Future)

Concept

Implementation

Not Priority Now

Summary

6.1 KiB Raw Blame History

inou API Design Decisions

Overview

Authentication

Token-Based Access

Why Tokens?

API Endpoints

REST Style with Versioning

Token in Header

Query Parameters

Dossier Always Explicit

Progressive Disclosure

Fewer Round-Trips

Thumbnails

Specification

Why DB Not Filesystem?

Full Images

Anatomical Analysis

Problem

Solution

Approach

Storage

Query Time

Cost

Why Not Position-Based Lookup?

Generalization

Categories

Consolidated Structure

Translation

Database

SQLite Stays

Alternatives Considered

What We Actually Use

RBAC (Future)

Concept

Implementation

Not Priority Now

Summary

6.1 KiB

Raw Blame History