inou/import-genome
James cc1dd7690c Lab reference charts, import tracking, DossierFromEntry consolidation
- Fix lab chart reference bands: parse DOB in DossierFromEntry, generate
  deterministic ref_ids in import-caliper (was collapsing 4363 rows to 1)
- Consolidate DossierFromEntry into lib/dbcore.go (eliminate portal duplicate)
- Add Import field to entries for batch undo (NextImportID, all import paths)
- MyChart direct JSON parsing (skip Gemini for structured lab data)
- Multi-order extraction from markdown/text tables
- Normalize progress callback for UI feedback
- DICOM import, genome import, API, portal, MCP, translation updates
- Remove test DICOM data from repo

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 05:15:03 -05:00
..
README.md Initial commit 2026-02-01 02:43:27 -05:00
main.go Lab reference charts, import tracking, DossierFromEntry consolidation 2026-02-24 05:15:03 -05:00

README.md

import-genome

Fast genetic data importer using lib.Save() for direct database access.

Performance

~1.5 seconds to:

  • Read 18MB file
  • Parse 674,160 variants
  • Sort by rsid
  • Match against 9,403 SNPedia rsids
  • Insert 5,382 entries via lib.Save()

Installation

cd ~/dev/inou
make import-genome

Usage

import-genome <plain-file> <dossier-id>

# Help
import-genome --help

Supported Formats

Format Delimiter Columns Alleles
AncestryDNA Tab 5 Split
23andMe Tab 4 Combined
MyHeritage CSV+Quotes 4 Combined
FTDNA CSV 4 Combined

Auto-detected from file structure.

Data Model

Creates hierarchical entries:

Parent (genome/extraction):
  id: 3b38234f2b0f7ee6
  data: {"source": "ancestry", "variants": 5381}

Children (genome/variant):
  parent_id: 3b38234f2b0f7ee6
  type: rs1801133 (rsid)
  value: TT (genotype)

Databases

  • SNPedia reference: ~/dev/inou/snpedia-genotypes/genotypes.db (read-only, direct SQL)
  • Entries: via lib.Save() to /tank/inou/data/inou.db (single transaction)

Algorithm

  1. Read plain-text genome file
  2. Auto-detect format from first data line
  3. Parse all variants (rsid + genotype)
  4. Sort by rsid
  5. Load SNPedia rsid set into memory
  6. Match user variants against SNPedia (O(1) lookup)
  7. Delete existing genome entries for dossier
  8. Build []lib.Entry slice
  9. lib.Save() - single transaction with prepared statements

Example

./bin/import-genome /path/to/ancestry.txt 3b38234f2b0f7ee6

# Output:
# Phase 1 - Read: 24ms (18320431 bytes)
# Detected format: ancestry
# Phase 2 - Parse: 162ms (674160 variants)
# Phase 3 - Sort: 306ms
# Phase 4 - Load SNPedia: 47ms (9403 rsids)
# Phase 5 - Match & normalize: 40ms (5381 matched)
# Phase 6 - Init & delete existing: 15ms
# Phase 7 - Build entries: 8ms (5382 entries)
# Phase 8 - lib.Save: 850ms (5382 entries saved)
#
# TOTAL: 1.5s
# Parent ID: c286564f3195445a