# Filescanner Cross-platform file inventory scanner with ClickHouse backend. ## Quick Start ```bash # Get dependencies go mod tidy # Build all platforms make all # Or build current platform only make build ``` ## Usage ### Scan files ```bash # Scan with dry-run (no DB) ./filescan -server myserver -path /home -dry-run # Scan to ClickHouse ./filescan -server myserver -path /home -ch 192.168.1.253:9000 # Verbose ./filescan -server myserver -path /home -v ``` ### Add hashes for duplicate detection ```bash # Only hashes files with non-unique sizes ./hashupdate -server myserver -ch 192.168.1.253:9000 ``` ### Find duplicates ```sql SELECT hash, count(*) as cnt, groupArray(concat(server, ':', folder, '/', filename)) as files FROM files.inventory WHERE hash != '' GROUP BY hash HAVING cnt > 1 ORDER BY any(size) DESC; ``` ## Binaries After `make all`: - `bin/filescan-mac-arm64` - Mac M1/M2/M3 - `bin/filescan-mac-amd64` - Mac Intel - `bin/filescan-linux` - Linux - `bin/filescan.exe` - Windows ## Excluded Directories Automatically skips: - Windows: `$RECYCLE.BIN`, `Windows`, `Program Files`, `AppData`, etc. - macOS: `.Trash`, `Library`, `.Spotlight-V100`, etc. - Linux: `/proc`, `/sys`, `/dev`, `/run`, etc. - Common: `node_modules`, `.git`, `__pycache__` ## ClickHouse Schema See `queries.sql` for schema and useful queries.