Performance
sql-splitter is designed for high throughput with minimal memory usage.
Benchmarks
Section titled “Benchmarks”Tested on Apple M2 Max:
| Metric | Value |
|---|---|
| Parser throughput | 600+ MB/s |
| Memory usage | ~50 MB constant |
| Cold start | ~5 ms |
Split Benchmark
Section titled “Split Benchmark”| File Size | Time | Throughput |
|---|---|---|
| 100 MB | 0.16s | 625 MB/s |
| 1 GB | 1.6s | 625 MB/s |
| 10 GB | 16s | 625 MB/s |
vs. Shell Alternatives
Section titled “vs. Shell Alternatives”| Tool | Time (1 GB file) |
|---|---|
| sql-splitter | 1.6s |
| awk-based | 8.5s |
| Python script | 12s |
sql-splitter is 5x faster than shell-based alternatives.
Optimization Tips
Section titled “Optimization Tips”Use Native CPU Optimizations
Section titled “Use Native CPU Optimizations”Build with CPU-specific optimizations:
RUSTFLAGS="-C target-cpu=native" cargo build --releaseCompressed Input
Section titled “Compressed Input”Compressed files can be faster than uncompressed when I/O is the bottleneck:
# Often faster than reading uncompressedsql-splitter split backup.sql.gz -o tables/Streaming to Database
Section titled “Streaming to Database”Avoid intermediate files when possible:
# Direct stream (no intermediate file)sql-splitter convert mysql.sql.gz --to postgres -o - | psql "$PG_CONN"
# vs. intermediate file (slower)sql-splitter convert mysql.sql.gz --to postgres -o temp.sqlpsql "$PG_CONN" < temp.sqlrm temp.sqlParallel Operations
Section titled “Parallel Operations”For many files, use parallel execution:
find dumps -name '*.sql.gz' -print0 | \ xargs -0 -n1 -P4 sql-splitter validate --strictQuery Caching
Section titled “Query Caching”Cache imported databases for repeated queries:
# First query (slow - imports data)sql-splitter query dump.sql "SELECT COUNT(*) FROM users" --cache
# Second query (fast - uses cache)sql-splitter query dump.sql "SELECT * FROM users WHERE active = 1" --cacheDisk Mode for Large Files
Section titled “Disk Mode for Large Files”For very large dumps:
sql-splitter query huge.sql "SELECT ..." --diskMemory Usage
Section titled “Memory Usage”sql-splitter maintains constant ~50 MB memory regardless of file size:
- Streaming: Reads in chunks, processes line-by-line
- Buffered I/O: Uses 64 KB buffers
- No full load: Never loads entire file into memory
This means 10 GB files use the same memory as 10 MB files.