Skip to content

split

Split a large SQL dump into separate files, one per table.

Alias: sp (e.g., sql-splitter sp dump.sql -o tables/)

  • Extract specific tables — Pull out just the tables you need without loading the entire dump
  • Enable parallel processing — Split once, then process tables concurrently in separate jobs
  • Debug specific tables — Isolate a problematic table’s DDL and data for investigation
  • Create partial backups — Extract only customer-facing tables while ignoring analytics/logging tables
  • Separate schema from data — Use --schema-only for migrations, --data-only for data refreshes

The split command uses streaming to process dumps in a single pass with constant memory (~10MB). Here’s what happens:

  1. Read — Statements are read using buffered I/O (64KB buffer for files under 1GB, 256KB for larger files)
  2. Parse — Each statement is identified and its target table extracted
  3. Route — Statements are written to the appropriate <table>.sql file via 256KB write buffers
  4. Flush — Buffers are flushed every 100 statements to balance memory and I/O
Statement TypeDestination
CREATE TABLE usersusers.sql
INSERT INTO usersusers.sql
CREATE INDEX ON usersusers.sql
ALTER TABLE usersusers.sql
COPY users FROM stdin (PostgreSQL)users.sql
SET, BEGIN, COMMIT, header comments_global.sql
Unrecognized statements_global.sql

Table names are sanitized for the filesystem: special characters become underscores, and names are lowercased.

Terminal window
sql-splitter split <INPUT> [OPTIONS]
Terminal window
# Split all tables to output/
sql-splitter split database.sql --output=tables/
# Preview what would be created (no files written)
sql-splitter split database.sql --dry-run --verbose
Terminal window
# PostgreSQL pg_dump output
sql-splitter split pg_dump.sql -o tables/ --dialect=postgres
# SQLite .dump output
sql-splitter split sqlite_dump.sql -o tables/ --dialect=sqlite
Terminal window
# Only extract users and orders
sql-splitter split dump.sql --tables users,orders -o subset/
# Extract schema for all tables, data for just users
sql-splitter split dump.sql -o schema/ --schema-only
sql-splitter split dump.sql -o data/ --data-only --tables users
Terminal window
# Automatically decompresses based on extension
sql-splitter split backup.sql.gz -o tables/
sql-splitter split backup.sql.zst -o tables/
sql-splitter split backup.sql.xz -o tables/

See Compression for all supported formats.

Terminal window
# Split multiple dumps
sql-splitter split "backups/*.sql" -o output/ --fail-fast
# Process all dumps recursively
sql-splitter split "dumps/**/*.sql.gz" -o tables/
FlagShortDescriptionDefault
--output-oOutput directoryoutput
--dialect-dSQL dialect: mysql, postgres, sqlite, mssqlauto-detect
--tables-tOnly split these tables (comma-separated)all
--schema-onlyOnly DDL (CREATE TABLE, indexes)false
--data-onlyOnly DML (INSERT, COPY)false
--verbose-vShow verbose outputfalse
--progress-pShow progress barfalse
--dry-runPreview without writing filesfalse
--fail-fastStop on first error (for glob patterns)false
--jsonOutput results as JSONfalse
tables/
├── users.sql # CREATE TABLE + INSERT for users
├── orders.sql # CREATE TABLE + INSERT for orders
├── products.sql # CREATE TABLE + INSERT for products
└── _global.sql # Header statements, SET commands

The _global.sql file contains:

  • Session settings (SET NAMES utf8mb4, SET FOREIGN_KEY_CHECKS=0)
  • Transaction markers (BEGIN, COMMIT)
  • Header comments from the dump tool
  • Any statements that couldn’t be associated with a specific table
Terminal window
sql-splitter split dump.sql -o tables/ --json
{
"input": "dump.sql",
"output_dir": "tables/",
"tables": [
{
"name": "users",
"file": "tables/users.sql",
"rows": 150,
"bytes": 12345
},
{
"name": "orders",
"file": "tables/orders.sql",
"rows": 500,
"bytes": 45678
}
],
"statistics": {
"total_tables": 2,
"total_rows": 650,
"total_bytes": 58023,
"elapsed_ms": 42,
"throughput_mb_per_sec": 612.5
}
}
CodeMeaning
0Success — all files written
1Error — I/O failure, parse error, or invalid input
2Invalid arguments — bad CLI flags

See Exit Codes for scripting examples.

Terminal window
# Split, then compress each table file
sql-splitter split dump.sql -o tables/
for f in tables/*.sql; do gzip "$f"; done
Terminal window
# Extract users table, then query it
sql-splitter split dump.sql --tables users -o tmp/
sql-splitter query tmp/users.sql "SELECT * FROM users WHERE created_at > '2024-01-01'"
Terminal window
# Split once, import tables in parallel
sql-splitter split dump.sql -o tables/
ls tables/*.sql | xargs -P4 -I{} mysql mydb < {}
Terminal window
# Find largest tables, then extract just those
sql-splitter analyze dump.sql --json | jq -r '.tables | sort_by(-.bytes) | .[0:3] | .[].name'
# Output: orders, logs, events
sql-splitter split dump.sql --tables orders,logs,events -o large_tables/
Terminal window
# Extract all tables, merge back a subset in dependency order
sql-splitter split dump.sql -o all/
sql-splitter merge all/users.sql all/orders.sql all/order_items.sql -o subset.sql

Symptom: You expect a table but it’s not in the output directory.

Causes:

  • The dump only contains DDL (schema) or only DML (data), and you used the opposite --schema-only or --data-only flag
  • The table name in the dump differs from what you expect (check case sensitivity)
  • The table has no statements at all in the dump

Fix: Run sql-splitter analyze dump.sql to see exactly which tables exist and what statements they have.

Symptom: Table files are created but contain no data.

Causes:

  • Using --data-only on a dump that only has CREATE TABLE statements
  • Using --schema-only on a data-only dump

Fix: Remove the --schema-only or --data-only flag, or check your dump contains what you expect with analyze.

Symptom: Error message about unable to detect dialect, or wrong SQL syntax in output.

Causes:

  • The dump lacks dialect-specific markers (backticks for MySQL, COPY for PostgreSQL)
  • Mixed-dialect dump (manually edited)

Fix: Explicitly specify the dialect: --dialect=postgres or --dialect=mysql.

Symptom: You want .sql.gz files instead of .sql.

sql-splitter writes uncompressed output. Pipe through compression after:

Terminal window
sql-splitter split dump.sql -o tables/
gzip tables/*.sql

Or for a single compressed output, use merge with compression:

Terminal window
sql-splitter split dump.sql -o tables/
sql-splitter merge "tables/*.sql" -o combined.sql.gz

Symptom: Permission denied error when writing to output directory.

Fix: Ensure the output directory exists and is writable:

Terminal window
mkdir -p tables/
sql-splitter split dump.sql -o tables/