shard
Extract data for specific tenants from multi-tenant SQL dumps.
Alias: sh (e.g., sql-splitter sh dump.sql --tenant-value 123 -o tenant.sql)
When to Use This
Section titled “When to Use This”- Tenant data export - Extract a single customer’s data for migration or analysis
- Development environments - Create small, focused dev databases from a multi-tenant production dump
- Data isolation - Separate tenant data for compliance, backup, or legal requirements
- Debugging - Reproduce issues with a specific tenant’s data without the full database
Use sample instead if you want a random subset of data across all tenants.
How It Works
Section titled “How It Works”Sharding follows your database’s foreign key relationships to extract complete, consistent data:
- Identifies tenant column - Auto-detects
tenant_id,company_id,org_id, etc., or uses your explicit--tenant-column - Starts from root tables - Tables with the tenant column that have no incoming foreign keys
- Follows FK relationships - Recursively includes all related rows that belong to the tenant
- Includes lookup tables - Optionally includes global/shared tables (countries, currencies, etc.)
The result is a self-contained dump that maintains referential integrity.
sql-splitter shard <INPUT> --tenant-value <VALUE> [OPTIONS]Examples
Section titled “Examples”Single Tenant Extraction
Section titled “Single Tenant Extraction”# Extract all data for tenant 123sql-splitter shard dump.sql --tenant-value 123 -o tenant_123.sql
# Specify the tenant column explicitlysql-splitter shard dump.sql --tenant-column account_id --tenant-value 42 -o out.sqlMultiple Tenants
Section titled “Multiple Tenants”Extract multiple tenants to separate files in one pass:
sql-splitter shard dump.sql --tenant-values "1,2,3" -o shards/This creates:
shards/├── tenant_1.sql├── tenant_2.sql└── tenant_3.sqlControlling Global Tables
Section titled “Controlling Global Tables”# Include all shared/lookup tables (countries, currencies, config)sql-splitter shard dump.sql --tenant-value 123 --include-global all -o tenant.sql
# Skip global tables entirelysql-splitter shard dump.sql --tenant-value 123 --include-global none -o tenant.sql
# Only include small lookup tables (default behavior)sql-splitter shard dump.sql --tenant-value 123 --include-global lookups -o tenant.sqlPreview Before Extracting
Section titled “Preview Before Extracting”# See what would be extracted without writing filessql-splitter shard dump.sql --tenant-value 123 --dry-runOptions
Section titled “Options”| Flag | Short | Description | Default |
|---|---|---|---|
--output | -o | Output SQL file or directory | stdout |
--dialect | -d | SQL dialect | auto-detect |
--tenant-column | Column name for tenant identification | auto-detect | |
--tenant-value | Single tenant value to extract | - | |
--tenant-values | Multiple tenant values (comma-separated) | - | |
--root-tables | Explicit root tables with tenant column | auto | |
--include-global | Global table handling: none, lookups, all | lookups | |
--config | -c | YAML config file for table classification | - |
--max-selected-rows | Maximum rows to select | unlimited | |
--no-limit | Disable row limit | false | |
--strict-fk | Fail if any FK integrity issues | false | |
--no-schema | Exclude CREATE TABLE statements | false | |
--progress | -p | Show progress bar | false |
--dry-run | Preview without writing | false | |
--json | Output as JSON | false |
Tenant Column Detection
Section titled “Tenant Column Detection”sql-splitter auto-detects common tenant columns:
tenant_idcompany_idaccount_idorg_idorganization_idworkspace_id
Override with --tenant-column:
sql-splitter shard dump.sql --tenant-column customer_id --tenant-value 42Multiple Tenants
Section titled “Multiple Tenants”Extract multiple tenants to separate files:
sql-splitter shard dump.sql --tenant-values "1,2,3" -o shards/Creates:
shards/├── tenant_1.sql├── tenant_2.sql└── tenant_3.sqlGlobal Tables
Section titled “Global Tables”The --include-global option controls shared/lookup tables:
none: Skip global tableslookups: Include small lookup tables (default)all: Include all non-tenant tables
FK Traversal
Section titled “FK Traversal”Sharding follows foreign key relationships:
- Finds rows matching tenant value in root tables
- Follows FKs to include related rows
- Ensures complete, consistent tenant slice
Troubleshooting
Section titled “Troubleshooting””No tenant column found” error
Section titled “”No tenant column found” error”Auto-detection looks for common column names. Specify yours explicitly:
sql-splitter shard dump.sql --tenant-column customer_id --tenant-value 123 -o tenant.sqlMissing related data in output
Section titled “Missing related data in output”Foreign key relationships might not be detected if:
- They’re not defined in the schema (implicit FKs)
- The FK references a table that’s excluded
Use --root-tables to explicitly specify which tables have the tenant column:
sql-splitter shard dump.sql --tenant-value 123 --root-tables users,projects,teams -o tenant.sqlShard is larger than expected
Section titled “Shard is larger than expected”Check which global tables are being included:
# See what would be extractedsql-splitter shard dump.sql --tenant-value 123 --dry-run --json | jq '.tables'
# Exclude large lookup tablessql-splitter shard dump.sql --tenant-value 123 --include-global none -o tenant.sqlFK integrity warnings
Section titled “FK integrity warnings”Some rows might reference data outside the tenant’s scope. Options:
--strict-fk: Fail immediately on any integrity issue- Review warnings: Some cross-tenant references might be intentional (shared resources)
- Include global tables:
--include-global allincludes all referenced lookup data
See Also
Section titled “See Also”sample- Create reduced datasetssplit- Split by tablegraph- Visualize FK relationships before sharding- JSON Output Schema - Schema for
--jsonoutput