How to Verify and Repair Archives#
This guide covers maintaining archive integrity.
Verifying Integrity#
Full Verification#
Verify all file checksums:
barecat verify myarchive.barecat
This reads every file and compares its CRC32C checksum against the index, showing a progress bar during verification. On success it prints a summary:
Verified 1234567 files (5.2 GB), all checksums OK.
On failure, errors are printed to stderr and the command exits with code 1:
ERROR: Checksum mismatch for path/to/corrupted/file.jpg
ERROR: 3 files failed verification.
Quick Verification#
Check index integrity without reading file contents:
barecat verify --quick myarchive.barecat
This verifies:
SQLite database integrity
Index consistency (parent directories exist, etc.)
Shard file existence and sizes
Much faster, but won’t detect corrupted file contents. On success it prints:
Verified 1234567 files (quick check), all OK.
Defragmentation#
When files are deleted, they leave gaps in the shard files. Defragmentation reclaims this space.
Check Fragmentation#
barecat du myarchive.barecat
Look at the “wasted space” or gap statistics.
Run Defrag#
barecat defrag myarchive.barecat
This rewrites files to eliminate gaps, compacting the archive.
Quick Defrag#
For faster but less thorough defragmentation:
barecat defrag --quick myarchive.barecat
Uses a best-fit algorithm that may leave some small gaps.
Resharding#
Change the shard size limit of an existing archive:
# Consolidate into larger shards
barecat reshard -s 50G myarchive.barecat
# Split into smaller shards
barecat reshard -s 1G myarchive.barecat
This reorganizes all data according to the new shard size limit.
Use cases:
Consolidate many small shards into fewer large ones
Split large shards for easier distribution
Prepare archive for a filesystem with file size limits
Database Upgrade#
When barecat schema changes, upgrade existing archives:
barecat upgrade myarchive.barecat
With multiple workers for faster processing:
barecat upgrade -j 8 myarchive.barecat
This migrates the SQLite schema while preserving all data.
Common Issues#
“Database schema version X.Y is older than supported”#
Run the upgrade:
barecat upgrade myarchive.barecat
“Checksum mismatch”#
A file is corrupted. Options:
Restore from backup - If you have one
Delete the corrupted file:
import barecat with barecat.Barecat('archive.barecat', readonly=False) as bc: del bc['path/to/corrupted/file.jpg']
Re-add from source - If original still exists:
with barecat.Barecat('archive.barecat', readonly=False) as bc: del bc['path/to/corrupted/file.jpg'] bc.add_by_path('/original/path/file.jpg', store_path='path/to/corrupted/file.jpg')
Python API#
import barecat
# Verify
with barecat.Barecat('archive.barecat') as bc:
bc.verify_integrity(quick=False)
# Defrag
with barecat.Barecat('archive.barecat', readonly=False, append_only=False) as bc:
bc.defrag(quick=False)
Maintenance Schedule#
Recommended practices:
After bulk deletions: Run defrag to reclaim space
Periodically: Run
verify --quickto catch issues earlyAfter barecat upgrade: Run
barecat upgradeif prompted
See Also#
Command Line Interface - CLI reference
Architecture - How barecat stores data