Data rot is reality. Even in ECC environments, RAID and all other data redundancy structures.

md5sum can easily be used to check data integrity e.g. after file transfers.

Checksum creating comes prior to checksum verification. The following script generates an md5sum compliant file, which can be used to check integrity at a later stage.

root@ganymede:~# cat ./generate-md5sum.sh
#!/bin/sh
find $1 -type f -exec md5sum {} \;
root@ganymede:~# ./generate-md5sum.sh archive/
d6e02966dc93d4b6bbd3a651acea0176  archive/jre-8-ea-bin-b106-linux-i586-05_sep_2013.tar.gz
e868ab86df2eb20a1d98c11e8564e52c  archive/inadyn-mt.v.02.24.38.tar.gz
13a91d9e50695dbfa086ffbacf81cfa6  archive/spigot.jar
6d790745b95d0ece9d1b717c8b4f1d15  archive/cli32
58a014a5a4f2fc3596caf40e60584db0  archive/archttp32
root@ganymede:~# ./generate-md5sum.sh archive/ > archive.md5
root@ganymede:~#

A script (generate-md5sum.sh) generates md5 checksums, which can be redirected into a file (archive.md5) for subsequent md5 checksum verification. It uses the path as parameter where to recursively start calculating md5 checksums.

root@ganymede:~# md5sum -c archive.md5
archive/jre-8-ea-bin-b106-linux-i586-05_sep_2013.tar.gz: OK
archive/inadyn-mt.v.02.24.38.tar.gz: OK
archive/spigot.jar: OK
archive/cli32: OK
archive/archttp32: OK
root@ganymede:~# md5sum -c --quiet archive.md5
root@ganymede:~#

md5sum checks data integrity using the -c option. This shows all results. If only errors are supposed to be displayed, additionally use –quiet option.