NAME¶
rdd-copy - copy a file, even if read errors occur
SYNOPSIS¶
rdd-copy [OPTION] src [dst]
rdd-copy -C [CLIENT OPTION] src [host:]dst
rdd-copy -S [SERVER OPTION]
DESCRIPTION¶
Rdd-copy is a file and device copying utility that includes features that are
useful in a forensic environment. In particular, rdd-copy can compute
cryptographic hashes over the data it copies, is robust with respect to read
errors, and can copy data across a network.
Rdd-copy is best understood as a program that consists of a reader stage and one
or more processing stages. The reader stage reads input data in a robust way.
It will retry failed reads. If a read error persists, the reader stage
substitutes zero bytes for the input bytes that it fails to read. The
resulting bytes are passed to all subsequent processing stages.
The processing stages are enabled through command-line options. The current
stages are: checksumming (Adler32 and CRC32), hashing (MD5 and SHA1), file
output, network output, and statistics.
Rdd-copy can be run in
local mode, in
client mode, and in
server mode. The mode is indicated by the first command-line
argument.
Copying data across a network requires two rdd-copy processes: a client process
that reads the data from disk and transmits it across the network, and a
server process that reads the data from the network and writes it to a file or
device.
LOCAL MODE¶
In local mode, rdd-copy copies source file
src to destination file
dst, handling read errors according to the options. If
dst is
not specified, the data in
src will be read and optionally hashed, but
it will not be written. To write to standard output, specify
- as
dst.
Rdd-copy will optionally compute an MD5 or a SHA1 hash value over the input
bytes and the zero bytes it substitutes for blocks it cannot read. These hash
values should be interpreted with care (see below).
Rdd-copy does NOT guarantee that the bytes it reads are the same bytes that are
stored on the input medium. It simply takes what
read(2) returns. Any
hash values (see options) are computed over the bytes that
read(2)
returns or, if
read(2) fails, over zero-valued fill bytes.
Rdd-copy does NOT guarantee that the bytes that it reads into memory (or the
zero-valued bytes that it substitutes when a read error occurs) will be
written to the output file correctly. If you wish to verify the correspondence
between what rdd-copy saw and what got written to disk, you will have to
recompute the MD5 and/or SHA1 hash values over the output file and compare
them with the hash values reported by rdd-copy. This is a useful verification
step, but beware that even this step cannot guarantee perfect correspondence
with the data stored on the source medium.
The best end-to-end test is probably to read back the output file and compare
each output byte to the corresponding input byte, unless that input byte was
part of a block for which rdd-copy reported a read error.
Rdd-copy does NOT recover from persisting write errors. Rdd-copy was designed to
handle unfriendly source media only. If you get write errors, you should
replace your target medium.
READ ERRORS¶
In local mode and in client mode, rdd-copy reads from disk. Rdd-copy assumes
that the source disk may be faulty and tries to be robust with respect to
disk-read errors. In server mode, rdd-copy reads from the network and makes no
attempt to survive read errors. The explanation below applies only to read
errors that occur in local mode and in client mode.
When a read error occurs, rdd-copy reduces the block size to the minimum block
size (see
--min-block-size) and resets the read pointer to the location
at which it started the read that failed.
Next, rdd-copy tries to read a series of minimum-sized blocks (see
--min-block-size). When such a read fails, it is retried a
user-specified number of times (see
--nretry). If the read failure
persists, rdd-copy normally will skip a minimum-sized block of input data and
will write a minimum-sized block of zero bytes to the destination file. These
zero bytes are also passed to all other rdd-copy processing stages
(checksumming, hashing, and statistics).
Any persistent read failure counts toward the maximum number of read errors that
the user will tolerate (see
--max-read-err). If this maximum is
reached, rdd-copy will exit immediately. By default, however, an infinite
number of read errors is allowed.
After a read failure, rdd-copy continues to use the minimum block size to read
data until it has read
block-size bytes of data without errors. (
block-size is the user-specified block size, see
--block-size.)
Only then will rdd-copy increase its block size again, doubling the size at
each successful read, until it reaches the default block size.
CLIENT MODE¶
In client mode, rdd-copy operates as in local mode, except that the data will
not be copied to a file, but will be written to a TCP connection to an
rdd-copy server process.
In client mode, a destination file,
dst, on a destination
host
must be specified. If no
host is specified,
localhost will be
used.
SERVER MODE¶
In server mode, rdd-copy accepts one TCP connection from an rdd-copy client. The
server process must be started before the client process. In server mode,
rdd-copy will read data from a TCP connection and write it to a target file.
For now, the target file must always be specified by the client. The main
reason for this decision is to keep open the option of having
inetd(8)
or
xinetd(8) start an rdd-copy server process.
OUTPUT¶
Informative messages, error messages, and statistics are all written to
stderr.
OPTIONS¶
- -C, --client
- Run rdd-copy in client mode. If you use this option, it
must come first.
- -S, --server
- Run rdd-copy in server mode. If you use this option, it
must come first.
- -p, --port <portnum>
- Modes: client, server.
Specifies the port number <portnum> at which the server listens
for an incoming connection. The default port is 4832.
- -?, --help
- Modes: all.
Print a usage message that includes this list of options.
- -V, --version
- Modes: all.
Print version information and exit
- -v, --verbose
- Modes: all.
Be verbose.
- -q, --quiet
- Modes: all.
Do not pose interactive questions.
- -l, --log-file <logfile>
- Modes: all.
Log all messages except progress messages to <logfile>.
- -f, --force
- Modes: local, server.
Force existing files to be overwritten. The default behavior is to bail out
when the output file already exists.
- -b, --block-size <size>
- Modes: local, client.
Specify the default block size; <size> must be a power of two. While
no read errors occur, rdd-copy will read and write blocks of <size>
bytes.
- -m, --min-block-size <size>
- Modes: local, client.
Specify the minimum read size; <size> must be a power of two. When a
persistent read error occurs, at least this many bytes of data will be
skipped and replaced with zero bytes in the destination file.
- -n, --nretry <count>
- Modes: local, client.
Retry failed reads up to <count> times. In many cases, using a large
retry value makes little sense, because the operating system's device
driver will not indicate a failed read until it has, itself, retried the
read several times.
- -o, --offset <size>
- Modes: local, client.
Skip <size> bytes from the start of the input file before reading any
data. The bytes that are skipped will not be included in any hash
computation and will not be written to the output file.
- -c, --count <size>
- Modes: local, client.
Read at most <size> input bytes or read until end-of-file.
- -z, --compress
- Modes: client.
Compress network data.
- -s, --split <size>
- Modes: local, server.
If necessary, create multiple output files, none of which will be larger
than <size> bytes. Each output file will have a name that consists
of a sequence number followed by a dash and the name specified on the
command line.
- -r, --raw
- Modes: local, client.
Access the device using the raw device. The data will not travel through the
buffer cache.
- -P, --progress <sec>
- Modes: all.
Report progress (bytes read and percentage of data covered) every
<sec> seconds.
- -M, --max-read-err <count>
- Modes: local, client.
Give up after <count> read errors.
- --md5
- Modes: all.
Compute an MD5 hash value over all data that was read without errors and
over the zero-filled blocks that are used to replace bad blocks.
- --sha, --sha1
- Modes: all.
Compute a SHA1 hash value over all data that was read without errors and
over the zero-filled blocks that are used to replace bad blocks.
- --checksum, --adler32 <file>
- Modes: all.
Compute an Adler32 checksum value over blocks of data produced by the reader
stage. The last block to be checksummed may be smaller than the the block
size that is used. All checksum values are written to <file>.
- --checksum-block-size, --adler32-block-size
<size>
- Modes: all.
Compute Adler32 checksum values over data blocks with a size of <size>
bytes. Only the last data block to be checksummed may be smaller than
<size>. The default block size is 32 Kbyte.
- --crc32 <file>
- Modes: all.
Compute a CRC32 checksum value over blocks of data produced by the reader
stage. The last block to be checksummed may be smaller than the the block
size that is used. All checksum values are written to <file>.
- --crc32-block-size <size>
- Modes: all.
Compute CRC32 checksum values over data blocks with a size of <size>
bytes. Only the last data block to be checksummed may be smaller than
<size>. The default block size is 32 Kbyte.
- -H, --histogram <file>
- Modes: all.
Compute a histogram over each block of data produced by the reader stage.
The histogramming block size can be set by the user (see
--hist-block-size). For each block, write a single text line of
statistics to <file>.
- -h, --hist-block-size <size>
- Modes: all.
Set the histogramming block size to <size> bytes. The default block
size is 256 Kbyte.
- --block-md5 <file>
- Modes: all.
Compute the MD5 hash value over blocks of data produced by the reader stage.
The last block to be hashed may be smaller than the block size. All MD5
values are written to text file <file>. Each line in this file
contains a block number, followed by a space, followed by the hash value
of the corresponding block.
- --block-md5-size <size>
- Modes: all.
Sets the block size of the block-wise MD5 computation. The default block
size is 4 Kbyte.
A <size> argument may be followed by one of the following multiplicative
suffixes: c 1, w 2, b 512, k 1024, M 1,048,576, and G 1,073,741,824.
EXAMPLES¶
- rdd-copy --md5 /dev/hda1
-
Compute and print the MD5 hash value over /dev/hda1. On Linux,
/dev/hda1 denotes the first partition of the primary master
disk.
- rdd-copy -b 16k -m 512 -l rdd-log.txt /dev/fd0 f.img
-
Create an image of a floppy disk ( /dev/fd0). Copy 16 Kbyte at a
time, but use blocks as small as a single sector (512 bytes) when read
errors occur. Write all log messages to the file rdd-log.txt.
- On the server: rdd-copy -S --sha1
- On the client: rdd-copy -C --sha1 /dev/hdb
snake:/images/disk.img
-
Copy the primary slave disk to host snake and store the data in file
/images/disk.img. The client host computes a SHA1 hash over the
data it reads from the disk; the server host computes a SHA1 hash over the
data it receives from the network.
- rdd-copy --count 512 /dev/hda mbr.img
-
Copy the master boot record (MBR) from the primary master disk to file
mbr.img.
SEE ALSO¶
- rdd-verify(1), raw(8)
NOTES¶
If you encounter read errors, do examine
/var/log/messages (or the
equivalent file on your operating system). It may contain useful device driver
error messages.
On Linux (kernel 2.4 and lower) rdd-copy and other programs that read from a
block device may yield an I/O error when they reach the end of the device,
even if there's nothing wrong with the device. To the best of my knowledge,
this is a Linux problem rather than an rdd-copy problem; the same problem
occurs with GNU dd-copy and other programs. The problem is described in the
following document:
http://www.cftt.nist.gov/Notes_on_dd_and_Odd_Sized_Disks4.doc. The problem has
apparently been solved in the Linux 2.6 kernel.
If you use
rdd-copy to access a device, consider using the
raw
device (see
raw(8)). This way, your data will not travel through the
buffer cache.
BUGS¶
Server-side errors are not reported back to the client. Users must watch the
server's output.
REPORTING BUGS¶
Report bugs to <rdd@holmes.nl>.
ACKNOWLEDGEMENTS¶
Many thanks to all who reported bugs and successes, and who suggested
improvements. You know who you are.
COPYRIGHT¶
Copyright © 2002-2003 Netherlands Forensic Institute
This software comes with NO warranty; not even for MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE.
HISTORY¶
Up to version 1.2-7a rdd-copy (then called rdd) used a different error recovery
strategy. With the new strategy, users can no longer set the recovery
threshold, so the
--recovery-len option has been retired.