Imagine that we have two very large directories (tens or hundreds of
The idea would be that after modifying the files in one of the PCs, we could copy only the changes to the other PC (which in principle would be little data volume) without having to copy all the files every time (remember that we speak of many GB of data ).
If both systems are connected by network (local, Internet), although there may be other different solutions, my preferred solution would be to use
We could also choose to store the data in ”
100,000,000,000 bytes / (1,000,000 bit / s / 8 bytes / bit) = 800,000 s = 9.26 days
If we do not have
rsync, or storage in the cloud, and perhaps no network access, there would be those who would copy all the files to an external hard drive (hundreds of GB) or maybe who would keep track of which files have changed to copy only those to A USB stick. And later still have to re-copy the files in the target system …
Well, it turns out to
rdiffdirbe an excellent solution to this problem.
rdiffdirIs written in Python, is part of the application
rdiffand uses the library
rdiffdirwe can easily create a file that contains only the changes with
Let’s use for example the case of the directory with professional documents that we want to synchronize between the home PC and the work PC to understand how we would do it with
We start from a scenario in which the directories are perfectly synchronized in both systems. We just got to work, and the directory we want to sync contains the following:
Work ~ $ find directory / directory/ Directory / subdirectory1 Directory / subdirectory1 / A.txt file Directory / subdirectory2 Directory / subdirectory2 / Btf file Work ~ $ cat directory / subdirectory1 / A.txt file Test A Work ~ $ cat directory / subdirectory2 / Btf file Test B
Before starting to work, we generate a file with the
Work ~ $ rdiffdir signature directory signature _ $ (date +% y% m% d) .rdiffdir
And we started to work: edit files, add files and directories, delete files …
Work ~ $ echo "New line" >> directory / subdirectory1 / A.txt file Work ~ $ rm -rf directory / subdirectory2 / Work ~ $ mkdir directory / subdirectory3 Work ~ $ echo "Test C"> directory / subdirectory3 / Ctf file
And at the end of the day, we copy the signature file and generate a file with the changes:
Work ~ $ rdiffdir delta signature_130208.rdiff directory changes _ $ (date +% y% m% d) .rdiffdir
We go home, copy the change file, and apply it on the directory:
Home ~ $ rdiffdir patch directory changes_130208.rdiffdir
And we verify that, indeed, we have the changes of the day:
Home ~ $ find directory directory Directory / subdirectory1 Directory / subdirectory1 / A.txt file Directory / subdirectory3 Directory / subdirectory3 / Ctf file Home ~ $ cat directory / subdirectory1 / A.txt file Test A New line Home ~ $ cat directory / subdirectory3 / file.txt Test C
If we plan to make changes at home, we would have to repeat the process: generate a signature file before starting, and one of changes at the end.
But, see if it will cost more the collar than the dog! What does a signature file contain from a huge directory? What takes to generate it? Well I just tried a directory with 22000 files, 1400 directories and 10GB and took about 6 minutes and takes up about 130MB: A suitable size to be able to carry it on a USB stick.
librsync-develthen, you just have to download the file with the sources, unzip it, enter it from the Cygwin shell and execute:
Python setup.py install