When developing web services and their related web sites, transfering code is easy. It is normally handled via git which automatically creates patches behind the scenes making the process fast and efficient.
However, few websites are complete without substantial collections of images (file sets) that are often, quite reasonably, separate from the code repository. Sometimes you need to make lots of minor changes to these collections and the process may involve several very large transfers as the image collection gets first downloaded and then re-uploaded with the changes. Judicious use of rsync can mitigate this to some extent but where file ownership or other elements change, rsync often gets confused and does little better than the tried and trusted process of creating a tarball and transfering the collection by hand.
I generally try to get the relevant files/folders made into a git repo but sometimes this is not possible (git slows down with fifty thousand plus scanned book files, for example) so I'll assume a git repo is not an option here.
If we are working on the file set for the first time, create a tarball or rsync to get a local copy. Once you have a local copy (`~/original`), make a copy of it (`~/updated`) to work on (that would be a git branch normally, of course).
For fun, I'll just create an original with some files in it:
~ $ mkdir -p original/a/aa ~ $ mkdir -p original/b ~ $ echo 'original content' > original/a/aa/foo ~ $ echo 'original content' > original/b/bar
When you finish work on `~/updated`, create the patch:
Here's some fictitious work on updated:
~ $ cp -R original updated ~ $ rm updated/b/bar ~ $ echo 'more content' >> updated/a/aa/foo ~ $ echo 'new file' > original/b/mumble
~ $ mkdir -p updated/c/cc
~ $ echo 'new file in new folder' > updated/c/cc/grumble
Now we make our patch. This will be a nice small file, easily transferred back to the server.
~ $ diff -urNa original updated > my_patch
Diff options (these are probably all you'll need)
The -a switch is important if you are patching anything involving binaries (such as file sets containing images).
To test the patch we'll create a copy of the original (just in case) and patch it:
~ $ cp -R original target ~ $ patch -p1 -d target -i ../my_patch patching file a/aa/foo patching file b/bar patching file c/cc/grumble
Let's see how it did:
~ $ diff -urNa updated target ~ $ diff -urNa original target ~ $ diff -urNa original/a/aa/foo target/a/aa/foo --- original/a/aa/foo 2014-09-16 15:57:49.124409329 +1000 +++ target/a/aa/foo 2014-09-16 16:30:56.984435375 +1000 @@ -1 +1,2 @@ original content +more content diff -urNa original/b/bar target/b/bar --- original/b/bar 2014-09-16 15:58:01.700409494 +1000 +++ target/b/bar 1970-01-01 10:00:00.000000000 +1000 @@ -1 +0,0 @@ -original content diff -urNa original/c/cc/grumble target/c/cc/grumble --- original/c/cc/grumble 1970-01-01 10:00:00.000000000 +1000 +++ target/c/cc/grumble 2014-09-16 16:30:56.984435375 +1000 @@ -0,0 +1 @@ +new file in new folder
Perfect. The usual gotcha is the 'p' parameter on the patch. Use the --dry-run option if worried and look at the file names in the patch. Use p +1 for each non required parent folder associated with the files.
The result of all this... That 256Gb file transfer only ever needs doing once. Patching thereafter has you just transfering changes. This can save heaps of time... although I still aim to turn anything where this is going on into a git repo first, if at all possible - git makes life so much easier by doing all the patching for you.
If others on the server are changing the file set (or perhaps the service changes it). Make sure you copy the original on the server before transferring it locally. That way you can make patches in future to transfer the updates others have made.
Australia: 07 3103 2894
International: +61 410 545 357