You are here
One of our servers has over 7GBs of backup data, mostly due to content (images, videos, etc) loaded into the content management system. When attempting to launch a new VM from a backup of this system with hub-launch, the restore seems to be running smoothly, but eventually stalls. After manually running the restore from the command-line, it was clear why the hub-launch was just stalling - the root '/' partition ran out of space.
Restoring duplicity archive from s3://s3-us-west-1.amazonaws.com/tklbam-gkp7dhx45incclo2
Synchronizing remote metadata to local cache...
Copying duplicity-full-signatures.20110914T195922Z.sigtar to local cache.
Copying duplicity-full.20110914T195922Z.manifest to local cache.
Last full backup date: Wed Sep 14 19:59:22 2011
Traceback (most recent call last):
File "/usr/lib/tklbam/deps/bin/duplicity", line 1252, in <module>
with_tempdir(main)
File "/usr/lib/tklbam/deps/bin/duplicity", line 1245, in with_tempdir
fn()
File "/usr/lib/tklbam/deps/bin/duplicity", line 1199, in main
restore(col_stats)
File "/usr/lib/tklbam/deps/bin/duplicity", line 539, in restore
restore_get_patched_rop_iter(col_stats)):
File "/usr/lib/tklbam/deps/lib/python2.6/site-packages/duplicity/patchdir.py", line 522, in Write_ROPaths
ITR( ropath.index, ropath )
File "/usr/lib/tklbam/deps/lib/python2.6/site-packages/duplicity/lazy.py", line 335, in __call__
last_branch.fast_process, args)
File "/usr/lib/tklbam/deps/lib/python2.6/site-packages/duplicity/robust.py", line 37, in check_common_error
return function(*args)
File "/usr/lib/tklbam/deps/lib/python2.6/site-packages/duplicity/patchdir.py", line 575, in fast_process
ropath.copy( self.base_path.new_index( index ) )
File "/usr/lib/tklbam/deps/lib/python2.6/site-packages/duplicity/path.py", line 416, in copy
other.writefileobj(self.open("rb"))
File "/usr/lib/tklbam/deps/lib/python2.6/site-packages/duplicity/path.py", line 595, in writefileobj
fout.write(buf)
IOError: [Errno 28] No space left on device
I attempted a quick-and-dirty solution (or so I thought) by logging into the newly launched server and running the following commands:
mkdir /mnt/tmp chmod 777 /mnt/tmp chmod +t /mnt/tmp rm -rf /tmp ln -s /mnt/tmp /tmp tklbam-restore [backup id]
but I ran into the same problem.
I am hoping that there is a solution that would allow me to continue to use hub-launch without having to perform the restore manually, but would live with a quick-and-dirty manual solution that actually works.
Thanks!
-Ken
Update
I tried setting the TMPDIR environment variable per the Duplicy FAQ: http://duplicity.nongnu.org/FAQ.html, to:
/mnt/duplicity/tmp
since the /mnt partition is 335GBs and still no luck. While Duplicity appeared to be using /mnt/duplicity/tmp (9.6GB used), the restore failed with "[Errno 28] No space left on device". For what it's worth, here is what the file system disk space usage looks like after the error:
TKLBAM doesn't hardwire /tmp
I haven't seen anything to indicate that Duplicity has difficulty handling restoration of big backups so I don't think it's likely this has anything to do with an inherent limitation. By design, TKLBAM should be able to backup/restore arbitrarily large amounts of data.
Obviously if you run out of disk space that's going to be an issue regardless of what backup/restore method you are using. There's no magic involved.
Anyhow, I've just taken a look at the TKLBAM source code and confirmed that /tmp isn't hardwired anywhere. Maybe you just didn't set TMPDIR right? You need to export the variable if you want your shell to set it not just in the local environment but also in the environment of the programs it executes. Like this:
If that still doesn't work, try redirecting /tmp to /mnt like this: Please tell me if any of these workarounds help. We'll look into solving the underlying EC2 configuration issue...Update
So I decided to go with:
right before I do a large restore and it appears to be working. All files get downloaded and cached by Duplicity to the /tmp directory on the /mnt partition and the root / partition grows as expected (new packages are installed), but does not run out of space.
Thanks for the help!
Added fix to Hub that mount --binds /tmp to instance storage
tklbam restore failure because of -No space left on device
A heads up - I have tried everything mentioned above, to do arestore no joy at all. My situation dire because my PRODUCTION sever crashed. Any one with ideas on how I can do a tklbam restore without generating the error below?
File "/usr/bin/tklbam-restore", line 444, in main
log_fh.write(trap.std.read())
IOError: [Errno 28] No space left on device
Is it a locally hosted server?
If so then perhaps you are area out of physical space?
tklbam restore failure because of -No space left on device
I was able to resolve this problem using the these commands in the following order. This was after googling profusely for two days.
mkdir /temp
chmod 1777 /temp/
export TMPDIR=/temp/
mount --bind /temp/ /tmp/
mount --bind /temp /tmp
tklbam-restore 23 --time 2015-09-20T07:23:16
Add new comment