Sign in to follow this  
getafix

Archiving several GB of data

Recommended Posts

I have ~ 200 GB of bzip2 compressed files that I would like to archive to DVD disks. I am working under Linux and the simplest solution I can think of is using RAR or tar to create 4.7GB archive volumes that I can then burn using mkisofs + a dvd writing software. This approach has the disadvantage that is I want to restore a few files from the archive, I have to have all the disks that constitute a single tar or rar archive to perform the restore. Am I mistaken regarding this assumption?

The use of archiving software or adapting backup software for this function also means that if I change platform down the road, I have to have the same or compatible software on that platform to perform the restore. This problem is not there is I use a ISO type filesystem approach.

If the limitation with tar or rar is true, then I may be better off using creating ISO that optimally fit files on each DVD. The advantage with using this approach is that restoring data will involve identifying the DVD(s) that contain the file(s) using an online idex of what is on each DVD and then loading the DVD(s) identified to simplly copy the data back on to the system.

I have not come across any software that will allow me to select a set of files and have it split the files into groups that minimize the number of DVDs required. For example, if I have three files A (1GB), B(4GB) and C (1GB), optimizing the layout will require 2 DVDs with A + C on disk I and B on disk II. If it were not optimized and I go through the list sequentially, A would go on Disk I, B on Disk II (as it will not fit with A and C on Disk III (as it will not fit with B).

This problem does not appear to be trivial, but it seems to me that some one may have had to write software to address a similar problem. A description of a similar problem and algorithms can been seen here.

Any ideas?

Share this post


Link to post
Share on other sites
I have ~ 200 GB of bzip2 compressed files that I would like to archive to DVD disks. I am working under Linux and the simplest solution I can think of is using RAR or tar to create 4.7GB archive volumes that I can then burn using mkisofs + a dvd writing software. This approach has the disadvantage that is I want to restore a few files from the archive, I have to have all the disks that constitute a single tar or rar archive to perform the restore. Am I mistaken regarding this assumption?

208764[/snapback]

Don't TAR or RAR have an option to create independent parts?

Share this post


Link to post
Share on other sites
It has been a while since I used this but it should come in handy:

http://www.pcworld.com/downloads/file_desc...fid,5132,00.asp

208765[/snapback]

That appears to be a file splitter to make big files fit on media that has capacity smaller than the file. My problem is to make an optimal number of files fit on a single disk without having to split them.

Maybe I should think of splitting the last file to make it fit optimally.

Share this post


Link to post
Share on other sites
I have ~ 200 GB of bzip2 compressed files that I would like to archive to DVD disks. I am working under Linux and the simplest solution I can think of is using RAR or tar to create 4.7GB archive volumes that I can then burn using mkisofs + a dvd writing software. This approach has the disadvantage that is I want to restore a few files from the archive, I have to have all the disks that constitute a single tar or rar archive to perform the restore. Am I mistaken regarding this assumption?

208764[/snapback]

Don't TAR or RAR have an option to create independent parts?

208797[/snapback]

Aren't the table of contents stored only on the first or last disk? If there is a way to get tar to store independant parts, each disk with its own table of contents, that may not be a bad option as it will do all the splitting to make files fit. All I have to do is maintain an online copy of the table of contents from each disk to determine which disk(s) I need to use for any particular restore.

Share this post


Link to post
Share on other sites
Aren't the table of contents stored only on the first or last disk?

208799[/snapback]

I think TAR writes like meta-data, data, meta-data, data ...

But what you want isn't that special, so I'd expect such an option to exist.

Share this post


Link to post
Share on other sites

have you looked at dar I think it does what you are after, though I have not used it myself.

from the docs

Slices

Dar stands for Disk ARchive. From the beginning, it was designed to be able to split an archive over several pieces of removable media -- no matter how many or what size. Thus, dar is able to save over floppy disks, CD-Rs, DVD-Rs, CD-RWs, DVD-RWs, Zip disks, Jazz disks, etc. Dar is not concerned by un/mounting a removable medium; instead it operates independently of the hardware. Given the size, it will split the archive in several files (called slices), pausing before creating each new slice. This allows the user to un/mount a medium, burn the slice to a CD-R, send it by email (if your mail system does not allow huge file in emails, dar can help you here also). By default, (no size specified), dar will make only one slice. If a slice size is specified and dar creates multiple slices, the size of the first slice can be specified separately. This is useful if, for example, you want to fill up a partially filled disk before starting use of an empty one. At restoration time, dar will look for the slices it needs, asking for a slice only if it is missing and required.

Direct Access

Even when using compression, dar does not have to read the whole backup to extract one file. If you just want to restore one file from a huge backup, the process will be much faster than using tar. To extract one or more files, dar first reads the catalogue (i.e. the contents of the backup), then goes directly to the location of the saved files you want to restore, and proceeds with restoration. When using slices, dar will ask only for the slices containing the files to restore. You can also restore all files from an archive, in which case dar will read the slices sequentially. When doing a full restore, no slice (except the first and last slices) will be asked for more than once.

Share this post


Link to post
Share on other sites

Note: As I mentioned in this older thread, an ISO formatted DVD will not support files larger than 2 gigabytes. You will need to adjust your splitting scheme to optimally use the available space, or write your DVD's in UDF (Universal Disk Format).

Anyway, I still think the simplest solution to your problem is an external 250GB USB2 harddrive instead of playing a human DVD-jukebox.

JD

Share this post


Link to post
Share on other sites
have you looked at dar I think it does what you are after, though I have not used it myself.

[... snip]

208812[/snapback]

This sounds very promising. I will take a look at it and see how it works. As the source is available, portability is probably not an issue in the long term.

Thanks!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this