olecranon

Question about chunk size in Raid 5?

Recommended Posts

I have a filserver, running ubuntu 7.10 (server) with 4*500Gb disks in Raid 5. (Mostly DVDs, music, pictures and some documents.)

During creation of the raid array, i left chunk size at default 4 Kb.

I havent been able to find out exactly what size is most recommended, but it seems that 32-128 Kb is the most common size.

My only concern about this is if leaving chunksize at 4 kb could be harmful for my system in any possible way.

I think read/write speed is OK.

Should I change chunk size?

Olec

Share this post


Link to post
Share on other sites

Yes, in linux (meta devices actually) uses the term chunk size but that's just stripe size. Generally, you'll find most systems default the stripe size to 64KiB. 4KiB is pretty low (I thought the default in md was 64KiB as well? or are you confusing that with the block size of the filesystem which would be 4KiB on most systems?).

Unless you know your workload very well I would suggest you keep the default.

Share this post


Link to post
Share on other sites
I have a filserver, running ubuntu 7.10 (server) with 4*500Gb disks in Raid 5. (Mostly DVDs, music, pictures and some documents.)

During creation of the raid array, i left chunk size at default 4 Kb.

I havent been able to find out exactly what size is most recommended, but it seems that 32-128 Kb is the most common size.

My only concern about this is if leaving chunksize at 4 kb could be harmful for my system in any possible way.

I think read/write speed is OK.

Should I change chunk size?

Olec

Use 256k for 7200 rpm drives or 1mb for 10,000 rpm drives. These are typically the best, 128k+ will boost performance, there are a lot of other things you can do, search the linux-raid mailing list and look for a subject called optimization.

Share this post


Link to post
Share on other sites

Larger stripe sizes only increase performance if your request size is large & contiguous. If not then it will hurt performance, without knowing the user's workload it's hard to make any recommendations.

Share this post


Link to post
Share on other sites
Yes, in linux (meta devices actually) uses the term chunk size but that's just stripe size. Generally, you'll find most systems default the stripe size to 64KiB. 4KiB is pretty low (I thought the default in md was 64KiB as well? or are you confusing that with the block size of the filesystem which would be 4KiB on most systems?).

Unless you know your workload very well I would suggest you keep the default.

Well Im not sure of anything :unsure:

Device file	/dev/md0
RAID level	Redundant (RAID5)
Filesystem status	Mounted on /DataRAID
Usable size	1465151808 blocks (1.36 TB)
Persistent superblock?	Ja
Parity algorithm	Default
Chunk size	4 kB
RAID status	clean
Partitions in RAID	
SCSI device B partition 1 
SCSI device C partition 1 
SCSI device D partition 1 
SCSI device E partition 1

Share this post


Link to post
Share on other sites

It will take some time and effort to remove all data for rebuilding the array....will it be harmful/cause distress to my disks if I leave this way??

If there is any chance of that, I will do it :rolleyes:

Share this post


Link to post
Share on other sites

It will take some time and effort to remove all data for rebuilding the array....will it be harmful/cause distress to my disks if I leave this way??

If there is any chance of that, I will do it :rolleyes:

Share this post


Link to post
Share on other sites

yes, that's stripe size. That was the default? Strange as I've never seen it that small except on purpose. Anyway, to you question that small size increase the number of requests to each drive assuming you have requests greater than 8 sectors (4KiB). As your larger request will be broken down into this size. This will slow down the system, yes, but it will not cause a functional issue unless you have a response time/bandwidth requirement that you have to hit. Actually with the smaller stripe size you have more of a chance to always write a full stripe width when you write files so when doing parity raids (4/5/6) it may help some. (though I wouldn't do this on purpose ;) )

So end of day, you won't cause any harm, just may not perform the best. When you get around to re-doing it in the future then you can fix it then if it's going to be a pain now.

Share this post


Link to post
Share on other sites
yes, that's stripe size. That was the default? Strange as I've never seen it that small except on purpose. Anyway, to you question that small size increase the number of requests to each drive assuming you have requests greater than 8 sectors (4KiB). As your larger request will be broken down into this size. This will slow down the system, yes, but it will not cause a functional issue unless you have a response time/bandwidth requirement that you have to hit. Actually with the smaller stripe size you have more of a chance to always write a full stripe width when you write files so when doing parity raids (4/5/6) it may help some. (though I wouldn't do this on purpose ;) )

So end of day, you won't cause any harm, just may not perform the best. When you get around to re-doing it in the future then you can fix it then if it's going to be a pain now.

Thanks a lot... :) Very good answer, made some things clearer.

One last question. If you don't mind.....which stripe/chunk size would you use if you had a fileserver with mostly DVDs (not .iso, but video_ts), MP3s and pictures.

About the default 4Kb, i build the array with mdadm through Webmin, maybe that explains it. i'm pretty sure i didn't do it on purpose ;)

Olec

Share this post


Link to post
Share on other sites

stripe and chunk are not the same :)

chunk is the part of the stripe on each disk, thus 4k is reasonable, though I tend to go with 32k for general purpose applications.

Share this post


Link to post
Share on other sites

@wimcle actually they are. Stripe size is the size PER DISK of contiguous space used for an array. The confusion is that in a lot of places people do not describe it well to what they mean by 'stripe width' which is stripe size * # of drives in array. To avoid some confusion of this the writers of metadevices use the term 'chunk size' for 'stripe size' and then use 'stripe' for what 'stripe width' is. Simply, in the industry "stripe size" = "chunk size" and when you want to talk about the entire parity group that is called the "stripe width". (you can blame whomever came up with it back in the 70's. ;) )

@olecranon if the namespace (filesystem) only holds large files then the questions really are how are they being accessed (smb/cifs/nfs/iscsi et al and how many outstanding requests do you have at a time (to indicate queue depth). Generally I would suggest 64KiB as most probably use CIFS/SMB for drive mapping and with windows your usual largest request size over the network is 64KiB this would let you increase parallelism in the array. One comment though is the most of the 'performance items' that are mentioned do not really show up with small arrays which is one reason why there is a lot of competing 'tribal knowledge' out there. Small array is basically something say with a stripe width <=8 drives. With arrays like that you see more performance functions related to bit density of the media (which streaming read/write tests mainly show). For file serving with multiple clients (>1) the access patterns will show to be much more random. (also, fyi, randomness is _not_ really fragmentation of a file or type of file, what I mean by randomness is the actual request to the drive subsystem. Ie. a subsystem (raid, drives, whatever) only really sees something like "read starting at block X for Y sectors". Drives will try to merge commands together if they are able but with multiple users each request will need to move the head, so even if it's a sequential file it's rare at least from my observations, that you access it sequentially from a drive's point of view with multi user environments.

I haven't used webmin in a long time so that may be what it's using. I just threw together a fast mdraid here under a ubuntu test box and that used 64KiB. Like I said, it's not going to NOT work. ;)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now