Sign in to follow this  

Multitrack audio recording: to RAID or not to RAID

Recommended Posts

I've been doing a lot of multitrack sound recording on my PC lately, and I've been wondering how best to organise my storage subsystem for this kind of application. I've read some of the excellent posts on these forums on the benefits of using independent disks rather than RAID striping, and thought I'd see how some of these ideas apply to DAW (Digital Audio Workstation) applications.

Like many of you out there, I started out thinking that RAID 0 would be the way to go - everyone says the increased STR is great for multimedia applications such as video editing. Well, I'm not sure that it's necessarily at all good for large multitrack sound recording sessions. I'll explain...

When I was setting up my system for my first big recording session, I picked up a second-hand LSI/AMI MegaRAID Elite 1600 (dual Ultra160 SCSI channels, i960 processor, 128 MB buffer cache) and four 6th gen 10k rpm U160 Cheetahs for some pretty reasonable prices. Things worked well for a while, but as we added more tracks and effects to the recordings, things started to get very bogged down. One major problem was that Adobe Audition, the software we were using, creates a new thread for each real-time effect you add. Each effect processing thread adds a significant memory overhead - after adding a dozen effects, the Audition process alone was consuming more that a gigabyte of RAM. The system only had 512 MB RAM, so there was a lot of virtual memory thrashing, and we had to turn off most of the effects in order to continue recording.

I upgraded to a dual Opteron system earlier this year, and gave it a healthy dose of 2 GB RAM, but the system was still struggling to get the data off the disks fast enough. The RAID array couldn't provide the data fast enough to keep the CPUs at all busy when mixing down the tracks. After doing some more reading of the Storage Review forums and thinking things over, I realised that with a large number of audio tracks, RAID 0 was probably completely inappropriate.

With video editing, you normally have a small number of tracks (files) to play back simultaneously, probably only one, two or three. The data rates can be quite high, especially if you're doing high-definition work, but consumer-level DV only requires about 4 MB/s. A complex multitrack audio session, on the other hand, might have 48 tracks (files) being played simultaneously, each of which might require 384,000 byte/s (96 kHz sampling rate, 32-bit resolution). That's about 18 MB/s in total, which in itself is quite a lot of data to be streaming off a disk.

However, the real problem is that those 48 tracks are going to be stored on different regions of the disk, and the system must read a little bit of each track, process and mix (sum) the data, send it to the audio hardware, then read the next little bit from each track's file. That's 48 disk accesses at different locations on the disk just for one small portion of the session. If you have a reasonably fast drive, the average access time might be around 12 ms, so your drive can access at most about 80 locations per second, and that's without leaving any extra time to read more data from that location. Depending on your drive's firmware, the readahead buffer cache will alleviate this somewhat, but it's still going to take a pretty fast, intelligent drive to perform this many disk seeks and still give you the required 18 MiB/s to play back the 48-track session without dropping out.

[As an aside, if you want to see for yourself just how much sequential transfers are hurt by the drive having to seek to other locations, try running two instances of your favourite STR benchmark (HDTach, HD Tune, ...) at the same time. When you start the second one, you'll probably find the transfer rate of the first drops to maybe 10% of what it was. Granted, the combined rate of the two tasks is still 20% of the maximum STR of the drive, but that's still a pretty big hit. Try running a few more at the same time and watch it go down further. It doesn't take much to turn your 70 MB/s monster into a quivering 5 MB/s jelly.]

So, I had to come up with a more intelligent approach; in particular, I had to try to minimise the number of seeks being performed by each drive. I realised that the ideal would be to use a dedicated hard drive for each track, just like the way tracks are stored on traditional sequential tape media. Obviously this would be too expensive, but that's what I should be trying to approximate: storing each track entirely on one drive, minimising the number of tracks per drive, and trying to distribute tracks across drives so that tracks that play simultaneously can be read from different drives. You'll notice that this is pretty much the opposite of what RAID striping does: RAID 0 would take a single audio track file, break it into small chunks, and write each chunk to a different drive. If you're only reading a single file, the striped configuration might well be faster, but if you're reading 50 tracks, each disk is going to experience a punishing number of localised I/O's.

So, I broke up the RAID array, moved the drives onto an Adaptec 39160 U160 non-RAID HBA, and set them up as separate volumes AV1..AV4. I usually use four microphones for each instrument - close mic, mid-distance mic, and a L-R stereo pair for ambience - and this lends itself nicely to being distributed across the four drives: put all the close-miked tracks on AV1, the mid-distance tracks on AV2, and the L and R room tracks on AV3 and AV4 respectively. This results in a nice degree of parallelism, without overloading any single drive with I/O. Here's how I might distribute the tracks across physical drives for a hypothetical session:



Drums overhead 1

Bass direct

Guitar 1 close mic

Guitar 2 close mic

Lead vox close mic



Drums overhead 2

Bass close mic

Guitar 1 mid mic

Guitar 2 mid mic

Lead vox mid mic


Tom 1

Drums room L

Bass mid mic

Guitar 1 room L

Guitar 2 room L

Lead vox room L


Tom 2

Drums room R

Bass room mic

Guitar 1 room R

Guitar 2 room R

Lead vox room R

Remember, they key is to get files that are being accessed simultaneously on different drives. If you put them on the same drive, the drive will have to frantically hunt back and forth between the two, wasting much time in the process.

Notice how each drive only has to read 6 tracks in this example, for a total of 24 tracks. (Of course, the usual stuff about using separate, dedicated drives for swap and A/V capture still applies as well.)

Using this approach has worked very well for me, although there are logistical issues in setting it all up (I'll address these in another posting). With only a dozen tracks per drive, each drive can comfortably provide about 7 MB/s, and keep the CPUs pretty busy. I was able to mix down a particular 5-minute session with 24-40 simultaneous tracks in under 2 minutes.

The one area where I have found RAID 0 striping to be useful in audio workstation use is scratch space. Audition allows you to define a primary and secondary temporary storage path, but it tends to use only the primary one until it fills up (which is not very sensible of it, IMHO). When applying destructive effects (e.g. normalising a mixdown of a song, correcting DC offset, etc.) it needs to read and write significant amounts of data on the scratch drive. Because it's doing both simultaneously, you get multiple regions of localisation, slightly longer queue depths, and a single drive will start to struggle. Because there was no obvious way to get Audition to use the primary and secondary scratch space in parallel, I put the primary scratch on a RAID 0 array (2 x Fujitsu MAN U160 drives on the MegaRAID Elite 1600, readahead and write-back caching enabled). This works really well: I can pretty much max out a CPU while doing non-CPU-intensive operations like a normalise. Overall disk throughput reads approx. 30 MB/s while doing this, which I'm quite happy with, considering it's reading and writing simultaneously.

Whew, long post! Hope it's helpful...



Share this post

Link to post
Share on other sites

One nice thing about a RAID volume is that it is completely transparent: the operating system sees it as a single logical drive, and doesn't care that it's split across some number of physical drives underneath. If you switch to using the drives independently, you lose this transparency: you have to manually save each waveform file to a separate path to distribute the files across drives.

You can ease the pain by using techniques such as mount points and symbolic filesystem links to conceal the actual physical location of the data. Unix people will no doubt be familiar with these already, but if you're using a recent version Windows, there are similar features available. For example, you can transparently attach a volume to a folder on an NTFS volume instead of (or in addition to) assigning it a drive letter (ugh) using the Disk Management snap-in in the Windows Management Console (Start:Programs:Administrative Tools:Computer Management). You can also transparently link to a folder in another location, similar to using a shortcut, by using the "linkd" (Microsoft resource kit) or "junction" (Sysinternals) programs. For example, to link to the existing path "I:\Recordings\Song1\bin1" from "C:\Recordings\Song1\bin1", you could run the following command:

junction C:\Recordings\Song1\bin1 I:\Recordings\Song1\bin1

To organise things, I created a folder for the entire recording session, with a subfolder for each song. NT filesystem junctions can only point to folders, not files, so I created a further set of "bin" subfolders within each song to hold the waveforms for the song. This directory structure will be duplicated on each of the drives (AV1 .. AV4 in my case). I also have a master folder (on another drive) which contains the main session/project file and links to the different storage bins. When you record a new track, you have to save it to the right bin folder.

Master Folder:


bin1 --> AV1:\Recordings\Session1\Song1\bin1

bin2 --> AV2:\Recordings\Session1\Song1\bin2

bin3 --> AV3:\Recordings\Session1\Song1\bin3

bin4 --> AV4:\Recordings\Session1\Song1\bin4

Storage folders:













That's it! It's a bit laborious to set up, but once it's done, you have a pretty much optimal physical organisation.

A couple of other things to note about Adobe Audition:

  1. You can specify two capture paths in the program settings, but Audition won't complain if you put both on the same physical drive. If you only have one drive available for capture, you would probably be better off using only one capture path rather than using two paths that map to the same physical drive. If you want to use two capture paths, make sure they're on separate drives.
  2. Audition mixdowns are written to the scratch drives as well. Even when playing back real-time, Audition will write the mixdown to the scratch drives. A bottleneck in writing to the scratch drives can cause stuttering playback in Audition! It's a good idea to dedicate a set of drives for scratch so that you're not trying to write to a drive that's already busy reading saved waveforms for the session.




Share this post

Link to post
Share on other sites
However, the real problem is that those 48 tracks are going to be stored on different regions of the disk, and the system must read a little bit of each track, process and mix (sum) the data, send it to the audio hardware, then read the next little bit from each track's file. That's 48 disk accesses at different locations on the disk just for one small portion of the session. If you have a reasonably fast drive, the average access time might be around 12 ms, so your drive can access at most about 80 locations per second, and that's without leaving any extra time to read more data from that location.

How long is one mixing session?

In theory, 18 mbyte/s from 48 streams should be doable by even a single simple IDE disk if proper smart buffering is being used.

Share this post

Link to post
Share on other sites

Hi Olaf,

One thing I didn't mention in my first post: I think the actual number of disk accesses per second would depend on the buffer size used by the recording software. I've noticed that Audition can get through noticeably more data if you set it to use larger buffers (at the expense of responsiveness to stop/pause/play commands and changing panning, gain and FX controls). Interestingly, you can't control the buffer size directly: you instead specify the total combined buffer length in seconds and the number of buffers to use. Presumably the buffer size would therefore also depend on the sampling rate and bit depth of the recording.

Another thing I've noticed is that queue depths (as reported in the Windows Performance Monitor) remain short even with many tracks. I'm not sure of the exact significance of this, or whether other multitrack programs behave differently.

The recordings I've been working on recently have been recorded at 44.1 kHz 32-bit resolution, so for a 4-minute song, each track in the session would require 40 MB. So the total storage for a 50-track session would be about 2 GB (more if recording multiple alternative takes).



Share this post

Link to post
Share on other sites

Hmm, that sounds like output buffering while it should be doing input buffering (which wouldn't affect play/stop/etc).

For example, if you have a 256 mb input buffer, it can buffer 14 seconds. Then only 4 seeks/second would be required.

Share this post

Link to post
Share on other sites

I might have to check out some other multitrack programs and see if they behave differently with their buffering.

The other thing I was trying to figure out was how to configure my four ST318406LW (Cheetah 36ES) drives for best performance in this kind of environment. I found this in Seagate's product manual for this drive:

4.5.3  Optimizing cache performance for desktop and server applications

Desktop and server applications require different drive caching operations for optimal performance. This means it is difficult to provide a single configuration that meets both of these needs. In a desktop environment, you want to configure the cache to respond quickly to repetitive accesses of multiple small segments of data without taking the time to “look ahead†to the next contiguous segments of data. In a server environment, you want to configure the cache to provide large volumes of sequential data in a non-repetitive manner. In this case, the ability of the cache to “look ahead†to the next contiguous segments of sequential data is a good thing.

The Performance Mode (PM) bit controls the way the drive switches the cache buffer into different modes of segmentation. In “server mode†(PM bit = 0), the drive can increase the number of cache buffer segments above the value defined in Mode Page 8, Byte 13, as needed to optimize the performance, based on the com-mand stream from the host. In “desktop mode†(PM bit = 1), the number of segments is maintained at the value defined in Mode Page 8, Byte 13, at all times. For additional information about the PM bit, refer to the Unit Attention Parameters page (00h) of the Mode Sense command (1Ah) in the  SCSI Interface Product Manual, part number 75789509.

I don't understand the comments about desktop and server patterns: I would have said that desktop environments would benefit hugely from readahead, while servers would tend to have highly random patterns that would actually be slowed down by too much readahead. What's the story?

I'm also wondering how many cache segments I should use on each drive. Originally I was thinking that one cache segment per audio stream would be sensible. According to the product manual, the LW (68-pin) and LC (SCA) models ship with different default settings, as follows:

LW (68-pin): PM = 1 (desktop mode), #cache segs = 16

LC (SCA): PM = 0 (server mode), #cache segs = 3

I would have thought that fewer, larger segments would give better desktop performance, and that it would be a good idea to set the number of segments to be a little larger than the number of concurrent streams or areas of localisation under typical use. Can someone please explain why this is not the case, and explain what the performance implications of changing the number of segments really are?




Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this