Recommended Posts

Even on a DMX3000 with 64GB of cache, you still can't do anything about the extra ms required to get traffic onto an external media, through a switch to a controller (then to cache or drives as necessary) and back. We have several 2003 boxes which boot from SAN, and the only local drives they use are for paging.

Technically, you're right.

This is what I was talking about:

http://www.platypus.net/products/direct_attach.asp

It's not really a device designed for a SAN. That sucker has about 1 microsecond seek. :D

Share this post


Link to post
Share on other sites
ddrueding,  NAS is a file server appliance (NFS OR CIFS--SMB), possibly with iSCSI and drivers which let remote machines mount volumes as devices.  It also may have SAN features such as hot-splits, etc.

I can definatly appreciate the uses for a SAN. Server clustering or rendering farms could hardly operate without that concept. But I must still be missing something with the NAS...

From this I gathered that is uses a standard filesystem, that it allows it's "shares" to be mounted as drives (unless you mean boot-time drivers, that might be cool), and be easily manageable.

I must still be missing something.

The filesystem is fairly straightforward and I see no reason for it not to be.

Drivers? Something like daemon tools to mount network shares? Like drive mapping? Is it bootable?

Manageability is always a plus. The ability to expand a RAID5 or RAID10 array sounds very attractive.

But these things are also available in servers at about the same cost. Sorry if I'm sounding dumb here, but I'd like to appreciate the technology.

SAN solutions 'present' disks or volumes to a host (or hosts) in a manner that makes them appear essentially as a local SCSI drive. This means you can typically do anything with them that you would a local drive or array, and they are mounted directly to the OS on the host.

NAS solutions typically are mostly just fileshare servers with enhanced flexibility/manageability. The filesystem is mounted on the NAS system, then shared through the desired network protocol. This is then accessible (at a low level) by any user/system on the network, although of course any file-sharing permissions, etc. you apply may deny you access.

High-end NAS solutions have extra software to make them behave more like a SAN. The NAS server takes a disk or volume, and rather than share it across a typical file sharing protocol, it essentially 'serves' a virtual disk to a client. This requires a client driver specific to the NAS server software in use, and it then appears as a local disk, to be formatted and mounted as a local filesystem. I haven't seen any that support booting from, although I'd imagine iSCSI solutions could support that, but I haven't had anything to do with iSCSI.

Essentially people go with SAN if they require the extra flexibility of treating the storage as local disks, or NAS if they need all their fileshares in one spot for ease of manageability and backup. The in-between solution (high-end NAS) can be used in place of a NAS if one is willing to accept the reduced network performance incurred by having all the normal network traffic on the same network as the storage traffic.

On your example of needing a SAN for a rendering farm, it's probably more likely to use a NAS in that situation, as more than likely all the nodes in the farm will need access to the same data. If you used a SAN, each node would have it's own 'local disk' to write data to, and then the data would need to be copied to a share anyway.

Share this post


Link to post
Share on other sites
The in-between solution (high-end NAS) can be used in place of a NAS if one is willing to accept the reduced network performance incurred by having all the normal network traffic on the same network as the storage traffic.

Come companies mix and match.

They have NAS at local locations and SANS at data centers, etc.

Share this post


Link to post
Share on other sites
If you used a SAN, each node would have it's own 'local disk' to write data to, and then the data would need to be copied to a share anyway.

click Got it. Thanks.

So A SAN has "partitons" that can be serverd to differnt machines? I was under the impression that a SAN could provide multiple systems access to the same volume though a SCSI or FC interface, allowing faster access to common files.

Share this post


Link to post
Share on other sites
So A SAN has "partitons" that can be serverd to differnt machines? I was under the impression that a SAN could provide multiple systems access to the same volume though a SCSI or FC interface, allowing faster access to common files.

It can do both, depending on the setup.

Here is a small info:

http://www.extremetech.com/article2/0,3973,10680,00.asp

IIRC, it has a minor mistake with the connectors.

Share this post


Link to post
Share on other sites
If you used a SAN, each node would have it's own 'local disk' to write data to, and then the data would need to be copied to a share anyway.

click Got it. Thanks.

So A SAN has "partitons" that can be serverd to differnt machines? I was under the impression that a SAN could provide multiple systems access to the same volume though a SCSI or FC interface, allowing faster access to common files.

Speaking from the DMX perspective, each usable volume does not relate one to one with a physical disk on the array. No disk is physically addressable by the host; host addressable volumes are created internally to the specifications you require.

So for example, you have an EMC DMX2000 filled to the brim with 288 physical drives.

None of these drives will ever be seen by the host. Using the service processor on the array, you would create the volume to meet your needs. So, let’s say you need a mirrored volume of 200 GB's. Using the service processor, the software will create you a single addressable volume using many of the 288 disks on the array. Since the drive was specified to be mirrored, the array will make sure that the data is spanned onto different physical drives on different busses which are controlled by different directors for the highest availability.

Now you have a volume with an ID number. In order to access this volume, a LUN needs to be assigned on a specific director card. (a director card being an 8 port fibre board) In this example, let’s say this volume was assigned to a single Fibre adapter on the director card. This specific fibre port now has the ability to access this 200GB volume on the array through a specific LUN. If you want to access this volume, you could connect your host using a fibre cable directly from the HBA to the fibre port on the DMX…but this defeats the purpose of a SAN.

If you want to take advantage of a SAN, you would then enable a feature on this director for LUN masking. LUN masking allows this single fibre port to manage volume access. Lets say we needed more than a single 200GB volume. What if we added 24 more volumes and assigned different LUNS to each of them on this one fibre port. If you want more than one host to utilize these 25 volumes, we need to control access to each volume.

This is done through the use of what is known as a VCM database. This database is located on a small volume on the array. (Generally as volume 000) This volume keeps track of the WWN assignments.

This database is used to map the World Wide Name of the host's HBA and permit access to a specific volume on a director. This allows a single fibre port to manage volume access to many different hosts. For obvious reasons you don't want too many hosts on a single fibre port or performance will suffer.

In order to get 10 hosts to access this one fibre port, a switch can be used. The next part of the SAN (simplistically) would be the fibre switches. Using the zoning abilities of a switch you could enable WWN zoning. You would then create a zone which includes the WWN of the fibre port on the array, and the WWN of the fibre adapter on the host and put them both into a single zone. This zone would then have a user-defined name. Next, you would add this zone into the current active zone set. After you save the changes and activate the new zone set, your host should be able to see the new array and any volumes that have been granted access.

The benefit here is that as long as every switch in your SAN fabric is connected in some way or another through an ISL (Inter Switch Link) the fabric can manage your host to array connectivity. You want to grant another host access to this one fibre port…plug the host into your SAN, grant access of the hosts WWN by placing it into a zone with the arrays WWN and add it to your active zone set.

Now lets say you’re a large company and you have 60 arrays and 500 hosts…hence the benefit of a SAN.

But to answer your question ddrueding, granting two hosts access to the same volume could lead to data corruption. As far as I see it, it would be no different than two hosts sharing the same physical hard drive. It would be best to implement some type of clustering software like MSCS or VCS to manage access to the volume. You could make the volume read-only and then grant multiple hosts to access this volume. I've never done it, but I know it's possible to write protect a volume on a DMX. You could also assign the LUN to 8 different fibre ports or even multiple fibre directors for performance gains. This would consume valuable fibre port resources on the DMX, but it's feasible.

In a SAN environment, it's not uncommon to have 2 to 4 HBA's all going to the same array, but through different switches. We have a piece of software at work known as powerpath. This software manages the redundancy of each HBA and its access to a mapped volume. This way if the HBA, cable, switch, ISL, or director card dies, there is an alternate path that takes over almost instantly for high availability. And in many cases, you mirror the array (or parts of the array) in a different physical building in a different state...but that's a whole different post.

A SAN in and by itself doesn't have partitions, it would be the array connected to the SAN that would have partitions...or virtually created volumes in the array. It all depends on the manufacturer of the array. On the DMX, they are known as splits.

Share this post


Link to post
Share on other sites

Array:

Directors (consisting of Fibre, iSCSI, ESCON, FICON...etc)

Volumes assigned to the directors using LUNS

Directors connected to hosts

Directors connected to Switches

Directors connected to other Directors on other arrays (if supported)

Switches connected to fibre directors

Switches connected to other switches (ISL's) (creates a switched fabric)

Hosts connected to Directors

Hosts connected to Switches

=======

Individual switches have their own software for managing the zones and zone sets. This software can be web driven or telnet based. When you have multiple switches working together in a fabric environment, they need to share the same active zone set, so you have to make sure each switch in a given family supports this mode of operation. Depending on the manufacturer, you can put switches of the same family into a fabric. There are also heterogeneous fabrics that mix vendors.

=======

LUN Masking tools exist for arrays, but they are generally specific to the manufacturer of the array. For the DMX2000, you can use the SAN manager software in ControlCenter to assign LUN’s to ports. This tool edits the VCM database on the DMX. You can also use ControlCenter to edit your zones and zone sets in each fabric basically giving you a tool to manage all aspects of your SAN.

Share this post


Link to post
Share on other sites

Handruin, I hope you didn't ruin your hands typing all that... you're a champ.

To give you an idea of how mature SANs are, EMC *just* released a version of Powerpath for Windows which correctly reconnects a path after it is restored following a failure.

Say for example, just hypothetically, that the SAN team wants to do maintenance on the switches which connect all hosts. Since multiple paths are mandatory in the environment, they don't notify the server teams about the maintenance-- they just ensure that only one switch (hence one path, max) will be down at a time, for a firmware update, let's say. So they perform their maintenance for all the switches, and, lo and behold, all SAN volumes on all systems are inaccessible w/o a reboot. Whoops! Not that that happened or anything. Oh, wait, it happened a few months ago. It's even better that most of the systems are clustered for availabilty, but since neither node could see the resources, it caused many many failures. I'm sure we weren't the only customer this has happened to, although I think the SAN team is largely at fault for not testing their maintenance plans.

Don't even ask about Veritas' multipath.

Hand, aren't you going to explain stitching together hypervolumes for proprietary RAID fun?

Share this post


Link to post
Share on other sites
If you used a SAN, each node would have it's own 'local disk' to write data to, and then the data would need to be copied to a share anyway.

click Got it. Thanks.

So A SAN has "partitons" that can be serverd to differnt machines? I was under the impression that a SAN could provide multiple systems access to the same volume though a SCSI or FC interface, allowing faster access to common files.

While Handruin's response to this is great, I have a feeling most of it went over many people's heads... :)

The SAN disk storage system (which is more specifically what we are talking about here, rather than the rest of the SAN environment) will have a RAID controller. Or in reality, two controllers. All the disks connect to the controllers in the storage system, and the controller creates typical arrays out of them. Some use the old-fashioned 'put these x number of disks in to a RAID 5 set, and these x number of disks in to a mirror,etc' to create what I'll term a volume, while some essentially pool all or most in to an array and divide that up in to multiple volumes. These volumes, commonly referred to as LUNs (due to the use of SCSI Bus:Target:LUN addressing in SANs) can then be selectively presented to any host (server) on the SAN. This then appears on the server as a local disk, to be initialised, formatted, etc.

Or you can present it to multiple hosts. As was previously mentioned, this is the same as having a single disk connected to the SCSI controller of two systems at the same time. They'll both try and mount the filesystem independantly, writing data that doesn't make sense to each other and corrupting the data. Using clustering enables you to control which server has control (and access) to it. So, you can have the same volume presented to two servers, and when one server is shut down for maintenance, the other takes control of the volume and the data on it is still available, to be shared as a network file share or whatever it's used for.

Next, you can have each LUN presented to multiple HBAs in the same server, and/or the LUN being presented by two controllers in the storage system. This means the server will see multiple instances of the same LUN, but it doesn't know that it's actually the same data and will treat each one independantly, again creating a scenario where corruption is likely. Some OSes have built-in 'multi-path' functionality, while others have applications that can be installed to provide it. This multi-path functionality enables the server to recognise that the multiple copies of the same LUN that it's seeing are in fact the same LUN and treat it as such. It also provides redundancy, so that when a path becomes unavailable (such as from a failed HBA or SAN switch), it simply uses an alternative path to access the LUN. The volume at the OS level never needs to be dismounted and nobody notices any problem (except the administrator who hopefully has some form of notification set up!).

NAS solutions typically can provide only the functionality in the first paragraph above, if even that - low end solutions will only do normal network file-sharing. The rest is the real reason people go for SANs. The potential performance impact of using a SAN-like NAS probably is given less consideration.

Share this post


Link to post
Share on other sites

As much fun as this stuff is to read (really it is), doesn't NAS vs SAN boil down to presentation (purpose)?

NAS presented as a shared file system via a "thin server".

SAN presented as a disk-equivalent volume.

Again, iSCSI will be both, although I believe it is targeted at the SAN market.

Basicallly, replace Fibre with iSCSI in the SAN model and you have a less expensive infrastructure or a faster/quicker infrastructure due to 10Gbit Ethernet.

A lot of what is identified with SANs would be handled in the software (or handled in firmware with a client/thin server model).

Intriguing to say the least.

Sorry to hear about the communcation problems (human & electronic) on the SAN maintenance,

Dogeared

8^)

Share this post


Link to post
Share on other sites

Holy cow...my power just came back on and I'm a bit overwhelmed.

Many thanks to Handurin, Lidlesseye and Chew for their great posts in this thread to help LOM (little old me) and hopefully a few others understand this really cool technology.

More killer info that should be preserved....oh wait....it is :D

Share this post


Link to post
Share on other sites
Basicallly, replace Fibre with iSCSI in the SAN model and you have a less expensive infrastructure or a faster/quicker infrastructure due to 10Gbit Ethernet.

A lot of what is identified with SANs would be handled in the software (or handled in firmware with a client/thin server model).

Intriguing to say the least.

SAN gear running over fibre channel will also support 10Gbit, so iSCSI doesn't really have an advantage there. Many (perhaps even most?) new switches being released on the market support both iSCSI and fibre channel. iSCSI, to my understanding, is just a technology enabling SCSI to run over ethernet, effectively doing away with the need for fibre channel cards.

It breaks down like this. You have the media level, which can be either copper or fiber*. Above that, you have what I'll term the transfer level, which can be either ethernet or fibre channel. On top of that is what I'll term the application level, which is networking or SCSI. You can get technology in almost any combination of the above. So you can get fiber ethernet network cards. Or copper fibre channel adapters for your SAN (using SCSI). iSCSI I believe is running SCSI over ethernet, probably typically copper based.

* For those that aren't aware, there's a distinction between the use of the two terms fiber and fibre. Fiber is used when referencing fiber-optic technology - the use of light to transmit data. Fibre is used when referencing fibre channel technology, whether over fiber or copper.

Share this post


Link to post
Share on other sites

Physical level -- electrical or optical.

Transport level - fibre channel or ethernet (or IB, Myrinet, Quadratics...)

Protocol level - IP, iSCSI, FC, etc.

That's it!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now