Tech Notes: Storage Hardware and Terminology

iSCSI. Fibre Channel switch. TOE NIC. When do I need an "HBA" for storage and when can I just use a standard Ethernet adapter? In the world of storage, storage terms and concepts can serve to further muddy waters that are already somewhat unclear to many IT pros. After all, up until fairly recently, a storage area network was the purview of only large organizations with large IT budgets and specialized staff. Now, however, as more storage options become available to small- and medium-size businesses, discussion about this topic is starting to take place in these IT departments. In this article, I'll go over some of the terms and equipment related to different types of storage solutions to help you get a better understanding of these increasingly cost-effective ways to handle your storage needs.

Network Attached Storage

NAS systems are ones that connect directly to your network, but that do not generally provide block level communication with the host, making them unsuitable for most database and Exchange applications. A NAS system is really just a mondo file server running its own operating system and providing direct access to users. That is, users can directly access the files on the NAS device just like they access files on a file server. Each whole file is transferred between the NAS device and the requesting client.

NAS terminology

NAS head: This is the part of the NAS to which clients connect. Behind the NAS head may lay hundreds or thousands of gigabytes of available storage, but clients need to access this space via the NAS head.
NFS: NFS (Network File System) is one of the communications protocols usually supported by NAS heads for communication with network clients, particular those of the UNIX or Linux flavor, although NFS clients are available for just about any operating systems these days.
CIFS (the protocol formerly known as SMB [Server Message Block]): CIFS (Common Internet File System), the protocol primarily responsible for file sharing communication with Windows (and Linux-based Samba) servers is another commonly supported protocol in most NAS heads. CIFS/SMB is used for communication with the NAS head by most Windows clients. Both NFS and CIFS use TCP/IP for their underlying communication.

Hardware and software needed to support NAS systems

One beauty of NAS systems is simplicity. If you have an existing Ethernet (Fast or Gigabit are the best choices here) network—and who doesn’t these days?—you can almost literally just pop a NAS head and storage on your network and be on your way. In short, the only equipment you need to support a NAS system in your environment is an Ethernet connection to the NAS head. For additional reliability, you might want to configure your NAS hardware with multiple connections, but at the end of the day, just typical Ethernet switch ports are all you need for a NAS system to work.

On the software side, you might need an NFS client on your Windows computers, or an SMB client (such as Samba) on your Linux computers to access the NAS system. However, this is only true if you're trying to access a NAS device that does not include support for your client operating system.

iSCSI-based Storage Area Networks

Compared to NAS and fibre channel-based SANs, iSCSI is the relative new kid on the block in the storage world, but due in no small part to its very low cost, has started to give fibre channel a serious run for its money. iSCSI storage networks are a complete technology—ranging from iSCSI drivers on your servers to storage hardware based on iSCSI standards. Unlike NAS systems, iSCSI SANs are perfect for database and Exchange applications due to the fact that iSCSI transmits block level data rather than complete files.

"Block level?," you might ask—and it's a really good question, particularly since storage vendors throw the term around pretty loosely. Block level communication means that data is transferred between the host and the client in chunks called blocks. Databases and Exchange servers depend on this type of communication (as opposed to the file level communication used by most NAS systems) in order to work properly. That said, some NAS vendors' devices are certified for use by databases and Exchange, but I don't generally recommend this configuration unless it's a last resort.

iSCSI terminology

Being somewhat new, the introduction of iSCSI to the marketplace has been accompanied by some new terminology.

iSCSI: It might be nice to talk a little about the term iSCSI itself. You probably know about SCSI (Small Computer Systems Interface) and probably even use it on most of your servers. SCSI has long been known for its reliability and speed. iSCSI is simply the commands used by SCSI systems encapsulated inside TCP/IP – hence the 'i' in iSCSI. Put another way in some iSCSI documentation, iSCSI is a way for a storage initiator such as a server to send commands to a storage target such as array of disks. iSCSI "targets" don't even have to use SCSI disks. In fact, many of them use newer serial ATA (SATA) disks and translate the SCSI commands for use on these less expensive devices.
iSCSI driver: An iSCSI driver attaches to a standard Ethernet adapter (usually of the gigabit variety) and facilitates communication with an iSCSI storage array. What does this mean for you? Most importantly, it means that you can start using the features of storage area network without having to buy expensive, specialized adapters for your servers. As long as your server has a gigabit Ethernet adapter, you can use iSCSI. If you do decide to use just an iSCSI driver (a lot of people who use iSCSI do use these), I recommend using a second gigabit Ethernet adapter in your server and create a separate network for storage communication. With today's overpowered dual- and quad-processor servers, this type of communication is almost always sufficient and you don't need to worry about TOE NICs (below).
TOE (TCP Offload Engine) NIC: For servers that are under a very heavy load, the additional load required to encapsulate commands destined for the iSCSI target can be a killer. By some estimates, depending on what you're doing, you might eat up to 30% of your CPU with iSCSI overhead, although this is not very common. For instances in which this level of overhead is unacceptable, you can offload the work to a specialized NIC called a TOE NIC. As the name implies, a TOE NIC handles the encapsulation, thus freeing up the CPU for other tasks. I recommend serious testing before you invest in TOE NICs. Measure your server's CPU to see what amount of processing is dedicated to the encapsulation task for iSCSI.
iSCSI initiator software: This is software that either comes with the host operating system and binds to a standard Ethernet NIC, or that resides on an iSCSI TOE adapter. The iSCSI initiator software is responsible for processing iSCSI commands and for managing the TCP/IP communications with an iSCSI storage array. Most modern operating systems include iSCSI initiator software at no additional charge. In a software-only scenario using the OS's iSCSI drivers and a standard Ethernet NIC, the host processor is responsible for translating iSCSI commands. These initiators work with just about any gigabit server NIC and are generally more than adequate with respect to performance. For older, slower servers, you might want to consider a hardware-based initiator such as a TOE NIC.
iSCSI target: This can be any device with which your host communicates using iSCSI, including an iSCSI disk array or iSCSI-aware tape unit.

Hardware and software needed to support iSCSI systems

I've already gone over most of the items you need to support an iSCSI infrastructure. The really great part about iSCSI is that implementation is fairly inexpensive since you probably have everything you need, except the disks. On the server side, new versions of Windows, Linux, UNIX, and NetWare all include iSCSI initiators, and with today's really fast servers, iSCSI overhead using a standard NIC is negligible on all but the most loaded servers. Beyond the host, to interconnect your servers and storage devices, all you need is a standard gigabit Ethernet switch on a network separate from your client communications. I recommend a separate network for two reasons: (1) when you separate client traffic and storage traffic, overall storage network performance stays high; and (2) since you probably don't want your clients directly accessing storage except via the server, you can help to secure your storage network by keeping it separate from your primary network. Last, but certainly not least, you need iSCSI targets—namely a storage array—with which to work.

Fibre Channel-based Storage Area Networks

The granddaddy of storage networks, Fibre Channel-based storage remains the strongest player in the networked storage market, although iSCSI has quickly become a formidable competitor. Like iSCSI, FC SANs transfer data at the block level making FC SANs more than suitable for database applications and Exchange rollouts. Also like iSCSI, FC uses its own terminology and introduces some new technology to the IT infrastructure.

FC SANs have long enjoyed their position as the respected top-tier in storage architecture mainly due to their reliability, performance, and ability to protect their data.

Although iSCSI has become a fierce competitor in some markets, the future for FC still looks good with plans to increase speeds from the current standard of 2Gbps to 4Gbps, although the jury is out on how much of an impact this really will have on overall performance due to other limitations in storage.

iSCSI terminology and required hardware

FC-based SANs brought a slew of terminology to the forefront of the minds of storage experts. I'm not going to go over every term here, but will provide you with the ones that are important to know when you're comparing storage solutions. I'm combining the sections on terminology and hardware for this topic since they are both pretty much the same.

HBA (Host Bus Adapter): An FC HBA is server hardware that enables communication with FC storage hardware. In most cases, the term HBA is coupled with an FC solution, but some vendors also refer to iSCSI TOE NICs as HBA, so just watch what you're reading. An FC HBA is generally an add-on card that utilizes a PCI slot.
FC switch: FC SANs use specialized equipment, including the switch interconnects between the hosts and the storage. If you're new to the storage game, prepare for a jaw drop when you see the price tags associated with FC switches! They're expensive and usually use GBICs so you can pick the kind of connectors you want to use. You can connect FC switches together to expand your overall storage fabric to support thousands of nodes.
Node: A node is any device connected to a fibre channel switch, be it a server, a storage system, or a tape drive.
Adaptec and other suppliers have offloaded the TCP/IP processing by adding dedicated TOEs (TCP/IP Offload Engines) to their Ethernet cards used for storage networks. But these are more expensive than ordinary Ethernet NICs and so the price advantage of iSCSI SANs compared to Fibre Channel SANs is reduced.
Intel sees an opportunity here and aims to add data set copying to its server chipsets and network controllers via I/O AT. This will be done in parallel to the TCP/IP processing taking place in the server CPU thus speeding up the overall data transmission time. In effect iSCSI commands and data moving are done simultaneously with Intel network controller memory being directly accessed.

Tech Notes

Friday, March 11, 2005

Storage Hardware and Terminology