2005-08-24

Serial Storage Is the Future ...

The storage world is going serial. One thing I'm regularly having trouble with is pointing out to some "die-hards" that Serial Attached SCSI (SAS) IS SCSI-2! The only difference is that the interface is different, but the drive, protocol, etc... are the same!

- ASICs are the name of the game

SCSI is typically implemented with a host adapter. This host adapter handles queuing, targetting, transfer setup and other details of the SCSI bus it controls. In legacy SCSI, this is one or more parallel SCSI devices that share the same bus, possibly with two or more busses (never more than 3 or 4) per host adapter. In a typical, high throughput SCSI implementation, you limit the number of drives to only 3 or 4 maximum per bus so the bus isn't saturated with contending transfers (even though wide SCSI allows upto 15 devices, plus the host adapter).

Serial changes the game. Instead of a single bus shared by all devices, serial busses are typically implemented with only 2 points per disc -- one for the host adapter, one for the end device. This is because serial only requires a few lines -- typically only 2 or 4 lines, plus ground. So a single, typical Application Specific Integrated Circuit (ASIC) can easily handle 4, 8 or even 16 devices, in place of a traditional parallel device where 25, 40, 50 or 68 conductors are used. This concept of an ASIC "switch" for I/O is not unheard of in the networking space, or even some storage device (e.g., 3Ware's Escalade ASIC+SRAM "Storage Switch" design for its ATA RAID controllers).

So now instead of controlling a parallel bus that everyone contends for, I can directly drive devices independently from my host adapter.

- Keeping the advantages of SCSI

Serial Attached SCSI (SAS) is SCSI-2 protocol. SCSI devices are easily made SAS without much difficulty. There is still sector remapping**, host adapter queuing, etc... These are not commodity options in other interfaces, even if some lower-cost SCSI drives use the same commodity disks (or some other interfaces use enterprise disks -- e.g., Western Digital's Raptor series are SATA versions of Hitachi's 36, 73 and 146GB SCSI/FibreChannel products).

[ **NOTE: Many intelligent ATA RAID controllers, like 3Ware, reserve parts of the ATA disk for sector remapping when they use a block volume -- i.e. RAID-0, RAID-10, RAID-5. RAID-1 mirroring on 3Ware is the only time it uses the "raw" (non-volume) disk. ]

One thing I regularly point out as an advantage of SCSI, and now SAS, over Serial ATA (SATA) is that command queuing is done at the host adapter level. In ATA with Native Command Queuing (NCQ), the queuing is still done at the per Integrated Drive Electronics (IDE) level. That means a SCSI host adapter can queue up operations for any drives it controls, whereas NCQ just means the host OS can queue up operations for each individual drive separately. And even then NCQ is still not well arbitrated by ATA controllers between the host OS and end IDE.

Some might point to the Advanced Host Controller Interface (AHCI) of ATA. Understand AHCI is a _software_ organization standard so you can target up to 32 ATA devices as a single, functional unit. It is not hardware, but done on the host system -- i.e., software. Queuing, c/o NCQ, is still done on a per end-IDE device basis, because ATA is dumb. It's just a bus arbitrator between the host system and end IDE device -- a few switches, a few timing registers, etc...

- And then adding some

SAS goes beyond what SCSI can do. Instead of being limited to 12m of an entire Low Voltage Differential (LVD) SCSI bus for all devices, with minimum spacing between devices, I can now have up to 8m per SAS twisted pair cable.

Furthmore, because it does connections on a point-to-point basis, I can plug in different devices that may not operate the same as all the other disks on the host adapter. So it is also backward compatible with 1.5GHz/150MBps SATA-I/II** as well as 3GHz/300MBps SATA-IO**. Of course, there are length limitations to use SATA (1m typical spec), but it's still an option. So nearly SAS host adapters can do SATA for free.

[ **NOTE: SATA-II is now a _marketing_ term much like USB 2.0 is. You have to have a SATA-IO drive for 300MBps, just like you have to have an EHCI controller for 480Mbps/60MBps with USB 2.0. I haven't checked to see if SATA-IO requires a twisted pair cable, but the original SATA committee expected 3 and 6GHz signaling to require it. This might be way they have created the SATA-IO spec, whereas vendors are claiming and shipping SCSI-II with only a 1.5GHz capable cable/EMF and no considerations in the logic for using a twisted pair cable. ]

Third, RAID-0 (striping), 1 (mirroing) and a simultaneous combination (sometimes called RAID-10 or RAID-1e[hanced]) only costs a little extra overhead in logic. The host adapter is already a "storage switch" of 4, 8 or even 16 channels, so it can mirror, stripe or otherwise distribute data easily between channels. Most first generation SAS devices offer these integrated, transparent "hardware RAID" functions, possibly with software RAID-0 or even RAID-4/5/6 across multiple cards.

Lastly, trunking SAS channels is an option. This is the new "killer app" of SAS -- trunking 4 or even 8 lines into 1.2 or 2.4GBps to the next hub. Although the distance is not nearly as far as FibreChannel or iSCSI (SCSI over IP), it is far cheaper than FibreChannel and far less overhead than iSCSI, while being as fast as FibreChannel or even faster than most iSCSI. In a nutshell, SAS is a great solution for multi-targetable storage in the same data closet / server room, without having to shell out for FibreChannel or deal with the inefficiency/overhead of iSCSI.

- Which is faster? SATA (ATA) or SAS (SCSI)?

Well, in a nutshell, the protocol is not the root issue. Interface speed, which has *0* to do with data transfer rate (DTR) of the disk itself, is the main consideration. Most commodity capacities (160, 200, 250, 300, 320, 400, 500GB) can't break 80MBps yet, and most enterprise capacities (36, 73, 146GB) are about 50MBps -- individually. Most vendor specification sheets list the maximum internal DTR in Mbps (divide by 8 for MBps).

Yes, this means that 10,000rpm and even 15,000rpm "enterprise" spindle disks typically have _lower_ DTRs than more "commodity" 7,200rpm disks because they are greatly reduced capacities. Their spindles cannot overcome the fact that more data density is swept out by the higher density commodity disks. Now this is, of course, assuming a continuous, linear transfer. The more seeks, the more quickly higher spindle can and does make a difference (even on single user workstations).

Although DTR _does_ become a consideration when your bus is _parallel_ with _multiple_ devices. I.e., an Low Voltage Differential (LVD) [parallel] SCSI bus like Ultra2/80 (80MBps), Ultra3/160 (160MBps) and Ultra4/320 (320MBps) has to share all that DTR across all the devices on a channel. Hence why SAS is looking realy good these days as even Ultra5/640 (640MBps) doesn't solve the root problem! Serial is the future.

But looking back at interface considerations ...

ATA (including SATA) is a _dumb_ bus arbitrator between PCI[-X|e] and the Integrated Drive Electronics (IDE). ATA is dead _dumb_ and other than some registers for bus timing/configuration, it's the system memory/CPU talking to the drive. ATA provides dead _dumb_ block I/O without any blocking. That's great for 1 drive at 1 operation, such as typical desktop usage -- especially in the latest densities where DTR is absolute, and seek is of minimal consideration.

SCSI has its own _hardware_ host adapter with intelligent management and queuing, plus a full command set. SCSI host adapters are already half-way to a full, intelligent hardware RAID design. The second you start queuing a lot of operations, SCSI wins. ATA can't service requests at all, it relies on the system CPU/OS. Especially with higher spindle rates, which are typically scarce in the ATA world (and only a few exceptions, like the WD Raptor SATA version of the Hitachi 10k SCSI/FC/SAS series, but no 15k).

Note that the _dumb_ nature of ATA-IDE is why ATAPI (ATA Peripheral Interface) was required for non-simple block transfers like most optical drives require. But even then, ATAPI is done in software, between the system memory/CPU and the end-drive. It's still not intelligent, it just adds some commands for the end-device at the host system/OS level.

Again, ATA with NCQ may now add queuing, but it only does for _individual_ drives. That means it's great for a desktop or even a workstation with 1 drive, but once you start adding drives, then NCQ loses it's benefits. SCSI host adapters queue for _all_ drives, not just 1, and it can better balance I/O requests, especially in a RAID configuration (although an intelligent ATA RAID card can do the same -- see RAID levels below).

Straight Just a Bunch of Disks (JBoD) really depends on the application, and ATA is typically all you need today. Things change once you start talking about an intelligent ATA RAID controller. Now you have ATA with intelligence, queuing, SRAM (non-blocking) or DRAM (buffering).

- What's good for what RAID levels?

For RAID-0, 1 and 10 (simultaneous RAID-0 and 1 in hardware), ATA with a non-blocking ASIC and SRAM is most ideal. That's 3Ware's legacy Escalade design (pre-9000 series), using the direct I/O of ATA. You have non-blocking end-to-end -- from the ASIC+SRAM to the storage interface, especially with today's commodity disk densities.

As I noted above, some of the new generation of SAS host adapters come with RAID-0, 1 and 10 "for free." They are a consideration as well because they too are doing "non-blocking I/O" for their channels (which can be SATA as well as SAS). Especially with today's commodity disk densities.

For RAID-3, a non-blocking ASIC and SRAM, plus a little DRAM for extra XOR buffer, is also ideal -- especially when the width of the bus matches the data channel (not including parity). That' the NetCell SR3x00 (32-bit -- 2 drive + partiy) and SR5x00 (64-bit -- 4 drive + parity). ATA is still ideal because it's direct I/O, and RAID-3 is not a blocked I/O (unlike RAID-0, 4 and 5).

For RAID-4 or RAID-5, you're now going blocks of (typically) 32KB striped, with dedicated parity (RAID-4) or striped (RAID-5). Now you want a microcontroller with lots of buffer (DRAM). ATA or SCSI doesn't matter -- the I/O isn't direct, so non-block is useless. Furthermore, SCSI can have lots of benefits with its higher spindles for response time (especially for RAID-5), let alone other features (like sector remapping standard -- although a few intelligent ATA RAID controllers reserve ATA sectors as well).

In the end, the future is Serial Attached SCSI (SAS). Almost _all_ new intelligent RAID controllers being designed are SAS because they also do SATA. SAS is basically an intelligent host, point-to-point SATA with SCSI-2 atop. It's basically like talking about the difference between the quality of an ASIC in Ethernet hardware, only now the concentrator is the storage controller -- a storage switch.

2 comments:

Tito Maury said...

Just thought I would stop by and say "Hello TheBS." It's been a hectic
but very worthwhile last few days for me. In searching for more asic design classes related info on the Internet, I came across your site. I appreciate your content and I really appreciate your this post! It's been a great help in collecting more info on asic design classes. Thanks again and have a great day!

wow power leveling said...

Americans everywhere humor A detention wow gold notice was written like this: a wow power leveling police car with stones, to win wow gold the detention center for seven wow power leveling days all-inclusive accommodation replica rolex Tour Value; hit send 2 a beautiful bracelet, wow power level fashionsuit, police transport; more more surprises , the former can enjoy free shaved 10; before the 100 can play with power leveling the dogs, the guests were presented massage sticks, electric shocks to CHEAPEST power leveling the dead skin beauty care services.