Introduction to Computer Storage

Introduction to Computer Storage

April 19, 2019

Computer Storage Systems have seen a massive evolution and transformation over the last 100 years or so. These changes have been dramatic relative to size of storage, price of storage and the access speed of storage. These changes were only possible due to rapid advances in technology which took years to evolve initially but in the last decade haveaccelerated due to rapid innovations and concerted/ collaborative efforts of industry giants. These technology advances have revolutionized the way companies, businesses and consumers use storage technologies today. From the most ubiquitous smart phone device, personal computers all the way to businesses who now have come to rely more and more on cloud storage services.

In this article we try to capture the evolution of storage technologies over the years and examine the fundamental technical reason that has accelerated storage evolution over the last decade

Storage evolution over the last 70 years relative to size, capacity and price

The table below highlights the trends in storage. Initially, the advances in technology were slow but mostly led by IBM in the 60s and 70s in the Megabyte revolution. In the 80s and 90s other companies jumped onto the storage bandwagon and started the Gigabyte revolution soon to be followed by the Terabyte revolution. Today there are many physical form factors of storage devices from the traditional mechanical Hard Disk Drives (HDD)to Non-Volatile Memoryor Solid-State Devices (SSD). However, the fastest and most dynamic revolution is occurring in the Cloud.

Storage Technology Evolution over the last 100 years (Another View)

The diagram below is a pictorial representation of how storage technologies have evolved over the last 100 years. It is a complementary diagram to the earlier diagram of this article

The fastest growing storage technology today

Most storage systems today use some form or the other of a mechanical device. These devices, otherwise known as Hard Disk Drives (HDD), are most prevalent in almost all storage systems today. HDDs are the dominant technology for several reasons: (1) Very high recording density per platter (2) More than one platter per HDD (3) Higher rotational speeds up to 15000 RPM for enterprise class drives and (4) Reduced costs due to economies of scale. However, they do have inherent disadvantages due to the following reasons (1) Further recording density increase has hit the limits of physical space (2) Increasing the rotational speed of the platter increases the cost exponentially and (3) Being a mechanical device it is bound to physically fail due to all the moving parts. A single HDD with a single platter at 15000 RPM can at most deliver a transfer speed of 100MB/s for sequential block reads. Random reads for the same configuration, the transfer speed drops down to as low as 10MB/s

Given the inherent limitations of mechanical storage devices and the rapid drop in prices of Non-Volatile Memory (NVM), NVM is the next revolution in storage. It is found in almost all mobile devices and now continues to replace mechanical devices across the board. There are several reasons for this (1) They are now cost competitive with HDDs per terabyte of storage capacity. This price parity will continue to erode in favor of NVM in the coming years (2) NVM technology is far more reliable in the longer term because it has no moving parts 3) NVM is over 100 times faster than HDD and has similar transfer speeds for both sequential and random reads/writes unlike HDDs and 4) Micro-second read/ write latency compared to milli-second latency for HDDs.

Companies like Intel and Samsung have now developed NVM technologies that are 3 dimensional which means increased storage density per cubic measure of volume without any performance degradation.

The first system implementations of NVM were found to be Solid State Drives that use the Serial ATA Protocol or SATA. The reason behind this was very simple. Immediately achieve a 100-fold increase in transfer speeds relative to HDD without changing the upper level Small Computer Systems Interface (SCSI) protocol. The SCSI protocol is over 4 decades old and is used by all I/O protocols such as Fiber Channel etc. All major operating systems also support the SCSI protocol, so for the industry to gain quick gains, it was easy to replace the HDD with an NVM based SSD.

The next disruption was soon born because the SCSI protocol and its associated software stack was very heavy as far as execution time was concerned. This SCSI protocol overhead directly impacts latency of reads and writes which is detrimental to high performance applications such as High Frequency Trading, Small transactions in Banking and numerous database applications for machine learning and artificial intelligence. Thus, an industry working group was formed to address the latency problem associated with the SCSI protocol. The industry came up with a new protocol which takes advantage of the native speed of Non-Volatile Memory. They completely eliminated the SCSI protocol to define a new protocol called NVMe (Non-Volatile-Memory Express) protocol. In this protocol, the NVM device is directly attached to the I/O PCI express bus hence the extension “e” in NVM. This eliminates the need to have the traditional hardware and firmware that resides in a Host Bus Adapter (HBA) along with the HBA itself. The following diagram illustrates this concept

Latest Non-Volatile Memory Highlights

  1. PCIe Gen1 is 2.5 gbps per lane per direction. Today's SSDs pack Gen3 x2 or Gen3 x4 (8 Gbps x 2 or 4 lanes = upto 32 Gbps) bandwidth in a very tiny M.2 gumstick form-factor
  2. 3-D NAND and 3D-Xpoint NAND - DRAM-bandwidth at Flash-economies - very low-latency flash (20us IO read/Write latency compared to 200us latency for enterprise flash
  3. New form-factor coming to pack TB of capacity - "ruler" form factor from Intel
  4. NVMe enables performance scaling with increase in capacity - traditionally denser HDD did not bring any performance improvement.
  5. NVMe over TCP enables low-cost SAN deployment compared to Infiniband, RoCE, iWarp or FC
  6. NVMe allows dual-ported drives - provides High-availability (same PCIe connector for either gen3/gen4 x4 or 2 separate gen3/gen4 x2 links) - and same connector can connect a native-SATA drive as well.

Further evolutions in NVMe – Non-Volatile Memory Express Over Fabric (NVMe-o-F)

Restricting NVMe inside the box, server chassis, has implications for scaling a storage system. Network Attached Storage (NAS) and Storage Area Networks (SAN) do not have scaling issues. These are field proven technologies for large enterprises with each having advantages and disadvantages in certain type of applications. NVMe-o-F is now about to change the dominance of NAS and SAN storage networks. Several competing technologies are now defining new protocols to replace FC and NAS in enterprise and cloud environments. These are iWarp, FC and RoCE. These technologies intend to use existing ethernet and fiber channel networks to deploy NVMe outside the server chassis. TCP/IP is also a very old protocol that runs over existing ethernet infrastructure and they too are in the process of extending NVMe outside the server chassis. It is only a matter of time when all storage (green field projects) adopt NVMe-o-F as the preferred storage technology. The following diagram illustrates NVMe-o-F in a CCTV surveillance application

Data Security and Data Protection

Data Security has to do with unauthorized access to data whereas Data Protection has to do with computer component failure. Both these need to be considered as important and must live together “components” with interrelated recovery policies. As this subject is too vast for discussion, we highlight the main points here:

  1. Data growth: Sufficient policies must be in place to handle future growth
  2. Cyberattack growth: This phenomenon is bound to increase so this also needs to be considered in an overall security policy
  3. Cost of data breaches: Must act as a powerful incentive to have a strong security policy and mechanism in place as these breaches can become very expensive
  4. Increasing data value: All the more reason to prevent data breaches by having a solid security policy

Storage vulnerabilities are inherent in storage systems. They include the following

  1. Lack of encryption
  2. Cloud storage: Understand the security offerings by cloud storage providers. All these features come at a cost
  3. Incomplete data destruction: All efforts and methods must be implemented to insure the total destruction of data when it is no longer needed
  4. Lack of physical security: This is one of the most common causes of data breaches and is often overlooked. Policies and procedures must be put in place to address this issue as critical data breaches are done by human beings

Following are some of the best data security practices

  1. Data storage security policies: These vary between organizations as well as departments. Based on access priority the relevant data must be sufficiently secured for unauthorized access
  2. Encryption and access control: Data must be encrypted both in transit and at rest with appropriate access control
  3. Strong network security: As several devices will have access to data including smart phones, these endpoints can be weak endpoints for cyber breaches
  4. Backup and recovery: These are the fundamental requirements for any organization to implement. Without this any data breach, loss or destruction of data can be catastrophic

Eight Data Security Best Practices

1 Write and enforce data security policies that include data security models.
2 Implement role-based access control and use multi-factor authentication where appropriate.
3 Encrypt data in transit and at rest.
4 Deploy a data loss prevention solution.
5 Surround your storage devices with strong network security measures.
6 Protect user devices with appropriate endpoint security.
7 Provide storage redundancy with RAID and other technologies.
8 Use a secure backup and recovery solution.

Summary

  1. The following points become clear and must be considered in storage adoption, data security and data protection
  2. The rate at which data sets continue to grow across all market segments must be seriously approached in developing a broad implementation strategy
  3. Corresponding technologies will continue to drop in price, increase in performance and increase in storage volume density
  4. Develop a consistent data security policy that is applied across all branches of the tree
  5. Develop and implement a corresponding data protection policy