Moving to Cloud Storage
If your organization is building or maintaining software services, scalable and performant storage is a prime concern. Most modern applications have exorbitant data requirements, and mission critical use cases require storage with low latency and high throughput.
In the traditional enterprise, the solution to these concerns was purchasing and deploying storage equipment, maintaining it, and scaling it over time. In the age of cloud computing, this is a much less appealing prospect. The cloud lets you gain instant access to storage resources, requiring no setup and no upfront investment, offering multiple levels of availability and performance.
What Is Cloud Storage?
Cloud storage services are typically based on a cluster of data servers, which are accessible over the Internet (although access can be restricted to private networks). Commonly, data is replicated between two or more servers for durability and high availability.
Cloud data storage resources can be provisioned in several ways:
- Users can pre-provision cloud storage resources with a fixed capacity—such as managed hard disks or file shares
- Users can use cloud resources in a flexible manner, adding or removing storage and paying per actual GB-hour of storage used
- In some cases, storage is provided as part of a complete platform as a service (PaaS) package, and is priced together with a larger service (for example, a storage component provided as part of an application hosting service)
End users or developers can upload and download files and objects via a web interface, command line interface (CLI), or application programming interface (API). For example, the Amazon S3 API has become a de facto standard for cloud storage management.
How Does Cloud Storage Work?
Cloud providers set up and maintain large data centers in many locations across the world. These resources are made available to customers, who can purchase cloud-based storage from various providers. To access data stored on the cloud, customer applications can use application programming interfaces (APIs) or by using traditional storage protocols.
There are three main cloud storage technologies used by cloud providers:
- Block storage—splits big data volumes into smaller units, which are called blocks. Each block gets a unique identifier and is placed on a storage drive. Typically block storage is directly attached to cloud virtual machines (VMs). This storage option is considered the fastest, and also the most expensive of the three—it provides the low latency needed to run databases as well as high-performance workloads.
- File storage—organizes all data into a hierarchical system of folders and files. Data is placed in files, and these files are placed inside folders. To organize folders and efficiently search for data and files, you can use directories and subdirectories. Typically, file storage is exposed as network file shares, using protocols like Network File System (NFS).
- Object storage—stores data in the form of objects. Each object consists of three components—a unique identifier, data stored in a file, and all metadata associated with the file. Object storage protocols use a RESTful API to receive a file, store its data and metadata as a single object, and assign an identifier to the object. Similarly, when a client requests an object, providing the ID, the system assembles the object and delivers it. This is considered the lowest cost cloud storage option. Object storage systems let you customize metadata to streamline data access as well as analysis.
Cloud Storage Pros and Cons
Pros of cloud storage include:
- Off-site management—the cloud provider is responsible for maintaining and protecting the underlying infrastructure. There is no need to set up and maintain an in-house data center. The cloud vendor provides the infrastructure and cloud customers can focus on other priorities.
- Quick implementation—cloud services provide instant deployment of storage. You can quickly set up and scale your storage resources, provision the service, and then start using it within hours (or even minutes, depending on the details of the service).
- Cost-effective—cloud storage is offered on-demand and cloud customers can pay for the resources they use only. This level of flexibility enables organizations to treat storage costs as ongoing operating expenses, and avoid capital expenditure with upfront costs and tax implications. Learn more about storage costs offered by the leading cloud providers – Azure storage costs, AWS costs, and Google Cloud costs.
- Scalability—the cloud computing model enables customers to quickly and easily scale resources on demand. Unlike on-premises facilities, capacity is not limited to hardware. Cloud customers can leverage a global network of cloud resources ready for use.
Cons of cloud storage include:
- Security—cloud providers are responsible for securing their network of data centers and all underlying infrastructure. However, breaches and outages do occur. In addition, your organization is responsible for securing your data and workloads, and a simple misconfiguration, like forgetting to turn on authentication for a cloud storage bucket, can result in massive exposure of sensitive data. Remember that in the cloud, storage is commonly exposed to public networks by default.
- Administrative control—cloud providers operate on a shared responsibility model. This means that the cloud provider operates many aspects of the service, and your organization does not have complete control over the environment.
- Latency—cloud storage is provisioned over a network and delays may occur during data transmissions from and to the cloud. This is typically caused by traffic congestion, especially when customers are using a shared public Internet connection to access resources.
- Regulatory compliance—many industries require compliance with specific regulations and standards. Healthcare providers, for example, are required to comply with the Health Insurance Portability and Accountability Act of 1996 (HIPAA). Check that your cloud provider is certified for the required compliance standards, and understand your organization’s role in ensuring compliance (for example, by securely configuring storage services). In many cases it is more complex to achieve compliance on the cloud than on-premises.
Comparison of Cloud Storage Services
There are many cloud providers, but three dominate the market—Amazon Web Services (AWS), Microsoft Azure, and Google Cloud. Let’s take a look at the storage services provided by the big three cloud providers.
- AWS S3—a popular object storage service that comes with a powerful API, letting you integrate with third-party applications and services. S3 offers several types of pricing and performance, such as standard, infrequent-access for single and multiple availability zones, Glacier for archives, and auto-tiering.
- Azure Blob Storage— a cloud-based object storage solution offered by Microsoft. It offers four tiers of performance, including archive, cool, hot, and premium. Each tier offers an on-demand price as well as volume discount for reserved capacity.
- Google Cloud object storage—offers several performance tiers, including archive, coldline, nearline, and standard.
- Amazon Elastic File System (Amazon EFS)—a managed service offered by AWS. It is designed as a petabyte-scale NFS service that comes with multi-availability zone redundancy. AWS EFS is available in several modes, including standard and infrequent access.
- Azure Files—a fully-managed service offered by Microsoft. It is designed to be mounted concurrently by on-premises or cloud deployments of Linux, macOS, and Windows. It offers four performance tiers as well as optional snapshots and metadata backups at additional cost.
- Cloud Firestore—a managed NFS service offered by Google Cloud. It is designed to host NoSQL databases that can be directly accessed by Android, iOS, and web applications via native SDKs. It is available in three performance tiers, including SSD, High Scale SSD, and HDD with a wide range of input/output operations per second (IOPS) and I/O throughput.
- Amazon Elastic Block Store (EBS)—offers HDD-based and SSD-based block storage devices, which you can attach to Amazon Elastic Compute Cloud (EC2) instances, at several latency and IOPS levels.
- Azure Page Blobs and Azure Managed Disks—page Blobs are optimized for random read/write operations and are available in HDD and SSD tiers. Managed Disks can be attached to Azure VMs, and are available in standard SSD, HDD, premium, and ultra tiers.
- Google Persistent Disk—offers block storage devices, which can be attached to Google Cloud VMs. Google provides eight tiers, each providing different reliability and performance.
AI-based Data Services
The leading public cloud providers offer value-added services that can help you do more with the files stored in the cloud. Value added services are especially useful when data volumes are large – for example, if you store huge volumes of unstructured big data in the cloud, or a large amount of video content, it is natural to process and work with the data directly in the cloud, rather than having to transfer it elsewhere.
Cloud providers offer a rich array of API-driven, managed services such as stream processing, big data analytics, computer vision, and speech-to-text. Here are a few examples of value added services that you can use when storing data in a public cloud:
- Amazon Rekognition – analyzes image and video content to detect objects, scenes, and faces
- Azure Cognitive Search – enables AI-powered search on textual, image and video data
- Cloudinary video API – based in the AWS cloud, performs AI-driven transformation and quality enhancement of video and image content
Object Storage Pricing Comparison
Object storage is often a major part of cloud storage expenditure, because it can be used for large-volume data storage and archival of historical data. The following table shows a quick comparison of object storage services across the big three cloud providers. Costs shown are for US regions and are subject to change—consult official pricing pages for up to date pricing.
In this article I explained the basics of cloud storage, and touched on the three main storage technologies offered by cloud services – block storage, file storage, and object storage – and compared storage solutions offered by the big three cloud providers:
Finally, I provided a detailed comparison of object storage costs, which often form the biggest component in an organization’s cloud storage spending.
I hope this will be of help in your journey to a more efficient, scalable storage strategy.
By Gilad Maayan
Gilad David Maayan is a technology writer who has worked with over 150 technology companies including SAP, Samsung NEXT, NetApp and Imperva, producing technical and thought leadership content that elucidates technical solutions for developers and IT leadership.