Object storage offers a new paradigm. As a data storage system, it is something that can be installed on site, but it is also the basis for most of the storage available on the public cloud. However, its use as a valuable technology in M&E — both for active workflow storage and long-term asset preservation — is less understood. This tutorial will explain why it is so useful, how it works, the problems it solves, and how it differs from other approaches.
Like many things in IT, the way data is stored, especially in a shared storage environment, can be thought of as a “solution stack.” At a base level, data is stored on storage devices, such as hard drives and flash drives, as blocks of data. Every individual file gets broken up into several blocks of data with each block being a particular number of bytes. These data blocks are mapped to regions of the data storage device, such as sectors on a hard drive. These mappings are stored in a file system, which is a database of these mappings, along with metadata about the files, such as access rights, creation and modification dates, etc. The file system is layered onto the raw storage device when a drive is formatted.
File systems are organized into hierarchies of directories, or folders, and within these directories are files. They are certainly useful for organization, and many users and workgroups come up with elaborate hierarchies with strong naming conventions for directories and files. We have the means of sharing these file systems out to workgroups of users, server systems and associated software platforms, via SANs and file servers.
But there is something rigid about hierarchies. Perhaps there is a better paradigm than the traditional file system.
Blocks, Files, and Objects, Oh My!
Object storage answers this by borrowing the notion of a data object from areas such as object-oriented programming and databases. So, what is a data object? It is the data — any data but probably a file or other captured data stream — referred to by an arbitrary set of attributes, usually expressed as a number of key-value pairs. “File name” would be a key, and the name of the file would be the value for that key. “File creation date” and “file modification date” would be other keys, with their own values.
What object storage gives you that traditional file systems do not is the ability to create your own sets of key-value pairs to associate with the data objects you are storing, which can be integrated, more or less, any way you please through a software application interface. Think of the key-value metadata pairs you may have come up with for different classes of assets stored in a MAM database. You can come up with whatever you want, and they are not inherently hierarchically arranged. This means you could have a search engine application integrate with the metadata index of an object storage system, based on a customized query looking to bring back a list of all files that adhere to that particular set of criteria.
It is an exceptionally flexible way to organize things on an object store. Which might mean it is not really structured at all. You may find it more useful to keep an object’s searchable metadata in a totally separate place, like your MAM database. What the MAM and the Object Store both need to track is the file’s main object ID, which the object system assigns to the files that are stored on it. This ID will be what a MAM or other software application passes to the object store via a GET API call, for example, in order to pull the file back to a file system for processing or use.
Many media software applications today are not able to modify data in place that is stored on an Object Store via that system’s APIs because they are written to utilize file systems on local storage, or shared via SAN and NAS file sharing technologies. Your NLE cannot really edit off object storage. Your transcoder cannot really decode from and encode to your object store. Not via the native APIs, anyway. However, many object storage systems do offer file-sharing protocol “front-ends” to their systems in order to support applications or users that need that interface to the data today. These do work decently, but tend not to match the performance of traditional shared file systems for media processing workloads.
Where this is starting to change more is in the public cloud space. Some providers like Amazon Web Services also offer media-specific services such as transcoding. These cloud providers are based around files being stored on their object platforms, like S3. They have been motivated to adapt and build media tool sets that are able to work with data on an object store. These capabilities will likely, over time, “trickle down” to on-premises deployment models.
For on-premise object storage deployments, the object storage platform is usually the second tier of a two-tier data storage setup. SAN or NAS for processing, and object for longer-term storage. Even smaller object stores can compete performance-wise with very large tape library installations. Tape may still be useful as part of a disaster recovery, or DR, strategy but it seems as though object storage is set to supplant it for many so-called “active archive” applications — archives that store data that is actually utilized on a regular basis.
Preservation at Scale
Another strength of many object storage platforms is how well they scale. Many can grow to tens and hundreds of petabytes of capacity per “namespace,” or unified object storage system. Many traditional shared file system technologies fall apart at this scale, but object truly is “cloud-scale.” We generate a lot of data in media these days, and by the looks of things, data rates and storage requirements are only going to keep increasing with 4K, 8K, HDR, volumetric video, 360-degree video, etc.
But what’s especially exciting about object for those of us in media is that it’s not just storage at scale, but storage with exceptional preservation qualities for the underlying data. Data integrity is usually talked about in terms of “durability,” and is referred to as X number of nines — much like data availability which, unlike durability, speaks more to the accessibility of data at a given moment. Durability for object storage is achieved through a few mechanisms and they result in systems that make it very likely you will never experience any data ever going “bad” or being lost due to bit-flipping, drive loss, or other failures, even when storing many petabytes of data.
The first way this is achieved is through erasure coding algorithms. Similar to RAID, they generate extra data based on all file data that lands on the system. Unlike the generation of RAID parity data in RAID 5 and 6, however, erasure coding does not require costly specialized controllers; rather, it uses the CPU of the host server to do the calculations.
Unlike RAID, erasure coding can allow for more loss per disk set than a RAID 6 set, which can only lose two of its constituent drives. When a third fails, before rebuild is complete, all data on the RAID set is lost. As such, it is imperative to limit the number of total disks when creating a traditional RAID set in order to balance total storage space desired with the risk you are willing to take on multiple disk failures that can cause data loss. Erasure coding algorithms are much more flexible — you can assign six out of 18 drives in a set for parity, so one third is unavailable for actual storage. However, six drives per set can be lost, without data loss! Other ratios, which affect overall system data durability and storage efficiency, can also be selected.
Another mechanism some object storage systems use to achieve durability — even in the case of entire site failures — is the ability to do erasure-coded disk sets across subsystems housed in multiple geographic locations. There are often some fairly stringent networking requirements between sites, but it sure is reassuring to be able to have, for instance, a single object store that is spread between three locations, and is erasure coded in a way where all data can be maintained even if one of the three locations is wiped off the map — all while still achieving the storage efficiency of a single erasure code. If this is a larger geographic footprint than you want, a direct one-to-one replica is often also a function between systems in different locations. This really means two separate object stores do a full replication between one another. There are some whispers of two-site erasure coding becoming an option with some future systems, so things may improve the road from a storage efficiency perspective for two-site setups.
Finally, as far as preservation technologies go, some object stores feature ongoing data integrity checks, via checksum (or hash value comparison) techniques. Hashing algorithms generate a unique value, based on the original set of bits in a file. If the hash is run again at a later time, and the hash value generated is the same as the original, you know that all of the bits in the file are identical to the originals. If the hash value changes, however, you know at least one bit in the file has changed — and that is enough to consider a file ruined in most circumstances.
Thankfully, because multiple copies of bits are stored utilizing the previously-described erasure coding techniques, object stores that offer this kind of feature are capable of figuring out which versions of a file’s bits are uncorrupted and can repair the corrupted bits using these uncorrupted copies. This kind of operation can be set up to run periodically for the entire data set, so such checksums are performed every six or 12 months. When considered in tandem with overall bit-flipping probabilities, this can lead to a stable system. While some data tape systems do offer such checksum features, these are much slower to complete due to the inherent latencies and bandwidth limitations of the data tape format.
So Really, in a Nutshell
Object storage is metadata-friendly, and thus extremely flexible when it comes to organization and discovery of data. It is very easy to integrate with applications like MAMs and media supply chain platforms. It offers quick-accessibility of data and can scale to extremely large capacities while protecting against data loss due to mechanical part or subsystem failure, data corruption, or even site loss. It is not wedded to hard drive technology — you can build, if you want, object stores out of flash drives (we do not advise this but it is possible). You can own it and host it yourself, lease it via the cloud and a broadband connection, and in some cases create hybrid systems of these approaches. And it’s not subject to the risks of RAID when deployed at scale.
I think a common model that will emerge for many media-centric companies will be to build a single-site object store that is relied on for daily use, but a copy of all data is also put into a very low-cost public cloud storage tier as a disaster recovery backup. This will essentially keep any egress or recovery fees for the public cloud tier minimal, other than in a disaster scenario, because you are using the copy of the data that’s on your own system.
There is finally something new under the sun in data storage that offers real value. Object storage is available for everyone, and we look forward to seeing how we can use it to build smarter, more efficient systems for our clients over the coming years.