We look forward to presenting Transform 2022 in person again on July 19th and virtually from July 20th to 28th. Join us for insightful conversations and exciting networking opportunities. Register today!
The aggravation, the unexpected delays, the lost time, the high costs: commuting is routinely considered the worst part of the day for people around the world and is one of the big drivers of work-from-home policies.
Computers feel the same way. Computational storage is part of an emerging trend to make data centers, edge servers, IoT devices, cars and other digitally enhanced things more productive and efficient by moving less data. With computational storage, a full-fledged computing system—complete with DRAM, I/O, application processors, dedicated storage, and system software—is squeezed into the confines of an SSD to locally manage repetitive, intermittent, and/or data-intensive tasks.
Why? Because moving data can consume inordinate amounts of money, time, energy and computing resources. “For some applications like in-drive compression, hardware engines consuming less than one watt can achieve the same throughput as over 140 traditional server cores,” said JB Baker, VP of marketing and product management at ScaleFlux. “That’s 1,500 watts and we can do the same work with one watt.”
Unnecessary data circulation is also not good for the environment. A 2018 study sponsored by Google found that 62.7% of computing power is consumed moving data back and forth between memory, storage, and the CPU in a variety of applications. So computational storage could reduce emissions improvement Perfomance.
And then there’s the looming capacity issue. Cloud workloads and internet traffic have grown 10-16x over the past decade and are likely to grow at this rate or faster in the coming years as AI-powered medical imaging, autonomous robots, and other data-intensive applications advance from concept to move to commercial deployment.
Unfortunately, servers, rack space, and operational budgets struggle to grow at the same exponential rate. For example, Amsterdam and other cities have imposed strict limits on data center size, forcing cloud providers and their customers to figure out how to do more with the same footprint.
Imagine a traditional two-socket server setup with 16 drives. A typical server could have 64 cores (two processors with 32 cores each). With compute memory, the same server could potentially have 136: 64 server cores and 72 application accelerators plugged into its drives for prep work. Multiplied by the number of servers per rack, racks per data center, and data centers per cloud empire, compute drives have the power to increase the potential ROI of millions of square feet of real estate.
The fine print
So if computational storage is so beneficial, why isn’t it ubiquitous yet? The reason is simple – a confluence of advances, from hardware to software to standards, must come together to make a paradigm shift in processing economically feasible. These factors all agree now.
For example, compute storage drives must fit within the same performance and space constraints as regular SSDs and servers. That means the compute element can consume only two to three watts of the 8 watts allotted to a drive in a server.
While some early computer SSDs relied on FPGAs, companies like NGD Systems and ScaleFlux are adopting system-on-chips (SoCs) built around Arm processors originally designed for smartphones. (An eight-core compute drive SoC can use four cores for managing the drive and the rest for applications.) SSDs typically already have quite a bit of DRAM — 1GB for every terabyte in a drive. In some cases, the processing unit can use this as a resource. Manufacturers can also add more DRAM.
In addition, a computational storage drive can support standard cloud-native software stacks: Linux operating systems, containers built with Kubernetes, or Docker. Databases and machine learning algorithms for image recognition and other applications can also be loaded into the drive.
Standards also need to be finalized. The Storage Networking Industry Association (SNIA) published its 0.8 specification last year, covering a wide range of topics such as security and configuration; a full specification is expected later this year.
Other innovations to expect: more ML acceleration and specialized SoCs, faster interconnects, improved on-chip security, better software to analyze data in real-time, and tools to merge data from distributed networks of drives.
Over time, we’ve also seen the advent of computing power added to traditional spinning disks, which are still the workhorse of cloud storage.
A double-edged edge
Some early use cases will occur at the edge, with the compute engine acting edge-for-the-edge. For example, Microsoft Research and NGD Systems found that computer storage drives could dramatically increase the number of image queries that can be performed by processing the data directly on the CSDs – one of the most widely discussed use cases – and that throughput grows linearly with more drives .
Bandwidth-constrained devices, often with low latency requirements, such as airplanes or autonomous vehicles, are another key target. Over 8,000 aircraft carrying over 1.2 million people are in the air at any given time. Machine learning for predictive maintenance can be efficiently performed in-flight with computational storage to increase safety and reduce turnaround time.
Cloud providers are also experimenting with computed cloud drives and will soon start moving to commercial deployments. In addition to offloading tasks from more powerful application processors, compute drives could improve security by scanning for malware and other threats locally.
The alternative?
Some may argue that the solution is obvious: reduce the computational load! Companies collect far more data than they use anyway.
However, this approach ignores one of the unfortunate truths about the digital world. We don’t know what data we need until we already have it. The only realistic choice is to find ways to efficiently handle the massive data onslaught that is coming our way. Computational drives will be a crucial pivot, allowing us to filter the data without getting bogged down in the details. Insights generated from this data can unlock capabilities and use cases that can transform entire industries.
Mohamed Awad is Vice President of IoT and embedded at Arm.
data decision maker
Welcome to the VentureBeat community!
DataDecisionMakers is the place where experts, including technical staff, working with data can share data-related insights and innovations.
If you want to read about innovative ideas and up-to-date information, best practices and the future of data and data technology, visit us at DataDecisionMakers.
You might even consider contributing an article of your own!
Read more from DataDecisionMakers