At the start of November I was lucky enough to be invited to the Next Generation Storage Symposium in San Jose, California as well as Storage Field Day #2 on the following two days. There were sessions from upcoming storage vendors as well as a keynote from well known analyst Robin Harris and some thought provoking panel discussions about next generation storage technologies. Having spent some time digesting the flood of information and ideas from this trip there are two trends which ’emerge out of the haze’ for me;
- Flash storage continues to rapidly disrupt the storage marketplace
- The desire for more scalable systems is driving changes in the architecture of enterprise storage
To people immersed in storage these trends are well known and have been covered increasingly over the last couple of years (flash has been in mainstream arrays in some form since 2008). I’ve written this article to consolidate my own thoughts rather than as the traditional end of year predictions – ‘if you can’t explain it you don’t understand it‘ is something I believe and one of the principal reasons I enjoy blogging. Despite my interest in storage these events made me realise I’ve taken my eye off the ball and I’m now playing catchup!
Most of the vendor and panel sessions at the conference concentrated on the flash aspects – tiered vs cache, should you use a hybrid or all flash array (or no array at all in Nutanix’s case) although there was also discussion of the scalability of various architectures and technologies. Flash’s key advantage is performance – when compared to spinning disk it’s orders of magnitude faster and much of the current innovation is in trying to overcome its other constraints of cost, lifecycle, form factor etc. It’s not just performance that’s driving the current industry changes however – the desire for greater scalability is driving storage from a centralised model to a more distributed architecture (as described very eloquently by Chris Evans).
Combined these factors imply a major shakeup in the storage industry – it’s going to be a fun few years!
Flash disrupts the marketplace
Unless you’ve been hiding under a very big rock you can’t have missed the mass market arrival of flash devices into almost every aspect of the market, both for consumers and enterprises. There are many drivers behind their adoption from mobile devices to virtualisation with its I/O blender and VDI bootstorms. Ruben Spruijt has written a great primer on where and how flash is being used – read it! In his article he not only summarises the different flash implementations (hybrid arrays, all flash arrays, tiering, caching etc) which I’m repeating here for reference but the pros and cons of each and a typical use case;
- Flash-cached (performance acceleration of magnetic disks by extending cache)
- Flash-tiered (performance acceleration of magnetic disks by using a tiered approach)
- Hybrid file systems (performance acceleration of magnetic disks by using a tiered approach)
- Flash-Only arrays (using SSD’s in a purpose built all flash array)
- Server side flash (connecting SSD directly to the PCI bus of the server)
- Network flash (combination of flash-only arrays and server side flash)
I spent quite a while mulling over SSD implementations before finding Ruben’s post and categorisations – it could have saved me significant brainpower!
There are more storage startups than you can shake a stick at and having seen many flash vendors pitch their products* over the last few months it’s clear that this is still an evolving market. There is more clarity now than a year ago about flash products but there’s still enough ‘devil in the detail’ and disputed facts that each solution will have to be evaluated on its own merits. One vendor will claim that dedupe is easy and quality of service is the distinguishing feature while another will argue that effective dedupe is difficult yet critical (do you do inline or post-process?) and quality of service unnecessary if you can scale easily.
*Storage vendors I’ve seen this year either at VMworld, VMUGs or through Storage Field Day include Nimble Storage, Nimbus Data, PureStorage, Tintri, Nutanix, NexSAN, Violin Memory, Whiptail, Virsto, and Fusion-IO. As you can see these guys are mainly startups and they’re on the campaign trail!
So is flash the future of storage? At the Next Generation Storage Symposium all nine sponsors were touting flash products and at Storage Field Day five out of nine sponsors were presenting flash storage arrays (either hybrid or all flash) so you’d be forgiven for assuming it is. Some research has thrown doubt on the long term future of flash and there are commercial challenges but other’s have rebutted the findings and ROM manufacturer Macronix has just claimed to extend the lifecycle of flash by heating the chips to restore their condition. Whatever the truth I suspect only time will tell and until then vendors will continue to push flash into their product lines.
The big boys have been slow to respond but they’re getting into gear. EMC’s server based flash product, VFCache, is now in its second generation and their all flash array, codenamed Project X, is due sometime in 2013 (although not everyone is impressed with it). There’s also Project Thunder which seems to be VFCache on steroids – by the time these are all in place they’ll be competing on all fronts with the startups I’ve mentioned previously. Netapp seem slightly unsure about how to best use flash as their stance is that shouldn’t be considered primary storage. That would mean it’s only use is for caching and arguably tiering yet Netapp are famous for saying that tiering is dead (back in 2010!). Their flash offerings today are all based on caching (FlashAccel for server based flash, FlashCache at the controller level and Flash Pools at the aggregate level) so it’ll be interesting to see what project MARS is all about. Dell meanwhile are apparently on the lookout to make an all-flash acquisition.
What about VMware? They admit they’ve not done much with flash yet (apart from swap to SSD in vSphere 5.0) but they’ve also shown that’s going to change. At VMworld this year they showed previews of vFlash which lets you use flash to accelerate specific or generic VM workloads (much like Fusion-IO, FlashSoft or OCZ’s VXL let you do today). I suspect we’ll see mainstream adoption of hypervisor based flash as it offers compelling benefits.
There’s a lot to learn about flash in all its forms and it’s making big changes to the storage world – you may have to unlearn some things you’ve taken for granted previously. One thing’s for sure – just adding SSDs to an existing solution is unlikely to cut the mustard!
Scalability drives changes
As the internet has matured so has the demand for scalable systems to keep up with the potentially worldwide customer base. Companies born out of the Web 2.0 generation like Google, Amazon, Facebook, and Netflix have all built hugely scalable systems but they didn’t use ‘enterprise’ infrastructure to do it – they used commodity hardware (which makes for an interesting read) with specially written application’s which assumed failure of the infrastructure (ably demonstrated by the Netflix chaos monkey). These massively scalable systems do not lend themselves to the centralized storage model that enterprises have been using for the last decade or two – they rely on scaling linearly where distributed resources are crucial. Many of these applications rely on the concept that it’s cheaper to move the computation nearer to the storage rather than focusing on high speed storage transports which is what traditional enterprise systems have been doing for at least a decade. MapReduce/HDFS is a great example of this (and Ceph which is a newcomer worth watching which aims to alleviate some of HDFS’s scalability limitations). You could argue those companies created the first, although bespoke, ‘software defined’ infrastructures….
The adoption of cloud computing is now driving similar shifts into traditional enterprises and a slew of companies are trying to solve the challenge of scaling the infrastructure without the need to rewrite the applications – enter the ‘software defined datacenter’ offering automation and policy based control. EMC’s Chad Sakacc’s described the concept of software defined very clearly;
the idea of decoupling the control plane from the infrastructure and running that in software on commodity hardware via an open API model
There’s those same ideas that fostered the Web 2.0 guys – commodity hardware with the control/intelligence baked into ‘open’ software. Of course whether this ‘software defined’ layer becomes open versus proprietary or even succeeds is very much open to debate given EMC/VMware’s market dominance. Is it in EMC’s interest, as the world’s largest storage company, to cannibalize and disrupt its existing revenue stream from ‘legacy’ storage solutions? Chad touches on this in his post and acknowledges that change is necessary and that as a company they’re ready for it.
Virtualization, and more recently hyper-converged infrastructure, are both enabling new architectures which share some of these Web 2.0 traits. Virtualization has pushed x86 hardware towards being a commodity and maybe equally importantly has allowed any application to be moved around dynamically, opening up the opportunity to locate it nearer the storage (ala HDFS). Companies such as Nutanix have incorporated this functionality into a ‘virtualised building block’ which lets you build a modular, scalable cluster where the storage and compute are both virtualised and storage locality is maximised to improve performance. This completely removes any reliance on centralised storage – hence their NoSAN movement (unsurprisingly one of their founders was involved with Google distributed filesystem).
Other companies offering similar solutions are Simplivity and Scale Computing, and more recently ScaleIO came out of stealth mode. Steve Foskett has a good blogpost detailing the ScaleComputing solution and what distinguishes it from the competition. I’m impressed with Nutanix’s offering as it combines both major trends (flash and distributed storage) into a single, modular, shipping product.
VMware’s recent previews of vVols (plus a good article here) and distributed storage are the virtualisation giant’s first forays into the software defined storage concept (discounting VAAI/VASA) but other companies are already working on competing and arguably more advanced products. Virsto launched v2.0 of their ‘storage hypervisor’ a couple of weeks ago, and Datacore’s SANsyphony (now on v9, it’s been around a while!) offers similar functionality but runs on a Windows platform rather than as an appliance. Interestingly these products don’t offer an API so won’t be easy to integrate with.
I wanted to cover what the big guys like EMC, Netapp, HDS, IBM, and HP are doing but there’s so much going on I’d be investigating and writing forever! Certainly they’re not standing still and technologies like Netapp’s cluster mode and EMC’s Isilon already offer a significant ability to scale.
Moving storage closer to the application along with the growth of converged infrastructure and a renewed focus on modular systems seems to be order of the day.
There’s so much going on in the storage world right now that keeping track of it all is a tough task. Writing this article took me much longer than planned because I constantly found interesting articles I wanted to explore and I’ve ended up with more questions than I had to begin with. I’d hoped to create a mental framework for myself which helped me understand the current developments and product announcements and put them in context against one another but there are so many facets that I’ve not succeeded – and I’ve not even touched on object storage, big data, successors to NAND flash, VM centric storage, VMs running directly on storage arrays, cloud storage etc. I’m putting this post ‘out there’ in case it’s of interest to someone else even though it’s ended up being a rambling mess of semi-coherent thoughts!
The panel presentations from the Next Generation Storage Symposium
- Next-Generation Storage: Robin Harris Keynote
- Next-Generation Array Architecture Panel
- Solid State Tiering and Caching Panel
- Engineering Storage for Virtualization and Cloud
- Scaling Storage for the Future
The year of software defined storage? (VMware’s office of the CTO)
SSDs are in your future – how, what, where, when (StorageIO)
Storage Hypervisors – Worth the Hype (Virtualization Practice)