2014-02-24

Myth: SAP HANA runs on cheap hardware

This question has come accross customer discussions quite some times now.

Many get surprised on the fact that there are a lot more HANA licenses sold, than HANA systems installed. Maybe this is one contributing factor!

What am I talking about? Many customers confuse the fact that SAP HANA runs on "Industry Standard Servers", also often called "commodity hardware" with the idea that because it's industry standard or commodity it's cheap.

Why? It's not once that I've seen "certain account teams" saying to customers that SAPHANA is well worth it's cost because it can even run on commodity hardware!

Industry standard (or commodity) means that it's hardware build out of components that multiple providers use, and due to increased adoption see their prices become lower that equivalent hardware that is proprietary.

So, running your applications on industry standards has the benefits of having lower hardware investment costs that running it on equivalent proprietary platforms, and also prevents you from getting locked into a vendor, as there are other providers in the market with similar architectures to which you can easily move your applications.

Many customers, after their discussions with those "certain account teams" get the sense that as soon as they buy their HANA license, they'll just need to install it on any spare free piece of hardware they have on the datacenter. Later, many get surprised with the fact that it's not like that. 

But what does it mean the for SAP HANA to run on industry standard hardware? It means that it uses Intel chips based architectures, for which there are a dozen of providers in the market.

So,why isn't it cheap, and why can't I install it on my existing hardware?

First of all, SAP has chosen to optimize the HANA code on the Intel E7 CPUs which happen to be their most expensive ones. Why? Because these CPUs offer availability and error protection features that their other CPUs don't, and they also include code for optimizing compression alongside with increased cache sizes, which all benefit HANA, as it runs in memory where any failures might become a serious problem. So, you need Intel E7 based servers.

Also, as HANA loads a lot of data in memory, it consumes a lot of RAM. So considering that some customers still negotiate their Intel servers with 128 or 256 GB of RAM to save some money, imagine buying servers with several terabytes of RAM! Yes, RAM is becoming cheaper, but a 16 or 32 GB DIMM board still costs some money!

Further contributing to this is the fact was that up until recently ago, customers could only run SAP HANA on specially designed appliances, meaning that they needed to buy dedicated hardware to HANA, needing in consequence always to buy new hardware, building a silo in the datacenter, with the consequences of the increased operation costs that silos imply. Fortunately that time is over, with the generally availability of the Tailored Dtatacenter Integration announced last November 6th.

Then, the most surprising of all is having HANA needing a very fast disk sub-system.

If HANA runs in memory, why the hell does it need a fast disk configuration?

RAM is volatile, so in case of a server failure (and for what I've heard today from some customers it happens with a lot more frequency than I would initially think) all data that you have in memory will be lost. When you are running business critical applications, can you aford this?

No you cannot. So all HANA data needs to be persisted!

Running in memory has the benefit that all those reports and other read intensive operations will have an outstanding boost! And with HANA, having the calculation engine addressing directly the data in memory also makes new predictive, planning and forecasting applications to be able to run in real time as reading and processing all that data directly in RAM, without the Network and Disk access latency, is a whole lot faster.

So you are only left with write operations. So, why do you need such a fast disk system?

How much CPUs do you have today on your database server of your ERP? 32? 64? 128?

Let's remember for a moment that HANA nodes come these days in either 40 or 80 cores, and that you can have 8, 16, 32 nodes in a HANA cluster. In the 8 nodes cluster scenario, we are talking about 640 cores! How much IO capacity do you need to be able to ingest the change logs of the write operations performed by those 640 CPU cores?

The numbers today in terms of requirements for disk sub-systems point out to need above 50k IO operations per second for the log volumes of each HANA node, and a throughput of several hundreds MBytes per second and even up to 1 GB/s in some cases, for the data volumes of each HANA node.

If you have no clue what this means, just ask anyone on the storage team of your company. This is a lot.

One thing is certain: it makes no sense to have all that memory and CPU capacity on the system, and then have it waiting for writes to get committed to disk.

Finaly, as HANA although being a "shared nothing" cluster architecture, may in some circumstances need to have its nodes in a cluster talking to each other, the request for the network connection of those server nodes today is of 10 GBit/sec (although some say that 3 x1GB ports would work fine). But the fact is that there are tests underway with 40GB network connections.

Conclusion: HANA is a system that requires a performance optimized hardware configuration to unleash all it's potential. Such an hardware configuration will never be cheap. 

It will be cheaper than buying an equivalent configuration on proprietary hardware based for example on the old RISC Unix systems. You may even try to cut its cost by cutting some options regarding availability, management and automation features, but those will byte you in the but latter with increased operations and management costs.

So, my advice is to have this clear in your mind and realy study well the business case for HANA, and when you decide to move forward, consider also you hardware budget in the overall calculations.

Some news are coming that will make it cheaper to implement SAP HANA, like the "SAP HANA Tailored  Datacenter Integration" which provides you a bit more choice when selecting the hardware components to run HANA on, and enabling you to use existing hardware on your datacenter (still with certain restrictions) and also sharing that hardware with non-HANA systems, avoiding silos and extra operational costs.

Hopefully 2014 will be a year filled with news that make it even easier to integrate HANA in the datacenter, and will make it possible for further organizations to make the decision of adopting it.

No comments:

Post a Comment