2014-02-24

Myth: SAP HANA runs on cheap hardware

This question has come accross customer discussions quite some times now.

Many get surprised on the fact that there are a lot more HANA licenses sold, than HANA systems installed. Maybe this is one contributing factor!

What am I talking about? Many customers confuse the fact that SAP HANA runs on "Industry Standard Servers", also often called "commodity hardware" with the idea that because it's industry standard or commodity it's cheap.

Why? It's not once that I've seen "certain account teams" saying to customers that SAPHANA is well worth it's cost because it can even run on commodity hardware!

Industry standard (or commodity) means that it's hardware build out of components that multiple providers use, and due to increased adoption see their prices become lower that equivalent hardware that is proprietary.

So, running your applications on industry standards has the benefits of having lower hardware investment costs that running it on equivalent proprietary platforms, and also prevents you from getting locked into a vendor, as there are other providers in the market with similar architectures to which you can easily move your applications.

Many customers, after their discussions with those "certain account teams" get the sense that as soon as they buy their HANA license, they'll just need to install it on any spare free piece of hardware they have on the datacenter. Later, many get surprised with the fact that it's not like that. 

But what does it mean the for SAP HANA to run on industry standard hardware? It means that it uses Intel chips based architectures, for which there are a dozen of providers in the market.

So,why isn't it cheap, and why can't I install it on my existing hardware?

First of all, SAP has chosen to optimize the HANA code on the Intel E7 CPUs which happen to be their most expensive ones. Why? Because these CPUs offer availability and error protection features that their other CPUs don't, and they also include code for optimizing compression alongside with increased cache sizes, which all benefit HANA, as it runs in memory where any failures might become a serious problem. So, you need Intel E7 based servers.

Also, as HANA loads a lot of data in memory, it consumes a lot of RAM. So considering that some customers still negotiate their Intel servers with 128 or 256 GB of RAM to save some money, imagine buying servers with several terabytes of RAM! Yes, RAM is becoming cheaper, but a 16 or 32 GB DIMM board still costs some money!

Further contributing to this is the fact was that up until recently ago, customers could only run SAP HANA on specially designed appliances, meaning that they needed to buy dedicated hardware to HANA, needing in consequence always to buy new hardware, building a silo in the datacenter, with the consequences of the increased operation costs that silos imply. Fortunately that time is over, with the generally availability of the Tailored Dtatacenter Integration announced last November 6th.

Then, the most surprising of all is having HANA needing a very fast disk sub-system.

If HANA runs in memory, why the hell does it need a fast disk configuration?

RAM is volatile, so in case of a server failure (and for what I've heard today from some customers it happens with a lot more frequency than I would initially think) all data that you have in memory will be lost. When you are running business critical applications, can you aford this?

No you cannot. So all HANA data needs to be persisted!

Running in memory has the benefit that all those reports and other read intensive operations will have an outstanding boost! And with HANA, having the calculation engine addressing directly the data in memory also makes new predictive, planning and forecasting applications to be able to run in real time as reading and processing all that data directly in RAM, without the Network and Disk access latency, is a whole lot faster.

So you are only left with write operations. So, why do you need such a fast disk system?

How much CPUs do you have today on your database server of your ERP? 32? 64? 128?

Let's remember for a moment that HANA nodes come these days in either 40 or 80 cores, and that you can have 8, 16, 32 nodes in a HANA cluster. In the 8 nodes cluster scenario, we are talking about 640 cores! How much IO capacity do you need to be able to ingest the change logs of the write operations performed by those 640 CPU cores?

The numbers today in terms of requirements for disk sub-systems point out to need above 50k IO operations per second for the log volumes of each HANA node, and a throughput of several hundreds MBytes per second and even up to 1 GB/s in some cases, for the data volumes of each HANA node.

If you have no clue what this means, just ask anyone on the storage team of your company. This is a lot.

One thing is certain: it makes no sense to have all that memory and CPU capacity on the system, and then have it waiting for writes to get committed to disk.

Finaly, as HANA although being a "shared nothing" cluster architecture, may in some circumstances need to have its nodes in a cluster talking to each other, the request for the network connection of those server nodes today is of 10 GBit/sec (although some say that 3 x1GB ports would work fine). But the fact is that there are tests underway with 40GB network connections.

Conclusion: HANA is a system that requires a performance optimized hardware configuration to unleash all it's potential. Such an hardware configuration will never be cheap. 

It will be cheaper than buying an equivalent configuration on proprietary hardware based for example on the old RISC Unix systems. You may even try to cut its cost by cutting some options regarding availability, management and automation features, but those will byte you in the but latter with increased operations and management costs.

So, my advice is to have this clear in your mind and realy study well the business case for HANA, and when you decide to move forward, consider also you hardware budget in the overall calculations.

Some news are coming that will make it cheaper to implement SAP HANA, like the "SAP HANA Tailored  Datacenter Integration" which provides you a bit more choice when selecting the hardware components to run HANA on, and enabling you to use existing hardware on your datacenter (still with certain restrictions) and also sharing that hardware with non-HANA systems, avoiding silos and extra operational costs.

Hopefully 2014 will be a year filled with news that make it even easier to integrate HANA in the datacenter, and will make it possible for further organizations to make the decision of adopting it.

2014-02-20

SAP HANA Productive, virtualized with VMware

2014-05-06 update: productive support has just been announced! Check my new blog post about it at: http://sapinfrastructureintegration.blogspot.com/2014/05/sap-hana-productive-on-vmware-is.html

Has EMC gone productive with a SAP HANA instance virtualized on VMware? Yes it has!

The question coming is how and why, since SAP hasn't yet released productive support for SAP HANA virtualized.

The first thing to realize is that EMC has done it without SAP's support!

Is this a big issue? No its not! Anyone who has worked either as a consultant or on operations management has done something "unsupported" by SAP at some point in their life.

Why? Because SAP support is by nature conservative. And if you had multiple contracts with SLA's on resolution time, you would be also. So, when SAP supports something, it always means it has been sufficiently tested, and eventual problems can be easily tracked and solved in due time.

Any experienced consultant knows that "not being supported" doesn't mean it won't work. It means it has risks, may fail, and if "shit happens" you are on your own, so you better have a backup plan.

Why EMC has done it? In my personal perspective for a number of valid reasons:
  1. Because EMC owns VMware, and so, no one better than the owner of the technology to take the first step and prove it;
  2. Because EMC has embraced the principal of "proven IT", meaning, "we want to test on our own internal IT the technologies and the concepts we sell to our customers, so that we have a real world experience we can share";
  3. Because SAP said it themselves. To support productive operations of SAP HANA virtualized, they feel its needed to have further experience managing it. So, what is better than having a customer taking the risk of testing it and gathering the knowledge?
So, for EMC this is not a big risk. EMC IT is closely supported by VMware, they have ran SAP HANA in a TDI model for some time, and through the process have gained relevant knowledge on HANA behavior and operations.

Also, the experience of running SAP HANA virtualized for non-productive use has been quite extensive, for example with SAP's own internal private cloud of the Education department having deployed HANA virtualized to support the training needs for quite some years now. So if you take for example the courses HA100 or HA200, you'll have a test system deployed on VMware. Other large customers have also been doing this for some time now.

EMC moving forward with the announcement of having a running productive instance of SAP HANA virtualized with VMware just confirms that HANA can run virtualized with minimal performance impact, and with significant benefits in terms of availability and cost reduction (both on setup and operations).

So, what is making SAP take so long to announce this widely expected support? I've heard comments about a wide variety of reasons, but in fact only SAP can say what is still missing.

The fact is EMC has been productive since last November, many customers have been running it for non-production for years now, and that for smaller installations, at this moment virtualizing HANA would be a great solution, for example enabling the boost of the service provider business, by being able to deploy smaller HANA instances (up to 1 TB) which fits many of the smaller customer needs, and so enabling the boost of HANA adoption.

On my side, I'm preparing to have some fun installing one HANA SP7 system on a VM and play around with VMotion, backup and other stuff, just to confirm how easy it is to get the functionality out.

If you want to know more about running SAP HANA virtualized, let me suggest the following resources:
·         EMC story regarding virtualizing SAP HANA: https://community.emc.com/docs/DOC-33197
·         SAP HANA Guidelines for being virtualized with VMware vShpere: http://www.saphana.com/docs/DOC-4192
·         SAP HANA Virtualized overview (roadmap and highlights): http://www.saphana.com/docs/DOC-3334
·         Joint SAP/VMware presentation at the SAP TechEd Amsterdam 2013 on SAP HANA Virtualization: http://www.vmware-sapteched2013.com/ITM136.pdf 

2014-02-13

The orchestration tool for the SAP Private Cloud – my review of SAP NW LVM 2.0


Is there really a case for integrating SAP Netweaver Landscape Virtualization Management as an orchestration tool for large SAP customers?

My belief is YES. Although there are intense discussions around the benefits of the multiple orchestration tools in the market, one thing is the potential, and another is “out of the box functionality”.

Business / IT alignment has been a widely discussed topic, which have observed pages and pages of blogging and experience sharing, mostly focusing on the IT Governance layer.

But many of these discussions become so high-level, that they forget the reality on the datacenter. Being the poor communication between infrastructure and applications teams a key problem in many IT organizations, having tools that bring transparency to the “connection points” between the diverse teams, can be a key contributor to reduce the inefficiency at the multiple dimensions of IT operations.

On the other side, when thinking of tools, it’s also critical to think who their intended users are. And LVM doesn’t aim to replace any tools for other teams (Virtualization console, Storage Console, Network console, etc), but rather provide tool that SAP systems admins didn’t have until today,  one that provides them automation and visibility capacity to infrastructure components that up until now they had no info on.

So, LVM enables the SAP Technical teams to have increased visibility on the infrastructure, to perform certain regular maintenance tasks in a more autonomous and automated way, as well as provide them added capacity to fulfil their role of ensuring SAP Systems Availability and Performance, through better communication with the underlying infrastructure having them all seeing the systems through the same set of metrics.

Finally, bridging the communication between the Networking, Storage, Servers, and applications teams is key to enable the private cloud operating model, and run SAP better, fully automated in an industrialized way.

Over the next paragraphs, I’ll be sharing my key takeaways from my training today on LVM 2.0, as well as what I’ve found most interesting to consider as potential use cases for LVM based on my own SAP Operations Management experience.
 

Brief summary of some of the interesting features on LVM 2.0

Going through the details of TechEd session ITM160, and with a virtual lab just for me for the full day, I took the opportunity both to take some notes of what I liked the most on the new LVM 2.0, as well as document certain usage scenarios I would see as interesting.

First, a summary of features on LVM 2.0, I liked the most:

·         Possibility to integrate custom scripting, for example when wanting to provision a system from a backup/restore, enabling afterwards the LVM to pick up that restore to continue the provisioning;

·         Manage in LVM systems provisioned through the SAP Cloud Appliance Library on public clouds, having LVM as the single orchestration tool for start / stop and other mass operations and control of all SAP Systems within a company;

·         Ability to organize systems and resources in multi-level pools and containers, enabling an easier organization of the landscapes, and preventing further risks from human errors on mass-operations;

·         In this new version, on top of the existing systems visualization, it is added the storage visualization, which maps disk pools down up to the physical devices. Great tool for performance troubleshooting and capacity planning;

·         Possibility to configure 3rd party apps to be viewed on LVM, enabling for example to bring to LVM visibility on infrastructure consoles with relevant data on SAP System specific infrastructures, adding a “never seen before” level of infrastructure information available to the SAP Systems Administration teams;

·         Possibility to customize the post copy automation, to the level of adding customer specific actions and completely tailor the system setup, making the refresh of Q&A systems a predictable and fully automated process.
 

Future features and my view on its benefits / use cases

There were also some previews of upcoming / desired features for the future of LVM, which I found very interesting, as well as where I see interesting potential:

·         Inclusion of HANA task lists for system clone / copy / refresh:

o   As customers adopt a private cloud operating model, HANA datacenter practices will need to further integrate with the private cloud specifics, and being able to deploy HANA systems through the simple processes of LVM will be a great add-on!

·         Use of the SAP Cloud Appliance Library also for the private cloud:

o   The SAP Cloud Appliance Library today is only available to service providers, and enables them to provide a menu of pre-defined SAP System images that customers can easily deploy. Imagine that you are starting a CRM project and need to deploy quickly a test system for your consultant to start testing. Having this option in the private cloud, with LVM as a management tool will be a great reinforcement of the benefits of adopting a cloud operating model for internal IT;

·         Enhancements to system visualization:

o   One of the key challenges when virtualizing SAP systems, it that SAP Systems teams loose complete track of where those systems are, and the trend of infrastructure teams to “over provision” and abuse of the “pooling and sharing” practices lead systems to degraded performance over time. It them becomes a nightmare to manage performance! On one side those infrastructure teams say it’s everything all right (according to their standards that may not suit the needs of SAP applications), and on the other side SAP teams have no tools to argue. End to end system visibility, from the application down to the disk is an outstanding tie-breaker on these situations, and will be key in accelerating virtualization adoption for productive SAP systems, being particularly important in the future reality of SAP HANA virtualized.

·         System provisioning using backup / restore:

o   Having LVM able also to launch backup and restore jobs makes all the sense. It’s not casual the fact that many SAP customers on Oracle databases used BRtools. I don’t know how far will this functionality go, but if the goal is to be able to provision systems from previous backups, why not extend a bit further and have a central tool for backup management across all the SAP landscape? For me it makes sense in the perspective of LVM being a tool for the most experienced SAP resources in the customer’s organizations, and a central point for infrastructure management. Also, for service providers moving towards a cloud operating model in their internal systems, it may be a good evolution.
 

Some images of my tour to LVM 2.0

Having the system just for me to play for a full day, gave me time to go a bit beyond just the scenarios presented at the SAP TechEd’s session ITM160.

Having worked many years as both a system admin, Basis Team leader and Operations Manager, the thing I loved the most was the configuration of the “post-processing task list”!

Man, I do remember all the hours I’ve spend doing these tasks!

I’ve heard some consultants claiming that this functionality will hurt severely their number of consulting hours. Well, on my perspective, doing it a couple of times in different platforms is fun, but repeating exactly the same thing over and over again… I’d rather spend my time improving other aspects of the landscape than keep repeating one thing I already knew from memory.

I’m all for automation! Once you’ve learned something new, if it is to be repeated, automate it and move forward! I believe one of the key challenges today in managing SAP systems (and IT in general) is the excessive manual work.

So, here is a look of the transaction STC01 on a managed system – Maintain Post Processing task list:
You can change, add or delete tasks, as well as customize (through variants) a big part of them.
Note that today you have these task lists built both to Business Suite as well as BW!

Another cool thing is the easy navigation and look of the overall LVM user interface.

LVM 2.0 – Monitoring > Activities


On the above image you have a summary of all activities, but then you can drill down and check the details of each activity.


LVM 2.0 – Monitoring > Activities > Steps (detail of a system copy steps)



I really liked the fact that it keeps a log of the time spent on each of the tasks within an activity. This is fantastic to plan future activities.

Working as operations manager one key question I always faced when performing planned maintenance tasks was the expected time to perform them. Now you can get it easily from the records on the system!

The image bellow gives you an idea of the activities you can perform with a system. Note that the buttons change depending on each of the systems.

LVM 2.0 – Provisioning > System and AS Provisioning (view of available options for a productive system)




Regarding the options available, not for the bellow example, where the “refresh” option only shows up for a QA system related with an existing productive system.

LVM 2.0 – Provisioning > System and AS Provisioning (view of available options for a quality assurance system)



I’ve found the dashboards also very usefull. When you are managing dozens of systems, unless you have some automation, its likely that once and a while some of the non-productive systems to be down on the beginning of the day. Many organizations do not implement monitoring for non-productive systems, and its costly to have a human resource (even if it’s a junior consultant) logging on to all systems to check that they are online every single morning (yes, I’ve seen it happening in more than one organizations…).

LVM 2.0 – Overview > Dashboard


As I said before, one key value for me is the end to end visualization. On this demo I could only see in separate the relationship between the disk volumes and the underlying layers, but not the end-to-end relationship from SAP system down to the disk. So, I’m curious to see the demos we will be building internally to check whether it will happen. If not, it’s a “must” addition for future releases.

LVM 2.0 – Overview > Visualization > Storage Managers


One thing you have already is the relationship between the SAP systems and the hosts they are running in. I do believe though, that it would be useful to have the full picture of all SAP systems on a single physical host.

LVM 2.0 – Overview > Visualization > Pools


Another great feature is the mass-operations. Imagine a datacenter maintenance on a weekend, and you have hundereds of SAP systems. Do you know how many resources and hours do you need just for the stop start if you don’t have it all scripted already? And even if you have, in larger environments, its possible that some systems have changed and are not captured by the scripts. This is definitely a major “human cost” saver in IT operations.

LVM 2.0 – Automation > Tasks (overview of an example of a planned mass operation – non-prod weekly restart)


Although I didn’t have the opportunity to test it (as my systems had no workload, even if I’ve launched some SGEN jobs to try and get it busy), this feature is quite interesting for very large systems. Having LVM starting new dialog instances or changing operation modes in case response times go over certain thresholds is definitely something usefull. Again, it’s all about automation and having the system self-adjust instead of waiting for the users to complain.

LVM 2.0 – Automation > Capacity Management (overview of systems available)


              Summary

Definitely, in my perspective SAP NW LVM, with the enhancements of version 2.0 is becoming a very interesting tool to enable the operation of SAP Systems in the private cloud.

Some numbers I saw some days ago, pointed out that today the cost of hardware on the IT Infrastructure budget these days accounts usually for less than 10%, being the biggest chuck IT staff and consulting.

This makes absolutely no sense.

IT came with the promise of automating company’s processes, so it doesn’t make sense that IT itself is being managed on a manual ad-hoc way.

IT reached a level of complexity, where managing operations manually will completely block the need for rapid adaptation and real time response to disruptions.

The “private cloud” operating model is all about having an industrialized automated datacenter.

And LVM will be a key tool enabling it for the SAP System Landscapes, due to both its out of the box functionality and tight integration with the SAP Systems reality.

Seeing increased integration being led by SAP partners to provide additional functionality to LVM will only increase its value, being one example all the integrations already delivered for example at the level of storage integration for snapshots, cloning, etc, having LVM as the single front-end for all SAP Systems operations automation.

Know more about EMC integration with SAP NW LVM at the SCN page, the support page or the community site.
Know more about Cloud Computing by reading the NIST definition of cloud computing and its reference architecture.
Know more about SAP NW LVM 2.0 at SCN’s page on Virtualization and Cloud Infrastructure.