SAP HANA interview questions and answers
SAP HANA 1.0 went GA this week - SAP speaks for Generally Available, and it seemed like a good time to collect all the facts about SAP's new In Memory business and put it in one place. I hope you enjoy it. It's not an official SAP FAQ, but it is information pulled together from industry experts from various organizations.
1. HANA Overview
1.1 What are the product names?
The short answer is: it's a mystery. SAP has changed them around a lot and now they call it SAP HANA Appliance, SAP HANA Database and SAP HANA Studio. Applications built on HANA will be marked "powered by SAP HANA". Probably they will change it all again.
1.2 What is SAP HANA Appliance 1.0?
SAP HANA 1.0 is an analytics appliance that consists of certified hardware, an In Memory DataBase (IMDB) an Analytics Engine and some tooling for getting data in and out of HANA. You build the logic and structures yourself, and use a tool e.g. SAP BusinessObjects, to visualise or analyse data.
1.3 What are the limitations of HANA 1.0?
Quite a few so far - it can only replicate certain data, from certain databases, in certain formats, using the Sybase Replication Server. Batch loading is done using SAP BusinessObjects Data Services 4.0 and is optimised only for SAP BusinessObjects BI 4.0 reporting.
1.4 What is SAP HANA 1.5, 1.2 or 1.0 SP03?
These are all the same thing, and 1.0 SP03 is touted to be the final name for what should go into RampUp (beta) in Q4 2011. This will allow any SAP NetWeaver BW 7.3 Data Warehouse to be migrated into a HANA appliance. HANA 1.0 SP03 specifically also accelerates BW calculations and planning, which means you get even more performance gains.
1.5 What's the difference between HANA and IMDB?
HANA is the name for the current BI appliance (HANA 1.0) and the BW Data Warehouse appliance (HANA 1.0 SP03). Both of these use the SAP IMDB Database Technology (SAP HANA Database) as their underlying RDBMS. Expect SAP to start to differentiate this more clearly as they start to position the technology for use cases other than Analytics.
1.6 If I can run NetWeaver BW on IMDB/HANA, why can't I run the Business Suite/ERP 6.0?
Simply because it's not mature enough yet to support business critical applications. From a technology perspective, it is already possible to run the Business Suite on IMDB and SAP has trialled moving some large databases into IMDB already.
1.7 What is HANA great at?
The best thing that HANA brings to the table is the ability to aggregate large data volumes in near real-time - and to have the data updated in near real-time. SAP's demos show hundreds of billions of records of data being aggregated in a matter of seconds. SAP has built a set of Analytics Apps on top of HANA and this are set to be great point use cases to get customers up and running quickly.
1.8 What is HANA bad at?
There are some current issues around HANA when delivering ad-hoc analytics, especially when using the SAP BusinessObjects Webi tool. Essentially the problem is that you can ask computationally very difficult questions with Webi, which can cause very long response times with HANA. SAP will need to build optimization for both Webi and HANA to reduce the computational complexity of these questions, but they're not there yet.
What's more, it's worth noting that HANA 1.0 is not a Data Warehouse and it is more of a Data Mart - that is, suited to point applications where there is a clear use case.
1.9 What does HANA cost?
SAP hasn't entirely confirmed HANA licensing costs but the hardware is somewhere around $1-200k per TB. Add to this licensing costs which are still being made on a per-customer basis.
1.10 Why is HANA so fast?
Regular RDBMS technologies put the information on spinning plates of iron (hard disks) from which the information is retrieved. HANA stores information in electronic memory, which is some 50x faster (depending on how you calculate). HANA stores a copy on magnetic disk, in case of power failure or the like. In addition, most SAP systems have the database on one system and a calculation engine on another, and they pass information between them. With HANA, this all happens within the same machine.
1.11 Does HANA/IMDB replace Oracle?
It's the elephant in the room, but once the Business Suite runs on IMDB, Oracle won't be needed any more by SAP customers who purchase HANA. This doesn't affect anything in the short term because those people buying HANA today will still need an Oracle ERP system.
1.12 What is this about 10:1 compression with HANA compared to Oracle?
A typical uncompressed Oracle or Microsoft SQL Server database, when put into HANA, will be 10x smaller than before and this is due to the way that HANA stores information in a compressed format. Note that most databases are now compressed and these numbers may not fit your scenario, and to add to this you need 2x the RAM as your database, plus room for growth. HANA sizing is still a dark art.
1.13 You mean I have to buy a HANA only 2.5x smaller than my big Oracle RDBMS? What about archiving and data ageing?
Yes, in some instances you may have to buy a HANA appliance that is only 2.5x smaller than it would be under Oracle. And data ageing isn't part of the 1.0 release, but SAP is certainly working on it pretty hard. Let's hope they release something faster than you need to buy a bigger HANA appliance!
1.14 What's the wider market opportunity for IMDB?
This is the interesting thing - no one knows yet, and few analysts seem to have cottoned on that the wider market opportunity might be huge. Think not just SAP applications but any third party that requires ultra-high speed. Think not just an appliance but a development platform. Time will tell.
2. SAP HANA database hardware
2.1 What hardware is supported right now?
Talk to your hardware vendor - all of the major vendors e.g. HP, IBM, Dell, have HANA offerings now. Technically HANA will run on any Intel x64 based system from your laptop through to the big 40-core, 2TB RAM servers. It is however only supported on a small number of big rack-mount servers like the Dell R910 and HP DL980.
2.2 Why doesn't HANA run on blades?
It's unclear but probably because the blades don't yet offer the same performance. HANA is optimized for the Intel X7560 CPU and will run fastest on this. And for instance, the Dell M910 blade can only run 2x X7650 CPUs and 512Gb RAM in this configuration, which probably explains the limitations. What's certain is that HANA will eventually run on blades - it's born to run on blade technology!
2.3 Does SAP make their own IMDB/HANA hardware?
Yes, but only in the labs so far. There are no public plans to compete against IBM/HP/Dell in this space, but it may make sense for SAP to enter the appliance market, especially in the context of Data Centres and even more so in the context of the SAP Business byDesign cloud offering, which will run on IMDB.
2.4 How big does HANA scale?
Theoretically at least - very well. The biggest single-server HANA hardware will run most mid-size workloads - 2TB of in-memory storage is equivalent to 5-20TB of Oracle storage. The way that HANA works means that it is possible to chain multiple systems together - meaning that scalability has thus-far been determined by the size of customers' wallets. Do note that whilst SAP talk up "Big Data" quite a lot, HANA currently only scales to the small-end of Big Data, which refers to the kind of huge datasets that FaceBook or Google have to store - not Terabytes, but rather Petabytes.
2.5 What storage subsystem does HANA use?
This varies from vendor to vendor but it is shared network attached storage (NAS). Both regular magnetic disks and SSD storage can be used for the backup of the database (HANA runs in memory remember, so disk storage is just for backup, and later, for data ageing). Note that you require 2x storage that you have RAM, which is 2x the database size - i.e. storage size = 4x database size. In most cases there is additional ultra-high speed SSD storage for log files.
3. Technical FAQ
3.1 What source databases does HANA support in real-time?
If you use Sybase Replication Server (SRS) for near real-time data then you need to watch out for licensing still (SAP have license deals pending). If you run DB2 then you're fine but with Oracle and Microsoft SQL Server there are some license challenges if you buy your license through SAP, because you may have a limited license that does not allow extraction. Talk to SAP for further information on this.
3.2 What source databases does HANA support for batch loads?
If you use SAP BusinessObjects Data Services 4.0 for bulk loads then pretty much anything. BO-DS is a very flexible Extract, Transform & Load tool that supports many databases - check out the specs for more details.
3.3 What additional limitations does Sybase Replication Server present?
SRS has additional restrictions which are worth bearing on mind. It can only replicate Unicode data and does not support IBM DB2 compressed tables.
4. Follow-ons, corrections & credits
This is a work in progress and your help correcting me, clarifying some things I may have not explained so well or even just asking a question that I haven't covered would be really useful for the wider market. Let me know and I'll expand this as the months go on!