Hadoop Integration with OBIEE 11g

Hadoop Integration with OBIEE 11g

OBIEE11g is a oracle software  which is used for to  generate reports and dashbords.Before OBIEE version  doesn’t supports to hadoop.Now this software integrate with hadoop.so,we can gereate reports for large datasets by using OBIEE11g.

MapReduce jobs are typically written in Java, but Hive can make this simpler

1)Hive is a query environment over Hadoop/MapReduce to support SQL-like queries

2) Hive server accepts HiveQL queries via HiveODBC or HiveJDBC, automatically creates MapReduce jobs against data previously loaded into the Hive HDFS tables

3) Approach used by ODI and OBIEE to gain access to Hadoop data

4)Allows Hadoop data to be accessed just like any other data source (sort of…)

5)Exalytics, through in-memory aggregates and InfiniBand connection to Exadata, can analyze vast (structured) datasets held in relational and OLAP databases

6) Endeca Information Discovery can analyze unstructured and semi-structured sources

7)InfiniBand connector to Big Data Applicance + Hadoop connector in OBIEE supports analysis via Map/Reduce

8)Oracle R distribution + Oracle Enterprise R supports SAS-style statistical analysis of large data sets, as part of Oracle Advanced Analytics Option

9) OBIEE can access Hadoop datasource through Hive,and use its in-memory cache to speed-up Hive queries.

Importing Hadoop/Hive Metadata into RPD:

1)HiveODBC driver has to be installed into Windows environment, so that BI Administration tool can  connect to Hive and return table metadata

2)Import as ODBC datasource, change physical DB type to Apache Hadoop afterwards

Set up ODBC Connection at the OBIEE Server:

1)OBIEE ships with HiveODBC drivers, need to use 7.x versions though (only Linux supported)

2) Configure the ODBC connection in odbc.ini, name needs to match RPD ODBC name

3) BI Server should then be able to connect to the Hive server, and Hadoop/MapReduce

Dealing with Hadoop / Hive Latency:

1)Hadoop access through Hive can be slow – due to inherent latency in Hive

2)Hive queries use MapReduce in the background to query Hadoop

Spins-up Java VM on each query, Generates MapReduce job, Runs and collates the answer

3)Great for large, distributed queries …

4)… but not so good for “speed-of-thought” dashboards

 Finally oracle said that ..,

  • Oracle Exalytics is an excellent platform to run these on, based on RAM and CPU #
  • OBIEE, with TimesTen for Exalytics and the Summary Advisor, supports“speed of thought” analytics using a rich, interactive dashboard
  • OBIEE can now connect to Hadoop/MapReduce, through Hive
  • Exalytics can accelerate Hadoop / Hive queries through in-memory aggregation
  • Also potential benefits via Exalytics – Big Data Appliance Infiniband Connection


Speak Your Mind