Category: Spark

Apache Spark is an open source big data processing framework built around speed, ease of use, and sophisticated analytics.

The Future of SQL on Apache Spark SQL

spark sql

The Future of SQL on Apache Spark SQL   When the Shark project started 3 years ago, Hive (on MapReduce) was the only choice for SQL on Hadoop. Hive compiled SQL into scalable MapReduce jobs and could work with a variety of formats (through its SerDes). However, it delivered less than ideal performance. In order […]

Introduction to Apache Spark

apache spark

Introduction to Apache Spark Apache Spark Apache Spark is a open source processing engine for Hadoop data built around speed and sophisticated analytics anf easy to use. It was originally developed in 2009 in UC Berkeley’s AMPLab, and open sourced in 2010. Spark has quickly become  one of the largest open source communities in big […]

HadoopTpoint © 2017 Frontier Theme