Apache Tez Introduction
In Simple words,Apache Tez is framework for YARN-based,Data processing applications in hadoop,In detailed manner Apache Tez is an extensible framework for building Yarn based,High data performance batch and interactive data applications in Hadoop and Apache Tez can handle TB to PB of data sets.Apache Tez is used by hadoop ecosystem such as Apache hive,Apache pig,Cascading and other engines.By using this Tez we can optimize and fast response time and extreme throughput at petabyte scale.
What Apache Tez Does
Tez provides a developer API and framework to write a native YARN applications that bridge the spectrum of interactive and batch workloads. tez allows application scalability from GBs of data to PBs of data and 10 ‘s to 1000’s nodes. Tez allows users to crate an hadoop application that integrated with YARN and perform well within mixed workload Hadoop clusters.
Tez is extensible and embedded,it provides optimization so freedom to express highly optimized data processing applications.Tez advantages are over general-purpose, end-user-facing engines such as MapReduce and Spark.It allows you to express complex computations as dataflow graphs and allows for dynamic performance optimizations based on real information about the data and the resources required to process it.
Apache Tez Using Companies
Tez Originally developed by Hortonworks,with short time Tez has gathered 31 committers which represent a who’s who of leading Hadoop companies, including
v) NASA JPL