Multiple Output Files in MapReduce Program using MultipleTextOutputFormat

In Our Previous Post we discussed about MultipleInputs multiple input files concept in Mapreduce and now this is the time to discuss about Multiple Output Files in MapReduce Program using MultipleTextOutputFormat by using

We have to import first of all import org.apache.hadoop.mapreduce.lib.output.MultipleOutputs from hadoop librires.

Multiple Outputs in Mapreduce

FileOutputFormat and its subclasses generate a set of files in the output directory. There is one file per reducer, and files are named by the partition number: part-00000, part-00001, etc. There is sometimes a need to have more control over the naming of the files or to produce multiple files per reducer. MapReduce comes with two libraries to help you do this: MultipleOutputFormat and MultipleOutputs.


MultipleOutputFormat allows you to write data to multiple files whose names are derived from the output keys and values. MultipleOutputFormat is an abstract class with two concrete subclasses, MultipleTextOutputFormat and MultipleSequenceFileOutputFormat, which are the multiple file equivalents of TextOutputFormat and SequenceFileOutputFormat. MultipleOutputFormat provides a few protected methods that subclasses can override to control the output filename.






In the above program we have to use Cleanup method and setup method

Input Data

color GREEN
color BLACK
color RED
fruit BANANA
fruit ORANGE.
fruit APPLE
so this is Multiple Output Files in MapReduce Program using MultipleTextOutputFormat.Share this knowledge ! Join us on Facebook ! Now Whatsapp sharing is supportable ! BookMark our ! Any Doubts Comment below .

1 Comment

Add a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

HadoopTpoint © 2017 Frontier Theme