Hadoop mapreduce programming in java Max Word Length

Hadoop mapreduce programming in java Max Word Length

Here we are providing the java code for how to find out a Max word length in a given input file by using Mapreduce.

Sample Input file  – Input.txt

Hi Welcome to Big Data World

Hadoop is one of the Best Technology now a days

Final output for the given Input file

Technology 10

Mapper Code 

public static class MaxWordMapper extends Mapper<LongWritable, Text, Text, IntWritable>{

@Override

protected void map(LongWritable key, Text value,Context context)

throws IOException, InterruptedException {

String txt = value.toString();

for(String line:txt.split(” “)){

if(line.length()>0){

context.write(new Text(line), new IntWritable(line.length()));

}

}

}

}

Output Format of the mapper class and this output is Input for the reducer class

Text    IntWritable

Description about Mapper Class

The mapper class will take every word in the given input file by spliting delimiter of Space ( <space>). And caluclate the word length.

The output format of Mapper class is as

Hi 2

Welcome 7

to 2

Big 3

Data 4

World 5

Hadoop 6

is 2

one 3

of 2

the 3

best 4

technology 10

now 3

a 1

days 4

This format is used for Input for the reducer class.

Reducer Class

public static class MaxWordReducer extends Reducer<Text, IntWritable, Text, IntWritable>{

String maxWord;

protected void setup(Context context) throws java.io.IOException, InterruptedException {

maxWord = new String();

}

protected void reduce(Text key, Iterable<IntWritable> value, Context context) throws java.io.IOException,InterruptedException {

if (key.toString().length() > maxWord.length()) {

maxWord = key.toString();

}

}

protected void cleanup(Context context) throws IOException, InterruptedException {

context.write(new Text(maxWord), new IntWritable(maxWord.length()));

}

}

In the above reducer class we have 3 methods

   i) setup method (It will call only one time when the reducer class is instantiated)

  1. Reducer method (It will call number of times )

  2. cleanup method (It is also like setup method that is it will call only one time but at the end of program)

So our requirement is display large word length in the given input file.so we have to compare one word to another word. If we write a code to display the data in the reducer class we will get number of fields as a output but according to our requirement we have to get only one output like max word length.we have to write a code in setup method to initiate the string value and clean up method to display the large word and word length.

Here is the example of Hadoop mapreduce programming in java Max Word Length

Comments

  1. shubhada says:

    i am studying one algorithm and i am enable to implement that algorithm on hadoop .Will u please help me to implement that algo.

Speak Your Mind

*