25-Jun-2015

    Install hadoop.

Hadoop installation
                  1) downloaded tar file of hadoop from apache and set hadoop path in .bashrc .

NOTE: using hadoop 2.6.0 and mongodbConnector r1.4.0-rc0

if you are using maven to build your project then follow these steps to process mongodb data with hadoop . 

step 1 - Add dependency into your pom.xml file and also download jars which will be required later to run mapreduce programme from command line
             click here to download mongodbConnector jars https://github.com/mongodb/mongo-hadoop/releases

step 2 - Create maven based java project 'HadoopWithMongo'

step 3 - Add mongo-hadoop-core-1.4-rc0 dependency into pom.xml file

step 4 - Add hadoop liberaries into your project classpath

             NOTE: hadoop lib folder location vary on the basis of hadoop version.
             In Hadoop-2.6.0 use this path "hadoop/share/hadoop/common/lib" , ignore this path "hadooop/lib direcotry"

step 5 - Create java class MongoConnector

step 6 - Write a MapReduce programme

import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.util.ToolRunner;
import org.bson.BSONObject;

import com.mongodb.hadoop.MongoConfig;
import com.mongodb.hadoop.MongoInputFormat;
import com.mongodb.hadoop.MongoOutputFormat;
import com.mongodb.hadoop.util.MapredMongoConfigUtil;
import com.mongodb.hadoop.util.MongoConfigUtil;
import com.mongodb.hadoop.util.MongoTool;

public class MongoConnector extends MongoTool{
	public static class Map extends Mapper{
		public void map(final Object key, final BSONObject value, final Context context) throws IOException, InterruptedException{
			 * write your mapper logic
			context.write(new Text(), new IntWritable(1));	
	public static class Reduce extends Reducer{
		public void reduce(Text key,Iterable values,Context context) throws IOException, InterruptedException{
			 * write your reducer logic
			context.write(new Text(), new IntWritable(1));
	public MongoConnector(){
		Configuration conf = new Configuration();
		MongoConfig mongoConfig = new MongoConfig(conf);
		if (MongoTool.isMapRedV1()) {
            MapredMongoConfigUtil.setInputFormat(getConf(), com.mongodb.hadoop.mapred.MongoInputFormat.class);
            MapredMongoConfigUtil.setOutputFormat(getConf(), com.mongodb.hadoop.mapred.MongoOutputFormat.class);
        } else {
            MongoConfigUtil.setInputFormat(getConf(), MongoInputFormat.class);
            MongoConfigUtil.setOutputFormat(getConf(), MongoOutputFormat.class);
		mongoConfig.setMapper((Class) Map.class);
	public static void main(String[] args) throws Exception {
		 System.exit(ToolRunner.run(new MongoConnector(), args));

NOTE: your mongo instance should be started.

Your connection is setup successfully if you want to run mapreduce programme using jar then follow these steps

step 1 - First of all put mongo connector jars downloaded in first step in hadoop lib directory
step 2 - start hadoop services 
step 3 - create jar file of above java project
step 4 - Hit this command - hadoop jar HadoopWithMongo.jar MongoConnector

    This will start your mapreduce programme.

Hope this Blog will help you in establishing connection between hadoop and mongo! 

