Hadoop connection with mongodb using mongoDBConnector
Posted By : Md Qasim Siddiqui | 25-Jun-2015
Prequisite:
Install hadoop.
Hadoop installation
1) downloaded tar file of hadoop from apache and set hadoop path in .bashrc .
NOTE: using hadoop 2.6.0 and mongodbConnector r1.4.0-rc0
if you are using maven to build your project then follow these steps to process mongodb data with hadoop .
step 1 - Add dependency into your pom.xml file and also download jars which will be required later to run mapreduce programme from command line
click here to download mongodbConnector jars https://github.com/mongodb/mongo-hadoop/releases
step 2 - Create maven based java project 'HadoopWithMongo'
step 3 - Add mongo-hadoop-core-1.4-rc0 dependency into pom.xml file
step 4 - Add hadoop liberaries into your project classpath
NOTE: hadoop lib folder location vary on the basis of hadoop version.
In Hadoop-2.6.0 use this path "hadoop/share/hadoop/common/lib" , ignore this path "hadooop/lib direcotry"
step 5 - Create java class MongoConnector
step 6 - Write a MapReduce programme
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.util.ToolRunner;
import org.bson.BSONObject;
import com.mongodb.hadoop.MongoConfig;
import com.mongodb.hadoop.MongoInputFormat;
import com.mongodb.hadoop.MongoOutputFormat;
import com.mongodb.hadoop.util.MapredMongoConfigUtil;
import com.mongodb.hadoop.util.MongoConfigUtil;
import com.mongodb.hadoop.util.MongoTool;
public class MongoConnector extends MongoTool{
public static class Map extends Mapper
NOTE: your mongo instance should be started.
Your connection is setup successfully if you want to run mapreduce programme using jar then follow these steps
step 1 - First of all put mongo connector jars downloaded in first step in hadoop lib directory
step 2 - start hadoop services
step 3 - create jar file of above java project
step 4 - Hit this command - hadoop jar HadoopWithMongo.jar MongoConnector
This will start your mapreduce programme.
Hope this Blog will help you in establishing connection between hadoop and mongo!
Cookies are important to the proper functioning of a site. To improve your experience, we use cookies to remember log-in details and provide secure log-in, collect statistics to optimize site functionality, and deliver content tailored to your interests. Click Agree and Proceed to accept cookies and go directly to the site or click on View Cookie Settings to see detailed descriptions of the types of cookies and choose whether to accept certain cookies while on the site.
About Author
Md Qasim Siddiqui
Qasim is an experienced web app developer with expertise in groovy and grails,Hadoop , Hive, Mahout, AngularJS and Spring frameworks. He likes to listen music in idle time and plays counter strike.