Scaling MongoDB and setting up sharding for MongoDB on Ubuntu

Posted By Akash Sharma | 07-Feb-2013

scaling MongoDB

In this blog I am going to share the details on how to implement sharding in mongodb on Ubuntu with data splitting into shards.

Before starting I want you to go through my blog:

FAQs : Scaling MongoDB using Sharding

In this blog I have mentioned all the terms related to sharding the mongodb database.

To start with sharding I just want to list the requirements :

  • shard server : minimum two servers for checking the splitting of data
  • config server : one server for testing or three servers for production
  • routing server : one server for mongos process

I have implemented it by myself as a testing work.I launched 2 shard servers , one config server and one routing server.All the servers were on EC2 ubuntu machine.Throughout the blog I will share hostname and private IP as the values which I have implemented.The machines with these IPs are terminated.You can implement your own machine instance.Keep in mind that following port numbers should be active on these machine:
22, 80, 8080, 27017, 27018, 27019, 28017, 28018, 28019

Following are the list of configurations of all the servers that I used:

2 Shard server:
machine name: Mongo-Machine-1
command to connect server :
ssh -i mongodbconnect.pem ubuntu@ec2-23-22-195-45.compute-1.amazonaws.com
hostname: ec2-23-22-195-45.compute-1.amazonaws.com
Private IP : 10.204.22.130

machine name: Mongo-Machine-2
command to connect server:
ssh -i mongodbconnect.pem ubuntu@ec2-75-101-188-189.compute-1.amazonaws.com
hostname: ec2-75-101-188-189.compute-1.amazonaws.com
Private IP: 10.207.25.190

1 Config server:
machine name: Mongo-Machine-3
command to connect server:
ssh -i mongodbconnect.pem ubuntu@ec2-23-20-67-5.compute-1.amazonaws.com
hostname: ec2-23-20-67-5.compute-1.amazonaws.com
private IP : 10.210.50.142
(IP of config server will be needed at the setting cluster)

1 Routing server:
machine name: Mongo-Machine-4
command to connect server:
ssh -i mongodbconnect.pem ubuntu@ec2-23-20-79-29.compute-1.amazonaws.com
hostname: ec2-23-20-79-29.compute-1.amazonaws.com

Cluster Setup:
Install mongodb on all servers.For installing mongodb on ubuntu go to my blog :
Installing MongoDB on Windows or Linux ( Ubuntu ) environment


Setting at all shard servers:

//Connect to shard server
ssh -i mongodbconnect.pem ubuntu@ec2-23-22-195-45.compute-1.amazonaws.com

//first stop the mongodb service
sudo service mongodb stop

//open mongodb.conf file at /etc/init and add --shardsvr parameter
sudo vim /etc/init/mongodb.conf

"exec start-stop-daemon --start --quiet --chuid mongodb --exec  /usr/bin/mongod -- --config /etc/mongodb.conf;"
//Replace the above line with the line below
"exec start-stop-daemon --start --quiet --chuid mongodb --exec  /usr/bin/mongod -- --shardsvr --config /etc/mongodb.conf;"

//Now start mongodb service
sudo service mongodb start


 

port no for shard server is 27018 on command shell and 28018 for RESTful service
To test that mongodb is running fine you can hit URL:
hostname:28018
eg: for machine name: Mongo-Machine-1
hostname: ec2-23-22-195-45.compute-1.amazonaws.com:28018
OR
Testing this in terminal:

	mongo localhost:27018

 

 

 

Setting at all config servers:

 

 

 

//Connect to config server
ssh -i mongodbconnect.pem ubuntu@ec2-23-20-67-5.compute-1.amazonaws.com

//first stop the mongodb service
sudo service mongodb stop

//open mongodb.conf file at /etc/init and add --configsvr parameter
sudo vim /etc/init/mongodb.conf

"exec start-stop-daemon --start --quiet --chuid mongodb --exec  /usr/bin/mongod -- --config /etc/mongodb.conf;"
// Replace the above line with the line below
"exec start-stop-daemon --start --quiet --chuid mongodb --exec  /usr/bin/mongod -- --configsvr --config /etc/mongodb.conf;"

//start mongodb service
sudo service mongodb start

 

 

port no for config server is 27019 on command shell and 28019 for RESTful service
To test that mongodb is running fine you can hit URL:
hostname:28019
eg:
ec2-23-20-67-5.compute-1.amazonaws.com:28019
OR
Testing this in terminal:

 

 

	mongo localhost:27019

 

 

Perform following steps on Routing server

 

 

//Connect to Routing server
ssh -i mongodbconnect.pem ubuntu@ec2-23-20-79-29.compute-1.amazonaws.com

//stop mongodb service
sudo service mongodb stop

//starting mongos process with configdb parameter to configure config server
//values for configdb parameter should be either IP or hostname
//for testing use one config server , for production use 3 config server
mongos --configdb 10.210.50.142

//you will get a message like this:
//”config servers and shards contacted successfully”

//open new tab with this IP (Routing server)

//Enter in admin database
mongo admin

//Adding shards to the routing server.
//shard can be added only through admin database
mongos>db.runCommand( { addshard : "10.204.22.130:27018", name : "Shard_A" } );
mongos>db.runCommand( { addshard : "10.207.25.190:27018", name : "Shard_B" } );

//enabling the database “testDBSharding” for sharding
mongos> db.runCommand( { enablesharding : "testDBSharding" } );
#output
{ "ok" : 1 }

//Mention the name of collection “testData” to be sharded with key name
mongos> db.runCommand( { shardcollection : "testDBSharding.testData", key : { _id : 1 } } );
#output
{ "collectionsharded" : "testDBSharding.testData", "ok" : 1 }

mongos> show dbs
#output
admin    (empty)
config    0.046875GB
testDBSharding    0.203125GB

mongos> use testDBSharding
#output
switched to db testDBSharding

//Script for entering large size data
mongos> myData = "";while ( myData.length < 200000 )myData += "My data for mongodb sharding";for ( var num = 0; num < 5000; num++ ){db.testData.save( { myData : myData } );}
mongos> use config
#output
switched to db config

//Showing all chunk information
mongos> db.chunks.find();

//Showing status of sharded cluster
mongos> db.printShardingStatus();
#Output
--- Sharding Status ---
 sharding version: { "_id" : 1, "version" : 3 }
 shards:
   {  "_id" : "Shard_A",  "host" : "10.204.22.130:27018" }
   {  "_id" : "Shard_B",  "host" : "10.207.25.190:27018" }
 databases:
   {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
   {  "_id" : "testDBSharding",  "partitioned" : true,  "primary" : "Shard_A" }
            testDBSharding.testData chunks:
                                    Shard_A    15
                                    Shard_B    12
                        too many chunks to print, use verbose if you want to force print

mongos>



 

Checking data at shard_A server

/*Note:Do not try to update or delete data from shard server
 in a sharded collection.This is only for conformation that
 data is saved on the shard.*/

//Connect to shard server
ssh -i mongodbconnect.pem ubuntu@ec2-23-22-195-45.compute-1.amazonaws.com

//use mongo with port no 27018
mongo localhost:27018/testDBSharding

>show collections

>db.testData.stats();

 

 

 

Akash Sharma

akash.sharma@oodlestechnologies.com

Request for Proposal

Recaptcha is required.

Sending message..