Blog

  • scaling MongoDB

    In this blog I am going to share the details on how to implement sharding in mongodb on Ubuntu with data splitting into shards.

    Before starting I want you to go through my blog:

    FAQs : Scaling MongoDB using Sharding

    In this blog I have mentioned all the terms related to sharding the mongodb database.

    To start with sharding I just want to list the requirements :

    • shard server : minimum two servers for checking the splitting of data
    • config server : one server for testing or three servers for production
    • routing server : one server for mongos process

    I have implemented it by myself as a testing work.I launched 2 shard servers , one config server and one routing server.All the servers were on EC2 ubuntu machine.Throughout the blog I will share hostname and private IP as the values which I have implemented.The machines with these IPs are terminated.You can implement your own machine instance.Keep in mind that following port numbers should be active on these machine:
    22, 80, 8080, 27017, 27018, 27019, 28017, 28018, 28019

    Following are the list of configurations of all the servers that I used:

    2 Shard server:
    machine name: Mongo-Machine-1
    command to connect server :
    ssh -i mongodbconnect.pem ubuntu@ec2-23-22-195-45.compute-1.amazonaws.com
    hostname: ec2-23-22-195-45.compute-1.amazonaws.com
    Private IP : 10.204.22.130

    machine name: Mongo-Machine-2
    command to connect server:
    ssh -i mongodbconnect.pem ubuntu@ec2-75-101-188-189.compute-1.amazonaws.com
    hostname: ec2-75-101-188-189.compute-1.amazonaws.com
    Private IP: 10.207.25.190

    1 Config server:
    machine name: Mongo-Machine-3
    command to connect server:
    ssh -i mongodbconnect.pem ubuntu@ec2-23-20-67-5.compute-1.amazonaws.com
    hostname: ec2-23-20-67-5.compute-1.amazonaws.com
    private IP : 10.210.50.142
    (IP of config server will be needed at the setting cluster)

    1 Routing server:
    machine name: Mongo-Machine-4
    command to connect server:
    ssh -i mongodbconnect.pem ubuntu@ec2-23-20-79-29.compute-1.amazonaws.com
    hostname: ec2-23-20-79-29.compute-1.amazonaws.com

    Cluster Setup:
    Install mongodb on all servers.For installing mongodb on ubuntu go to my blog :
    Installing MongoDB on Windows or Linux ( Ubuntu ) environment


    Setting at all shard servers:

    //Connect to shard server
    ssh -i mongodbconnect.pem ubuntu@ec2-23-22-195-45.compute-1.amazonaws.com
    
    //first stop the mongodb service
    sudo service mongodb stop
    
    //open mongodb.conf file at /etc/init and add --shardsvr parameter
    sudo vim /etc/init/mongodb.conf
    
    "exec start-stop-daemon --start --quiet --chuid mongodb --exec  /usr/bin/mongod -- --config /etc/mongodb.conf;"
    //Replace the above line with the line below
    "exec start-stop-daemon --start --quiet --chuid mongodb --exec  /usr/bin/mongod -- --shardsvr --config /etc/mongodb.conf;"
    
    //Now start mongodb service
    sudo service mongodb start
    


     

    port no for shard server is 27018 on command shell and 28018 for RESTful service
    To test that mongodb is running fine you can hit URL:
    hostname:28018
    eg: for machine name: Mongo-Machine-1
    hostname: ec2-23-22-195-45.compute-1.amazonaws.com:28018
    OR
    Testing this in terminal:

    	mongo localhost:27018
    

     

     

     

    Setting at all config servers:

     

     

     

    //Connect to config server
    ssh -i mongodbconnect.pem ubuntu@ec2-23-20-67-5.compute-1.amazonaws.com
    
    //first stop the mongodb service
    sudo service mongodb stop
    
    //open mongodb.conf file at /etc/init and add --configsvr parameter
    sudo vim /etc/init/mongodb.conf
    
    "exec start-stop-daemon --start --quiet --chuid mongodb --exec  /usr/bin/mongod -- --config /etc/mongodb.conf;"
    // Replace the above line with the line below
    "exec start-stop-daemon --start --quiet --chuid mongodb --exec  /usr/bin/mongod -- --configsvr --config /etc/mongodb.conf;"
    
    //start mongodb service
    sudo service mongodb start
    

     

     

    port no for config server is 27019 on command shell and 28019 for RESTful service
    To test that mongodb is running fine you can hit URL:
    hostname:28019
    eg:
    ec2-23-20-67-5.compute-1.amazonaws.com:28019
    OR
    Testing this in terminal:

     

     

    	mongo localhost:27019
    

     

     

    Perform following steps on Routing server

     

     

    //Connect to Routing server
    ssh -i mongodbconnect.pem ubuntu@ec2-23-20-79-29.compute-1.amazonaws.com
    
    //stop mongodb service
    sudo service mongodb stop
    
    //starting mongos process with configdb parameter to configure config server
    //values for configdb parameter should be either IP or hostname
    //for testing use one config server , for production use 3 config server
    mongos --configdb 10.210.50.142
    
    //you will get a message like this:
    //”config servers and shards contacted successfully”
    
    //open new tab with this IP (Routing server)
    
    //Enter in admin database
    mongo admin
    
    //Adding shards to the routing server.
    //shard can be added only through admin database
    mongos>db.runCommand( { addshard : "10.204.22.130:27018", name : "Shard_A" } );
    mongos>db.runCommand( { addshard : "10.207.25.190:27018", name : "Shard_B" } );
    
    //enabling the database “testDBSharding” for sharding
    mongos> db.runCommand( { enablesharding : "testDBSharding" } );
    #output
    { "ok" : 1 }
    
    //Mention the name of collection “testData” to be sharded with key name
    mongos> db.runCommand( { shardcollection : "testDBSharding.testData", key : { _id : 1 } } );
    #output
    { "collectionsharded" : "testDBSharding.testData", "ok" : 1 }
    
    mongos> show dbs
    #output
    admin    (empty)
    config    0.046875GB
    testDBSharding    0.203125GB
    
    mongos> use testDBSharding
    #output
    switched to db testDBSharding
    
    //Script for entering large size data
    mongos> myData = "";while ( myData.length < 200000 )myData += "My data for mongodb sharding";for ( var num = 0; num < 5000; num++ ){db.testData.save( { myData : myData } );}
    mongos> use config
    #output
    switched to db config
    
    //Showing all chunk information
    mongos> db.chunks.find();
    
    //Showing status of sharded cluster
    mongos> db.printShardingStatus();
    #Output
    --- Sharding Status ---
     sharding version: { "_id" : 1, "version" : 3 }
     shards:
       {  "_id" : "Shard_A",  "host" : "10.204.22.130:27018" }
       {  "_id" : "Shard_B",  "host" : "10.207.25.190:27018" }
     databases:
       {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
       {  "_id" : "testDBSharding",  "partitioned" : true,  "primary" : "Shard_A" }
                testDBSharding.testData chunks:
                                        Shard_A    15
                                        Shard_B    12
                            too many chunks to print, use verbose if you want to force print
    
    mongos>
    



     

    Checking data at shard_A server

    /*Note:Do not try to update or delete data from shard server
     in a sharded collection.This is only for conformation that
     data is saved on the shard.*/
    
    //Connect to shard server
    ssh -i mongodbconnect.pem ubuntu@ec2-23-22-195-45.compute-1.amazonaws.com
    
    //use mongo with port no 27018
    mongo localhost:27018/testDBSharding
    
    >show collections
    
    >db.testData.stats();
    

     

     

     

    Akash Sharma

    akash.sharma@oodlestechnologies.com

Tags: nosql , bigdata , mongodb , sharding , clustering