Mongodb – Move Collection to Another Mongo Server File Size Analysis

You can check the stats of a mongodb collection by using the command below:

db.<collection>.stats()

> db.seCrossReferences.stats()
{
“ns” : “result.<collection>”,
“count” : 256292,
“size” : 3049450896,  # which is about 2.8GB
“avgObjSize” : 11898.346011580541,
“storageSize” : 4109500416,
“numExtents” : 18,
“nindexes” : 1,
“lastExtentSize” : 1071394816,
“paddingFactor” : 1,
“systemFlags” : 1,
“userFlags” : 0,
“totalIndexSize” : 8323168,
“indexSizes” : {
“_id_” : 8323168
},
“ok” : 1
}

Then you can export/dump the data using mongoexport or mongodump, the syntax for mongodump is as follows, it will dump the binary data out of mongo and save to a directory:

################### MONGODUMP ####################################

mongodump –db <dbname> –collection <collectionname> –out <outputpath/name>

It turns out that the size of the dumped folder is about the same size as what is showed in the db.<collection>.stats(), which makes sense because the dumped result is in binary format, result.bson.

Then you can use gzip to make it into a tar ball where the file size is only 177MB, which is less than 7% of the original file.

You can also dump the file to standard output and gzip on the go in one liner:

mongodump –db <dbname> –collection <collectionname> –out – | gzip -9 > <outputfilename>.gz

it took 6 minutes in total to dump and gzip the file, -9 flag make sure it is the best compression rate, but after all, the file size is 171 MB which is almost the same.

################### MONGOEXPORT ####################################

mongoexport will export the mongodb data to a human readable file JSON/CSV. The command syntax is similar and lets take a look:

mongoexport –db <dbname> –collection <collectionname> –out <outputfilename>.json

The exported json format file is 2.9 GB in this case and it took about the 7 minutes also.

mongoexport –db result –collection seCrossReferences | gzip > seCrossReferences.json.gz

This takes about 10 minutes but… the exported zipped file is 144MB…

In conclusion, I think maybe mongoexport + zip is a better solution, because:

1. its output is json/csv which could be used for something else except for moving database.

2. the output has a smaller size

############# IMPORT DATA #######################################
mongoexport + mongoimport
mongoimport –db –collection –file .json –type json

mongodump + mongorestore
http://docs.mongodb.org/manual/reference/program/mongorestore/

Scrum – Agile Software Development

Scrum is an iterative and incremental agile software developmentframework for managing software projects and product or application development. It defines “a flexible, holistic product development strategy where a development team works as a unit to reach a common goal”. It challenges assumptions of the “traditional, sequential approach” to product development. Scrum enables teams to self-organize by encouraging physical co-location or close online collaboration of all team members and daily face to face communication among all team members and disciplines in the project.

A key principle of Scrum is its recognition that during a project the customers can change their minds about what they want and need (often called requirements churn), and that unpredicted challenges cannot be easily addressed in a traditional predictive or planned manner. As such, Scrum adopts an empirical approach—accepting that the problem cannot be fully understood or defined, focusing instead on maximizing the team’s ability to deliver quickly and respond to emerging requirements.

Reference: http://en.wikipedia.org/wiki/Scrum_(software_development)