You can check the stats of a mongodb collection by using the command below:
db.<collection>.stats()
> db.seCrossReferences.stats()
{
“ns” : “result.<collection>”,
“count” : 256292,
“size” : 3049450896, # which is about 2.8GB
“avgObjSize” : 11898.346011580541,
“storageSize” : 4109500416,
“numExtents” : 18,
“nindexes” : 1,
“lastExtentSize” : 1071394816,
“paddingFactor” : 1,
“systemFlags” : 1,
“userFlags” : 0,
“totalIndexSize” : 8323168,
“indexSizes” : {
“_id_” : 8323168
},
“ok” : 1
}
Then you can export/dump the data using mongoexport or mongodump, the syntax for mongodump is as follows, it will dump the binary data out of mongo and save to a directory:
################### MONGODUMP ####################################
mongodump –db <dbname> –collection <collectionname> –out <outputpath/name>
It turns out that the size of the dumped folder is about the same size as what is showed in the db.<collection>.stats(), which makes sense because the dumped result is in binary format, result.bson.
Then you can use gzip to make it into a tar ball where the file size is only 177MB, which is less than 7% of the original file.
You can also dump the file to standard output and gzip on the go in one liner:
mongodump –db <dbname> –collection <collectionname> –out – | gzip -9 > <outputfilename>.gz
it took 6 minutes in total to dump and gzip the file, -9 flag make sure it is the best compression rate, but after all, the file size is 171 MB which is almost the same.
################### MONGOEXPORT ####################################
mongoexport will export the mongodb data to a human readable file JSON/CSV. The command syntax is similar and lets take a look:
mongoexport –db <dbname> –collection <collectionname> –out <outputfilename>.json
The exported json format file is 2.9 GB in this case and it took about the 7 minutes also.
mongoexport –db result –collection seCrossReferences | gzip > seCrossReferences.json.gz
This takes about 10 minutes but… the exported zipped file is 144MB…
In conclusion, I think maybe mongoexport + zip is a better solution, because:
1. its output is json/csv which could be used for something else except for moving database.
2. the output has a smaller size
############# IMPORT DATA #######################################
mongoexport + mongoimport
mongoimport –db –collection –file .json –type json
mongodump + mongorestore
http://docs.mongodb.org/manual/reference/program/mongorestore/