In Mongo we trust
10 minute crash course in MongoDB
Recently I've been working more and more with MongoDB, I previously worked heavily with Couchbase and blogged a lot about how to query Couchbase if coming from the SQL world. The following is a ten minute crash course in the most common queries you'll want to execute when using MongoDB or you have an interview and forgot to prep!
MongoDB 10'000m overview
MongoDB is an open-source document based database that stores data in BSON (Binary Json). Instead of a traditional table based relational database MongoDB works with documents that are schema-less and hence can be much more flexible in terms of what data can be stored together in the same collection.
As Mongo is schema-less you lose a distinct advantage of traditional RDBMS and that is the ability to perform joins between multiple tables, this means that often with Mongo data will be replicated into different collections and multiple queries often have to be executed to extract the same data that usually you'd be able to run with one SQL query. Mongo is really fast so don't sweat it about having to execute multiple requests.
Having said the above Mongo really shines when working with unstructured data, being able to store different types of data on the fly can be useful, just make sure you are picking the correct tool for the job.
Time to get set up
We're going to load in some sample data and start executing some queries, it's the fastest way to learn how to work with Mongo. All this assumes you have MongoDB running locally on your machine. Save the following JSON file locally:
Running the following command will load the data from above into MongoDB under a database name of inmongowetrust
and a collection of users
. (Think of collections as being similar to tables in the RDBMS world).
mongoimport --verbose --host localhost:27017 --db inmongowetrust --collection users --file users.json
From the tools I've tried for running test queries and working with data RoboMongo. If you run the following queries in RoboMongo you'll easily see which documents are being returned and can easily experiment with the result sets!
Example time
###### - Select all usersdb.users.find()
- SELECT COUNT OF ALL USERS:
db.users.find().count()
- SELECT ALL FROM COMPANY:
db.users.find({companyId: 102})
- SELECT ALL FROM COMPANY WHERE ACTIVE:
db.users.find({companyId: 102,active: true})
- SELECT ALL INACTIVE USERS:
db.users.find({active: false})
- SELECT VIA ID:
db.users.find({_id: ObjectId("57037ae071df3738ecf2b4d7")})
- SELECT VIA MULTIPLE IDS:
db.users.find({_id: {$in: [ObjectId("57037ae071df3738ecf2b4d7"),ObjectId("57037b4571df3738ecf2b4d8")]}})
- SELECT VIA MULTIPLE IDS AND LIMIT:
db.users.find({_id: {$in: [ObjectId("57037ae071df3738ecf2b4d7"),ObjectId("57037b4571df3738ecf2b4d8")]}}).limit(1)
- SELECT USERS WITH AGE GREATER THAN:
db.users.find({age: {$gt: 31}})
- SELECT USERS WITH AGE GREATER THAN OR EQUAL TO:
db.users.find({age: {$gte: 31}})
- SELECT ONLY ID FROM QUERY:
db.users.find({age: 29},{_id: 1})
- SELECT ONLY ID AND ACTIVE:
db.users.find({age: 29},{_id: 1,active: 1})
- SELECT ONLY ACTIVE FROM QUERY:
db.users.find({age: 29},{_id: 0,active: 1})
- SELECT ALL USERS BUT ONLY RETRIEVE LOCATION AND AGE:
db.users.find({},{_id: 0,location: 1,age: 1})
- SELECT ALL USERS AND SORT ASC BY AGE WITH ONLY AGE AND LOCATION FIELDS SELECTED:
db.users.find({},{_id: 0,location: 1,age: 1}).sort({age: 1})
- SELECT ALL USERS AND SORT VIA FIELD NOT RETURNED:
db.users.find({},{_id: 0,location: 1,age: 1}).sort({name: 1})
- SELECT ALL USERS MISSING A LOCATION FIELD:
db.users.find({location: null})
- SELECT ALL USERS WHERE LOCATION FIELD EXISTS AND IS NOT NULL:
db.users.find({location: {$exists: true, $ne: null}})
- SELECT ALL USERS THAT WERE CREATED IN 2016:
db.users.find({createdAt: {$gte: ISODate("2016-01-01T00:00:00.000Z")}})
- SELECT ALL USERS CREATED ON SINGLE DAY IN 2016:
db.users.find({createdAt: {$gte: ISODate("2016-01-03T00:00:00.000Z"), $lt: ISODate("2016-01-03T23:59:59.999Z")}})
10 second summary
- Mongo is flexible when working with unstructured data
- If you need joins then you've got structured data, just use a RDBMS
- Don't worry about executing multiple queries
As always feel free to reach out to discuss,argue or even perhaps agree with my thoughts! You can reach out to me on Twitter or leave a comment below!