in Databases, Tutorials

What’s the difference between a SQL and NoSQL database?

While learning to code (with Ruby and Rails, among others), I’ve been using SQL-based databases, such as SQLite and ActiveRecord. These are what are known as relational databases (or query interfaces, or something…), in which we use a schema to model the database and SQL queries to conduct operations such as adding or removing records, or querying the database for information, such as the quantity of records found for some attribute.

Recently, however, I’ve been hearing a lot about NoSQL databases, and I decided to learn more about them.

What is NoSQL?

Here is the Wikipedia definition:

NoSQL is an umbrella term for a loosely defined class of non-relational data stores that break with a long history of relational databases and ACID guarantees. Data stores that fall under this term may not require fixed table schemas, and usually avoid join operations. The term was first popularised in early 2009.

A commenter on StackOverflow explains:

The idea with NoSQL isn’t so much to replace SQL as it is to provide a solution for problems that aren’t solved well with [a] traditional RDBMS.

NoSQL supposedly stands for “Not only SQL” but I think that’s a PC definition of the term, like saying that the ‘F’ in SNAFU stands for ‘Fouled’ when we all know what it really stands for.  At least in my research of the topic, I have yet to find a single SQL query anywhere near a NoSQL database.

Ok, cool, that’s a bit of a definition, but what is a NoSQL database really?

I think the best way to answer that is to contrast it against a traditional database.

The traditional relational database: Great for stable, predictable domains

Relational databases emerged in the pre-internet era, when making software applications involved more of a up-front design approach. You’d analyze the domain, define the various entities and their relationships and the create your data model.  A relational database, in other words, seems to be intended more for traditional software applications, such as an accounting system, which is relatively stable domain. Ok, there is no such thing as completely stable when it comes to software, but *more* stable than the Wild west that is the internet, which brings us to NoSQL databases.

NoSQL: Designed for the big messy interwebs

The internet in general and web-based applications specifically are anything but predictable.  In fact the only thing that is predictable about the interwebs is that it is messy and unpredictable.  NoSQL databases are created specifically for that messiness and the need for rapid change and evolution.

Comparing a SQL-based and a NoSQL-based data model

A blog post at Cloudant offers a great comparison between a SQL-based database and it’s NoSQL equivalent.  They used the data modeling of the product Foundbite as an example, which, according to the product website, “combines photos and sound to capture the real atmosphere of a place, event or experience.”

The SQL version of Foundbit

Here is what a SQL-based data model for this product might look like (as represented in the form of an Entity-Relational Diagram):

Updating the SQL Model

As with most products, you are likely to keep adding new features, which means continually modifying the database. Here, they’ve added reverse geo-location in this update to the above model:

Creating these updates is not trivial in a SQL-based database.  We need to add new fields in different tables, define any new relationships, and also write any necessary queries to pull the various pieces together into various product views.

But Cloudant felt that a traditional data model was too constraining for them, so they instead went with a NoSQL approach…

A NoSQL Version

Here is what a NoSQL equivalent version of the above looks like for the Foundbite product.

As you can see, all the data is stored in a single document or JSON object. In other words, a NoSQL database is managed using a simple text file. Just as importantly, many NoSQL databases, such as MongoDB, support end-to-end use of JSON, which means you can work in a single language or technology rather than have to move between multiple language/technologies, such as is the case with, say, PHP/Apache or Ruby/Passenger.

Adding features

Now, as with the SQL example, let’s say we want to add that reverse geo-location feature, and while we’re at it, why not add some other features like FourSquare Venue ID and Hash Tags.

Here is an update to the above JSON object, in which we’ve added those changes.

All we needed to do was update this one document. That is sort of scarily simple.  And since I’m a total NoSQL N00b, I’m guessing there must be a lot of other stuff that is needed for this to work but so far, it seems that really is all that is needed. Don’t ask me how this works because I have no idea.

Look ma, no migrations!

Looking at the above, if you’ve been working in Rails, or some other technology that uses traditional databases, you’re likely to notice a lot of things that you we did NOT have to do to modify the database.

Here are just a few of those:

  • No schema changes. In fact, apart from the above document, there is actually no schema.
  • No migrations (ie no creating of a migration file and running rake db:migrate.)
  • No complex joins to define entity relationships.

In fact, the above changes can be made with no database downtime or table locks. A NoSQL database, therefore is highly agile (and Agile too, I’d say.)

And now the bad news (I think)

Based on my limited research into this topic, I also found some downsides to NoSQL databases.

First, a NoSQL db may not provide full ACID (atomicity, consistency, isolation, durability) guarantees, which according to Wikipedia is “a set of properties that guarantee that database transactions are processed reliably.”  I’m no DB expert, but I can imagine that each of these attributes are pretty essential to a robust database.

Atomicity: Each transaction is “all or nothing”: if one part of the transaction fails, the entire transaction fails, and the database state is left unchanged.

Consistency: a database operation does not violate any integrity constraints during its execution. If a transaction leaves the database in an illegal state, it is aborted and an error is reported.

Isolation: concurrent execution of transactions results in a system state that identical to if transactions were executed serially.

Durability: Once a transaction is committed, it will remain so, even in the event of power loss, crashes, or errors.

Mathy Stuff: NoSQL databases are supposedly also not as good at handling math stuff, like telling you how many you have of a certain record or element based on a set of attributes.

I’m sure the creators of NoSQL DBs have thought of ways to address these issues, I just didn’t have a chance to look into them.

Creating a Rails App with a NoSQL Database

Since I’m working in Rails, I decided to try to create a NoSQL database in Rails. Here’s how I did that, with a lot of help from Stackoverflow and Railscasts, among others.

1. Install a NoSQL database

A lot of tutorials you see on adding NoSQL to Rails just seem to imply that all you have to do is add a gem for a NoSQL database, but that is only a interface helper to your database.  You have to also install the actual database if you plan to try this on your local machine.

Here, we’ll be using MongoDB, which seems to be one of the more predominant NoSQL databases, and is used by the New York Times, LinkedIn and other big-dogs.

The best way to install Mongo if you’re on OSX is to use Homebrew.  If so, just do the following:

Now, I know this looks suspiciously simple, but that is in fact all there is to it. HOWEVER, make sure you have total sudo (ie God-like) permissions on your machine, or your installation may hang with no explanation.

Speaking of God, after installing, you’ll want to run the db, which you can do by typing:

(You may need to type ‘sudo mongod’)

You should then see the following message:

To stop the db, just enter Ctrl+C (but don’t do that now.)

Ok, we’re now ready to create a NoSQL Rails app.

2. Create a new Rails app, without active record

First, create a new app, but without Active Record.

3. Update your Gemfile

Then, go into your Gemfile and add the following:

We’ll be using the mongoid gem.  There are other gems for MongoDB, but this is the one I found least difficult to use.  You’ll notice the reference to a specific repo for the gem source. Also, the ‘bson_ext’ gem is needed but I’m not sure why. This is needed if you are using Rails 4.

4. Configure Mongoid

Run the following generator:

This will generate the file config/mongoid.yml and probably also a bunch of other craziness.

5. Create a NoSQL model

As mentioned earlier, there are no migrations or anything, you just add fields directly to your model.  You could just create a model here using a regular Rails generator, but if you’re lazy like me, you’ll instead just create a whole scaffold, so that you can start entering data asap.

Let’s imagine we’re creating a blog app, and we want to have posts in our blog:

Just to make sure everything is working ok, fire up your server and head over to localhost:3000/posts You should be able to start creating posts, as per usual, like so:

Adding data in NoSQL app

You’ll notice that we were able to do this without any migrations.

6. Take a look at the model file

All of our field info is instead collected in the model itself:

8. Muck about!

This example is, in part, based on a MongoDB railscast by the inimitable Ryan Bates.  To really get a feel for the flexibility, follow the railscast and learn how to quickly make changes to your db.

Now, this is a nice little example, but probably doesn’t really do justice to the power of a NoSQL database.  For that, I think one needs a more real-world context, with lots of data, lots of users, and a need for constant feature updates.

Rails + NoSQL = BFF?

Since Rails is all about being Agile and all about creating stuff for the web, and since Ruby is all about objects, it would seem as if Rails and NoSQL would be a match made in heaven.  I think a main reason why this is not the case is because the RDBMS ActiveRecord is such a core part of Rails and so Rails would sort of have to shed its own skin in order to integrate full with a NoSQL model.

But, as NoSQL databases become more widely used (as I at least am guessing will happen), how future versions of Rails might support more and/or easier integration with NoSQL databases.