I'm an old fashioned SQL database programmer

There's been a lot of buzz growing in the blogosphere about NoSQL databases. As of today I count six sessions on the schedule for RailsConf 2010 that are directly or indirectly related to non-relational database storage engines. None of the planned sessions cover SQL exclusively (although two cover both realms to some extent).

Certainly there are some appealing features about NoSQL. Let's face it: SQL is an awkward language. Many relational concepts are difficult for an inexperienced programmer to grasp. What's the difference between WHERE and HAVING in an query? Do you know the syntax for constructing an outer join? Most RDBMS engines to this day treat data types in a rather primitive manner. And the biggest failure of SQL is its inability to handle hierarchical queries in an efficient manner. Try to fetch a subtree of objects with an arbitrary depth, for instance. And let's not get started with the differences in dialects between the various engines. I had to deal with that once when I ported a database application from SQL Server to Oracle, and had to write an RDBMS-agnostic query to obtain paged results from a table without using a cursor.

When it comes to writing web applications, there are certain problems that are best solved in a non-relational environment. Document storage, structured data storage, Cache lookup, queue management, and full text searches are several examples that can be best served by systems such as Memcached, Redis, Tokyo Cabinet, Lucene, or the new Cassandra. I've used them in quite a few Rails projects to manage these sorts of tasks. And when it comes to dealing with networks of data, such as connections for a social web site, you'll want to break out a dedicated engine to crunch that data continuously, as most of the big sites do.

But when it comes down to the nitty gritty of storing key business data for an application, and generating reports on that data, I always come back to good old SQL. Need to run an aggregation with grouping? Want to find documents with the most common tags to the document you're currently displaying? Need to enforce referential integrity in the event that a user record or a document is deleted from the database? It's hard to do that efficiently with non-relational databases.

Back in the 90's, I experimented with several object database and network-relational database solutions such as GemStone and ObjectStore. I was seduced with the idea of a native object storage or network object engine. I even convinced one of my employers to pluck down $25,000 for an ObjectStore license -- a very dear sum for a startup 12 years ago. But sadly, none of those systems ever rose to the popularity of Oracle or even DB2. And the core reason is this: you can't get the query performance from object, network, or key-value storage systems that you can get from an RDBMS. And maintaining referential integrity in a network database takes an incredible amount of overhead.

SQL, for all its faults, happens to be the one language that has remained constant throughout my career. And I'm very excited about the prospect of building my queries using the hot new ActiveRecord Query interface in Rails3. But remember that it's all still SQL under the hood. And understanding how SQL works will help you create more efficient queries.

Posted by george Thu, 15 Apr 2010 00:12:00 GMT


Comments

Leave a response

  1. Avatar
    administrative assistant 8 months later:
    this post is very usefull thx!
  2. Avatar
    Houston Home Security 9 months later:
    Dude.. I am not much into reading, but somehow I got to read lots of articles on your blog. Its amazing how interesting it is for me to visit you very often.

Leave a comment