So what's the deal with NoSQL?
Is NoSQL just a controversial buzzword? Could you imagaine if the term 'Object Orientated' didn't exist and instead architectures based on concepts such as encapsulation, polymorphism and inhertiance were referred to as 'NoProcedural'? Could you imagine if .net was called 'NoJava'? Leinster was called 'NoMunster'?
Well controversial name aside, a good way to appreciate the hype about NoSQL is to consider scalability - the classical non-functional architectural concern. In a classical OLTP architecture, when load increases and your JVM is under pressure, you need to scale. You have two choices:
The leading relational databases will scale horizontally. But only by so much. To get around this an architect usually will consider:
With a NoSQL database there is no strict schema. Everything is effectively collapsed into one very fat table - a bit like an old skool flat file, but where each row stores a huge amout of data. So, instead of having a table for Users and a table for Activities (representing User's activities), you put all the User information together in one fat row. This means there are no joins across tables. It also means there is a lot of data redundancy which means more storage space required. In addition, more computational power will be needed for writes. But because data that is used together is located at the very same place - within the same row - it means no complex joins and hence it is easier to scale. The computational requirement for reads is also less. So reads can go faster.
Another advantage of NoSQL databases is derived from the freedom that comes with not having to be tied to strict schema. You know that headache where a change to a data model can cause big problems? Well since there is no strict schema with NoSQL - this problem does not exist. This makes the architecture more flexible and more extensible.
Right now, it's fair to say NoSQL is only relevant in the minority of architectures. But could this be another case of technical innovation driving business innovation as we have seen with smart phones? There wasn't a need for smart phones but the technical innovation provided business opportunities. I think the same could happen with NoSQL Architectures.
Take a step back from Computer Science and just think Science. Science used to be hypotheisis centric, now it is becoming more and more data centric. CERN, genome sequencing, climate change analysis - all involve tonnes and tonnes of data. Surely NoSQL architectures allied with searching technologies such as MapReduce / Hadoop will open up new ways to do Science?
So any disadvantages with NoSQL architectures? Well it's still an immature technology. Indexing, Security models are just not as sophisticated as they are with classical relational databases. And because most of it is coming from the open source community the support is not as good as it is for relational databases. So don't throw out your SQL just yet!
PS Well done Dublin and winning the All Ireland!
References:
1. http://about.digg.com/blog/looking-future-cassandra
2. http://www.techrepublic.com/blog/10things/10-things-you-should-know-about-nosql-databases/1772
3. http://nosqltapes.com/
Is NoSQL just a controversial buzzword? Could you imagaine if the term 'Object Orientated' didn't exist and instead architectures based on concepts such as encapsulation, polymorphism and inhertiance were referred to as 'NoProcedural'? Could you imagine if .net was called 'NoJava'? Leinster was called 'NoMunster'?
Well controversial name aside, a good way to appreciate the hype about NoSQL is to consider scalability - the classical non-functional architectural concern. In a classical OLTP architecture, when load increases and your JVM is under pressure, you need to scale. You have two choices:
- vertical scaling - adding more CPU power to your JVM
- horizontal scaling - adding more JVMs (usally one more boxes)
The leading relational databases will scale horizontally. But only by so much. To get around this an architect usually will consider:
- Scaling vertically - putting the database on the best hardware that can be afforded
- Partitioning out legacy data and thus reduce things like the size of index tables. This will boost performance and put less pressure on the need to scale
- Remove the amount of pressure on the database by caching more in the business tier
- Pay a DBA a lot of money!
With a NoSQL database there is no strict schema. Everything is effectively collapsed into one very fat table - a bit like an old skool flat file, but where each row stores a huge amout of data. So, instead of having a table for Users and a table for Activities (representing User's activities), you put all the User information together in one fat row. This means there are no joins across tables. It also means there is a lot of data redundancy which means more storage space required. In addition, more computational power will be needed for writes. But because data that is used together is located at the very same place - within the same row - it means no complex joins and hence it is easier to scale. The computational requirement for reads is also less. So reads can go faster.
Another advantage of NoSQL databases is derived from the freedom that comes with not having to be tied to strict schema. You know that headache where a change to a data model can cause big problems? Well since there is no strict schema with NoSQL - this problem does not exist. This makes the architecture more flexible and more extensible.
Right now, it's fair to say NoSQL is only relevant in the minority of architectures. But could this be another case of technical innovation driving business innovation as we have seen with smart phones? There wasn't a need for smart phones but the technical innovation provided business opportunities. I think the same could happen with NoSQL Architectures.
Take a step back from Computer Science and just think Science. Science used to be hypotheisis centric, now it is becoming more and more data centric. CERN, genome sequencing, climate change analysis - all involve tonnes and tonnes of data. Surely NoSQL architectures allied with searching technologies such as MapReduce / Hadoop will open up new ways to do Science?
So any disadvantages with NoSQL architectures? Well it's still an immature technology. Indexing, Security models are just not as sophisticated as they are with classical relational databases. And because most of it is coming from the open source community the support is not as good as it is for relational databases. So don't throw out your SQL just yet!
PS Well done Dublin and winning the All Ireland!
References:
1. http://about.digg.com/blog/looking-future-cassandra
2. http://www.techrepublic.com/blog/10things/10-things-you-should-know-about-nosql-databases/1772
3. http://nosqltapes.com/

