SQL Vs NoSQL Vs NewSQL Databases
Before I join the database field in 2011, I was told that the role of a Database Administrator(DBA) is extremely dull and boring. However, until now, it has been nothing but exciting. Buzz words like Big Data and IoT have generated new found interest in databases. The number of new databases that have sprung out of nowhere is tremendous. CockroachDB, Riak, Voldemort and etc.. In this post, I will be sharing about 3 different category of databases.
SQL, also known as relational database management system(RDBMS), is what most companies are running on today. Popular relation databases are Oracle, SQL Server, MySQL and PostgreSQL. Data are store in rows and columns. You can use SQL to query for data in a RDBMS. A lot of applications are running on either of them. Bank transactions, Data Warehouse operations and payment transactions. These are the core applications that any companies will be supporting.
Relational databases are famous for ACID. Atomicity, Consistency, Isolation and Durability. You can read about them over here. Basically, it means that your transaction, once committed, it will be safe and there should not be any data loss. Simultaneous transactions should not interfere with each other.
NoSQL is a term for databases that does not store its data in rows and columns. They are typically not ACID but BASE compliant. This is because they sacrifice C for AP in CAP theorem. Also, you don’t have to define your schema upfront and is considered “schema-less”. Below are the different types of NoSQL databases
- Document => MongoDB
- Key-Value => Redis
- Graph => Neo4j
- Column => Cassandra
Different types of NoSQL database are specific for different use case. For example, Neo4j is perfect as a data-store for social media sites because supports defining relations between entities. MongoDB are typically used for e-commerce sites because each cart can be defined as 1 collection. Redis is often being used as a queue or caching tier that sits in-front of another database. NoSQL can be an entirely subject of it’s own and I’m probably not qualified to write anything more than an introduction on it. (But this won’t be true after I finish my MongoDB course!)
NoSQL can be accessed use SQL-LIKE language or their own native language like Neo4j-Cypher.
NewSQL is our shorthand for the various new scalable/high-performance SQL database vendors. We have previously referred to these products as “ScalableSQL” to differentiate them from the incumbent relational database products. Since this implies horizontal scalability, which is not necessarily a feature of all products, we adopted the term NewSQL in the new report. And to clarify, like NoSQL, NewSQL is not to be taken too literally: the new thing about the NewSQL vendors is the vendor, not the SQL. NewSQL is a set of various new scalable/high-performance SQL database vendors (or databases). These vendors have designed solutions to bring the benefits of the relational model to the distributed architecture, and improve the performance of relational databases to an extent that the scalability is no longer an issue.
– 451 Group’s senior analyst, Matthew Aslett
NewSQL are pretty new to the database industry. They are supposed to be ACID-compliant yet be highly scalable. Also, they are also using SQL-LIKE for CRUD operations. Below are some NewSQL databases.
In order to achieve the above requirements, different NewSQL vendors have implemented their design different.For example, NuoDB uses a “2 tier approach”. Transaction Engines hold a subset of the objects in memory while Storage Managers are servers that have a complete copy of all objects on disks. Also, NuoDB uses Durable Distributed Cache (DDC) which sound like a massive RAC-Cache Fusion to me. VoltDB uses a mixture of partitioning, stored procedures as a unit of transaction and deterministic command to attain the same. Each of the NewSQL databases have very different concepts to it!
In my opinion, NoSQL will never replace SQL databases. They are simply used for different scenario and it’s never meant to replace each other. The RDBMS is just so good in handling OTLP transactions that you will never want to migrate it to MongoDB or Riak because of performance or ACID compliance. Likewise, you wouldn’t want your 25-node MongoDB cluster to be migrated to Oracle databases. You will either die trying or your company will go bankrupt first.
However, I believe that we will see a huge increase in companies adopting NoSQL because storing and analysing data is becoming very crucial or business to succeed these days. It’s becoming a key differentiator in the corporate world. It helps understand your customers better. Look at companies like Uber or Amazon. They are real good examples of using data to succeed.
Some people may ask, if NoSQL/NewSQL is so powerful, why don’t people use it to replace their existing Oracle Data Warehouse. Typical Oracle shops spend at least 7-figure sum on the Data Warehouse licensing per year (EE Edition, RAC, AWR packs..). In my opinion, there are a couple of reasons why:
- Current analytics tool are pretty robust and mature. But they are only support on ANSI compliant SQL language. Using non-SQL or SQL-LIKE tool will require a re-write of all their existing software.
- RDBMS contains dimension and fact tables which are already “cleaned” as compared to NoSQL
- SQL is a very easy language to learn and the existing users are just too comfortable on it
However, I believe that NoSQL will soon rise to join the ranks of SQL database in the Data Warehouse scene. This is because companies now want to store their non-structured data as part of their data lake. So they could be using RDBMS for their existing applications and using NoSQL for non-structured data. Then they could use something like Big SQL to query data from both type of databases, forming something called Data Lake.
For DBAs like myself, it means you will get to play with more databases! 🙂