Thoughts on PostgreSQL JSON Vs MongoDB JSON Performance – Part III (Final)
In part I and part II of this blog series, I shared about my opinions on the test results that were released. I also ran the same test on newer versions of both MongoDBand PostgreSQL. I didn’t want to use existing benchmarking tool like YCSB because it wasn’t fast enough to push the databases to its limit. Hence, I enlist the help of my colleague, Dominic, to write a new tool in Golang to do it. It uses very little memory and doesn’t use any locks! Dominic and myself also presented the results at Percona Live Europe 2017.
The correct way to do performance benchmark
In naive performance benchmarking, people will measure the average latency or the time taken to perform x amount of transactions. Doing this will hide important details about the performance. Being able to do 1 million operations in 10 minutes is not good if it takes 5s to complete an operation or 10% of the operations complete in 10s. We should measure performance in 2 ways.
- 99th percentile (Latency of the operations)
- Throughput (Number of operations per second)
In our results, MongoDB takes lesser time to insert 10 million records. So MongoDB > PostgreSQL right?
However, when we look at the 99th percentile graph, we noticed something strange.
From the graph, we can see that MongoDB has a very long “tail”. This means that it has a significant amount of operations taking more than 1s. For a platform with an SLA of 1s response time, if your database takes 1s to complete a request, you are in a bad situation.
With FreeBSD, we get an extremely powerful tool called dTrace. It allows you to write scripts to hook into the kernel to see what the database is waiting on. Now, we can measure application file system operations.
PostgreSQL – Time taken to write in nanoseconds vs throughput
MongoDB – Time taken to write in nanoseconds vs throughput
MongoDB’s graph fluctuates ALOT whereas PostgreSQL has a stable and consistent graph. Although MongoDB does outperform PostgreSQL tremendously sometimes, there are other times where it takes 4x the amount of time as compared to PostgreSQL. This probably co-relate to the 1s to 3s response times in the INSERTs graph above. IMO, a good database should provide consistent and predictable performance. MongoDB’s performance behaviour is not acceptable in this scenario.
MongoDB cache eviction/checkpoint bug?
Based on our diagnosis, we discovered that the issue happens during checkpoint (every 60s). During checkpoint, WiredTiger writes all the data in a snapshot to disk in a consistent way across all data files. The now-durable data act as a checkpoint in the data files. This means that every 60s, there’s a massive performance drop in throughput since checkpoint will flush everything to disk and fill up the disk IO queue.
In PostgreSQL, the checkpoint_completion_target will allow you to spread the checkpoint writes over a period of time continuously until the next checkpoint. Yes, this means that your throughput will be lower. However, you will get a predictable database performance in return.
This issue applies to Linux as well. This is not a FreeBSD/ZFS issue. We have reproduced the issue with CentOS 7 on XFS.
Indeed, we have seen MongoDB achieving a higher maximum throughput than PostgreSQL. However, the maximum throughput and WILL drop significantly during checkpoints. On the other hand, PostgreSQL has an extremely stable and predictable throughput. The design of MongoDB/WiredTiger needs to be improved to minimise the impact during checkpoint. MongoDB has also acknowledged this problem with new Jira ticket to track this issue. With this MongoDB issue, it’s not fair to make a statement which database is faster with JSON/JSONB.
I do hope you have enjoyed the article. I did learn loads of stuff while writing these blog posts. Also, big thanks to Dom for helping me with the benchmarking tool and preparing for Percona Live Europe together.
Wei Shan ANG