Compound Index Vs Multiple Single Field Index in MongoDB 3.x

Compound Index Vs Multiple Single Field Index in MongoDB 3.x

MongoDB supports both compound index and single field index. It’s a common knowledge that index decrease your INSERTS/UPDATES performance but increases your SELECT performance. However, if you need to index multiple fields, is it better to use a compound index or just create multiple single field indexes? This post tries to investigate which is the better option in terms of performance.

Insert Performance

Test Setup:

  • Single MongoDB 3.2 instance running on my Mac
  • POC Driver uses single thread to perform inserts only
  • Test run for 5 mins
  • Empty collection after every run
  • Default readConcern
  • Default writeConcern

Sample Document:

> db.testCol.findOne()
{
 "_id" : {
 "w" : 2,
 "i" : 0
 },
 "fld0" : NumberLong(530752),
 "fld1" : ISODate("2015-06-02T08:14:18.687Z"),
 "fld2" : "tempor invidunt ut labore et",
 "fld3" : "eirmod tempor invidunt ut",
 "fld4" : "Lorem ipsum dolor sit amet,",
 "fld5" : ISODate("2016-10-12T01:58:24.580Z"),
 "fld6" : NumberLong(1712708),
 "fld7" : "sit amet. Lorem ipsum dolor",
 "fld8" : "sit amet, consetetur",
 "fld9" : NumberLong(652363),
 "bin" : BinData(0,"")
}

Phase 1 (No indexes except on _id field)

After 295 seconds, 13340672 new records inserted - collection has 13340672 in total
46498 inserts per second since last report 99.97 % in under 50 milliseconds
0 keyqueries per second since last report 100.00 % in under 50 milliseconds
0 updates per second since last report 100.00 % in under 50 milliseconds
0 rangequeries per second since last report 100.00 % in under 50 milliseconds

~45,222 inserts/second

Phase 2 (Compound index on fld1 to fld9)

After 295 seconds, 7739904 new records inserted - collection has 7739904 in total
20830 inserts per second since last report 99.72 % in under 50 milliseconds
0 keyqueries per second since last report 100.00 % in under 50 milliseconds
0 updates per second since last report 100.00 % in under 50 milliseconds
0 rangequeries per second since last report 100.00 % in under 50 milliseconds

~26,236 inserts/second

Phase 3 (Single field indexes on each fld2, fld6 and fld8)

After 295 seconds, 2998272 new records inserted - collection has 2998272 in total
7012 inserts per second since last report 63.61 % in under 50 milliseconds
0 keyqueries per second since last report 100.00 % in under 50 milliseconds
0 updates per second since last report 100.00 % in under 50 milliseconds
0 rangequeries per second since last report 100.00 % in under 50 milliseconds

~10,163 inserts/second

Storage Utilisation

On the same collection with 2,817,024 documents:

Compound Index

db.testCol.stats().indexSizes
{
 "_id_" : 75079680,
 "fld0_1_fld1_1_fld2_1_fld3_1_fld4_1_fld5_1_fld6_1_fl7_1_fld8_1_fld9_1" : 2617061376
}

~2.44GB

Single-Field Index

> db.testCol.stats().indexSizes
{
 "_id_" : 75079680,
 "fld0_1" : 65343488,
 "fld1_1" : 79781888,
 "fld2_1" : 35471360,
 "fld3_1" : 35475456,
 "fld4_1" : 35475456,
 "fld5_1" : 79781888,
 "fld6_1" : 65339392,
 "fld7_1" : 35475456,
 "fld8_1" : 35475456,
 "fld9_1" : 65347584,
}

~0.53GB

Index Selectivity Usage

For single-field index, there’s no limitation at all. MongoDB optimiser will pick and choose whichever index it deems fit. For compound index, it’s slightly different.

Query like these will not use the compound index:

  • db.testCol.find({fld9:1,fld8:1,fld7:1})
  • db.testCol.find({fld9:1,fld3:1,fld6:1})
  • db.testCol.find({fld9:1,fld8:1,fld7:1})

Query like these will use the compound index:

  • db.testCol.find({fld0:1,fld1:1})
  • db.testCol.find({fld5:1,fld8:1,fld0:1}) (not very efficient..)
  • db.testCol.find({fld0:1,fld3:1,fld5:1})

The general idea is that MongoDB will use it the compound index if it matches the index prefix. If it doesn’t match it correctly, it might still use the index if it is still more efficient that performing a full collection scan. The only way to find out is educate and test the queries yourself!

Summary

A very short summary in key points below:

  1. No index has the best insert performance (duh!!)
  2. Compound index is faster than Single-field index in insert performance
  3. Compound index uses more storage than Single-field index

Depending on the situation, choose your poison carefully 🙂

References:

P.S: The above test does not attempt to quantify the absolute performance numbers but rather provide a good performance degradation ratio for indexes

Regards,
Wei Shan

Advertisements

Leave a comment

PostgreSQL 9.6 on RHEL 7.2 -FATAL: semctl(SETVAL) failed: Invalid argument

PostgreSQL 9.6 on RHEL 7.2 -FATAL: semctl(SETVAL) failed: Invalid argument

The error below started occurring once we upgraded to PostgreSQL 9.6 on RHEL 7.2. It was strange because all we did was had a user to login, check the status of pgsql and logout.

 Dec 7 22:50:01 hostname systemd: Created slice user-1004.slice.
 Dec 7 22:50:01 hostname systemd: Starting user-1004.slice.
 Dec 7 22:50:01 hostname systemd: Started Session 2243 of user postgres.
 Dec 7 22:50:01 hostname systemd: Starting Session 2243 of user postgres.
 Dec 7 22:50:01 hostname systemd: Created slice user-0.slice.
 Dec 7 22:50:01 hostname systemd: Starting user-0.slice.
 Dec 7 22:50:01 hostname systemd: Started Session 2244 of user root.
 Dec 7 22:50:01 hostname systemd: Starting Session 2244 of user root.
 Dec 7 22:50:01 hostname postgres[12064]: [432-1] [unknown]: [local]: 58481389.2f20: LOG: connection received: host=[local]
 Dec 7 22:50:01 hostname postgres[12064]: [433-1] postgres: [local]: 58481389.2f20: FATAL: semctl(153976863, 2, SETVAL, 0) failed: Invalid argument
 Dec 7 22:50:01 hostname postgres[21799]: [432-1] : : 58444abd.5527: LOG: server process (PID 12064) exited with exit code 1
 Dec 7 22:50:01 hostname postgres[21799]: [433-1] : : 58444abd.5527: LOG: terminating any other active server processes
 Dec 7 22:50:01 hostname postgres[11514]: [434-1] repmgr: <IP_ADDRESS>(41171): 58480edb.2cfa: WARNING: terminating connection because of crash of another server process
 Dec 7 22:50:01 hostname postgres[11514]: [434-2] repmgr: <IP_ADDRESS>(41171): 58480edb.2cfa: DETAIL: The postmasteIP_ADDRESSr has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
 Dec 7 22:50:01 hostname postgres[11514]: [434-3] repmgr: <IP_ADDRESS>(41171): 58480edb.2cfa: HINT: In a moment you should be able to reconnect to the database and repeat your command.
 Dec 7 22:50:01 hostname postgres[12074]: [434-1] [unknown]: ::1(48077): 58481389.2f2a: LOG: connection received: host=::1 port=48077
 Dec 7 22:50:01 hostname postgres[12074]: [435-1] repmgr: ::1(48077): 58481389.2f2a: FATAL: the database system is in recovery mode
 Dec 7 22:50:01 hostname Backup: [INFO]DB check start.
 Dec 7 22:50:01 hostname postgres[21799]: [434-1] : : 58444abd.5527: LOG: all server processes terminated; reinitializing
 Dec 7 22:50:01 hostname postgres[21799]: [435-1] : : 58444abd.5527: LOG: could not remove shared memory segment "/PostgreSQL.1721076088": No such file or directory
 Dec 7 22:50:01 hostname postgres[21799]: [436-1] : : 58444abd.5527: LOG: semctl(152961024, 0, IPC_RMID, ...) failed: Invalid argument
 Dec 7 22:50:01 hostname postgres[21799]: [437-1] : : 58444abd.5527: LOG: semctl(152993793, 0, IPC_RMID, ...) failed: Invalid argument

After researching on the issue, we found this bug :).

Resolution

1) Set RemoveIPC=no in /etc/systemd/logind.conf

2) Reboot the server or restart systemd-logind as follows:
 # systemctl daemon-reload
 # systemctl restart systemd-logind

Reference:

  1. https://bugzilla.redhat.com/show_bug.cgi?id=1287618
  2. https://bugzilla.redhat.com/show_bug.cgi?id=1264533

Note:

  • Please note that I have seen this RHEL 7 issue with Oracle database too. It’s a RHEL 7 bug so it should applies to most database using IPC calls.

Hope this helps.

Regards,
Wei Shan

Leave a comment

MySQL 5.x – How to capture the number of inserts in a MySQL database

MySQL 5.x – How to capture the number of inserts in a MySQL database

I was asked to get the number of DML on a MySQL database recently. However, the performance schema was not installed so I couldn’t query from it. I found a status variable called innodb_rows_inserted.

mysql> show status like 'Innodb_rows_inserted'; select sleep(60) ; show status like 'Innodb_rows_inserted' ;
+----------------------+-----------+
| Variable_name | Value |
+----------------------+-----------+
| Innodb_rows_inserted | 576654182 |
+----------------------+-----------+
1 row in set (0.00 sec)

+-----------+
| sleep(60) |
+-----------+
| 0 |
+-----------+
1 row in set (1 min 0.00 sec)

+----------------------+-----------+
| Variable_name | Value |
+----------------------+-----------+
| Innodb_rows_inserted | 576684821 |
+----------------------+-----------+
1 row in set (0.01 sec)

To get the value/second, simply perform the calculation below;

(576684821-576654182)/60 = ~510 inserts per second

Obviously, the number above is only for inserts. You can also modify the command above to include the following;

  • Innodb_rows_inserted
  • Innodb_rows_read
  • Innodb_rows_updated
  • Innodb_rows_deleted

Hope this helps!

Regards,
Wei Shan

,

Leave a comment

How to perform MongoDB Sharded Cluster Cross-DC Failover

How to perform MongoDB Sharded Cluster Cross-DC Failover

Imagine the following MongoDB architecture, how do you perform a manual failover when the entire DC 1 is unavailable? I couldn’t find any official MongoDB documentation. Thus, I decided to write my own.

Note:

  1. MongoDB version 3.2
  2. Config Server Replica Set
  3. Assumed DC1 total failure and unrecoverable

mongodb-blog-post

High Level Steps:

  1. Stop mongos
  2. Update mongos configuration to point to the new single node config server replica set
  3. Reconfigure the config server to be a single node replica set
  4. Update config database shards collection
  5. Reconfigure the DC2 shards to be single node replica set
  6. Clean up shard caches
  7. Start mongos

Detailed Steps

Stop mongos

systemctl stop mongos

Update mongos configuration file to point to the new single node config server replica set

From:
configDB: rs0/node1:27019,node2:27019,node3:27019

To:
configDB: rs0/node3:27019

Reconfigure the config server to be a single node replica set

cfg = rs.conf()
cfg.members = [cfg.members[]]
rs.reconfig(cfg, {force : true})

Update config database shards collection

mongo localhost:27019
mongo> use config
mongodb> shards.update(
 { _id: "<replSetName" },
 { $set:
 {
 "host": "replSetName/<dc2_shard_hostname:27017"
 }
 }
)

Reconfigure the DC2 shards to be single node replica set

cfg = rs.conf()
cfg.members = [cfg.members[]]
rs.reconfig(cfg, {force : true})

Clean up shards caches (On all shard new single node replica set)

Add the following to the mongodb config file and reboot the mongod process
setParameter:
   recoverShardingState:false
# systemctl restart mongod

Clean up the config database cache

mongo> use admin
mongo> db.system.version.remove(
   { _id: "minOpTimeRecovery" },
   { writeConcern: { w: "majority" } }
)

Remove recoverShardingState: false from mongodb config file and reboot the mongod process

Start mongos

systemctl start mongos

Reference:

Regards,
Wei Shan

, ,

Leave a comment

A neutral perspective of the current database industry

A neutral perspective of the current database industry

According to db-engines, there are more than 317 database software in the market right now. If you have requirements to develop a new application , how do you know which database to use? When I started my career in 2011, the de facto standard was to use Oracle database. It was mature, reliable and able to support multiple types of workload. If you can’t afford Oracle database, you could use MySQL, if you require high performance with HA capabilities. If you require a mature product with strict compliant to ANSI SQL standards, there’s always PostgreSQL. However, Oracle is still the #1 choice for a lot of reasons. Also, Oracle DBAs were also paid the most, compared to other databases DBA.

Five years has passed, a lot has changed in the database industry. Even though Oracle is still the #1 in the market, there are many other comparable choices now. Oracle supporters will claim that Oracle database is still the best and no other database products will come close. They will also claim that NoSQL products are overhyped and will die down soon enough. Open-source supporters will spread hate on Oracle, claim that it’s overpriced and that you don’t need Oracle database at all.

In my opinion, enterprise companies AKA companies whom are loaded with cash, will still continue to pay 7 figures for Oracle support contract. They continue to do so because they are not willing to migrate their legacy application to something else. Also, It doesn’t make sense for them to use other database software when they have unlimited license usage for Oracle software. It’s also possible that the CIO aren’t willing to risk his/her neck by changing the database stack 🙂 For companies that aren’t as rich, usually their ERP software will still be stuck using Oracle database. That’s if they haven’t move to a cloud ERP yet. For new applications, they are already looking at alternatives.

Oracle is still the #1 database, in terms of technical superiority. No other databases vendors have implemented Oracle’s Real Application Cluster (RAC). It’s an extremely powerful feature that offers possibly, zero downtime for node failure. However, how many real life applications require that? Is it feasible to have that feature for an additional 23,000 USD/core? Is it acceptable to have 15s of downtime during node failure instead? That would decrease your cost to about nothing for licensing.

If you have a multiple terabyte application, the cost of having that on a Oracle database is not going to be cheap. A four-core EE licensed server is going to cost you at least 200,000 USD already. If you run it on MongoDB or MySQL, it’s not going to cost you anything. Obviously, your application needs to be able to able to support that (ACID compliant). If you help your company to save that 200k USD, you might find a sweeter bonus at the end of the year!

Oracle database has got a lot of great technology behind it. The database optimisation engine is far more impressive than any other vendors I know. They had parallel query even before version 8i. PostgreSQL only had it recently. However, Oracle database sales team has been trying to piss off their customers for awhile now.I know that several banks in Asean region are using PostgreSQL/MySQL for their new applications Also, many companies are banning new Oracle database implementations implicitly unless there’s no alternatives. People didn’t have much choice for choosing database software back then. It is vastly different now.

It’s a lot harder to pick the right database for new applications now. It used to be just Oracle and Oracle only. These days, you got to think of the nature and volume of the data/application. You also need to decide if your application requires ACID. If you are using NoSQL, there’s graph, key-value, document and columnar. You don’t really want to use the wrong database for your application. Using the wrong datastore for your application will result into sub-optimal performance or ugly workarounds that you wouldn’t like. For example, you could use MongoDB for text search. However, a better database for heavy text search use case would be Elasticsearch. Just because MongoDB have text indexes and is able to perform text search doesn’t mean it’s the right datastore for it. Heck, you could even use MySQL as a pure key-value storage. Databases is all about polyglot persistence these days.

I believe that in years to come, Oracle database will become the mainframe of database software. Highly profitable yet legacy piece of software whereby companies only use it because they were too afraid to use something else. For all my friends that still works actively on Oracle databases, perhaps it’s time to learn a new skill or move on to something else.

Regards,
Wei Shan

4 Comments

Tips on using MongoDB Shell

Tips on using MongoDB Shell

I’m currently taking the MongoDB online course, M202 right now. It’s an advance course that covers sharded cluster, performance tuning and others useful topics. Also, I have learnt a couple of tricks about mongo shell from the course so far.

show profile

On a database that has profiling enabled, the show profile helper in the mongo shell displays the 5 most recent operations that took at least 1 millisecond to execute. Issue show profile from the mongo shell, as follows:

show profile

This is particularly useful when you don’t want to go through the entire collection. MongoDB query syntax isn’t the most intuitive language in the world and it helps when you can just do a quick analysis of the slow queries. If you need something more detailed, try this.

show log

There are many ways to view MongoDB log files:

  1. cat mongod.log
  2. via Kibana
  3. mongo shell, show log

I find that using #3 is handy, especially during a severity 1 issue. It shows all the logs that is currently in memory. This also means it will only show relevant logs 🙂

Hope this helps!

Regards,
Wei Shan

Leave a comment

MongoDB 3.x – Maximum Oplog Size

MongoDB 3.x – Maximum Oplog Size

In the official MongoDB documentation, it is said that the maximum size for oplog is 50GB. With 5 years of Oracle database experience, I developed the bad habit of trusting official documentation entirely. Following the procedure to resize the oplog on my laptop, I was able to resize it to 60GB instead.

To check the current oplog size;

my_rs_1:PRIMARY> rs.printReplicationInfo() 

configured oplog size:   61440MB
log length start to end: 1568secs (0.44hrs)
oplog first event time:  Sun Jan 29 2017 17:55:26 GMT+0000 (GMT)
oplog last event time:   Sun Jan 29 2017 18:21:34 GMT+0000 (GMT)
now:                     Sun Jan 29 2017 18:25:29 GMT+0000 (GMT)

I have resized it to 60GB instead! 🙂

Working with MongoDB for the last 3 months has made me realised a lot of things that Oracle DBA has taken for granted. Things that is supposed to work the way it is documented seems to be a bonus instead of being a requirement.

P.S: On the hindsight, I believe the 50GB upper bound only applies for the default 5% and not the maximum configurable size.

Hope this helps!

Regards,
Wei Shan

2 Comments

MongoDB 3.2 Sharded Cluster – rs.reconfig() does not work

MongoDB 3.2 Sharded Cluster – rs.reconfig() does not work

Following this document to reconfigure a sharded replica set does not work. The replica set seems to be confused. rs.status() show the correct output but db.serverStatus() show otherwise.

rs:PRIMARY> rs.status()
{
"set" : "rs",
"date" : ISODate("2017-01-18T03:05:06.202Z"),
"myState" : 1,
"term" : NumberLong(20),
"configsvr" : true,
"heartbeatIntervalMillis" : NumberLong(2000),
"members" : [
{
"_id" : 1,
"name" : "node1:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 439,
"optime" : {
"ts" : Timestamp(1484708697, 1),
"t" : NumberLong(20)
},
"optimeDate" : ISODate("2017-01-18T03:04:57Z"),
"infoMessage" : "could not find member to sync from",
"electionTime" : Timestamp(1484708696, 1),
"electionDate" : ISODate("2017-01-18T03:04:56Z"),
"configVersion" : 83051,
"self" : true
}
],
"ok" : 1
}
rs:PRIMARY> db.serverStatus()
{
"host" : "node1:27017",
"process" : "mongod",
"repl" : {
"hosts" : [
"node1:27017"
],
"setName" : "rs",
"setVersion" : 83051,
"ismaster" : false,
"secondary" : true,
"primary" : "node1:27017",
"me" : "node1:27017",
"electionId" : ObjectId("7fffffff0000000000000014"),
"rbid" : 398723451

Error log on mongos

2017-01-17T21:38:40.143Z W NETWORK [ReplicaSetMonitorWatcher] No primary detected for set rs0
2017-01-17T21:38:40.144Z W NETWORK [ReplicaSetMonitorWatcher] No primary detected for set rs1
2017-01-17T21:38:50.440Z W NETWORK [ReplicaSetMonitorWatcher] No primary detected for set rs0
2017-01-17T21:39:00.762Z W NETWORK [ReplicaSetMonitorWatcher] No primary detected for set rs1
2016-12-20T21:18:27.060Z W SHARDING [replSetDistLockPinger] pinging failed for distributed lock pinger :: caused by :: FailedToSatisfyReadPreference: could not find host matching read preference { mode: "primary" } for set rs0(config)
2016-12-20T21:18:29.187Z W NETWORK [mongosMain] Failed to connect to , reason: errno:65 No route to host
2016-12-20T21:18:29.188Z W NETWORK [mongosMain] Failed to connect to , reason: errno:65 No route to host
2016-12-20T21:19:44.217Z W NETWORK [mongosMain] Failed to connect to after 5000 milliseconds, giving up.
2016-12-20T21:19:44.218Z W NETWORK [mongosMain] No primary detected for set rs0(config)
2016-12-20T21:19:44.218Z I SHARDING [mongosMain] Error initializing sharding state, sleeping for 2 seconds and trying again :: caused by :: FailedToSatisfyReadPreference: could not get updated shard list from config server due to could not find host matching read preference { mode: "nearest" } for set rs0(config)

Error log on the single node replica set mongod process

2017-01-17T23:50:36.601Z I NETWORK [HostnameCanonicalizationWorker] Starting hostname canonicalization worker 
2017-01-17T23:50:36.602Z I NETWORK [initandlisten] waiting for connections on port 27017 ssl 
2017-01-17T23:50:36.602Z I REPL [ReplicationExecutor] New replica set config in use: {...} 
2017-01-17T23:50:36.603Z I REPL [ReplicationExecutor] This node is in the config 
2017-01-17T23:50:36.603Z I REPL [ReplicationExecutor] transition to STARTUP2 
2017-01-17T23:50:36.603Z I REPL [ReplicationExecutor] Starting replication applier threads 
2017-01-17T23:50:36.604Z I REPL [ReplicationExecutor] transition to RECOVERING 
2017-01-17T23:50:36.608Z I REPL [ReplicationExecutor] transition to SECONDARY 
2017-01-17T23:50:36.608Z I REPL [ReplicationExecutor] conducting a dry run election to see if we could be elected 
2017-01-17T23:50:36.608Z I REPL [ReplicationExecutor] dry election run succeeded, running for election 
2017-01-17T23:50:36.608Z I REPL [ReplicationExecutor] election succeeded, assuming primary role in term 38 
2017-01-17T23:50:36.608Z I REPL [ReplicationExecutor] transition to PRIMARY 
2017-01-17T23:50:37.668Z I SHARDING [rsSync] Sharding state recovery process found document { _id: "minOpTimeRecovery", configsvrConnectionString: "rs0/", shardName: "rs0", minOpTime: { ts: Timestamp 1481206005000|6, t: 15 }, minOpTimeUpdaters: 0 } 
2017-01-17T23:50:37.669Z I SHARDING [ShardingState initialization] first cluster operation detected, adding sharding hook to enable versioning and authentication to remote servers 2017-01-17T23:50:37.690Z I SHARDING [ShardingState initialization] Updating config server connection string to: rs0/ 
2017-01-17T23:50:37.690Z I NETWORK [ShardingState initialization] Starting new replica set monitor for rs0/ 
2017-01-17T23:50:37.690Z I NETWORK [ReplicaSetMonitorWatcher] starting 
2017-01-17T23:50:37.703Z I SHARDING [thread1] creating distributed lock ping thread for process rs0/ (sleeping for 30000ms) 
2017-01-17T23:51:32.666Z I ACCESS [conn2] Successfully authenticated as principal user on admin 
2017-01-17T23:51:52.712Z W NETWORK [replSetDistLockPinger] Failed to connect to after 5000 milliseconds, giving up. 
2017-01-17T23:51:52.712Z W NETWORK [ShardingState initialization] Failed to connect to after 5000 milliseconds, giving up. 
2017-01-17T23:51:52.712Z W NETWORK [ShardingState initialization] No primary detected for set rs0(config) 2017-01-17T23:51:52.712Z W SHARDING [replSetDistLockPinger] pinging failed for distributed lock pinger :: caused by :: FailedToSatisfyReadPreference: could not find host matching read preference { mode: "primary" } for set rs0(config) 
2017-01-17T23:51:52.712Z I SHARDING [ShardingState initialization] Error initializing sharding state, sleeping for 2 seconds and trying again :: caused by :: FailedToSatisfyReadPreference: could not get updated shard list from config server due to could not find host matching read preference { mode: "nearest" } for rs0(config)

Root Cause:

Because the config database information isn’t updated in the shard members. We need to remove the cache data from  db.system.version in step #6 of this document.

Hope this helps.

Regards,
Wei Shan

Leave a comment