Archive for category PostgreSQL 9.4

Thoughts on PostgreSQL JSON Vs MongoDB JSON Performance

Thoughts on PostgreSQL JSON Vs MongoDB JSON Performance

Recently, I came across a 3 year old blog post from EnterpriseDB about benchmarking JSON performance on MongoDB Vs PostgreSQL via a random Quora response of someone bashing MongoDB performance. I use both PostgreSQL and MongoDB on a daily basis at my current role. They are both great database and good at what they were built to do. However, the benchmarking results were absolutely unbelievable. I couldn’t believe it and had to read the source code myself.

There were a couple of reasons why the results were highly inaccurate.

INSERTs Test

It uses bash to generate massive INSERT shell scripts and inserts into mongod via Mongo shell. This means that it is not benchmarking PostgreSQL against MongoDB. It is benchmarking PostgreSQL psql against MongoDB Javescript shell (SpiderMoney/v8). That’s not a very fair benchmark is it?

SELECTs Test

Same issue here. It uses Mongo shell to run the SELECT queries. Effectively, the performance will be bounded by the speed of the javascript engine. It was using v8 back in version 2.6. (Link)

Data Loading Test

This is the only fair comparison so far. It compares PostgreSQL’s COPY against MongoDB’s mongoimport. It was indeed true that PosgreSQL was faster than MongoDB in this test.

Summary

I love using both PostgreSQL and MongoDB. However, this benchmarking test fails prove that PostgreSQL was indeed faster than MongoDB in JSON performance. Also, it did not share the configurations of postgresql.conf and mongod.conf. MongoDB doesn’t work well with ext4 filesystem. So if they are using AWS Linux AMI, it’s going to be on ext4 unless they took the effort the mount external EBS volumes.

Since the results are already 3 years old, I plan to benchmark PostgreSQL and MongoDB over the next few weeks using PostgreSQL 9.6 against MongoDB 3.4 with WT storage engine. Also, I don’t work for MongoDB nor EnterpriseDB, this will be a unbiased test 🙂

Regards,
Wei Shan

Leave a comment

How to generate authentication file for PgBouncer

How to generate authentication file for PgBouncer

PgBouncer is a lightweight connection pooler that prevents the MAX_CONNECTION reached error in PostgreSQL. Basically, it does a very good job for what it says. It maintains a pool of connections for your applications. This helps in reducing the overhead of setting up and tearing down connections for every client incoming request. [link]

By default, it uses a plain text password file for authenticating users to the PostgreSQL database. This will obviously raise a red flag during security audit! Let’s see what we can do to prevent that red flag 🙂

First, we have to install pg_bouncer and psycopg2.

yum install pgbouncer.x86_64 python-psycopg2.x86_64

Now that we got PgBouncer installed. We have to do configuration.

# cd /etc
# chown -R postgres. pgbouncer/
# su - postgres
$ cd /etc/pgbouncer/
$ ls
mkauth.py mkauth.pyc mkauth.pyo pgbouncer.ini

Now, let’s generate that password file 🙂

$ ./mkauth.py "/etc/pgbouncer/password.txt" "dbname='postgres' user='postgres' host='localhost'"
$ ls
mkauth.py mkauth.pyc mkauth.pyo password.txt pgbouncer.ini

Let’s see if we can view the password!

$ cat password.txt
"postgres" "" ""
"repmgr" "md58ea99ab1ec3bd8d8a6162df6c8e1ddcd" ""
"testuser" "md5d70a0452418aeb8fb4030eae69ca2856" ""

It uses the pg_shadow table and generate the password file. Obviously, it is only salted and double hashed with MD5. It is not as secure as SHA-256, but still a lot more secure that reading the password in plain text isn’t it?

Regards,
Wei Shan

Leave a comment

WARNING – Granting of psql/sqlplus executables to sudoers in Linux/Unix

WARNING – Granting of psql/sqlplus executables to sudoers in Linux/Unix

I discovered this behavior accidentally during my project work recently. I needed psql access to a RHEL 6 system. However, the company policy enforces a minimum access policy (Well done!). Thus, my account was given a sudoer permission to only use psql executable. I needed to perform DBA tasks on this system.

Let me show you the issue.

[weishan]$ sudo su -
[sudo] password for weishan:
ユーザー user は'/bin/su -' を root として  上で実行することは許可されていません。すみません。
(This means user not allowed to execute su -)

[weishan]$ tail -n 10 /var/log/messages
tail: `messages' を 読み込み用でオープンできません: 許可がありません
(This means access denied)

[weishan]$ sudo /home/postgres/pgsql/bin/psql -h localhost -p 5432 -U postgres postgres
[sudo] password for weishan:
psql (9.2.4)
Type "help" for help.

Looking at the output above, the sudoer was configured correctly! Now, let’s see..

psql> \!
[root@gsdb03s log]# tail -n 5 /var/log/messages
May 10 15:34:47  snmpd[22564]: Connection from UDP: [ip]:63484->[ip]
May 10 15:34:47 snmpd[22564]: Connection from UDP: [ip]:63484-> [ip]
May 10 15:34:47 snmpd[22564]: Connection from UDP: [ip]:63484-> [ip]
May 10 15:34:47 snmpd[22564]: Connection from UDP: [ip]:63484-> [ip]

From the above, we can show that it is technically possible for unintended DBAs to get root access even if you did not allow them access explicitly. Thus, we should be mindful while granting psql or sqlplus executable.

I did not test this on sqlplus, but the behaviour should be the same since both psql and sqlplus allow the user to “temporarily” exit the console to OS level access.

Moral of the story:

  1. Do not configure sudo for user to access psql/sqlplus to login to postgres as root user directly.
  2. Configure sudo to only allow weishan user to “sudo su -” to postgres user.
  3. Configure logging for sudo access

Regards,
Wei Shan

 

 

Leave a comment

Playing Around with PostgreSQL date/time Datatype

Playing Around with PostgreSQL date/time Datatype

To check current timestamp. (Notice it’s GMT+8 in the results)

postgres=# select now();
 now
-------------------------------
 2016-02-29 14:19:36.046499+08
(1 row)

To check current timestamp 1 hour ago.

postgres=# select (now() - interval '1 hour');
 ?column?
-------------------------------
 2016-02-29 13:20:47.030513+08
(1 row)

To check current timestamp 1 day ago.

postgres=# select (now() - interval '1 day');
 ?column?
-------------------------------
 2016-02-28 14:20:53.820484+08

To check current timestamp in epoch time. (Unix timestamp in seconds)

postgres=# select extract(epoch from NOW());
 date_part
------------------
 1456726955.54147
(1 row)

To check current timestamp in epoch time. (Unix timestamp in milliseconds)

postgres=# select extract(epoch from NOW())*1000;
 ?column?
-----------------
 1456727010006.5
(1 row)

Bonus! By default, the output is integer. You need to cast it to float to get results with less than 0.

postgres=# select 1/2;
 ?column?
----------
 0
(1 row)
postgres=# select 1/2::float;
 ?column?
----------
 0.5
(1 row)

I had quite some “fun” playing with all these date/time datatype recently.

Regards,
Wei Shan

 

Leave a comment

PostgreSQL 9.x – pg_basebackup: could not get transaction log end position from server: FATAL: archive member “core.56229” too large for tar format

PostgreSQL 9.x – pg_basebackup: could not get transaction log end position from server: FATAL:  archive member “core.56229” too large for tar format

While I was trying to re-deploy a PostgreSQL streaming replication slave from the master node, I hit the following error:

[root@node2~]# su - postgres
-bash-4.1$ /usr/pgsql-9.2/bin/pg_basebackup -D /data/var/lib/pgsql/9.2/data -h node1 --xlog -P -v
transaction log start point: 70/CD000020
166537602/235436024 kB (100%), 2/2 tablespaces
pg_basebackup: could not get transaction log end position from server: FATAL: archive member "core.56229" too large for tar format

From node1, I found the following files in the $PGDATA:

4.2G ./core.53808
4.2G ./core.55595
4.2G ./core.55698
4.2G ./core.56229

The above was the core dump from previous PostgreSQL DB crashes. By default, PostgreSQL does not create files larger than 1G, so I figured this is not the datafiles. I asked Uncle Google and found out that it’s the core dump by PostgreSQL DB crashes.

Once I removed the files from the $PGDATA directory. The pg_basebackup command executes successfully. The interesting thing that I learnt is, pg_basebackup uses tar in the backend to compress and send the files over to the slave database.

Regards,
Wei Shan

Leave a comment

PostgreSQL 9.4 – Configuring Replication Slots

PostgreSQL 9.4 – Configuring Replication Slots

PostgreSQL 9.4 introduce a very useful feature called Replication Slots. Previously, we had to configure a high value for WAL_KEEP_SEGMENTS parameter in postgresql.conf. This is done via estimation based on delta change on the master database. Alternative, we could also configure the ARCHIVE_COMMAND instead to move it to another location for the slave to pick it up. Both methods require a larger than required capacity to store the WAL archives. This new feature is just plain cool 🙂

You can read more about it over here.

Step 1: Modify MAX_REPLICATION_SLOTS parameter in postgresql.conf on master

1.1) max_replication_slots = 0 =>max_replication_slots = 3
1.2) pg_ctl restart

Step 2: Create the replication slot on master

postgres=# SELECT * FROM pg_create_physical_replication_slot('postgres_slave');
 slot_name | xlog_position
 ----------------+---------------
 postgres_slave |

Step 3: Configure PRIMARY_SLOT_NAME in recovery.conf on slave

# echo "primary_slot_name = 'postgres_slave' >> recovery.conf
# pg_ctl restart

Step 4: Verify configuration on master

postgres=# SELECT * FROM pg_replication_slots;
 slot_name | plugin | slot_type | datoid | database | active | xmin | catalog_xmin | restart_lsn
 ----------------+--------+-----------+--------+----------+--------+------+--------------+------
 postgres_slave | | physical | | | t | | | 1/50002B0

Cheers,
Wei Shan

Leave a comment