Puppet – err: Could not retrieve catalog from remote server: Error 400 on SERVER

Puppet – err: Could not retrieve catalog from remote server: Error 400 on SERVER

While I was deploying a simple Puppet setup, I faced the above error whenever I tried to execute the following the Puppet client:

[root@puppetclient ~]# puppet agent --test --noop
info: Retrieving plugin
info: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb
info: Loading facts in /var/lib/puppet/lib/facter/oracle_database_homes.rb
info: Loading facts in /var/lib/puppet/lib/facter/concat_basedir.rb
info: Loading facts in /var/lib/puppet/lib/facter/root_home.rb
info: Loading facts in /var/lib/puppet/lib/facter/facter_dot_d.rb
info: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb
err: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find default node or by name with 'puppetclient.local, puppetclient' on node puppetclient.local
warning: Not using cache on failed catalog
err: Could not retrieve catalog; skipping run

I tried the PuppetLabs troubleshooting guide, however I still could not find the root cause. I also tried to “cleanup” the Puppet SSL certifications.

[root@puppetclient ~]# rm -f /var/lib/puppet/ssl/certs/puppetclient.local.pem
[root@puppetmaster ~]# puppet cert clean puppetclient.local

After further investigation, it was yet another silly mistake! In the Puppet-Master, puppetclient.local was not defined in the nodes.pp. After adding the node definition, it was resolved!

node 'puppetnewclient.local' {
 include ntp
 include sysctl

Wei Shan



Leave a comment

Puppet, a tool in the DevOps engineer’s Arsenal

Puppet, a tool in the DevOps engineer’s Arsenal

What is Puppet?

Puppet is a open-source configuration management tool. Adoption of DevOps means that you will need some sort of configuration management tool, else you can’t automate your software delivery. Puppet is one of the many configuration management tools out there. There are many others like Chef, Ansible or Salt. There are already a HUGE number of discussion online regarding Puppet vs Chef vs Ansible vs Salt so I’m not going to talk about them over here.

What is the value of Configuration Management tool?

The power of configuration management tool will be obvious when you have hundreds and thousands of nodes to manage. When you have less than 50 nodes, you can decide the purpose of each node and manually install the required packages on it. If there’s a patch, you can just install the new fix easily. If there’s a need to disable SSLv2, you can just login to each of the server to disable them.

  • What happens if you have 1000 nodes, and you need to disable SSLv2 on all of them?
  • What if you need to update the gcc packages on 40/1000 nodes?
  • What if the annoying IT auditors want to audit your security patches in your environment?

Why do I like Puppet?

I prefer using Puppet because the following reasons:

  • Non-programmer friendly. You don’t have to write any OO code. Puppet DSL is a declarative language. You just have to specify the end state.
  • Gentle learning curve. The Puppet DSL is quite intuitive.
  • Largest community around. This is important because you can get help easily via Google
  • Additional plugin modules are available if you want to provision any software using Puppet. (For etc, Oracle)

How is using Puppet going to help me as a Database Administrator?

The typical databases to DBA ratio is about 250:1. The volume and complexity of the databases you manage is growing. On top of that, you’ve got demands from the business to do more and faster, with the same amount of resources and manpower. Now, with additional Puppet modules, you can actually manage the database level configurations using Puppet.

For example;

tablespace {'PIO_DATA': 
ensure => 'present', 
bigfile => 'yes', 
datafile => 'pio_data.dbf',
size => '200G', 
logging => 'yes',
autoextend => 'on', 
next => '100M', 
max_size => '12288M',
extent_management => 'local', 
segment_space_management => 'auto', }


I believe that using configuration management tool to deploy databases is a new trend for DBAs. It allow us to manage thousands of databases easily. However, there certain caveats to it. The Puppet oradb module is still quite new, there are still certain bugs that needs to be iron out. That said, I still believe that it is another alternative to to building your own DBaaS using Oracle OEM.

It is something I believe DBAs should explore and determine if it’s suitable for your own environment.

Wei Shan

Leave a comment

SQL Vs NoSQL Vs NewSQL Databases

SQL Vs NoSQL Vs NewSQL Databases

Before I join the database field in 2011, I was told that the role of a Database Administrator(DBA) is extremely dull and boring. However, until now, it has been nothing but exciting. Buzz words like Big Data and IoT have generated new found interest in databases. The number of new databases that have sprung out of nowhere is tremendous. CockroachDB, Riak, Voldemort and etc.. In this post, I will be sharing about 3 different category of databases.

SQL Databases

SQL, also known as relational database management system(RDBMS), is what most companies are running on today. Popular relation databases are Oracle, SQL Server, MySQL and PostgreSQL. Data are store in rows and columns. You can use SQL to query for data in a RDBMS. A lot of applications are running on either of them. Bank transactions, Data Warehouse operations and payment transactions. These are the core applications that any companies will be supporting.

Relational databases are famous for ACID. Atomicity, Consistency, Isolation and Durability. You can read about them over here. Basically, it means that your transaction, once committed, it will be safe and there should not be any data loss. Simultaneous transactions should not interfere with each other.

NoSQL Databases

NoSQL is a term for databases that does not store its data in rows and columns. They are typically not ACID but BASE compliant. This is because they sacrifice C for AP in CAP theorem.  Also, you don’t have to define your schema upfront and is considered “schema-less”. Below are the different types of NoSQL databases

  • Document => MongoDB
  • Key-Value => Redis
  • Graph => Neo4j
  • Column => Cassandra

Different types of NoSQL database are specific for different use case. For example, Neo4j is perfect as a data-store for social media sites because supports defining relations between entities. MongoDB are typically used for e-commerce sites because each cart can be defined as 1 collection. Redis is often being used as a queue or caching tier that sits in-front of another database. NoSQL can be an entirely subject of it’s own and I’m probably not qualified to write anything more than an introduction on it. (But this won’t be true after I finish my MongoDB course!)

NoSQL can be accessed use SQL-LIKE language or their own native language like Neo4j-Cypher.

NewSQL Databases

NewSQL is our shorthand for the various new scalable/high-performance SQL database vendors. We have previously referred to these products as “ScalableSQL” to differentiate them from the incumbent relational database products. Since this implies horizontal scalability, which is not necessarily a feature of all products, we adopted the term NewSQL in the new report. And to clarify, like NoSQL, NewSQL is not to be taken too literally: the new thing about the NewSQL vendors is the vendor, not the SQL. NewSQL is a set of various new scalable/high-performance SQL database vendors (or databases). These vendors have designed solutions to bring the benefits of the relational model to the distributed architecture, and improve the performance of relational databases to an extent that the scalability is no longer an issue.

– 451 Group’s senior analyst, Matthew Aslett

NewSQL are pretty new to the database industry. They are supposed to be ACID-compliant yet be highly scalable. Also, they are also using SQL-LIKE for CRUD operations. Below are some NewSQL databases.

  • VoltDB
  • Clustrix
  • NuoDB

In order to achieve the above requirements, different NewSQL vendors have implemented their design different.For example, NuoDB uses a “2 tier approach”. Transaction Engines hold a subset of the objects in memory while Storage Managers are servers that have a complete copy of all objects on disks. Also, NuoDB uses Durable Distributed Cache (DDC) which sound like a massive RAC-Cache Fusion to me. VoltDB uses a mixture of partitioning, stored procedures as a unit of transaction and deterministic command to attain the same. Each of the NewSQL databases have very different concepts to it!


In my opinion, NoSQL will never replace SQL databases. They are simply used for different scenario and it’s never meant to replace each other. The RDBMS is just so good in handling OTLP transactions that you will never want to migrate it to MongoDB or Riak because of performance or ACID compliance. Likewise, you wouldn’t want your 25-node MongoDB cluster to be migrated to Oracle databases. You will either die trying or your company will go bankrupt first.

However, I believe that we will see a huge increase in companies adopting NoSQL because storing and analysing data is becoming very crucial or business to succeed these days. It’s becoming a key differentiator in the corporate world. It helps understand your customers better. Look at companies like Uber or Amazon. They are real good examples of using data to succeed.

Some people may ask, if NoSQL/NewSQL is so powerful, why don’t people use it to replace their existing Oracle Data Warehouse. Typical Oracle shops spend at least 7-figure sum on the Data Warehouse licensing per year (EE Edition, RAC, AWR packs..). In my opinion, there are a couple of reasons why:

  1. Current analytics tool are pretty robust and mature. But they are only support on ANSI compliant SQL language. Using non-SQL or SQL-LIKE tool will require a re-write of all their existing software.
  2. RDBMS contains dimension and fact tables which are already “cleaned” as compared to NoSQL
  3. SQL is a very easy language to learn and the existing users are just too comfortable on it

However, I believe that NoSQL will soon rise to join the ranks of SQL database in the Data Warehouse scene. This is because companies now want to store their non-structured data as part of their data lake. So they could be using RDBMS for their existing applications and using NoSQL for non-structured data. Then they could use something like Big SQL to query data from both type of databases, forming something called Data Lake.

For DBAs like myself, it means you will get to play with more databases! :)

Leave a comment

Pacemaker/Corosync/PostgreSQL Issue

Pacemaker/Corosync/PostgreSQL Issue

The moment I sat on my desk, there were tons of tickets complaining that the database were down. It was 2 node PostgreSQL database HA cluster running the following stack.

  • RedHat Linux 6.x
  • Pacemaker/Corosync
  • PostgreSQL 9.2.x
  • Master-Slave synchronous streaming replication between the 2 PostgreSQL nodes

I ran the crm_mon command immediately and found out that there was something weird with the PostgreSQL. The Pacemaker/Corosync HA wasn’t working as intended. The PostgreSQL database was down. Both nodes are online but only node1 was the slave but PostgreSQL wasn’t running on node2. I started going through the logs.

On Node1

Sep 21 04:10:19 node1 postgres[32318]: [5-1] FATAL: could not connect to the primary server: could not connect to server: No route to host
Sep 21 04:10:19 node1 postgres[32318]: [5-2] #011#011Is the server running on host "<IP address>" and accepting
Sep 21 04:10:19 node1 postgres[32318]: [5-3] #011#011TCP/IP connections on port 5432?
Sep 21 04:10:19 node1 postgres[32318]: [5-4] #011
Sep 21 04:10:22 node1 lrmd[3185]: notice: operation_finished: pgsql_monitor_7000:32319 [ 2015/09/21_04:10:22 INFO: Master does not exist. ]
Sep 21 04:10:22 node1 lrmd[3185]: notice: operation_finished: pgsql_monitor_7000:32319 [ 2015/09/21_04:10:22 WARNING: My data is out-of-date. sta

On Node2

Sep 21 15:28:41 node2 pengine[3194]: notice: LogActions: Start pgsql:1#011(node2)
Sep 21 15:28:41 node2 pengine[3194]: notice: process_pe_message: Calculated Transition 1120: /var/lib/pacemaker/pengine/pe-input-234.bz2
Sep 21 15:28:42 node2 postgres[61043]: [1-1] LOG: database system was shut down in recovery at 2015-09-19 23:36:56 SGT
Sep 21 15:28:42 node2 postgres[61044]: [1-1] FATAL: the database system is starting up
Sep 21 15:28:42 node2 postgres[61043]: [2-1] LOG: entering standby mode
Sep 21 15:28:42 node2 postgres[61043]: [3-1] LOG: could not read from log file 59, segment 114, offset 0: No such file or directory
Sep 21 15:28:42 node2 postgres[61043]: [4-1] LOG: invalid primary checkpoint record
Sep 21 15:28:42 node2 postgres[61043]: [5-1] LOG: could not read from log file 59, segment 114, offset 0: No such file or directory
Sep 21 15:28:42 node2 postgres[61043]: [6-1] LOG: invalid secondary checkpoint record
Sep 21 15:28:42 node2 postgres[61043]: [7-1] PANIC: could not locate a valid checkpoint record


  • Node1 thinks that it is the slave and hence was trying to connect to the master database on node2
  • Node1 wasn’t able to connect to the IP address because that IP address was managed by Pacemaker/Corosync and since PostgreSQL was down, the “IP address” resource won’t be running!
  • Node2 couldn’t start the database because it was corrupted

What I need to do now is to promote the non-corrupted database(node1) to master and re-setup the slave on node2!


Execute on node2

Stop the pacemaker/corosync services on node2

root#crm node standby node2
root#"/etc/init.d/pacemaker stop
root#/etc/init.d/cman stop

Execute on node1

Promote node1 to master in both pacemaker/corosync and database level. You don’t have to reboot the server but I just did it because I wanted a “clean” state in node1

root#"crm resource migrate master-group node1
postgres#pg_ctl promote -D $PGDATA
root# init 6

Execute on node2

Re-setup node2 as slave by seeding it from node1. Start pacemaker/corosync services on node2 once the database is re-seeded

root# rm -fr $PGDATA
postgres#pg_basebackup -D /var/lib/pgsql/9.2/data -h --xlog -P -v
root#crm node online node2
root#"/etc/init.d/pacemaker stop
root#/etc/init.d/cman stop
root# crm node online node2

Wei Shan

Leave a comment

DRBD – not defined in your config (for this host)

DRBD – <resource> not defined in your config (for this host)

I encountered the following error while creating a new DRBD resource for the HA cluster.

# drbdadm create-md r0
--== Thank you for participating in the global usage survey ==--
The server's response is:
'r0' not defined in your config (for this host).

Troubleshooting process….

Hostname is linuxzfs1.local

[root@linuxzfs1 drbd.d]# uname -a
Linux linuxzfs1.local 3.10.0-229.11.1.el7.x86_64 #1 SMP Thu Aug 6 01:06:18 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

However, resource is defined as linuxzfs1.(Note the highlighted)

[root@linuxzfs1 drbd.d]# cat /etc/drbd.dr0.res
resource r0 {
 device /dev/drbd1;
 disk /dev/sdb;
 meta-disk internal;
 on linuxzfs1 {
 on linuxzfs2 {

Modify the r0.res to ensure the hostname is defined correctly both in r0.res and /etc/hostname

[root@linuxzfs1 drbd.d]# cat r0.res
resource r0 {
 device /dev/drbd1;
 disk /dev/sdb;
 meta-disk internal;
 on linuxzfs1.local {
 on linuxzfs2.local {

After the file have been modified correctly, the drbdadm command executed successfully now.

# drbdadm create-md r0
initializing activity log
NOT initializing bitmap
Writing meta data...
New drbd meta data block successfully created.

Wei Shan

Leave a comment

Disabling Transparent Huge Pages in CentOS 7.x

Disabling Transparent Huge Pages in CentOS 7.x

Transparent Huge Pages(THP) have been introduced since RedHat/CentOS 6. In CentOS 7,this feature has been turned on by default. Even THP is supposed to increased to memory performance, various database vendors(Oracle,MariaDB) are recommending to turn off THP. It seems to cause performance degradation when THP is enabled.

To verify if THP is enabled.

# cat /sys/kernel/mm/transparent_hugepage/defrag
[always] madvise never
# cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never

Edit the rc.local file

Add the following to the bottom of /etc/rc.d/rc.local

if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
echo never > /sys/kernel/mm/transparent_hugepage/enabled
if test -f /sys/kernel/mm/transparent_hugepage/defrag; then
echo never > /sys/kernel/mm/transparent_hugepage/defrag

Make rc.local file executable

chmod u+x /etc/rc.d/rc.local


# init 6

To verfiy if THP is disabled.

# cat /sys/kernel/mm/transparent_hugepage/defrag
always madvise [never]
# cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]

Wei Shan

Leave a comment

SQL Server 2014 – Unable to add AlwaysOn AG Listener

SQL Server 2014 – Unable to add AlwaysOn AG Listener

We need Availability Group(AG) listener to enables a client to connect to an availability replica without knowing the name of the physical instance of SQL Server to which the client is connecting. This means that you do not have to manage the client connectivity during a failover.

Recently, I was deploying a new setup with SQL Server 2016 on Windows Server 2012. It’s a 2 node AlwaysOn AG setup. When I was trying to add the AG listener, I hit the following error:

Error message

I tried troubleshooting the issue using the Microsoft document over here. However, the solutions provided was not working.

In the end, we found our that we needed this patch to work. It’s the SQL Server 2014 SP1. This patch contains more than just security fixes. I highly recommend for you to install this patch before you perform any SQL Server 2014 deployment!

Wei Shan

Leave a comment

PostgreSQL 9.2 – Configuring plsh extension

PostgreSQL 9.2 – Configuring plsh extension

This extension allows you to run shell commands from within the database which are useful for moving or renaming files in the Unix level. The GitHub repository is over here. Below is a quick guide on how I get it up and running.

Create an auxiliary directory to hold the files.

# mkdir /tmp/plsh
# cd /tmp/plsh

Download and extract the tar files

# wget https://github.com/petere/plsh/archive/1.20130823.tar.gz
# tar xvf 1.20130823.tar.gz
# cd 1.20130823.tar.gz

Install gcc and postgresql92 in order to compile plsh.

# yum install postgresql92-devel.x86_64
# yum install gcc
# make PG_CONFIG=/usr/pgsql-9.2/bin/pg_config
# make install PG_CONFIG=/usr/pgsql-9.2/bin/pg_config

Create plsh extension. (This needs to be done on each database which requires plsh)

# psql weishan
weishan=# create extension plsh;

Verify that the plsh extension has been installed correctly.

weishan=# \dx
 List of installed extensions
 Name | Version | Schema | Description
 plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
 plsh | 2 | public | PL/sh procedural language

Wei Shan

Leave a comment


Get every new post delivered to your Inbox.