Solaris 11 – Free Hands On Practise Lab!

Good Day!

I was doing some researching on database migration from AIX and Solaris and I found some juicy stuff! Oracle has provided some free lab session for new Solaris administrators to get their hands dirty. If you are new to Unix and want to try them out, you can check out the links below!

Just a side note, I was recently exposed to Wintel environments, either  Oracle/MSSQL on Windows Server 2008/2012. It was a utter pain to work with the permissions and services. Unix/Linux is still the number one enterprise operating systems, period.

Regards,
WeiShan

Leave a comment

Oracle DataGuard VS Storage Replication

Oracle DataGuard VS Storage Replication

Recently, I was involved in various projects on the design of the DR solution for Oracle databases. This question popped up at almost every time. I strongly believe that Oracle DataGuard is the correct solution for any Oracle DR solution.

Cost Effective

Oracle DataGuard comes free with Oracle Enterprise Edition (EE)! The only reason why a company might NOT use Oracle DataGuard is because they are not on EE and the cost of upgrading to EE just to use DataGuard does not justify the price.

Lower Bandwidth Utilisation

An insert to a table in Oracle generates alot of IO activities.

  • 8k insert into table
  • update of leaf indexes
  • 8k block of undo generated
  • redo log generated
  • archive log generated

All these unnecessary IOs are replicated to your DR site. If you are using DataGuard, only the stream of redo will be sent over to the DR :) Link

Lower Complexities and easier to manage.

During DR implementation.

The steps to create an Oracle DataGuard configuration are very well documented and straightforward. It is especially so if you are an Oracle DBA. If you find that the manual configuration (link) are too difficult, you can always create the physical standby using Oracle Enterprise Manager (OEM).

However, for storage replication, the LUN configuration must be extremely precise. All the LUNs used by the same database should be in the same consistency group. If it is a RAC database, then the OCR diskgroup must not be mirrored.

During DR exercise.

In Oracle DataGuard, to switchover to the standby database, you will only need to run a single command through the DataGuard broker.

DGMGRL> switchover to 'standby';

However, if we were to use storage replication, we will have to perform the following. Do note that all these steps are manual unless you can script them to run automatically.

  1. Break mirrored LUN
  2. Mount LUN onto standby database
  3. Startup database (Database will perform automatic instance recovery)

Data Block Corruption Detection

DataGuard is able to detect block corruption automatically as long as you have the parameters configured in the SPFile.

DB_BLOCK_CHECKSUM=FULL
DB_LOST_WRITE_PROTECT=TYPICAL

Storage replication does not have such capability because it does not have visibility in the Oracle database. To the storage layer, everything is in OS data block.

Oracle DataGuard Rolling Upgrade

Oracle DataGuard has a feature called rolling upgrade. It is part of the Oracle Maximum Availability Architecture (MAA). The purpose of rolling upgrade is to reduce/prevent planned downtime.

The concept of rolling upgrading is simple;

  • Patch the standby database first
  • Test if the patch is working
  • Switchover to standby database
  • Upgrade primary database
  • Switchback to primary database
  • DBA saves the day!!

If you are a more paranoid DBA, you could do the following prior to the steps above to further test the patch/upgrade;

  • Snapshot the primary database LUNs
  • Mount the snapshot as a test database
  • Run a test patch on the test database
  • Destroy the test database

Summary

I am an Oracle DBA by nature and an Oracle DBA likes full control. By using DataGuard, I have absolute control over the DR solution. I will not have to depend on the storage engineer and system engineer during the DR exercise. This is precisely why ASM technology was created, to allow DBA to have control over the database filesystems. Why would you want to have multiple stakeholders in the DR exercise when you can have a single person to do everything (hint:DBA!).

Of course, all the reasons above are pertaining to Oracle database. Storage replication is still a major advantage to have in your DR solution arsenal. It is perfect for non-clustered application.

Regards,
WeiShan

Leave a comment

Food for thought: Troubleshooting issues

Today, I found out that I made an extremely silly mistake.

This was a very simple project. A 2-Node Oracle RAC setup connected to a IBM v7000 storage. The deployment was smooth. I configured the ASM disk using udev and device-mapper. However, after the system was handover to the customer, they begin encountering reboots and the following error.

qla2xxx [0000:1b:00.0]-801c:1: Abort command issued nexus=1:1:0 -- 1 2002.

These were the steps I took to resolve the issue.

  1. Logging of service ticket to IBM, Red Hat and QLogic.
  2. Flashing of QLogic HBA firmware
  3. Patching of QLogic HBA driver

However, the issue was still not resolved. The root cause was infact that the udev was configured incorrectly! The SCSI timeout was supposed to be set to 120s according to IBM documentation. This was a mistake that made me angry with myself. These were the few reasons for my mistake

  1. Check and verify all configurations again. (This was the part I did not perform. I was VERY confident that I configured it correctly)
  2. Do not assume and bet on the root cause so quickly. (I was too quick to assume to root cause to be the firmware/driver issues)
  3. Do not be sway by the customer no matter how experienced they are. Have your own understanding and judgement!

 

Regards,
Wei Shan

Leave a comment

Oracle RAC 11gR2 – ORA-01102: cannot mount database in EXCLUSIVE mode

Oracle RAC 11gR2 – ORA-01102: cannot mount database in exclusive mode

In one of the customer site, I was unable to start both of the RAC instance at the same time. When 1 of the RAC instance was up, I couldn’t start the other RAC instance.

#sqlplus / as sysdba
SQL> startup;
ERROR at line 1:
ORA-01102: cannot mount database in EXCLUSIVE mode

ORA-01102 error occurs when 2 separate (Not RAC configured) Oracle instance tries to open the same database. In one of the RAC instance, one of the parameter was configured wrongly

CLUSTER_DATABASE = FALSE;

Solution;

SQL>alter system set cluster_database=TRUE scope=spfile sid='SID1';
SQL>shutdown immediate;
SQL>startup

Hope this helps.

Regards,
Wei Shan

 

 

Leave a comment

Oracle RAC – Configuring udev and device-mapper for ASM

Oracle RAC – Configuring udev and device-mapper for ASM

Since Oracle no longer provides asmlib RPMs for RHEL 6, we will need to take care of that during Oracle RAC installation. The most common way is to use udev rules to ensure the ASM disks permission survives through reboot. If the provided storage does not have a multi-pathing driver like SDDPCM or PowerPath for the OS, then we will need to use RHEL 6 default driver, device-mapper.

  • device-mapper => Default multi-pathing software in RHEL 6
  • udev => Used to create rules for the ASM disks

1. Ensure device-mapper rpm is installed in the system

rpm -qa | grep -i device-mapper*

2. Get the unique SCSI device ID for all the disks

# scsi_id -gud /dev/sd*

3. Create/Edit /etc/multipath.conf file – This configuration needs to be provided by the storage vendor.

RHEL62
 device {
 vendor "IBM"
 product "2145"
 path_grouping_policy group_by_prio
 getuid_callout "/lib/udev/scsi_id --whitelisted --device=/dev/%n"
 features "1 queue_if_no_path"
 prio alua
 path_checker tur
 failback immediate
 no_path_retry "5"
 rr_min_io 1
# polling_interval 30
 dev_loss_tmo 120
 }
blacklist {
 wwid <local_disk>
 devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*" 
 devnode "^hd[a-z]" 
 devnode "^dcssblk[0-9]*"
}
defaults {
polling_interval 30 
}
multipaths { 
 multipath {
 wwid 360026b9000618571000008b54ea5360b 
 alias db
 }
}

4. Create the udev rules to ensure the ASM disks permission is set correctly – This is very specific to the version of the Linux. Please verify and test this out.

vi /etc/udev/rules.d/97-oracle-asmdevices.rules
KERNEL=="dm-*", SUBSYSTEM=="block", ENV{DEVTYPE}=="disk", OWNER="grid", GROUP="asmadmin", MODE="0660"

5. Ensure device-mapper service is always started.

#chkconfig multipathd on

6. Restart system

#init 6

7. Verify everything

#service --status-all | grep -i multipathd
#ls -al /dev/dm-*

Hope this helps!

Regards,
Wei Shan

Leave a comment

Being a database IT professional

Where to get knowledge on databases (Oracle/MSSQL/MongoDB)?

Being in a IT profession is one of the most challenging career. There are many areas of IT you can focus on. For example, security, networking, systems, application and many more! The worst thing is that all these demands for skill sets changes over time. If you bet your luck on a single technology, you WILL be out of the job some day in the future. (hint: lotus notes)

However, that is also the most interesting aspects of being a IT professional. You learn new things and adapt/evolve to the ever changing landscape of the IT industry. I have less than 3 years of experience in IT and have yet to experience the downfall of any major technology. However, I firmly believe  in the following;

  • You need to embrace changes. If you stop changing, you will get changed!
  • Be curious to all things
  • Be open-minded to new technologies

I would also like to recommend some goodies for my readers. The following lists are where I get my usual “self-improvement” materials from. Do note that they are mostly database related. Its a database blog after all!

  1. Microsoft SQL Server
  2. MongoDB
  3. Oracle YouTube Free Tutorial
  4. CBT Nuggets Training (Chargeable)

Globalisation is happening around the globe where IT team can be in anywhere supporting the business. We are competing with IT professionals globally. This does not just happen in IT industry alone. Being knowledgable in more than one database technology will definitely helps in the long run.

Cheers,
Wei Shan

 

1 Comment

[ALERT] – Oracle products affected by OpenSSL Heartbleed Exploit

[ALERT] – Oracle products affected by OpenSSL Heart Bleed Exploit

If you have been surfing the internet recently, then you should know that there is a known critical exploit affecting a majority of the websites using OpenSSL for encrypted session. I am not a security expert, but I do know that you have to start changing your passwords on most of the websites that you surf regularly.

Here is a list of the websites that are affected.

Oracle has also released document highlighted the products affected by the Heart Bleed bug. It includes the affected products and whether is there a security fix released for it.

You can find the document here.

Please take some time to go through the document and schedule the patch as soon as possible. If you databases is public facing, you should patch it IMMEDIATELY.

 

Cheers,
Wei Shan

Leave a comment

Oracle RAC 11gR2 – rootcrs.pl/roothas.pl -unlock hangs

Oracle RAC 11gR2 – rootcrs.pl/roothas.pl -unlock hangs

In order to patch the Oracle Grid Infrastructure, we will need to unlock the $GRID_HOME by running either of the command.

# $GRID_HOME/crs/install/rootcrs.pl -unlock
# $GRID_HOME/crs/install/roothas.pl -unlock

If the command hangs, please check the audit directory under $GRID_HOME/rdbms/audit.

rootcrs.pl -unlock or roothas.pl command will go through all the files in the audit directory and change the ownership and permission individually. If the list is huge (1 of my customers had 100k of files!), it will appear to be hanging.

It is part of the operational duty to house keep the audit directory. You may choose to run a cronjob to purge the audit files older than 30 days.

For more information, please refer to MOS [DOC ID 1352884.1].

Cheers,
Wei Shan

Leave a comment

Follow

Get every new post delivered to your Inbox.