ASM Filter Driver CPU load – don’t care

Some time ago I wrote about ASM Filter Driver installation. If you are using AFD, then you might notice a permanent CPU load of 1.0 as we did. There was nothing else running, we stopped the Oracle Clusterware, stopped the Cloud Control Agent, still 1.0 load. Even after disabling the whole cluster stack and a reboot. But actually there were no processes visible that consumed CPU.

[root@vm101 ~]# crsctl disable crs
CRS-4621: Oracle High Availability Services autostart is disabled.

That’s what “top” said after reboot:

top - 20:44:51 up 4 min,  1 user,  load average: 0.00, 0.03, 0.02
Tasks: 108 total,   2 running, 106 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.2 sy,  0.0 ni, 99.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  5889124 total,  5664764 free,   118360 used,   106000 buff/cache
KiB Swap:  6143996 total,  6143996 free,        0 used.  5718728 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 2291 root      20   0  129880   1676   1196 R   0.3  0.0   0:00.06 top
    1 root      20   0   54256   3968   2320 S   0.0  0.1   0:02.13 systemd
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.00 kthreadd
    3 root      20   0       0      0      0 S   0.0  0.0   0:00.00 ksoftirqd/0

But when we loaded the AFD kernel driver, the load immediately went up to 1.0.

[root@vm101 ~]# lsmod |grep afd
[root@vm101 ~]# modprobe -r oracleafd
[root@vm101 ~]# lsmod |grep afd
oracleafd             205593  0

That’s what “top” told us now:

top - 20:52:20 up 12 min,  1 user,  load average: 1.06, 0.76, 0.37
Tasks: 104 total,   2 running, 102 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.1 us,  0.1 sy,  0.0 ni, 99.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  5889124 total,  5647164 free,   129004 used,   112956 buff/cache
KiB Swap:  6143996 total,  6143996 free,        0 used.  5707868 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
    1 root      20   0   54256   3972   2320 S   0.0  0.1   0:02.14 systemd
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.00 kthreadd

So a SR with Oracle was opened to clarify that. The explanation is quite simple. Linux counts processes in an uninterruptible wait in the load average, even though these processes don’t use any CPU. Those processes can be identified by the “D” state in ps -l output.

Let’s check the “D” state as Oracle Support mentioned:

[root@vm101 ~]# ps -efl | grep -E '^. D'
1 D root      2322     2  0  80   0 -     0 AfdgWa 20:45 ?        00:00:00 [afd_log]

You can see the “D” state in the second column. So that’s it. Don’t care about the load. It does not tell the truth.

And many thanks to my collegue for investigating this issue.

Advertisements

All about Oracle naming

The todays blog post is meant to bring some light to the different names that are used in an Oracle database environment. I’ve been asked these questions about naming a million times. And there are so many of these naming parameters:

  • ORACLE_SID environment variable
  • SID_NAME and GLOBAL_DB_NAME in listener.ora
  • the TNS-Alias and SERVICE_NAME parameter in tnsnames.ora
  • DB_NAME, DB_UNIQUE_NAME, SERVICE_NAMES and INSTANCE_NAME in [s]pfile

These parameters are all there for a reason, but it is not quite obvious how they interact. So let’s start with a single database server and see how the startup procedure for an instance works:

  1. set ORACLE_SID environment variable to something, e.g. ORCL
  2. start SQL*Plus and connect “/ as sysdba”
  3. type “startup”
  4. Oracle is trying to find the parameters in $ORACLE_HOME/dbs/spfile<ORACLE_SID>.ora, e.g. spfileORCL.ora
    if there is no SPfile, it tries the pfile ORACLE_HOME/dbs/init<ORACLE_SID>.ora, e.g. initORCL.ora
  5. processes are started and named ora_<process>_<ORACLE_SID>, e.g. ora_smon_ORCL
  6. if a password file should be used, Oracle looks for “orapw<ORACLE_SID>” and opens it
  7. the “instance_name” parameter defines the name of the instance and defaults to $ORACLE_SID
  8. the “db_unique_name” parameter uniquely identifies the database system and defaults to “db_name”. This important for Data Guard environments where you have the same database running several times
  9. the trace directories in DIAGNOSTIC_DEST are named diag/rdbms/<db_unique_name>/<ORACLE_SID>
  10. the freshly started instance registers the following things with the listener
    1. the “instance_name” for the default service which is “db_unique_name”.”db_domain”
    2. the “instance_name” for any other service given in “service_names” parameter
  11. the parameter “db_name” is being used to check if the database files really belong to to this instance

 
Let’s look at the following example where I set all the parameters to different values to outline their usage:

[oracle@vm104 dbs]$ export ORACLE_SID=ORCL

[oracle@vm104 dbs]$ cat initORCL.ora
db_name=ORCLDB
db_unique_name=ORCLUQ
db_domain=DOMAIN.COM
instance_name=ORCLINST
service_names=ORCLSVC.FOO.ORG
memory_target=800M

[oracle@vm104 dbs]$ sqlplus / as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on Tue Aug 9 10:26:14 2016

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup nomount
ORACLE instance started.

Total System Global Area  838860800 bytes
Fixed Size                  2929936 bytes
Variable Size             520096496 bytes
Database Buffers          310378496 bytes
Redo Buffers                5455872 bytes
SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Advanced Analytics
and Real Application Testing options

[oracle@vm104 dbs]$ ps -ef|grep ora_smon | grep -v grep
oracle   10260     1  0 10:26 ?        00:00:00 ora_smon_ORCL

[oracle@vm104 dbs]$ ls /u01/app/oracle/diag/rdbms/orcluq/ORCL/
alert  hm        incpkg  lck  metadata       metadata_pv  sweep
cdump  incident  ir      log  metadata_dgif  stage        trace


[oracle@vm104 dbs]$ lsnrctl status

LSNRCTL for Linux: Version 12.1.0.2.0 - Production on 09-AUG-2016 10:31:51

Copyright (c) 1991, 2014, Oracle.  All rights reserved.

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 12.1.0.2.0 - Production
Start Date                07-JUN-2016 12:58:30
Uptime                    62 days 21 hr. 33 min. 21 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/grid/12.1.0/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/vm104/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.16.36.104)(PORT=1521)))
Services Summary...
Service "ORCLSVC.FOO.ORG" has 1 instance(s).
  Instance "ORCLINST", status BLOCKED, has 1 handler(s) for this service...
Service "ORCLUQ.DOMAIN.COM" has 1 instance(s).
  Instance "ORCLINST", status BLOCKED, has 1 handler(s) for this service...
The command completed successfully

So that’s it for the local naming. Now we come to the networking and to the corresponding files “tnsnames.ora” and “listener.ora”. There are other parameter to take care of.
You might need to add some static services to the listener if you want to do RMAN duplicates or simple want to connect as SYSDBA to a non-running instance from a remote host. It is similar to connecting locally, but instead of setting the environment variables, we just tell the listener the details. And we will need to set up the clients tnsnames.ora properly.

First the listener, we can define static services for each listener following this pattern:

SID_LIST_<name of listener> =
  (SID_LIST =
    (SID_DESC =
      (GLOBAL_DBNAME = <name that will be used as service>)
      (ORACLE_HOME = /u01/app/oracle/product/12.1.0/dbhome_1)
      (SID_NAME=<that will be ORACLE_SID>)
    )
  )

This will result in the listener.ora:

SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
      (GLOBAL_DBNAME = ORCL_GLB_NAME.MYDOMAIN.COM)
      (ORACLE_HOME = /u01/app/oracle/product/12.1.0/dbhome_1)
      (SID_NAME=ORCL)
    )
  )

Now let’s check what happens to the listener services:

[oracle@vm104 ~]$ lsnrctl reload

LSNRCTL for Linux: Version 12.1.0.2.0 - Production on 09-AUG-2016 10:47:18

Copyright (c) 1991, 2014, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))
The command completed successfully

[oracle@vm104 ~]$ lsnrctl status

LSNRCTL for Linux: Version 12.1.0.2.0 - Production on 09-AUG-2016 10:47:23

Copyright (c) 1991, 2014, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 12.1.0.2.0 - Production
Start Date                07-JUN-2016 12:58:30
Uptime                    62 days 21 hr. 48 min. 52 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/grid/12.1.0/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/vm104/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
Services Summary...
Service "ORCL_GLB_NAME.MYDOMAIN.COM" has 1 instance(s).
  Instance "ORCL", status UNKNOWN, has 1 handler(s) for this service...

Ok, so far, so good. To establish a connection from a remote host, we need to modify the “tnsnames.ora” on the remote host.

First the pattern:

<My TNS Alias>=
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = vm104)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = <service name that is known to the listener>)
    )
  )

We can use any value for “My TNS Alias” here, but we have to care about the SERVICE_NAME. So the “tnsnames.ora” will look like this:

ORCL_ARTIFICIAL =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = vm104)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = ORCL_GLB_NAME.MYDOMAIN.COM)
    )
  )

And to finally connect to the idle instance remotely, we need a password file to authenticate the user:

[oracle@vm104 dbhome_1]$ orapwd file=$ORACLE_HOME/dbs/orapwORCL password=oracle

[oracle@vm104 dbhome_1]$ ls $ORACLE_HOME/dbs/*ORCL*
/u01/app/oracle/product/12.1.0/dbhome_1/dbs/initORCL.ora
/u01/app/oracle/product/12.1.0/dbhome_1/dbs/orapwORCL

Now this is the point when we can use “My TNS Alias” for connecting to our idle instance:

[oracle@vm104 dbhome_1]$ sqlplus sys/oracle@ORCL_ARTIFICIAL as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on Tue Aug 9 11:03:49 2016

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup nomount
ORACLE instance started.

Total System Global Area  838860800 bytes
Fixed Size                  2929936 bytes
Variable Size             520096496 bytes
Database Buffers          310378496 bytes
Redo Buffers                5455872 bytes

This is the complete way:

  1. hand the alias <My TNS Alias> to SQL*Plus
  2. find alias <My TNS Alias> in tnsnames.ora and get <hostname>, <port> and <registered SERVICE_NAME>
  3. connect to given <hostname> and <port> and ask for connection to <registered SERVICE_NAME>
  4. if the listener does not know this <registered SERVICE_NAME> this will fail
  5. the listener forks a process for <ORACLE_SID> given in the SID_NAME parameter of the SID_LIST
  6. this process tries to find the password file orapw<ORACLE_SID> and authenticate the user
  7. the startup procedure is then identical to the one I described above

I hope this helps to get a better understanding of what parameter is being used for what purpose and how they interact. If you have any questions or additional hints, please feel free to comment.