Change time zone in Oracle’s DBaaS

Recently I assisted in a project that is utilizing Oracle Cloud Infrastructure (OCI) to provide application servers and Oracle databases. The customer is located in New Zealand, so they obviously wanted to work with dates and times in their local time zone. But when it came to the Database Scheduler, we found, that is was using UTC rather than the local time zone. This makes totally sense in the first place, but it’s not what we wanted to have.
Here is, what it looked like initially.

[oracle@ecotes-dbt ~]$ date
Thu Nov 14 07:05:29 UTC 2019

SQL> select sysdate from dual;

SYSDATE
----------------
14.11.2019 07:06

So at first we checked and changed the time zone for the underlying server. This is done by pointing /etc/localtime to another file.

[opc@ecotes-dbt ~]# sudo su - 
[root@ecotes-dbt ~]# ll /etc/localtime
lrwxrwxrwx 1 root root 23 Oct  7 06:51 /etc/localtime -> /usr/share/zoneinfo/UTC
[root@ecotes-dbt ~]#  ll /usr/share/zoneinfo/NZ
-rw-r--r-- 4 root root 2434 Jan 29  2018 /usr/share/zoneinfo/NZ
[root@ecotes-dbt ~]# ll /usr/share/zoneinfo/Pacific/Auckland
-rw-r--r-- 4 root root 2434 Jan 29  2018 /usr/share/zoneinfo/Pacific/Auckland
[root@ecotes-dbt ~]# rm -f /etc/localtime
[root@ecotes-dbt ~]# ln -s /usr/share/zoneinfo/Pacific/Auckland /etc/localtime
[root@ecotes-dbt ~]# ll /etc/localtime
lrwxrwxrwx 1 root root 36 Nov 14 20:10 /etc/localtime -> /usr/share/zoneinfo/Pacific/Auckland

[oracle@ecotes-dbt ~]$ date
Thu Nov 14 20:11:31 NZDT 2019

Now the database is picking up the new settings from the OS.

SQL> select sysdate from dual;

SYSDATE
----------------
14.11.2019 20:11

Unfortunately, the database scheduler has it’s own settings. So we needed to change that too.

SQL> SELECT *
2    FROM   dba_scheduler_global_attribute
3    WHERE  attribute_name = 'DEFAULT_TIMEZONE';   

ATTRIBUTE_NAME         VALUE
---------------------- -----------
DEFAULT_TIMEZONE       Etc/UTC

SQL> select dbms_scheduler.stime from dual;

STIME
---------------------------------------------------------------------------
14-NOV-19 08.26.59.556872000 AM ETC/UTC

But we can simply change this default with a single call to DBMS_SCHEDULER.SET_SCHEDULER_ATTRIBUTE.

SQL> exec DBMS_SCHEDULER.SET_SCHEDULER_ATTRIBUTE('default_timezone', 'Pacific/Auckland');

PL/SQL procedure successfully completed.

SQL> SELECT *
2    FROM   dba_scheduler_global_attribute
3    WHERE  attribute_name = 'DEFAULT_TIMEZONE';   

ATTRIBUTE_NAME         VALUE
---------------------- -----------
DEFAULT_TIMEZONE       Pacific/Auckland

SQL> select dbms_scheduler.stime from dual;

STIME
---------------------------------------------------------------------------
14-NOV-19 09.31.26.365519000 PM PACIFIC/AUCKLAND

So that’s it. Now we have our DBaaS running with the time zone specific to New Zealand.

Privilege Analysis

Todays blog post is about security. When creating database users, the least privilege principle should apply. That means, the user gets only those privileges that are needed for the work it needs to do. But we all know that many applications come with extensive requirements for the database user accounts. In most cases the development was done using DBA privileges or similar so the final installation requires these privileges too. A DBA will not like that at all, but it is difficult to identify all the privileges that are needed to apply the least-privilege-principle. That is where the “Privilege Analysis” feature comes into play. It was introduced with Oracle Database 12.1 but unfortunately it was part of the “Database Vault” extra-cost option. With the Database Release 18 this limitation was lifted and backported to 12.1 so it is now available as a standard feature of Enterprise Edition.
The following small and simple example will outline the basic functionality of the feature. The main parts are the DBMS_PRIVILEGE_CAPTURE package and the DBA_USED_PRIVS view. In the example there is a central schema named “NEXUS” that is being accessed by other users named “NEXUS”. There are roles in place wich contain all required privileges and are granted to those users. But in addition to that the “SELECT ANY TABLE” system privilege is required to make the application work. This requirement needs to be eliminated.
The first step is to create a rule that defines what should be monitored. Since a “like” expression cannot be used, it is done with a “between” that actually does the same.

BEGIN
  DBMS_PRIVILEGE_CAPTURE.CREATE_CAPTURE(
    name        => 'COLLECT_NEXUS' ,
    description => 'Ermittlung verwendeter Privilegien von Nexus' ,
    type        => DBMS_PRIVILEGE_CAPTURE.G_CONTEXT ,
    condition   => 'SYS_CONTEXT (''USERENV'',''SESSION_USER'') between ''NEXUS'' and ''NEXUT'''
  );
END;
/

In order to gather some data, the rule needs to be enabled.

BEGIN DBMS_PRIVILEGE_CAPTURE.ENABLE_CAPTURE('COLLECT_NEXUS'); END;
/

Having done this, we run through the questionable part of the application. The database will capture all the privileges we use during that time. So it is essential to deactivate this right after this.

BEGIN DBMS_PRIVILEGE_CAPTURE.DISABLE_CAPTURE('COLLECT_NEXUS'); END;
/

The captured data can now be analyzed using the DBA_USED_PRIVS view. In our case this looked like this:

SQL> SELECT DISTINCT USERNAME, SYS_PRIV, OBJ_PRV, OBJECT_OWNER,OBJECT_NAME,USERNAME FROM DBA_USED_PRIVS WHERE CAPTURE='COLLECT_NEXUS';

USERNAME	SYS_PRIV	     OBJ_PRIV	 USER_PRIV	 OBJECT_OWNER	 OBJECT_NAME
----------- ---------------- ----------- ----------- --------------- ---------------
NEXUS13	    CREATE PROCEDURE				
NEXUS11	    CREATE PROCEDURE				
[...]
NEXUS11	    CREATE TABLE				
NEXUS13	    CREATE VIEW				
NEXUS11	    CREATE VIEW				
NEXUS11	    SELECT ANY TABLE                        NEXUS	        KASSEKAT
NEXUS11	    SELECT ANY TABLE			            NEXUS           SPRACHEN
NEXUS13		                  SELECT		        SYS             ALL_TAB_COLUMNS
NEXUS13		                  SELECT		        SYS             ALL_CONSTRAINTS
NEXUS11		                  SELECT		        NEXUS	        BEREICHSZAHLEN
NEXUS13		                  SELECT		        SYS	            ALL_OBJECTS
NEXUS		                  EXECUTE		        SYS	            DBMS_OUTPUT
NEXUS13		                  SELECT		        NEXUS	        KAPITEL
[...]

As we can see, the application used the SELECT ANY TABLE privilge to access two tables in the main schema. The users already have SELECT privileges granted on those tables, but through a role. Since the application is using PL/SQL as well, this grant cannot be used. In this case we need to do a direct grant for these two tables. And as soon as we did that, we were able to revoke the SELECT ANY TABLE privilege with no negative impact to the application.
This example outlines the possibilities of the feature. Since it is now free-of-charge for Enterprise Edition users, it becomes quite useful. Only Standard Edition users are not allowed to make use of it at all.

Duplicate Column Names?

Recently, at a customers site, we stumbled upon an error in one of the databases alert.log. It turned out, that it was caused by the nightly statistics gathering job. There was one table for which statistics could not be generated.

SQL> exec dbms_stats.gather_table_stats('BASIS','VG_XXXXXXXXXXX', method_opt=>'for all columns size 1');

BEGIN dbms_stats.gather_table_stats('BASIS','VG_XXXXXXXXXXX', method_opt=>'for all columns size 1'); END;
 
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-00001: unique constraint (SYS.I_WRI$_OPTSTAT_HH_OBJ_ICOL_ST) violated
ORA-06512: at "SYS.DBMS_STATS", line 34850
ORA-06512: at line 1

So we first tried to remove the statistics, but we were unable to.

SQL> exec dbms_stats.delete_table_stats('BASIS','VG_XXXXXXXXXXX');
BEGIN dbms_stats.delete_table_stats('BASIS','VG_XXXXXXXXXXX'); END;
 
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-00001: unique constraint (SYS.I_WRI$_OPTSTAT_HH_OBJ_ICOL_ST) violated
ORA-06512: at "SYS.DBMS_STATS", line 17279
ORA-06512: at line 1
 
SQL> exec dbms_stats.delete_table_stats('BASIS','VG_XXXXXXXXXXX',force=>true);
BEGIN dbms_stats.delete_table_stats('BASIS','VG_XXXXXXXXXXX',force=>true); END;
 
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-00001: unique constraint (SYS.I_WRI$_OPTSTAT_HH_OBJ_ICOL_ST) violated
ORA-06512: at "SYS.DBMS_STATS", line 17279
ORA-06512: at line 1

So we started to dig deeper into that issue. We found out, that there is a duplicate column name when querying the Data Dictionary. But on the other hand, a simple “describe” does not show that.

SQL> desc owner.vg_xxxxxxxxxx

Name                                                  Null?    Type
----------------------------------------------------- -------- -----------------
[...]
XXXXXXXXXXXXXXX_ID                                             NUMBER(19)
XXXXXXXXXXXXXXXXX_XXX                                          NUMBER(19)
XXXXXXXXXXXXXXXXX_ID                                           NUMBER(19)
NAMENSZUSATZ                                                   NVARCHAR2(4000)
ENTITY_TIMESTAMP                                               TIMESTAMP(9)
OPTLOCKVERSION                                        NOT NULL NUMBER(19)
DATENUEBERNAHMETYP                                             NVARCHAR2(4000)
ZUGANGGEBUCHT                                         NOT NULL NUMBER(1)
UEBERSTELLER                                          NOT NULL NUMBER(1)
XXXXXXXXXXXXXX                                        NOT NULL NUMBER(1)
 
 
SQL> select column_id, column_name  from dba_tab_columns where table_name='VG_XXXXXXXXXXX' and column_name like 'Z%'
 
COLUMN_ID  COLUMN_NAME
---------- --------------------
        59 ZUGANGGEBUCHT
        59 ZUGANGGEBUCHT

Next step was to find out, where the data from the dictionary view is comming from.

SQL> select text from dba_views where view_name='DBA_TAB_COLUMNS';

TEXT
--------------------------------------------------------------------------------
select OWNER, TABLE_NAME,
       COLUMN_NAME, DATA_TYPE, DATA_TYPE_MOD, DATA_TYPE_OWNER,
       DATA_LENGTH, DATA_PRECISION, DATA_SCALE, NULLABLE, COLUMN_ID,
       DEFAULT_LENGTH, DATA_DEFAULT, NUM_DISTINCT, LOW_VALUE, HIGH_VALUE,
       DENSITY, NUM_NULLS, NUM_BUCKETS, LAST_ANALYZED, SAMPLE_SIZE,
       CHARACTER_SET_NAME, CHAR_COL_DECL_LENGTH,
       GLOBAL_STATS, USER_STATS, AVG_COL_LEN, CHAR_LENGTH, CHAR_USED,
       V80_FMT_IMAGE, DATA_UPGRADED, HISTOGRAM, DEFAULT_ON_NULL,
       IDENTITY_COLUMN, SENSITIVE_COLUMN,
       EVALUATION_EDITION, UNUSABLE_BEFORE, UNUSABLE_BEGINNING
  from DBA_TAB_COLS
where USER_GENERATED = 'YES'


SQL> select text from dba_views where view_name='DBA_TAB_COLS';

TEXT
--------------------------------------------------------------------------------
select
     OWNER, TABLE_NAME,
     COLUMN_NAME, DATA_TYPE, DATA_TYPE_MOD, DATA_TYPE_OWNER,
     DATA_LENGTH, DATA_PRECISION, DATA_SCALE, NULLABLE, COLUMN_ID,
     DEFAULT_LENGTH, DATA_DEFAULT, NUM_DISTINCT, LOW_VALUE, HIGH_VALUE,
     DENSITY, NUM_NULLS, NUM_BUCKETS, LAST_ANALYZED, SAMPLE_SIZE,
     CHARACTER_SET_NAME, CHAR_COL_DECL_LENGTH,
     GLOBAL_STATS,
     USER_STATS, AVG_COL_LEN, CHAR_LENGTH, CHAR_USED,
     V80_FMT_IMAGE, DATA_UPGRADED, HIDDEN_COLUMN, VIRTUAL_COLUMN,
     SEGMENT_COLUMN_ID, INTERNAL_COLUMN_ID, HISTOGRAM, QUALIFIED_COL_NAME,
     USER_GENERATED, DEFAULT_ON_NULL, IDENTITY_COLUMN, SENSITIVE_COLUMN,
     EVALUATION_EDITION, UNUSABLE_BEFORE, UNUSABLE_BEGINNING
from dba_tab_cols_v$


SQL> select text from dba_views where view_name='DBA_TAB_COLS_V$';

TEXT
--------------------------------------------------------------------------------
select u.name, o.name,
       c.name,

[...]

Ok, DBA_TAB_COLUMNS gets its data from DBA_TAB_COLS which in turn gets its data from DBA_TAB_COLS_V$. That view has the final SQL, that is being used to retrieve the data, I shortened it, you get the idea.
We then used the SQL from DBA_TAB_COLS_V$ to find the root cause of our duplicate column.

SQL> select
  2    , sys."_CURRENT_EDITION_OBJ" o
  3    , sys.hist_head$ h
  4    c.col#, c.obj#, c.name,  c.intcol#
  5    , sys.tab$ t
  6  where o.obj# = c.obj#
  7    , h.obj#, h.intcol#
  8  from sys.col$ c
  9        or
 10        (o.type# = 2     /* tables, excluding iot - overflow and nested tables */
 11          and
 12   , sys."_CURRENT_EDITION_OBJ" o
 13                         and (bitand(t.property, 512) = 512 or
 14                              bitand(t.property, 8192) = 8192))))
 15    , sys.hist_head$ h
 16    , sys.user$ u
 17    , sys.coltype$ ac
 18    , sys.obj$ ot
 19    , sys."_BASE_USER" ut
 20    , sys.tab$ t
 21  where o.obj# = c.obj#
 22    and o.owner# = u.user#
 23    and o.obj# = t.obj#(+)
 24    and c.obj# = h.obj#(+) and c.intcol# = h.intcol#(+)
 25    and c.obj# = ac.obj#(+) and c.intcol# = ac.intcol#(+)
 26    and ac.toid = ot.oid$(+)
 27    and ot.type#(+) = 13
 28    and ot.owner# = ut.user#(+)
 29    and (o.type# in (3, 4)                                     /* cluster, view */
 30         or
 31         (o.type# = 2     /* tables, excluding iot - overflow and nested tables */
 32          and
 33          not exists (select null
 34                        from sys.tab$ t
 35                       where t.obj# = o.obj#
 36                         and (bitand(t.property, 512) = 512 or
 37                              bitand(t.property, 8192) = 8192))))
 38    and c.name='ZUGANGGEBUCHT';

      COL#       OBJ# NAME                    INTCOL#       OBJ#    INTCOL#
---------- ---------- -------------------- ---------- ---------- ----------
        60      95556 ZUGANGGEBUCHT                60
        68      95523 ZUGANGGEBUCHT                68
        59      94177 ZUGANGGEBUCHT                59      94177         59   
        59      94177 ZUGANGGEBUCHT                59      94177         59   
        74      95522 ZUGANGGEBUCHT                74
        74      95550 ZUGANGGEBUCHT                74

6 rows selected.

Have a look at the highlighted rows, there’s the duplicate column. Obviously the join to HIST_HEAD$ introduced the second row. That means, there must be something wrong inside that table. And that turned out to be true.

SQL> desc hist_head$
Name                        Null?    Type
--------------------------- -------- -------------
OBJ#                        NOT NULL NUMBER
COL#                        NOT NULL NUMBER
BUCKET_CNT                  NOT NULL NUMBER
ROW_CNT                     NOT NULL NUMBER
CACHE_CNT                            NUMBER
NULL_CNT                             NUMBER
TIMESTAMP#                           DATE
SAMPLE_SIZE                          NUMBER
MINIMUM                              NUMBER
MAXIMUM                              NUMBER
DISTCNT                              NUMBER
LOWVAL                               RAW(1000)
HIVAL                                RAW(1000)
DENSITY                              NUMBER
INTCOL#                     NOT NULL NUMBER
SPARE1                               NUMBER
SPARE2                               NUMBER
AVGCLN                               NUMBER
SPARE3                               NUMBER
SPARE4                               NUMBER

SQL> select  COL#, BUCKET_CNT, ROW_CNT, NULL_CNT, TIMESTAMP#, SAMPLE_SIZE from  hist_head$ where obj#=94177 and intcol#=59;

      COL# BUCKET_CNT    ROW_CNT   NULL_CNT TIMESTAMP#          SAMPLE_SIZE
---------- ---------- ---------- ---------- ------------------- -----------
         0          1          0      35656 2017-01-05 22:00:23        5502
         0          1          0      35383 2017-01-05 22:00:23        5775

A quick MOS-search revealed two notes, that describe the issue, Alter Table Drop Column Failing With ORA-00600[16515] (Doc ID 2375301.1) and DBMS_STATS.DELETE_TABLE_STATS Fails With ORA-600 [16515] (Doc ID 1233745.1). The latter one has the final solution for this issue.
We first identified the older of the two rows which we then deleted.

SQL> SELECT rowid,obj#,intcol#,timestamp# FROM hist_head$ WHERE obj#=94177 and intcol#=59 order by timestamp#;

ROWID                    OBJ#    INTCOL# TIMESTAMP#
------------------ ---------- ---------- -------------------
AAAABEAABAAAWh0AAj      94177         59 2017-01-05 22:00:23
AAAABEAABAAAWh0AAi      94177         59 2017-01-05 22:00:23

SQL> DELETE FROM hist_head$ WHERE ROWID='AAAABEAABAAAWh0AAj';

1 row deleted.

SQL>  COMMIT;

Commit complete.

After this, DBA_TAB_COLUMNS is back to distinct column names.

SQL> select column_id, column_name  from dba_tab_columns where table_name='VG_XXXXXXXXXXX' and column_name like 'Z%';

COLUMN_ID COLUMN_NAME
---------- --------------------
        59 ZUGANGGEBUCHT

And of cause, the statistics gathering was and is running fine again.

Auto Resize and Shrink ACFS Volumes

Today I want to write a little about ACFS. In case you are using the latest and greatest Oracle Grid Infrastructure and you still run 11.2 databases, you are required to use ACFS. This especially applies to the Database Appliances, ExaDatas and for the Cloud Services, if you use Grid Infrastructure there. Having that, you might run into situations where your ACFS filesystem grows which it does automatically. But what if you want to shrink the filesystem to it’s original or at least a smaller size? With version 18.3 this should work quite well. Version before did had some issues when there was no contigous free space at the end of the volume to do the shrinking. Read more about that in MOS Note “ODA (Oracle Database Appliance): Resizing ACFS DBStorage File System (Doc ID 1990430.1)“.
Basically, there are two major topics. First, the shrinking of ACFS volumes and second, limiting the auto resize of ACFS.

Current situation

Given a customers ODA, this is the current situation.

[root@oda-x7m ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroupSys-LogVolRoot
                       30G  3.4G   25G  13% /
tmpfs                 189G  1.3G  187G   1% /dev/shm
/dev/md0              477M   40M  412M   9% /boot
/dev/sda1             500M  320K  500M   1% /boot/efi
/dev/mapper/VolGroupSys-LogVolOpt
                       59G   24G   33G  43% /opt
/dev/mapper/VolGroupSys-LogVolU01
                       99G   78G   16G  84% /u01
/dev/asm/commonstore-241
                      5.0G  319M  4.7G   7% /opt/oracle/dcs/commonstore
/dev/asm/dattfstar01-241
                      170G  158G   13G  93% /u02/app/oracle/oradata/TFSTAR01
/dev/asm/reco-135     883G  693G  191G  79% /u03/app/oracle
/dev/asm/datxvgvg-241
                      100G  5.9G   95G   6% /u02/app/oracle/oradata/XVGVG
/dev/asm/datxdovg-241
                      410G  399G   12G  98% /u02/app/oracle/oradata/XDOVG
/dev/asm/datxvgovg-241
                      100G  5.7G   95G   6% /u02/app/oracle/oradata/XVGOVG
/dev/asm/datxdoovg-241
                      390G  376G   15G  97% /u02/app/oracle/oradata/XDOOVG
/dev/asm/datxvgfg-241
                      100G  5.1G   95G   6% /u02/app/oracle/oradata/XVGFG
/dev/asm/datxdofg-241
                      100G   46G   55G  46% /u02/app/oracle/oradata/XDOFG
/dev/asm/datsfstar01-241
                      100G  6.3G   94G   7% /u02/app/oracle/oradata/SFSTAR01
/dev/asm/dattvgfg01-241
                      100G  5.1G   95G   6% /u02/app/oracle/oradata/TVGFG01
/dev/asm/dattfstar03-241
                      100G  4.9G   96G   5% /u02/app/oracle/oradata/TFSTAR03
/dev/asm/dattfstar04-241
                      100G  7.0G   94G   7% /u02/app/oracle/oradata/TFSTAR04
/dev/asm/dattfstar02-241
                      1.9T  1.9T   76M 100% /u02/app/oracle/oradata/TFSTAR02
/dev/asm/dattdomea01-241
                      100G   20G   81G  20% /u02/app/oracle/oradata/TDOMEA01
/dev/asm/dattfstar05-241
                      100G  5.2G   95G   6% /u02/app/oracle/oradata/TFSTAR05
/dev/asm/datifstar01-241
                      100G  5.1G   95G   6% /u02/app/oracle/oradata/IFSTAR01A
/dev/asm/datrfstar01-241
                      1.1T  1.1T  9.4G 100% /u02/app/oracle/oradata/RFSTAR01A
jus-srv-rz50.justiz.sachsen.de:/u01/app/DB-dumps
                      3.0T  2.0T  924G  68% /u99/nfs/DB-dumps
/dev/asm/datsfstar02-241
                      100G  3.6G   97G   4% /u02/app/oracle/oradata/SFSTAR02

Shrinking an ACFS Filesystem

As to be seen above, there are quite a lot of databases that have a lot of free space in their filesystems. The initial size is defined by the shapes that an ODA provides. But in our case, even 100GB are too much, so we wanted to reduce the occupied space.
First, we get some basic information.

[oracle@oda-x7m ~]$ /sbin/advmutil volinfo /dev/asm/datxvgvg-241
Device: /dev/asm/datxvgvg-241
Size (MB): 102400
Resize Increment (MB): 512
Redundancy: mirror
Stripe Columns: 8
Stripe Width (KB): 1024
Disk Group: DATA
Volume: DATXVGVG
Compatible.advm: 12.2.0.1.0

The “resize” feature requires the ADVM compatibility to be at least 12.2.0.1, so this is the case. Now let’s do the shrink.

[oracle@oda-x7m ~]$ acfsutil size -h
Usage: acfsutil [-h] size [[-|+]nnn[K|M|G|T|P]] [-a nnn[K|M|G|T|P]] [-x nnn[K|M|G|T|P]] [-d ]  
                                         - Resize file system and configure auto-resize
                [-a]                     - Auto-resize increment
                [-x]                     - Auto-resize maximum
                [-d]                     - Resize only this device

[oracle@oda-x7m ~]$ acfsutil size -80G /u02/app/oracle/oradata/XVGVG
acfsutil size: new file system size: 21474836480 (20480MB)

After that, the volume is at the desired size of 20GB.

[oracle@oda-x7m ~]$ /sbin/advmutil volinfo /dev/asm/datxvgvg-241
Device: /dev/asm/datxvgvg-241
Size (MB): 20480
Resize Increment (MB): 512
Redundancy: mirror
Stripe Columns: 8
Stripe Width (KB): 1024
Disk Group: DATA
Volume: DATXVGVG
Compatible.advm: 12.2.0.1.0

Limiting auto increment of ACFS

The automatic increment of ACFS volumes has two implications. First, it makes monitoring difficult because you never know is the filesystem is really full or will it automatically resize if needed. To figure out, if you are having a critical issue, you need to perform several steps. Implementing this with a monitoring system will be cumbersome in most cases. The other thing is, managing the storage resources is not straight-forward. You might overcommit your storage. My approach and advise is, to disable automatic resizing by setting the ACFS size and maximum size to the value, that you commited for this specific database. Let’s check the setup before doing that.

[oracle@oda-x7m ~]$ acfsutil info fs /u02/app/oracle/oradata/ICM01A
/u02/app/oracle/oradata/ICM01A
    ACFS Version: 12.2.0.1.0
    on-disk version:       46.0
    compatible.advm:       12.2.0.0.0
    ACFS compatibility:    12.2.0.0.0
    flags:        MountPoint,Available,AutoResizeEnabled
    mount time:   Fri Mar 15 10:09:28 2019
    mount sequence number: 4
    allocation unit:       4096
    metadata block size:   4096
    volumes:      1
    total size:   107374182400  ( 100.00 GB )
    total free:   91318923264  (  85.05 GB )
    auto-resize increment:   10737418240  (  10.00 GB )
    auto-resize maximum:     0  ( 0.00 )
    file entry table allocation: 262144
    primary volume: /dev/asm/daticm01-411
        label:
        state:                 Available
        major, minor:          246, 210435
        logical sector size:   512
        size:                  107374182400  ( 100.00 GB )
        free:                  91318923264  (  85.05 GB )
        metadata read I/O count:         54838
        metadata write I/O count:        51022
        total metadata bytes read:       242651136  ( 231.41 MB )
        total metadata bytes written:    233385984  ( 222.57 MB )
        ADVM diskgroup:        DATA
        ADVM resize increment: 536870912
        ADVM redundancy:       mirror
        ADVM stripe columns:   8
        ADVM stripe width:     1048576
    number of snapshots:  0
    snapshot space usage: 0  ( 0.00 )
    replication status: DISABLED
    compression status: DISABLED

So I set the maximum size to the current size.

[oracle@oda-x7m ~]$ acfsutil size -x 100G /u02/app/oracle/oradata/ICM01A
acfsutil size: ACFS-03642: successfully updated auto-resize settings

Just crosscheck that.

[oracle@oda-x7m ~]$ acfsutil info fs /u02/app/oracle/oradata/ICM01A
/u02/app/oracle/oradata/ICM01A
    ACFS Version: 12.2.0.1.0
    on-disk version:       46.0
    compatible.advm:       12.2.0.0.0
    ACFS compatibility:    12.2.0.0.0
    flags:        MountPoint,Available,AutoResizeEnabled
    mount time:   Fri Mar 15 10:09:28 2019
    mount sequence number: 4
    allocation unit:       4096
    metadata block size:   4096
    volumes:      1
    total size:   107374182400  ( 100.00 GB )
    total free:   91318923264  (  85.05 GB )
    auto-resize increment:   10737418240  (  10.00 GB )
    auto-resize maximum:     107374182400  ( 100.00 GB )
    file entry table allocation: 262144
    primary volume: /dev/asm/daticm01-411
        label:
        state:                 Available
        major, minor:          246, 210435
        logical sector size:   512
        size:                  107374182400  ( 100.00 GB )
        free:                  91318923264  (  85.05 GB )
        metadata read I/O count:         54840
        metadata write I/O count:        51022
        total metadata bytes read:       242659328  ( 231.42 MB )
        total metadata bytes written:    233385984  ( 222.57 MB )
        ADVM diskgroup:        DATA
        ADVM resize increment: 536870912
        ADVM redundancy:       mirror
        ADVM stripe columns:   8
        ADVM stripe width:     1048576
    number of snapshots:  0
    snapshot space usage: 0  ( 0.00 )
    replication status: DISABLED
    compression status: DISABLED

You could also combine the resize operation which returns the final ACFS size which in turn can be used for setting the maximum size.
And that’s basically it, there is nothing more to do.

ADR Home is growing permanently

Housekeeping is an often underestimated topic when running databases in production. In the blog post about the size based deletion policy I described a new way of keeping the size of an ADR home limited. But what if you are not yet on Oracle Database 12.2+? Then you have to rely on the existing time based deletion policies. That is what we did and do for our customers. A couple of weeks ago one of them experienced space pressure on their /u01 filesystem on one of their ODAs. The automatic purging worked fine, but the filesystem usage was still increasing. The Enterprise Manager metric outlines this behaviour.

adr-growth-01

So we started to investigate this and came to the trace-directory of some databases. This is an example of one of them.

[root@odax7-2m trace]#  du -hs * | sort -h | tail -6

68M     DOMEADD_gen0_82205.trm
69M     DOMEADD_mmon_82354.trm
131M    DOMEADD_ipc0_82195.trm
419M    DOMEADD_gen0_82205.trc
423M    DOMEADD_mmon_82354.trc
462M    DOMEADD_ipc0_82195.trc

We started with the largest trace and had a look at its end.

[root@odax7-2m trace]# tail DOMEADD_ipc0_82195.trc
2019-03-15 09:23:07.749*:kjuinc(): no cluster database, return inc# KSIMINVL

*** 2019-03-15T09:23:10.913816+01:00
2019-03-15 09:23:10.913*:kjuinc(): no cluster database, return inc# KSIMINVL

Having this, we did a quick research on My Oracle Support which revealed Bug 27989556  Excessive Trace Message: no cluster database, return inc# ksiminvl. And of cause there was no fix available. This changed by April 16th, now there’s a fix for 18c available with the 18.6.0 Release Update.

Ok, next file. Let’s have a look.

[root@odax7-2m trace]# tail DOMEADD_mmon_82354.trc
AUTO SGA: kmgs_parameter_update_timeout gen0 0 mmon alive 1

*** 2019-03-15T09:29:55.474098+01:00
AUTO SGA: kmgs_parameter_update_timeout gen0 0 mmon alive 1

*** 2019-03-15T09:29:58.474092+01:00
AUTO SGA: kmgs_parameter_update_timeout gen0 0 mmon alive 1

Again MOS has the answer: AUTO SGA: kmgs_parameter_update_timeout gen0 0 mmon alive 1 Excessive Trace Files were generated on one node after Migration to 12.2 and MMON trace file grows (Doc ID 2298766.1). At least there are patches for that this time.

Since patching would have required a downtime and we might fix only one of two issues, we decided against it. Instead, we created a simple but effective workaround:

for i in $(find /u01/app/oracle/diag/rdbms -type f  -name "*mmon*tr*"); do echo "" > $i; done
for i in $(find /u01/app/oracle/diag/rdbms -type f  -name "*ipc0*tr*"); do echo "" > $i; done

Once this was run, and we now do that regularly, the filesystem usage dropped immediatly. This is what Enterprise Manager showed right after running the workaround.

adr-growth-02

Adding VLANs on an ODA

The Oracle Database Appliances X7-2 come only with one redundant network interface, so it is not possible to separate network traffic between distinct fault-tolerant bonded interfaces. But at least we can use VLANs on the one and only bonded interface that all the ODA X7-2 provide. If you choose to use VLANs, this choice must be taken at the point of installing the ODA. There is no easy way to change that afterwards. So if you use VLANs, you do it right from the beginning when you do the “configure-firstnet”. Once the setup is finished, it will end up in a configuration simiar to this.

[root@odax7-2m ~]# odacli list-networks

ID                                       Name                 NIC        InterfaceType IP Address         Subnet Mask        Gateway            VlanId
---------------------------------------- -------------------- ---------- ---------- ------------------ ------------------ ------------------ ----------
32812bc3-5bb7-4f8e-9070-b011acde665d     Private-network      priv0      DUMMY      192.168.16.24      255.255.255.240
735e2695-d68a-4696-a596-ba39a7180de6     Public-network       btbond1.26 Vlan       172.17.11.165      255.255.255.0      172.17.11.10       26

Now we can add more networks. The additional networks must be added from the operating system point of view which is done by the appliance management “odacli”, and from the Grid Infrastructure point of view which in turn is done by using “srvctl”. Finally, the databases must be configured to use the propper VLANs for their communication. Let’s walk through these steps.

1. Create a new VLAN

[root@odax7-2m ~]# odacli create-network -no-d -g 172.17.2.10 -n btbond1 -t VLAN -p 172.17.2.9 -m lsnr2 -w Database -s 255.255.255.0 -v 21
{
  "jobId" : "bd4d9ba0-7f59-491c-878b-5c03d3be51b6",
  "status" : "Created",
  "message" : null,
  "reports" : [ ],
  "createTimestamp" : "March 15, 2019 10:56:15 AM CET",
  "resourceList" : [ ],
  "description" : "Network service creation with names btbond1:lsnr2 ",
  "updatedTime" : "March 15, 2019 10:56:15 AM CET"
}

[root@odax7-2m ~]# odacli describe-job -i "bd4d9ba0-7f59-491c-878b-5c03d3be51b6"

Job details
----------------------------------------------------------------
                     ID:  bd4d9ba0-7f59-491c-878b-5c03d3be51b6
            Description:  Network service creation with names btbond1:lsnr2
                 Status:  Success
                Created:  March 15, 2019 10:56:15 AM CET
                Message:

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Setting network                          March 15, 2019 10:56:15 AM CET      March 15, 2019 10:56:20 AM CET      Success
Setting up Vlan                          March 15, 2019 10:56:15 AM CET      March 15, 2019 10:56:20 AM CET      Success

[root@odax7-2m ~]# odacli list-networks

ID                                       Name                 NIC        InterfaceType IP Address         Subnet Mask        Gateway            VlanId
---------------------------------------- -------------------- ---------- ---------- ------------------ ------------------ ------------------ ----------
32812bc3-5bb7-4f8e-9070-b011acde665d     Private-network      priv0      DUMMY      192.168.16.24      255.255.255.240
735e2695-d68a-4696-a596-ba39a7180de6     Public-network       btbond1.26 Vlan       172.17.11.165      255.255.255.0      172.17.11.10       26
e69efc7a-5973-4911-85c7-9d8e8b4138be     lsnr2                btbond1.21 VLAN       172.17.2.9         255.255.255.0      172.17.2.10        21

[root@odax7-2m ~]# odacli describe-network -i e69efc7a-5973-4911-85c7-9d8e8b4138be

Network details
----------------------------------------------------------------
                     ID:  e69efc7a-5973-4911-85c7-9d8e8b4138be
                   Name:  lsnr2
                    NIC:  btbond1.21
          InterfaceType:  VLAN
             IP Address:  172.17.2.9
            Subnet Mask:  255.255.255.0
                Gateway:  172.17.2.10
                 VlanId:  21
                   Type:  Database
                Default:  false
                Created:  March 15, 2019 10:56:15 AM CET

[root@odax7-2m ~]# odaadmcli show vlan
        NAME                     ID    INTERFACE   CONFIG_TYPE IP_ADDRESS      NETMASK         GATEWAY         NODENUM
        lsnr2                    21    btbond1     Database    172.17.2.9      255.255.255.0   172.17.2.10     0
        vlan26                   26    btbond1     public      172.17.11.165   255.255.255.0   172.17.11.10    0

[root@odax7-2m ~]# ifconfig btbond1.21
btbond1.21 Link encap:Ethernet  HWaddr 00:10:E0:DE:6F:CB
          inet addr:172.17.2.9  Bcast:172.17.2.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:501 errors:0 dropped:0 overruns:0 frame:0
          TX packets:13 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:35567 (34.7 KiB)  TX bytes:720 (720.0 b)

2. Add the VLAN to the Grid Infrastructure
Note, you must set the environment variables to point to the Grid Infrastructure. We have a toolbox for Linux systems, that makes it easy to set and maintain different environment settings. This toolbox provides a function to set the propper Oracle environment which is called “soe” –  set oracle environment. You’ll see it’s usage in the following step.

First, we add the new VLAN as a network resource.

[root@odax7-2m ~]# soe +ASM

Environment :
===================

Setting         Value
--------------- ---------------------------
NLS_DATE_FORMAT "YYYY-MM-DD HH24:MI:SS"
NLS_LANG        "AMERICAN_GERMANY.AL32UTF8"
ORACLE_BASE     /u01/app/oracle
ORACLE_HOME     /u01/app/12.2.0.1/oracle
ORACLE_SID      +ASM1
ORACLE_UNQNAME  +ASM

[root@odax7-2m ~]# srvctl add network -netnum 21 -subnet 172.17.2.0/255.255.255.0/btbond1.21 -nettype static

[root@odax7-2m ~]# srvctl config network
Network 1 exists
Subnet IPv4: 172.17.11.0/255.255.255.0/btbond1.26, static
Subnet IPv6:
Ping Targets:
Network is enabled
Network is individually enabled on nodes:
Network is individually disabled on nodes:
Network 21 exists
Subnet IPv4: 172.17.2.0/255.255.255.0/btbond1.21, static
Subnet IPv6:
Ping Targets:
Network is enabled
Network is individually enabled on nodes:
Network is individually disabled on nodes:
Network 28 exists
Subnet IPv4: 172.17.5.0/255.255.255.0/btbond1.28, static
Subnet IPv6:
Ping Targets:
Network is enabled
Network is individually enabled on nodes:
Network is individually disabled on nodes:

Second, we definde the VIP and the listener. Please be careful, which OS user you use for which command. Have a look at the command prompt.

[root@odax7-2m ~]# srvctl add vip -node odax7-2m -address odax7-2m-lsnr2/255.255.255.0 -netnum 21
[root@odax7-2m ~]# su - oracle
[oracle@odax7-2m ~]$ soe +ASM

Environment :
===================

Setting         Value
--------------- ---------------------------
NLS_DATE_FORMAT "YYYY-MM-DD HH24:MI:SS"
NLS_LANG        "AMERICAN_GERMANY.AL32UTF8"
ORACLE_BASE     /u01/app/oracle
ORACLE_HOME     /u01/app/12.2.0.1/oracle
ORACLE_SID      +ASM1
ORACLE_UNQNAME  +ASM

[oracle@odax7-2m ~]$ srvctl add listener -netnum 21 -endpoints "TCP:1523" -listener listener2
[oracle@odax7-2m ~]$ srvctl start listener -listener listener2
[oracle@odax7-2m ~]$ lsnrctl status listener2

LSNRCTL for Linux: Version 12.2.0.1.0 - Production on 15-MAR-2019 11:10:26

Copyright (c) 1991, 2016, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER2)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER2
Version                   TNSLSNR for Linux: Version 12.2.0.1.0 - Production
Start Date                15-MAR-2019 11:10:03
Uptime                    0 days 0 hr. 0 min. 22 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/12.2.0.1/oracle/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/odax7-2m/listener2/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER2)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.17.2.9)(PORT=1523)))
The listener supports no services
The command completed successfully

3. Point a database to the new Listener
To make use of the new listener, the database(s) must register with that listener. So we simply set the “local_listener” parameter to point to the new listener and we are done.

[oracle@odax7-2m ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on MAR-15-2019 11:11:58 

Copyright (c) 1982, 2014, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Advanced Analytics,
Real Application Testing and Unified Auditing options

SQL> alter system set local_listener='odax7-2m-lsnr2:1523';

System altered.

SQL> exit

[oracle@odax7-2m ~]$ soe +ASM -s
[oracle@odax7-2m ~]$ lsnrctl status listener2

LSNRCTL for Linux: Version 12.2.0.1.0 - Production on 15-MAR-2019 11:15:23

Copyright (c) 1991, 2016, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER2)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER2
Version                   TNSLSNR for Linux: Version 12.2.0.1.0 - Production
Start Date                15-MAR-2019 11:10:03
Uptime                    0 days 0 hr. 5 min. 19 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/12.2.0.1/oracle/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/odax7-2m/listener2/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER2)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.17.2.9)(PORT=1523)))
Service "MMI.support.robotron.de" has 1 instance(s).
  Instance "MMI", status READY, has 2 handler(s) for this service...
Service "MMIXDB.support.robotron.de" has 1 instance(s).
  Instance "MMI", status READY, has 2 handler(s) for this service...
The command completed successfully

That’s it, it’s as simple as that. 

Instance Caging

Last week we had a funny issue at a customers ODA X7-2M. You may know, that there are different database shapes, each having different settings for CPU_COUNT, memory etc. The documentation is quite clear about that. In this special case they had a database with shape “odb4” which sets CPU_COUNT to 4. Now they went live with their appication on that database and experienced the following behaviour.
cpu_count_instance_caging
You see the CPU limit of and you can also see, that the load goes well over that limit. But why? The answer is quite simple. Setting CPU_COUNT does not neccessarily limit the number of CPUs the database will use. An active Resource Manager Plan is required to achieve that. MOS Note “Why are a Higher Number of CPUs Used Than the CPU_COUNT Parameter Setting (Doc ID 2017395.1)” is explainig that. If you wish to limit the CPU usage, you will have to use Instance Caging which is a combination of setting CPU_COUNT and using Resource Manager. There is another MOS Note “Configuring and Monitoring Instance Caging (Doc ID 1362445.1)” for that.
That means, if you want your databases on ODA be limited to the CPU usage their shape defines, you need to activate a Resource Manager Plan, at least the “DEFAULT_PLAN”. This is not the default, keep that in mind.