ADR Home is growing permanently

Housekeeping is an often underestimated topic when running databases in production. In the blog post about the size based deletion policy I described a new way of keeping the size of an ADR home limited. But what if you are not yet on Oracle Database 12.2+? Then you have to rely on the existing time based deletion policies. That is what we did and do for our customers. A couple of weeks ago one of them experienced space pressure on their /u01 filesystem on one of their ODAs. The automatic purging worked fine, but the filesystem usage was still increasing. The Enterprise Manager metric outlines this behaviour.

adr-growth-01

So we started to investigate this and came to the trace-directory of some databases. This is an example of one of them.

[root@odax7-2m trace]#  du -hs * | sort -h | tail -6

68M     DOMEADD_gen0_82205.trm
69M     DOMEADD_mmon_82354.trm
131M    DOMEADD_ipc0_82195.trm
419M    DOMEADD_gen0_82205.trc
423M    DOMEADD_mmon_82354.trc
462M    DOMEADD_ipc0_82195.trc

We started with the largest trace and had a look at its end.

[root@odax7-2m trace]# tail DOMEADD_ipc0_82195.trc
2019-03-15 09:23:07.749*:kjuinc(): no cluster database, return inc# KSIMINVL

*** 2019-03-15T09:23:10.913816+01:00
2019-03-15 09:23:10.913*:kjuinc(): no cluster database, return inc# KSIMINVL

Having this, we did a quick research on My Oracle Support which revealed Bug 27989556  Excessive Trace Message: no cluster database, return inc# ksiminvl. And of cause there was no fix available. This changed by April 16th, now there’s a fix for 18c available with the 18.6.0 Release Update.

Ok, next file. Let’s have a look.

[root@odax7-2m trace]# tail DOMEADD_mmon_82354.trc
AUTO SGA: kmgs_parameter_update_timeout gen0 0 mmon alive 1

*** 2019-03-15T09:29:55.474098+01:00
AUTO SGA: kmgs_parameter_update_timeout gen0 0 mmon alive 1

*** 2019-03-15T09:29:58.474092+01:00
AUTO SGA: kmgs_parameter_update_timeout gen0 0 mmon alive 1

Again MOS has the answer: AUTO SGA: kmgs_parameter_update_timeout gen0 0 mmon alive 1 Excessive Trace Files were generated on one node after Migration to 12.2 and MMON trace file grows (Doc ID 2298766.1). At least there are patches for that this time.

Since patching would have required a downtime and we might fix only one of two issues, we decided against it. Instead, we created a simple but effective workaround:

for i in $(find /u01/app/oracle/diag/rdbms -type f  -name "*mmon*tr*"); do echo "" > $i; done
for i in $(find /u01/app/oracle/diag/rdbms -type f  -name "*ipc0*tr*"); do echo "" > $i; done

Once this was run, and we now do that regularly, the filesystem usage dropped immediatly. This is what Enterprise Manager showed right after running the workaround.

adr-growth-02

Advertisements

Adding VLANs on an ODA

The Oracle Database Appliances X7-2 come only with one redundant network interface, so it is not possible to separate network traffic between distinct fault-tolerant bonded interfaces. But at least we can use VLANs on the one and only bonded interface that all the ODA X7-2 provide. If you choose to use VLANs, this choice must be taken at the point of installing the ODA. There is no easy way to change that afterwards. So if you use VLANs, you do it right from the beginning when you do the “configure-firstnet”. Once the setup is finished, it will end up in a configuration simiar to this.

[root@odax7-2m ~]# odacli list-networks

ID                                       Name                 NIC        InterfaceType IP Address         Subnet Mask        Gateway            VlanId
---------------------------------------- -------------------- ---------- ---------- ------------------ ------------------ ------------------ ----------
32812bc3-5bb7-4f8e-9070-b011acde665d     Private-network      priv0      DUMMY      192.168.16.24      255.255.255.240
735e2695-d68a-4696-a596-ba39a7180de6     Public-network       btbond1.26 Vlan       172.17.11.165      255.255.255.0      172.17.11.10       26

Now we can add more networks. The additional networks must be added from the operating system point of view which is done by the appliance management “odacli”, and from the Grid Infrastructure point of view which in turn is done by using “srvctl”. Finally, the databases must be configured to use the propper VLANs for their communication. Let’s walk through these steps.

1. Create a new VLAN

[root@odax7-2m ~]# odacli create-network -no-d -g 172.17.2.10 -n btbond1 -t VLAN -p 172.17.2.9 -m lsnr2 -w Database -s 255.255.255.0 -v 21
{
  "jobId" : "bd4d9ba0-7f59-491c-878b-5c03d3be51b6",
  "status" : "Created",
  "message" : null,
  "reports" : [ ],
  "createTimestamp" : "March 15, 2019 10:56:15 AM CET",
  "resourceList" : [ ],
  "description" : "Network service creation with names btbond1:lsnr2 ",
  "updatedTime" : "March 15, 2019 10:56:15 AM CET"
}

[root@odax7-2m ~]# odacli describe-job -i "bd4d9ba0-7f59-491c-878b-5c03d3be51b6"

Job details
----------------------------------------------------------------
                     ID:  bd4d9ba0-7f59-491c-878b-5c03d3be51b6
            Description:  Network service creation with names btbond1:lsnr2
                 Status:  Success
                Created:  March 15, 2019 10:56:15 AM CET
                Message:

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Setting network                          March 15, 2019 10:56:15 AM CET      March 15, 2019 10:56:20 AM CET      Success
Setting up Vlan                          March 15, 2019 10:56:15 AM CET      March 15, 2019 10:56:20 AM CET      Success

[root@odax7-2m ~]# odacli list-networks

ID                                       Name                 NIC        InterfaceType IP Address         Subnet Mask        Gateway            VlanId
---------------------------------------- -------------------- ---------- ---------- ------------------ ------------------ ------------------ ----------
32812bc3-5bb7-4f8e-9070-b011acde665d     Private-network      priv0      DUMMY      192.168.16.24      255.255.255.240
735e2695-d68a-4696-a596-ba39a7180de6     Public-network       btbond1.26 Vlan       172.17.11.165      255.255.255.0      172.17.11.10       26
e69efc7a-5973-4911-85c7-9d8e8b4138be     lsnr2                btbond1.21 VLAN       172.17.2.9         255.255.255.0      172.17.2.10        21

[root@odax7-2m ~]# odacli describe-network -i e69efc7a-5973-4911-85c7-9d8e8b4138be

Network details
----------------------------------------------------------------
                     ID:  e69efc7a-5973-4911-85c7-9d8e8b4138be
                   Name:  lsnr2
                    NIC:  btbond1.21
          InterfaceType:  VLAN
             IP Address:  172.17.2.9
            Subnet Mask:  255.255.255.0
                Gateway:  172.17.2.10
                 VlanId:  21
                   Type:  Database
                Default:  false
                Created:  March 15, 2019 10:56:15 AM CET

[root@odax7-2m ~]# odaadmcli show vlan
        NAME                     ID    INTERFACE   CONFIG_TYPE IP_ADDRESS      NETMASK         GATEWAY         NODENUM
        lsnr2                    21    btbond1     Database    172.17.2.9      255.255.255.0   172.17.2.10     0
        vlan26                   26    btbond1     public      172.17.11.165   255.255.255.0   172.17.11.10    0

[root@odax7-2m ~]# ifconfig btbond1.21
btbond1.21 Link encap:Ethernet  HWaddr 00:10:E0:DE:6F:CB
          inet addr:172.17.2.9  Bcast:172.17.2.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:501 errors:0 dropped:0 overruns:0 frame:0
          TX packets:13 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:35567 (34.7 KiB)  TX bytes:720 (720.0 b)

2. Add the VLAN to the Grid Infrastructure
Note, you must set the environment variables to point to the Grid Infrastructure. We have a toolbox for Linux systems, that makes it easy to set and maintain different environment settings. This toolbox provides a function to set the propper Oracle environment which is called “soe” –  set oracle environment. You’ll see it’s usage in the following step.

First, we add the new VLAN as a network resource.

[root@odax7-2m ~]# soe +ASM

Environment :
===================

Setting         Value
--------------- ---------------------------
NLS_DATE_FORMAT "YYYY-MM-DD HH24:MI:SS"
NLS_LANG        "AMERICAN_GERMANY.AL32UTF8"
ORACLE_BASE     /u01/app/oracle
ORACLE_HOME     /u01/app/12.2.0.1/oracle
ORACLE_SID      +ASM1
ORACLE_UNQNAME  +ASM

[root@odax7-2m ~]# srvctl add network -netnum 21 -subnet 172.17.2.0/255.255.255.0/btbond1.21 -nettype static

[root@odax7-2m ~]# srvctl config network
Network 1 exists
Subnet IPv4: 172.17.11.0/255.255.255.0/btbond1.26, static
Subnet IPv6:
Ping Targets:
Network is enabled
Network is individually enabled on nodes:
Network is individually disabled on nodes:
Network 21 exists
Subnet IPv4: 172.17.2.0/255.255.255.0/btbond1.21, static
Subnet IPv6:
Ping Targets:
Network is enabled
Network is individually enabled on nodes:
Network is individually disabled on nodes:
Network 28 exists
Subnet IPv4: 172.17.5.0/255.255.255.0/btbond1.28, static
Subnet IPv6:
Ping Targets:
Network is enabled
Network is individually enabled on nodes:
Network is individually disabled on nodes:

Second, we definde the VIP and the listener. Please be careful, which OS user you use for which command. Have a look at the command prompt.

[root@odax7-2m ~]# srvctl add vip -node odax7-2m -address odax7-2m-lsnr2/255.255.255.0 -netnum 21
[root@odax7-2m ~]# su - oracle
[oracle@odax7-2m ~]$ soe +ASM

Environment :
===================

Setting         Value
--------------- ---------------------------
NLS_DATE_FORMAT "YYYY-MM-DD HH24:MI:SS"
NLS_LANG        "AMERICAN_GERMANY.AL32UTF8"
ORACLE_BASE     /u01/app/oracle
ORACLE_HOME     /u01/app/12.2.0.1/oracle
ORACLE_SID      +ASM1
ORACLE_UNQNAME  +ASM

[oracle@odax7-2m ~]$ srvctl add listener -netnum 21 -endpoints "TCP:1523" -listener listener2
[oracle@odax7-2m ~]$ srvctl start listener -listener listener2
[oracle@odax7-2m ~]$ lsnrctl status listener2

LSNRCTL for Linux: Version 12.2.0.1.0 - Production on 15-MAR-2019 11:10:26

Copyright (c) 1991, 2016, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER2)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER2
Version                   TNSLSNR for Linux: Version 12.2.0.1.0 - Production
Start Date                15-MAR-2019 11:10:03
Uptime                    0 days 0 hr. 0 min. 22 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/12.2.0.1/oracle/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/odax7-2m/listener2/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER2)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.17.2.9)(PORT=1523)))
The listener supports no services
The command completed successfully

3. Point a database to the new Listener
To make use of the new listener, the database(s) must register with that listener. So we simply set the “local_listener” parameter to point to the new listener and we are done.

[oracle@odax7-2m ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on MAR-15-2019 11:11:58 

Copyright (c) 1982, 2014, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Advanced Analytics,
Real Application Testing and Unified Auditing options

SQL> alter system set local_listener='odax7-2m-lsnr2:1523';

System altered.

SQL> exit

[oracle@odax7-2m ~]$ soe +ASM -s
[oracle@odax7-2m ~]$ lsnrctl status listener2

LSNRCTL for Linux: Version 12.2.0.1.0 - Production on 15-MAR-2019 11:15:23

Copyright (c) 1991, 2016, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER2)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER2
Version                   TNSLSNR for Linux: Version 12.2.0.1.0 - Production
Start Date                15-MAR-2019 11:10:03
Uptime                    0 days 0 hr. 5 min. 19 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/12.2.0.1/oracle/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/odax7-2m/listener2/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER2)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.17.2.9)(PORT=1523)))
Service "MMI.support.robotron.de" has 1 instance(s).
  Instance "MMI", status READY, has 2 handler(s) for this service...
Service "MMIXDB.support.robotron.de" has 1 instance(s).
  Instance "MMI", status READY, has 2 handler(s) for this service...
The command completed successfully

That’s it, it’s as simple as that. 

Instance Caging

Last week we had a funny issue at a customers ODA X7-2M. You may know, that there are different database shapes, each having different settings for CPU_COUNT, memory etc. The documentation is quite clear about that. In this special case they had a database with shape “odb4” which sets CPU_COUNT to 4. Now they went live with their appication on that database and experienced the following behaviour.
cpu_count_instance_caging
You see the CPU limit of and you can also see, that the load goes well over that limit. But why? The answer is quite simple. Setting CPU_COUNT does not neccessarily limit the number of CPUs the database will use. An active Resource Manager Plan is required to achieve that. MOS Note “Why are a Higher Number of CPUs Used Than the CPU_COUNT Parameter Setting (Doc ID 2017395.1)” is explainig that. If you wish to limit the CPU usage, you will have to use Instance Caging which is a combination of setting CPU_COUNT and using Resource Manager. There is another MOS Note “Configuring and Monitoring Instance Caging (Doc ID 1362445.1)” for that.
That means, if you want your databases on ODA be limited to the CPU usage their shape defines, you need to activate a Resource Manager Plan, at least the “DEFAULT_PLAN”. This is not the default, keep that in mind.

Huge Oracle Database traces caused by EM Agent

Recently, I found some huge tracefiles in the ADR of some databases. Look at this:

[oracle@server01b trace]$ ls -lrth | tail -10
-rw-r----- 1 oracle dba 1.2K Apr 9 03:44 ICM01_vktm_29023.trc
-rw-r----- 1 oracle dba 329K Apr 9 06:36 ICM01_dbrm_29047.trm
-rw-r----- 1 oracle dba 4.7M Apr 9 06:36 ICM01_dbrm_29047.trc
-rw-r----- 1 oracle dba 1.5K Apr 9 07:58 ICM01_mmon_29120.trm
-rw-r----- 1 oracle dba 14K Apr 9 07:58 ICM01_mmon_29120.trc
-rw-r----- 1 oracle dba 471M Apr 9 08:02 ICM01_ora_29162.trm
-rw-r----- 1 oracle dba 33G Apr 9 08:02 ICM01_ora_29162.trc
-rw-r----- 1 oracle dba 1.3M Apr 9 08:02 alert_ICM01.log
-rw-r----- 1 oracle dba 12M Apr 9 08:02 ICM01_lmhb_29080.trm
-rw-r----- 1 oracle dba 112M Apr 9 08:02 ICM01_lmhb_29080.trc

The file(s) were not purged automatically from ADR since they simply don’t get older. So I had a look into the file and I found repeating content like this:

Trace file /u01/app/oracle/diag/rdbms/icm01a/ICM01/trace/ICM01_ora_29162.trc
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Advanced Analytics
and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/12.1.0.2/dbhome_5
System name: Linux
Node name: server01b
Release: 4.1.12-112.16.8.el6uek.x86_64
Version: #2 SMP Fri Apr 13 15:13:46 PDT 2018
Machine: x86_64
Instance name: ICM01
Redo thread mounted by this instance: 1
Oracle process number: 35
Unix process pid: 29162, image: oracle@jus-oda-lrz01b

*** 2019-03-15 14:28:19.983
*** SESSION ID:(464.27491) 2019-03-15 14:28:19.983
*** CLIENT ID:() 2019-03-15 14:28:19.983
*** SERVICE NAME:(SYS$USERS) 2019-03-15 14:28:19.983
*** MODULE NAME:(emagent_SQL_oracle_database) 2019-03-15 14:28:19.983
*** CLIENT DRIVER:(jdbcthin) 2019-03-15 14:28:19.983
*** ACTION NAME:(problemTbspUndo) 2019-03-15 14:28:19.983

psdgbt: bind csid (1) does not match session csid (873)
psdgbt: session charset is AL32UTF8

*** 2019-03-15 14:28:19.983
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=1, mask=0x0)
----- Error Stack Dump -----
----- Current SQL Statement for this session (sql_id=a0qc12302fzfk) -----
begin dbms_application_info.set_module(:1 , :2 ); end;

There was some more callstack dumped etc., but this was not relevant to identify the root cause for this. Obviously the Enterprise Manager Agent is doing something nasty to the database. There are several documents in MOS, the best match of them is “After restart of database suddenly PLS-00553: character set name is not recognized are seen in the alert.log (Doc ID 1599864.1)“. Basically, the Agent connects too early to a starting database and picks up US7ASCII characterset. The only solution as of now is to bounce the Agent.
As a conclusion, it might be a good idea to bounce the Agent every time a database was started. May it be because the database was newly created, restarted due to maintenance or in a Data Guard switch-/failover scenario. Or just do it on a regular basis so no one has to do it by hand.

New Size Based ADR Purge Policy in 12.2

With the introduction of the Automatic Diagnostic Repository (ADR) we got some automatic purging of logs and traces too, at least for the database. This purging is based on the age of files. All the files in ADR are put in two categories, files with a short life and files with a long life. So we also have two retention policies for these files, the SHORTP_POLICY and the LONGP_POLICY. The SHORTP_POLICY defaults to 720 hours which is roughly a month whilst the LONGP_POLICY defaults to 8760 hours which is roughly a year. You can read more on setting the policies in MOS Note “Retention Policy for ADR (Doc ID 564269.1)“. If you are unsure which files are controlled by which policy, have a look at MOS Note “Which Files Are Part Of SHORTP_POLICY And LONGP_POLICY In ADR? (Doc ID 975448.1)“, basically traces, core dumps and IPS have a short life, all others have a long life.

This is a good starting point for housekeeping, but there might be situations were a lot of large traces are written in a short amount of time. This might fill up you filesystem leading to a database outage. Now with Oracle Database 12.2 and onwards, there is another size based policy. This policy is not immediatly visible:

[oracle@odax7-2m rdbms]$ adrci

ADRCI: Release 12.2.0.1.0 - Production on Fri Mar 15 08:07:49 2019

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

ADR base = "/u01/app/oracle"
adrci> set home mmidb
adrci> show homes
ADR Homes:
diag/rdbms/mmidb/MMIDB

adrci>  show control

ADR Home = /u01/app/oracle/diag/rdbms/mmidb/MMIDB:
*************************************************************************
ADRID                SHORTP_POLICY        LONGP_POLICY         LAST_MOD_TIME                            LAST_AUTOPRG_TIME                        LAST_MANUPRG_TIME                        ADRDIR_VERSION       ADRSCHM_VERSION      ADRSCHMV_SUMMARY     ADRALERT_VERSION     CREATE_TIME
-------------------- -------------------- -------------------- ---------------------------------------- ---------------------------------------- ---------------------------------------- -------------------- -------------------- -------------------- -------------------- ----------------------------------------
1312824711           72                   72                   2019-03-14 22:37:18.660133 +01:00        2019-03-14 09:13:26.445071 +01:00        2019-03-15 07:58:30.921596 +01:00        1                    2                    107                  1                    2018-10-02 09:11:29.809205 +02:00
1 row fetched

To see the size based policy (SIZEP_POLICY), we need to query it explicitly:

adrci> select sizep_policy from adr_control_aux;

ADR Home = /u01/app/oracle/diag/rdbms/mmidb/MMIDB:
*************************************************************************
SIZEP_POLICY
--------------------
18446744073709551615
1 row fetched

This number represents the maximum size of the specific ADR home in bytes. To get an idea, how much space is required to keep files for a given amount of time or vice versa, we can use the “estimate” command.

adrci> estimate (LONGP_POLICY = 72)
Estimate
Short Policy Hours: 72
Long Policy Hours: 72
Size Policy Bytes: 2459925363

adrci> estimate (SIZEP_POLICY = 1000000000)
Estimate
Short Policy Hours: 26
Long Policy Hours: 26
Size Policy Bytes: 1000000000

So if we want to limit the size to 1GB, we can set the SIZEP_POLICY accordingly.

adrci> set control (SIZEP_POLICY = 1000000000)

adrci> select sizep_policy from adr_control_aux;

ADR Home = /u01/app/oracle/diag/rdbms/mmidb/MMIDB:
*************************************************************************
SIZEP_POLICY
--------------------
1000000000
1 row fetched

But remember, this does not take immediate effect. The database is the only facility, that does automatic purging. All other ADR homes from other facilities need explicit purging via the “purge” command. And also the database does the purging only once a week and there is no way to change that, see MOS Note “Is there a way to control Auto_Purge Frequency done by the MMON ? (Doc ID 1446242.1)” for details. So you should consider using a separate purge job for that. The “purgeLogs” script (MOS Note “purgeLogs: Cleanup traces, logs in one command (Doc ID 2081655.1)“) might be a good starting point for that, even though it was designed to be used on ExaData, Database Appliance etc., it should also do it’s job on standard installations.

Platform change and upgrade in one step – Manually

In a previous blog post I outlined the steps to migrate between platforms and versions in one step and ended up with an error from the XTTS scripts provided by Oracle.
Now I’ll show you, how easy this is, even if you do it by hand.

Step 1: create initial backup

At first, we get the current SCN of the target database. This SCN will be used later on.

SQL>  select current_scn from v$database;

CURRENT_SCN
-----------
    1446020

Then we create a classic backup of all tablespaces that shall be transported. Note, this is all online and read/write at this time. So there is no impact to any production processes beside the load generated by the backup itself.

RMAN> backup for transport allow inconsistent incremental level 0 format '/tmp/MMI_%U.bkp' tablespace mmi;

Starting backup at 2018-09-24 13:43:45
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=271 device type=DISK


channel ORA_DISK_1: starting incremental level 0 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00005 name=+DATA/MMI/DATAFILE/mmi.296.987685353
channel ORA_DISK_1: starting piece 1 at 2018-09-24 13:43:54
channel ORA_DISK_1: finished piece 1 at 2018-09-24 13:43:57
piece handle=/tmp/MMI_02tdtqsq_1_1.bkp tag=TAG20180924T134353 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
Finished backup at 2018-09-24 13:43:57

Step 2: restore the back at destination

For the restore, we need to have an empty database in place. This database may be of a different, e.g. higher version as the source. We do the restore and the necessary platform conversion in one step. You could also restore and convert afterwards, but this would require twice the storage.

RMAN> restore  from platform 'Linux x86 64-bit'  all foreign datafiles to new from backupset '/tmp/MMI_02tdtqsq_1_1.bkp';

Starting restore at 2018-09-24 14:05:00
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring all foreign files in backup piece
channel ORA_DISK_1: reading from backup piece /tmp/MMI_02tdtqsq_1_1.bkp
channel ORA_DISK_1: restoring foreign file 5 to +DATA/MMIRECO/DATAFILE/mmi.301.987689101
channel ORA_DISK_1: foreign piece handle=/tmp/MMI_02tdtqsq_1_1.bkp
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:16
Finished restore at 2018-09-24 14:05:16

At this point we have the files available at the destination, but they are not yet part of the destination database.

Step 3: create incremental backup

To keep the destination files up-to-date, we create an incremental backup at the source that covers all changes from the beginning of the initial (or previous incremental) backup that we took. That’s why we gathered the SCN beforehand. And we will get the SCN again right before the incremental backup for the next increment.

SQL>  select current_scn from v$database;

CURRENT_SCN
-----------
    1458731

Now we create the incremental backup.

RMAN>  backup for transport allow inconsistent incremental from scn 1446020 format '/tmp/MMI_%U.bkp' tablespace mmi;

Starting backup at 2018-09-24 13:54:20
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00005 name=+DATA/MMI/DATAFILE/mmi.296.987685353
channel ORA_DISK_1: starting piece 1 at 2018-09-24 13:54:21
channel ORA_DISK_1: finished piece 1 at 2018-09-24 13:54:22
piece handle=/tmp/MMI_03tdtrgd_1_1.bkp tag=TAG20180924T135421 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 2018-09-24 13:54:22

Step 4: apply incremental backup

The incremental backup, that we created in step 3 can now be converted to the destination platform and applied to the datafiles.

RMAN> recover foreign datafilecopy '+DATA/MMIRECO/DATAFILE/mmi.301.987689101' from backupset '/tmp/MMI_03tdtrgd_1_1.bkp' from platform 'Linux x86 64-bit';

Starting restore at 2018-09-24 14:06:19
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring foreign file +DATA/MMIRECO/DATAFILE/mmi.301.987689101
channel ORA_DISK_1: reading from backup piece /tmp/MMI_03tdtrgd_1_1.bkp
channel ORA_DISK_1: foreign piece handle=/tmp/MMI_03tdtrgd_1_1.bkp
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
Finished restore at 2018-09-24 14:06:21

Step 5: keep datafile copies current

This is simply done by repeating steps 3 & 4 until the final downtime takes place.

Step 6: final incremental backup

For the final transport we create a last incremental backup with the tablespaces being read-only now.

RMAN> sql "alter tablespace mmi read only";

sql statement: alter tablespace mmi read only


RMAN> backup for transport incremental from scn 1458731 format '/tmp/MMI_%U.bkp' tablespace mmi DATAPUMP FORMAT '/tmp/mmi_ts.dmpdp';

Starting backup at 2018-09-24 14:10:48
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=271 device type=DISK
Running TRANSPORT_SET_CHECK on specified tablespaces
TRANSPORT_SET_CHECK completed successfully

Performing export of metadata for specified tablespaces...
   EXPDP> Starting "SYS"."TRANSPORT_EXP_MMI_nffu":
   EXPDP> Processing object type TRANSPORTABLE_EXPORT/TABLE_STATISTICS
   EXPDP> Processing object type TRANSPORTABLE_EXPORT/STATISTICS/MARKER
   EXPDP> Processing object type TRANSPORTABLE_EXPORT/PLUGTS_BLK
   EXPDP> Processing object type TRANSPORTABLE_EXPORT/POST_INSTANCE/PLUGTS_BLK
   EXPDP> Processing object type TRANSPORTABLE_EXPORT/TABLE
   EXPDP> Master table "SYS"."TRANSPORT_EXP_MMI_nffu" successfully loaded/unloaded
   EXPDP> ******************************************************************************
   EXPDP> Dump file set for SYS.TRANSPORT_EXP_MMI_nffu is:
   EXPDP>   /u01/app/oracle/product/12.2.0.1/db/dbs/backup_tts_MMI_95169.dmp
   EXPDP> ******************************************************************************
   EXPDP> Datafiles required for transportable tablespace MMI:
   EXPDP>   +DATA/MMI/DATAFILE/mmi.296.987685353
   EXPDP> Job "SYS"."TRANSPORT_EXP_MMI_nffu" successfully completed at Mon Sep 24 14:12:46 2018 elapsed 0 00:01:29
Export completed

channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00005 name=+DATA/MMI/DATAFILE/mmi.296.987685353
channel ORA_DISK_1: starting piece 1 at 2018-09-24 14:12:51
channel ORA_DISK_1: finished piece 1 at 2018-09-24 14:12:54
piece handle=/tmp/MMI_05tdtsj3_1_1.bkp tag=TAG20180924T141050 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
channel ORA_DISK_1: starting full datafile backup set
input Data Pump dump file=/u01/app/oracle/product/12.2.0.1/db/dbs/backup_tts_MMI_95169.dmp
channel ORA_DISK_1: starting piece 1 at 2018-09-24 14:12:55
channel ORA_DISK_1: finished piece 1 at 2018-09-24 14:12:56
piece handle=/tmp/mmi_ts.dmpdp tag=TAG20180924T141050 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 2018-09-24 14:12:56

This step creates the last incremental backup that we need to apply at the destination as well as a metadata export that we’ll need to plugin the datafiles into the destination database.

Step 7: final apply and plugin

The application of the incremental backup is identical to the previous apply from step 4.

RMAN> recover foreign datafilecopy '+DATA/MMIRECO/DATAFILE/mmi.301.987689101' from backupset '/tmp/MMI_05tdtsj3_1_1.bkp' from platform 'Linux x86 64-bit';

Starting restore at 2018-09-24 14:14:29
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring foreign file +DATA/MMIRECO/DATAFILE/mmi.301.987689101
channel ORA_DISK_1: reading from backup piece /tmp/MMI_05tdtsj3_1_1.bkp
channel ORA_DISK_1: foreign piece handle=/tmp/MMI_05tdtsj3_1_1.bkp
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
Finished restore at 2018-09-24 14:14:31

Now we need to get the metadata export dump from the final backupset.

RMAN> restore dump file datapump destination '/u01/app/oracle/admin/mmireco/dpdump/' from backupset '/tmp/mmi_ts.dmpdp';


Starting restore at 2018-09-24 15:04:15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=267 device type=DISK

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring Data Pump dump file to /u01/app/oracle/admin/mmireco/dpdump/backup_tts_MMIRECO_22317.dmp
channel ORA_DISK_1: reading from backup piece /tmp/mmi_ts.dmpdp
channel ORA_DISK_1: foreign piece handle=/tmp/mmi_ts.dmpdp
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:02
Finished restore at 2018-09-24 15:04:20

Finally, we can import the metadata export making the foreign datafiles part of the destination database. In my example I used just one file/tablespace, you can easily plugin several datafiles at once by using wildcards in the file list. As a prerequisite, create all the users that own segments inside the transported tablespaces. Otherwise the import will fail.

[oracle@vm141 ~]$ impdp  dumpfile=data_pump_dir:backup_tts_MMIRECO_22317.dmp logfile=data_pump_dir:mmi_imp_tts.impdp.log transport_datafiles='+DATA/MMIRECO/DATAFILE/mmi.301.987689101';

Import: Release 12.2.0.1.0 - Production on Mon Sep 24 15:06:15 2018

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

Username: / as sysdba

Connected to: Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
Master table "SYS"."SYS_IMPORT_TRANSPORTABLE_01" successfully loaded/unloaded
Starting "SYS"."SYS_IMPORT_TRANSPORTABLE_01":  /******** AS SYSDBA dumpfile=data_pump_dir:backup_tts_MMIRECO_22317.dmp logfile=data_pump_dir:mmi_imp_tts.impdp.log transport_datafiles=+DATA/MMIRECO/DATAFILE/mmi.301.987689101
Processing object type TRANSPORTABLE_EXPORT/PLUGTS_BLK
Processing object type TRANSPORTABLE_EXPORT/TABLE
Processing object type TRANSPORTABLE_EXPORT/TABLE_STATISTICS
Processing object type TRANSPORTABLE_EXPORT/STATISTICS/MARKER
Processing object type TRANSPORTABLE_EXPORT/POST_INSTANCE/PLUGTS_BLK
Job "SYS"."SYS_IMPORT_TRANSPORTABLE_01" successfully completed at Mon Sep 24 15:07:02 2018 elapsed 0 00:00:43

Step 8: Finalization

Finally, we can switch the tablespace(s) read/write again at the destination.

SQL> alter tablespace mmi read write;

Tablespace altered.

And since the export only includes metadata for all the segments inside the transported files, we need to copy all metadata that resides in other tablespaces, e.g. SYSTEM tablespaces. That are views, PL/SQL, grants etc. So best is to do a metadata-only export at the source and import that into the destination database. I asume, you know how to do it, so I safely ommit this step.

Conclusion

Transporting tablespaces between platforms and version is not a one-liner, but it is a acceptable effort when you consider the downtime, that is needed to switch from one system to another. And in my opinion, when you do it manually without the scripts from Oracle, it becomes much clearer, how things work.

Platform change and upgrade in one step – XTTS

Today I want to tell you something about cross platform transportable tablespaces (XTTS) and incrementally updated datafile copies. This is a very nice feature to move to another platform and change the Oracle version in one step. There are several MOS notes, whitepapers and of cause some Upgrade-Slides by Mike Dietrich covering this topic.

The basic steps in all cases and versions are these:

  1. set tablespace(s) read only
  2. export metadata for those tablespaces
  3. transport datafiles and dump to target system
  4. create necessary database users
  5. import dump into target database which makes the datafiles part of the target database
  6. set tablespace(s) read write
  7. optionally, transport other objects like views, PL/SQL etc.

But what if the tablespaces are too big and copy time would exceed the allowed downtime window? Then we can use incremental backups to shorten the downtime. The steps to do this, are quite similar.

  1. backup tablespace(s) for transport
  2. copy backup to target system
  3. restore and convert datafiles to target destination and platform
  4. create incremental backup for transported tablespace(s)
  5. recover transported datafiles using the incremental backup
  6. repeat steps 4 and 5 until final downtime window
  7. set tablespace(s) read only
  8. create final incremental backup for transported tablespace(s)
  9. apply incremental backup to transported datafiles
  10. continue with step 2 of the initial process above

Sounds a bit complicated, doesn’t it? So Oracle was so kind to put this whole process into scripts. The following MOS notes contain and describe these scripts:

  • 11G – Reduce Transportable Tablespace Downtime using Cross Platform Incremental Backup (Doc ID 1389592.1)
  • 12C – Reduce Transportable Tablespace Downtime using Cross Platform Incremental Backup (Doc ID 2005729.1)
  • V4 Reduce Transportable Tablespace Downtime using Cross Platform Incremental Backup (Doc ID 2471245.1)

During a migration project from Linux to SPARC (don’t ask) I wanted to use those scripts. But I experienced the following during “Phase 2”:

oracle@server2:~/xtts> $ORACLE_HOME/perl/bin/perl xttdriver.pl --backup
============================================================
trace file is /oracle/xtts/backup_Sep20_Thu_12_20_52_917//Sep20_Thu_12_20_52_917_.log
=============================================================

[…]

--------------------------------------------------------------------
Done with prepare phase
--------------------------------------------------------------------


--------------------------------------------------------------------
Find list of datafiles in system
--------------------------------------------------------------------

SP2-0027: Input is too long (> 2499 characters) - line ignored
SP2-0027: Input is too long (> 2499 characters) - line ignored
           from dual
           *
ERROR at line 19:
ORA-00936: missing expression

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Error:
------
Error in executing /oracle/xtts/backup_Sep20_Thu_12_27_07_889//diff.sql

Okay. The reason is, I have loads of tablespaces that need to be transferred. This list messed up the whole process. So I tried to understand the process and just do it myself. What I learned is, that it is quite easy to do it manually. And I share my findings with you by simply outlining the basic process in an upcoming blog post.