Patching an OMS 13.2 on Windows

Yesterday I had to patch an Enterprise Manager 13.2 that is running on a Windows box. It was never patched before so it was clear, that it would require some activities beforehand. But there were some other unforeseen things which I will describe in this post. Sorry for the german output in some of the snippets, it was a german system….

First, I needed to update “OPatch” and “OMSPatcher”. Refreshing the latter one was easy since I just needed to replace the “OMSPatcher” directory in the Middleware Home with the new one. For refreshing “OPatch” it used to be the same procedure, but I learned that it has changed. Reading My Oracle Support docs is really helpful sometimes. Here is how it works:

D:\CloudControl_cc13r2\Update\opatch_13.9.1.3.0\6880880>D:\oracle\product\mw13cR2\oracle_common\jdk\bin\java.exe -jar .\opatch_generic.jar -silent ORACLE_HOME=%ORACLE_HOME%
Launcher-Logdatei ist C:\Users\XXX\AppData\Local\Temp\2\OraInstall2017-04-06_11-15-14AM\launcher2017-04-06_11-15-14AM.log.
Installationsprogramm wird extrahiert... . . Fertig
Es wird geprüft, ob CPU-Geschwindigkeit über 300 MHz liegt   Tatsächlich 3500    Erfolgreich
Swap-Bereich wird geprüft: muss größer sein als 512 MB    Erfolgreich
Es wird geprüft, ob diese Plattform eine 64-Bit-JVM erfordert   Tatsächlich 64    Erfolgreich (64-Bit nicht erforderlich)
Temporärer Speicherplatz wird geprüft: muss größer sein als 300 MB   Tatsächlich 46996 MB    Erfolgreich


Vorbereitung für das Starten von Oracle Universal Installer aus C:\Users\XXX\AppData\Local\Temp\2\OraInstall2017-04-06_11-15-14AM
Installationszusammenfassung


Speicherplatz: erforderlich 27 MB, verf³gbar 397.477 MB
Zu installierende Featuresets:
        Next Generation Install Core 13.9.1.0.1
        OPatch 13.9.1.3.0
        OPatch Auto OPlan 13.9.1.0.0
Sessionlogdatei ist C:\Users\XXX\AppData\Local\Temp\2\OraInstall2017-04-06_11-15-14AM\install2017-04-06_11-15-14AM.log

Die Produktliste wird geladen. Warten.
 1%

[...]

Die Logs finden Sie hier: C:\Users\XXX\AppData\Local\Temp\2\OraInstall2017-04-06_11-15-14AM.

Drücken Sie zum Beenden die Eingabetaste

Now I wanted to apply the OMS side patch using omspatchter:

D:\CloudControl_cc13r2\Update\25501489>omspatcher apply 
OMSPatcher Automation Tool
Copyright (c) 2017, Oracle Corporation.  All rights reserved.


OMSPatcher version : 13.8.0.0.2
OUI version        : 13.9.1.0.0
Running from       : d:\oracle\product\mw13cR2
Log file location  : d:\oracle\product\mw13cR2\cfgtoollogs\omspatcher\opatch2017-04-06_11-22-19AM_1.log

OMSPatcher log file: d:\oracle\product\mw13cR2\cfgtoollogs\omspatcher\25501489\omspatcher_2017-04-06_11-22-22AM_deploy.log

Please enter OMS weblogic admin server URL(t3s://omshost.acme.com:7101):>
Please enter OMS weblogic admin server username(weblogic):>
Please enter OMS weblogic admin server password:>


OMSPatcher could not read installed OMS owner from OUI inventory by itself.
Please add OMSPatcher.OMS_USER=<OMS installed user> to command line and try again.


[ Error during Get Central Inventory Information Phase]. Detail: OMSPatcher was not able to read OUI inventory to retrieve installed user & system details.
OMSPatcher failed: OMSPatcher could not read installed OMS owner from OUI inventory by itself.
Please add OMSPatcher.OMS_USER=<OMS installed user> to command line and try again.

Log file location: d:\oracle\product\mw13cR2\cfgtoollogs\omspatcher\25501489\omspatcher_2017-04-06_11-22-22AM_deploy.log

Recommended actions: Please check if OUI inventory is locked by some other processes. Please check if OUI inventory is readable.

OMSPatcher failed with error code = 236

…and failed. But the required action is stated quite clearly, so I added the owner of the OMS home:

D:\CloudControl_cc13r2\Update\25501489>omspatcher apply OMSPatcher.OMS_USER=adkaiser
OMSPatcher Automation Tool
Copyright (c) 2017, Oracle Corporation.  All rights reserved.


OMSPatcher version : 13.8.0.0.2
OUI version        : 13.9.1.0.0
Running from       : d:\oracle\product\mw13cR2
Log file location  : d:\oracle\product\mw13cR2\cfgtoollogs\omspatcher\opatch2017-04-06_11-25-23AM_1.log

OMSPatcher log file: d:\oracle\product\mw13cR2\cfgtoollogs\omspatcher\25501489\omspatcher_2017-04-06_11-25-29AM_deploy.log

Please enter OMS weblogic admin server URL(t3s://omshost.acme.com:7101):>
Please enter OMS weblogic admin server username(weblogic):>
Please enter OMS weblogic admin server password:>


OMSPatcher could not read installed OMS owner from OUI inventory by itself.
Please add OMSPatcher.OMS_USER=<OMS installed user> to command line and try again.


[ Error during Get Central Inventory Information Phase]. Detail: OMSPatcher was not able to read OUI inventory to retrieve installed user & system details.
OMSPatcher failed: OMSPatcher could not read installed OMS owner from OUI inventory by itself.
Please add OMSPatcher.OMS_USER=<OMS installed user> to command line and try again.

Log file location: d:\oracle\product\mw13cR2\cfgtoollogs\omspatcher\25501489\omspatcher_2017-04-06_11-25-29AM_deploy.log

Recommended actions: Please check if OUI inventory is locked by some other processes. Please check if OUI inventory is readable.

OMSPatcher failed with error code = 236

The logfile is not very helpful either:

D:\CloudControl_cc13r2\Update\25501489>type d:\oracle\product\mw13cR2\cfgtoollogs\omspatcher\25501489\omspatcher_2017-04-06_11-25-29AM_deploy.log
[06.04.2017 11:25:29]        OMSPatcher has successfully verified min_patching_tool_version check
[06.04.2017 11:25:47]        [ Error during Get Central Inventory Information phase ] Detail:                              OMSPatcher was not able to read OUI inventory to retrieve installed user & system details.
OMSPatcher could not read installed OMS owner from OUI inventory by itself.
                             Please add OMSPatcher.OMS_USER=<OMS installed user> to command line and try again.

So I started researching and found EM13c: OMSPatcher Analyze Command For 13c OMS Fails With Error “OMSPatcher failed with error code 236” (Doc ID 2136840.1) which says that the parameter needs to be specified inside a properties file. There is another doc EM 13c: How to Apply a Patch to the Enterprise Manager 13c Cloud Control OMS Oracle Home (Doc ID 2091619.1) which describes how to create this properties file.

First step is to create the WebLogic encrypted configuration and key files.

D:\CloudControl_cc13r2\Update>%ORACLE_HOME%\OMSpatcher\wlskeys\createkeys.cmd -oh %ORACLE_HOME% -location D:\CloudControl_cc13r2\Update

[...]

Your environment has been set.
Please enter weblogic admin server username::> weblogic
Please enter weblogic admin server password::>
CreateKeys Weblogic API executed successfully.

User configuration file created: D:\CloudControl_cc13r2\Update\config
User key file created: D:\CloudControl_cc13r2\Update\key
'createkeys.bat' succeeded.

These two lines can I now use in my properties file along with the OMS owner. This is the content of my properties file:

AdminServerURL=t3s://omshost.acme.com:7101
AdminConfigFile=D:\\CloudControl_cc13r2\\Update\\config
AdminKeyFile=D:\\CloudControl_cc13r2\\Update\\key
OPatchAuto.OMS_USER=administrator

Be aware of the double backslashes, otherwise it won’t work.

Now finally I am able to patch my OMS using this properties file.

D:\CloudControl_cc13r2\Update\25501489>omspatcher apply -property_file D:\CloudControl_cc13r2\Update\properties.txt
OMSPatcher Automation Tool
Copyright (c) 2017, Oracle Corporation.  All rights reserved.


OMSPatcher version : 13.8.0.0.2
OUI version        : 13.9.1.0.0
Running from       : d:\oracle\product\mw13cR2
Log file location  : d:\oracle\product\mw13cR2\cfgtoollogs\omspatcher\opatch2017-04-06_13-08-43PM_1.log

OMSPatcher log file: d:\oracle\product\mw13cR2\cfgtoollogs\omspatcher\25501489\omspatcher_2017-04-06_13-08-53PM_analyze.log



WARNING: Could not apply the patch "25414328" because the "oracle.sysman.vi.oms.plugin with version 13.2.1.0.0" core component of the OMS or the plug-in for which the patch is intended is either not deployed or deployed with another version in your Enterprise Manager system.
WARNING: Could not apply the patch "25414306" because the "oracle.sysman.emfa.oms.plugin with version 13.2.1.0.0" core component of the OMS or the plug-in for which the patch is intended is either not deployed or deployed with another version in your Enterprise Manager system.
WARNING: Could not apply the patch "25118889" because the "oracle.sysman.vt.oms.plugin with version 13.2.1.0.0" core component of the OMS or the plug-in for which the patch is intended is either not deployed or deployed with another version in your Enterprise Manager system.
WARNING: Could not apply the patch "25414263" because the "oracle.sysman.csm.oms.plugin with version 13.2.1.0.0" core component of the OMS or the plug-in for which the patch is intended is either not deployed or deployed with another version in your Enterprise Manager system.
WARNING: Could not apply the patch "25414255" because the "oracle.sysman.ssa.oms.plugin with version 13.2.1.0.0" core component of the OMS or the plug-in for which the patch is intended is either not deployed or deployed with another version in your Enterprise Manager system.
WARNING: Could not apply the patch "25414356" because the "oracle.sysman.smf.oms.plugin with version 13.2.1.0.0" core component of the OMS or the plug-in for which the patch is intended is either not deployed or deployed with another version in your Enterprise Manager system.

Configuration Validation: Success


Running apply prerequisite checks for sub-patch(es) "25414294,25414339,25414245,25414317,25414281" and Oracle Home "d:\oracle\product\mw13cR2"...
Sub-patch(es) "25414294,25414339,25414245,25414317,25414281" are successfully analyzed for Oracle Home "d:\oracle\product\mw13cR2"

[...]

12c Upgrade stuck at fixed object stats

A while ago I did some database upgrades to version 12.1.0.2. One of those upgrades made me a bit nervous. I used “dbca” to perform the upgrade and it was kind of stuck. So I went to the command prompt and started to investigate this. What I found was this:

12c_fixed_obj_stats

A session was waiting for “ADR block file read” again and again. The SQL that was causing this waits can be seen in the screenshot too. It was gathering statistics for X$DBKFDG, part of the “gather fixed object stats” step.

A quick research in My Oracle Support did not bring up any helpful information. My first idea was to check the ADR but I found it to be nearly empty, at least in the suspicous directories like “trace” etc. So I posted this issue on Twitter and asked for help. @realadrienne pointed to a similar issue related to Recovery Advisor and the list of critical issues. So I checked that using

SQL> select * from v$ir_failure;

which returned loads of records that had a priority of “critical” and a status of “closed”. So I tried to get rid of these records using “adrci”.

adrci> purge -age 1 -type hm

This took quite a while, about 10 minutes. to complete. But in the end the v$ir_failure view had no records anymore and the fixed object stats where gathered quickly.

So now there is an additional pre-upgrade check on my personal list, that says “clean up v$ir_failure”. You should add this to your list too to prevent unnecessary delays during database upgrades.

Thanks to @realadrienne, @FranckPachot and @OracleSK for the quick and very helpful assistance.

Thanks also to @MikeDietrichDE for the feedback on this issue.

 

 

DataPatch stuck on RAC – PSU October 2016

Yesterday one of my customers wanted to patch two 2-node clusters with the current PSU October 2016 (161018). Both are running 12.1.0.2 Grid Infrastructure and 12.1.0.2 Database. The servers run SPARC Solaris 10. When applying the patch on the first cluster using “opatchauto” everything went fine until the “trying to apply SQL Patch” part on the 2nd node. So I went to the log directory and found the following:

$ cd $ORACLE_BASE/cfgtoollogs/sqlpatch/sqlpatch_27075_2016_11_30_17_12_08
$ tail sqlpatch_catcon_0.log

SQL> GRANT SELECT ON sys.gv_$instance TO dv_secanalyst
  2  /

At that line it was stuck. Searching My Oracle Support brought up nothing helpful. So I had a look at the database sessions:

SQL> select sid, username, event, state, seconds_in_wait 
2    from v$session where username='SYS';

       SID USERNAME                       EVENT                                                            STATE                                                   SECONDS_IN_WAIT
---------- ------------------------------ ---------------------------------------------------------------- -----------                                    -------- ---------------
        13 SYS                            SQL*Net message from client                                      WAITING                                                             226
        30 SYS                            SQL*Net message from client                                      WAITING                                                             473
        32 SYS                            SQL*Net message to client                                        WAITED SHOR                                    T TIME                 0
       411 SYS                            SQL*Net message from client                                      WAITING                                                             473
       783 SYS                            library cache lock                                               WAITING                                                             211
       786 SYS                            SQL*Net message from client                                      WAITING                                                               4
      1155 SYS                            SQL*Net message from client                                      WAITING                                                             467

The session is waiting for something dictionary related. Since the waiting statement was related to RAC, I stopped the other instance which made sqlplatch continue immediately. So the workaround looked like this:

$ srvctl stop instance -db <dbname> -node <node1>
$ srvctl start instance -db <dbname> -node <node1>

This happened on both clusters. So be aware of that in case you are applying that PSU patch to RAC databases.
In case you missed to stop the 1st instance in time, the GRANT statement will run into a timeout (ORA-4021) and the SQL-Patch will be marked with “ERROR” in DBA_REGISTRY_SQLPATCH. In such case, just re-run “datapatch” again and monitor the logfile.
Happy patching.

Update 07-DEC-2016

I was not able to reproduce this issue on a Linux x86-64 system. So there is a chance that the issue is OS related.

Update 12-DEC-2016

Finally I reproduced this issue on my Linux x86-64 test system. Now I opened a SR for that.

Update 13-DEC-2016

Thanks to a quick and efficient Oracle Support guy (yes, there are such people!) we found the root cause of that issue. There is a bug in the Enterprise Manager Agent (DB Express maybe too) that it holds a shared lock on some GV$ views during the whole lifetime of a session. That’s why datapatch got stuck. If you just stop the Agent, datapatch will continue immediatly. There is no need to stop the whole instance. We just need to get rid of the Agent’s sessions.
Thanks a lot to Prakash from Oracle Support for his engagement in investigating this issue.

opatchauto Odyssey

A couple of days ago a customer asked for assistance in installing the January PSU in their RAC environment. The patch should be applied to two systems, first the test cluster, second the production cluster. Makes sense so far. So we planned the steps that needed to be done:

  • Download the patch
  • copy patch to all nodes and extract it
  • check OPatch version
  • create response file for OCM and copy it to all nodes
  • clear ASM adump directory since this may slow down pre-patch steps
  • “opatchauto” first node
  • “opatchauto” second node
  • run “datapatch” to apply SQL to databases

The whole procedure went fine without any issues on test. We even skipped the last step, running “datapatch” since the “opatchauto” did that for us. This happens in contrast to the Readme which does not tell about that.

So that was easy. But unfortunately the production system went not as smooth as the test system. “opatchauto” shut down the cluster stack and patched the RDBMS home successfully. But during the patch phase of GI, the logfile told us that there are still processes that blocked some files. I checked that and found a handful, one of those processes was the “ocssd”. When killing all the left-over processes I knew immediately that this was not the best idea. The server fenced and rebooted straight away. That left my cluster in a fuzzy state. The cluster stack came up again, but “opatchauto -resume” told me, that I should proceed with some manual steps. So I applied the patches to the GI home which was not done before and run the post-patch script which failed. Starting “opatchauto” in normal mode failed also since the cluster was already in “rolling” mode.

So finally I removed all the applied patches manually, put the cluster back in normal mode following MOS Note 1943498.1 and started the whole patching all over.  Everything went fine this time.

Conclusion

  1. Think before you act. Killing OCSSD is not a good idea at all.
  2. In contrast to the Readme “datapatch” is being executed by “opatchauto” as part of the patching process.
  3. Checking the current cluster status can be done like this:
[oracle@vm101 ~]$ crsctl query crs activeversion -f
Oracle Clusterware active version on the cluster is [12.1.0.2.0]. The cluster upgrade state is [NORMAL]. The cluster active patch level is [3467666221].