Housekeeping is an often underestimated topic when running databases in production. In the blog post about the size based deletion policy I described a new way of keeping the size of an ADR home limited. But what if you are not yet on Oracle Database 12.2+? Then you have to rely on the existing time based deletion policies. That is what we did and do for our customers. A couple of weeks ago one of them experienced space pressure on their /u01 filesystem on one of their ODAs. The automatic purging worked fine, but the filesystem usage was still increasing. The Enterprise Manager metric outlines this behaviour.
So we started to investigate this and came to the trace-directory of some databases. This is an example of one of them.
[root@odax7-2m trace]# du -hs * | sort -h | tail -6 68M DOMEADD_gen0_82205.trm 69M DOMEADD_mmon_82354.trm 131M DOMEADD_ipc0_82195.trm 419M DOMEADD_gen0_82205.trc 423M DOMEADD_mmon_82354.trc 462M DOMEADD_ipc0_82195.trc
We started with the largest trace and had a look at its end.
[root@odax7-2m trace]# tail DOMEADD_ipc0_82195.trc 2019-03-15 09:23:07.749*:kjuinc(): no cluster database, return inc# KSIMINVL *** 2019-03-15T09:23:10.913816+01:00 2019-03-15 09:23:10.913*:kjuinc(): no cluster database, return inc# KSIMINVL
Having this, we did a quick research on My Oracle Support which revealed Bug 27989556 Excessive Trace Message: no cluster database, return inc# ksiminvl. And of cause there was no fix available. This changed by April 16th, now there’s a fix for 18c available with the 18.6.0 Release Update.
Ok, next file. Let’s have a look.
[root@odax7-2m trace]# tail DOMEADD_mmon_82354.trc AUTO SGA: kmgs_parameter_update_timeout gen0 0 mmon alive 1 *** 2019-03-15T09:29:55.474098+01:00 AUTO SGA: kmgs_parameter_update_timeout gen0 0 mmon alive 1 *** 2019-03-15T09:29:58.474092+01:00 AUTO SGA: kmgs_parameter_update_timeout gen0 0 mmon alive 1
Again MOS has the answer: AUTO SGA: kmgs_parameter_update_timeout gen0 0 mmon alive 1 Excessive Trace Files were generated on one node after Migration to 12.2 and MMON trace file grows (Doc ID 2298766.1). At least there are patches for that this time.
Since patching would have required a downtime and we might fix only one of two issues, we decided against it. Instead, we created a simple but effective workaround:
for i in $(find /u01/app/oracle/diag/rdbms -type f -name "*mmon*tr*"); do echo "" > $i; done for i in $(find /u01/app/oracle/diag/rdbms -type f -name "*ipc0*tr*"); do echo "" > $i; done
Once this was run, and we now do that regularly, the filesystem usage dropped immediatly. This is what Enterprise Manager showed right after running the workaround.