Benchmarking Data Guard Throughput

Motivation

When planning a new Oracle Data Guard setup, we do a lot of benchmarking beforehand. We do I/O benchmarks using Orion, or application benchmarks like SwingBench or SLOB. For network bandwith and throughput there is iPerf. But iPerf tells us only half the truth. It just measures the network. But in a Data Guard setup with synchronous redo transport there is more. The receiving site needs to write the information to disk and after successful completion acknowledge back to the sender.

Oracle provides a tool called “OraTCPTest” for that purpose which is pretty similar to iPerf but with additional capability to reflect the behaviour of Data Guard. The tool and basic instructions can be found in My Oracle Support document “Measuring Network Capacity using oratcptest (Doc ID 2064368.1)“.

Basically we start a server process at the receiving site, that optionally can write the received information to disk, and another client process at the sending site. Let’s walk through a basic example. I did this on Windows boxes, it works even there 🙂

1. Start the server process

We just need to specify the port which should be used for listening.

D:\install>java -jar oratcptest.jar -server -port=4711
OraTcpTest server started.

 
There is nothing more than that.

2. Start the client process

Here we need some more parameters, the target, the port and some timing stuff.

D:\install>java -jar oratcptest.jar myotherserver -port=4711 -duration=10s -interval=2s
[Requesting a test]
        Message payload        = 1 Mbyte
        Payload content type   = RANDOM
        Delay between messages = NO
        Number of connections  = 1
        Socket send buffer     = (system default)
        Transport mode         = SYNC
        Disk write             = NO
        Statistics interval    = 2 seconds
        Test duration          = 10 seconds
        Test frequency         = NO
        Network Timeout        = NO
        (1 Mbyte = 1024x1024 bytes)

(15:05:55) The server is ready.
                        Throughput                 Latency
(15:05:57)        101.832 Mbytes/s                9.820 ms
(15:05:59)        102.159 Mbytes/s                9.789 ms
(15:06:01)        102.169 Mbytes/s                9.788 ms
(15:06:03)        101.954 Mbytes/s                9.808 ms
(15:06:05)        101.998 Mbytes/s                9.804 ms
(15:06:05) Test finished.
               Socket send buffer = 64 kbytes
                  Avg. throughput = 101.968 Mbytes/s
                     Avg. latency = 9.807 ms

And that’s it. You can see all the parameters that were used and which values these have. The output then tells us the performance details, the average throughput and latency.

Finding the truth

Assuming we are migrating an existing system, we need to know if the network can handle the amount of redo that needs to be shipped. AWR or Statspack reports tells us these numbers, but they are averages. Peaks might be masked in those reports. Using the information in V$ARCHIVED_LOG is much more accurate.

Let’s look at a timeframe during daytime workload and find the maximum redo rate.

select
  thread#,
  sequence#,
  blocks*block_size/1024/1024 MB,
  (next_time-first_time)*86400 sec,
  (blocks*block_size/1024/1024)/((next_time-first_time)*86400) "MB/s"
from
  v$archived_log
where
  ( (next_time-first_time)*86400<>0)
and
  first_time between
    to_date('2016/11/22 08:00:00','YYYY/MM/DD HH24:MI:SS') and
    to_date('2016/11/22 16:00:00','YYYY/MM/DD HH24:MI:SS')
and
  dest_id=2
order by 5;

   THREAD#  SEQUENCE#         MB        SEC       MB/s
---------- ---------- ---------- ---------- ----------
         1     349010 226.600586        453 .500222044
         1     348976 226.943848        415 .546852645
[...]
         1     349048 240.336914         56 4.29173061
         1     348953  240.79248         53 4.54325435
         1     349049 246.279785         47 5.23999543
         1     348987 228.593262         37 6.17819626

So at peak workloads one redo file of roughly 230MB is written in 37 seconds which results in 6-7 MB/s. In the first simple example we achieved over 100MB/s, so this is easy, isn’t it? No, it isn’t. We will see in a moment. First we need to find out how large a typical redo write is. We can get these values from AWR reports or just query V$SYSTAT.

select
  (select value from v$sysstat where statistic# = (select statistic# from v$statname where name = 'redo size')) /
  (select value from v$sysstat where statistic# = (select statistic# from v$statname where name = 'redo writes')) "avg redo write size"
from
  dual;

avg redo write size
-------------------
         9399.55267

The typical redo write size is 9400 bytes in this case. Now we can start over and do the benchmark again with the propper parameters.

Real Life Benchmark

Again we start the server process, this time we specify a file which is writeable. Since the tool cannot write directly to ASM, I created an ACFS volume to get roughly the performance of my ASM disk that will host the Standby Redologs in the future environment.

D:\install>java -jar oratcptest.jar -server -port=4711 -file=d:\app\oracle\oradata\acfs\oratcptest.tmp
OraTcpTest server started.

Now we can start several tests against this server process. Let’s start with asynchronous transmission without writing the information at the receiving site.

D:\install>java -jar oratcptest.jar myotherserver -port=4711 -duration=10s -interval=2s -mode=async -length=9400
[Requesting a test]
        Message payload        = 9400 bytes
        Payload content type   = RANDOM
        Delay between messages = NO
        Number of connections  = 1
        Socket send buffer     = (system default)
        Transport mode         = ASYNC
        Disk write             = NO
        Statistics interval    = 2 seconds
        Test duration          = 10 seconds
        Test frequency         = NO
        Network Timeout        = NO
        (1 Mbyte = 1024x1024 bytes)

(08:42:53) The server is ready.
                        Throughput
(08:42:55)        112.814 Mbytes/s
(08:42:57)        112.731 Mbytes/s
(08:42:59)        112.641 Mbytes/s
(08:43:01)        112.622 Mbytes/s
(08:43:03)        112.665 Mbytes/s
(08:43:03) Test finished.
               Socket send buffer = 16 Mbytes
                  Avg. throughput = 112.642 Mbytes/s					 

Ok, 10MB/s more than the first test, we are transferring just 9kB per message instead of 1MB wich was the default.
Next test, again asynchronous transfer, but this time the server should write the message content to disk before replying.

D:\install>java -jar oratcptest.jar myotherserver -port=4711 -duration=10s -interval=2s -mode=async -write -length=9400
[Requesting a test]
        Message payload        = 9400 bytes
        Payload content type   = RANDOM
        Delay between messages = NO
        Number of connections  = 1
        Socket send buffer     = (system default)
        Transport mode         = ASYNC
        Disk write             = YES
        Statistics interval    = 2 seconds
        Test duration          = 10 seconds
        Test frequency         = NO
        Network Timeout        = NO
        (1 Mbyte = 1024x1024 bytes)

(08:42:34) The server is ready.
                        Throughput
(08:42:36)         25.230 Mbytes/s
(08:42:38)         26.655 Mbytes/s
(08:42:40)         27.600 Mbytes/s
(08:42:42)         27.578 Mbytes/s
(08:42:44)         27.603 Mbytes/s
(08:42:44) Test finished.
               Socket send buffer = 1 Mbyte
                  Avg. throughput = 26.922 Mbytes/s

The throughput went down massively. Only 27MB/s are left, that is 1/4 of the previous throughput that we achieved without writing the message payload. A massive impact.
So what will happen when we do the transfer in a synchronous manner as Data Guard will do with Maximum Availability? Let’s see.

D:\install>java -jar oratcptest.jar myotherserver -port=4711 -duration=10s -interval=2s -mode=sync -write -length=9400
[Requesting a test]
        Message payload        = 9400 bytes
        Payload content type   = RANDOM
        Delay between messages = NO
        Number of connections  = 1
        Socket send buffer     = (system default)
        Transport mode         = SYNC
        Disk write             = YES
        Statistics interval    = 2 seconds
        Test duration          = 10 seconds
        Test frequency         = NO
        Network Timeout        = NO
        (1 Mbyte = 1024x1024 bytes)

(08:44:28) The server is ready.
                        Throughput                 Latency
(08:44:30)         15.082 Mbytes/s                0.595 ms
(08:44:32)         15.959 Mbytes/s                0.562 ms
(08:44:34)         16.402 Mbytes/s                0.547 ms
(08:44:36)         16.603 Mbytes/s                0.541 ms
(08:44:38)         16.579 Mbytes/s                0.541 ms
(08:44:38) Test finished.
               Socket send buffer = 64 kbytes
                  Avg. throughput = 16.117 Mbytes/s
                     Avg. latency = 0.557 ms

Again a degradation, there are only 16MB/s left. And remember, we’ll need 7MB/s during peak load. And beside that, we get the latency which will later impact the commit performance since the sending (primary) server will wait till the redo is written and acknowledged by the receiving (standby) server.
Just for the sake of completeness, the test with synchronous transfer but without writing.

D:\install>java -jar oratcptest.jar myotherserver -port=4711 -duration=10s -interval=2s -mode=sync -length=9400
[Requesting a test]
        Message payload        = 9400 bytes
        Payload content type   = RANDOM
        Delay between messages = NO
        Number of connections  = 1
        Socket send buffer     = (system default)
        Transport mode         = SYNC
        Disk write             = NO
        Statistics interval    = 2 seconds
        Test duration          = 10 seconds
        Test frequency         = NO
        Network Timeout        = NO
        (1 Mbyte = 1024x1024 bytes)

(08:44:11) The server is ready.
                        Throughput                 Latency
(08:44:13)         23.547 Mbytes/s                0.381 ms
(08:44:15)         31.800 Mbytes/s                0.282 ms
(08:44:17)         33.249 Mbytes/s                0.270 ms
(08:44:19)         32.639 Mbytes/s                0.275 ms
(08:44:21)         31.627 Mbytes/s                0.284 ms
(08:44:21) Test finished.
               Socket send buffer = 64 kbytes
                  Avg. throughput = 30.556 Mbytes/s
                     Avg. latency = 0.294 ms

It is somewhat faster than asnychronous transfer with writing.

Conclusion

Simple network bandwith tests are not sufficient to predict future Data Guard redo transfer performance. The OraTCPTest provides a nice opportunity to measure the throughput and get realistic values that can be used to plan and size the network. Again, here are the numbers of the different tests.

ASYNC nowrite ASYNC write SYNC nowrite SYNC write
112 MB/s 27 MB/s 32 MB/s 16 MB/s

We could now try to tune that more by playing around with send and receive buffer size. But for the time being, the defaults are sufficient for my case. So I skipped that.

ODA: Performance Impact due to Flashback Database

Here and there I am planning and actually doing migrations to the Oracle Database Appliance (ODA). I like this appliance because of it’s relative simplicity and ease of management. In most cases we connect two ODAs in different locations using DataGuard with the goal of maximizing availability. That means we have a primary 2-node-RAC on ODA-1 and another 2-node-RAC on ODA-2 which operates as standby.
In order to fulfill the requirements of availability and fault tolerance the databases are configured as “Maximum Availability”:

  • sufficient Standby Redologs configured – on the dedicated Redo SSDs
  • Force Logging mode
  • Flashback Database configured and activated – located in FRA per definition

In terms of availability and manageability this is quite good, but in terms of performance this turned out to be really bad sometimes. Look at this AWR snippets:

              Snap Id      Snap Time      Sessions Curs/Sess
            --------- ------------------- -------- ---------
Begin Snap:     88808 01-Sep-15 17:30:06       480       2.7
  End Snap:     88855 02-Sep-15 17:00:10       478       6.8
   Elapsed:            1,410.06 (mins)
   DB Time:            3,311.41 (mins)
Wait Classes by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                                        Avg             Avg
                                        Total Wait     Wait   % DB   Active
Wait Class                  Waits       Time (sec)     (ms)   time Sessions
---------------- ---------------- ---------------- -------- ------ --------
System I/O             11,522,327          161,304       14   81.2      1.9
DB CPU                                     101,562            51.1      1.2
User I/O               19,956,840           88,663        4   44.6      1.0
Background Wait Events                 DB/Inst: XXX/xxx  Snaps: 88808-88855
-> ordered by wait time desc, waits desc (idle events last)
-> Only events with Total Wait Time (s) >= .001 are shown
-> %Timeouts: value of 0 indicates value was > .5%.  Value of null is truly 0

                                                             Avg                
                                        %Time Total Wait    wait    Waits   % bg
Event                             Waits -outs   Time (s)    (ms)     /txn   time
-------------------------- ------------ ----- ---------- ------- -------- ------
db file async I/O submit        969,898     0     90,151      93      0.3   46.5
db file parallel write        5,015,605     0     33,520       7      1.3   17.3
flashback log file write        617,118     0     30,295      49      0.2   15.6
Backup: MML write backup p   10,608,828     0     11,727       1      2.7    6.0
RMAN backup &amp; recovery I/O      288,934     0      3,508      12      0.1    1.8

Remember, the flashback logs are stored inside FRA. And where does ODA place the FRA? On the slower part of the spinning disks. The system can not be faster than the slowest part in the chain. And in this configuration the user I/Os are influenced by flashback I/Os making things even worse.
So turn off flashback database in order to improve performance.

              Snap Id      Snap Time      Sessions Curs/Sess
            --------- ------------------- -------- ---------
Begin Snap:     89528 16-Sep-15 17:30:45       393       5.0
  End Snap:     89575 17-Sep-15 17:00:44       576       5.2
   Elapsed:            1,409.98 (mins)
   DB Time:            2,792.69 (mins)
Wait Classes by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                                        Avg             Avg
                                        Total Wait     Wait   % DB   Active
Wait Class                  Waits       Time (sec)     (ms)   time Sessions
---------------- ---------------- ---------------- -------- ------ --------
System I/O              7,424,187           97,029       13   57.9      1.1
DB CPU                                      96,595            57.6      1.1
User I/O                8,191,268           63,769        8   38.1      0.8

Only half the System I/O than with flashback database enabled.

Background Wait Events                 DB/Inst: XXX/xxx  Snaps: 89528-89575
-> ordered by wait time desc, waits desc (idle events last)
-> Only events with Total Wait Time (s) >= .001 are shown
-> %Timeouts: value of 0 indicates value was > .5%.  Value of null is truly 0

                                                             Avg                
                                        %Time Total Wait    wait    Waits   % bg
Event                             Waits -outs   Time (s)    (ms)     /txn   time
-------------------------- ------------ ----- ---------- ------- -------- ------
db file async I/O submit        531,487     0     73,471     138      0.1   57.5
db file parallel write        2,384,365     0     15,169       6      0.7   11.9
Backup: MML write backup p   10,121,315     0     11,769       1      2.8    9.2
RMAN backup &amp; recovery I/O      197,080     0      4,906      25      0.1    3.8

The background wait for “flashback log file write” completely gone.

Keep that in mind when using Flashback Database. This will also apply to other hardware than ODA. Be careful which disks you use for FRA/flashback logs.

SQL Patches – what hints do they exactly use?

There are a couple of blog posts out there that describe how to inject hints into given SQL statements without modifying anything inside the code. One of these blog posts was written by the Optimizer Team. There is quite a lot of information of how to create SQL Patches and how to check the execution plans for the usage of these SQL Patches, but I wasn’t really able to find out where the details are stored inside the database.

Let’s walk through it. First things first, the base query:

SQL> conn scott/tiger
Connected.
SQL> select count(*) from emp;

  COUNT(*)
----------
        14

SQL> select * from table(dbms_xplan.display_cursor('g59vz2u4cu404', 0, 'allstats last'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------SQL_ID  g59vz2u4cu404, child number 0
-------------------------------------
select count(*) from emp

Plan hash value: 2937609675

--------------------------------------------
| Id  | Operation        | Name   | E-Rows |
--------------------------------------------
|   0 | SELECT STATEMENT |        |        |
|   1 |  SORT AGGREGATE  |        |      1 |
|   2 |   INDEX FULL SCAN| PK_EMP |     14 |
--------------------------------------------

Note
-----
   - Warning: basic plan statistics not available. These are only collected when:
       * hint 'gather_plan_statistics' is used for the statement or
       * parameter 'statistics_level' is set to 'ALL', at session or system level

You see, I executed a query and wanted to see the exact execution statistics. Since they are not gathered, the DBMS_XPLAN output notifies me with the information how to get these information.
Assuming that we are not able to change the code to introduce the hint, I create a SQL Patch.

SQL> conn / as sysdba
Connected.
SQL> declare
  1    txt varchar2(2000);
  2  begin
  3    select sql_fulltext into txt
  4    from v$sql where sql_id = 'g59vz2u4cu404';
  5    dbms_sqldiag_internal.i_create_patch(
  6      sql_text => txt,
  7      hint_text => 'gather_plan_statistics',
  8      name => 'my_patch'
  9    );
 10  end;
 11  /

PL/SQL procedure successfully completed.

I used V$SQL to get the exact SQL text for creating the SQL patch.

Now the query is using my newly created SQL patch:

SQL> conn scott/tiger
Connected.
SQL> select count(*) from emp;

  COUNT(*)
----------
        14

SQL> select * from table(dbms_xplan.display_cursor('g59vz2u4cu404', 0, 'allstats last'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------SQL_ID  g59vz2u4cu404, child number 0
-------------------------------------
select count(*) from emp

Plan hash value: 2937609675

-------------------------------------------------------------------------------------
| Id  | Operation        | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |        |      1 |        |      1 |00:00:00.01 |       1 |
|   1 |  SORT AGGREGATE  |        |      1 |      1 |      1 |00:00:00.01 |       1 |
|   2 |   INDEX FULL SCAN| PK_EMP |      1 |     14 |     14 |00:00:00.01 |       1 |
-------------------------------------------------------------------------------------

Note
-----
   - SQL patch "my_patch" used for this statement

But this does not show what the patch exactly does. There is a view DBA_SQL_PATCHES that has some information, but nothing about the hints in these patches.

SQL> desc dba_sql_patches
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 NAME                                      NOT NULL VARCHAR2(128)
 CATEGORY                                  NOT NULL VARCHAR2(128)
 SIGNATURE                                 NOT NULL NUMBER
 SQL_TEXT                                  NOT NULL CLOB
 CREATED                                   NOT NULL TIMESTAMP(6)
 LAST_MODIFIED                                      TIMESTAMP(6)
 DESCRIPTION                                        VARCHAR2(500)
 STATUS                                             VARCHAR2(8)
 FORCE_MATCHING                                     VARCHAR2(3)
 TASK_ID                                            NUMBER
 TASK_EXEC_NAME                                     VARCHAR2(128)
 TASK_OBJ_ID                                        NUMBER
 TASK_FND_ID                                        NUMBER
 TASK_REC_ID                                        NUMBER

When looking at the definition of that view, we see a join between “sqlobj$” and “sqlobj$auxdata”. And the latter one has a column “COMP_DATA” which contains everything I wanted to know in XML format. To extract this information I use queries like this. First identify the SQL signature and then use this signature to get the SQL patch information.

SQL> conn / as sysdba
Connected.
SQL> set numformat 99999999999999999999999999
SQL> select EXACT_MATCHING_SIGNATURE from v$sql where sql_id = 'g59vz2u4cu404';

   EXACT_MATCHING_SIGNATURE
---------------------------
        2640606212120450132

SQL> select cast(extractvalue(value(x), '/hint') as varchar2(500)) as outline_hints
  1  from   xmltable('/outline_data/hint'
  2         passing (select xmltype(comp_data) xml
  3                  from   sys.sqlobj$data
  4                  where  signature = 2640606212120450132)) x;

OUTLINE_HINTS
--------------------------------------------------------------------------------
gather_plan_statistics

That’s it.

Finally, I get rid of the SQL patch by dropping it.

SQL> exec dbms_sqldiag.drop_sql_patch(name => 'my_patch');

PL/SQL procedure successfully completed.

Nice feature that we got, but remember that the use of DBMS_SQLDIAG package may require additional licensing. How that helps anyway.

High parse time in Oracle 12c

One of our customers moved their main application to Oracle Database 12c a couple of days ago. With the new version they experiences a couple of issues that I helped them to resolve or to work around and that I will write about in this and other blog posts.

The first and very heavy issue appeared right from the start. When several people started using the application they waited a long time at a specific point. What we saw at database level was a high number of “Concurrency” waits, namely “Cursor: pin S wait on X”. There must be something, these queries run fine in 11.2.0.4. So we first set “optimizer_features_enable” step by step back to “11.2.0.2”, but it did not change anything. It continued to look like this:

12c-start-work

And this was only the beginning… It went on:

12c-race-condition

This all was caused by only a few statements. It turned out, that during the concurrency periods V$SQL showed a value of zero for PLAN_HASH_VALUE for the particular statements and child cursors. The parsing of the statement took that long, up to five minutes per parse. All the other sessions that were about to run the same statement waited for the first session to complete the parse. And this happend over and over again, either due to new child cursors being created or due to the cursors being aged out of the shared pool.

Currently there as a SR open with Oracle to identify the reason for these extremly long parse times. I will update this post as soon as a I get news.

Meanwhile the workaround is to simply pin the statements in the shared pool. This way we do not need to parse over and over again, we keep all information in shared pool regardless if it is used or not. We still need to do the parse for each child cursor that is needed, but then we are done. So there is a time after restarting a database instance when the parse needs to be done. During this time we identify the statements and pin them again.
The following are examples only, it has nothing to do with the real issue.

SQL> select sql_id, executions, plan_hash_value, address, hash_value from v$sql where sql_id='94qn6y14kw01g';

SQL_ID        EXECUTIONS PLAN_HASH_VALUE ADDRESS          HASH_VALUE
------------- ---------- --------------- ---------------- ----------
94qn6y14kw01g        255      1388734953 00000000CE3DB1E8 1227751471

We use ADDRESS and HASH_VALUE to pin the cursor:

SQL> exec dbms_shared_pool.keep('00000000CE3DB1E8,1227751471', 'C');

PL/SQL procedure successfully completed.

To identify the statements that are already pinned, use a query similar to this:

SQL> select name, executions, status,
  2  (select sql_id from v$sql s where s.hash_value = c.hash_value and rownum=1) sql_id
  3  from v$db_object_cache c
  4  where kept='YES' and type='CURSOR';

NAME
--------------------------------------------------------------------------------
EXECUTIONS STATUS              SQL_ID
---------- ------------------- -------------
SELECT NVL(TO_NUMBER(EXTRACT(XMLTYPE(:B2 ), :B1 )), 0) FROM DUAL
       273 VALID               94qn6y14kw01g

SELECT NVL(TO_NUMBER(EXTRACT(XMLTYPE(:B2 ), :B1 )), 0) FROM DUAL
       273 VALID               94qn6y14kw01g

After a while, to load reduced and then continued at normal level.

12c-normal-load

I am now waiting for Oracle Support to provide information about the root cause and for the fix of this issue.

Did you ever see “enq: CF – contention”?

I did. On a customers database we saw massive waits for “enq: CF – contention”. The sessions waiting for that event where all executing simple INSERT statements. So why do they wait for controlfile operations?
My Oracle Support Note 1380939.1 pointed me to the root cause. The INSERTs are doing NOLOGGING operations. And hence the database updates the UNCRECOVERABLE_TIME and UNRECOVERABLE_SCN information inside the controlfile over and over again. This serialization leads to massive performance degradation.
But who is doing NOLOGGING operations? That is bad in terms of recoverability and in our special case really really bad from a performance perspective. We started investigating this. The database runs in NO FORCE LOGGING mode and all tablespaces are LOGGING too:

SQL> select force_logging from v$database;

FOR
---
NO

SQL> select distinct logging from dba_tablespaces where contents='PERMANENT';

LOGGING
---------
LOGGING

Let’s check the logging attribute of the table:

SQL> select logging from dba_tables where table_name='TABLE_IN_QUESTION';

LOG
---

SQL> select partitioned,logging from dba_tables where table_name='TABLE_IN_QUESTION';

PAR LOG
--- ---
YES

Obviously the table is partitioned so I needed to check the partitions itself:

SQL> select distinct logging from dba_tab_partitions where table_name='TABLE_IN_QUESTION';

LOGGING
-------
YES

So why are there nologging operations? The logging attribute is for segments, so I checked for LOB columns in the table.

SQL> select column_name, logging, partitioned from dba_lobs  where table_name='TABLE_IN_QUESTION';

COLUMN_NAME          LOGGING PAR
-------------------- ------- ---
VALUE                NONE    YES

SQL> select distinct column_name, logging from dba_lob_partitions  where table_name='TABLE_IN_QUESTION';

COLUMN_NAME          LOGGING
-------------------- -------
VALUE                NO

That’s it. The LOB partitions are defined to do NOLOGGING operations. But we were not able to change that:

SQL> alter table TABLE_IN_QUESTION modify lob (value) (nocache logging);
alter table TABLE_IN_QUESTION modify lob (value) (nocache logging)
                   *
ERROR at line 1:
ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired

And of cause the application owner, who was complaining most about bad performance by the way, did not commit to switch the data loading off for a short period of time. You know, availability, etc bla bla… So we just enabled the FORCE LOGGING mode for the database which immediately removed the “enq: CF – contention”.

SQL> alter database force logging;

Database altered.

That is what Cloud Control looked like…
CF contention gone

But we need to keep in mind that some time when we are allowed to, we must change the LOB parameters.