Oracle ASM – why user defined filenames should be avoided

Last week my night was interrupted by a customer call. He told me, that on one of his systems an ASM diskgroup ran out of space. All the instances are stuck and so on, you know those stories. Luckily, he already found two files that were consuming some terrabytes which he considered as obsolete. I’ll tell you in a minute, why he was assuming that. This is an Exadata system running several databases, they use bigfile tablespaces, just to mention that. But unfortunately he was not able to delete these two files and that’s why he wanted me to get rid of these files.

So I logged in to his system and had a look for myself. From what the customer told me, the files seemed to be not in use anymore. At least when reading the directory name.

ASMCMD> cd +DATAC1/ASM/DATAFILE_OLD
ASMCMD> pwd
+DATAC1/ASM/DATAFILE_OLD
ASMCMD> ls -lsh
Type      Redund  Striped  Time             Sys  Block_Size     Blocks          Bytes           Space  Name
DATAFILE  HIGH    COARSE   MAY 29 20:00:00  Y            8K  857110785           6.4T           19.2T  EXCON_DATEN_TS.1157.1127667053
DATAFILE  HIGH    COARSE   MAY 29 20:00:00  Y            8K  475892737           3.5T           10.6T  t_bundle_uses_tbs.1574.1110647865

Because the directory was named “DATAFILE_OLD”, these two files were probably some orphaned files. So I gave it another try to get rid of them:

ASMCMD> rm -rf EXCON_DATEN_TS.1157.1127667053
ORA-15032: not all alterations performed
ORA-15028: ASM file '+DATAC1/ASM/DATAFILE_OLD/EXCON_DATEN_TS.1157.1127667053' not dropped; currently being accessed (DBD ERROR: OCIStmtExecute)

But I experienced the same error that the customer got beforehand. That is where the story really starts. The reason for not being able to delete the files is quite obvious, the files are still in use by some instance. Fortunately ASM takes care of that and prevents us from deleting files, that are currently in use. But the question is, to which instance/database these files belong to? ASM has a “lsof” command similar to the “lsof” in Linux that shows a list of instances and files that are currently in use. So let’s start with that.

[grid@some-server ~]$ asmcmd lsof -G DATAC1 | grep -i datafile_old
[grid@some-server ~]$ asmcmd lsof -G DATAC1 | grep -i excon
APLOUAT_  APLOUAT1       +DATAC1/APLOUAT_FRA1XP/DATAFILE/local_excon_index_ts.680.1072824757
APLOUAT_  APLOUAT1       +DATAC1/APLOUAT_fra1xp/DATAFILE/excon_daten_ts
LIONCDB_  LIONCDB1       +DATAC1/LIONCDB_DB/FB877908850202CFE0538C200A0A1A63/DATAFILE/excon_daten_ts.323.1136681069
LIONCDB_  LIONCDB1       +DATAC1/LIONCDB_DB/FB877908850202CFE0538C200A0A1A63/DATAFILE/excon_index_ts.1137.1136679463
LIONCDB_  LIONCDB1       +DATAC1/LIONCDB_DB/FB877908850202CFE0538C200A0A1A63/DATAFILE/local_excon_data_ts.1139.1136679463
LIONCDB_  LIONCDB1       +DATAC1/LIONCDB_DB/FB877908850202CFE0538C200A0A1A63/DATAFILE/local_excon_index_ts.1160.1136679461

Ok, no datafiles containing the strange “DATAFILE_OLD” and some files containing at least parts of the filename in question, but not exactly what I was looking for.

After some time of thinking about this, I had an idea. It could be related to aliases. If you create a datafile from within a database inside ASM and specify a filename, you will get two things: A datafile in Oracle Managed File format and an alias with in the given directory and the given filename pointing to that datafile. ASM obviously knows, that the file is in use, but since the database is opening the file using the user defined filename, “lsof” will show the name of the alias rather than the real filename.

How can I find out, to which database the files belong? There are views like V$ASM_ALIAS and V$ASM_FILE, but they are not easy to query. Some searching lead me to this very helpful thread in the Oracle Forums. The query was exactly what I needed. I simply had to adapt the diskgroup number and then walk through all the databases to find the aliases pointing to the two files in question. Here’s the query again:

WITH qry1 AS
(
SELECT aa.name,
       ag.name group_name,
       af.type,
       aa.file_number,
       aa.file_incarnation,
       aa.parent_index parent_index
FROM v$asm_alias aa,
     v$asm_diskgroup ag,
     v$asm_file af
WHERE aa.file_number = af.file_number
AND aa.group_number = ag.group_number
AND ag.group_number = af.group_number
AND aa.file_incarnation = af.incarnation
AND aa.system_created = 'N'
AND ag.group_number = &&groupnumber.
)
SELECT alias_name,
       '+'||group_name||'/'||LISTAGG(CASE WHEN alias_name = root_name THEN name END,'/') WITHIN GROUP(ORDER BY lvl DESC) alias_path,
       '+'||group_name||'/'||LISTAGG(CASE WHEN alias_name != root_name THEN name END,'/') WITHIN GROUP(ORDER BY lvl DESC) file_path
FROM
(
SELECT aa.name,
       q1.group_name,
       q1.file_number,
       q1.name alias_name,
       LEVEL lvl,
       CONNECT_BY_ROOT aa.name root_name
FROM v$asm_alias aa,
     qry1 q1
WHERE aa.group_number = &&groupnumber.
START WITH (aa.name = q1.name OR (aa.name != q1.name AND aa.file_number = q1.file_number))
CONNECT BY PRIOR aa.parent_index = aa.reference_index
AND q1.name = PRIOR q1.name
)
GROUP BY group_name,file_number,alias_name;

I found the aliases in database APLOUAT. Let’s check the ASM directory:

ASMCMD> cd DATAC1/APLOUAT_fra1xp/DATAFILE/
ASMCMD> ls -lsh
Type      Redund  Striped  Time             Sys  Block_Size     Blocks          Bytes           Space  Name
DATAFILE  HIGH    COARSE   MAY 29 20:00:00  Y            8K       9185          71.8M            216M  CDC_METADATA_TS.667.1072824731
DATAFILE  HIGH    COARSE   MAY 29 20:00:00  N            8K  857110785           6.4T           19.2T  EXCON_DATEN_TS => +DATAC1/ASM/DATAFILE_OLD/EXCON_DATEN_TS.1157.1127667053
DATAFILE  HIGH    COARSE   MAY 29 20:00:00  Y            8K   25586305         195.2G          585.6G  IIMEX_BRIDGEHEAD_TS.295.1072823641
DATAFILE  HIGH    COARSE   MAY 29 20:00:00  Y            8K   80536705         614.4G            1.8T  IMEX_TS.302.1072792101
DATAFILE  HIGH    COARSE   MAY 29 20:00:00  Y            8K  603979777           4.5T           13.5T  LICENSING_SMALL_TS.306.1072792099
DATAFILE  HIGH    COARSE   MAY 29 20:00:00  Y            8K   94317569         719.6G            2.1T  LION_STAGE_DATA_TS.298.1072792101
DATAFILE  HIGH    COARSE   MAY 29 20:00:00  Y            8K     262145             2G              6G  LOCAL_EXCON_INDEX_TS.680.1072824757
DATAFILE  HIGH    COARSE   MAY 29 20:00:00  Y            8K   10838145          82.7G          248.1G  PROCESS_INDEX_TS.315.1072823415
DATAFILE  HIGH    COARSE   MAY 29 20:00:00  Y            8K   15859713           121G            363G  UNDOTS1.676.1132578301
DATAFILE  HIGH    COARSE   MAY 29 20:00:00  Y            8K   19660801           150G            450G  UNDOTS2.2364.1132578301
DATAFILE  HIGH    COARSE   MAY 29 20:00:00  N            8K  475892737           3.5T           10.6T  t_bundle_uses_tbs => +DATAC1/ASM/DATAFILE_OLD/t_bundle_uses_tbs.1574.1110647865

At that point, I was able to tell the customer, that these two files belong to tablespaces in a running database which is the reason, that these files cannot be easily deleted. The customer story ends here, but I have something more to tell. There might be an easier way to find the alias. Remember the output of “asmcmd lsof”:

[grid@some-server ~]$ asmcmd lsof -G DATAC1 | grep -i excon
APLOUAT_  APLOUAT1       +DATAC1/APLOUAT_FRA1XP/DATAFILE/local_excon_index_ts.680.1072824757
APLOUAT_  APLOUAT1       +DATAC1/APLOUAT_fra1xp/DATAFILE/excon_daten_ts
LIONCDB_  LIONCDB1       +DATAC1/LIONCDB_DB/FB877908850202CFE0538C200A0A1A63/DATAFILE/excon_daten_ts.323.1136681069
LIONCDB_  LIONCDB1       +DATAC1/LIONCDB_DB/FB877908850202CFE0538C200A0A1A63/DATAFILE/excon_index_ts.1137.1136679463
LIONCDB_  LIONCDB1       +DATAC1/LIONCDB_DB/FB877908850202CFE0538C200A0A1A63/DATAFILE/local_excon_data_ts.1139.1136679463
LIONCDB_  LIONCDB1       +DATAC1/LIONCDB_DB/FB877908850202CFE0538C200A0A1A63/DATAFILE/local_excon_index_ts.1160.1136679461

See the highlighted line, this is one of the two aliases pointing to the files in question. Since the names are somehow similar, doing a case-insensitive search for open files containing parts of the name can reveal the aliases, that are pointing to these files. But this only works, when the user defined filename contains the tablespace name. If the filename is completely different, then you have to fall back to the query above.

At the end, this is a good example, why one should not use user defined filenames when using ASM. Better stick to Oracle Manged Filenames (OMF), this will make things much easier.