Wednesday, 25 September 2013

How to quickly check that Database is consistent after incomplete recovery (Point in Time Recovery) before OPEN RESETLOGS

At first, It's helpful to enable the session to display the timestamp for DATE type columns :

SQL> alter session set nls_date_format='DD-MON-YYYY HH24:MI:SS' ;

Check 1:

Objective: Verify that the datafiles are recovered to the intended point in time (PIT) and they are consistent (FUZZY=NO)
Query the current status and PIT (P-oint I-n T-ime upto which the datafiles have been recovered) of datafiles by reading datafile headers directly from the physical datafiles:

SQL> select fuzzy, status, error, recover, checkpoint_change#, checkpoint_time, count(*) from v$datafile_header group by fuzzy, status, error, recover, checkpoint_change#, checkpoint_time ;

FUZ STATUS  ERROR           REC CHECKPOINT_CHANGE# CHECKPOINT_TIME        COUNT(*)
--- ------- --------------- --- ------------------ -------------------- ----------
NO  ONLINE                                 5311260 31-AUG-2011 23:10:14          6
YES ONLINE                                 5311260 31-AUG-2011 23:10:14          1


+ Verify that the checkpoint_time / checkpoint_change# is in line with your intended UNTIL TIME/SCN. If not, recover the database further if you have more archived logs available.

+ If FUZZY=YES for some datafiles, it means more recovery is required. If no more archived logs are available, identify such datafiles and determine if we can take them offline because we will loose the data in those datafiles. If the datafiles belong to SYSTEM or UNDO tablespace, we can / MUST not bring such datafile offline without proper analysis. Please consult Oracle Support for further actions.
SQL> select file#, substr(name, 1, 50), substr(tablespace_name, 1, 15), undo_opt_current_change# from v$datafile_header where fuzzy='YES' ;

     FILE# SUBSTR(NAME,1,50)                                  SUBSTR(TABLESPA UNDO_OPT_CURRENT_CHANGE#
---------- -------------------------------------------------- --------------- ------------------------
         3 /u01/app/oracle/oradata/prod111/undotbs01.dbf      UNDOTBS1                         5117431

Occasionally, if the tablespace name doesn't indicate it to be UNDO tablespace, if we see non-zero value in column UNDO_OPT_CURRENT_CHANGE#, it indicates that the datafile contains undo segments.


To bring a datafile offline :

SQL> alter database datafile <file#> offline ;

Check 1 can be considered Passed when :

+ Verified that all the datafiles have been recovered upto the intended Point in time.

+ Fuzzy=NO for SYSTEM, UNDO and all intended datafiles. For datafiles with Fuzzy=YES, either recover them further or bring them OFFLINE if no further archived logs available.

Check 2:

Objective: Verify that the files with status=RECOVER are not OFFLINE unintentionally
SQL> select status, enabled, count(*) from v$datafile group by status, enabled ;

STATUS  ENABLED      COUNT(*)
------- ---------- ----------
SYSTEM  DISABLED            1
ONLINE  READ WRITE          4
RECOVER DISABLED            2


If the files are in RECOVER status, verify if they are OFFLINE :

SQL> select file#, substr(name, 1, 50), status, error, recover from v$datafile_header ;


If you want the data for these files to be accessible, then bring them ONLINE :

SQL> alter database datafile <file#> ONLINE ;


If a file remains offline at the time of OPEN RESETLOGS, the datafile may not be brought back online again in the same OPENED database.

Check 2 can be considered Passed when:

a) All the intended datafiles are not OFFLINE

Check 3:

Objective: Additional Fuzzy check (Absolute Fuzzy check)


Occasionally, it is possible to see Fuzzy=NO and same checkpoint_change# for all the intended datafiles ; still some of the datafiles might be fuzzy and OPEN RESETLOGS will return error, e.g.
SQL> select fuzzy, status, error, recover, checkpoint_change#, checkpoint_time, count(*) from v$datafile_header group by fuzzy, status, error, recover, checkpoint_change#, checkpoint_time ;

FUZ STATUS  ERROR           REC CHECKPOINT_CHANGE#      CHECKPOINT_TIME   COUNT(*)
--- ------- --------------- --- ------------------ -------------------- ----------
NO  ONLINE                                 5311260 31-AUG-2011 23:10:14          7


SQL> ALTER DATABASE OPEN RESETLOGS ;

ORA-01194: file 4 needs more recovery to be consistent
ORA-01110: data file 3: '/u01/app/oracle/oradata/prod111/undotbs02.dbf'


Hence, we should perform additional fuzzy check known as Absolute Fuzzy Check:

SQL> select hxfil file#, substr(hxfnm, 1, 50) name, fhscn checkpoint_change#, fhafs Absolute_Fuzzy_SCN, max(fhafs) over () Min_PIT_SCN from x$kcvfh where fhafs!=0 ;

FILE#      NAME                                               CHECKPOINT_CHANG ABSOLUTE_FUZZY_S     MIN_PIT_SCN
---------- -------------------------------------------------- ---------------- ---------------- ----------------
         4 /u01/app/oracle/oradata/prod111/undotbs01.dbf               5311260          5311524          5311524
         6 /u01/app/oracle/oradata/prod111/system01.dbf                5311260          5311379          5311524

Note: Column Min_PIT_SCN will return same value even for multiple rows as we have applied ANALYTICAL "MAX() OVER ()" function on it.

Above query indicates that the recovery must be performed at least UNTIL SCN 5311524 to make datafiles consistent and ready to OPEN. Since the checkpoint_change# is smaller than Min_PIT_SCN, the datafiles will ask for more recovery.


Check 3 can be considered Passed when,

a) No rows selected from above query (i.e. Min_PIT_SCN is 0 (Zero) for all the datafiles)

b) Min_PIT_SCN is returned less than Checkpoint_Change#


Check 4 (After successful OPEN RESETLOGS) :

Monitor the alert.log for the time of OPEN RESETLOGS activities. You might see some messages like below during dictionary check:
Dictionary check beginning
Tablespace 'TEMP' #3 found in data dictionary, <(============================== (1)
but not in the controlfile. Adding to controlfile.
Tablespace 'USERS' #4 found in data dictionary,
but not in the controlfile. Adding to controlfile.
File #4 found in data dictionary but not in controlfile.
Creating OFFLINE file 'MISSING00004' in the controlfile. <(==================== (2)
File #5 is online, but is part of an offline tablespace. <(==================== (3)
data file 5: '/u01/app/oracle/oradata/prod111/example01.dbf'
File #7 found in data dictionary but not in controlfile. <(==================== (2)
Creating OFFLINE file 'MISSING00007' in the controlfile.
File #8 is offline, but is part of an online tablespace. <(==================== (4)
data file 8: '/u01/app/oracle/oradata/prod111/mydata02.dbf'
File #9 is online, but is part of an offline tablespace. <(==================== (3)
data file 9: '/u01/app/oracle/oradata/prod111/example02.dbf'
Dictionary check complete

We discuss below the points highlighted :

(1) Check if the temp files exist. If not, add them as per your preference:
SQL> select file#, name from v$tempfile ;

no rows selected

SQL> select file#, name from dba_temp_files ;

no rows selected


SQL> select tablespace_name, status, contents from dba_tablespaces where contents='TEMPORARY' ;

TABLESPACE_NAME                STATUS    CONTENTS
------------------------------ --------- ---------
TEMP                           ONLINE    TEMPORARY


SQL> alter tablespace temp add tempfile '/u01/app/oracle/oradata/temp01.dbf' size 10m ;

Tablespace altered.

SQL> select file#, substr(name, 1, 50), status, enabled from v$tempfile

FILE#    SUBSTR(NAME,1,50)                                  STATUS  ENABLED
-------- -------------------------------------------------- ------- ----------
       1 /u01/app/oracle/oradata/temp01.dbf                 ONLINE  READ WRITE

(2) It appears that the tablespace was brought offline using "ALTER TABLESPACE USERS OFFLINE" command. So, verify if the missing files really exist with original name. You may need to consult your pear DBAs, or refer alert.log / RMAN backup log or any such information which may provide clue about the actual file name.

If you find the file, try to rename them. If not, we can offline the datafile or drop associated tablespace:
SQL> select file#, status, enabled, substr(name, 1, 50) from v$datafile where name like '%MISSING%' ;

FILE#    STATUS  ENABLED    SUBSTR(NAME,1,50)
-------- ------- ---------- --------------------------------------------------
       4 OFFLINE DISABLED   /u01/app/oracle/product/11.1.0/db_1/dbs/MISSING000
       7 OFFLINE DISABLED   /u01/app/oracle/product/11.1.0/db_1/dbs/MISSING000


SQL> alter database datafile 4 online ;
alter database datafile 4 online
*
ERROR at line 1:
ORA-01157: cannot identify/lock data file 4 - see DBWR trace file
ORA-01111: name for data file 4 is unknown - rename to correct file
ORA-01110: data file 4: '/u01/app/oracle/product/11.1.0/db_1/dbs/MISSING00004'


SQL> alter database rename file 'MISSING00004' to '/u01/app/oracle/oradata/prod111/users01.dbf' ;

Database altered.

SQL> alter database rename file 'MISSING00007' to '/u01/app/oracle/oradata/prod111/users02.dbf' ;

Database altered.

SQL> select tablespace_name, status from dba_tablespaces where tablespace_name in (select tablespace_name from dba_data_files where file_id in (4, 7)) ;

TABLESPACE_NAME                STATUS
------------------------------ ---------
USERS                          OFFLINE

SQL> ALTER TABLESPACE USERS ONLINE ;

Tablespace altered.


Before proceedig further, let's query the status for these files in alert.log:
SQL> select a.file#, substr(a.name, 1, 50) file_name, a.status file_status, a.error, substr(a.tablespace_name, 1, 10) tablespace_name, b.status tablespace_status from v$datafile_header a, dba_tablespaces b
where a.tablespace_name=b.tablespace_name /* and a.file# in (4, 5, 7, 8, 9) */ ;
FILE# FILE_NAME                                     FILE_STATUS ERROR           TABLESPA TABLESPACE_STATUS
----- --------------------------------------------- ----------- --------------- -------- ------------------
    1 /u01/app/oracle/oradata/prod111/system01.dbf  ONLINE                      SYSTEM   ONLINE
    2 /u01/app/oracle/oradata/prod111/sysaux01.dbf  ONLINE                      SYSAUX   ONLINE
    3 /u01/app/oracle/oradata/prod111/undotbs01.dbf ONLINE                      UNDOTBS1 ONLINE
    4 /u01/app/oracle/oradata/prod111/users01.dbf   OFFLINE     OFFLINE NORMAL  USERS    OFFLINE <(== related to (2) in alert.log excerpt above
    5 /u01/app/oracle/oradata/prod111/example01.dbf ONLINE                      EXAMPLE  OFFLINE <(== related to (3) in alert.log excerpt above
    6 /u01/app/oracle/oradata/prod111/mydata01.dbf  ONLINE                      MYDATA   ONLINE
    7 /u01/app/oracle/oradata/prod111/users02.dbf   OFFLINE     OFFLINE NORMAL  USERS    OFFLINE <(== related to (2) in alert.log excerpt above
    8 /u01/app/oracle/oradata/prod111/mydata02.dbf  OFFLINE     WRONG RESETLOGS MYDATA   ONLINE <(=== related to (4) in alert.log excerpt above
    9 /u01/app/oracle/oradata/prod111/example02.dbf ONLINE                      EXAMPLE  OFFLINE <(== related to (3) in alert.log excerpt above
9 rows selected.

So, we can attempt to correct the "ERROR" as displayed in above query depending on the availability of file / archived logs and other possible factors.

Let's continue,
(3) It seems that tablespace was brought offline inconsistently ( ALTER TABLESPACE EXAMPLE OFFLINE IMMEDIATE ). If the archived log generated at that time has got applied, the file may be back online :

SQL> alter tablespace example ONLINE ;

Tablespace altered.

(4) This tablespace MYDATA has 2 datafiles File# 6 & 8. It appears that File# 8 was brought offline ( using ALTER DATABASE DATAFILE 8 OFFLINE ) and it was OFFLINE before OPEN RESETLOGS. If the archived log generated at that time has got applied during recovery or all the archived logs are available for recovery since that time, the file may be back online :
SQL> alter database datafile 8 online ;
alter database datafile 8 online
*
ERROR at line 1:
ORA-01190: control file or data file 8 is from before the last RESETLOGS
ORA-01110: data file 8: '/u01/app/oracle/oradata/prod111/mydata02.dbf'


SQL> alter tablespace mydata online ;
alter tablespace mydata online
*
ERROR at line 1:
ORA-01190: control file or data file 8 is from before the last RESETLOGS
ORA-01110: data file 8: '/u01/app/oracle/oradata/prod111/mydata02.dbf'


SQL> recover datafile 8 ;
Media recovery complete.
SQL> alter database datafile 8 online ;

Database altered.

SQL> alter tablespace mydata online ;

Tablespace altered.

Please note that it is not always possible to recover and bring the file online which is failing with error " ORA-01190: control file or data file x is from before the last RESETLOGS".

No comments:

Post a Comment