Sunday, 29 September 2024

Patching a DB System in OCI


Let's learn how to patch an Oracle Database 19c - 19.23 to 19.24 using the OCI DB System.

1. Validate the current version :

Login to the OCI Cloud Console

Navigate under Oracle Database --> Database (BM/VM) - BM - Bare Metal / VM - Virtual Machine



Now navigate to the DB System in question click on the Node and login to the box to validate the current version: 19.23.0.0.0

You can do the same check using the Database Command Line(CLI)


2. Check the latest version available.

Now navigate to the console:

Click on the DB Systems section --> At the bottom left of the screen click on

Databases

Click on Databases

Now navigate to the Database Page.

Scroll down to the section where it is says Version:

Click on the View hyperlink to view the latest version:

Click View latest Version Parch

As you can see below, the latest is 19.24. Let's Patch this database.

Latest 19.24 version

Note : Patching database required downtime, so perform this activity during maintenance window.

3. Run precheck



Once precheck is completed, now apply the actual patch.

4. Apply Database Patch

Navigate to the Database and the hamburger menu and click on Apply

Note with Apply, it does a pre-check and then applies the patch

You will now see the status change from Available to Applying

Applying the Patch 19.24.0.0.0

You can monitor the status using CLI using the below command.

[root@srv1 ~]# dbcli list-jobs

[root@srv1 ~]# dbcli describe-job -i a276b6a6-af06-4baa-ac9d-dd45c55f9ae7

Now the database patch has completed successfully.


4. Verify the database Patch

Let's verify the same using database command line(CLI)

Successful Patch 19.24.0.0.0

Congratulation..!! We have successfully patched an OCI bound DB system!



I hope this article helped you. Your suggestions/feedback are most welcome.

Keep learning... Have a great day!!!


Thank you,
Amit Pawar
Email: amitpawar.dba@gmail.com
WhatsApp No: +91-8454841011

Saturday, 14 September 2024

Oracle Exadata Smart Flash Cache

Exadata Smart Flash Cache, part of the cell (storage) server, temporarily holds redo data before it is securely written to disk. Exadata Storage Servers feature extensive flash storage, with a portion reserved for database logging and the rest utilized for caching user data.

On a full rack Exadata server, there is approximately 5 TB of flash cache available, offering considerable storage capacity for caching.

The Exadata Database Machine’s exceptional performance is largely due to two key features in the Exadata Storage Server Software, which optimize the flash hardware:

  1. Exadata Smart Flash Cache – enables the staging of active database objects in flash.
  2. Exadata Smart Flash Logging – accelerates database logging tasks.

These features, combined with Exadata’s mission-critical resilience, make it an ideal platform for Oracle Database deployment. Flash cache management is automatic, but users can influence caching priorities through hints, and administrators have the ability to disable it for specific databases.

Intelligent Caching: 

The Smart Flash Cache recognizes different database I/O patterns, prioritizing frequently accessed data and index blocks for caching. Control file and file header operations are also cached, and database administrators can adjust caching priorities to suit workload demands.

However, monitoring the cache contents is not straightforward. Oracle provides the list flashcachecontent command via the cellcli tool, but it lacks summary options and only displays object numbers.

 Example- 

CellCLI> list flashcachecontent where objectNumber = 43215 detail;
        cachedKeepSize:         0
        cachedSize:             42384
         dbID:                   191919191
        dbUniqueName:           TEST
         hitCount:               18
         missCount:              1
         objectNumber:           43215
         tableSpaceNumber:       8

Data that is Never Cached in Flash Cache:

Backup related I/O is not cached
Data pump I/O is not cached
Datafile formating data is not cached
Table scans do not monopolize the cache
I/Os to mirror copies are managed intelligently. 

FlashDisk-based Grid Disks:

To speed up I/O performance for random reads, Exadata V2 introduced solid state storage called Flash Cache.

•Flash cache is configured as a cell disk of type FlashDisk, and just as grid disks are created on HardDisk cell disks, they may also be created on FlashDisk cell disks.

•FlashDisk type cell disks are named with a prefix of FD and a diskType of FlashDisk. 

 Creating FlashDisk based Grid Disks:

 It is not recommended to use all of your Flash Cache for grid disks. When creating the Flash Cache, use the size parameter to hold back some space to be used for grid disks.

 CellCLI> create flashcache all size=300g;

 •We can create grid disks using the remaining free space on the Flash Disks, using the familiar 'create griddisk' command.

 CellCLI> create griddisk all flashdisk prefix='RAMDISK‘;

CellCLI> list griddisk attributes name, diskType, size – where disktype='FlashDisk‘;

 The beauty of Flash Cache configuration is that all this may be done while the system is online and servicing I/O requests.

Data Processing modes of Flash Cache:

 1. Write through mode -  Excellent for absorbing repeated random reads.

2. Write-back mode - Best for write intensive workloads commonly found in OLTP applications

 By default, Exadata flash cache Operates in write-through mode. DBA’s can influence caching priorities by using CELL_FLASH_CACHE storage attribute for specific database objects.

 1. Write Through mode 

 Please read the data points and understand write and read operations in exadata server using write through mode.

  

In write-through mode, smart cache work as follows

 - For Write Operations, CELLSRV writes data to disk and sends acknowledgment to the DB so it can continue without interruption. Then, if the data is suitable for caching, it is written to smart flash cache. Write performance is not improved or diminished using this method. However, if a subsequent read operation needs the same data , it is likely to benefit from the cache. When data  is inserted into a full cache, a prioritized least recently used (LRU) algorithm.  

 - For Read Operations (on cached data), CELLSRV must first determine if the request should use the cache.This decisions is based on various factors including the reason for the read, the CELL_FLASH_CACHE setting for the associated object, and the current load on the cell. If it is determined that the cache should be used , CELLSRV uses an in-memory hash table, to quickly determine if the data resides in flash cache. If the request data is cached , a cache lookup is used to satisfy the  I/O request.

 - For Read Operations (On un-cached data) that cannot be satisfied using flash cache, a disk read is performed and the requested information is sent to the database. Then if the data is suitable for caching , it is written to the flash cache.

 2. Write Back mode




In this mode, write Operations work as follows

 - CELLSRV receives the write operation and uses intelligent caching algorithms to determine if the data is suitable for caching. 

- If the data is suitable for caching, it is written to flash cache only. If the cache is full, CELLSRV determines which data to replace using the same prioritized least recently used (LRU) algorithm as in write through mode.

- After the data is written to flash, an acknowledgement is sent back to the database.

- Data is only written back to disk when it is aged out of the cache. 

 Note the following regarding write back flash cache

- Write back flash cache allows 20 times more write I/Os per second on X3-4 systems, which makes it ideal for write intensive applications that would otherwise saturate the disk controller write cache. 

- The large flash capacity on X5 systems means that for many applications a very high proportion of all I/O can be serviced by flash.

- An active data block can remain in write back flash cache for months or years. Also, flash cache is persistence through power outages, shutdown operations, cell restarts and so on.

- With write back flash cache, data redundancy is maintained by writing primary and secondary data copies to cache on separate cell(storage) servers.   

- Secondary block copies are aged out of the cache and written to disk more quickly than primary copies. Hence, blocks that have not been read recently only keep the primary copy in cache, which optimizes the utilization of the premium flash cache.

- If there is a problem with the flash cache on one storage server, then operations transparently fail over to the mirrored copies (on flash or disk) on other storage servers. No user intervention is required. The unit for mirroring is the ASM allocation unit. This means that the amount of data affected is proportional to the lost cache size, not the disk size.

- With write back flash cache, read operations are handled the same as a write trough flash cache.  

LIST CELL shows the current value.

CELLCLI> list cell attributes flashcachemode

CELLCLI> list cell detail

How to enable Write-Back Flash Cache:

Methods are available:

1. Rolling Method - Assuming that RDBMS & ASM instances are UP and enabling Write-Back Flash Cache in One Cell Server at a time

2. Non-Rolling Method - Assuming that RDBMS & ASM instances are DOWN while enabling Write-Back Flash Cache

Note: Before performing the below steps, Perform the following check as root from one of the compute nodes:

Check all griddisk “asmdeactivationoutcome” and “asmmodestatus” to ensure that all griddisks on all cells are “Yes” and “ONLINE” respectively.

# dcli -g cell_group -l root cellcli -e list griddisk attributes asmdeactivationoutcome, asmmodestatus

Check that all of the flashcache are in the “normal” state and that no flash disks are in a degraded or critical state:

# dcli -g cell_group -l root cellcli -e list flashcache detail

exadata01cell01: WriteThrough
exadata01cell02: WriteThrough
exadata01cell03: WriteThrough

1.     Rolling Method:

(Assuming that RDBMS & ASM instances are UP and enabling Write-Back Flash Cache in One Cell Server at a time)

Login to Cell Server:

Step 1. Drop the flash cache on that cell

#cellcli –e drop flashcache

Flash cache exadata01cell01_FLASHCACHE successfully dropped

Step 2. Check the status of ASM if the grid disks go OFFLINE. The following command should return 'Yes' for the grid disks being listed:

 # cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome

         DATAC1_CD_00_exadata01cell01   OFFLINE  Yes
         DATAC1_CD_01_exadata01cell01   OFFLINE  Yes
         DATAC1_CD_02_exadata01cell01   OFFLINE  Yes
         DATAC1_CD_03_exadata01cell01   OFFLINE  Yes
         DATAC1_CD_04_exadata01cell01   OFFLINE  Yes
         DATAC1_CD_05_exadata01cell01   OFFLINE  Yes
         DBFS_DG_CD_02_exadata01cell01  OFFLINE  Yes
         DBFS_DG_CD_03_exadata01cell01  OFFLINE  Yes
         DBFS_DG_CD_04_exadata01cell01  OFFLINE  Yes
         DBFS_DG_CD_05_exadata01cell01  OFFLINE  Yes
         RECOC1_CD_00_exadata01cell01   OFFLINE  Yes
         RECOC1_CD_01_exadata01cell01   OFFLINE  Yes
         RECOC1_CD_02_exadata01cell01   OFFLINE  Yes
         RECOC1_CD_03_exadata01cell01   OFFLINE  Yes
         RECOC1_CD_04_exadata01cell01   OFFLINE  Yes
         RECOC1_CD_05_exadata01cell01   OFFLINE  Yes

Step 3. Inactivate the griddisk on the cell

# cellcli –e alter griddisk all inactive

 Step 4. Shut down cellsrv service

# cellcli -e alter cell shutdown services cellsrv 

 Stopping CELLSRV services...

The SHUTDOWN of CELLSRV services was successful.

Step 5. Set the cell flashcache mode to writeback 

# cellcli -e "alter cell flashCacheMode=writeback"

 Cell exadata01cell01 successfully altered

 Step 6. Restart the cellsrv service 

# cellcli -e alter cell startup services cellsrv 

Starting CELLSRV services...

The STARTUP of CELLSRV services was successful.

Step 7. Reactivate the griddisks on the cell

# cellcli –e alter griddisk all active
GridDisk DATAC1_CD_00_exadata01cell03 successfully altered
GridDisk DATAC1_CD_01_exadata01cell03 successfully altered
GridDisk DATAC1_CD_02_exadata01cell03 successfully altered
GridDisk DATAC1_CD_03_exadata01cell03 successfully altered
GridDisk DATAC1_CD_04_exadata01cell03 successfully altered
GridDisk DATAC1_CD_05_exadata01cell03 successfully altered
GridDisk DBFS_DG_CD_02_exadata01cell03 successfully altered
GridDisk DBFS_DG_CD_03_exadata01cell03 successfully altered
GridDisk DBFS_DG_CD_04_exadata01cell03 successfully altered
GridDisk DBFS_DG_CD_05_exadata01cell03 successfully altered
GridDisk RECOC1_CD_00_exadata01cell03 successfully altered
GridDisk RECOC1_CD_01_exadata01cell03 successfully altered
GridDisk RECOC1_CD_02_exadata01cell03 successfully altered
GridDisk RECOC1_CD_03_exadata01cell03 successfully altered
GridDisk RECOC1_CD_04_exadata01cell03 successfully altered
GridDisk RECOC1_CD_05_exadata01cell03 successfully altered

Step 8. Verify all grid disks have been successfully put online using the following command:

# cellcli -e list griddisk attributes name, asmmodestatus
         DATAC1_CD_00_exadata01cell02   ONLINE         Yes
         DATAC1_CD_01_exadata01cell02   ONLINE         Yes
         DATAC1_CD_02_exadata01cell02   ONLINE         Yes
         DATAC1_CD_03_exadata01cell02   ONLINE         Yes
         DATAC1_CD_04_exadata01cell02   ONLINE         Yes
         DATAC1_CD_05_exadata01cell02   ONLINE         Yes
         DBFS_DG_CD_02_exadata01cell02  ONLINE         Yes
         DBFS_DG_CD_03_exadata01cell02  ONLINE         Yes
         DBFS_DG_CD_04_exadata01cell02  ONLINE         Yes
         DBFS_DG_CD_05_exadata01cell02  ONLINE         Yes
         RECOC1_CD_00_exadata01cell02   ONLINE         Yes
         RECOC1_CD_01_exadata01cell02   ONLINE         Yes
         RECOC1_CD_02_exadata01cell02   ONLINE         Yes
         RECOC1_CD_03_exadata01cell02   ONLINE         Yes
         RECOC1_CD_04_exadata01cell02   ONLINE         Yes
         RECOC1_CD_05_exadata01cell02   ONLINE         Yes

Step 9. Recreate the flash cache 

# cellcli -e create flashcache all
Flash cache exadata01cell01_FLASHCACHE successfully created

If the flash disk is used for flash cache, then the effective cache size increases. If the flash disk is used for grid disks, then the grid disks are re-created on the new flash disk. If those gird disks were part of an Oracle ASM disk group, then they are added back to the disk group, and the data is rebalanced on them based on the disk group redundancy and ASM_POWER_LIMIT parameter.

Step 10. Check the status of the cell to confirm that it's now in WriteBack mode:

# cellcli -e list cell detail | grep flashCacheMode 
flashCacheMode:         WriteBack                            
 

Step 11. Repeat these same steps again on the next cell to the FINAL cell. However, before taking another storage server offline, execute the following making sure 'asmdeactivationoutcome' displays YES:

 # cellcli -e list griddisk attributes name,asmmodestatus, asmdeactivationoutcome
         DATAC1_CD_00_exadata01cell01   ONLINE  Yes
         DATAC1_CD_01_exadata01cell01   ONLINE  Yes
         DATAC1_CD_02_exadata01cell01   ONLINE  Yes
         DATAC1_CD_03_exadata01cell01   ONLINE  Yes
         DATAC1_CD_04_exadata01cell01   ONLINE  Yes
         DATAC1_CD_05_exadata01cell01   ONLINE  Yes
         DBFS_DG_CD_02_exadata01cell01  ONLINE  Yes
         DBFS_DG_CD_03_exadata01cell01  ONLINE  Yes
         DBFS_DG_CD_04_exadata01cell01  ONLINE  Yes
         DBFS_DG_CD_05_exadata01cell01  ONLINE  Yes
         RECOC1_CD_00_exadata01cell01   ONLINE  Yes
         RECOC1_CD_01_exadata01cell01   ONLINE  Yes
         RECOC1_CD_02_exadata01cell01   ONLINE  Yes
         RECOC1_CD_03_exadata01cell01   ONLINE  Yes
         RECOC1_CD_04_exadata01cell01   ONLINE  Yes
         RECOC1_CD_05_exadata01cell01   ONLINE  Yes

After changing the flashcache modes on all cells, check if flashcache modes are changed to write-back for all cells.

CellCLI> dcli -g ~/cell_group -l root cellcli -e "list cell attributes flashcachemode"
exadata01cell01: WriteBack
exadata01cell02: WriteBack
exadata01cell03: WriteBack

  2.     Non-Rolling Method:

 (Assuming that RDBMS & ASM instances are DOWN while enabling Write-Back Flash Cache)

Step 1. Drop the flash cache on that cell

# cellcli -e drop flashcache 

 Step 2. Shut down cellsrv service

 # cellcli -e alter cell shutdown services cellsrv 

 Step 3. Set the cell flashcache mode to writeback 

 # cellcli -e "alter cell flashCacheMode=writeback" 

 Step 4. Restart the cellsrv service 

 # cellcli -e alter cell startup services cellsrv 

 Step 5. Recreate the flash cache 

 # cellcli -e create flashcache all

 Write-Back Flash Cache Not Required for DiskGroup:

Note: We can disable Write-Back Flash Cache diskgroups like RECO not requiring this feature. This can save space in the flash cache.

CACHINGPOLICY could be used to change the flash cache policy of the griddisk.

Before changing the cache policy from default to none, ensure there is no cached data in flash cache for the grid disk:

CellCLI> create griddisk all harddisk prefix=RECO, size=1006, cachingPolicy="none“;

 OR

CELLCLI>ALTER GRIDDISK grid_disk_name FLUSH;

CELLCLI>ALTER GRIDDISK grid_disk_name CACHINGPOLICY="none";

 Flushing the data from Flash Cache to Disk – Manual Method:

The data which is not been synchronized with griddisk can be synchronized using the FLUSH option.

CELLCLI>ALTER GRIDDISK grid_disk_name FLUSH

 Use the following command to check the progress of this activity:

 CELLCLI>LIST GRIDDISK ATTRIBUTES name, flushstatus, flusherr

 Reinstating WriteThrough FlashCache:

1.   To reinstate Writethrough caching, FlashCache must first be flushed

2.   FlashCache must then be dropped and cellsrv stopped.

Step 1. CELLCLI> alter flashcache all flush

Step 2. CELLCLI> drop flashcache

Step 3. CELLCLI> alter cell shutdown services cellsrv

Step 4. CELLCLI> alter cell flashCacheMode = WriteThrough

Step 5. CELLCLI> alter cell startup services cellsrv

Monitoring Flash Cache Usage:

CELLCLI> list metricdefinition attributes name, description where name like '.*_DIRTY‘ 

CD_BY_FC_DIRTY

Number of unflushed bytes cached in FLASHCACHE on a cell disk

FC_BY_DIRTY

Number of unflushed bytes in FlashCache

FC_BY_STALE_DIRTY

Number of unflushed bytes in FlashCache which cannot be flushed. Because cached disks are not accessible

GD_BY_FC_DIRTY         

Number of unflushed bytes cached in FLASHCACHE for a grid disk

 SUMMARY

Use the Write-Back Flash Cache feature to leverage the Exadata Flash hardware and make Exadata Database Machine a faster system for Oracle Database Deployments.  Flash Storage inside the Oracle Exadata Database Machine is used completely as Flash Cache by default, effectively working as an extension of the Database Buffer Cache  and delivering faster Access together with a very high IO per Second rate which is especially important for OLTP. Additionally, we may take a part of the Flash Storage to build ASM diskgroups upon it. Files placed on these diskgroups will reside permanently on Flash Storage – no Caching needed.

 

Friday, 30 August 2024

Step by Step Oracle RAC Standby Database Switchover

1. On Primary:

a. Verify there is no log file gap between the primary and the standby database

SELECT status, gap_status 
FROM v$archive_dest_status 
WHERE dest_id = 2;

STATUS    GAP_STATUS
--------- ------------------------
VALID     NO GAP

b. Verify that the primary can be switched to the standby role.

SET lines 4000
SELECT name, log_mode, controlfile_type, open_mode, database_role, switchover_status 
FROM v$database;

NAME      LOG_MODE     CONTROL OPEN_MODE            DATABASE_ROLE    SWITCHOVER_STATUS
--------- ------------ ------- -------------------- ---------------- --------------------
ORADB     ARCHIVELOG   CURRENT READ WRITE           PRIMARY          TO STANDBY

c. Check Temp files match count in Primary and Standby

COL name FOR a45
SELECT ts#, name, status 
FROM v$tempfile;

col name for a45
    select ts#,name,ts#,status  from v$tempfile;
 
 
       TS# NAME                                                 TS# STATUS
---------- --------------------------------------------- ---------- -------
         3 /u01/app/oracle/oradata/ORADB/temp01.dbf               3 ONLINE
         3 /u01/app/oracle/oradata/ORADB/pdbseed/temp012          3 ONLINE
           020-09-03_00-49-51-498-AM.dbf
 
         3 /u01/app/oracle/oradata/ORADB/pdb1/temp01.dbf          3 ONLINE


d. Clear Potential Blocking Parameters & Jobs

SHOW PARAMETER job_queue_processes
ALTER SYSTEM SET job_queue_processes=0 SCOPE=BOTH;

show parameter job_queue_processes         
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
job_queue_processes                  integer     40
 
ALTER SYSTEM SET job_queue_processes=0 SCOPE=BOTH;
System altered.



COL owner FOR a30
COL job_name FOR a40
COL start_date FOR a30
COL end_date FOR a30        
SELECT owner, job_name, start_date, end_date, enabled 
FROM dba_scheduler_jobs 
WHERE enabled='TRUE' AND owner NOT IN 'SYS'

OWNER                JOB_NAME                         START_DATE                     END_DATE                       ENABL
-------------------- -------------------------------- ------------------------------ ------------------------------ -----
ORACLE_OCM           MGMT_CONFIG_JOB                  04-SEP-20 12.53.14.000000 AM +                                TRUE
                                                      04:00
											   
ORACLE_OCM           MGMT_STATS_CONFIG_JOB            04-SEP-20 12.53.14.000000 AM +                                TRUE
                                                      04:00
											   
EXECUTE DBMS_SCHEDULER.DISABLE('ORACLE_OCM.MGMT_CONFIG_JOB');

PL/SQL procedure successfully completed.

EXECUTE DBMS_SCHEDULER.DISABLE('ORACLE_OCM.MGMT_STATS_CONFIG_JOB');

PL/SQL procedure successfully completed.

e. Create Guaranteed Restore Point on Primary and Standby and make sure they are dropped post successful switchover

SHOW PARAMETER db_recovery_file_dest

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_recovery_file_dest                string      /u01/app/oracle/fra
db_recovery_file_dest_size           big integer 22732M

SQL> show parameter db_recovery_file_dest_size
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_recovery_file_dest_size           big integer 22732M
 
 
SQL> alter system set db_recovery_file_dest_size=5G;
System altered.
 
SQL> show parameter db_recovery_file_dest_size
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_recovery_file_dest_size           big integer 5G

SQL> create restore point switchover_restore_point  guarantee flashback database;
 
Restore point created.

select NAME,TIME,GUARANTEE_FLASHBACK_DATABASE,storage_size/1024/1024/1024 from v$restore_point;
 
NAME                           TIME                                     GUA STORAGE_SIZE/1024/1024/1024
------------------------------ ---------------------------------------- --- ---------------------------
SWITCHOVER_RESTORE_POINT       08-FEB-24 10.04.30.000000000 AM          YES                    .1953125


2. On Primary:

a. Switch the Primary to the Standby Role:

ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY WITH SESSION SHUTDOWN;

Database altered.

b. Verify the standby has received the end-of-redo (EOR) log(s) in standby alert log

Alert log sample message :
PR00 (PID:5275): Resetting standby activation ID 2803353007 (0xa717c5af)
2020-09-04T01:32:44.494716+04:00
Media Recovery End-Of-Redo indicator encountered
2020-09-04T01:32:44.494784+04:00
Media Recovery Continuing
PR00 (PID:5275): Media Recovery Waiting for T-1.S-32
2020-09-04T01:32:45.535769+04:00
.... (PID:5531): The Time Management Interface (TMI) is being enabled for role transition
.... (PID:5531): information.  This will result in messages beingoutput to the alert log
.... (PID:5531): file with the prefix 'TMI: '.  This is being enabled to make the timing of
.... (PID:5531): the various stages of the role transition available for diagnostic purposes.
.... (PID:5531): This output will end when the role transition is complete.
SWITCHOVER: received request 'ALTER DATABASE COMMIT TO SWITCHOVER  TO PRIMARY' from primary database.
2020-09-04T01:32:45.536371+04:00
ALTER DATABASE SWITCHOVER TO PRIMARY (oradb_s2)
Maximum wait for role transition is 15 minutes.
TMI: kcv_commit_to_so_to_primary wait for MRP to finish BEGIN 2020-09-04 01:32:45.536801
Switchover: Media recovery is still active
 rmi (PID:5531): Role Change: Canceling MRP - no more redo to apply
2020-09-04T01:32:45.538289+04:00
PR00 (PID:5275): MRP0: Background Media Recovery cancelled with status 16037

c. Shutdown and restart the database in standby mode

[oracle@srv1 ~]$ sqlplus / as sysdba
 
SQL*Plus: Release 19.0.0.0.0 - Production on Thu Feb 8 10:10:15 2024
Version 19.3.0.0.0
 
Copyright (c) 1982, 2019, Oracle.  All rights reserved.
 
Connected to an idle instance.
 
SQL> startup nomount
ORACLE instance started.
 
Total System Global Area 1073738888 bytes
Fixed Size                  9143432 bytes
Variable Size             276824064 bytes
Database Buffers          780140544 bytes
Redo Buffers                7630848 bytes
SQL> alter database mount standby database
  2  ;
 
Database altered.


3. On Standby:

a. Check the MRP process state

Select process,status,sequence# from v$managed_standby;
PROCESS   STATUS        SEQUENCE#
--------- ------------ ----------
ARCH      CLOSING              86
DGRD      ALLOCATED             0
DGRD      ALLOCATED             0
ARCH      CLOSING              94
ARCH      CONNECTED             0
ARCH      CONNECTED             0
MRP0      WAIT_FOR_LOG         95


b. Switch the Standby to Primary

SQL> Alter database commit to switchover to primary with session shutdown;
 
Database altered.
 
SQL> Alter database open;
 
Database altered.
SQL> Select name, open_mode, database_role from v$database;
 
NAME      OPEN_MODE            DATABASE_ROLE
--------- -------------------- ----------------
ORADB     READ WRITE           PRIMARY

4. Start Redo Apply on the new physical standby database

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;
 
Database altered.

5. Enable log shipping on new primary

ALTER SYSTEM SET log_archive_dest_state_2=ENABLE SCOPE=BOTH;

System altered.

6. Set job_queue_processes parameter on new standby database

SHOW PARAMETER job_queue_processes    
ALTER SYSTEM SET job_queue_processes=40 SCOPE=BOTH;

7. Start secondary new primary instance

In case of standalone database no action needed, as we already have database open. Start all the instances in RAC environment which were stopped before start of the activity.
srvctl status database -d prod -v
srvctl start instance -d prod -i prod2
srvctl status database -d prod -v

8. Start secondary new standby instance

In case of standalone database no action needed, as we already have database open in mount mode. Start all the instances in RAC environment which were stopped before start of the activity.
srvctl status database -d stby -v
srvctl start instance -d stby -i stby2
srvctl status database -d stby -v

9. Schedule backups on new primary and create DB links if they exist on primary new

START all cronjobs and DBMS_SCHEDULER jobs

EXECUTE DBMS_SCHEDULER.ENABLE('ORACLE_OCM.MGMT_CONFIG_JOB');

PL/SQL procedure successfully completed.

EXECUTE DBMS_SCHEDULER.ENABLE('ORACLE_OCM.MGMT_STATS_CONFIG_JOB');

PL/SQL procedure successfully completed.

SQL> SELECT OWNER, JOB_NAME, START_DATE, END_DATE, ENABLED FROM DBA_SCHEDULER_JOBS WHERE ENABLED='TRUE' AND OWNER NOT IN 'SYS';
 
OWNER      JOB_NAME                       START_DATE                               END_DATE                       ENABL
---------- ------------------------------ ---------------------------------------- ------------------------------ -----
ORACLE_OCM MGMT_CONFIG_JOB                04-SEP-20 12.53.14.000000 AM +04:00                                     TRUE
ORACLE_OCM MGMT_STATS_CONFIG_JOB          04-SEP-20 12.53.14.000000 AM +04:00                                     TRUE
 

Note: Post switchover, ensure sufficient Temporary tablespace is available on the new primary. If not, add it.

On New Standby:

10. Drop the restore point created earlier

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
 
Database altered.
 
SQL> select NAME,TIME,GUARANTEE_FLASHBACK_DATABASE,storage_size/1024/1024/1024 from v$restore_point;
NAME
--------------------------------------------------------------------------------
TIME                                                                        GUA
--------------------------------------------------------------------------- ---
STORAGE_SIZE/1024/1024/1024
---------------------------
SWITCHOVER_RESTORE_POINT
08-FEB-24 10.04.30.000000000 AM                                             YES
                   .1953125
  
SQL> DROP RESTORE POINT switchover_restore_point;
 
Restore point dropped.
 
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT from session parallel 6;
Database altered.

Friday, 19 July 2024

OLR Internals


In Oracle 11g Release 2, Oracle introduced the Oracle Local Registry (OLR), a new feature that complements the Oracle Cluster Registry (OCR) within the Grid Infrastructure. The OLR is an essential component stored locally on each node in the Grid Infrastructure home directory, specifically at 'grid_home/cdata/<hostname>.olr', with its location documented in '/etc/oracle/olr.loc'.
Each node maintains its unique OLR file within the Grid Infrastructure software home. The OLR contains critical security contexts required by the Oracle High Availability Service during the early stages of Clusterware startup. The data within the OLR is vital for the Oracle High Availability Services daemon (OHASD) to initialize, encompassing information about GPnP wallets, Clusterware configurations, and version details. This information, alongside the Grid Plug and Play (GPnP) configuration file, is crucial for locating the voting disks. If these disks are stored in ASM, the GPnP profile’s discovery string will be used by the cluster synchronization daemon for their retrieval.

In this post, I'll delve into the purpose of the OLR, its necessity, and the type of data it contains. To fully understand these aspects, we need to examine the contents of an OLR dump:


ocrdump -local -stdout
[SYSTEM]
[SYSTEM.version]
[SYSTEM.version.activeversion]
[SYSTEM.version.hostnames.rac1]
[SYSTEM.version.hostnames.rac2]
[SYSTEM.ORA_CRS_HOME]
[SYSTEM.evm]
[SYSTEM.GPnP.profiles]
[SYSTEM.CRSADMIN]
[SYSTEM.CRSUSER]
[SYSTEM.CRSD]
[SYSTEM.CRSD.SERVERPOOLS]
[SYSTEM.CRSD.SERVERS]
[SYSTEM.CRSD.SERVERS.rac1.STATE]
[SYSTEM.CRSD.TYPES]
[SYSTEM.CRSD.TYPES.ora!local_resource!type.START_DEPENDENCIES]
[SYSTEM.CRSD.TYPES.ora!local_resource!type.START_DEPENDENCIES.CONFIG]
[SYSTEM.CRSD.TYPES.ora!local_resource!type.STOP_DEPENDENCIES]
[SYSTEM.CRSD.TYPES.ora!local_resource!type.STOP_DEPENDENCIES.CONFIG]
[SYSTEM.CRSD.TYPES.ora!network!type]
[SYSTEM.CRSD.TYPES.ora!cluster_resource!type.AUTO_START]
[SYSTEM.CRSD.TYPES.ora!database!type.INSTANCE_FAILOVER]
[SYSTEM.CRSD.TYPES.ora!database!type.ORACLE_HOME]
[SYSTEM.CRSD.RESOURCES.ora!net1!network]
[SYSTEM.CRSD.RESOURCES.ora!gsd]
[SYSTEM.CRSD.RESOURCES.ora!LISTENER_SCAN1!lsnr.INTERNAL]
[SYSTEM.CRSD.RESOURCES.ora!LISTENER_SCAN2!lsnr]
[SYSTEM.CRSD.RESOURCES.ora!oc4j]
[SYSTEM.CRSD.RESOURCES.ora!rac2!vip.INTERNAL]
[SYSTEM.CRSD.RESOURCES.ora!LISTENER!lsnr.INTERNAL]
[SYSTEM.CRSD.RESOURCES.ora!orcl!db.INTERNAL]
[DATABASE.NODEAPPS.rac2]
[DATABASE.VIP_RANGE]
[DATABASE.ASM]
[DATABASE.ASM.rac1.+asm1.VERSION]
[DATABASE.ASM.rac2.+asm2.ORACLE_HOME]
[CRS]
[CRS.CUR]
[CRS.HIS]
[CRS.SEC]
[CRS.STAGE]
[CRS.STAGE.node1]

I've tried to format the output for clarity. The OLR contains extensive information, including ORA_CRS_HOME, Clusterware versions, configurations, local host versions, active versions, GPnP details, OCR latest backup times and locations, node names, and the status of node resources—whether they need to start or not—and the start & stop dependencies of these resources. These dependencies are classified as either weak (optional) or hard (mandatory).

Understanding the purpose of the OLR is crucial. Although the OCR needs to be accessible by Clusterware to determine which resources to start on a node, Oracle 11gR2 allows the OCR to be stored in ASM. This poses a challenge because ASM itself is a resource that must be started. Here, the OLR comes into play. Being a locally available file on the operating system, the OLR can be read by any process with the appropriate privileges, without any dependencies.

The High Availability Services stack comprises daemons that communicate with their counterparts on other nodes. Once the High Availability Services stack is operational, the cluster node can join the cluster and use shared components like the OCR. Part of the startup sequence for the High Availability Services stack is stored in the GPnP profile, but it also relies on information from the OLR.

A pertinent question is why OCR is still needed if we have OLR. Comparing the OLR and OCR reveals that the OLR contains fewer keys. For example, an 'ocrdump' might show 704 keys for the OCR versus 526 keys for the OLR. Most keys in the OLR pertain to the OHASD process, while those in the OCR are related to CRSD. This indicates the necessity of the OLR (along with the GPnP profile) for starting the High Availability Services stack.

I hope this explanation helps you understand the OLR, its purpose, its content, and why it is essential.


I hope this article helped you. Your suggestions/feedback are most welcome.

Keep learning... Have a great day!!!


Thank you,
Amit Pawar
Email: amitpawar.dba@gmail.com
WhatsApp No: +91-8454841011