External Table

Monday, March 25, 2013

Testing Lost Writes with Oracle and Data Guard

Topic: This post is about lost writes in Oracle, on techniques for reproducing and investigating the effects of lost writes and on the use of Active Data Guard to protect against lost writes

Motivations. Imagine this scenario: a production system has two standbys to protect against disaster and to load balance read-only load (with Active Data Guard (ADG)). A lost write happens in the primary and remains unnoticed for a few days. Finally the block that suffered a lost write is updated again. Both standbys stop applying redo throwing ORA-600 [3020] (also known as stuck recovery). The primary DB keeps working fine, although it is logically corrupted by the lost write.
You are the DBA in charge of fixing this, what would you do?

The importance of testing: I hope that the example above illustrates that lost write can generate quite complex recovery scenarios and overall a few headaches to support DBAs. In this post I illustrate a few techniques and examples that can be used to test the effects of lost writes in Oracle and therefore prepare in case of a real-world issue strikes. Of particular interest will be to test the effect of lost writes in an environment with (Active) Data Guard.
We need to cover some ground first on techniques and how to setup the test. But first a definition.

Lost writes: "A data block lost write occurs when an I/O subsystem acknowledges the completion of the block write, while in fact the write did not occur in the persistent storage" (from support note 1302539.1). Lost writes can be caused by faulty storage, but also by Oracle bugs, in general anything in between our data in RAM and storage can corrupt our data, including controllers and network.

Digression on techniques.
1. A useful technique that we will need in the following is the ability to read and write a single block from Oracle data files (in a test environment). For databases on filesystems (and also DBs on NFS) dd is the tool for this job (I am using the Linux platform as reference). Examples:

read one 8KB block from filesystem (block 134 in this example):
dd if=testlostwrite.dbf bs=8192 count=1 skip=134 of=blk134.dmp

write one 8KB block to filesystem (block 134 in this example)::
dd of=testlostwrite.dbf bs=8192 count=1 seek=134 if=blk134.dmp conv=notrunc

Note when writing to an Oracle datafile we must use conv=notrunc or else we will end up with an unusable (truncated) output file. Note also the syntax for specifying the block offset, skip is used for input files and seek for output files (see dd manual).

How to read and write single blocks on ASM data files. One possibility is to take a backup copy of the datafile with RMAN, edit it with dd (as detailed above), then (with rman again) restore the backup copy. With a little knowledge of ASM internals, more direct ways to access files in ASM are available: one can find the position of the block (and its mirror copies if relevant) and then use dd to read/write data directly. Currently my preferred way is sightly different and it exploits the dbms_diskgroup package. This is an undocumented package (see again the link above on ASM internals for some additional details) although it is extensively used by Oracle's asmcmd utility. I have packaged the dbms_diskgroup.read and dbms_diskgroup.write calls into a small utility written in perl (that I called asmblk_edit, follow this link to download asmblk_edit). Similar ideas can be found also in support note 603962.1.
The following example illustrates using the utility asmblk_edit to read and write block number 134 from and to a data files stored in ASM:

read one 8KB block from an Oracle data file in ASM (block 134 in this example):
./asmblk_edit -r -s 134 -a +TESTDB_DATADG1/TESTDB/datafile/testlostwrite.353.810578639 -f blk134.dmp

write one 8KB block to and Oracle data file in ASM (block 134 in this example):
./asmblk_edit -w -s 134 -a +TESTDB_DATADG1/TESTDB/datafile/testlostwrite.353.810578639 -f blk134.dmp

2. Another technique that we need is quite straightforward and allows us to find the offset of the Oracle block that we want to read/write to for our tests.
The example here below shows how to find block number where data is stored, using rowid:
SQL> select rowid, dbms_rowid.ROWID_BLOCK_NUMBER(rowid), a.* from testlosttable a;

Incidentally finding the block number for an index leaf block can be done with the following (using the undocumented function sys_op_lbid):
SQL> select rowid rowid_table_from_index_leaf, sys_op_lbid(18297477,'L',t.rowid) index_leaf_rowid from testlosttable t --note: 18297477 in this example is the data_object_id of the index I am examining, edit with the actual number as relevant

3. Last but not least, we need a way to modify data blocks 'under the nose of Oracle', in particular we want to make sure we flush/invalidate the relevant cached data and metadata. The method we will use is:

Offline the tablespace where data resides (this flushes dirty blocks and invalidates cache entries)
Perform the read/write modifications to the block, with dd or asmblk_edit, as needed
Online the tablespace again before the next usage

A basic recipe to reproduce the effects of a lost write.
We can now put together the ideas and techniques described above into a working example aimed at reproducing the effects of a lost write in Oracle:

SQL> create bigfile tablespace testlostwrite datafile '{edit with datafile directory}/testlostwrite.dbf' size 10m;
SQL> create table testlosttable (id number, payload varchar2(100)) tablespace testlostwrite ;
SQL> create index i_testlosttable on testlosttable (id) tablespace testlostwrite ;
SQL> insert into testlosttable values (10,'aaaaaaaaaaaaaaaaaaaaaaaaa');
SQL> insert into testlosttable values (20,'bbbbbbbbbbbbbbbbbbbbbbbbb');
SQL> commit;

SQL> select rowid, dbms_rowid.ROWID_BLOCK_NUMBER(rowid), a.* from testlosttable a;
--note: this will allow to find the block_id where data resides, let's say it's block 134

SQL> alter tablespace testlostwrite offline;

-- read block either with dd or with asmblk_edit and create backup copy. Example:
-- ./asmblk_edit -r -s 134 -a +TESTDB_DATADG1/TESTDB/datafile/testlostwrite.353.810578639 -f blk134.dmp
SQL> alter tablespace testlostwrite online;
SQL> insert into testlosttable values (30,'cccccccccccccccccccccccccc');
SQL> commit;
SQL> alter tablespace testlostwrite offline;

-- write block either with dd or with asmblk_edit and from previously created backup copy. Example:
-- ./asmblk_edit -w -s 134 -a +TESTDB_DATADG1/TESTDB/datafile/testlostwrite.353.810578639 -f blk134.dmp
SQL> alter tablespace testlostwrite online;
SQL> -- our database has now a lost write in the table testlosttable block 134

The effect of the lost write on the table is that the row with id=30 has disappeared from the table. However the entry with id=30 is still visible in the index i_testlosttable. This observation can be confirmed with the 2 queries reported here below. Note that in case of normal operations (i.e. no lost writes) the 2 queries would both return three rows, this is not the case here because of our manual editing of the table block with the asmblk utility.

SQL> select /*+ INDEX_FFS(a)*/ id from testlosttable a where id is not null;
SQL> select /*+ FULL(a)*/ id from testlosttable a;
SQL> -- these 2 queries return different results in our test with a lost write.. and only one of them is correct!

We can now proceed with this additional test: insert a new row into the table, for example:

SQL> insert into testlosttable values (40,'ddddddddddddddddddddddd');
SQL> commit;

What we should notice notice at this point is that Oracle keeps working fine and no errors are shown to the user. With the techniques discussed above we can easily show that this new row has been inserted into block 134 (the block with a lost write). Let's postpone further investigations for a later paragraph and for now just note that Oracle has not detected that we have suffered a lost write.

An example of the effects of lost writes with Standby (Data Guard)
A standby database, among others, provides a copy of the database that Oracle can use to detect lost writes (i.e. the standby provides a reference copy of 'good data'). A simple test to see how this works in practice:

We start by going through the same steps described above and we create a lost write in the primary.

Note that at this point the standby has no knowledge that something has gone wrong in the primary.

Now we can go ahead and run DML against the block that has suffered the lost write

i.e. we insert the row with id=40 as detailed above.

At the moment when the standby database will try to apply the redo log entry (change vector) to the block with a lost write, it will compare SCN numbers and find that something is wrong. MRP will stop and throw ORA-600 [3020] (stuck recovery).

An example of the error stack for a stuck recovery:

ORA-00600: internal error code, arguments: [3020], [10], [134], [134], [], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 10, block# 134, file offset is 1097728 bytes)
ORA-10564: tablespace TESTLOSTWRITE
ORA-01110: data file 10: '+STDBY_DATADG1/TESTDB/datafile/testlostwrite.347.810578685'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 18299147

Additional information is dumped in the trace files of MRP and of the recovery slaves.
It's worth noting that ORA-600 [3020] can be caused by a lost write (as in this example) but also by other issues, in particular with the redo consistency (outside the scope of this discussion).

Oracle 11g and lost write protection
In the example above Oracle only throws an error alerting that something is wrong (in this case ORA-600 [3020], stuck recovery) when a DML operation is performed on the primary DB against a block that had suffered a lost write. This means that lost writes may also remain silent and unnoticed in the primary DB for a very long period of time. A new feature of 11g can be used to make Oracle more active in detecting lost write issues, in particular to check the blocks that are subject to physical reads into the buffer cache. This is how to activate these checks:

set the initialization parameter DB_LOST_WRITE_PROTECT = TYPICAL (or FULL if we prefer) on the primary database. This will cause the generation of additional redo entries when Oracle performs physical reads.
set DB_LOST_WRITE_PROTECT = TYPICAL also on the standby, this will make MRP and its recovery slaves to check for lost writes using the extra information in the redo log stream.
Outcome: if we hit a block suffering from lost write, MRP and its slaves will stop and throw ORA-752: recovery detected a lost write of a data block

An example of the error stack for lost writes with db_lost_write_protect set to typical is pasted here below (note that additional information is dumped in the trace files of MRP and of the recovery slaves):

Hex dump of (file 10, block 134) in trace file {path..}/standby_pr09_26471.trc
Reading datafile '+STDBY_DATADG1/TESTDB/datafile/testlostwrite.347.810578685' for corruption at rdba: 0x00000086 (file 10, block 134)
Read datafile mirror 'STDBY_DATADG1_0000' (file 10, block 134) found same corrupt data (logically corrupt)
Read datafile mirror 'STDBY_DATADG1_0011' (file 10, block 134) found same corrupt data (logically corrupt)

STANDBY REDO APPLICATION HAS DETECTED THAT THE PRIMARY DATABASE

LOST A DISK WRITE OF BLOCK 134, FILE 10
NO REDO AT OR AFTER SCN 6367748422450 CAN BE USED FOR RECOVERY.
ORA-00752: recovery detected a lost write of a data block
ORA-10567: Redo is inconsistent with data block (file# 10, block# 134, file offset is 1097728 bytes)
ORA-10564: tablespace TESTLOSTWRITE
ORA-01110: data file 10: '+STDBY_DATADG1/TESTDB/datafile/testlostwrite.347.810578685'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 18299538

Comparing this error message with the ORA-600 [3020] reported in the previous paragraph,the first thing we notice is that we have more information and more details. Moreover Oracle has checked that this is indeed a case of lost write and has even tried to resolve the 'logical corruption' trying to read the second mirror copy from the normal-redundancy ASM diskgroup used in this example (an attempt that did not help in this case as both mirror copies of the block have suffered a lost write, as generated in our example using the asmblk utility).

DB_LOST_WRITE_PROTECT and its impact on primary and standby DBs
From the example here above we see that the extra checks that oracle recovery process does when setting DB_LOST_WRITE_PROTECT with Oracle 11g are definitely an improvement from 10g behavior, although this mechanism does not provide a complete protection against lost writes but only gives us a higher probability that a lost write can be found.

How about the impact of setting this parameter? Extra redo entries are generated on the primary: they are called block read redo (BRR). We can directly investigate BRR entries for example by dumping redo (BRR corresponds to redo at layer 23 and opcode 2), see this example on how to do this type of dump.
SQL> alter system dump logfile '...{path to storage}.../thread_2_seq_1816.510.810661967' layer 23 opcode 2;

Another method is by querying v$mystat or v$sysstat. We will look for stats containing the text 'lost write' in their name:
SQL> select name, sum(value) from v$mystat se, v$statname n where n.statistic#=se.statistic# and (n.name like '%lost write%' or name like '%physical read%') group by name;

The size of BRR entries in the redo stream varies as Oracle's optimizations can come into play, such as batching several BRR entries in one redo record. Based on a limited number of observations of a production system I estimate than on average we can expect 30 bytes of extra redo generated by BRR for each physical block read performed by the database, although mileage may vary and definitely the impact of the parameter should be tested before applying it to a busy production system! Another observation based on testing is that direct read operations do not generate BRR (for example reads for parallel query).

Example of BRR entry from a logfile dump:

CHANGE #5 TYP:0 CLS:4 AFN:10 DBA:0x00000086 OBJ:18299943 SCN:0x05ca.9b6e95f9 SEQ:1 OP:23.2 ENC:0 RBL:1
Block Read - afn: 10 rdba: 0x00000086 BFT:(1024,134) non-BFT:(0,134)
scn: 0x05ca.9b6e95f9 seq: 0x01
flags: 0x00000004 ( ckval )

After setting db_lost_write_protect to typical (or full) and rebooting the instance Oracle will set _log_committime_block_cleanout to true (based on observations on my test system 11.2.0.3). The effect is that (on the primary instances) Oracle will write into the redo logs additional entries related to commit time block cleanout operations (layer 4 opcode 6). Logging block cleanout redo record (and subsequently applying them to the standby database) is a desirable feature especially in the case of an environment deploying Active Data Guard.

On a standby instance Oracle setting db_lost_write_protect to TYPICAL or FULL will make MRP slaves do the extra work of checking for lost writes by comparing SCNs in BRRs with SCN in the block headers. Notably the extra work implies additional physical reads on the standby. This additional work on the standby can be monitored by querying v$sysstat (statistics name = 'recovery blocks read for lost write detection' and also the related stat 'recovery blocks skipped lost write checks').

Note: checks for lost write based on SCN from BRR entries are performed also when doing media recovery (i.e. a recover database command). Therefore even if Data Guard is not available one can use a simple restore of the database from backup to perform validation of BRR records to search for lost writes.

Analysis and troubleshooting.
From the steps details above we have a test environment with a lost write that we have generate. This will allow us to have some fun investigating the issue at low level and developing techniques for analysis and troubleshooting. All this with the great advantage, compared to a real-life case, that now we know for sure what the root cause of this incident is!
A good starting point for troubleshooting lost writes in a production system is support note 1265884.1 "Resolving ORA-752 or ORA-600 [3020] During Standby Recovery". A few more pointers here below.

- One technique to investigate the affected blocks both on primary and standby is by dumping the block contents to a trace file. The idea is to compare the contents of the primary and standby. Of particular interest will be the SCN of the last change to the block and also SCNs in the ITL list. Example:
SQL> alter system dump datafile 10 block 134;
SQL> --the following performs a dump of the cached version of the block
SQL> alter session set events 'immediate trace name set_tsn_p1 level <ts#+1>'; -- where ts# is the tablespace number
SQL> alter session set events 'immediate trace name buffer level <decimal rdba>'; --rba is 134 in our example

- We can also investigate the contents of the block suffering a lost write using SQL. This has the advantage of allowing the use of flashback query. The current content of the block on the primary database may not be what we want: we are interested in a consistent image of the block at the SCN equal to the SCN reported in our ORA-752 message for example (this would also be current SCN of the standby as recovery is stuck there at that point in time). Example:
SQL> set num 16
SQL> select ora_rowscn, rowid, dbms_rowid.rowid_row_number(a.rowid) row_number, a.* from testlosttable as of scn 6367748219413 a where rowid like 'ABFzkJAAAAAAACG%'; -- edit values for scn and rowid, use SQL below to find the values to use

- Digression: how to find the rowid of the block with lost write (block 134 of file 10 and object_id=18299145). Actually what we need is just the first 15 characters of the rowid (the last three characters are the row_number inside the block). Example:
SQL> select substr(DBMS_ROWID.ROWID_CREATE(rowid_type =>1, object_number =>18299145, relative_fno =>0, block_number =>134, row_number =>0),1,15) rowid_prefix from dual;

- How to find the current SCN (at the standby):
SQL> select current_scn from v$database;

- Another powerful technique is to dump redo log information from all relevant logfiles searching for entries related to the block with lost writes (in our example it's block 134 of file 10). The dump will include transaction details and most notably redo marking the time when DBWR has written the give block (this info is sotre in block written redo, BWR). If the parameter db_lost_write_protect is set to typical or full the redo dump will also show details of the block read redo (see BRR discussed above). For further info on logfile dumps see also Julian Dyke's website. Example:
SQL> alter system dump logfile '...{path to storage}.../thread_2_seq_1816.510.810661967' DBA MIN 10 134 DBA MAX 10 134; -- edit file number and block number as needed

- Additional investigation on data affected by the lost write can be performed by log mining. The goal is to find the SQL of all the transactions for the affected block. Identify the relevant redo logs to mine first. Example:
SQL> BEGIN

SYS.DBMS_LOGMNR.ADD_LOGFILE(LogFileName=>'...{path to storage}.../thread_2_seq_1816.510.810661967',Options=>SYS.DBMS_LOGMNR.NEW);

SYS.DBMS_LOGMNR.START_LOGMNR(Options=> SYS.DBMS_LOGMNR.DICT_FROM_ONLINE_CATALOG);

END;

SQL> SELECT scn,sql_redo FROM SYS.V_$LOGMNR_CONTENTS WHERE data_obj#=18299145 and row_id like 'ABFzkJAAAAAAACG%'; -- calculate rowid with dbms_rowid package as detailed above

- Other analysis that can be done in case we have indexes on the tables:

In case one or more indexes are present we can read data from the index and compare the results with what we have on the table. Example to read from the index:
SQL> select rowid rowid_table_from_index_leaf, id, sys_op_lbid(18299146,'L',t.rowid) index_leaf_rowid from testlosttable t where rowid like 'ABFzkJAAAAAAACG%';
--note: update 18299146 with the the data_object_id of the index of interest
-- in this example this is the data_object_id of I_TESTLOSTTABLE

- We can also check if there are inconsistencies (corruption) between the table and its indexes with 'analyze table validate', although in my experience this can be quite time consuming and not necessarily add more information to the analysis. Example:
SQL> analyze table testlosttable validate structure cascade online;

Actions that we can take to restore the services
Suppose that our analysis has confirmed that a lost write happened and also that we have the details of what 'is lost'. We need now to fix the corrupted block on the primary and restore the service on the standby. In particular if we have an Active Data Guard with a SLA, the latter may be quite an urgent action. Hopefully we also have some ideas of what the root cause was and a way to fix it in the future.

One possible action is to failover to standby. This action plan however may provide to be unacceptable in many circumstances, given the potential for data loss it implies. A failover would likely not be acceptable if the primary database has continued working and accepting users transactions since the time (SCN) of the incident that has generated ORA-600 (or ORA-752).

Another possibility is to use our knowledge of the lost transactions gathered in the analysis phase to run SQL actions to 'fix the primary'. This has to be evaluated case by case. In some circumstances we can also get away with just dropping and recreating the object with a lost write. In our simple example of a lost insert on table testlosttable, the action to perform on the primary is:

SQL> alter index i_testlosttable rebuild online;
SQL> insert into testlosttable values (30,'cccccccccccccccccccccccccc');
SQL> commit;

What about fixing the standby? We can unstuck the recovery on the standby by allowing it to corrupt the block with a lost write (and fix it later, as detailed below). Example:

SQL> alter system set db_lost_write_protect=none; --temporarily disable lost write check if needed
SQL> alter database recover automatic standby database allow 1 corruption;
SQL> --wait till the redo that cause the error message has been applied
SQL> alter database recover cancel;
SQL> -- restart normal Data Guard operations. An example for ADG:
SQL> alter system set db_lost_write_protect=typical;
SQL> alter database open read only;
SQL> alter database recover managed standby database nodelay using current logfile disconnect;

At this point all is OK except for one corrupted block on the standby. How to restore the corrupted block on standby? This depends on the case, we may just rebuild the object on primary this will fix the problem on standby. Another option is to copy over the datafile from primary to standby
In our example the corrupted block is block 134 of file 10 and we have an active data guard in real time apply. We can use automatic block media recovery (ABMR) to fix it. In my tests AMBR is attempted but does not really work against the corrupted block 134, I can work around this by zeroing out the block. This is an example (intended to be used on test databases):

./asmblk_edit -w -s 134 -a +STDBY_DATADG1/TESTDB/datafile/testlostwrite.347.810578685 -f zeroblock

where zeroblock file is created with: dd if=/dev/zero bs=8192 count=1 of=zeroblock

if my standby was on filesystem I could have used:
dd if=/dev/zero of=testlostwrite.dbf bs=8192 count=1 seek=134 if=blk134.dmp conv=notrunc

If we now query the table (the table testlostwrite in our example) when Oracle reaches the zeroed block it will fetch a copy from production (see Oracle documentation for the details about configuring and using the ABMR feature). This will happen in a transparent way for the user issuing the query, the operation is logged in the alert log of the instance (the 2 lines here below appeared repeated twice in my test using 11.2.0.3):

Automatic block media recovery requested for (file# 10, block# 134)

Automatic block media recovery successful for (file# 10, block# 134)

Oracle's OTN demo video on lost write
Short videos with demos on testing lost writes and automatic block media recovery in a Data Guard environment can be found on OTN.
See also support document "Best Practices for Corruption Detection, Prevention, and Automatic Repair - in a Data Guard Configuration" [ID 1302539.1]

Conclusions and future work
This article illustrates a simple example on how to reproduce and investigate the effects lost write in a test environment, which is intended as a training exercise for DBAs.
I would like to end with four lessons learned on the topic of how lost writes can affect a high-availability Oracle deployment:

Finding the root causes of a lost write can prove to be very hard. Two possible sources of the root causes of these type of issues are the storage and Oracle bugs. How to find which one (if any of the two) is to blame for a particular incident? If the issue cannot be re-produced we can easily find ourselves in between a finger-pointing discussion between vendors.
Having a standby (Data Guard) is of great help for discovering lost writes. Without a standby there are high chances that we will not realize we had a lost write in the first place. However, once a lost write is found the redo apply stops: the recovery is stuck and throws error ORA-600 [3020] or ORA-752 if we try to restart it. A support DBA most likely will have to analyze the situation and decide what to do to fix the issue (for example perform a failover or rather going through a process similar to what described in this article). Moreover if one or more Active Data Guards are used for critical read-only activity, there is time pressure to restart the redo apply.
How can we check that our primary and standby databases are 'in sync', that is that there are no lost writes waiting to 'explode' as time bombs? I am not aware of a utility that could do such check. This is a very interesting topic, probably material for another post. A brief discussion of this problem and possible solutions can be found at this link and also here.
Database restores can be affected by lost writes too. For example the restore of a database backup can fail (get stuck on applying redo) because of a lost write that has happened in production after the latest data file backup. This has potential impacts on the disaster and recovery strategy.

Note: the examples reported in this article have been tested against Oracle 11.2.0.3 64 bit for Linux (RHEL5).

Thursday, February 14, 2013

AWR Analytics and Oracle Performance Visualization with PerfSheet4

Topic: This post describes PerfSheet4, a tool for performance analysis aimed at streamlining access and visualization of Oracle's AWR data. The tool is aimed at DBAs and Oracle performance analysts. PerfSheet4 is a spin off of previous original work by Tanel Poder, rewritten and integrated with additional functionality for AWR analysis and with important changes to the user interface.

Note added: An update to this post with the details of a more recent version of PerfSheet4 can be found at this link. See also the work on PerfSheet.js.

Context: There is much information in the counters and metrics of Oracle's AWR that can be of substantial help for troubleshooting and for capacity planning. Besides the standard AWR report, time-based analysis is often very useful. However this type of access is made slow and tedious by a list of tasks, such as, fetching data, calculating incremental changes and producing interesting graphs.This normally requires developing or downloading a bunch of scripts and adding some visualization techniques on top. See for example Karl Arao's scripts and blog.
Tanel Poder has already demonstrated a few years ago that the pivot chart functionality of Excel can be an excellent platform for performance data visualization including AWR (and Perfstat) data. See his work on Perfsheet 2.0 and PerfSheet3.

What's new: The main drive to develop PerfSheet4 was to (try to) make the user interface as simple as possible, without compromising the general nature of the tool. A set of pre-defined queries for some common AWR analysis come with the tool. The plotting functionality will generate custom graphs for such data, an easy entry point to analysis.
If direct access to the DB is not possible, data can be extracted from the DB with the scripts supplied in the sql*plus_scripts directory and then loaded as csv. Similarly, export to csv can be used to archive data for later analysis.

PerfSheet4 is a free tool and is available for download from GitHub or download from this link. The tool has been developed and tested on Excel 2010 against Oracle 11gR2. See here below a snapshot of the user interface with some additional details.

Why Excel: Pivot tables and charts are a great feature available in Excel that allows to quickly and easily analyze multidimensional data. This can be of great help when troubleshooting problems in production, for example with the need to try out different ideas and data combinations

Why pre-defined graphs: This feature is again to make life easier for the user. The idea is to automates the tedious task of setting up the pivot chart fields and provides an entry point to get to the needed graph.

Available queries: version 3.2p beta comes with the following queries (and pre-defined graphs):

Query name name	Short description
Wait events	Extracts data from dba_hist_system_event, computes delta values between snapshots and rates (i.e. delta values divided over delta time)
System statistics	Extracts data from dba_hist_sysstat, computes delta values between snapshots and rates (i.e. delta values divided over delta time)
Stats per service	Extracts data from dba_hist_service_stat, computes delta values between snapshots and rates (i.e. delta values divided over delta time)
Workload data	Extracts data from dba_hist_sysmetric_summary, metrics give details on the usage of system resources over time ranges
IO wait latency	Extracts data from dba_hist_event_histogram for io related events, computes delta values between snapshots and rates (i.e. delta values divided over delta time)
Top 5 wait per instance	Extracts data from dba_hist_system_event and dba_hist_sysstat, computes delta values and rates (delta value over delta time) and selects top 5 non idle events for each instance
Wait events per class	Extracts data from dba_hist_system_event, computes delta values and rates (delta value over delta time) aggregating over wait class

Future work: Stabilize Perfsheet4 against bugs and for different execution environments. For this feedback from other users would be very much appreciated. Possibly add more pre-defined queries (for AWR analytics but not necessarily only that). Add a post with examples of how analysis performed with PerfSheet4 have been useful for my job in various occasions. In the mean time if you find the tool useful or find bugs or have ideas for possible improvements, I'll be interested to know! Acknowledgments: Many thanks to Tanel Poder, the co-author of the tool.

Getting started video: http://youtu.be/kC0EJCLZWIw

Wednesday, November 28, 2012

Active Data Guard and UKOUG 2012

I am looking forward to participating again to the UKOUG annual conference (and also to attend the Sunday's OakTable event). This is for me a great opportunity to meet and discuss with many passionate Oracle experts who regularly attend the conference and also to get up-to-date with the latest technology and news from the Oracle community.
My colleague Marcin and I have just finished preparing our presentation on the subject of Active Data Guard. I have enjoyed the work even though, as it is often the case, this has taken quite some effort from both of us to reach the level of details we wanted.
Our goal with this presentation is to share our thoughts and experience on Active Data Guard as we have seen it mature in our production environment in the last 10 months. We will discuss, with examples, what we think are the great points of the technology for our use cases and what we have learned from production deployment. What has worked.. and what has not worked!
If you read this and are going to participate to the UKOUG conference, you are welcome to come to our presentation on December 4th at 15:10 in Hall 5!

Slides can be downloaded from this link.

Wednesday, September 26, 2012

Command-Line DBA Scripts

Topic: A review of the command-line monitoring scripts that I currently use for Oracle.

Note: last updated in June 2015. Download links: http://canali.web.cern.ch/canali/resources.htm and https://github.com/lucacanali

Command-line is still very useful for Oracle database administration, troubleshooting and performance monitoring. One of the first thins that I do when I want to work with a given Oracle database is to connect as a DBA user with Sql*plus and get information on the active sessions with a script. I will run the script a few times to get an idea of what's happening 'right now' and get a possible starting point for more systematic tuning/troubleshooting if needed.
I liken this approach to examining a few pictures from a remote webcam and trying to find out what's happening out there!.. The method can be powerful, although with some limitations. One of its advantages is that it is easy to implement in Oracle! One of the main source of information for this type of monitoring is GV$SESSION, which is available in Oracle since ages. More recently GV$ACTIVE_SESSION_HISTORY has proven very useful and I quite like the additional improvements introduced there in11gR2.

Examples: One of my favorites is a script I have called top.sql (not very much an original name I know :) which queries gv$session (RAC-aware) and prints out what I currently find the most interesting details (for my environment) on active sessions that I can fit in roughly 200 characters. Another script that does a similar job taking data from v$active_session_history is ash.sql. the scripts ash2.sql, ash3.sql and ash4.sql are the RAC versions for 2, 3 and 4 node RACs respectively.

One of the typical pain points when writing CLI monitoring scripts is deciding which information to display and how to pack it in a relatively short line. I have also found that with each major Oracle versions I want to refresh my scripts with a few changes.

Scripts details: I have packaged some of the scripts that I have written and that I find useful in a zip file which is available at this link (unzip locally and run using sqlplus). Note that the scripts are tested on 11gR2, many of them will also work on 10g and 12c.

Monitoring Class (Active Session Monitoring)

Script Name	Description
top.sql	List with selected details of the active sessions, from gv$session (RAC aware)
top_10g.sql	Active sessions details, from gv$session, 10g version (RAC aware)
ash.sql	List with selected details of the active sessions, from v$active_session_history (one instance only)
ash_h.sql	Usage @ash_h <n#_sec>. Active sessions details for the last n#_sec seconds, from v$active_session_history (current instance only)
ash2.sql, ash3.sql ash4.sql	Active sessions details, from gv$active_session_history (for RAC 2, 3 and 4 nodes).

SQL Monitoring

Script Name	Description
monitor.sql	Display information form gv$sql_monitor for running statements statements. See also monitor_details.sql
monitor_plan.sql	Usage: @monitor_plan <key>. A drill-down script from monitor.sql. Prints details of the execution plan from gv$sql_plan_monitor
monitor_report_sid.sql, monitor_report_sid_active.sql	Usage: @monitor_report_sid_active <sid>. Generates an active/html sql monior report for the given sid. See also @monitor_report_sid <sid> for the simpler html version.
monitor_report_sqli_d.sql, monitor_report_sql_id_active.sql	Usage: @monitor_report_sql_id_active <sql_id>. Generates an active/html sql monior report for the given sql_id. See also @monitor_report_sql_id <sql)id> for the simpler html version.

Drill-down Class

Script Name	Description
ash_sess.sql	Usage: @ash_sess <inst_id> <sid> <n#_sec>. Prints details from gv$active_session_history for a given session (RAC aware)
explain.sql	Usage: @explain <sql_id>. Prints detailed explain plan info for given sql_id from library cache (current instance only)
explain_awr.sql	Usage: @explain_awr sql_id. Prints detailed explain plan info for given sql_id from AWR (rac aware)
obj.sql	Usage: @obj <object_id>. Find details for a given object id, or data_object_id
sessinfo.sql	Usage example: @sessinfo sid=<num>. Prints info on sessions that match the search pattern (RAC aware)
sql.sql	Usage: @sql <sql_id>. Prints sql text for given sql_id (RAC aware)

Workload Characterization Class

Script Name	Description
sysmetric.sql, sysmetric_details.sql	Display values of selected system metrics (RAC aware)
iometric.sql, iometric_details.sql	IO utilization metrics and details (RAC aware)
ehm.sql, ehm_local.sql ehm_micro, ehm_micro_local	Usage: @ehm 10 "db file sequential read". Event history metric data, helps finding skew in wait times, for example to study random reads latency skew (RAC aware). ehm_micro -> 12c version, latency drill down at microsecond.
eventmetric.sql	Wait event metric details for top wait events (RAC aware)
filemetric.sql	File metric details for top-usage files (RAC aware)
ash_top.sql	Usage @ash_top <n#_sec>. Aggregates ash data over n#_sec seconds (current instance only)
jobs_running.sql	List running jobs details (RAC aware)
load.sql	Displays OS-load values (RAC aware)
longops.sql, longops_10g.sql, longops_details.sql	Info on long running operations from gv$sessoin_longops (RAC aware)
transactions.sql	List of open transactions and crash recovery (RAC aware)

ASH and AWR Reports

Script Name	Description
ash_report.sql	Usage example: @ash_report sysdate-1/24 sysdate Generates an ASH report for the current instance and displays it in a browser.
awr_report.sql	Usage: @awr_report <snap_begin> <snap_end>. Generates an AWR report for the current instance and displays it in a browser.
awr_global_report.sql	Usage: @awr_global_report <snap_begin> <snap_end>. Generates a global AWR report for the current database and displays it in a browser.
awr_snapshots.sql	Lists AWR snapshots for the last 24 hours.

Miscellaneous

Script Name	Description
kill.sql	Usage @kill <sid> <serial#> kill the given session (current instance only)
kill_select.sql	Usage example: @kill_select "sql_id='xxxx'". Helps in preparing scripts for killing large number of sessions (current instance only)
locks.sql, locks_10g.sql, locks_details.sql	Scripts to help investigating locks and blocking enqueues issues (RACaware)
binds.sql	Usage: @binds <sql_id>. Helps in finding bind values for a given sql_id (RAC aware)
login.sql	Sql*plus startup/config script for my environment (windows client version 11.2.0.3)

Conclusions
Command-line scripts are still of great help for monitoring and performance troubleshooting of Oracle databases. Of particular interest are queries to print the details of active sessions, the SQL they execute and their wait events details. This makes for a quick and easy entry point to troubleshooting and to get an idea of what's happening on the DB server. More systematic troubleshooting may be needed beyond that. Command-line scripts are easy to modify and customize, many DBAs indeed have their own tool-set. A selection of my own DBA scripts can be downloaded from this link.

A few links to end this post: First of all the excellent session snapper and the rest of the collection of scripts by Tanel Poder. In addition a mention to a few cool tools that are a sort of bridge between command-line and graphical-interface approaches: MOATS by Tanel Poder and Adrian Billington , SQL Dashboard by Jagjeet Singh and topaas by Marcin Przepiorowski.

Thursday, August 16, 2012

How to Turn Off Adaptive Cursor Sharing, Cardinality Feedback and Serial Direct Read

Scope: New features and improvements on the SQL (cost-based) execution engine are great. However sometimes the need comes to turn some of those new features off. Maybe it's because of a bug that has appeared or simply because we just want consistency, especially after an upgrade. This is often the case when upgrading an application that has been tuned already with heavy usage of hints for critical SQL.

When this is relevant: using a full set of hints (an outline) often provides consistent SQL execution for a given Oracle version (and set of session/instance parameter values). However this is not guaranteed to be stable across upgrades. The reason is that additional functionality of the cost based optimization engine can play a role. This is indeed the case when moving from 10g to 11g and in particular to 11gR2. It's a good idea to investigate the new features and also to keep a note aside on how to disable them in case of need. The discussion of pros and cons of using hints is a very interesting topic but outside the scope of this entry.

Three 11g features of cost-based optimization and SQL execution to consider when upgrading from 10g:

Cardinality feedback. This feature, enabled by default in 11.2, is intended to improve execution plans for repeated executions. See Support note 1344937.1 and documentation for additional info. You can be disable cardinality feedback with an underscore parameter which can be set at session or instance level. Usual caveats on setting underscore parameters apply:
alter session set "_OPTIMIZER_USE_FEEDBACK"=FALSE;
This can also be done with a hint, therefore at statement level, using the opt_param syntax: /*+ opt_param('_OPTIMIZER_USE_FEEDBACK','FALSE') */

Adaptive cursor sharing. This was introduced in 11gR1 to address SQL performance consistency issues related to bind variable peeking. See support note 740052.1 and documentation for details. For the purposes of this doc, if you want to get the 10g behavior with an Oracle version higher than 10g and overall if you want to disable this feature, you may want to add the hint:
/*+ NO_BIND_AWARE */.
According to support note 11657468.8 adaptive cursor sharing can be disabled by setting the following 2 parameters (say at session level): _optimizer_adaptive_cursor_sharing = false, _optimizer_extended_cursor_sharing_rel = "none"

Serial direct read. This feature, introduced in 11gR2, is about SQL execution and not strictly speaking about cost-based optimization. Serial direct reads allow Oracle to perform full scan very efficiently, using asynchronous I/O if available and also bypassing the the buffer cache, much like what parallel query does, but in serial mode. The decision of when to use or not to use serial direct reads is not taken by the optimizer. It still makes sense to discuss this here as this feature can change the behavior of full scans: by forcing the use of direct read instead of 'normal cache-based access'. There are many interesting details to this feature that are outside the scope of this discussion. To revert to 10g behavior and overall to disable the feature you need to set the following parameter (there is no hint for this unfortunately, as this feature is not directly linked to CBO):
alter session set "_SERIAL_DIRECT_READ"=never;
Note, at the opposite side of the spectrum one can force serial direct reads with: alter session set "_serial_direct_read"=always;

Additional info and comments:

1. When testing Oracle version upgrades and facing regression of SQL, it can be useful to set cost-based optimization to behave as it did in the older version. For example when upgrading from 10.2.0.5 to 11gR2 we can use the following hint on SQL we are testing: SQL hint: /*+ OPTIMIZER_FEATURES_ENABLE('10.2.0.5') */ or set the parameter OPTIMIZER_FEATURES_ENABLE at session level (or in some cases even at system level).
However as discussed above, setting optimizer_features_enable parameter sometimes is not enough to ensure that a given SQL will execute in exactly the same way in the new system as it did in the older version. In particular related to the 3 examples above, adaptive cursor sharing and cardinality feedback are tuned off by reverting CBO compatibility to a 10g version, although the behavior of serial direct read is not affected by this hint (or any other hint for that matter).

2. How to get a full list of available hints? query V$SQL_HINT in Oracle 11.2 or higher

3a. Example of a logon trigger that can be used to set session-level parameters:

create or replace trigger change_session_parameters
AFTER LOGON ON mydbuser.SCHEMA --customize username, replace "mydbuser" with relevant user name
BEGIN
execute immediate 'alter session set "_serial_direct_read"=never';
END;
/

3b. Alternative trigger definition

create or replace TRIGGER trace_trig
AFTER LOGON
ON DATABASE
DECLARE
sqlstr1 VARCHAR2(100) := 'alter session set "_serial_direct_read"=never';
BEGIN
IF (USER IN ('USERNAME1', 'USERNAME2')) THEN -- customize user names as needed
execute immediate sqlstr1;
END IF;
END trace_trig;
/

4. Example of how to get the outline (full set of hints) for a given query:

select * from table(dbms_xplan.display_cursor('sql_id',null,'OUTLINE')); -- edit sql_id with actual value

Wednesday, July 18, 2012

Recursive Subquery Factoring, Oracle SQL and Physics

Introduction: this post is about the use of recursive subquery factoring (recursive common table expressions) in Oracle to find numerical solutions to the equations of classical mechanics, for demonstration purposes. This technique will be introduced using 4 examples or growing complexity. In particular by the end of the post we will be able to compute the length of the (sidereal) year using the laws of physics and SQL.

Recursive subquery factoring (aka recursive common table expressions) is a feature first introduced in Oracle version 11gR2 that has been used by many both for business and for play. See for example this post on Jonathan Lewis blog, this post on solving sudokus and this about the Mandelbrot set. The computational strength of the feature comes from the fact that it makes recursion easily available from within SQL and so opens the door to natural ways of implementing several algorithms for numerical computation.
Additional note: the amount of physics and maths used have been kept to a minimum and simplified where possible, hopefully without introducing too many mistakes in the process.

Example 1: projectile in gravity field

The motion is on 1 dimension, say the x axis. Our first task is to find a representation of the state of the system with a structure that can be computed in the database. A natural choice is to use the tuple (t, x, v, a). Where t is time, x is the position along the 1-dimensional axis of motion, v is the velocity and a is the acceleration.
Newton's second law (F=ma) gives us the rule of the game when we want to compute the state of the system subject to and external force. Using calculus this can be written as:

Next we need a method to find numerical (approximate) solutions. We can use the Euler integration method, which basically means doing the following:

Another important point is that we need to know the initial conditions for the state tuple, in particular initial position and velocity. In the example we will take x(t=0)=0 m and v(t=0)=100 m/sec. The force is the gravitational pull at the surface of the Earth: F=--m*g, where g=9.81 m/sec^2. Note the mass cancels out in our equations. This example models the motion of a projectile that we shoot vertically into the air and we observe as it rises up to about 500 meters and then falls back down, all this happens in about 20.5 seconds. The SQL used and a graph of the results are here below:

define dT=0.1
define g=9.81
define maxT=20

-- SQL to compute the motion of a projectile subject to gravity
WITH data(t,x,v,a) AS (
SELECT cast(0 as binary_double) t, cast(0 as binary_double) x, cast (100 as binary_double) v, cast(-&g as binary_double) a FROM dual
UNION ALL
SELECT t+&dT, x+v*&dT, v+a*&dT, -&g FROM data WHERE t < &maxT
)
SELECT t,x,v,a FROM data;

Example 2: Harmonic oscillator

In this example we investigate the motion of a mass attached to a spring. We expect the mass to oscillate around a central point (x=0).
For greater accuracy in calculations we use a different integration method: the Verlet integration method (see also this link). The equation for the acceleration is: a = F/m =-K*x and initial conditions are: x(t=0)=1 m and v(t=0)=0 m/sec. Moreover we take K= 0.1 sec^-2. See below the SQL used for the calculation and a graph of the results.

define dT=0.1
define K=0.1
define maxT=20

-- SQL to compute the motion of a harmonic oscillator
WITH data(t,x,v,a) AS (
SELECT cast(0 as binary_double) t, cast(1 as binary_double) x, cast (0 as binary_double) v, cast(-&K*1 as binary_double) a FROM dual
UNION ALL
SELECT t+&dT, x+v*&dT+0.5*a*&dT*&dT, v+0.5*(a-&K*(x+v*&dT))*&dT, -&K*(x+v*&dT+0.5*a*&dT*&dT) FROM data WHERE t < &maxT
)
SELECT t,x,v,a FROM data;

Example 3: Motion of the Earth around the Sun

The motion of the system Earth-Sun is a problem of 2 bodies moving in space. With a 'standard trick' this can be reduced to a problem of 1 body, and 2 spatial variables, which represent the (vector) distance of the Earth from the Sun in the place of the orbit. Our description of the system will use the following tuple: (t,x,vx,ax,y,vy,ay), that is time, position, velocity and acceleration for a 2-dimensional problem in the (x,y) plane. The equation for the force is Newton's law of universal gravitation. Another important point again is to use the correct initial conditions. These can be taken from astronomical observations, x(t=0)=152098233 Km (also know as the aphelion point) and v=29.291 Km/sec (credits to this link and this other link). We will use again the Verlet integration method as in example 2 above. Note, for ease of computation, time and space will be re-scaled so that t=1 means 1000 sec and x=1 means 1 Km (same is true for the y axis). The SQL used is pasted here below as well as a graph of the computed results, that is our approximate calculation of the Earth's orbit.

-- length unit = 1 km
-- time unit = 1000 sec
define dT=.1
define aph=152098233
define GM=132712838618000000
-- note this is GM Sun + Earth (reduced mass correction)
define v_aph=29291
define maxT=40000

-- SQL to compute the trajectory of the Earth around the Sun
WITH data(step, t,x,y,vx,vy,ax,ay) AS (
SELECT 0 step,cast(0 as binary_double) t,cast(&aph as binary_double) x,cast(0 as binary_double) y,
cast(0 as binary_double) vx, cast(&v_aph as binary_double) vy,
cast(-&GM/power(&aph,2) as binary_double) ax, cast(0 as binary_double) ay FROM dual
UNION ALL
SELECT step+1, t+&dT, x+vx*&dT+0.5*ax*&dT*&dT, y+vy*&dT+0.5*ay*&dT*&dT,
vx+0.5*(ax-&GM*(x+vx*&dT)/power(x*x+y*y,1.5))*&dT,
vy+0.5*(ay-&GM*(y+vy*&dT)/power(x*x+y*y,1.5))*&dT,
-&GM*(x+vx*&dT+0.5*ax*&dT*&dT)/power(power(x+vx*&dT,2)+power(y+vy*&dT,2),1.5),
-&GM*(y+vy*&dT+0.5*ay*&dT*&dT)/power(power(x+vx*&dT,2)+power(y+vy*&dT,2),1.5)
FROM data WHERE t < &maxT
)
SELECT t,x,y,vx,vy,ax,ay FROM data WHERE mod(step,100)=0; -- output only one point every 100

Example 4: Compute the length of the sidereal year using the equations of motion of the Earth around the Sun and SQL

As a final example and an 'application' of the techniques above we can use SQL to compute the number of days year (or rather in a sidereal year, see the link for additional details). We find a result that is in agreement with measurements with 6 significant digits. This is an interesting result considering that it is obtained with just a few lines of SQL!

-- SQL to compute the number of days in one sidereal year
-- A sidereal year is the time taken by the Earth to orbit
-- the Sun once with respect to the fixed stars.
-- This builds on the equation and SQL discussed in example N.3

define dT=.1
define aph=152098233
define GM=132712838618000000
define v_aph=29291
define maxT=40000

select round(t*1000/(3600*24),3) days_in_a_sidereal_year from (

WITH data(step, t,x,y,vx,vy,ax,ay) AS (

SELECT 0 step,cast(0 as binary_double) t,cast(&aph as binary_double) x,cast(0 as binary_double) y,

cast(0 as binary_double) vx, cast(&v_aph as binary_double) vy,

cast(-&GM/power(&aph,2) as binary_double) ax, cast(0 as binary_double) ay FROM dual

UNION ALL

SELECT step+1, t+&dT, x+vx*&dT+0.5*ax*&dT*&dT, y+vy*&dT+0.5*ay*&dT*&dT,

vx+0.5*(ax-&GM*(x+vx*&dT)/power(x*x+y*y,1.5))*&dT,

vy+0.5*(ay-&GM*(y+vy*&dT)/power(x*x+y*y,1.5))*&dT,

-&GM*(x+vx*&dT+0.5*ax*&dT*&dT)/power(power(x+vx*&dT,2)+power(y+vy*&dT,2),1.5),

-&GM*(y+vy*&dT+0.5*ay*&dT*&dT)/power(power(x+vx*&dT,2)+power(y+vy*&dT,2),1.5)

FROM data WHERE t < &maxT

)

SELECT step,t,y,lead(y) over(order by step) next_y FROM data

) where y<0 and next_y>0;

DAYS_IN_A_SIDEREAL_YEAR

-----------------------

365.256

Conclusions:

We have discussed in 4 examples the usage of Oracle SQL and in particular of recursive subquery factoring (recursive common table expressions), applied to solve selected cases of differential equations of classical mechanics. Despite the simplifications and approximations involved, the technique has allowed us to compute the length of the sidereal year with good precision. This method can be extended to make calculation for more complex systems, such as systems of many particles.

Wednesday, July 11, 2012

Listener.ora and Oraagent in RAC 11gR2

Topic: An investigation of a few details of the implementation of listeners in 11gR2, including the configuration of listener.ora in RAC and the role of the cluster process 'oraagent'.

11gR2 comes with several important changes and improvements to the clusterware in general and in particular the way listeners are managed. While the listener process is still the 'good old' process tnslsnr (Linux and Unix), it is now started from the grid home (as opposed to database oracle home). Moreover listeners are divided in two classes: node listeners and scan listeners, although they use the same binary for both functions. There are many more details and instead of covering them here I'd rather reference this excellent review: Markus Michalewicz's presentation at TCOUG.

Oraagent takes good care of our listeners

Node listeners and scan listeners are configured at the clusterware level, for example with srvctl and the configuration is propagated to the listeners accordingly

this integration is more consistently enforced in 11gR2 than in previous versions

The oraagent process spawned by crsd takes care of our listeners in terms of configuration and monitoring

note that there is normally a second oraagent on the system which is spawned by ohasd and does not seem to come into play here

Notably in the listener.ora file we can see which configuration lines are being taken care of automatically by oraagent, as they are marked with the following comment: # line added by Agent

experiment on a test DB: delete one or more of the listener.ora configuration lines and restart the listener (for example with srvctl stop listener; srvctl start listener). The original configuration will reappear in listener.ora and the manually modified listener.ora will be renamed (with a timestamp suffix)

The agent creates and maintains the file: endpoints_listener.ora

this file is there for backward compatibility, see docs and/or support site for more info.
experiment on test: delete the file and restart the listerner, oraagent will recreate the file.

Oraagent log can be found at: $GRID_HOME/log/{nodename}/agent/crsd/oraagent_oracle/oraagent_oracle.log
Oraagent monitors the functioning of each listener

from the log file entries (see above about the location of the oraagent log file) we can see that each listener is monitored with a frequency of 60 seconds

Oraagent comes into action also at instance startup, when the instance is started with srvctl (as opposed to 'manually' started instance from sqlplus) and sets LOCAL_LISTENER parameter, dynamically (this is done with an alter system command and only if the parameter has not been set on spfile).

Dynamic listening endpoints

Where are the TCP/IP settings of my listeners in listener.ora?

Only IPC endpoints are listed in listener.ora, this is at first puzzling, where are the TCP settings that in previous versions were listed in listener.ora? ..read on!

Notes:

endpoint=the address or connection point to the listener. The most known endpoint in the oracle DB world being TCP, port 1521
dynamic listening endpoint and dynamic service registration are both concepts related to listener activities, but they are distinct.

Oraagent connects to the listener via IPC and activates the TCP (TCPS, etc) endpoints as specified in the clusterware configuration

experiment on test: $GRID_HOME/bin/lsnrctl stop listener; $GRID_HOME/bin/lsnrctl start listener; Note that the latter command starts only the IPC endpoint. However oraagent is posted at listener startup and makes active the rest of the endpoints (notably listening on the TCP port), this can be seen for example by running the following a few seconds after listener restart: $GRID_HOME/bin/lsnrctl status listener (which will list all the active endpoints)

Another way to say that is that the endpoints for the listener in RAC 11gR2 are configured in a dynamic way: TCP (TCPS, etc) endpoints are activated at runtime by oraagent

this is indicated in the listener.ora by the new (undocumented) parameters ENABLE_GLOBAL_DYNAMIC_ENDPOINT_{LISTENER_NAME}=ON

Experiment on test on disabling dynamic listening endpoint:

stop the listener: $GRID_HOME/bin/lsnrctl stop listener
edit listener.ora and set ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER=OFF
start the listener: $GRID_HOME/bin/lsnrctl start listener
check listener.ora and check that the parameter edited above has not changed, that is ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER=OFF
in this case the TCP endpoint will not be started, that is the listener will be listening only on IPC. Check with: $GRID_HOME/bin/lsnrctl status listener
note: if we try do the same exercise by stopping and starting the listener with srvctl, as it would be the typical way to do it, we will see that the parameter ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER in listener.ora will be set again to ON. This will activate again dynamic listening endpoints, which is something we want in production, but not for this exercise!

Note: the examples have been tested on 11.2.0.3 RAC for Linux. ORACLE_HOME needs to be set to point to the grid home (and TNS_ADMIN if used at all, needs to be set to $GRID_HOME/network/admin)

I don't need to edit listener.ora any more, or do I?

We have seen above that the most important configurations related to listener.ora are automated via oraagent and the clusterware in general
There are additional listener.ora parameters that are not managed by the clusterware in11gR2 and need to be configured in listener.ora in case we want to use them

notably parameters for COST (Class of Secure Transports) parameters needed to secure against a security vulnerability, detailed at Oracle Security Alert for CVE-2012-1675

The steps for the configuration are very well described in support note 1340831.1. Here we just mention that for a 2-node RAC the following parameters will be needed in listener.ora (note parameters taken from listener.ora on 11.2.0.3 RAC for Linux):

SECURE_REGISTER_LISTENER = (IPC,TCP)

SECURE_REGISTER_LISTENER_SCAN1 = (IPC,TCPS)

SECURE_REGISTER_LISTENER_SCAN2 = (IPC,TCPS)

WALLET_LOCATION = (SOURCE =(METHOD = FILE)(METHOD_DATA =(DIRECTORY = ..put here directory of wallet..)))

Conclusions

The handling of listeners in 11gR2 RAC has become much more integrated with the clusterware and for most RAC configurations there is not much need to play with listener.ora any more. At a closer look of however, new parameters and overall some new rules of the game in the 11gR2 implementation are revealed, most notably the use of dynamic endpoint registration by the clusterware.