Showing posts with label RAC. Show all posts
Showing posts with label RAC. Show all posts

Friday, January 31, 2014

Clusterware 12c and Restricted Service Registration for RAC

Topic: This post is about exploring the mechanisms used by Oracle Clusterware 12.1.0.1 to restrict remote service registration, i.e. the 12c new feature "Restricting Service Registration for Oracle RAC Deployments"

Why is this useful? This improvement of 12c clusterware and listeners over the 11.2 version is useful mainly for security purposes, for example as a measure against TNS poisoning attacks (see also CVE-2012-1675), and it is particularly relevant for RAC deployments. Another important point is that it makes the DBA job easier by avoiding the complexity of COST (Class of Secure Transport) configurations (see also support Doc ID 1453883.1).
Notably Oracle 11.2 databases can profit of this 12c improvement too (in the case where the 11g RAC database installed under 12c clusterware).

Spoiler: If you have heard already about the 12c new feature of valid node checking for registration (VNCR) you are still in for a surprise.

Listeners in Oracle RAC in 11.2 and 12.1:
There have been important changes on how listeners are used in RAC starting with the clusterware 11.2. The details are discussed in a previous post. The main points are that (1) we now have local listeners and scan listeners, all using the same binary tnslsnr but with different scope. (2) Most of the listener configuration parameters, in RAC, are taken care of by the clusterware (Oraagent).
Database instances will perform remote service registration to remote listeners as specified by the instance parameter remote_listener (BTW remote service registration is needed to enable the server-side load balancing mechanism of RAC). PMON takes care of the registration in versions up to 11.2, in 12c a new LREG process has been introduced. Local service registration is configured at the instance level using the parmeter local_listener. Normally we will leave its value unset and the clusterware (Oraagent) will take care of setting it to the address of the local listener.  In 11.2 and higher the parameter listener_networks is also relevant, typically in the case of setups with more than one public network (the details are outside the scope of this discussion).

Why restricting service registration?
If service registration is not restricted, anybody who can reach the listener via the network can register a service with it. This opens the way to abuses, such as crafting an attack aimed at redirecting legitimate TNS traffic towards the attacker's machine (see also TNS poisoning attack mentioned above).

What's new in 12.1.0.1?
After performing a 12.1.0.1 vanilla RAC installation we notice that the scan listeners will only accept remote registration over TCP from local cluster nodes. This is an improvement over 11g where to obtain the same result the DBA had to manually execute a series of steps for COST (Class of Secure Transport) configuration, in order to configre the TCPS protocol to restrict remote registration. This improvement is listed as a new feature of 12c for RAC as "Restricting Service Registration for Oracle RAC Deployments".

How does 12c restrict service registration?
In $GRID_HOME/network/admin we can examine listener.ora: we notice a few additional lines from the equivalent file in version 11.2 (see also this post for more details on listener.ora in 11.2), that is lines containing valid node checking configuration. For example:

VALID_NODE_CHECKING_REGISTRATION_LISTENER=SUBNET # line added by Agent
VALID_NODE_CHECKING_REGISTRATION_LISTENER_SCAN1=OFF # line added by Agent
VALID_NODE_CHECKING_REGISTRATION_LISTENER_SCAN2=OFF # line added by Agent

With the first of those lines listed here above the clusterware (OraAgent), which takes care of most of the listener configurations, informs us that after starting the listener it has also connected to it (via IPC) and activated the new feature of valid node checking for registration (VNCR) for the listener called LISTENER, that is for the local listener.
What VNCR does is to restrict remote service registration, in this case to the local subnets.  A short recap for the possible values for the VNCR from support Note Id 1592571.1 is listed here below. See documentation for further details.
  • OFF/0 - Disable VNCR
  • ON/1/LOCAL - The default. Enable VNCR. All local machine IPs can register.
  • SUBNET/2 - All machines in the subnet are allowed registration. This is for RAC installations.

What about the scan listener?
From the snippet of listener.ora reported above we can also see that VNCR is surprisingly turned OFF for the scan listeners (at least in this configuration that I have obtained after a vanilla 12.1.0.1 clusterware installation). However, as we can easily check (see more on techniques below) remote service registration to the scan listeners is indeed restricted and not possible for servers outside the RAC cluster. Therefore another mechanism, different from VNCR, is in place for scan listeners. Let's investigate!

Some explanations:
It turns out that the clusterware (OraAgent) does again the work, but this time without making it visible with an entry it in listener.ora. OraAgent takes care of setting the parameter REMOTE_REGISTRATION_ADDRESS for the scan listener, setting a endpoint for it on the HAIP network. Note, for more info on HAIP see documentation and support Doc Id 1210883.1
A log of this listener parameter change can be found in the logfile of the Clusterware Agent (OraAgent for crsd): $GRID_HOME/{NODE_NAME}/agent/crsd/oraagent_oracle/oraagent_oracle.log). The result can also be observed by using lsnrctl:

$ lsnrctl show remote_registration_address dbserver1-s:1521

Connecting to (DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=))(ADDRESS=(PROTOCOL=TCP)(HOST=xx.xx.xx.xx)(PORT=1521)))
dbserver1-s:1521 parameter "remote_registration_address" set to (DESCRIPTION=(ASYNC_TIMER=yes)(EXPIRE_TIME=1)(TRANSPORT_CONNECT_TIMEOUT=15)(ADDRESS_LISTDRESS=(PROTOCOL=tcp)(HOST=169.254.163.245)(PORT=11055))))

What does REMOTE_REGISTRATION_ADDRESS do?
When the listener receives a remote registration request it will reply to the client (which would normally be Oracle's PMON or LREG process) with a request to re-send the registration message via the HAIP network, in the example above: (HOST=169.254.163.245)(PORT=11055). This would only be possible if the instance trying to register its services has access to the same cluster interconnect (that is if it belongs to the same RAC cluster or to another RAC cluster which shares the same private network).
The default value for REMOTE_REGISTRATION_ADDRESS is OFF, therefore the redirection mechanism described here above is not in place unless explicitly activated (by the 12c clusterware in this case). See also this link at the documentation for REMOTE_REGISTRATION_ADDRESS.

VNCR can be used for scan listeners too. 
VNCR appears to be used when an invited list is specified. That is for the case when we want to further restrict the nodes allowed to perform remote service registration. An example here below of how this can be done, see the documentation for details:
$ srvctl modify scan_listener -update -invitednodes dbserver1,dbserver2

After doing this change we will notice that (1) VNCR (valid node checking) is now used for the scan listeners. (2) That the invited nodes are limited to the listed nodes and local subnet. (3) That the parameter remote_registration_address is no longer used in this case. Here below a relevant snippet from listener.ora:
VALID_NODE_CHECKING_REGISTRATION_LISTENER_SCAN1=SUBNET  # line added by Agent
REGISTRATION_INVITED_NODES_LISTENER_SCAN1=(dbserver1,dbserver2) # line added by Agent

Additional comments on configuring VNCR for RAC 
With the clusterware version used for these tests (12.1.0.1) I was not able to set VALID_NOTE_CHECKING_REGISTRATION_LISTENER to the value ON using srvctl, but rather Oracle was using the value SUBNET. The value ON is a more restrictive value than SUBNET (see  discussion above) and I believe it is more appropriate for most cases as a setting for the local listener.
In a test system I have noticed that when manually editing VALID_NODE_CHECKING_REGISTRATION_LISTENER=ON in listener.ora, the change would stay persistent after listener restart.
Moreover in 12.1.0.1 clusterware, when specifying a list of invited nodes for the scan listener (as in the example above), the VNCR parameter will be set to SUBNET rather than to ON. In this case a manual update of VALID_NODE_CHECKING_REGISTRATION_LISTENER_SCAN1=ON did not prove to be persistent after scan listener restart (OraAgent would overwrite the value).

Some pointers to investigation techniques

How to test if a listener will accept remote service registration:
We can use a test database as the 'attacker' and set the remote_listener parameter to point the listener under test. This will send a remote service registration request from the 'attacker' instance (from PMON or LREG depending on the version) towards the target listener. This operation and its result will be visible in the target listener's listener.log file. If the registration has been successful it will also be visible by running the command lsnrctl services target-listener:port.
A basic example showing how to set the remote_listener parameter:
SQL> alter system set remote_listener = '<endpoint>:<port>' scope=memory sid='*';

Listener configuration details from the clusteware logs:
Most of the parameters for the listener in 11.2 and 12c clusterware are set by the clusterware (for those parameters the configuration in the clusterware takes precedence from the values set in listener.ora). The log file of interest to see what operations have been performed by the clusterware to the listeners is the crsd oraagent log file: $GRID_HOME/{NODE_NAME}/agent/crsd/oraagent_oracle/oraagent_oracle.log

Listener logs tailing
The listener log files are an obvious source where we can get information on what is happening with remote registration.
Here below an example of listener.log entries generated following a remote registration request blocked by VNCR:
Listener(VNCR option 2) rejected Registration request from destination xx.xx.xx.xx
DD-MON-YYYY HH:MI:SS * service_register_NSGR * 1182
TNS-01182: Listener rejected registration of service ""

Network connections
When the clusterware sets remote_registration_address to provide redirection of the remote registrations, it will also set up an additional endpoint for the scan listener in the HAIP network. Moreover LREG (or PMON in 11.2) of remote instance can be seen to connect to this endpoint. Netstat is a handy tool to expose this. Example:
$ netstat -anp|grep tcp|grep 169.254

tcp   169.254.143.245:57688  0.0.0.0:*             LISTEN      12904/tnslsnr
tcp   169.254.143.245:57688  169.254.80.219:23239  ESTABLISHED 12904/tnslsnr

Trace Oracle processes
A simple technique to see the messaging between pmon/lreg while registering to the remote listener is to use strace. For example we can identify the pid of PMON (or LREG as relevant) and run: strace -s 10000 -p <pid>  (see above the syntax for alter system set remote_listener to trigger remote registration). Example from the output:

read(23, "\0\315\0\0\5\0\0\0\0\303(DESCRIPTION=(ASYNC_TIMER=yes)(EXPIRE_TIME=1)(TRANSPORT_CONNECT_TIMEOUT=15)(ADDRESS_LIST=(ADDRESS=(PROTOCOL=tcp)(HOST=169.254.163.245)(PORT=19771)))(CONNECT_DATA=(COMMAND=service_register_NSGR)))", 8208) = 205

Listener tracing
Another usefult technique is to set the tracing level for the scan listener to level 16 and look at the trace file, for example while triggering service registration. Example of how to set the trace level:

$ lsnrctl 
LSNRCTL> set current_listener dbserver1-s:1521
LSNRCTL> set trc_level 16

Conclusions
Oracle Cluserware 12c new feature "Restricting Service Registration for Oracle RAC Deployments" allows to restrict service registration for security purposes and to reduce the complexity of RAC installations. This new feature can be utilized by 12c and 11g RDBMS engines installed under 12c clusterware. This article investigates the details of the implementation of restricted service registration. One of the finding is that Oracle Clusterware does most of the configuration work in the background by setting the relevant parameters for the local and scan listeners.
A new 12c listener.ora parameter, REMOTE_REGISTRATION_ADDRESS, is used to secure scan listeners, at least in the case of a 12.1.0.1 vanilla clusterware installation. Another mechanism to restrict service registration is used for local listeners and for scan listener in particular cases: Valid Node Checking for Registration (VNCR), also a new feature of 12c.

Wednesday, July 11, 2012

Listener.ora and Oraagent in RAC 11gR2

Topic: An investigation of a few details of the implementation of listeners in 11gR2, including the configuration of listener.ora in RAC and the role of the cluster process 'oraagent'.

11gR2 comes with several important changes and improvements to the clusterware in general and in particular the way listeners are managed. While the listener process is still the 'good old' process  tnslsnr (Linux and Unix), it is now started from the grid home (as opposed to database oracle home). Moreover listeners are divided in two classes: node listeners and scan listeners, although they use the same binary for both functions. There are many more details and instead of covering them here I'd rather reference this excellent review: Markus Michalewicz's presentation at TCOUG.

Oraagent takes good care of our listeners

  • Node listeners and scan listeners are configured at the clusterware level, for example with srvctl and the configuration is propagated to the listeners accordingly
    • this integration is more consistently enforced in 11gR2 than in previous versions
  • The oraagent process spawned by crsd takes care of our listeners in terms of configuration and monitoring
    • note that there is normally a second oraagent on the system which is spawned by ohasd and does not seem to come into play here
  • Notably in the listener.ora file we can see which configuration lines are being taken care of automatically by oraagent, as they are marked with the following comment: # line added by Agent
    • experiment on a test DB: delete one or more of the listener.ora configuration lines and restart the listener (for example with srvctl stop listener; srvctl start listener). The original configuration will reappear in listener.ora and the manually modified listener.ora will be renamed (with a timestamp suffix)
  • The agent creates and maintains the file: endpoints_listener.ora
    • this file is there for backward compatibility, see docs and/or support site for more info. 
    • experiment on test: delete the file and restart the listerner, oraagent will recreate the file.
  • Oraagent log can be found at: $GRID_HOME/log/{nodename}/agent/crsd/oraagent_oracle/oraagent_oracle.log
  • Oraagent monitors the functioning of each listener 
    • from the log file entries (see above about the location of the oraagent log file) we can see that each listener is monitored with a frequency of 60 seconds
  • Oraagent comes into action also at instance startup, when the instance is started with srvctl (as opposed to 'manually' started instance from sqlplus) and sets LOCAL_LISTENER parameter, dynamically (this is done with an alter system command and only if the parameter has not been set on spfile).

    Dynamic listening endpoints 

    • Where are the TCP/IP settings of my listeners in listener.ora? 
      • Only IPC endpoints are listed in listener.ora, this is at first puzzling, where are the TCP settings that in previous versions were listed in listener.ora? ..read on!
    • Notes: 
      • endpoint=the address or connection point to the listener. The most known endpoint in the oracle DB world being TCP, port 1521
      • dynamic listening endpoint and dynamic service registration are both concepts related to listener activities, but they are distinct.
    • Oraagent connects to the listener via IPC and activates the TCP (TCPS, etc) endpoints as specified in the clusterware configuration
      • experiment on test: $GRID_HOME/bin/lsnrctl stop listener; $GRID_HOME/bin/lsnrctl start listener; Note that the latter command starts only the IPC endpoint. However oraagent is posted at listener startup and makes active the rest of the endpoints (notably listening on the TCP  port), this can be seen for example by running the following a few seconds after listener restart: $GRID_HOME/bin/lsnrctl status listener (which will list all the active endpoints)
    • Another way to say that is that the endpoints for the listener in RAC 11gR2 are configured in a  dynamic way: TCP (TCPS, etc) endpoints are activated at runtime by oraagent
      • this is indicated in the listener.ora by the new (undocumented) parameters ENABLE_GLOBAL_DYNAMIC_ENDPOINT_{LISTENER_NAME}=ON
    • Experiment on test on disabling dynamic listening endpoint:  
      • stop the listener: $GRID_HOME/bin/lsnrctl stop listener
      • edit listener.ora and set  ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER=OFF
      • start the listener: $GRID_HOME/bin/lsnrctl start listener
      • check listener.ora and check that the parameter edited above has not changed, that is  ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER=OFF 
      • in this case the TCP endpoint will not be started, that is the listener will be listening only on IPC. Check with: $GRID_HOME/bin/lsnrctl status listener 
      • note: if we try do the same exercise by stopping and starting the listener with srvctl, as it would be the typical way to do it, we will see that the parameter ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER in listener.ora will be set again to ON. This will activate again dynamic listening endpoints, which is something we want in production, but not for this exercise!
    • Note: the examples have been tested on 11.2.0.3 RAC for Linux. ORACLE_HOME needs to be set to point to the grid home (and TNS_ADMIN if used at all, needs to be set to $GRID_HOME/network/admin)

    I don't need to edit listener.ora any more, or do I?

    • We have seen above that the most important configurations related to listener.ora are automated via oraagent and the clusterware in general
    • There are additional listener.ora parameters that are not managed by the clusterware in11gR2 and need to be configured in listener.ora in case we want to use them
    • The steps for the configuration are very well described in support note 1340831.1. Here we just mention that for a 2-node RAC the following parameters will be needed in listener.ora (note parameters taken from listener.ora on 11.2.0.3 RAC for Linux):
    SECURE_REGISTER_LISTENER = (IPC,TCP)
    SECURE_REGISTER_LISTENER_SCAN1 = (IPC,TCPS)
    SECURE_REGISTER_LISTENER_SCAN2 = (IPC,TCPS)
    WALLET_LOCATION =  (SOURCE =(METHOD = FILE)(METHOD_DATA =(DIRECTORY = ..put here directory of wallet..)))


    Conclusions

    The handling of listeners in 11gR2 RAC has become much more integrated with the clusterware and for most RAC configurations there is not much need to play with listener.ora any more. At a closer look of however, new parameters and overall some new rules of the game in the 11gR2 implementation are revealed, most notably the use of dynamic endpoint registration by the clusterware.