Extending Oracle Enterprise Manager (EM) monitoring. 29 January 2012
Posted by David Alejo Marcos in Grid Control, Oracle 11.2, SQL - PL/SQL.Tags: Grid Control, Oracle 11.2, SQL, SQL - PL/SQL
trackback
I always found Oracle Enterprise Manager (EM) to be an interesting tool for different reasons. The only thing I missed was an easy way to create my own alerts.
It is very simple to create a KSH, Perl, etc script to do some customised monitoring and notify you by email, Nagios, NetCool, etc.
By integrating your scripts with OEM, you will have an easy way to enhance your monitoring and still have notification by email, SNMP traps, etc. as you would currently have if your company is using OEM for monitoring your systems.
The problem:
Develop an easy way to integrate your monitoring scripts with OEM.
The solution:
I decided to use an Oracle type and an Oracle function to accomplish this goal. Using the steps described below, we can monitor pretty much whatever aspect of the database providing you can put the logic into a function.
As example, I had added the steps to create two new User-Defined SQL Metrics, as Oracle calls them:
1.- Long Running Sessions (LRS).
2.- Tablespace Monitoring.
The reason to have my own TBS monitoring is to enhance the existing as it has “hard-coded” thresholds. I might have tablespaces in your database which are 6TBS in size and other with only 2Gb, so raising an alert at 95% for both of them is, in my opinion, not adequate.
You can find more about the query I developed here.
The steps to create a script to monitor long running sessions (LRS) are:
1.- create types
CREATE OR REPLACE TYPE lrs_obj as OBJECT ( user_name VARCHAR2(256), error_message varchar(2000)); / CREATE OR REPLACE TYPE lrs_array AS TABLE OF lrs_obj; /
2.- create function.
CREATE OR REPLACE FUNCTION lrs RETURN lrs_array IS long_running_data lrs_array := lrs_array(); ln_seconds_active number := 300; BEGIN SELECT lrs_obj(username||' Sec: '||sec_running, ', Inst_id: '||inst_id||', SID: '||sid||', Serial: '|| serial||', Logon: '||session_logon_time||', sql_id: '||sql_id) BULK COLLECT INTO long_running_data FROM (SELECT /*+ FIRST_ROWS USE_NL(S,SQ,P) */ s.inst_id inst_id, s.sid sid, s.serial# serial, s.last_call_et sec_running, NVL(s.username, '(oracle)') AS username, to_char(s.logon_time, 'DD-MM-YYYY HH24:MI:SS') session_logon_time, s.machine, NVL(s.osuser, 'n/a') AS osuser, NVL(s.program, 'n/a') AS program, s.event, s.seconds_in_wait, s.sql_id sql_id, sq.sql_text from gv$session s, gv$sqlarea sq where s.sql_id = sq.sql_id and s.inst_id = sq.inst_id and s.status = 'ACTIVE' and s.last_call_et > ln_seconds_active and s.paddr not in ( select paddr from gv$bgprocess where paddr != '00' ) and s.type != 'BACKGROUND' and s.username not in ( 'SYSTEM', 'SYS' ) AND s.event != 'SQL*Net break/reset to client' ) CUSTOMER_QUERY; RETURN long_running_data; END lrs; /
3.- Grant privileges to the users will be executing monitoring scripts:
grant execute on lrs_obj to public; grant execute on lrs_array to public; grant execute on lrs to public;
4.- create synonyms
Create public synonym lrs_obj for lrs_obj; Create public synonym lrs_array for lrs_array; Create public synonym lrs for lrs;
5.- Query to monitor
SELECT user_name, error_message FROM TABLE(CAST(lrs as lrs_array));
Once we are satisfied with the thresholds (300 seconds on the script), we are ready to add it to EM.
1.- Navigate to User-Defined SQL Metrics (you need to navigate to your database and you will find the link at bottom).
2.- Create new User-Defined SQL Metric and fill the gaps (I have attached some values for reference). The most important thing right now is to make sure metric_Type = String, Sql Query Output = two columns, comparison Operator = CONTAINS and warning has a value returned by the query (i did decide to go for Inst_id).
The only thing left now is to add this monitoring to your templates and to your rules so notifications are being sent.
Once all has been configure, you should start seeing alerts like this:
Target Name=lab01 Target type=Cluster Database Host=mylab Occurred At=Jan 22, 2012 14:35:47 PM GMT Message=LRS alert: key = DAVIDLAB Sec: 530, value = , Inst_id: 1, SID: 153, Serial: 1597, Logon: 22-01-2012 12:21:46, sql_id: 3m72fjep12w8r Metric=StrValue Metric value=, Inst_id: 1, SID: 153, Serial: 1597, Logon: 22-01-2012 12:21:46, sql_id: 3m72fjep12w8r Metric ID=lrs Key=DAVIDLAB Sec: 530 Severity=Warning Acknowledged=No Notification Rule Name=david alerts Notification Rule Owner=DAVIDLAB
For Tablespace monitoring the steps will be the same as described above:
1.- create types
CREATE OR REPLACE TYPE tbs_obj as OBJECT ( tablespace_name VARCHAR2(256), error_message varchar(2000)); / CREATE OR REPLACE TYPE tbs_array AS TABLE OF tbs_obj; /
2.- create function.
CREATE OR REPLACE FUNCTION calc_tbs_free_mb RETURN tbs_array IS
tablespace_data tbs_array := tbs_array();
BEGIN
SELECT tbs_obj(tablespace_name, alert||', Used_MB: '||space_used_mb||', PCT_Free: '||pct_free||', FreeMB: '|| free_space_mb) BULK COLLECT
INTO tablespace_data
FROM
(SELECT (CASE
WHEN free_space_mb <= DECODE (allocation_type, 'UNIFORM', min_extlen, maxextent) * free_extents THEN 'CRITICAL'
WHEN free_space_mb <= DECODE (allocation_type, 'UNIFORM', min_extlen, maxextent) * free_extents + 20 THEN 'WARNING'
ELSE 'N/A'
END)
alert,
tablespace_name,
space_used_mb,
ROUND (free_space/power(1024,3), 2) free_gb,
free_space_mb,
pct_free,
ROUND (extend_bytes/power(1024,3), 2) extend_gb,
free_extents,
max_size_gb,
maxextent
FROM (SELECT c.tablespace_name,
NVL (ROUND ( (a.extend_bytes + b.free_space) / (bytes + a.extend_bytes) * 100,2), 0) pct_free,
NVL ( ROUND ( (a.extend_bytes + b.free_space) / 1024 / 1024, 2), 0) free_space_mb,
(CASE
WHEN NVL ( ROUND ( (maxbytes) / power(1024,3), 2), 0) <= 30 THEN 60
WHEN NVL ( ROUND ( (maxbytes) / power(1024,3), 2), 0) <= 100 THEN 120
WHEN NVL ( ROUND ( (maxbytes) / power(1024,3), 2), 0) <= 300 THEN 200
WHEN NVL ( ROUND ( (maxbytes) / power(1024,3), 2), 0) <= 800 THEN 300
ELSE 340
END) free_extents,
a.extend_bytes, b.free_space,
ROUND (maxbytes /power(1024,3), 2) max_size_gb,
nvl (round(a.bytes - b.free_space ,2) /1024/1024, 0) space_used_mb,
c.allocation_type,
GREATEST (c.min_extlen / 1024 / 1024, 64) min_extlen,
64 maxextent
FROM ( SELECT tablespace_name,
SUM(DECODE (SIGN (maxbytes - BYTES), -1, 0, maxbytes - BYTES)) AS extend_bytes,
SUM (BYTES) AS BYTES,
SUM (maxbytes) maxbytes
FROM DBA_DATA_FILES
GROUP BY tablespace_name) A,
( SELECT tablespace_name,
SUM (BYTES) AS free_space,
MAX (BYTES) largest
FROM DBA_FREE_SPACE
GROUP BY tablespace_name) b,
dba_tablespaces c
WHERE c.contents not in ('UNDO','TEMPORARY') and
b.tablespace_name(+) = c.tablespace_name
AND a.tablespace_name = c.tablespace_name
)
) CUSTOMER_QUERY;
RETURN tablespace_data;
END calc_tbs_free_mb;
/
3.- Grant privileges to the users will be executing monitoring scripts:
grant execute on tbs_obj to public; grant execute on tbs_array to public; grant execute on calc_tbs_free_mb to public;
4.- create synonyms
Create public synonym tbs_obj for tbs_obj; Create public synonym tbs_array for tbs_array; Create public synonym calc_tbs_free_mb for calc_tbs_free_mb;
5.- Query to monitor
SELECT * FROM TABLE(CAST(calc_tbs_free_mb as tbs_array));
Please, remember to use comparison operator = CONTAINS, warning = WARNING and critical=CRITICAL
As always, comments are welcome.
David Alejo-Marcos.
David Marcos Consulting Ltd.


Thx David -
I have this on my to-do stack to try out. May be a while before I get to it. I’ll let you know how it goes.
thx again…
Hi. I am getting an error when I try to test the metric.
Received unexpected response text :
That’s all it gives me. I know for sure the query isn’t returning any rows however. Is this expected?
thx,
Carl
oh. forgot to mention that I have created a udm for the lrs function.
thx
Hello Carl,
Thank you for your comments.
I have not experienced this error. If the query does not return any rows, no alert will be raised.
The following link might help
https://forums.oracle.com/forums/thread.jspa?threadID=518881
Please, post any errors on the em log files as they might be helpful.
Regards,
David Marcos.