Performance "Statspack" Report

Lab128 - Tools for Advanced Oracler Tuning and Monitoring.

Reference Guide.

Performance "Statspack" Report

Lab128 can generates Statspack-like report using user's selected time interval. Once the interval is selected, a menu item Reports | Performance | Statspack Report becomes enabled. Also you can use shortcut Ctrl-S to call the report. The report presents the most important data used in performance analysis:

Load Profile. The "Load Profile" section shows you the load on your instance per second and per transaction. You can compare this section between two Reports to see how the load on your instance is increasing or decreasing over time.

Redo size - This is the amount of redo generated during this report.
Logical Reads - This is calculated as Consistent Gets + DB Block Gets = Logical Reads
Block changes - The number of blocks modified during the sample interval
Physical Reads - The number of requests for a block that caused a physical I/O.
Physical Writes - The number of physical writes issued.
User Calls - The number of queries generated
Parses Total of all parses - both hard and soft
Hard Parses - Those parses requiring a completely new parse of the SQL statement. These consume both latches and shared pool area.
Soft Parses - Not listed but derived by subtracting the hard parses from parses. A soft parse reuses a previous hard parse and hence consumes far fewer resources.
Sorts, Logons, Executes and Transactions - are all self explanatory.

Instance Efficiency Percentages. Hit ratios are calculations that may provide information regarding different operations in the Oracle instance. Database tuning never must be driven by hit ratios. They only provide additional information to understand how the instance is operating.

Buffer Nowait % This is the percentage of time that the instance made a call to get a buffer (all buffer types are included here) and that buffer was made available immediately (meaning it didn't have to wait for the buffer...hence "Buffer Nowait").
Buffer Hit % This means that when a request for a buffer took place, the buffer was available in memory and physical disk I/O did not need to take place.
Library Hit % If your Library Hit percentage is low it could mean that your shared pool size is too small or that the bind variables are not being used (or at least being used properly).
Execute to Parse % This is the formula used to get this percentage: (executevalue - parsevalue ) / executevalue. So, if you run some SQL and it has to be parsed every time you execute it (even if it was a soft parse) then your percentage would be 0%. The more times that you can reuse the parsed cursor the higher your Execute to Parse ratio is. If the application parses SQL statement and never executes it (really bad!), this ratio will be negative. Generally you should be concerned if this ratio is below 50%.
Parse CPU to Parse Elapsed % Generally, this is a measure of how available your CPU cycles were for SQL parsing. If this is low, you may see "latch free" as one of your top wait events.
Redo NoWait % The instance didn't have to wait to use the redo log if this is 100%.
In-memory Sort % This means the instance could do its sorts in memory as opposed to doing physical I/O. You don't want to be doing your sorts on disk especially in an OLTP system. Try increasing your SORT_AREA_SIZE or PGA_AGGREGATE_TARGET in your spfile/pfile to see if that helps if your in-memory sorting is not between 95% and 100%.
Soft Parse % This is an important one, at least for OLTP systems. This means that your SQL is being reused. If this is low (< 95%) then make sure that you're using bind variables in the application and that they're being used properly.
Latch Hit %: This should be pretty close to 100%; if it's not then check out what your top wait events are to try to fix the problem (pay specific attention to 'latch free' event).
% Non-Parse CPU: Shows the percentage of how much CPU resources were spent on the actual SQL execution.

Top 10 Wait Events. This section shows the Top 10 timed events that must be considered to focus the tuning efforts. This section is crucial in determining what some of the performance drains in your database are. It will actually tell you the amount of time the instance spent waiting.

Here are some common reasons for high wait events:

CPU Time: Is not really a wait event, but rather the amount of CPU time used during the snapshot window If this is your largest wait event then it could mean that you have some CPU intensive SQL going on. You may want to examine some of the SQL further down in the Report for SQL statements that have large CPU Time.
DB file scattered read: This can be seen fairly often. Usually, if this number is high, then it means there are a lot of full table scans going on. This could be because you need indexes or the indexes you do have are not being used.
DB file sequential read: This could indicate poor joining orders in your SQL or waiting for writes to 'temp' space. It could mean that a lot of index reads/scans are going on. Depending on the problem it may help to tune PGA_AGGREGATE_TARGET and/or DB_CACHE_SIZE.
SQL*Net more data to client: This means the instance is sending a lot of data to the client. You can decrease this time by having the client bring back less data. Maybe the application doesn't need to bring back as much data as it is.
log file sync: A Log File Sync happens each time a commit takes place. If there are a lot of waits in this area then you may want to examine your application to see if you are committing too frequently (or at least more than you need to).
Logfile buffer space: This happens when the instance is writing to the log buffer faster than the log writer process can actually write it to the redo logs. You could try getting faster disks but you may want to first try increasing the size of your redo logs; that could make a big difference (and doesn't cost much).
Logfile switch: This could mean that your committed DML is waiting for a logfile switch to occur. Make sure your filesystem where your archive logs reside are not getting full. Also, the DBWR process may not be fast enough for your system so you could add more DBWR processes or make your redo logs larger so log switches are not needed as much.

ASH - Top SQL Statements. The Active Session History (ASH) collection engine provides very important report on most active SQL statements and shows breakdown by wait events.

Activity,% - percentage of the SQL statement elapsed time in the sum of all SQL elapsed time. This column indicates how big is given SQL statement as compared to the rest of SQL statements.
Time (s) - combined elapsed time for the SQL statement. Remember, the combined elapsed time can be bigger than measured time interval if several sessions were executing same statement concurrently.
% Total Elapsed Time - combined elapsed time from "Time (s)" column divided by Report elapsed time expressed in percents. This can be bigger than 100%. For example, if 3 sessions were executing same SQL statement during entire report time, the SQL statement will have 300% in this column. In other words, this column shows how many sessions on average were executing this statement (100% - means 1 session).

Each SQL statement has a list of events and percentage of time spent on the event. Depending on nature of waits, this shows whether the statement was CPU intensive, or I/O intensive, or experienced row lock contention etc.

ASH - Top Service/Module. This shows distribution of the load between services. The report has same columns as in previous ASH - Top SQL Statements section.

Top SQL ordered by Elapsed Time. If your goal is to reduce the response time of the database, you should start with this section. You should look at total elapsed time for the query and elapsed time per a single execution. Sometimes the single execution time is reasonable but the frequency of executions can point out to the problem in the application. The percentage of DB Time gives an estimation of how big impact the query is making to overall response time. Therefore it gives an idea how big will be payoff if the query is significantly improved. CPU column should be compared to the Elapsed time column to see the percentage of time spent on CPU. That will tell how CPU intensive the query is.

Top SQL ordered by CPU Time. If your goal is to reduce / analyze the CPU usage, you should look into this section. The CPU time probably will be a substantial percentage of total elapsed time. If the query spent all the time on CPU, values in these columns will be very close. As with elapsed time analysis in previous section, you should look at total time and time per a single execution. Sometimes the single execution time is reasonable but the frequency of executions can point out to the problem in the application. Your decision whether to optimize the query or fix the application or add more hardware should be based on those factors. The percentage of DB Time gives an estimation of how big impact the query is making to overall response time. Therefore it gives an idea how big will be payoff if the query is significantly improved.

Top SQL ordered by Gets. Gets are logical reads of DB blocks. They can be satisfied from the disk (physical read) from the DB cache. Quite often in OLTP environment the queries with highest gets will be gets from cache because they have much higher rate than physical reads. These queries will have high hit ratio (cache gets divided by total gets number) and typically there is a good correlation between top CPU queries and top Gets queries. In DSS environment, gets are mostly read gets, so there is a correlation between top Reads and top Gets queries. You should check if the number of gets per execution is reasonable. You should take into consideration how many rows the query is expected to return. If the number of rows is small but the number of gets per row is high, there are potentials to improve the query. Generally the number of gets per row > 100 in OLTP should be considered as too large but it depends on the query. If the query does aggregations across many rows, this criterion doesn't apply. You should remember that large number of gets may indicate the opportunity to improve the query.

Top SQL ordered by Reads. If your goal is to reduce / analyze the I/O load, you should look into this section. You should check if the number of reads per execution is reasonable. In OLTP environment the large number of reads can indicate full scans or range scans on indexes with poor selectivity. You should also check the number of gets per execution. If the number of gets is reasonably small, and your system is experiencing high I/O, you should increase DB cache. This doesn't apply to DSS database. Even though large number of reads can be normal in DSS, you should attempt to optimize SQL and reduce number of reads.

Top SQL ordered by Executions. High-frequency queries can indicate poorly designed application. The excessive number of execution can tax CPU and network.

Top SQL ordered by Parse Calls. You should always avoid hard parses for frequent queries which is typical to OLTP environment. Even if all parses are soft parses, they present substantial load to the system and should be avoided. This generally can be done at application level by re-coding and using principle "parsing once and executing the query several times."

Top SQL ordered by Cluster Wait Time. This is RAC specific section. These are top queries used to wait most on RAC inter-instance events. Compare values in CWT (Cluster Wait Time) columns with the elapsed time, if they are close, then the global cache and supporting services are a bottleneck for this query. Check which tables / indexes involved in the query and relate that information with segment statistics in Segments and Extents screen, sorted by "gc cr blocks received" and "gc current blocks received". This will give you understanding which segments are hot. Then you will need to take measures, which depending on type of segments, can include reverse indexing, hash partitioning, or application partitioning.

OS Statistics. Most of statistics are self-explanatory. Please note that Physical memory size is not reported correctly in some Oracle versions. The "CPU Total Busy Time" is a sum of "CPU User Time" and "CPU System Time". The "% of Elapsed Time" column shows how much of all available CPU resources have been utilized.

RAC Statistics.

Global Cache Load Profile.

Global Cache blocks received - number of blocks received from the remote instance over the hardware interconnect. This occurs when a process requests a consistent read for a data block that is not in its local cache. Oracle sends a request to another instance. Once the buffer has been received, this statistic is incremented.
Global Cache blocks served - number of blocks sent to the remote instances over the hardware interconnect.
GCS/GES messages received - number of messages received from remote instances over the hardware interconnect. This statistics generally represents overhead caused by functioning of RAC services.
GCS/GES messages sent - number of messages sent to remote instances over the hardware interconnect. This statistics generally represents overhead caused by functioning of RAC services.
DBWR Fusion writes - number of fusion writes. In RAC, as in a single instance Oracle database, blocks are only written to disk for aging, cache replacement, or checkpoints. When a data block is replaced from the cache due to aging or when a checkpoint occurs and the block was changed in another instance but not written to disk, Global Cache Service will request that instance to write the block to disk.
Estd Interconnect traffic - number of KB transferred by the interconnect.

Global Cache Efficiency Percentages.

Buffer access - local cache % - percentage of blocks satisfied from local cache to the total number of blocks requested by sessions. It is desirable to maintain this ratio as high as possible because this is the least expensive and fastest way to get the database block.
Buffer access - remote cache % - percentage of blocks satisfied from remote instance cache to the total number of blocks requested by sessions. The sum of this ratio and Buffer access - local cache % described above should be maintained as high as possible because these two paths of accessing DB blocks are fastest and least expensive to the system. Getting block from the cache of remote instance is about 10 times faster than reading it from the disk.
Buffer access - disk % - percentage of blocks read from disk into the cache to the total number of blocks requested by sessions. It is desirable to maintain this ratio low because physical reads is the slowest way to access database blocks.

Global Cache and Enqueue Services - Workload Characteristics.

This section provides average times for obtaining global enqueue (locks) and global cache blocks.

Avg global enqueue get time (ms) - time spent on sending message through the interconnect, opening a new global enqueue for the resource or converting access mode of the enqueue if it is already open. If get time is more than 20 ms, then your system may be experiencing timeouts.
Avg global cache cr block receive time (ms) - The time is spent on sending message from requesting instance to the block mastering instance (2-way get) and sometimes to the holding instance (3-way get). This time also includes building of the consistent read image of the block in the holding instance. The CR Block get time should not exceed 15 ms.
Avg global cache current block receive time (ms) - The time is spent on sending message from requesting instance to the block mastering instance (2-way get) and sometimes to the holding instance (3-way get). This time also includes log flush time in the holding instance. The Current Block get time should not exceed 30 ms.
Avg global cache cr block build time (ms), Avg global cache cr block send time (ms), Avg global cache cr block flush time (ms) - are components of GCS activity to serve remote requests for consistent read (CR) blocks.
Global cache log flushes for cr blocks served % - is the percentage of CR block that needed log flush to the total number of CR blocks served.
Avg global cache current block build time (ms), Avg global cache current block send time (ms), Avg global cache current block flush time (ms) - are components of GCS activity to serve remote requests for current blocks.
Global cache log flushes for current blocks served % - is the percentage of current block that needed log flush to the total number of current blocks served.

Global Cache and Enqueue Services - Messaging Statistics.

This section provides performance numbers on messaging, although Oracle documentation doesn't say how to use that information for the purpose of tuning or troubleshooting . If you have your ideas, please sent e-mail message to support@lab128.com.

Avg message sent queue time (ms) - calculated as v$ges_statistics 'msgs sent queue time (ms)' / 'msgs sent queued';
Avg message sent queue time on ksxp (ms) - calculated as v$ges_statistics 'msgs sent queue time on ksxp (ms)' / 'msgs sent queued on ksxp';
Avg message received queue time (ms) - calculated as v$ges_statistics 'msgs received queue time (ms)' / 'msgs received queued';
Avg GCS message process time (ms) - calculated as v$ges_statistics 'gcs msgs process time(ms)' / 'gcs msgs received';
Avg GES message process time (ms) - calculated as v$ges_statistics 'ges msgs process time(ms)' / 'ges msgs received';
% of direct sent messages - calculated as v$ges_statistics 'messages sent directly' / ( 'messages sent directly' + 'messages sent indirectly' + 'messages flow controlled';
% of indirect sent messages - calculated as v$ges_statistics 'messages sent indirectly' / ( 'messages sent directly' + 'messages sent indirectly' + 'messages flow controlled';
% of flow controlled messages - calculated as v$ges_statistics 'messages flow controlled' / ( 'messages sent directly' + 'messages sent indirectly' + 'messages flow controlled';