Thursday, July 27, 2006

Training Class (Final Day)

To round off the material covered in this class the following topics were covered today:
  1. Tuning Block Space Usage.
  2. Tuning I/O.
  3. Tuning PGA and Temporary Space.
  4. Performance Tuning: Summary.

I found the Tuning I/O lecture somewhat interesting. The first portion of the lecture focused on the advantages and disadvantages of the various forms of RAID protection. While informative, I could've spent 5 minutes on Google had I not already been armed with the knowledge of this technology. The remainder of this lecture focused on ASM (Automatic Storage Management). This rather non-trivial feature in 10g sounds very cool; define some Data disk group(s) , the relevant protection and striping granularity and let Oracle do the all of the I/O tuning. Of course, this is a severe over simplification of what it really does (or doesn't, as your mileage may vary). But, the point is, this feature is supposed to free the DBA from the often times laborious chore of tuning the I/O subsystem. Truthfully, I think the degree to which Oracle touts the hands-off nature of this feature is overstated; especially for busy production systems. I, nor anyone in the class, had worked with the product. Consequently, I feel there are probably very few shops out there migrating their production databases to ASM. Is it more of a political battle? After all, if DBAs will be able to someday create and manage the logical volumes/file systems this might make the System Administrators feel a little encroached upon. It is just a hunch, but widespread conversions to ASM will probably not happen anytime soon. Anyone reading this blog have any good/bad experience with ASM in a production environment? I am very interested in your feedback.

The most engaging lecture of the day was the Tuning Block Space Usage. I am really keen to the Automatic Segment Space Management (ASSM) feature. This feature warrants serious consideration given the upside: free list elimination and a considerably more robust approach to reusing blocks for inserts. As much as I liked the discussion on ASSM, the subsequent topic grabbed my utmost attention: segment shrinking. What a great (and might I add way overdue) feature. If one of my production environments was on 10g today I could see using this tool to reclaim vast amounts of space in some of my very large heap tables, index-organized tables and indexes. Oracle claims that the majority of the work can be done online. Moreover, the indexes associated with your heap tables are still usable even after the row movement inherent to the SHRINK has completed. I like the idea of having the freedom to perform these "online" activities, but I still prefer to perform these kinds of operations during quite periods. The course material gives a fantastic, albeit brief, description of the mechanics. Very nice Oracle! Once again, are there any readers of this blog that have experience with this feature and want to share your experiences?

The final two lectures, Tuning PGA and Temporary Space and Performance Tuning Summary, were good, but not great. The material seemed to belabor a few points.

In summary, if you are considering taking this course I think you are best served if you do not have much 10g experience in production environments. If your experience with 10g and some of the "tuning" features is even moderate, I recommend you not take the course. Your time would be better spent reading up on this material in the Oracle documentation set.

Eric's rating of the course: B+.

Wednesday, July 26, 2006

Training Class (Day 3)

Another day of training is in the books. What was on today's menu?
  1. Tuning the Shared Pool.
  2. Tuning the Buffer Cache.
  3. Automatic Shared Memory Management.
  4. Checkpoint and Redo Tuning.

Apparently, Oracle is migrating some of its serialization protection from latches to mutexes. For example, the structures previously protected by the Library Cache Pin latch are now protected by a mutex and evidenced by the cursor:pin S wait event. Actually there are several new mutexes and mutex related wait events new to 10g. For example:

- cursor:mutex indicates mutex waits on parent cursor operations and statistic block operations.

- cursor:pin events are waits for cursor pin operations (library cache pin now protected by mutex).

There are a couple interesting facts about Oracle and mutexes. A mutex get is about 30-35 instructions, compared to 150-200 instructions for a latch get. Also, a mutex is around 16 bytes in size, compared to 112 bytes for a latch in Release 10.2 (in prior releases, it was 200 bytes).

One of the appeals of the mutex, per the documentation, is the reduced potential for false contention. That is, a mutex can protect a single structure; often times stored with the structure it protects. However, latches often protect many structures (see cache buffers chain latch) and can yield what the documentation calls false contention. It is called false contention because "the contention is for the protection mechanism rather than the target object you are attempting to access." This all sounds really great, right? Well, maybe. If Oracle goes to more widespread use of mutexes instead of latches to protect target objects that would be a boatload more mutexes. I am sure the porters at Oracle are not intending to use mutexes exclusively in the future. But, I can see where contention in Oracle could be dramatically reduced at the cost of CPU cycles and memory. What would happen if Oracle protected each buffer with a mutex? While each mutex is less expensive with regard to memory and CPU than an individual latch, you will need considerably more mutexes for each replaced latch. 50 mutexes used to replace a single latch could run the CPU up considerably for the "same" application workload.

I have one final note on mutexes. As of version 10.2.0.2 a SELECT against V$SQLSTAT and searches of child cursor lists are mutex protected.

I found the Tuning the Buffer Cache discussion somewhat interesting. Unless you have been hiding under a rock the past 4-5 years, I am sure you have heard the Oracle experts preaching the notion that ratios are not very helpful in diagnosing the health of a database. In particular, the buffer cache hit ratio is frequently tagged as meaningless. A smile came to my face when I read the following excerpt from the course material:

"A badly tuned database can still have a hit ratio of 99% or better...hit ratio is only one part in determining tuning performance...hit ratio does not determine whether a database is optimally tuned..."

Oracle is finally teaching what the experts have been saying for years!

I have been to several Hotsos events/training classes. They often talk about the need to include the buffer is pinned count statistic in the tally for logical reads. These operations are simply latch-reduced logical reads. Why doesn't Oracle integrate this information into their course material or documentation set? They still only claim that db block gets and consistent gets constitute logical reads. I monitored a process recently in one of my production environments and noticed the process did 2 buffer is pinned count logical reads for every 1 (db block gets + consistent gets). That is a substantial percentage of work owed to operations not officially categorized as a measure of work by Oracle.

Lastly, the on-topic impromptu discussions were fruitful. That always makes the training session more interesting :)

Training Class (Day 2)

The second day of training was much better than the first. I suspected it would get better based on the material to be covered. The topic set de jour was:

  1. Metrics, Alerts and Baselines.
  2. Using Statspack.
  3. Using Automatic Workload Repository.
  4. Reactive Tuning.

Having limited exposure to 10g in any true production environment, I found 75% of these topics interesting (Statspack chapter was not valuable to me). I really like what Oracle has accomplished with 10g with regard to the gathering and reporting of statistics and metrics (the rates of changes for given statistics). About 5 years ago I wrote a utility for 9i that allowed me to compare Oracle-captured statistics and wait event durations to similar reference points. This utility, I dubbed AppSnap (written in PL/SQL), captured the statistics and wait event durations each hour and calculated and stored the deltas in a separate tablespace. This permitted me to compare what is considered "typical" load to current load and evaluate the deviations rather quickly. I wrote a Unix shell script reporting tool called Instance Health that reports each hour the deltas as they relate to what I call peer hours. For example, each hour a report is generated as a text file and stored in a log directory. The most previous delta is compared to the same hour of day for the past 30 days, the same hour of day and day of week for the past 12 weeks and against all hours for the past 30 days. This has proved to be very valuable for detecting systemic anomalies after application upgrades, etc.

Okay. Now Oracle has come along with 10g and provides the same functionality (albeit not free). I appreciate the graphical conveyance of this type of analysis provided by Enterprise Manager. Shoot, Oracle even calculates the variance within the sampled timeframe for each metric. This is really cool because you can easily write a query that can ascertain if some metric is statistically anomalous (i.e. +-3 standard deviations). At first glance, some of the AWR reports are not very intuitive. But, the more you stare at them the more sense they appear to make. The Active Session History reporting is also a very nice feature (once again, not free).

If you already have considerable work experience with AWR/ASH/ADDM then this class probably won't provide you much value. The course does go into the mechanics of the data capturing and touches rather superficially on the reporting capabilities. So there is a good chance you probably have more knowledge about these products than this class affords. However, if you are like me and have yet to dig in your heels on a 10g production environment this class could serve as a very nice primer.

Well, I am off to day 3 of this training class.

Monday, July 24, 2006

Training Class (Day 1)

As I mentioned in my last post, I am attending an Oracle training class this week. The class is Oracle Database 10g: Performance Tuning. Having been to the Oracle 10g New Features for Database Administrators class last year, I was hoping for a substantially more in-depth look at 10g instance tuning in this class.

The first day was just okay for me. I learned a few things but I felt the class dragged a bit: 30 minutes getting to know each other and too many unrelated tangents or related, yet gratuitous, topic embellishments. I don't mind an occasional anecdotal deviation as it relates to the topic at hand, but those that are completely off topic really slow down the course. You know when you are reading a story and everything seems to flow nicely (proper balance of dialogue and narrative), then you run into a couple pages of narrative? It really puts the brakes on the interest.

I tend to evaluate courses on:
  1. The conveyed knowledge of the instructor.
  2. The presentation of the subject matter.
  3. The quality of the course material.
  4. The value of the impromptu discourse along the way (icing on the cake stuff).

Based on the first day, I feel the instructor has a good command of the subject matter and adds value to the class based on relevant experience. I have been to several Oracle classes where the instructor did nothing more than read the slides verbatim and/or appeared to have little relevant experience. Aside from the introductions, we spent the remainder of the day on two topics: a performance tuning overview and 10g statistics/wait events. The study material that accompanies the course is rather good; I have taken the liberty to skip ahead and get into the meat of the course material. I am looking forward to the class tomorrow as I feel we will be digging our heels in a bit more (or at least I hope).

Given the sparse amount of substantive material covered on the first day, I don't have any really interesting takeaways. I'll give the first day of class a B.

Monday, July 17, 2006

iLoveit

Well, this weekend I replaced my "archaic" Generation 4 IPod with a snazzy new Generation 5 Video IPod. I chose the black facing this time for a change of pace.


I really like the IPod product - versatile, convenient and just loads of entertainment/educational potential for my ~2 hours of public transit commute each day.

With more and more news and entertainment mediums adding video Podcasts this little device has become a cool way to pass the commuting time. I enjoy reading on my commute, but watching an interesting show or funny movie can pass the time at record clips. I won't be surprised if I ask the engineer to make another loop so I can finish watching my movie before heading into work :)

I wonder when Mr. Kyte will be Podcasting some material...

I'll be in Oracle training next week attending the Oracle Database 10g: Performance Tuning (Database) class. It should be a really good course. I'll be blogging some of the interesting takeaways from the class, so stay "tuned"!