Thursday, March 09, 2006

Hotsos 2006: Day 3

I started off the day with a pastry, a glass of orange juice and an hour of Tom Kyte. Not a bad way to start a day, now is it? Tom gave a presentation on the performance impact of the LOG ERRORS clause in 10GR2. This feature of Oracle is really nice. As Tom said, "It will change the way we extract, transfer and load data in the future." I totally agree. The gist of this new feature is the ability to log error messages (and other relevant row data) for the failed rows in a multi-row DML statement. Much like the .bad file in the SQL*Loader tool gives us the ability to record failed rows outside the database, the LOG ERRORS clause permits us to store essentially the same data in an "error" table. Circa, 10GR2 no longer do our multi-row DML operations need to be an "all or nothing" proposition. It is perfectly permissible for a pre-configured number of rows to fail. Moreover, the logging of errors is an autonomous transaction. That is, if you rollback the DML the error data can persist in the "error" table.

As for the performance impact owed to the LOG ERRORS clause, it really varies and can be a function of the number of errors encountered and/or the path (conventional or direct) taken by the statement. In general, DML error logging should use direct path processing. Obviously, it is abundantly safe if a high percentage of your data is clean. If you are running 10GR2 this is definitely an option to seriously consider for your data loads.

For me there was one unresolved issue with using this option. Based on Tom's testing, updates that use the LOG ERRORS clause generate nearly 2 times the amount of redo and nearly 2 times the undo when compared to a variety of other more clunky "coding" options that render the same result. Naturally, I will need to determine why this is the case :) When I get the answer I will create a follow-on post. Like I always say, Oracle cannot hide the nature of the redo!

In the afternoon Doug Burns gave another very good presentation titled "How Many Slaves? (Parallel Execution and the Magic of Two)." While I haven't consumed the entire paper that accompanies the presentation it is very obvious Doug has done his homework. What I truly appreciate with this topic, in particular Doug's paper and presentation material, is the rigor he has imposed on his investigations. While Doug will be the first to tell you that there are many more testing scenarios that can be envisioned (countless) and our results will be platform-specific, the key is to know your system and understand that the degree of parallelism is not an exact science. I encourage you to read his paper to get the details as they are extremely interesting. And remember, just because your system affords you the ability to run potentially higher degrees of parallelization to achieve greater response time this doesn't give us a license to abuse our systems. Everything has a cost. We should carefully evaluate our "need" for fast throughput for jobs deemed worthy of parallelizing as we could easily starve other well deserving processes of CPU.

Tomorrow, Jonathan Lewis is giving a training day on "Understanding and Assisting the Cost Based Optimizer." I am sure this will be a terrific ending to a fantastic week of learning from some of the world's best Oracle minds.

0 Comments:

Post a Comment

<< Home