Home:ALL Converter>Optimizing Oracle stored procedures

Optimizing Oracle stored procedures

Ask Time:2012-02-03T21:12:22         Author:bjsample

Json Formatter

I was recently tasked with optimizing some existing Oracle stored procedures. Each of the stored procedures query the database and generate an XML file output. One in particular was taking about 20 minutes to finish execution. Taking a look at it there were several nested loops and unnecessary queries. For example, rather than doing a

SELECT * from Employee e, Department d WHERE e.DEPT_ID = d.ID
--write data from query to XML

it was more like

FOR emp_rec in ( SELECT * from employee )
   SELECT * from Department WHERE id = emp_rec.DEPT_ID;
   --write data from query to XML

Changing all these cases to look more like the first option sped up the procedures immensely. My question is why? Why is doing a join in the select query quicker than manually combining the tables? What are the underlying processes?

Author:bjsample,eproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/9129201/optimizing-oracle-stored-procedures
Dave Costa :

Let's look at how the original version is likely to be processed.\n\nFOR emp_rec in ( SELECT * from employee )\nLOOP\n SELECT * from Department WHERE id = emp_rec.DEPT_ID;\n --write data from query to XML\nEND LOOP;\n\n\nThe loop query is likely to do a full table scan on employee. Then, for each row returned, it will execute the inner query. Assuming that id is the primary key of department, each execution of the query is likely to do a unique lookup using the primary key index.\n\nSounds great, right? Unique index lookups are usually the fastest way to get a single row (except for explicit lookup by ROWID). But think about what this is doing over multiple iterations of the loop. Presumably, every employee belongs to a department; every department has employees; and most or all departments have multiple employees.\n\nSo on multiple iterations of the loop, you're repeating the exact same work for the inner query multiple times. Yes, the data blocks may be cached so you don't have do repeat physical reads, but accessing data in the cache does have some CPU overhead, which can become very significant when the same blocks are accessed over and over again.\n\nFurthermore, ultimately you will likely want every row in department at least once, and probably more than once. Since every single block in the table will need to be read, you're not really saving work by doing an index lookup -- you're adding work.\n\nWhen you rewrite the loop as a single query, the optimizer is able to take this into account. One option it has would be to do a nested loop join driven by employee, which would be essentially the same as the explicit loop in PL/SQL (minus the context switching as pointed out by Mark). However, given the relationships between the two tables, and the lack of any filtering predicate, the optimizer will be able to tell that it's more efficient to simply full-scan both tables and do a merge or hash join. This actually results in fewer physical IOs (assuming a clean cache at the start of each execution) and much fewer logical IOs.",