Steve Callan, firstname.lastname@example.org
Is Oracle always the best tool or application to manipulate data? The answer lies within what you mean by manipulate. Data can be transformed via data manipulation language, and it can also be transformed via string or numerical manipulation. To distinguish between the types of manipulation, let’s say that stored data is in the purview of DML, and output or displayed data is what or how stored data is presented to the user. A simple example of this is a date. What is stored can be forced to display in numerous ways. March 8th, 2007 is just as valid an output as 03/08/2007; it just depends on what you or your users want to see.
The point of this article concerns how you get from what is stored to what is displayed. The construct for this point revolves around a string/numeric manipulation problem. Suppose you have an application that supports surveys or offers a coupon to users via the Internet. After clicking Submit on an online survey form, it is very common for your identifier, whether it is a survey ID or an account number, to be passed back as one of many name-value pairs in a URL.
Another use of this “pass back data” transport mechanism is to validate or limit your use of a promotional coupon. Coupons, whether paper or electronic in form, are typically represented in a numeric format, and quite often, as very large numbers. The very large number aspect makes it hard to counterfeit or fake numbers since the density of numbers can be made to be sparse. Issue one million coupons with barcode numbers ranging from 1,000,000 to 1,999,999 and you have a density of one. Using a range of 77,000,000,000 to 78,000,000,000 and finding a valid coupon number just became the equivalent of searching for a needle in the haystack.
The problem or constraint, however, is that passing back a 22-digit barcode number consumes valuable space or length in the URL (using the 255 or 256 character limit). What we need is a way to shorten up the big number into something a bit smaller/shorter. The method is what leads us into the “how” of how are we going to accomplish this.
One method is to change the base of the number. If you take a base-10 number and convert it into a base-36 number, you save some space. Using Oracle, how would you perform the conversion?
A PL/SQL approach
Let’s look at a PL/SQL approach first. The code has been optimized a bit, in that the factorization has a jump-start in terms of position or iterations already taken into account. It is also based on the input being close to at least 22 digits long.
A different version of this has some output attached to it so you can see how the number is reduced.
This isn’t meant to be security through obscurity, although a relatively strange looking string will discourage most users from trying to figure out what it represents.
One at a time, the data manipulation performance is okay, but how long would it take to generate a million converted values? Let’s make a table with three columns. The first is the base-10 number, the second is the base-36 converted/manipulated value using the PL/SQL code above, and the third is reserved for the output using a different means to manipulate the numbers. Create the table and populate it with a million records. Turn timing on and see how long it takes to generate the base-36 values.
The rate works out to be just over 500 updates per second.
Might there be a faster way outside of PL/SQL to accomplish this task? This problem is a good example of when to use a better language, so to speak, than PL/SQL to transform data. A better language or approach in this case is to manipulate the string using Java and a built-in method (toString).
How do you get Java code into a database?
First, we need to have a java file with the applicable Java code in it. Second, is to upload the code (either the source or compiled version) into the database. Third is to compile the code, and then the last step is to create a wrapper function or procedure around the Java code. The function calls the Oracle-named object, which in turn calls the Java-named object.
The source file/Java code is pretty straightforward.
Load the source file and compile it.
Create a wrapper function/publish the class.
Now we’re ready to test the difference.
The rate above works out to almost 1200 per second, well more than twice as fast as the PL/SQL version. Why is that? Oracle acknowledges that PL/SQL isn’t the fastest language in the world or the best at string manipulation.
A more formal list of steps is shown in the section titled “Java Stored Procedure Steps” in the Java Developer’s Guide.
PL/SQL can do many things well, but other languages can do some things much better, with “better” being measured in terms of speed. If you find that stored data must be manipulated for display type output, don’t be afraid to try other languages supported by Oracle.