Noferblatz

Stored Procedures Are Bad
(Programming)

noferblatz (02 December 2019 15:15:41)

Stored procedures are bits of code embedded in a SQL RDBMS. They operate solely on the data in the database, hopefully doing something worthwhile. For example, if you wanted to serialize the invoice numbers on invoices, you could set up a stored procedure to fetch a number from a numbers table, increment it, and use that as the next invoice number. Stored procedures were introduced into the SQL standard in 1999, and were not part of the original specification. Like many other standards, they were introduced because the various SQL makers were including this in their products, and standards often follow industry practice.

There are a variety of reasons for creating and using stored procedures, but this isn't a dissertation on why stored procedures are a good thing. In fact, it is the opposite. I believe stored procedures are bad.

I can't think of any case where the code of a stored procedure couldn't be written instead in the application language, rather than stored in the database. Using the example above, one could write a routine in PHP to query the numbers table, increment the result number, and store that in the numbers table. Whether there is an advantage in having code be in a stored procedure or an application function is a debatable question. Like I said, there are numerous reasons for preferring stored procedures.

One significant failing of stored procedures is that they run in such a way that you don't see them. I can look at any of my PHP code, and PHP will dump a trace, and I can find offending bad code. Not so with stored procedures. Their execution is not obvious. Years ago, I wrote a stored procedure to generate new invoice numbers. Then came the day I was trying to find out how my invoice numbers were getting generated. It wasn't in my code. How was it happening? Oh yeah, there's a bit if hidden code I wrote in the SQL engine to do that.

The biggest problem with stored procedures, from my perspective, is based in the fact that SQL vendors do not adhere to the SQL standard. And there's a very good reason for that: vendor lock-in.

Each implementation of SQL generally uses its own language to implement stored procedures, and such languages are unique to that vendor. There are exceptions, but this is the general rule among major vendors. This means that if you write a stored procedure in the "language" of your SQL vendor, if you ever want to move that database to a different platform, you likely won't be able to easily.

Of course, this is true for other aspects of SQL. Mainstream SQL statements generally port over easily from SQL dialect to SQL dialect. But edge cases and databases using more exotic SQL features do not port over, one for one. For example, sequential ID table columns don't share the same "CREATE TABLE" syntax, vendor to vendor.

And that's the essence of vendor lock-in. If you were Oracle, would you want some customer to decide to change to PostgreSQL, and be able to easily port their data over? No. As a result, adherence to the SQL standard is considered desirable, but optional. Having the majority of the SQL implementation "standard" leverages the skills of existing database administrators and programmers. But many major vendors have training programs which teach the subtle ins and outs of their particular dialect of SQL.

But let's say you're a SQL user. If for some reason you decide one day to move your data from MySQL to PostgreSQL, don't you want that process to be simple and easy? Of course. Vendor lock-in can make that nearly impossible.

Maybe you're not so picky about your data, and maybe you love your vendor and always will. Good on ya. But personally, I believe my data is mine, and you (SQL vendor) have no right to lock me in to your implementation. It's kind of like, if I write something in C, I expect every compiler I might use to generate code which functions as I designed it. I don't want my new compiler to barf a lung on some standard C statement in the code.

Of course, this is a large part of the reason I use Linux and LibreOffice. I object to Microsoft locking me in to its proprietary Word document format. So I use LibreOffice's vendor-independent open source document format. I object to Microsoft dictating that I have to ask them for new authorization every time I change a piece of hardware on my computer. Ergo, Linux.

I want my data to be portable. And it's not just a matter of principle. I've written numerous internal applications for my business. And at one time, I used PostgreSQL exclusively. But one day, I upgraded my operating system and installed the latest version of PostgreSQL. Suddenly, PostgreSQL couldn't work with my data. No, it wasn't my mistake. Something about the PostgreSQL install went south, and I couldn't fix it. So none of my applications would work. At that point, I did some research and realized that SQLite would be a better fit. It scales fine for small to medium size websites, and although it doesn't have all the features of most other SQL implementations, it works fine with my data. Moreover, while other varieties of SQL store their data in various files somewhere on the disk, SQLite stores data in a single file which can be copied or moved anywhere you like.

Needless to say, SQLite doesn't allow stored procedures, which is just fine with me. Don't get me wrong. When I used PostgreSQL, I did write and use a few stored procedures. But when I transitioned from PostgreSQL to SQLite, all that was gone, because SQLite didn't support them. In addition, I had to rewrite all my table creation code which utilized serialize table indexes, because PostgreSQL and SQLite don't use the same syntax there.

My point, obviously, is that in the database world, portability is king. I want as much portability as possible. The ideal is impossible to achieve, but we can get as close to it as possible. That means stored procedures, which are inherently non-portable, are out. I can rewrite any of that code in PHP at the application level. It also means I don't use some of the more exotic aspects of SQL which don't translate straight across between SQL vendors. Straight tables, no weird data types, no strange table constraints. Vanilla.

You may disagree. You may love MySQL and want to use every edge case feature of their SQL dialect. I'm fine with that. Do as you like. I'm only making my own case for portability.