MySQL versus PostgreSQL (Information Technology)

Paul M. Foster (12 February 2012 01:07:51 Revised )

In this post, I'm going to discourage you from using MySQL, in favor of PostgreSQL.

I originally named this post "MySQL Sucks". But there are too many reasons to say that, many of them out of date. Like all FOSS (Free and Open Source Software) projects, MySQL is a moving target. Things improve and bugs emerge and get fixed all the time. So it's hard to make a statement about a particular piece of software in this area and have it be true for a long time. But in the case of MySQL, there are longer-term reasons why you'd want to avoid its use, and instead prefer the less popular FOSS DBMS, PostgreSQL.

First off, understand that in the FOSS database world, there are three major choices: MySQL, PostgreSQL and SQLite. There are other databases out there, but these garner the majority of use. I'm excluding SQLite from the present comparison because, while it is exceptional for some uses, even its author warns that it is a niche DBMS, only suitable in certain limited types of application. So our present comparison is between MySQL and PostgreSQL.

MySQL was originally developed by programmers. PostgreSQL was originally developed by database administrators (DBAs). This makes a difference because MySQL was originally built for speed at the expense of data integrity. PostgreSQL had data integrity built into it from the beginning.

What do I mean by "data integrity"? Mostly two things:

  1. Referential Integrity. This means that foreign key specifications are honored and enforced by the DBMS. Where referential integrity is eschewed, the burdon of enforcing foreign key constraints and the like falls to the application, or the programs which use the DBMS, and is not enforced by the database.
  2. Transactional Integrity. This means that, where a combination of multiple inserts or updates are required to complete a transaction, their full completion is enforced by the database. For example, with the creation of an invoice involving a header record and several detail records. If transactional integrity isn't important to the DBMS, it's up to you to ensure that all the detail and header record inserts actually go through. By contrast, true transactional integrity, where honored by the database, means that the database will not allow any of the inserts to proceed unless the all can be completed properly. It will also undo them, if for some reason, proceeding will violate other constraints on the tables involved.

In the case of referential and transactional integrity, PostgreSQL was originally built with these things in mind. MySQL was not. In MySQL, they were originally the responsibility of the application programmer, not the database. In PostgreSQL, the reverse was true.

(It's worth noting also that, because MySQL was built by programmers rather than DBAs, its adherence to SQL standards has been less strict and complete than with PostgreSQL.)

Now, this situation has changed somewhat over time. MySQL has what are called "storage engines". This means that the part of MySQL which ultimately stores and manages the actual data on disk is a separate piece of software from the part that carries out queries and interprets the SQL language. MySQL has several storage engines available to it. The most popular of these engines used to be MyISAM, which had the same lack of emphasis on data integrity as MySQL itself. However, over time, another storage engine has gained popularity, InnoDB. InnoDB does enforce data integrity, and is, at present, the only MySQL storage engine which does. PostgreSQL has no such thing as a separate storage engine.

InnoDB was developed as a separate product from MySQL, by different programmers in a different company. This meant that, while it might be used by MySQL for a given table/database, the storage engine itself was not owned or controlled by MySQL, even though it was, like MySQL, licensed under the GPL (GNU Public License). As it turns out, InnoDB was ultimately acquired by Oracle.

Perhaps not by coincidence, MySQL was acquired by Sun, which was ultimately acquired by Oracle, where it sits today. Meaning that Oracle now owns both MySQL and InnoDB. But Oracle also owns the Oracle database product.

It's worth considering what happens when a company acquires another company with a competng product. Adobe all but abandoned work on PageMaker and developed its own in-house competing product, InDesign. Same with Freehand, which competed directly with Adobe's Illustrator product. Microsoft recently announced "end-of-life" for FoxPro, as it has its own database product.

It's not hard to imagine a likely scenario where, at some point, Oracle will "end-of-life" MySQL. Fortunately, as soon as MySQL was acquired by Oracle, forks were started in the FOSS community. So when Oracle decides MySQL's time has come, one might perhaps be able to find a forked software product available. The question is, how much different will it be, and how difficult will the transition from MySQL to the forked database be.

Meantime, PostgreSQL is not owned by any company, nor is it likely to be.

Lastly, there is the perpetual question of speed. MySQL has long been touted for its speed of operation, particularly as compared to PostgreSQL. But benchmarks always depend on who ran them and the environment in which they ran. MySQL was always designed to run fast on one-off, single-file queries. For any given benchmark, it may or may not run faster on such queries than PostgreSQL. The two are usually closely matched in this area. Where MySQL dogs it is in multi-table, multiple-join queries. In this respect, PostgreSQL is generally the superior product and always has been.

In addition, there are the one-off odd complaints about MySQL. For example, it has no true boolean type. Booleans are stored as integers, and so it's possible to store a number like 27 in MySQL's "boolean" type. It also has peculiar non-standard rules for quoting identifiers and string field constants. However, these are generally minor inconveniences and may change over time. In some cases, PostgreSQL has similar quirks, like folding all identifiers into lower case, unless quoted (SQL standard demands they be folded into upper case, rather than lower case).

In summary, if you want a database which enforces data integrity, adheres relatively closely to SQL standards, is unencumbered by corporate ownership (and whims), and has reasonable speed under a wide range of query scenarios, your proper choice is PostgreSQL, not MySQL.