The evil of S(E)Q(UE)L

Mountains

For they bind heavy burdens and grievous to be borne, and lay them on men's shoulders; but they themselves will not move them with one of their fingers.

Matthew 23:4, The Bible
Poem: Database Doom
You know what rhymes with SQL?
Next to hell and Save yourself

Any techie that has worked extensively with data has probably used SQL once in his lifetime. I started using SQL early in my programming days, when I worked on a record-keeping software soon after I learned C++. It was very instrumental in that project, and pretty much in any job I've worked for in my career. As long as there was data, there was SQL (in most cases).

With this foundation, you could probably see how surprised I was to come to the conclusion of what SQL truly is, a gatekeeper of data. In this blog post, I'm going to explore what this means in detail.

What is a database?

First, let's start with what a database is.

According to Wikipedia, a database is an organized collection of data. It can also mean a database management system (like SQL), though I don't believe this to be its original definition.

A database management system (DBMS) is the software system that enables users to define create, maintain and control access to the database.

As you can see, the term database and database management system can both mean 2 different things. This is where I believe the confusion comes from. We conflate what a database and a DBMS is and that could get in our way of truly understanding the role of a database, and not the software around it (the DBMS).

The problem with SQL

The main problem with SQL is that it merges the database and DBMS. This may be desired in some situations, but not all. This creates a lack of separations of concerns; SQL tries to do everything for you, leaving little room for the developer to manipulate data easily.

This all-in-one solution by SQL causes some issues:

Data is stored as binary instead of as text

Programs do not create data, people do.

Mike Gancarz, The Unix Philosophy

SQL violates tenet 5 of the UNIX philosophy:

Store data in flat text files.

In my experience, storing data in flat text files makes data handling a breeze. Anybody can take a look at the data at anytime, ANYBODY! Not a SQL expert, not a programmer, but a basic computer user can make complete sense of the data that is available.

Imagine searching through files with common Linux tools like grep and manipulating data with sed. It's much easier. And better yet, you can store data in a format that is more suited for its application. For example, you could store structured data in JSON, but key values in INI files. Life becomes a lot easier when you are in full control of your data.

Bloat

SQL violates tenet 2 of the UNIX philosophy:

Make each program do one thing well.

SQL is more than just a query language; it provides security, data validation and so many other things. Many claim that handling this yourself could be tricky, but I don't believe that to be true. These choices should be left in the hands of the developer.

It's hefty

SQL violates tenet 1 of the UNIX philosophy:

Small is beautiful.

Having to install a server and client makes running SQL a pain. Why do I have to install 2 programs just to read my data? Why do I have to run a full server just to read data made for human consumption?

It has different versions

SQLite, MySQL, PostgreSQL, MariaDB and a lot more. All of them have their nuances when it comes to the SQL language and are not really compatible with themselves.

Exported SQL needs to be built

A SQL database to be used by another SQL database first needs to be exported to SQL code. Exporting a SQL database converts the data back into SQL code. Unfortunately, this could be time-consuming if you have a large database, and it is pretty annoying that you have to convert it from one form to another, just to transfer it, only to convert it back when it needs to be used by a database.

The syntax can be hard to use for beginners

Beginners are not really familiar with how SQL works and it is not a trivial technology to learn.

The syntax can only be used in SQL environments

Only SQL programs understand SQL; humans don't read SQL naturally.

SQL databases need to be constantly updated

This should be the most scary disadvantage: if data is very old, you may have difficulty putting data into a database because of updated syntax. By keeping data security separate from the data itself, you can update security without touching your data at all.

A better alternative?

We have persistent objects, they’re called files.

Ken Thompson

A better alternative would be to keep your data as flexible as possible. In order to do that, I suggest using different technologies that do one (main) thing well, then find a way to make them all work together.

For me, I have decided to use the Linux filesystem as my database, and JSON to store most of my data. However, I could use any format that is appropriate and easy to read and query like INI or CSV.

The beauty of using the filesystem is that everyone knows how to use it. It's as easy as creating files and folders and there are a number of terminal commands that make that relatively easy.

The beauty of using JSON is that you can easily make queries to your data using jq (or any other JSON query language). Therefore you could make the same types of queries that you make in SQL, just with the added advantage of being able to manipulate or view data yourself. When you export your data, you don't need a special program to read it. Everything is ready to be used immediately.

jq also has the added benefit of piping, something that SQL lacks.

For data integrity with JSON, JSON schema can be used to ensure the data is saved properly.

Some other formats that can be used instead of SQL are plain text, XML, HTML, YAML, INI, TOML, and the list goes on. The importance of storing data as plain text cannot be overstated. This is exactly why Unix has had this as their philosophy for years, and why it scales so well in the software industry. Learn from the pros.

The bottom line

Stop using SQL if you can. Try to explore other data formats that are less bloated and completely readable.

Take the power in your own hands!

If you would like to reply to or comment on this blog post, feel free to email me at efe@mmhq.me.