SQL currently forces two computer programs to discuss what they want to do in an English based textual representation (wtf?) and both computer programs jump through hoops to adapt to that.

The database itself uses lexers, parsers, optimizers and interpretors to understand and execute the requested SQL statements. The consumer application uses DBALs, ActiveRecords or ORMs or other manners of SQL statement construction in order to convey them to the database.

The fact that two computer programs have to talk to each other in a language foreign to them and entire third party projects are dedicated to helping them cross these communication boundaries easier should give us enough pause (once again, wtf?).

Some of the disadvantages of forcing two natively binary machines to speak broken English:

  • computational overhead on both sides — either in generating queries or in parsing them
  • representational overhead on both sides — data communicated between the two entities is needlessly converted into text (including integers, floats, enums, sets) or it is at least syntactically escaped (strings, blobs)
  • structural duplication — a logical entity is declared once in the application logic and once as a (set of) table(s) in the database
  • added vulnerabilities — the entire spectrum of SQL injection attacks is caused by the textual nature of the communication
  • feature restriction — new features need to be added in database server logic, consumer application logic as well as SQL syntax logic (buzz words: standardization, vendor acceptance, uniformity) which ripples into implementations at the DBAL and ActiveRecord or ORM level in order for the consumer application logic to have access to it

SEQUEL the “Structured English Query Language” was born in the 70s and was used by (not necessarily technical) humans directly, that’s why it was English based. SQL is now a mature technology provided by many vendors with very high performance yields which made it a de facto standard for every data-driven development project — this is a classic case of exaptation.

The very same technology backbone without the “English” part and the “textual” part would mean:

  1. Less code to maintain — less effort and less errors
  2. Less market fragmentation from vendors in terms of behavior — or at least in terms of syntax
  3. No need for DBALs, ActiveRecords or ORMs — but complex client libraries and APIs would be needed to expose the same functionality
  4. Deduplication of structural code — consumer applications would provide authoritative structure of database objects
  5. Access to the entire consumer application library in the database — complex mathematical or logical functions would sometimes need to be implemented at the database level in the (more limited) SQL or PL/SQL languages even when already available at the application level; ex.: hashing functions, geo-positioning arithmetic, complex string manipulation, output format manipulation etc.
  6. More permeability of features between database server and consumer application — such as C-level datatypes such as structs, enums and unions with all their nested structuring and functionality which are currently not available in MySQL (unions, nested structs) but are partially in PostgreSQL
  7. Added permeability allows shifting responsibilities when necessary — moving some of the processing from an overburdened database server to an easier-to-scale application level
Advertisements