Surprises in SQL – State-of-the-art options in the standard query language
And the Performance?
The main benefit of window functions is the variety of options you have with simple syntax. Window functions are also typically much faster than their purely relational equivalents. Despite the huge range of options that window functions offer, you can easily estimate the additional overhead they cause in a database. You just need to remember that most window functions are performed in two simple steps: preparation and application.
In the preparation phase, you need to make sure the data is sorted as per the PARTITION BY
and ORDER BY
clauses. In other words, the columns of the PARTITION BY
clause – if they exist – are placed in front of the ORDER BY
clause, and the data is sorted accordingly. You can omit this sorting if the data is already sorted in this order. The data could be in order if it was sorted previously – for example, for a different window function – or, as in the previous example, by a global ORDER BY
clause. Sorting is also unnecessary if the data comes from an index with appropriate sorting.
The additional overhead of the preparatory phase is thus mainly equivalent to the overhead for an ORDER BY
clause for each different OVER
clause. Identical OVER
clauses do not cause this overhead multiple times. The previous example requires no preparatory overhead, because the global ORDER BY
clause has already sorted the data as required in advance.
After this preparation, the individual window functions are then executed. This overhead can be roughly compared with normal use of aggregate functions with GROUP BY
, with the exception that the function is executed for each row in the worst case. Of course, each SQL vendor does its best to avoid this worst case.
Time Travel with SQL:2011
The final trailblazing SQL feature I'll examine in this article is temporal and bi-temporal tables. Although the term temporal itself does not give you many clues, this feature encompasses functions that everyone has already used. Generally speaking, you can use temporal and bi-temporal tables to make data changes traceable. Consider a customer database: how does the user handle the fact that a customer has changed their name? Is it sufficient to overwrite the name with a simple UPDATE
? Or do you need to be able to see that the name has changed? Where traceability was required, users previously had to take care of this themselves. That is, mark the old row in the customer database as deprecated and create a new row with the new name.
SQL:2011 now offers time-versioned tables. An UPDATE
of this kind of table would look something like the following:
UPDATE customers FOR PORTION OF validity FROM DATE'2015-01-01' TO DATE'9999-12-31' SET name = 'New name' WHERE id = 123
The user defines in the UPDATE
– immediately after the table name – the period of time for which the new name is valid. To allow this to happen, the table has to be prepared appropriately by creating two columns for the start and end times and grouping the data to create an application time period (validity
in this example). SQL does all the rest. Whatever you need to achieve, the desired state is done – for example, an UPDATE
of the last valid row and an INSERT
for the new name.
SQL:2011 naturally offers everything else you need to handle these tables, including constraints that understand the time concept (WITHOUT OVERLAPS
), but also queries with a specific time:
SELECT * FROM customers WHERE id = 123 AND validity CONTAINS DATE '2015-12-01'
This query returns the new name, because SQL periods always include the start time and exclude the end time.
Application time versioning is designed to reflect reality. For instance, the valid start for a new name is the day of the wedding – not the time when the new name is entered on the system. The application can choose validity
arbitrarily – which explains the "application time" label.
Validity periods are different for system-versioned tables. Once a table is system versioned by means of appropriate ALTER TABLE
instructions, the database handles versioning in a completely transparent way. All SELECT
, UPDATE
, … instructions still continue to work unchanged; you don't need to modify the application. However, the application cannot choose the validity periods itself. Instead, the database always uses the current system time.
All INSERT
, UPDATE
, and DELETE
instructions always take immediate effect. System versioning thus does not reflect reality but always references the time at which the change was made in the database.
Both types of versioning – application time or system versioning – are optional and can be used independently of one another. It is also possible to apply both kinds of versioning to a single table at the same time; however, this approach creates bi-temporal tables. Application time versioning would use the wedding date as the date for the change, but system versioning uses the time at which the change was entered on the system, which could be weeks later. (Probably very few newlyweds inform their banks of a name change just one day after the ceremony.)
Which Databases?
If you try out one or more of the functions described in this article, you might discover they don't work! Not all SQL databases are state-of-the-art. To be more precise: the SQL standard only mandates a couple of functions. Most of the functions in the standard – including all the functions described in this article – are optional. Additionally, many manufacturers still regard SQL:92 as the definitive statement on SQL. Just imagine this headline for comparison: "Leading NoSQL database now also supports Windows 3.1!"
These modern SQL functions are useful, even though just a few databases (including some open source databases) support them. If you work with SQL on a regular basis, it is worth your while to learn these state-of-the-art SQL features so you can select a database that supports the functionality when you need it. Figure 3 shows some SQL database systems that support the functionality described in this article.
« Previous 1 2
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
TUXEDO Computers Unveils Linux Laptop Featuring AMD Ryzen CPU
This latest release is the first laptop to include the new CPU from Ryzen and Linux preinstalled.
-
XZ Gets the All-Clear
The back door xz vulnerability has been officially reverted for Fedora 40 and versions 38 and 39 were never affected.
-
Canonical Collaborates with Qualcomm on New Venture
This new joint effort is geared toward bringing Ubuntu and Ubuntu Core to Qualcomm-powered devices.
-
Kodi 21.0 Open-Source Entertainment Hub Released
After a year of development, the award-winning Kodi cross-platform, media center software is now available with many new additions and improvements.
-
Linux Usage Increases in Two Key Areas
If market share is your thing, you'll be happy to know that Linux is on the rise in two areas that, if they keep climbing, could have serious meaning for Linux's future.
-
Vulnerability Discovered in xz Libraries
An urgent alert for Fedora 40 has been posted and users should pay attention.
-
Canonical Bumps LTS Support to 12 years
If you're worried that your Ubuntu LTS release won't be supported long enough to last, Canonical has a surprise for you in the form of 12 years of security coverage.
-
Fedora 40 Beta Released Soon
With the official release of Fedora 40 coming in April, it's almost time to download the beta and see what's new.
-
New Pentesting Distribution to Compete with Kali Linux
SnoopGod is now available for your testing needs
-
Juno Computers Launches Another Linux Laptop
If you're looking for a powerhouse laptop that runs Ubuntu, the Juno Computers Neptune 17 v6 should be on your radar.