I am continuing with presenting different solutions for interval queries in SQL Server. For an introduction, please refer to the blog post Interval Queries in SQL Server Part 1. Note that you also need to read an excellent article by Itzik Ben-Gan wrote on interval queries in SQL Server (http://sqlmag.com/t-sql/sql-server-interval-queries) by using the Relational Interval Tree model. I am using the tables and data Itzik has prepared. In order to test the solutions, you can download the code from Itzik’s article by using the link in this paragraph.
The second solution in addition to Itzik’s two RI tree solutions and to the classical solution was introduced to me by my friend and distinguished SQL Server MVP Davide Mauri. Therefore, the merits for this post go to him.
Davide found a way to use the existing SQL Server infrastructure. Although there is no temporal data support, spatial data support exists in SQL Server from version 2008, if I remember correctly. Besides two spatial data types, the geometry and geography types, SQL Server also features specific spatial indexes. Both data types include a method called STIntersects, which returns 1 if a spatial type (geometry or geography) instance intersects another spatial type instance and returns 0 if it does not. If you remember, Albert Einstein actually unified space and time; Davide did the same, therefore we can call him ZweisteinJ
Without further delays, let’s create and populate the table that uses the geometry data type to store the intervals. Here is the creation and population code that also creates a regular primary key and a spatial index.
— SLOW! (20+ min!)
‘ 0, ‘
‘ 0)’, 0)
‘ 0)’, 0)
— ~ 1 min
— ~ 3 min
WITH (BOUNDING_BOX =
(0, 0, 10000000,1));
Without further delays, let’s create and populate the table that uses the geometry data type to store the intervals. Here is the creation and population code that also creates a regular primary key and a spatial index. From the comments in the code, you can notice that the table population is pretty slow. It took more than 20 minutes on my computer. Nevertheless, remember I am inserting 10,000,000 rows in the table; for a production system, 10,000,000 rows in 20 minutes might be completely adequate performance.
Time for querying the table. I am showing here the query that selects the same rows from the middle of the data as the query in my previous article and as the queries in Itzik’s article.
— middle of data
(geometry::STGeomFromText(‘LINESTRING (5000000 0, 5000020 0)’, 0))
— logical reads: 396 (284 table + 112 index), CPU time: 16 ms
The performance is slightly worse that you can get with the RI tree solution, or with the solution I introduced in the part 1 of this blog series. In addition, the insert performance is much worse. Still, this is a viable solution, because it is very simple and uses only existing SQL Server features, without complex queries and mathematics. I also like Davide’s Italo-Balkan way of thinking. We are actually very good at NOT using things for what they are intended for
- Python for SQL Server Specialists Part 4: Python and SQL Server - April 24, 2018
- Python for SQL Server Specialists Part 3: Graphs and Machine Learning - April 11, 2018
- Python for SQL Server Specialists Part 2: Working with Data - March 22, 2018