Skyline Query in T-SQL

January 5, 2012 by
Filed under: Programming, SQL 

Here is the sample database schema as an example to illustrate the procedure and process of implementing a Skyline Query that works with Microsoft Sql Server, this should work with all SQL Server that support nested queries. 

 
 
CREATE TABLE [dbo].[Housing](
    [ID] [int] IDENTITY(1,1) NOT NULL,
    [DistanceFromWork] [numeric](13, 4) NOT NULL,
    [Price] [money] NOT NULL,
    [SquareFootage] [int] NOT NULL,
    [Name] [nvarchar](50) NOT NULL,
    [City] [nchar](10) NULL,
 CONSTRAINT [PK_Housing] PRIMARY KEY CLUSTERED 
(
    [ID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, 
ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
 
 
SET IDENTITY_INSERT [dbo].[Housing] ON
INSERT [dbo].[Housing] ([ID], [DistanceFromWork], [Price], [SquareFootage], [Name], [City]) 
VALUES (1, CAST(1.2000 AS Numeric(13, 4)), 320394.0000, 1600, N'Brooke', N'Boston    ')
INSERT [dbo].[Housing] ([ID], [DistanceFromWork], [Price], [SquareFootage], [Name], [City]) 
VALUES (2, CAST(8.8000 AS Numeric(13, 4)), 179000.0000, 1924, N'Mica', N'Boston    ')
INSERT [dbo].[Housing] ([ID], [DistanceFromWork], [Price], [SquareFootage], [Name], [City]) 
VALUES (3, CAST(1.1000 AS Numeric(13, 4)), 280000.0000, 1800, N'Blackbone', N'Boston    ')
INSERT [dbo].[Housing] ([ID], [DistanceFromWork], [Price], [SquareFootage], [Name], [City]) 
VALUES (4, CAST(12.2000 AS Numeric(13, 4)), 180000.0000, 1923, N'Vienna', N'Boston    ')
INSERT [dbo].[Housing] ([ID], [DistanceFromWork], [Price], [SquareFootage], [Name], [City]) 
VALUES (5, CAST(0.1000 AS Numeric(13, 4)), 500000.0000, 1200, N'Zappod', N'Boston    ')
INSERT [dbo].[Housing] ([ID], [DistanceFromWork], [Price], [SquareFootage], [Name], [City]) 
VALUES (6, CAST(14.2000 AS Numeric(13, 4)), 88000.0000, 3200, N'Rancher', N'Boston    ')
INSERT [dbo].[Housing] ([ID], [DistanceFromWork], [Price], [SquareFootage], [Name], [City]) 
VALUES (7, CAST(1.3000 AS Numeric(13, 4)), 310000.0000, 2300, N'Joomba', N'Boston    ')
INSERT [dbo].[Housing] ([ID], [DistanceFromWork], [Price], [SquareFootage], [Name], [City]) 
VALUES (8, CAST(0.9000 AS Numeric(13, 4)), 312222.0000, 1900, N'Dumbo', N'Boston    ')
SET IDENTITY_INSERT [dbo].[Housing] OFF
 

Here is the query that will do a three-dimensional skyline query here is an output before and after.

------listing for compare------
SELECT *
  FROM [Test].[dbo].[Housing]
order by distancefromwork, price
 
--------optimization query---------
 
    SELECT *
FROM [Housing] h
WHERE h.city = N'Boston' AND NOT EXISTS(
SELECT *
FROM [Housing] h1
WHERE h1.city = N'Boston' AND h1.DistanceFromWork <= h.DistanceFromWork AND
h1.price <= h.price AND  h1.SquareFootage >= h.SquareFootage and
(h1.DistanceFromWork < h.DistanceFromWork OR h1.price < h.price OR h1.squarefootage > h.squarefootage))
order by distancefromwork, price 

 

The performance of this is not very good as it cannot be easily optimized since you are nesting queries to compare and remove other records that are dominated by the ones that you care about. 

BEFORE

ID    DistanceFromWork    Price    SquareFootage    Name    City
5    0.1000    500000.00    1200    Zappod    Boston    
8    0.9000    312222.00    1900    Dumbo    Boston    
3    1.1000    280000.00    1800    Blackbone    Boston    
1    1.2000    320394.00    1600    Brooke    Boston    
7    1.3000    310000.00    2300    Joomba    Boston    
2    8.8000    179000.00    1924    Mica    Boston    
4    12.2000    180000.00    1923    Vienna    Boston    
6    14.2000    88000.00    3200    Rancher    Boston    

 

AFTER

ID    DistanceFromWork    Price    SquareFootage    Name    City
5    0.1000    500000.00    1200    Zappod    Boston    
8    0.9000    312222.00    1900    Dumbo    Boston    
3    1.1000    280000.00    1800    Blackbone    Boston    
7    1.3000    310000.00    2300    Joomba    Boston    
2    8.8000    179000.00    1924    Mica    Boston    
6    14.2000    88000.00    3200    Rancher    Boston    

As you can see it removed item # 1 and #4, as these were dominated by other properties in the system.

 

I will explore this in the future to look at other methods for achieving similar results, getting what you really want out of a system.

Comments

Tell me what you're thinking...
and oh, if you want a pic to show with your comment, go get a gravatar!





%d bloggers like this: