Skyline Query

January 4, 2012 by
Filed under: Design, Programming 

A Skyline query is one of the most useful but least obvious data programming patterns.  When you have a set of data and a query with 2 constraints that are conflicting, the output of this is the “Skyline” Query.  If you want to thing about what skyline means and why it describes this issue take the example of searching for cars to drive, and lets say you want the biggest car with the best gas mileage.  These two constraints are conflicting since as cars get bigger, gas mileage usually goes down, but a skyline query essentially runs this query across a set of data and then grabs the tops of the skylines. This would take a cluster of data and grab the top 2 or 3 in that set, then move along to find the next set, this way you can give the user choices without having to come up with an artificial weight or

 

A more realistic example would be searching for a house that is the cheapest cost, that is the biggest, and say closest to where you work.  This is one of the rare data patterns that has so many real world applications it is not commonly known below we will show an implementation of an efficient algorithm shown in this paper to offer stellar performance.

The Skyline Operator

They showed that the best performance for a multi-dimensional skyline processor is a Block – Nested Loop.

Here is some psudo-code with implementation coming next. I’ll show brute-force then also

an implementation with the block-nested loop for T-SQL..

   1:  M – Input of dimensional points
   2:  R output of dimensional points
   3:  T temporary file for dimensional points
   4:  S set of dimensional points
   5:  p < q point p is dominated by point q
   6:   
   7:  function SkylineBNL(Input)
   8:  {
   9:  Output = 0; 
  10:  Temp = 0; 
  11:  Set=0;
  12:  CountIn = 0;
  13:  CountOut = 0;
  14:   
  15:  while(input)
  16:  do
  17:  {
  18:  foreach p in Set
  19:  {
  20:  if(TimeStamp(p)=countin 
  21:  {
  22:  save(Output, p), Release(p)
  23:  }
  24:  Load(Input,p)
  25:  TimeStamp(p) = CountOut;
  26:  CountIn++
  27:  foreach(q in Set)
  28:  {
  29:  if(p>q) release (p) break;
  30:  if(p<q) then reelase(q);
  31:  }
  32:  if(MemoryAvailable)
  33:  {
  34:  save(Temp,p), release(p)
  35:  CountOut ++;
  36:  }
  37:  //keep it running to iterate through entire sets.
  38:  if(Main == empty)
  39:  {
  40:  Main = Temp;
  41:  Temp=0;CountIn=0;CountOut=0;
  42:  }
  43:  }
  44:  }
  45:   
  46:  //flush
  47:  foreach(p in Set)
  48:  {
  49:  save(Output,p), release(p)
  50:  }
  51:   
  52:  return Output;

Comments

One Comment on Skyline Query

  1. S.Deepam Kanmani on Wed, 21st Aug 2013 6:18 am
  2. how to write skyline query concept in oracle?
    b’coz i have seen skyline of operator but this operator is not supported by oracle…
    give me some suggestions on skyline using oracle database

Tell me what you're thinking...
and oh, if you want a pic to show with your comment, go get a gravatar!





%d bloggers like this: