June, 2010


21
Jun 10

Lazy cheap flight calculations with priority queues

There is an interesting problem of utilizing priority queues to figure out the best price combination in a set of flight legs. The problem is as follows:

We need to calculate the cheapest combination of flight legs (connections) for a flight to a particular destination. We’re given a price ordered N set of flight legs and we need to find the winning combination. Each combination would be evaluate for eligibility and would either pass or fail, so the cheapest combination doesn’t necessarily reflect the cheapest possibly combination of prices from the legs. A black box predicate function is consulted to ensure the combination is eligible. This reflects various airline rules, like overlapping times, specials that are only available to certain people, routes, or connections.

Solution: A naive approach for say a two leg flight is to say construct a (n x m) ordered matrix and evaluate each priced ordered combination through the black box predicate routing until one passes. The problem with this approach is that we unendingly construct a full matrix when in many cases one of the first combinations is enough to present the cheapest “valid” price. The key to reducing this is to construct a lazy data structure which will prioritize the cheapest flights and can then be iterated to find one that’s valid. We do so at runtime while constructing matrix combinations. The solutions is generalized, so the same can be used for n leg flights.

The algorithm goes something like this…

Construct the first set of combinations which can reflect the cheapest flight. The first cheapest combinations is always n1 + m1. If that doesn’t pass, the next possible set of cheapest combinations is either n2 + m1 or n1 + m2. We then continue to n1 + ma and na + m1, where a is incremented until the end of the route leg set for either leg.

The worst case running time is quadratic O(n2), but because of the lazy data structure, the algorithm runs in rather constant time, depending on how lucky we are that the first few combinations will yield a “rule valid” price combination.

This problem idea came from reading The Algorithm Design Manual by Steven S. Skiena. I recommend this book for anyone wishing to delve into the world of more advanced algorithm design.

Here is the solution in python. You’ve probably noticed I’ve been using a lot of python. Besides the fact that I like the language, python is an incredibly good language for conveying algorithmic ideas in a concise but very readable way.

The only two functions that matter, are cheapest_price and _pick_combo, the rest are just auxiliary functions used to support an OO structure and running a sample.

  import heapq, random, time

  class Route(object):
      """docstring for TicketFinder"""
      def __init__(self):
          self.heap = []
          self.unique = dict()
          self.legs = []
          self.max_leg_len = 0
          self._counter = 0
          self._loop_counter = 0

      def add_leg(self, leg):
          leg.sort()
          self.legs.append(leg)
          leg_len = len(leg)
          if leg_len > self.max_leg_len:
              self.max_leg_len = leg_len

      def cheapest_price(self, pred_func=lambda x: True):
          for i in range(0, self.max_leg_len):
              combo = self._pick_combo(i, pred_func)
              if combo: return combo

      def print_stats(self):
          print("""Legs: %s
  Combos examined: %s
  Loops: %s
  """ % (len(self.legs), self._counter, self._loop_counter))

      def _pick_combo(self, curr_idx, pred_func):
          num_legs = len(self.legs)
          price_combo = [ leg[curr_idx] for leg in self.legs if not curr_idx >= len(leg) ]
          self._add_combo(price_combo)
          cheapest_price = self._eval_price_combo(pred_func)
          if cheapest_price: return cheapest_price
          for idx in range(1, self.max_leg_len-curr_idx):
              for j in range(0, num_legs):
                  if len(self.legs[j]) &lt= (curr_idx+idx): continue
                  combo = []
                  for k in range(0, num_legs):
                      self._loop_counter += 1
                      if j == k:
                          combo.append(self.legs[k][curr_idx+idx])
                      elif curr_idx &lt len(self.legs[k]):
                          combo.append(self.legs[k][curr_idx])
                  self._add_combo(combo)

              cheapest_price = self._eval_price_combo(pred_func)
              if cheapest_price: return cheapest_price

      def _add_combo(self, combo):
          self._counter += 1
          if len(combo) == len(self.legs) and not self.unique.has_key(str(combo)):
              heapq.heappush(self.heap, combo)
              self.unique[str(combo)] = True

      def _eval_price_combo(self, pred_func):
          for i in range(0, len(self.heap)):
              least_combo = heapq.heappop(self.heap)
              if pred_func(least_combo):
                  print("Winning combo: %s" % [ "%.2f" % l for l in least_combo ])
                  return sum(least_combo)
          return None


  ############### Samples below ##################

  def sample_run(num_legs, pred_func):
      print(("#" * 30) + " Sample Run " + ("#" * 30))
      route = Route()
      for i in range(0, num_legs):
          route.add_leg( [ random.uniform(100, 500) for i in range(0, 100) ] )

      start = time.clock()
      price = route.cheapest_price(pred_func)
      calc_time = time.clock() - start

      if price:
          print("Cheapest price: %.2f" % price)
      else:
          print("No valid route found")
      route.print_stats()
      print(("#" * 72) + "\n")

  if __name__ == '__main__':
      sample_run(2, lambda x: True)
      def pred(x):
          for price in x:
              if price &lt 150: return False
          return True
      sample_run(3, pred)

I haven’t thoroughly tested this for correctness besides numerous runs and some basic validation so let me know if you see anything apparently wrong here.

Running the above yields

    ############################## Sample Run ##############################
    Winning combo: ['103.62', '106.40']
    Cheapest price: 210.03
    Legs: 2
    Combos examined: 1
    Loops: 0

    ########################################################################

    ############################## Sample Run ##############################
    Winning combo: ['150.74', '150.25', '173.95']
    Cheapest price: 474.95
    Legs: 3
    Combos examined: 2852
    Loops: 8523

    ########################################################################

For the first sample run, we use a predicate function which yields True, so we never examine anything other than the first combo n1 + m1. For the second sample, I add a predicate function which only accepts any price combination where all legs are above $150. (Of course this is not anything resembling airline rules, just good enough to simulate some sample cases, where the first n combinations are rejected). In the second sample run, we utilized 3 legs and examined 2852 combinations before coming up with the winning leg combination for the route. Each price within the combination is the smallest possible price above $150 for each leg.