Excuse me while I belabor what may be a triviality. Here’s a common pattern:
A function f: (X,Y) → Z is implemented as f(x,y) = f'(g(x),y).
In other words, evaluating f(x,y) involves an intermediate evaluation g(x) that is independent of y. If g(x) is expensive and f(x,y) is evaluated more than once for the same x, you probably want to cache the result of g(x) and reuse it across multiple f calls. Let’s contrive an example for the sake of clarity:
The first line of input contains N whitespace-separated integers. The remaining lines of input each contain a single index 0 ≤ i < N. For each index i, print the ith-smallest integer from the first line.
The solution is obvious: sort the N integers and index the resulting array using the given indices. And it fits our pattern: the N integers are our singular x input, and the given indices are our y inputs. The intermediate value g(x) is just the sort of the N integers. The point made above seems trivial in the context of this example: it would be idiotic to re-sort the integers for each index! Bear with me; let’s see what it might look like in Haskell:12
import Data.List (sort) import Data.Vector.Unboxed ((!), fromList) main :: IO () main = do -- get the integers in a list ints <- map read . words <$> getLine -- sort them and put them in an array let sorted_ints = toList $ sort ints -- get the indices and find the values indices <- map ((sorted_ints !) . read) . lines <$> getContents -- print the values mapM_ (putStrLn . show) results
This implementation gets the job done, and as a code snippet it’s fine. But in
a real codebase we might want to encapsulate the operation “find the
ith-smallest element in a list of (orderable) values” into a single
function. The intermediate function g—toList . sort in the example—is an
implementation detail that should be hidden from the surrounding code and the
programmers who use f. Let’s try writing ithSmallest:
ithSmallest :: Ord a => [a] -> Int -> a ithSmallest xs i = sort (toList xs) ! i
But this function does exactly what we set out not to do: it rebuilds the
sorted array every time it’s called. To fix it, we can take advantage of the
fact that all functions in Haskell are curried. In other words, all functions
take a single argument, and any function that appears to have multiple
arguments actually takes a single argument and returns a new function with one
fewer arguments. E.g. f :: a -> b -> c -> d takes a value x :: a and
returns f' :: b -> c -> d, which takes a value y :: b and returns f'' :: c
-> d, etc. This is easy to understand if you realize that the type operator
-> is right associative, so f :: a -> b -> c -> d is really f :: a -> (b
-> (c -> d)).
So ithSmallest already takes a list of values and returns a new function with
takes an index and finds the answer. Let’s take matters into our own hands and
compute the sorted array before explicitly returning the one-argument function.
It’s easy with a simple let statement and an anonymous function:
ithSmallest' :: Ord a => [a] -> Int -> a ithSmallest' xs = let svec = fromList $ sort xs in \i -> svec ! i
This does exactly what we want. We can partially apply it as an argument to
map and the sorted array is built just once:
import Data.List (sort) import Data.Vector.Unboxed ((!), fromList) ithSmallest' :: Ord a => [a] -> Int -> a ithSmallest' xs = let svec = fromList $ sort xs in \i -> svec ! i main :: IO () main = do -- get the integers in a list ints <- map read . words <$> getLine -- get the indices and find the values results <- map (ithSmallest' ints) . lines <$> getContents -- print the values mapM_ (putStrLn . show) results
Of course, there are other ways to to reuse the sorted array while hiding it from the programmer. Many of those methods involve “heavier” abstractions like the ST monad or explicit memoization via unsafe IO calls. Those techniques are useful in certain cases, but it’s important to realize when they’re overkill. The simplest abstraction is almost always the best choice.
1
vector, which exports
Data.Vector.Unboxed and other useful modules, is the de facto package for
efficient array operations these days.
2 Data.List.sort is used for brevity
and simplicity. Better sorting algorithms, which take advantage of mutable
arrays, are available in
vector-algorithms.