Haskell is a functional programming language that emphasizes purity and lazy evaluation. In order to understand how Haskell works, it's important to first understand the concepts of function evaluation and termination.
Born in 1990, Haskell is a functional programming language designed by a committee of experts to embody the following characteristics:
- Purely Functional: Haskell is a purely functional language, meaning that all computations are performed using functions and function application.
- Call-by-Need (Lazy Evaluation): values are not computed until they are actually needed. This permits us to make infinite computations and infinite lists.
- Strong Polymorphic and Static Typing: every expression has a well-defined type that can be checked at compile time. It also has polymorphic typing, which allows you to write functions that can operate on values of any type as long as they meet certain requirements (type inference ... indeed usually we don´t need to explicitly declare types).
In mathematics, functions do not have side-effects. Haskell is purely funtional, so we will see later how to manage inherently side-effectful computations (e.g. those with I/O).
A function application ready to be performed is called a reducible expression (or redex). We basically can have two strategies regarding evaluations of functions:
- call-by-value: in this strategy, arguments of functions are always evaluated before evaluating the function itself - this corresponds to passing arguments by value.
- call-by-name:We start with the redex that is not contained in any other redex. Functions are always applied before their arguments, this corresponds to passing arguments by name.
In call-by-name, if the argument is not used, it is never evaluated; if the argument is used several times, it is re-evaluated each time. In Haskell it's used Call-by-need which is a memoized version of call-by-name where, if the function argument is evaluated, that value is stored for subsequent uses. In a "pure" (effect-free) setting, this produces the same results as call-by-name, and it is usually faster. Call-by-need is very convenient for dealing with never-ending computations that provide data
makeStream :: [l] -> [l]
makeStream l = l ++ makeStream l
ghci> i = makeStream 1
ghci> take 5 i
[1,1,1,1,1]
Currying is an important concept and refers to the fact that functions have only one argument.
answerToEverything = 42
complexcalc x y z = x * y * z * answerToEverything
:t complexcalc
complexcalc : : Integer -> Integer -> Integer -> Integer
complexcalc : : Integer-> (Integer -> (Integer -> Integer))
The term is a reference to logician Haskell Curry. The alternative name Schönfinkelisation has been proposed as a reference to Moses Schönfinkel but didn't catch on.
Functions are declared through a sequence of equations using pattern matching. Haskell has a powerful mechanism for defining custom data types called "algebraic data types". You should become familiar with how to define and use these types, as well as pattern matching.
first (x,_,_) = x
second (_,x,_) = x
third (_,_,z) = z
first (1,2,3) -> 1
second (1,2,3) -> 2
third (1,2,3) -> 3
The _
means the same thing as it does in list comprehensions. It means that we really don't care what that part is, so we just write a _.
Other useful stuff regarding functions:
.
is used for composing functions(f . g)(x)
isf(g(x))
$
symbol in Haskell is called the "function application operator". It has a very low precedence, which means that it allows you to avoid using parentheses when applying functions:f (g x)
is like:f $ g x
. Another way to think about it is that$
acts as a "parentheses eraser" in function chaining. It is not necessary but makes code concise and readable.
data
and type
keywords are both used to define new data structures, but they have different purposes.
- A type is a set of values and it's made using the keyword
data
(and not type lol) - A type-class is a set of types.
Show
,Eq
,Ord
are typeclasses. - A type-class is a way to define operations to other types ... very similar to interfaces in for example Java. They are the mechanism provided by Haskell for ad hoc polymorphism.
Haskell | Java (or similar OOP) |
---|---|
Type Class | Interface |
Type | Class |
Value | Object |
Method | Method |
data
is used when you want to define a new type you usedata
.
data Tree a = Empty | Node a (Tree a) (Tree a)`
type
is used to define type synonyms, which are alternative names for existing types. For example, you could usetype
to define a synonym for a list of integers:
type IntList = [Int]`
newtype
is 'weird' since we want to make an alias 'basically identical' to another already definetype
but I want to say to ghc to consider them two different types.
Recursive Types:
-- Trees
data Tree a = Leaf a | Branch (Tree a) (Tree a)
Branch :: Tree a -> Tree a -> Tree a
aTree = Branch (Leaf 'a') (Branch (Leaf 'b') (Leaf 'c'))
-- Lists are recursive types
data List a = Null | Cons a (List a)
data [a] = [] | a : [a]
Lists in Haskell are represented by square brackets []
and the elements (of the same type) are separated by commas.
numbers = [1,3,5,7]
numbers !! 2 -> 5
null numbers -> False --tell us if the list is empty
head numbers -> 1
tail numbers -> [3,5,7]
init numbers -> [1,3,5]
last numbers -> 7
drop 2 numbers -> [5,7]
take 2 numbers -> [1,3]
elem 1 [1,2,3] -> True
elem 21 [1,2,3] -> False
take
is used to take n
items from a list: take n (x:xs)
.
If n
is equal to zero, an empty list is returned. If n
is greater than the length of the list, the entire list is returned.
The zip
function is a common operation in programming languages that takes two or more lists as input and combines them into a single list of tuples, where each tuple contains corresponding elements from the input lists.
- Resulting list will have the same length as the shortest input list.
- Especially useful for when you want to combine two lists in a way or traverse two lists simultaneously.
zip [1,2,3,4,5] [5,5,5,5,5] -- [(1,5),(2,5),(3,5),(4,5),(5,5)]
zip [1 .. 5] ["one", "two", "three", "four", "five"] -- [(1,"one"),(2,"two"),(3,"three"),(4,"four"),(5,"five")]
Implementation:
zippa :: [a] -> [b] -> [(a,b)]
zippa [] _ = []
zippa _ [] = []
zippa (x:xs) (y:ys) = (x,y) : zippa (xs) (ys)
The : operator is called the cons
operator. It is used to prepend an element to a list.
0 : [1, 2, 3] -> [0, 1, 2 ,3]
Example of a list comprehension:
length' xs = sum [1 | _ <- xs]
fi xs = [x | x <- xs, odd x == False]
(1,2,3)
in Haskell is a tuple (while [1,2,3]
is a list). A tuple is a fixed-length, ordered collection of elements. While lists are homogeneous, tuples are heterogeneous. For example, (1, "Hello", 3.14)
is a valid tuple in Haskell.
Tuples are useful for when you want to group together a fixed number of elements that may have different types, while lists are useful for when you want to store a sequence of elements that are all of the same type.
fst
and snd
are built-in functions that operate on pairs:
pair = (3, 5)
firstElement = fst pair -- returns 3
pair = (3, 5)
secondElement = snd pair -- returns 5
let x = 3
y = 12
in x+y -- => 15
--or
let {x = 3 ; y = 12} in x+y
euler :: Point -> Point -> Float
euler (Point x1 y1) (Point x2 y2) =
let dx = x1 - x2
dy = y1 - y2
in sqrt ( dx*dx + dy*dy )
Example with where
:
bmi :: Float -> Float -> String
bmi w h
| calc <= 18.5 = "Underweight"
| calc <= 25.0 ="Normal"
| calc <= 30.0 = "Overweight"
| otherwise = "Obese"
where calc = w/h^2
checkSign :: Int -> String
checkSign n =
if n >= 0
then "Positive"
else "Negative"
To make instances of the Eq
typeclass, we need to implement the (==)
function.
Example on how to create an instance of the Eq
typeclass:
-- Define a custom data type called Person
data Person = Person String Int
-- Implementing Eq instance for Person
instance Eq Person where
(Person name1 age1) == (Person name2 age2) =
name1 == name2 && age1 == age2
data Queue a = Queue [a] [a]
instance (Show a) => Show (Queue a) where
show (Queue x y) = show x ++ "|" ++ show y
data Tree a = Empty | Leaf a | Node (Tree a) (Tree a)
instance Show x => Show (Tree x) where
show Empty = "()"
show (Leaf a) = "(" ++ show a ++ ")"
show (Node a b) = "[" ++ show a ++ show b ++ "]"
Usually it is not necessary to explicitly define instances of some classes. Haskell can be quite smart and do it automatically, by using deriving.
data Point = Point Float Float deriving (Show,Eq)
Ord
typeclass is used for types that can be ordered or sorted.
data Student = Student { rollNumber :: Int, grade :: Char }
instance Ord Student where
compare student1 student2 = compare (rollNumber student1) (rollNumber student2)
data RPS = Rock | Paper | Scissors deriving (Show, Eq)
instance Ord RPS where
x <= y | x == y = True
Rock <= Paper = True
Paper <= Scissors = True
Scissors <= Rock = True
_ <= _ = False
Here is a high-level summary of the steps to reach Monads. Monads are the most powerful of these type classes, and they are used for dealing with side effects like error handling or state management.
- Foldable: A type class that enables folding over data structures, such as lists or trees.
- Functor: A type class that defines a way to apply a function over a structure, preserving the structure's type. Note that in C++, functors exist and are involving "mapping operations" but are a different thing.
- Applicative Functors: A type class that extends the Functor class to allow for sequential application of functions over multiple structures, with the ability to lift functions of multiple arguments into the context of the structures.
- Monad: A type class that extends the Applicative Functor class to enable sequencing of actions or computations, with the ability to chain together actions that may produce values, handle errors, or make decisions based on the results of previous actions.
Actually it's not necessary to implement all these steps to have a Monad.
Foldable is a class used for folding. The main idea is the same as foldl
and foldr
for lists: given a container and a binary operation f
, we want to apply f
to all elements in the container.
A minimal implementation of Foldable requires foldr
.
foldr
starts from right.
ghci> :t foldr
foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b
data Tree a = Empty | Leaf a | Node (Tree a) (Tree a)
tfoldr f z Empty = z
tfoldr f z (Leaf x) = f x z
tfoldr f z (Node l r) = tfoldr f (tfoldr f z r) l
instance Foldable Tree where
foldr = tfoldr
> foldr (+) 0 (Node (Node (Leaf 1) (Leaf 3)) (Leaf 5))
9
The foldl
function in Haskell is short for "fold left". This means that it starts at the leftmost element of a list and applies a given binary operator to each element and an accumulator value, accumulating the result as it goes.
Note that in Racket it is defined with (f x z)
.
ghci> :t foldl
foldl :: Foldable t => (b -> a -> b) -> b -> t a -> b
foldl (\acc x -> x + acc) 0 [1,2,3]
In foldl (+) 0 [1,2,3]
the steps are:
- Start with an accumulator value of
0
. - Apply
(+) 0 1
, which gives us a new accumulator value of1
. - Apply
(+) 1 2
, which gives us a new accumulator value of3
. - Apply
(+) 3 3
, which gives us a final result of6
.
Example of multiple arguments in the lambda expression of foldl
:
foldl (\acc (x, y) -> x + y + acc) 0 [(1, 2), (3, 4), (5, 6)]
Implementation of elem
with foldl
:
isThere y ys = foldl (\acc x -> if x == y then True else acc) False ys
Foldl
can be expressed in term of foldr (id is the identity function):
foldl f a bs = foldr (\b g x -> g (f x b))
Actually foldr
may work on infinite lists, unlike foldl
: foldl
starts from left to right but actually it starts to compute the recurrence from the last element (but if it doesn't exists .. that's not possible).
Functor is the class of all the types that offer a map operation Generally it's natural to make every data structure an instance of functor.
fmap id = id
(where id is the identity function)fmap (f.g) = (fmap (f)).(fmap (g))
(homomorphism)
ftmap :: (a->b) -> (Tree a) -> (Tree b)
ftmap _ Empty = Empty
ftmap f (Leaf x) = (Leaf (f x))
ftmap f (Node z1 z2) = (Node (ftmap f z1) (ftmap f z2))
instance Functor Tree where
fmap = ftmap
class (Functor f) => Applicative f where
pure :: a -> f a
(<*>) :: f (a -> b) -> f a -> f b -- <*> means apply
The pure
method is a useful helper function for taking an ordinary value or function and putting it into a context.
concat [[1,2],[3],[4,5]]
[1,2,3,4,5]
ghci> concatMap (\x -> [x, x+1]) [1,2,3]
[1,2,2,3,3,4]
ghci> concatMap (++"?") ["a","b","c"]
"a?b?c?"
ghci> concatMap (++"") ["a","b","c"]
"abc"
ghci> concatMap (return "a") ["a","b","c"]
"aaa"
-- with concatMap, we get the standard implementation of <*> (the main op of applicative)
ghci> [(+1),(*2)] <*> [1,2,3]
[2,3,4,2,4,6]
ghci> z = Node (Leaf (*3)) (Leaf (*2))
ghci> y = Node (Leaf 2) (Leaf 1)
ghci> z <*> y
[[(6)(3)][(4)(2)]]
Applicative instance example on a binary tree where in each node is stored data, together with the number of nodes contained in the subtree of which the current node is root.
data Ctree a = Cnil | Ctree a Int (Ctree a) (Ctree a) deriving (Show, Eq)
cvalue :: Ctree a -> Int
cvalue Cnil = 0
cvalue (Ctree _ x _ _) = x
cnode :: a -> Ctree a -> Ctree a -> Ctree a
cnode x t1 t2 = Ctree x ((cvalue t1) + (cvalue t2) + 1) t1 t2
cleaf :: a -> Ctree a
cleaf x = cnode x Cnil Cnil
instance Functor Ctree where
fmap f Cnil = Cnil
fmap f (Ctree v c t1 t2) = Ctree (f v) c (fmap f t1)(fmap f t2)
instance Foldable Ctree where
foldr f i Cnil = i
foldr f i (Ctree x _ t1 t2) = f x $ foldr f (foldr f i t2) t1
x +++ Cnil = x
Cnil +++ x = x
(Ctree x v t1 t2) +++ t = cnode x t1 (t2 +++ t)
ttconcat = foldr (+++) Cnil
ttconcmap f t = ttconcat $ fmap f t
instance Applicative Ctree where
pure = cleaf
x <*> y = ttconcmap (\f -> fmap f y) x
Example with a Slist
data structure for lists that store their length.
data Slist a = Slist Int [a] deriving (Show, Eq)
makeSlist v = Slist (length v) v
instance Foldable Slist where
foldr f i (Slist n xs) = foldr f i xs
instance Functor Slist where
fmap f (Slist n xs) = Slist n (fmap f xs)
instance Applicative Slist where
pure v = Slist 1 (pure v)
(Slist x fs) <*> (Slist y xs) = Slist (x*y) (fs <*> xs)
instance Monad Slist where
fail _ = Slist 0 []
(Slist n xs) >>= f =
makeSlist (xs >>= (\x -> let Slist n xs = f x in xs) )
Other example
instance Applicative Tree where
pure x = (Leaf x)
_ <*> Empty = Empty
Leaf f1 <*> Leaf x = Leaf (f1 x)
Node fl fr <*> Node l r =
Node (Node (fl <*> l) (fl <*> r))
(Node (fr <*> l) (fr <*> r))
Introduced by Eugenio Moggi in 1991, a monad is a kind of algebraic data type used to represent computations.
:t (>>=)
Monad m => m a -> (a -> m b) -> m b
The implementation with monad permits me to perform specific operations with >>=
. >>=
sequentially compose two actions, discarding any value produced by the first, like sequencing operators in imperative languages. Note that do
is a "syntax sugar" (alternative to >>=
) since it provides a more readable and convenient way of writing Monadic code (this is like the begin
structure in Scheme).
-- Define a Monad instance for Tree.
instance Monad Tree where
-- Define 'return' (also known as 'pure') to create a new leaf node with the given value.
return = Leaf
-- For an empty tree, simply return another empty tree.
Empty >>= f = Empty
-- For a leaf node containing value x, apply function f to x and return the result.
Leaf x >>= f = f x
-- For a non-leaf node with left subtree l and right subtree r,
-- recursively apply function f to both subtrees and combine them into one larger tree using Node constructor.
Node l r >>= f = Node (l >>= f) (r >>= f)
This implementation allows us to use standard monadic operations like bind (>>=
) on trees constructed using our custom datatype.
type Log = [String] --remember that is just an alias
data Logger a = Logger a Log
instance (Eq a) => Eq (Logger a) where
(Logger x _) == (Logger y _) = x == y
instance (Show a) => Show (Logger a) where
show (Logger x s) = show x ++ "{" ++ show s ++ "}"
instance Functor Logger where
fmap f (Logger x s) = Logger (f x) s
instance Applicative Logger where
pure x = Logger x ["Init"]
Logger f f_name <*> (Logger x log) = (Logger (f x) (log ++ f_name))
instance Monad Logger where
return = pure
(Logger x l) >>= f =
let Logger x' l' = f x
in Logger x' (l ++ l')
data CashReg a = CashReg { getReceipt :: (a,Float) } deriving (Show, Eq)
getCurrentItem = fst . getReceipt
getPrice = snd . getReceipt
instance Functor CashReg where
fmap f cr = CashReg (f $ getCurrentItem cr, getPrice cr)
instance Applicative CashReg where
pure a = CashReg (a,0.0)
fcr <*> cr = CashReg (getCurrentItem fcr $ getCurrentItem cr, getPrice fcr + getPrice cr)
instance Monad CashReg where
return = pure
cr >>= f =
let newCr = f $ getCurrentItem cr
in CashReg (getCurrentItem newCr, getPrice cr + getPrice newCr)
Maybe is used to represent computations that may fail: data Maybe a = Nothing | Just a
. It is adopted in many recent languages, to avoid NULL and limit exceptions usage: mainly because exceptions are sometimes complex to manage.
Haskell has a simple module system, with import, export and namespaces. Modules provide the only way to build abstract data types (ADT)
IO is not 'compatible' with the functional philosophy. In general, IO computation is based on state change (e.g. of a file), hence if we perform a sequence of operations, they must be performed in order (and this is not easy with call-by-need). IO
is an instance of the monad class.
main = do { --sequence of IO actions
putStr "Please, tell me something>";
thing <- getLine; --a way to get the input
putStrLn $ "You told me \"" ++ thing ++ "\".";
}
import System.IO
import System.Environment
readfile = do {
args <- getArgs; -- command line arguments
handle <- openFile (head args) ReadMode;
contents <- hGetContents handle;
putStr contents;
hClose handle;
}
main = readfile