Skip to content

Latest commit

 

History

History
615 lines (427 loc) · 19.4 KB

File metadata and controls

615 lines (427 loc) · 19.4 KB

Haskell

Introduction

Haskell is a functional programming language that emphasizes purity and lazy evaluation. In order to understand how Haskell works, it's important to first understand the concepts of function evaluation and termination.

Born in 1990, Haskell is a functional programming language designed by a committee of experts to embody the following characteristics:

  1. Purely Functional: Haskell is a purely functional language, meaning that all computations are performed using functions and function application.
  2. Call-by-Need (Lazy Evaluation): values are not computed until they are actually needed. This permits us to make infinite computations and infinite lists.
  3. Strong Polymorphic and Static Typing: every expression has a well-defined type that can be checked at compile time. It also has polymorphic typing, which allows you to write functions that can operate on values of any type as long as they meet certain requirements (type inference ... indeed usually we don´t need to explicitly declare types).

What is a functional language?

In mathematics, functions do not have side-effects. Haskell is purely funtional, so we will see later how to manage inherently side-effectful computations (e.g. those with I/O).

Evaluation of functions

A function application ready to be performed is called a reducible expression (or redex). We basically can have two strategies regarding evaluations of functions:

  • call-by-value: in this strategy, arguments of functions are always evaluated before evaluating the function itself - this corresponds to passing arguments by value.
  • call-by-name:We start with the redex that is not contained in any other redex. Functions are always applied before their arguments, this corresponds to passing arguments by name.

Haskell is lazy: call-by-need

In call-by-name, if the argument is not used, it is never evaluated; if the argument is used several times, it is re-evaluated each time. In Haskell it's used Call-by-need which is a memoized version of call-by-name where, if the function argument is evaluated, that value is stored for subsequent uses. In a "pure" (effect-free) setting, this produces the same results as call-by-name, and it is usually faster. Call-by-need is very convenient for dealing with never-ending computations that provide data

makeStream :: [l] -> [l]
makeStream l = l ++ makeStream l

ghci> i = makeStream 1
ghci> take 5 i
[1,1,1,1,1]

Currying

Currying is an important concept and refers to the fact that functions have only one argument.

answerToEverything = 42

complexcalc x y z = x * y *  z * answerToEverything

:t complexcalc
complexcalc : : Integer -> Integer -> Integer -> Integer
complexcalc : : Integer-> (Integer -> (Integer -> Integer))

The term is a reference to logician Haskell Curry. The alternative name Schönfinkelisation has been proposed as a reference to Moses Schönfinkel but didn't catch on.

Function definition

Functions are declared through a sequence of equations using pattern matching. Haskell has a powerful mechanism for defining custom data types called "algebraic data types". You should become familiar with how to define and use these types, as well as pattern matching.

first (x,_,_) = x
second (_,x,_) = x
third (_,_,z) = z
first (1,2,3) -> 1
second (1,2,3) -> 2
third (1,2,3) -> 3

The _ means the same thing as it does in list comprehensions. It means that we really don't care what that part is, so we just write a _.

Other useful stuff regarding functions:

  • . is used for composing functions (f . g)(x) is f(g(x))
  • $ symbol in Haskell is called the "function application operator". It has a very low precedence, which means that it allows you to avoid using parentheses when applying functions: f (g x) is like: f $ g x. Another way to think about it is that $ acts as a "parentheses eraser" in function chaining. It is not necessary but makes code concise and readable.

Data and type

data and type keywords are both used to define new data structures, but they have different purposes.

  • A type is a set of values and it's made using the keyword data (and not type lol)
  • A type-class is a set of types. Show , Eq, Ord are typeclasses.
  • A type-class is a way to define operations to other types ... very similar to interfaces in for example Java. They are the mechanism provided by Haskell for ad hoc polymorphism.
Haskell Java (or similar OOP)
Type Class Interface
Type Class
Value Object
Method Method
  • data is used when you want to define a new type you use data.
data Tree a = Empty | Node a (Tree a) (Tree a)`
  • type is used to define type synonyms, which are alternative names for existing types. For example, you could use type to define a synonym for a list of integers:
type IntList = [Int]`
  • newtype is 'weird' since we want to make an alias 'basically identical' to another already define type but I want to say to ghc to consider them two different types.

Recursive Types:

-- Trees
data Tree a = Leaf a | Branch (Tree a) (Tree a)

Branch :: Tree a -> Tree a -> Tree a
aTree = Branch (Leaf 'a') (Branch (Leaf 'b') (Leaf 'c'))

-- Lists are recursive types
data List a = Null | Cons a (List a)
data [a] = [] | a : [a]

Lists

Lists in Haskell are represented by square brackets [] and the elements (of the same type) are separated by commas.

numbers = [1,3,5,7]
numbers !! 2  -> 5 
null numbers -> False --tell us if the list is empty
head numbers -> 1 
tail numbers -> [3,5,7]
init numbers -> [1,3,5]
last numbers -> 7 
drop 2 numbers -> [5,7]
take 2 numbers -> [1,3]
elem 1 [1,2,3] -> True
elem 21 [1,2,3] -> False

Take

take is used to take n items from a list: take n (x:xs).

If n is equal to zero, an empty list is returned. If n is greater than the length of the list, the entire list is returned.

Zip

The zip function is a common operation in programming languages that takes two or more lists as input and combines them into a single list of tuples, where each tuple contains corresponding elements from the input lists.

  • Resulting list will have the same length as the shortest input list.
  • Especially useful for when you want to combine two lists in a way or traverse two lists simultaneously.
zip [1,2,3,4,5] [5,5,5,5,5]  -- [(1,5),(2,5),(3,5),(4,5),(5,5)]  
zip [1 .. 5] ["one""two""three""four""five"]  -- [(1,"one"),(2,"two"),(3,"three"),(4,"four"),(5,"five")]

Implementation:

zippa :: [a] -> [b] -> [(a,b)]
zippa [] _ = []
zippa _ [] = []
zippa (x:xs) (y:ys) = (x,y) : zippa (xs) (ys)

Cons

The : operator is called the cons operator. It is used to prepend an element to a list.

0 : [1, 2, 3] -> [0, 1, 2 ,3]

Example of a list comprehension:

length' xs = sum [1 | _ <- xs]
fi xs = [x | x <- xs, odd x == False]

Tuples

(1,2,3) in Haskell is a tuple (while [1,2,3] is a list). A tuple is a fixed-length, ordered collection of elements. While lists are homogeneous, tuples are heterogeneous. For example, (1, "Hello", 3.14) is a valid tuple in Haskell.

Tuples are useful for when you want to group together a fixed number of elements that may have different types, while lists are useful for when you want to store a sequence of elements that are all of the same type. fst and snd are built-in functions that operate on pairs:

pair = (3, 5)
firstElement = fst pair -- returns 3

pair = (3, 5)
secondElement = snd pair -- returns 5

Let

let x = 3
y = 12
in x+y -- => 15

--or 
let {x = 3 ; y = 12} in x+y


euler :: Point -> Point -> Float
euler (Point x1 y1) (Point x2 y2) = 
    let dx = x1 - x2 
        dy = y1 - y2 
    in sqrt ( dx*dx + dy*dy )

Example with where:

bmi :: Float -> Float -> String 
bmi w h 
    | calc <= 18.5 = "Underweight"
    | calc <= 25.0 ="Normal"
    | calc <= 30.0 = "Overweight"
    | otherwise = "Obese"
    where calc = w/h^2

If

checkSign :: Int -> String
checkSign n =
  if n >= 0 
    then "Positive" 
    else "Negative"

Equality

To make instances of the Eq typeclass, we need to implement the (==) function. Example on how to create an instance of the Eq typeclass:

-- Define a custom data type called Person
data Person = Person String Int

-- Implementing Eq instance for Person
instance Eq Person where
    (Person name1 age1) == (Person name2 age2) =
        name1 == name2 && age1 == age2

Show

data Queue a = Queue [a] [a] 

instance (Show a) => Show (Queue a) where 
  show (Queue x y) = show x ++ "|" ++ show y  


data Tree a = Empty | Leaf a | Node (Tree a) (Tree a)

instance Show x => Show (Tree x) where
    show Empty = "()"
    show (Leaf a) = "(" ++ show a ++ ")"
    show (Node a b) = "[" ++ show a ++ show b ++ "]"

Usually it is not necessary to explicitly define instances of some classes. Haskell can be quite smart and do it automatically, by using deriving.

data Point = Point Float Float deriving (Show,Eq)

Ord

Ord typeclass is used for types that can be ordered or sorted.

data Student = Student { rollNumber :: Int, grade :: Char }

instance Ord Student where
    compare student1 student2 = compare (rollNumber student1) (rollNumber student2)

The Enumeration Class

data RPS = Rock | Paper | Scissors deriving (Show, Eq)

instance Ord RPS where
x <= y | x == y = True
Rock <= Paper = True
Paper <= Scissors = True
Scissors <= Rock = True
_ <= _ = False

Road to Monads

Here is a high-level summary of the steps to reach Monads. Monads are the most powerful of these type classes, and they are used for dealing with side effects like error handling or state management.

  • Foldable: A type class that enables folding over data structures, such as lists or trees.
  • Functor: A type class that defines a way to apply a function over a structure, preserving the structure's type. Note that in C++, functors exist and are involving "mapping operations" but are a different thing.
  • Applicative Functors: A type class that extends the Functor class to allow for sequential application of functions over multiple structures, with the ability to lift functions of multiple arguments into the context of the structures.
  • Monad: A type class that extends the Applicative Functor class to enable sequencing of actions or computations, with the ability to chain together actions that may produce values, handle errors, or make decisions based on the results of previous actions.

Actually it's not necessary to implement all these steps to have a Monad.

Foldable

Foldable is a class used for folding. The main idea is the same as foldl and foldr for lists: given a container and a binary operation f, we want to apply f to all elements in the container. A minimal implementation of Foldable requires foldr.

Foldr

foldr starts from right.

ghci> :t foldr
foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b 
data Tree a = Empty | Leaf a | Node (Tree a) (Tree a)

tfoldr f z Empty = z
tfoldr f z (Leaf x) = f x z
tfoldr f z (Node l r) = tfoldr f (tfoldr f z r) l

instance Foldable Tree where
foldr = tfoldr

> foldr (+) 0 (Node (Node (Leaf 1) (Leaf 3)) (Leaf 5))
9

Foldl

The foldl function in Haskell is short for "fold left". This means that it starts at the leftmost element of a list and applies a given binary operator to each element and an accumulator value, accumulating the result as it goes. Note that in Racket it is defined with (f x z).

ghci> :t foldl
foldl :: Foldable t => (b -> a -> b) -> b -> t a -> b

foldl (\acc x -> x + acc) 0 [1,2,3]

In foldl (+) 0 [1,2,3] the steps are:

  • Start with an accumulator value of 0.
  • Apply (+) 0 1, which gives us a new accumulator value of 1.
  • Apply (+) 1 2, which gives us a new accumulator value of 3.
  • Apply (+) 3 3, which gives us a final result of 6.

Example of multiple arguments in the lambda expression of foldl :

foldl (\acc (x, y) -> x + y + acc) 0 [(1, 2), (3, 4), (5, 6)]

Implementation of elem with foldl:

isThere y ys = foldl (\acc x -> if x == y then True else acc) False ys

Foldl can be expressed in term of foldr (id is the identity function):

foldl f a bs = foldr (\b g x -> g (f x b))

Actually foldr may work on infinite lists, unlike foldl: foldl starts from left to right but actually it starts to compute the recurrence from the last element (but if it doesn't exists .. that's not possible).

Functor

Functor is the class of all the types that offer a map operation Generally it's natural to make every data structure an instance of functor.

  • fmap id = id (where id is the identity function)
  • fmap (f.g) = (fmap (f)).(fmap (g)) (homomorphism)
ftmap :: (a->b) -> (Tree a) -> (Tree b)
ftmap _ Empty = Empty 
ftmap f (Leaf x) = (Leaf (f x))
ftmap f (Node z1 z2) = (Node (ftmap f z1) (ftmap f z2)) 

instance Functor Tree where 
    fmap = ftmap

Applicative Functors

class (Functor f) => Applicative f where
pure :: a -> f a
(<*>) :: f (a -> b) -> f a -> f b -- <*> means  apply 

The pure method is a useful helper function for taking an ordinary value or function and putting it into a context.

concat [[1,2],[3],[4,5]] 
[1,2,3,4,5] 

ghci> concatMap (\x -> [x, x+1]) [1,2,3] 
[1,2,2,3,3,4]

ghci> concatMap (++"?") ["a","b","c"]
"a?b?c?"

ghci> concatMap (++"") ["a","b","c"]
"abc"

ghci> concatMap (return "a") ["a","b","c"]
"aaa"


-- with concatMap, we get the standard implementation of <*> (the main op of applicative)

ghci> [(+1),(*2)] <*> [1,2,3]
[2,3,4,2,4,6] 

ghci> z = Node (Leaf (*3)) (Leaf (*2))
ghci> y = Node (Leaf 2) (Leaf 1)
ghci> z <*> y
[[(6)(3)][(4)(2)]]

Applicative instance example on a binary tree where in each node is stored data, together with the number of nodes contained in the subtree of which the current node is root.

data Ctree a = Cnil | Ctree a Int (Ctree a) (Ctree a) deriving (Show, Eq)

cvalue :: Ctree a -> Int
cvalue Cnil = 0
cvalue (Ctree _ x _ _) = x

cnode :: a -> Ctree a -> Ctree a -> Ctree a 
cnode x t1 t2 = Ctree x ((cvalue t1) + (cvalue t2) + 1) t1 t2 

cleaf :: a -> Ctree a
cleaf x = cnode x Cnil Cnil

instance Functor Ctree where
 fmap f Cnil = Cnil
 fmap f (Ctree v c t1 t2) = Ctree (f v) c (fmap f t1)(fmap f t2)

instance Foldable Ctree where 
foldr f i Cnil = i 
foldr f i (Ctree x _ t1 t2) = f x $ foldr f (foldr f i t2) t1

x +++ Cnil = x 
Cnil +++ x = x 
(Ctree x v t1 t2) +++ t = cnode x t1 (t2 +++ t)

ttconcat = foldr (+++) Cnil 

ttconcmap f t = ttconcat $ fmap f t 

instance Applicative Ctree where 
pure = cleaf 
x <*> y = ttconcmap (\f -> fmap f y) x

Example with a Slist data structure for lists that store their length.

data Slist a = Slist Int [a] deriving (Show, Eq)

makeSlist v = Slist (length v) v

instance Foldable Slist where
    foldr f i (Slist n xs) = foldr f i xs

instance Functor Slist where
    fmap f (Slist n xs) = Slist n (fmap f xs)

instance Applicative Slist where
 pure v = Slist 1 (pure v)
 (Slist x fs) <*> (Slist y xs) = Slist (x*y) (fs <*> xs)

instance Monad Slist where
 fail _ = Slist 0 []
 (Slist n xs) >>= f = 
	 makeSlist (xs >>= (\x -> let Slist n xs = f x in xs) )

Other example

instance Applicative Tree where 
    pure x = (Leaf x)
    _ <*> Empty = Empty
    Leaf f1 <*> Leaf x = Leaf (f1 x)
    Node fl fr <*> Node l r = 
        Node (Node (fl <*> l) (fl <*> r))
        (Node (fr <*> l) (fr <*> r))

Monad

Introduced by Eugenio Moggi in 1991, a monad is a kind of algebraic data type used to represent computations.

:t (>>=)
Monad m => m a -> (a -> m b) -> m b 

The implementation with monad permits me to perform specific operations with >>=. >>= sequentially compose two actions, discarding any value produced by the first, like sequencing operators in imperative languages. Note that do is a "syntax sugar" (alternative to >>=) since it provides a more readable and convenient way of writing Monadic code (this is like the begin structure in Scheme).

-- Define a Monad instance for Tree.
instance Monad Tree where
  -- Define 'return' (also known as 'pure') to create a new leaf node with the given value.
  return = Leaf
  -- For an empty tree, simply return another empty tree.
  Empty >>= f = Empty
  -- For a leaf node containing value x, apply function f to x and return the result.
  Leaf x >>= f = f x
  -- For a non-leaf node with left subtree l and right subtree r,
  -- recursively apply function f to both subtrees and combine them into one larger tree using Node constructor.
  Node l r >>= f = Node (l >>= f) (r >>= f)

This implementation allows us to use standard monadic operations like bind (>>=) on trees constructed using our custom datatype.

type Log = [String] --remember that is just an alias
data Logger a = Logger a Log

instance (Eq a) => Eq (Logger a) where
    (Logger x _) == (Logger y _) = x == y

instance (Show a) => Show (Logger a) where
   show (Logger x s) = show x ++ "{" ++ show s ++ "}"

instance Functor Logger where 
    fmap f (Logger x s) = Logger (f x) s 

instance Applicative Logger where
    pure x = Logger x ["Init"]
    Logger f f_name <*> (Logger x log) =  (Logger (f x) (log ++ f_name))

instance Monad Logger where 
    return = pure 
    (Logger x l) >>= f = 
        let Logger x' l' = f x
        in Logger x' (l ++ l') 
data CashReg a = CashReg { getReceipt :: (a,Float) } deriving (Show, Eq)

getCurrentItem = fst . getReceipt 
getPrice = snd . getReceipt

instance Functor CashReg where 
    fmap f cr = CashReg (f $ getCurrentItem cr, getPrice cr)

instance Applicative CashReg where 
    pure a = CashReg (a,0.0)
    fcr <*> cr = CashReg (getCurrentItem fcr $ getCurrentItem cr, getPrice fcr + getPrice cr) 

instance Monad CashReg where 
    return = pure 
    cr >>= f = 
        let newCr = f $ getCurrentItem cr 
        in CashReg (getCurrentItem newCr, getPrice cr + getPrice newCr)

Maybe

Maybe is used to represent computations that may fail: data Maybe a = Nothing | Just a . It is adopted in many recent languages, to avoid NULL and limit exceptions usage: mainly because exceptions are sometimes complex to manage.

Misc

Modules

Haskell has a simple module system, with import, export and namespaces. Modules provide the only way to build abstract data types (ADT)

IO

IO is not 'compatible' with the functional philosophy. In general, IO computation is based on state change (e.g. of a file), hence if we perform a sequence of operations, they must be performed in order (and this is not easy with call-by-need). IO is an instance of the monad class.

main = do { --sequence of IO actions
putStr "Please, tell me something>";
thing <- getLine; --a way to get the input
putStrLn $ "You told me \"" ++ thing ++ "\".";
}
import System.IO
import System.Environment

readfile = do {
    args <- getArgs; -- command line arguments
    handle <- openFile (head args) ReadMode;
    contents <- hGetContents handle; 
    putStr contents;
    hClose handle;
}

main = readfile