Let's talk about the Unison language.
Unison is a statically typed langage whose main influences are Haskell, Erlang and Frank. It is built upon a simple idea that has really appealing consequences:
Code is content-addressed and immutable
In many langages, the main way to identify something is its name. In Unison, definitions are identified by a hash computed after its content. For instance, those two definitions below have different names but their body is identical. So they share the same hash.
inc x = x + 1
add1 z = z + 1
For each hash, an AST is stored in the unison codebase.
In fact, the text representation that we have seen above is not the primary representation for unison code. The AST is. Text is one of many possible way to create/view/update unison code.
For instance, if we need our previously defined inc
function to define something else, its name will be substituted by its hash in the resulting AST.
add2 x = inc (inc x)
Assuming that the corresponding hash for inc
is #12345
, the below code will be preprocessed as this:
add2 x = #12345 (#12345 x)
Then, argument names are substituted by normalized ones:
add2 arg1 = #12345 (#12345 arg1)
With this hash based representation of our definition, we can compute an AST that is independant of names declared by developers.
This implies that we can change the name inc
for another name without changing the definition of add2
.
Also once inc
has been compiled, if you change its name, then is hash will not be modified. This mean that you do not need to recompile that function or other already compiled definitions that depends on it!
Installation
It is much more fun to try those things on your own machine while following along this article. So I suggest to do so, knowing that Unison is really easy to install, so I invite you to do so:
- OSX (with homebrew)
brew tap unisonweb/unison
brew install unison-language
Once Unison is installed on your machine, you need to create a directory that unison will be launched against and cd inside it.
mkdir Unison # create a dir to start with
cd Unison
then initialize unison and launch the unison codebase manager
ucm init # init the codebase
ucm # entering unison
You should see something like this:
_____ _
| | |___|_|___ ___ ___
| | | | |_ -| . | |
|_____|_|_|_|___|___|_|_|
Welcome to Unison!
You are running version: release/M1m
I'm currently watching for changes to .u files under ~/Code/Unison
Type help to get help.
.>
Once in the Unison REPL is launched, we import the base library.
.> pull https://github.com/unisonweb/base:.releases._latest base
And we are good to go!
The unison codebase manager
Once you have launched ucm
, any file within the current directory is watched. Let's first create a file named scratch.u
with our inc
definition in it.
inc x = x + 1
When saving this file, you can see the output of the ucm
process.
I found and typechecked these definitions in scratch.u. If you do an `add` or `update`, here's
how your codebase would change:
⍟ These new definitions are ok to `add`:
inc : Nat -> Nat
Indicating that our definition is valid.
Now, if we want to try our inc
function, we can add a watch expression. To do so, just add the following line at the end of scratch.u
(you have to include the '>' symbol).
> inc 10
On saving the file, ucm
will spot it and display:
Now evaluating any watch expressions (lines starting with `>`)... Ctrl+C cancels.
3 | > inc 10
⧩
11
Since our inc
function seems to work as intended, we can add it to the codebase.
At the ucm
prompt just type add
:
.> add
⍟ I've added these definitions:
inc : Nat -> Nat
Now that our inc
function is in the codebase, we can delete the content of scratch.u
.
We can ask ucm
to find it like this:
.> find inc
1. inc : Nat -> Nat
2. base.Int.increment : Int -> Int
3. base.List.distinct : [a] -> [a]
4. base.Nat.increment : Nat -> Nat
5. base.Store.modify.examples.increment : Nat
In the list of results, the first item is our function. We can ask ucm
to view it.
.> view 1
inc : Nat -> Nat
inc x =
use Nat +
x + 1
This is it! Enhanced with its infered type declaration and import statement (+ function from module Nat).
The find
operation is handful to explore the current codebase. For instance, we can search for all definitions of a given type:
.> find : Nat -> Nat
1. base.Nat.complement : Nat -> Nat
2. base.Nat.increment : Nat -> Nat
3. base.Nat.leadingZeros : Nat -> Nat
4. base.Nat.trailingZeros : Nat -> Nat
5. base.Nat.decrement : Nat -> Nat
Changing names
Let's define another useful function:
scratch.u
add2 x = inc (inc x)
Once add2
is added to our codebase (with the add
command), we can retrieve its definition with the view
command
.> view add2
add2 : Nat -> Nat
add2 x = inc (inc x)
We suddenly decide that we want to change the name of the inc
function to add1
.
In any other langage we would had to navigate to the inc
definition, change its name, then change all its usages accordingly, but in unison we only have to run this simple ucm
command:
.> move.term inc add1
As we can see, the add2
definition has been correctly updated:
.> view add2
add2 : Nat -> Nat
add2 x = add1 (add1 x)
No recompilation is needed neither, it just works!
Navigating the history
With immutability comes great benefits, as functional programmers often says.
Since our codebase is always a well-typed immutable value, we have time travel for free.
.> reflog
Here is a log of the root namespace hashes, starting with the most recent, along with the command
that got us there. Try:
`fork 2 .old`
`fork #fpkooer5hq .old` to make an old namespace accessible again,
`reset-root #fpkooer5hq` to reset the root namespace and its history to that of the specified
namespace.
1. #fiviinv92c : move.term ..inc .add1
2. #1mlujphnkc : add
3. #cb6lu9ahad : add
4. #lqs21dco5d : pull https://github.com/unisonweb/base:.releases._latest .base
5. #7asfbtqmoj : (initial reflogged namespace)
Documentation
in Unison documentation is first class, it has type Doc
and a dedicated literal.
As an example, let's write documentation for our inc
function:
inc.docs = [:
`@inc n` returns `n + 1`
the source for @inc is
@[source] inc
the result of evaluating `(@inc 1)` is @[evaluate] inc.ex1
its signature is:
`@[signature] inc`
:]
You can then display this docs in ucm:
.> display inc.docs
`inc n` returns `n + 1`
the source for inc is
inc x =
use Nat +
x + 1
the result of evaluating `(inc 1)` is 2
its signature is: `inc : Nat -> Nat`
To get more details on documentation inner syntax please see this page.
One neat thing to keep in mind is that the names you use in your docs have the same behaviour in case of renaming. For instance if we rename inc
to add1
with:
.> move.term inc add1
and then display the doc again:
.> display inc.docs
`add1 n` returns `n + 1`
the source for add1 is
add1 x =
use Nat +
x + 1
the result of evaluating `(add1 1)` is 2
its signature is: `add1 : Nat -> Nat`
Every dynamic value (the one starting by @
) are updated accordingly.
Abilities
Abilities lets you use the same ordinary Unison syntax for programs that do (asynchronous) I/O, stream processing, exception handling, parsing, distributed computation, and lots more...
Hello world!
scratch.u:
greet name = printLine ("Hello " ++ name ++ "!")
program = '(greet "world")
ucm:
.> run program
Hello world!
When saving our scratch.u file, we can see the type of the greet function:
greet : Text -> {IO} ()
which means that the greet function takes a Text
as argument, needs the IO
ability and return nothing (()
or Unit
).
The functions that we have seen up to this point are pure (inc
and add2
). If ucm
display their type as Nat → Nat
, their real type is Nat → {} Nat
, indicating that no abilities were needed.
A function from type a
to type b
that needs several abilities A1
, A2
and A3
will have the type like a → {A1,A2,A3} b
Motivation
There are three main motivations for abilities in Unison:
- Make side effects visible in the type signature.
- Decouple interface from implementation.
- Customize control flow.
Lazyness
As you may have noticed in the hello world example, the '(greet "world")
expression was prefixed by a single quote. This is syntax sugar for (_ → (greet "world"))
which denotes a function taking ()
as argument and returning the result of evaluating (greet "world")
.
scratch.u
:
expr = '(1 + 1)
-- is equivalent to
expr = (_ -> (1 + 1))
-- and can be resumed/evaluated like this
> expr ()
-- or with syntax sugar
> !expr
This kind of quoting is often used in conjunction of abilities, as we have seen with the hello world example.
This is because only functions can use abilities:
Unison is a 'purely functional' language. That means that evaluation of terms cannot in itself be effectful. Having to add the'
delay to effectfully-computed values is a consequence of that. Any effectful code needs to pull its punches — it's not causing effects to happen during evaluation, but rather it's evaluating to a function which can then explain to a handler what effects it would like to be performed. We'll see how that handler in turn can only either (a) turn the requested effects into pure computations, ready for evaluation, or (b) pass the buck, by translating them to requests in another ability, yielding another function. If there are abilities like network or disk I/O, then the buck gets passed all the way to the very outside of the program, and into the Unison runtime system, using theIO
ability. At no point does a Unison term's evaluation ever directly generate an effect.
the IO ability
IO
is a special ability, built in to Unison. It is through IO
that effectful Unison programs can actually interact with the outside world — writing to the console, reading from files and sockets, and any other behavior that goes beyond simply evaluating Unison expressions to a result. When you run your program (for example using ucm's run
command), operations from the IO
ability are handled by the Unison runtime system.
Asking ucm
about IO
:
.> view IO
ability base.io.IO where
getFileSize_ : FilePath ->{base.io.IO} Either Error Nat
kill_ : ThreadId ->{base.io.IO} Either Error ()
send_ : Socket -> Bytes ->{base.io.IO} Either Error ()
bracket_ :
'{base.io.IO} a -> (a ->{base.io.IO} b) -> (a ->{base.io.IO} c) ->{base.io.IO} Either Error c
getLine_ : Handle ->{base.io.IO} Either Error Text
getText_ : Handle ->{base.io.IO} Either Error Text
getFileTimestamp_ : FilePath ->{base.io.IO} Either Error EpochTime
closeFile_ : Handle ->{base.io.IO} Either Error ()
getTemporaryDirectory_ : {base.io.IO} (Either Error FilePath)
getCurrentDirectory_ : {base.io.IO} (Either Error FilePath)
renameDirectory_ : FilePath -> FilePath ->{base.io.IO} Either Error ()
renameFile_ : FilePath -> FilePath ->{base.io.IO} Either Error ()
receive_ : Socket -> Nat ->{base.io.IO} Either Error (Optional Bytes)
fileExists_ : FilePath ->{base.io.IO} Either Error Boolean
isDirectory_ : FilePath ->{base.io.IO} Either Error Boolean
directoryContents_ : FilePath ->{base.io.IO} Either Error [FilePath]
listen_ : Socket ->{base.io.IO} Either Error ()
closeSocket_ : Socket ->{base.io.IO} Either Error ()
clientSocket_ : HostName -> ServiceName ->{base.io.IO} Either Error Socket
delay_ : Nat ->{base.io.IO} Either Error ()
seek_ : Handle -> SeekMode -> Int ->{base.io.IO} Either Error ()
serverSocket_ : Optional HostName -> ServiceName ->{base.io.IO} Either Error Socket
accept_ : Socket ->{base.io.IO} Either Error Socket
setBuffering_ : Handle -> Optional BufferMode ->{base.io.IO} Either Error ()
openFile_ : FilePath -> Mode ->{base.io.IO} Either Error Handle
throw : Error ->{base.io.IO} a
fork_ : '{base.io.IO} a ->{base.io.IO} Either Error ThreadId
getBuffering_ : Handle ->{base.io.IO} Either Error (Optional BufferMode)
position_ : Handle ->{base.io.IO} Either Error Int
setCurrentDirectory_ : FilePath ->{base.io.IO} Either Error ()
createDirectory_ : FilePath ->{base.io.IO} Either Error ()
removeDirectory_ : FilePath ->{base.io.IO} Either Error ()
removeFile_ : FilePath ->{base.io.IO} Either Error ()
systemTime_ : {base.io.IO} (Either Error EpochTime)
isFileEOF_ : Handle ->{base.io.IO} Either Error Boolean
isFileOpen_ : Handle ->{base.io.IO} Either Error Boolean
isSeekable_ : Handle ->{base.io.IO} Either Error Boolean
putText_ : Handle -> Text ->{base.io.IO} Either Error ()
Other abilities
Abilities are declared in an abstract way like this:
ability Store v where
get : v
put : v -> ()
This declaration introduces a Store
ability which contains 2 actions: get
and put
.
Unison has no builtin mutable state mechanism. So this ability is introducing one.
This declaration does not specify any implementation (it is abstract). One nice thing about this is that we can plug any implementation we need (depending on the use-case) without modifying the code that is using those actions.
In order to be able to use this ability, we have to handle it. In Unison this is done by an Ability handler.
If e
is an expression containing action calls and h
is an ability handler then handle e with h
is the way to get the result of evaluating e
.
An ability handler is a function that states how to handle each action we want to use in the handled code (e
). A handler for the Store
ability will have to take care of Store.get
and Store.put
individually. But we will delay this and start with a simpler ability.
Abort
One of the most basic ability one can imagine is the Abort
ability which is able to cut
the execution of a program.
ability Abort where
cut : a
Here is a handler for it:
abortHandler returnedValue = cases
-- handling the cut action
{Abort.cut -> continuation} -> returnedValue
-- Exit case
{a} -> a
There is 2 branches in our cases statement, the first one is handling our cut
action by ignoring the continuation and immediately returning returnedValue
.
The first branch starts with the action pattern {Abort.cut → continuation}
that matches our only action Abort.cut
and specifies what to do when this action is encountered. the continuation
binding is holding the rest of the program. Any action pattern can do many different things with it: ignore it, call it, call it several times, record it... In this case, we are simply ignoring it and immediately returning returnedValue
(the only argument of our handler).
The second branch {a} → a
is the exit branch, every ability handler have to have one. It is taken when all actions calls have been handled.
Let's try it by adding this watch expression:
> handle
x = 4
Abort.cut
x + 2
with
abortHandler 0
it evaluates to 0
as intended (our abortHandler
was initiated with 0
). If we remove the Abort.cut
line it evaluates to 6
.
So what has happened here?
- The first expression was evaluated normally,
x
was bound to4
. - Then we hit an action that has to be handled, so we defer the control of the execution to our ability handler giving him access to the continuation of the program, the first branch of our handler is taken, the continuation is ignored and the handler first argument (
0
in this case) is returned. - The
x + 2
expression is never computed.
Store again
Let's take a look at a more interesting handler.
A handler for the Store
ability may look like this
storeHandler storedValue = cases
-- handling the get action
{Store.get -> cont} ->
handle cont storedValue with storeHandler storedValue
-- handling the put action
{Store.put v -> cont} ->
handle cont () with storeHandler v
-- the exit case
{a} -> a
In general an action pattern has the following structure:
{A.c p_1 p_2 p_n -> k}
Where A
is the name of the ability, c
is the name of the action, p_1
through p_n
are patterns matching the arguments to the action, and k
is a continuation for the program.
Here, our handler has 3 cases, 2 for our get
and put
actions plus the exit case.
The 1st and 2nd branches are using the handler recursively to handle the evaluation of the rest of the computation.
The handler is initiated with an argument (storedValue
) that we will use to keep some piece of state along handling, this is a very common pattern.
One really dummy example that uses the Store
ability:
-- an helper function to update the value of our store
storeUpdate f =
Store.put (f Store.get)
-- our dummy function that uses a mutable state
-- in order to count the number of zeros present in a list
countZeros = cases
-- non empty list case
head +: tail ->
-- first we check if the head is zero and incrementing our store value if so
if (head == 0) then storeUpdate inc else ()
-- then we recur with the tail of the list
countZeros tail
-- empty list case, we are using Store.get to obtain our stored value and return it
[] -> Store.get
-- we are initiating the storeHandler with the value 0
-- and handling the countZeros call with it
> handle countZeros [1,0,3,0,1,0]
with storeHandler 0
-- it returns: 3
Let see what happens on the evaluation of this watch expression:
storeHandler 0
is returning a handler function that will be used to handle thecountZeros [1,0,3,0,1,0]
expression.[1,0,3,0,1,0]
is matching the first branch of thecountZeros
function, we are binding1
tohead
and[0,3,0,1,0]
totail
.- We test if
head
equals0
: this is not the case so the line evaluate to()
. - On the next line, we are recurring with
tail
. - We still match the first branch, binding
0
tohead
and[3,0,1,0]
totail
. - This time
head
equals0
. So we callstoreUpdate
with theinc
function. - The
storeUpdate
function is callingStore.get
. Store.get
is an action. So the control is passed to our handler.- The first branch of
storeHandler
is matching theStore.get
action. - The continuation of the program has to be called with the value that the action is meant to return, here a
Nat
, that is the currently stored value (the argument passed tostoreHandler
initially:0
). - Since the rest of the computation may contains other actions we still have to handle, we use the same handler to do so (wrapping it around our continuation call with an handle expression).
- Now we are back in the
storeUpdate
function, the value that we got withStore.get
(0
) is incremented byinc
and theStore.put
action is called with this new value (1
). - Again, we are entering our handler, this time matching the second branch.
- Like the first branch, we will resume the computation, this time passing
()
to the continuation since theStore.put
is meant to return a value of typeUnit
(()
). This time we will slightly change the handler to capture the new value that we want to store (1
). - And it continues, incrementing our store each time a
0
is encountered until the list is empty. - When the list is empty, the second branch of
countZeros
is taken and we callStore.get
a last time to return the result (in this particular case:3
).