In my previous post, Clojure STM - What? Why? How? I gave a high level run through of Software Transactional Memory in clojure via Multiversion concurrency control. The first example demonstrated how a transfer between two bank accounts could be achieved transactionally. A second example talked about a report over all the bank accounts in the system, whilst transactions like those in the first example were proceeding.
This second example prompted some discussion on reddit.com. The main question was ‘are the values of the refs in all the accounts consistent’
This post is an investigation of the points raised
Lets look again at the code of that example…
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
The key point to notice here that the current value of each account is
retrieved via the deref
in the call (map deref accounts)
. This makes
the current value of the ref the in-transaction value for the
remainder of the transaction. Since there is a call to deref
for
each account and traversing the list takes finite time
(especially since map
returns a lazy sequence) is there a
potential for a change to a value in the accounts collection while
we’re traversing the list?
All reads of Refs will see a consistent snapshot of the ‘Ref world’ as
of the starting point of the transaction (its ‘read point’). The
transaction will see any changes it has made. This is called the
in-transaction-value.
How can we validate that this is the case? Here are two functions
- A function that takes a ref and increments it after a given delay.
- A function that derefs a given ref, delays a given amount then derefs a different ref.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
UPDATED: Code above updated at 10:17GMT 03 Jul 2014 following comments in #clojure by hiredman
Given these two methods we can engineer a situation where we
(deref ref2)
in one transaction, update ref1
a few times in other
transactions, then (deref ref1)
in the first transaction. If the
assertion about a consistent snapshot is correct then the updates to
ref1
shouldn’t be visible in our first transaction.
So what happens when we run this?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
|
The r1=3
result is not what we’re expecting. If the read of
ref2
creates a ‘read point’ that all other refs are aligned with,
the updates to ref1
should not be visible here. Also we see that
the READ transaction is unexpectedly re-starting.
Remember that the ‘READ’ transaction here is read-only.
It seems that clojure has something of an inconsistency here. When the ‘READ’ transaction is complete, the runtime looks for an entry in the history of all the refs from before the transaction started. If it finds an entry then the transaction completes, otherwise the transaction is aborted and retried.
UPDATE
In this case this means that it (incorrectly in my opinion)picks up
the latest version of ref1 and completes, returning r1=3.
So it seems that the concerns raised on reddit were valid, because clojure
doesn’t strictly follow the behaviour described in the documentation.
Luckily there is a way to get the desired behaviour - add some history to the
refs…
In this case this means that the transaction picks up the latest
version of ref1
and completes, returning r1=3
. This is a
‘correct’/consistent result, but the system has done unnecessary work.
So it seems that the concerns raised on reddit were valid, because clojure doesn’t strictly follow the behaviour described in the documentation.
There is a way to get the desired behaviour, but it feels like a hack. By adding history to the ref, the read-only transaction completes first time. To me, this change in behaviour, which isn’t incorrect exactly, is wasteful of resources…
END UPDATE
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
|
Although this works, and would apply to our original model for accounts, I don’t really like it.
A better use of ref
There’s a better way to model this; Rather than make each account a ref in it’s own right, store the collection of accounts under a single ref and update the whole collection for each transfer.
Here’s how a simple implementation of that might look:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
The transfer function would be updated accordingly…
1 2 3 4 5 6 7 8 9 10 |
|
In this post we’ve investigated consistency between ref
s, found a
potential problem with the clojure runtime and eventually improved out
model to avoid the problem.
UPDATED: Text above updated at 15:45GMT 17 Apr 2012 following comments from Antti-Ville Tuunainen
In the next post we’ll discuss a subtle requirement when dealing with refs. Namely - How do you ensure that a ref hasn’t changed during a transaction?