Image source — https://winnin.com/battle/357860-Cute-Animals-Who%27ll-Make-You-Feel-Less-Lazy?force_lang=es
Surely at the backend of computations, being lazy is mighty powerful. Clojure is beautiful in many aspects, and laziness is a core component. Plenty of articles on it, about it and around it. They cover aspects of how to write such code. Mostly with examples like Fibonacci, or a generic number generator. Some questions should pop up with that. I definitely had some -
So here is a stab at answering those from the lessons I learned while getting lazy.
First the “what”. Laziness, i.e. lazy evaluations are parts of code that do not get evaluated to completion until the point their return values are absolutely required. Think of this on similar lines as us putting off filing tax returns until the last possible moment to submit them. It is very powerful for us to go do things that are actually useful in the meantime. We could file them well early, but we don’t have or need to. It definitely makes me feel “efficient”. (Side note: For better puns, check out the posts in the references below (: )
Bringing it back to the backend, there are expressions that do not need to be evaluated fully until a later point in the flow. Makes sense on that level. Now the “how”. For this part, let’s take a look closely at Clojure’s core principles. Lazy evals is definitely one of them. In particular, laziness is constructed via lazy sequences that can be passed on between functions and will be evaluated only when there is eval command issued. Here is a quick look of what I mean (as given in https://clojuredocs.org/clojure.core/lazy-seq)
;; The following defines a lazy-seq of all positive numbers. Note that;; the lazy-seq allows us to make a recursive call in a safe way because;; the call does not happen immediately but instead creates a closure.
user=> (defn positive-numbers([] (positive-numbers 1))([n] (lazy-seq (cons n (positive-numbers (inc n))))))#'user/positive-numbers
user=> (take 5 (positive-numbers))(1 2 3 4 5)
Awesome! With that, here is my first lesson learned
Lesson 1 — Great that I want to be lazy, but what exactly do I want to be lazy at?
Source - imgflip.com
“Why?”, you ask. There are always parts of code that don’t need to finish until they are required. Even when they do we do not need all of the return vals from the evaluation, maybe just n
out of (count allresults)
That sounds like a fair advantage to have.
Let’s break out the positive-numbers
example without laziness and the potential costs involved.
(defn not-lazy-positive-numbers [n](mapv#(let [v (inc %)]; to know when evaluation happens(println "executing" v)v)(range (- n 1) (+ n 10))))
;returns n to n+10 values starting from n(not-lazy-positive-numbers 10);executing 10;executing 11;executing 12;executing 13;executing 14;executing 15;executing 16;executing 17;executing 18;executing 19;executing 20;[10 11 12 13 14 15 16 17 18 19 20]
not-lazy-positive-numbers
evaluates immediately. To show this, let’s say we have to pick 15 numbers from the pool of positive numbers between 10 to 40.
(take 15 (concat (not-lazy-positive-numbers 10) (not-lazy-positive-numbers 20) (not-lazy-positive-numbers 30)));executing 10;executing 11;executing 12;executing 13;executing 14;executing 15;executing 16;executing 17;executing 18;executing 19;executing 20;executing 20;executing 21;executing 22;executing 23;executing 24;executing 25;executing 26;executing 27;executing 28;executing 29;executing 30;executing 30;executing 31;executing 32;executing 33;executing 34;executing 35;executing 36;executing 37;executing 38;executing 39;executing 40(10 11 12 13 14 15 16 17 18 19 20 20 21 22 23)
Wow, it evaluates everything, i.e. 30+, even though we just needed 15 of those values.
Lesson 2 — There are plenty of code sections that do more than what is needed, look for them
Now, applying some laziness back from the original example,
(defn lazy-positive-numbers [n](println "executing" n) ; to know what's executing, returns a lazy seq of max 1+10 executions(lazy-seq (cons n (take 10 (lazy-positive-numbers (inc n))))))
(lazy-positive-numbers 10);executing 10;executing 11;executing 12;executing 13;executing 14;executing 15;executing 16;executing 17;executing 18;executing 19;executing 20;executing 21;(10 11 12 13 14 15 16 17 18 19 20)
lazy-positive-numbers
returns 10 values always, as a lazy sequence. (Side-note — ugly code for the lazy seq, could have been thread-lasted :/)
Again, let’s say we have to pick 15 numbers from the pool of positive numbers between 10 to 40. This time lazily
(take 15 (concat (lazy-positive-numbers 10) (lazy-positive-numbers 20) (lazy-positive-numbers 30)));executing 10;executing 20;executing 30;executing 11;executing 12;executing 13;executing 14;executing 15;executing 16;executing 17;executing 18;executing 19;executing 20;executing 21;executing 21;executing 22;executing 23;executing 24;(10 11 12 13 14 15 16 17 18 19 20 20 21 22 23)
Awesome! With lazy eval, the number of executions came down to 18 (- 3 for initialization of the main sources).
Lesson 3 —Laziness comes with an overhead of creating the top level initialization. More lazy sources, more overhead.
(Unimportant Lesson — IRL=”In Real Life”)
Now let’s take a real-world example. We are rendering a user feed page with posts and updates. The number of posts returned is based on certain relevance parameters and paginated. The posts are sourced from multiple sources, to build a real-time feed from different signals received. Like an orchestration. The sources could be from a database, ML data model, cache, back-filled with some preset data. All of the sources have a computation and a latency cost to retrieve data. In pseudo-code, it should look like
posts = [];until(posts.length >= limit;sourcelist = get-data-sources()data = get-data-from-sources()remdata = data.slice(posts.length - limit - data.length)posts.push(remdata))
Looks placed to be made lazy. Each entry in sourcelist is a generator, i.e. returns a sequence of data that can be pooled together, transformed and added to the returned posts. Applying the lazy principles, it looks like with all the methods
(defn get-data-sources [](lazy-seq[#(lazy-positive-numbers 10) ;data generators, can be replaced with actual db calls#(lazy-positive-numbers 20)#(lazy-positive-numbers 30)]))
(defn get-data-from-sources [sourcelist](map #(apply % []) sourcelist))
(defn get-posts [limit](->>(get-data-sources) ;returns seq of data-sources(get-data-from-sources) ;returns a lazy-seq of results(apply concat) ;concat all lazy-seq before taking(take limit)))
;Executing should call other data sources only after exhausting the current one
(get-posts 15);executing 10;executing 20;executing 30;executing 11;executing 12;executing 13;executing 14;executing 15;executing 16;executing 17;executing 18;executing 19;executing 20;executing 21;executing 21;executing 22;executing 23;executing 24;(10 11 12 13 14 15 16 17 18 19 20 20 21 22 23)
Woohoo! Imagine executing 40 calls to retrieve 15 entries, phew.
Lesson 4 — Laziness is good for filling up a sequence from different sources.
Another important thing to note which is not obvious is every action is broken down to functions from the pseudo-code. It will be much harder to use laziness without being functional.
Lesson 5 —Difficult to be lazy if it is not functional.
Laziness usually gives a sense of low speed. But is that the case? Adding a tiny instrumentation to the output
(defn eval-not-lazy [](time(let [result (take 15 (concat (not-lazy-positive-numbers 10) (not-lazy-positive-numbers 20) (not-lazy-positive-numbers 30)))](println result))))
(defn eval-lazy [](time(let [result (take 15 (concat (lazy-positive-numbers 10) (lazy-positive-numbers 20) (lazy-positive-numbers 30)))](println result))))
(eval-not-lazy);(10 11 12 13 14 15 16 17 18 19 20 20 21 22 23);"Elapsed time: 0.571285 msecs";nil
;(eval-lazy);(10 11 12 13 14 15 16 17 18 19 20 20 21 22 23);"Elapsed time: 0.415577 msecs";nil
Definitely in the same order of speed as the not-lazy version.
Source — imgflip.com
Lesson 6 — Laziness doesn’t impact speed negatively, it can be the same or better
Now, let’s see how laziness reacts to concurrent execution and whether concurrency can be used even. To our example, using pmap
to make the execution in parallel. (The joys of Clojure :)).
(defn parallel-get-data-from-sources [sourcelist](pmap #(apply % []) sourcelist))
(defn parallel-get-posts [limit](->>(get-data-sources) ;returns seq of data-sources(parallel-get-data-from-sources) ;returns a lazy-seq of results(apply concat) ;concat all lazy-seq before taking(take limit)))
(parallel-get-posts 15);executingexecuting 1020
;executing 30;executing 11;executing 12;executing 13;executing 14;executing 15;executing 16;executing 17;executing 18;executing 19;executing 20;executing 21;executing 21;executing 22;executing 23;executing 24;(10 11 12 13 14 15 16 17 18 19 20 20 21 22 23)
Yup, it can be used, but it doesn’t make sense to, as we are filling up a sequence unless there is initialization that can happen in parallel.
Lesson 7 —Concurrency doesn’t affect laziness, the sequence is filled in sequence
Like all hammers out there, this is a hammer which serves a particular group of nails and it is good to understand when it shouldn’t be used. Many core Clojure functions like take
, map
, repeat
, etc. return lazy sequences. In cases where the whole set of expressions should be evaluated, laziness should be short-circuited to evaluate all.
In our example, the short circuit can be triggered with a doall
(->>(concat(lazy-positive-numbers 10)(lazy-positive-numbers 20)(lazy-positive-numbers 30))(doall)(take 15));executing 10;executing 20;executing 30;executing 11;executing 12;executing 13;executing 14;executing 15;executing 16;executing 17;executing 18;executing 19;executing 20;executing 21;executing 21;executing 22;executing 23;executing 24;executing 25;executing 26;executing 27;executing 28;executing 29;executing 30;executing 31;executing 31;executing 32;executing 33;executing 34;executing 35;executing 36;executing 37;executing 38;executing 39;executing 40;executing 41;(10 11 12 13 14 15 16 17 18 19 20 20 21 22 23)
If we miss the doall
all expresssions will not be evaluated.
Lesson 8 — Be wary of core Clojure functions — **map != mapv**
, **filter != filterv**
and so on
Lesson 9 — Use **doall**
to short-circuit laziness and evaluate all
In conclusion, “Laziness”, you are awesome.
The above lessons came over an internal dev/engineering talk at Swym Corporation. The talk had a broader outline of functional patterns with Clojure, which I hope to cover in my next post.
Special thanks to Saumitra, Supritha and Shivam for helping make this post look good. Hope you enjoy reading it as much as we enjoyed hacking it together. Please feel free to chime in with your thoughts. Appreciate your comments and questions! :)
P.S: Maybe the best of things? Nope, sorry. That’s kinda taken by “Hope”. I am not saying that. Andy Dufresne did :)