Example of a functional programming refactoring pattern

Often, discussions and tutorials about refactoring focus on object-oriented languages. It's not clear if this stems from the fact that Martin Fowler's absolute masterpiece, Refactoring: Improving the Design of Existing Code, targets OO, or if it's simply that OO is still the dominant paradigm in the code out there needing refactoring, or some other reason.

And of course, many of the patterns that Fowler describes can be applied directly to functional programming languages. In fact, Martin Fowler himself now lists Extract method as an alias for Extract function. Rename variable will always be paradigm-agnostic, but there's no clear functional equivalent for Replace Superclass with Delegate.

I would argue that this is not a problem at all, and that understanding techniques that don't quite match your language of choice is still a good exercise and a good way to improve how we think about our own code. However, one side effect of all this is that there are probably some patterns that are, if not specific to functional programming, at least more useful in that context.

Here's a variation on Extract function that illustrates this.

When moving some code around, CodeScene's pull request integration starts to complain about this function. Specifically, CodeScene says that it has "Deeply nested complexity", which, to be honest, it does have:


(defn insert-emails-for-existing-users [db-spec detected-users-with-emails]
  (sql/with-db-transaction [tx db-spec]
    (doseq [[email name] (normalized-unique-user-emails detected-users-with-emails)]
      (when-not (string/blank? email)
        (let [{user-nickname :nick} (user-by-name tx name)]
          (cond (not user-nickname)
                nil
                (not (user-exists-with-email? tx name email))
                (try
                  (insert-email-for-user tx user-nickname email)
                  (catch SQLException e
                    (if (db/duplicate-entry-error? e)
                      (log/errorf "User '%s' with email '%s' already exists, despite select returning none."
                                  user email)
                      ;; re-throw any other SQL exceptions
                      (throw e))))))))))

The part of the function that actually does something useful is the call to insert-email-for-user. It is nested inside a with-db-transaction block, a doseq loop, a when-not conditional, a let block, another conditional with cond, and finally a try block.

There's even another level of nesting inside the catch expression.

So yes, that's a lot.

A first glance at the code doesn't reveal any easy refactoring solutions. There aren't any five-line blocks of code that could easily be extracted. It's hard to see where to break things apart.

Can that (when-not (string/blank? email)...) test be included in the cond? Not really, because we want to avoid hitting the database when grabbing user-nickname... which we need in the cond.

We might be tempted by a Replace Temp with Query to remove


(let [{user-nickname :nick} (user-by-name tx name)] ;...)

except that we do need user-nickname twice. Martin Fowler might tell us to go for it and worry about the optimization later, but avoiding a database query in this case still seems worth it, both in terms of performance and readability. Plus, we shouldn't be punished for using let blocks, right?

Maybe it's time for one more cup of coffee and to take a step back.

The problem here is our doseq loop. It's built in a slightly non-idiomatic style. doseq itself is only used for extracting the two bindings, email and name, from each item in the list returned by normalized-unique-user-emails. The when-not and part of the cond expressions are actually just filtering out some of the items from that list. The (not user-nickname) condition just logs and proceeds to the next interation of the loop for example.

Looking at things this way, we can see that the doseq loop is really mixing two roles: performing an action, (insert-email-for-user tx user-nickname email), and deciding which items to apply that to.

Maybe if we separate these two roles, we can finally extract some logic. Our strategy is going to be to move the filtering logic to the list returned by the call to normalized-unique-user-emails.

As a first step, let's try removing the when-not nesting:


(doseq [[email name] (->> detected-users-with-emails
                          normalized-unique-user-emails
                          (remove (fn [[email]] (string/blank? email))))]
  (let [{user-nickname :nick} (user-by-name tx name)]
	;;; 
  ))

By pushing the filtering to the list, with the call to remove, we get rid of one level of nesting.

Now let's see what we can do with the user-nickname. Since we want to avoid unnecessary database calls, we will need to be careful. On the other hand, the string/blank? test is out of the way, so that helps.

Here, we will move our database call to our "list preparation code" and incorporate the resulting nickname into our loop bindings:


(doseq [[email name user-nickname] (->> detected-users-with-emails
                                        normalized-unique-user-emails
                                        (remove (fn [[email]] (string/blank? email)))
                                        (map (fn [[_ name :as user-vec]]
                                               (conj user-vec (:nick (user-by-name tx name))))))]
  (cond (not user-nickname)
        nil
        ;; 
        ))

With the call to map, we append the nickname for each user to each item in the list. And then we use that as one of the doseq bindings. The let block and its level of nesting? Gone!

We're still not done though. The (not user-nickname) check is really a filter. It can move up to the list as well. We can be even greedier though: let's eliminate the (not (user-exists-with-email? tx name email)) too.


(defn insert-emails-for-existing-users [db-spec detected-users-with-emails]
  (sql/with-db-transaction [tx db-spec]
    (doseq [[email name user-nickname] (->> detected-users-with-emails
                                            normalized-unique-user-emails
                                            (remove (fn [[email]] (string/blank? email)))
                                            (map (fn [[_ name :as user-vec]]
                                                   (conj user-vec (:nick (user-by-name tx name)))))
                                            (filter (fn [[_ _ user-nickname]] user-nickname)) ; replaces (not user-nickname)
                                            (remove (fn [[email name]] (user-exists-with-email? tx name email))))]
      (try
        (insert-email-for-user tx user-nickname email)
        (catch SQLException e
          (if (db/duplicate-entry-error? e)
            (log/errorf "User '%s' with email '%s' already exists, despite select returning none."
                        user email)
            ;; re-throw any other SQL exceptions
            (throw e)))))))

And now that we've moved all the filtering and data-gathering to the input list, we finally have some code that we can apply Extract function to.


(defn prepare-user-list [tx detected-users-with-emails]
  (->> detected-users-with-emails
       normalized-unique-user-emails
       (remove (fn [[email]] (string/blank? email)))
       (map (fn [[_ name :as user-vec]]
              (conj user-vec (:nick (user-by-name tx name)))))
       (filter (fn [[_ _ user-nickname]] user-nickname)) ; replaces (not user-nickname)
       (remove (fn [[email name]] (user-exists-with-email? tx name email)))))

(defn insert-emails-for-existing-users [db-spec detected-users-with-emails]
  (sql/with-db-transaction [tx db-spec]
    (doseq [[email name user-nickname] (prepare-user-list tx detected-users-with-emails)]
      (try
        (insert-email-for-user tx user-nickname email)
        (catch SQLException e
          (if (db/duplicate-entry-error? e)
            (log/errorf "User '%s' with email '%s' already exists, despite select returning none."
                        user email)
            ;; re-throw any other SQL exceptions
            (throw e)))))))

This looks better. We've solved the nested complexity problem in insert-emails-for-existing-users, which no longer has any conditionals at all. On the other hand, prepare-user-list looks like it could use some love. Seeing all the filtering logic combined like this, it becomes clear that we have redundant database queries: user-by-name and user-exists-with-email? are obviously doing almost the same the same thing.

Let's write a new function called user-by-name-without-matching-email.


(defn user-by-name-without-matching-email
  [tx user-name email]
  ;; TODO: consider pushing this logic down to DB layer
  (when-let [u (user-by-name tx user-name)]
    (when (not= email (:email u))
      u)))

As mentioned in the comment: if the existing user-by- functions don't let us do this, we could write a new database query that gives us the logic we need. That would allow us to eliminate those nested when expressions. Combining two database hits into one might also provide a performance benefit too. If so, it means that all this refactoring actually helped us improve performance simply by organizing the code better and helping us to see a better solution.

Now we can simplify prepare-user-list even more:


(defn prepare-user-list [tx detected-users-with-emails]
  (->> detected-users-with-emails
       normalized-unique-user-emails
       (remove (fn [[email]] (string/blank? email)))
       (keep (fn [[email name :as user-vec]]
               (when-let [u (user-by-name-without-matching-email tx name email)]
                 (when (:nick u)
                   (conj user-vec (:nick (user-by-name tx name)))))))))

For readers who aren't Clojure programmers and somehow made it this far: keep combines the transforming logic of map while filtering out nils. This allows us to combine map, filter and remove here. In fact, we can even skip the preceeding call to remove entirely by making the string/blank? check part of the keep predicate.

The test for (:nick u) should be pushed to the query function too. If we put this all together, we end up with something like this:


(defn user-by-name-with-nick-and-without-matching-email
  [tx user-name email]
  ;; TODO: consider pushing this logic down to DB layer
  (when-let [u (user-by-name tx user-name)]
    (when (and (:nick u) (not= email (:email u))
      u))))

(defn prepare-user-list [tx detected-users-with-emails]
  (->> detected-users-with-emails
       normalized-unique-user-emails
       (keep (fn [[email name :as user-vec]]
               (when-not (string/blank? email) 
                 (when-let [u (user-by-name-with-nick-and-without-matching-email tx name email)]
                   (conj user-vec (:nick (user-by-name tx name)))))))))

(defn insert-emails-for-existing-users [db-spec detected-users-with-emails]
  (sql/with-db-transaction [tx db-spec]
    (doseq [[email name user-nickname] (prepare-user-list tx detected-users-with-emails)]
      (try
        (insert-email-for-user tx user-nickname email)
        (catch SQLException e
          (if (db/duplicate-entry-error? e)
            (log/errorf "User '%s' with email '%s' already exists, despite select returning none."
                        user email)
            ;; re-throw any other SQL exceptions
            (throw e)))))))

CodeScene won't complain about the complexity anymore. Your eyes and your brain won't either.

What did we actually do here? This refactoring process ended with some function extractions. To get there, though, we had to regroup our logic first, moving from an iterative approach to a typically functional and data-centric list transformation approach. In Clojure, doseq is a macro reserved for non-functional tasks, that is doing something on each iteration. This refactoring makes our code more functional in the sense that we moved as much logic as possible to operations on a list, rather than mixing the data logic together with the mechanics of database insertion and database error handling. The work being done by the conditionals in the original version of the function is now being accomplished by the predicate arguments to filter and remove.

Martin Fowler does in fact identify some patterns that are similar to what we've done here. Replace Loop with Pipeline might be the closest. Or maybe Separate Query from Modifier. There is also Split Loop. Should we call this Process List before Looping? Isolate mutation? It probably doesn't matter. This combination of changes, combined with Extract function, are useful with doseq that it's worth keeping the pattern in mind the next time you run into similar code.