Uberjar dependecies

Let me tell you a story about debugging a problem with our CI build including unexpected dependencies via lein uberjar.

 

Intro

 

Recently, we spotted a problem in our CodeScene test environment - the codescene docker container would not start. The app failed early in the initialization sequence with a mysterious dependency error:

 

Exception in thread "main" java.lang.ExceptionInInitializerError
Caused by: Syntax error macroexpanding at (duct/core/repl.clj:1:1).
at clojure.lang.Compiler.load(Compiler.java:7665)
at clojure.lang.RT.loadResourceScript(RT.java:381)
...
at user$eval138$loading__6789__auto____139.invoke(user.clj:1)
at user$eval138.invokeStatic(user.clj:1)
at user$eval138.invoke(user.clj:1)
at clojure.lang.Compiler.eval(Compiler.java:7194)
...
at ***.core.(Unknown Source)
Caused by: java.io.FileNotFoundException: Could not locate fipp/ednize__init.class, fipp/ednize.clj or fipp/ednize.cljc on classpath.
at clojure.lang.RT.load(RT.java:462)
...
at clojure.core$require.doInvoke(core.clj:6038)
at clojure.lang.RestFn.invoke(RestFn.java:482)
at duct.core.repl$eval1257$loading__6789__auto____1258.invoke(repl.clj:1)
at duct.core.repl$eval1257.invokeStatic(repl.clj:1)
at duct.core.repl$eval1257.invoke(repl.clj:1)
at clojure.lang.Compiler.eval(Compiler.java:7194)
at clojure.lang.Compiler.eval(Compiler.java:7183)
at clojure.lang.Compiler.load(Compiler.java:7653)
... 35 more

 

 

 

Debugging

 

 

Reproduce the problem

 

The first step should be to reproduce the bug yourself. See it fail - on your machine.

 

Obviously, everything was working in the REPL (otherwise we would have found the problem earlier), but the uberjar was not starting.

 

I ran the build script on my laptop and tried to run it via java -jar …. It worked without a problem.

 

Then I built a whole docker image and tried to run it. That worked too…​

 

 

Welcome to the (CI) caves!

 

I was thinking: "there must be some difference between my laptop (macOS) and the GitHub Actions machine (Ubuntu)".

The first idea: different versions of Java. But how could that cause the problem? So maybe leiningen?

 

Let’s see what I had locally:

 

lein --version
Leiningen 2.9.8 on Java 17.0.2 OpenJDK 64-Bit Server VM

 

But how to find out the version that the CI uses? Perhaps by debugging the build?

 

I found the convenient debug-via-ssh action on GitHub Marketplace. I added it to our CI workflow, restarted the build and connected via SSH to the machine. There I ran the same lein version command as before:

 

lein --version
Leiningen 2.9.9 on Java 11.0.16 OpenJDK 64-Bit Server VM
 

Hmm, Ok. So the Java version was different but I wasn’t sure how that could affect the problem. Besides, I also tried Java 8 before and it worked.[1]

 

Then maybe leiningen? I decided to upgrade it on my machine:

 

brew upgrade leiningen
...
lein --version
Leiningen 2.9.9 ...

 

Then made a new build and ran it again. Finally, got the same error!

 

 

 

Digging in

 

We had the reproducer - a good start.

 

Now back to the stacktrace. The key message there is Could not locate fipp/ednize_init.class


Caused by: java.io.FileNotFoundException: Could not locate fipp/ednize__init.class, fipp/ednize.clj or fipp/ednize.cljc on classpath.

 

It suggests a missing or conflicting (version) dependency - in this case, the fipp library.

 

This was surprising because everything was working on our machines and the CI tests were passing too.

 

Reading the stacktrace, we can see that:

 

  1. There’s a problem in duct/core/repl.clj file

  2. This file is being loaded via some user.clj file on the classpath

  3. user.clj is requiring, at least transitively, duct.core.repl

  4. duct.core.repl requires fipp.ednize

 

But wait, what user.clj? How come there’s user.clj on the classpath when we don’t have it anywhere in our production depedencies?

 

 

Exploiting the JAR

 

To answer those questions, I decided to dig into the JAR file. I had a broken docker image, so it was easy to copy the JAR

file from the container to my host OS.

 

Opening the JAR with Midnight Commander (JAR is just a special zip archive) lead to a surprise: there was user.clj right in the JAR’s root.

 

By looking at the file content, I could see that this is  user.clj from one our modules that we include as a dependency in the application. This file indeed requires duct.core.repl which requires fipp.ednize.

 

But this user.clj was a dev-only source file (in the dev/src directory) that was specifically included only in leiningen’s dev profile in the module’s project.clj:

 

  :profiles {:dev {:dependencies [[fipp "0.6.24"]
                                  ... ]
                   :source-paths ["dev/src"]}}

 

 

Looking for advice

 

I figured out why it’s happening ("dev-only" user.clj file present in the uberjar) but not how to fix it.

 

To learn more, I turned to, as many times before, the Clojurians slack channel. I posted a question in the #leiningen

channel hoping that somebody could help me find the reason.

 

Very soon, I got an advice to exclude the dev profile via lein with-profile -dev ...:

 

For a few years I've invoked all important commands with an with-profile-dev - one never know when that implicit profile might be activated.

— vemv on Slack
 

A couple of hours later, I learned about the root cause:

 

This is a known bug in 2.9.9, as soon as it was noticed yesterday technomancy jumped on it and started fixing it. It is recommended to stay on 2.9.8 and wait until 2.9.10 if possible; the bug has been fixed, but turns out to be one of those things where fixing one bug introduces another one.

 

Because of move to Codeberg there’s nothing on the GitHub side about this; the issue in question is being tracked here

https://codeberg.org/leiningen/leiningen/issues/5

 

Recommended workaround is lein with-profile production uberjar

— Esko on Slack
 

Excellent, I learned why exactly it’s happening with lein 2.9.9 and a couple of approaches to solve it. Now, let’s fix it!

 

 

Resolution

 

First, I thought: Simply downgrade leiningen to the previous version (2.9.8). However, this turned out to be a bit more complicated than I hoped. Leiningen is automatically installed (and updated) on all GitHub Actions nodes. That is convenient and useful.

 

Thus I leaned toward the workaround suggested on Slack: make sure to turn off the dev profile or specify the production profile explicitly, when installing the dependencies:

 

lein with-profile production install

 

And to be safe, do the same thing when building the uberjar:

 

lein with-profile production uberjar
 

I updated our build script and verified that everything worked as before. Sweet!

 

 

Parting thoughts

 

Build tools (as all software) have bugs and sometimes break unexpectedly. Automatic version updates are convenient but can break your software "without a reason" and no change on your side.

 

  • Make sure to know your tools and review how you are using them. When in doubt, ask experts - there’s plenty of free advice out there!

  • (Maybe) fix versions of the tools and use the same versions in both CI and development. If you do so, make sure to review and update the versions regularly.

 


 

 

 
 
1. Our build uses Java 8 too, but GitHub actions come with Java 11 preinstalled. Later in the workflow file, we specify that we want Java 8 and that is used for the actual build.