Leiningen, uberjars and a mysterious "dev-only" dependency problem
Let me tell you a story about debugging a problem with our CI build including unexpected dependencies via lein uberjar.
Intro
Recently, we spotted a problem in our CodeScene test environment - the codescene docker container would not start. The app failed early in the initialization sequence with a mysterious dependency error:
Exception in thread "main" java.lang.ExceptionInInitializerError
Caused by: Syntax error macroexpanding at (duct/core/repl.clj:1:1).
at clojure.lang.Compiler.load(Compiler.java:7665)
at clojure.lang.RT.loadResourceScript(RT.java:381)
...
at user$eval138$loading__6789__auto____139.invoke(user.clj:1)
at user$eval138.invokeStatic(user.clj:1)
at user$eval138.invoke(user.clj:1)
at clojure.lang.Compiler.eval(Compiler.java:7194)
...
at ***.core.(Unknown Source)
Caused by: java.io.FileNotFoundException: Could not locate fipp/ednize__init.class, fipp/ednize.clj or fipp/ednize.cljc on classpath.
at clojure.lang.RT.load(RT.java:462)
...
at clojure.core$require.doInvoke(core.clj:6038)
at clojure.lang.RestFn.invoke(RestFn.java:482)
at duct.core.repl$eval1257$loading__6789__auto____1258.invoke(repl.clj:1)
at duct.core.repl$eval1257.invokeStatic(repl.clj:1)
at duct.core.repl$eval1257.invoke(repl.clj:1)
at clojure.lang.Compiler.eval(Compiler.java:7194)
at clojure.lang.Compiler.eval(Compiler.java:7183)
at clojure.lang.Compiler.load(Compiler.java:7653)
... 35 more
Debugging
Reproduce the problem
The first step should be to reproduce the bug yourself. See it fail - on your machine.
Obviously, everything was working in the REPL (otherwise we would have found the problem earlier), but the uberjar was not starting.
I ran the build script on my laptop and tried to run it via java -jar …
. It worked without a problem.
Then I built a whole docker image and tried to run it. That worked too…
Welcome to the (CI) caves!
I was thinking: "there must be some difference between my laptop (macOS) and the GitHub Actions machine (Ubuntu)".
The first idea: different versions of Java. But how could that cause the problem? So maybe leiningen?
Let’s see what I had locally:
lein --version
Leiningen 2.9.8 on Java 17.0.2 OpenJDK 64-Bit Server VM
But how to find out the version that the CI uses? Perhaps by debugging the build?
I found the convenient debug-via-ssh action on GitHub Marketplace. I added it to our CI workflow, restarted the build and connected via SSH to the machine. There I ran the same lein version
command as before:
lein --version
Leiningen 2.9.9 on Java 11.0.16 OpenJDK 64-Bit Server VM
Hmm, Ok. So the Java version was different but I wasn’t sure how that could affect the problem. Besides, I also tried Java 8 before and it worked.[1]
Then maybe leiningen? I decided to upgrade it on my machine:
brew upgrade leiningen
...
lein --version
Leiningen 2.9.9 ...
Then made a new build and ran it again. Finally, got the same error!
Digging in
We had the reproducer - a good start.
Now back to the stacktrace. The key message there is Could not locate fipp/ednize_init.class
Caused by: java.io.FileNotFoundException: Could not locate fipp/ednize__init.class, fipp/ednize.clj or fipp/ednize.cljc on classpath.
It suggests a missing or conflicting (version) dependency - in this case, the fipp library.
This was surprising because everything was working on our machines and the CI tests were passing too.
Reading the stacktrace, we can see that:
-
There’s a problem in
duct/core/repl.clj
file -
This file is being loaded via some
user.clj
file on the classpath -
user.clj
is requiring, at least transitively,duct.core.repl
-
duct.core.repl
requires fipp.ednize
But wait, what user.clj
? How come there’s user.clj
on the classpath when we don’t have it anywhere in our production depedencies?
Exploiting the JAR
To answer those questions, I decided to dig into the JAR file. I had a broken docker image, so it was easy to copy the JAR
file from the container to my host OS.
Opening the JAR with Midnight Commander (JAR is just a special zip archive) lead to a surprise: there was user.clj
right in the JAR’s root.
By looking at the file content, I could see that this is user.clj
from one our modules that we include as a dependency in the application. This file indeed requires duct.core.repl
which requires fipp.ednize
.
But this user.clj
was a dev-only source file (in the dev/src
directory) that was specifically included only in leiningen’s dev
profile in the module’s project.clj
:
:profiles {:dev {:dependencies [[fipp "0.6.24"]
... ]
:source-paths ["dev/src"]}}
Looking for advice
I figured out why it’s happening ("dev-only" user.clj file present in the uberjar) but not how to fix it.
To learn more, I turned to, as many times before, the Clojurians slack channel. I posted a question in the #leiningen
channel hoping that somebody could help me find the reason.
Very soon, I got an advice to exclude the dev profile via lein with-profile -dev ...
:
For a few years I've invoked all important commands with an with-profile-dev - one never know when that implicit profile might be activated.
A couple of hours later, I learned about the root cause:
This is a known bug in 2.9.9, as soon as it was noticed yesterday technomancy jumped on it and started fixing it. It is recommended to stay on 2.9.8 and wait until 2.9.10 if possible; the bug has been fixed, but turns out to be one of those things where fixing one bug introduces another one.
Because of move to Codeberg there’s nothing on the GitHub side about this; the issue in question is being tracked here
https://codeberg.org/leiningen/leiningen/issues/5
Recommended workaround is
lein with-profile production uberjar
Excellent, I learned why exactly it’s happening with lein 2.9.9 and a couple of approaches to solve it. Now, let’s fix it!
Resolution
First, I thought: Simply downgrade leiningen to the previous version (2.9.8). However, this turned out to be a bit more complicated than I hoped. Leiningen is automatically installed (and updated) on all GitHub Actions nodes. That is convenient and useful.
Thus I leaned toward the workaround suggested on Slack: make sure to turn off the dev profile or specify the production
profile explicitly, when installing the dependencies:
lein with-profile production install
And to be safe, do the same thing when building the uberjar:
lein with-profile production uberjar
I updated our build script and verified that everything worked as before. Sweet!
Parting thoughts
Build tools (as all software) have bugs and sometimes break unexpectedly. Automatic version updates are convenient but can break your software "without a reason" and no change on your side.
-
Make sure to know your tools and review how you are using them. When in doubt, ask experts - there’s plenty of free advice out there!
-
(Maybe) fix versions of the tools and use the same versions in both CI and development. If you do so, make sure to review and update the versions regularly.
Resources
- The leiningen bug: https://codeberg.org/leiningen/leiningen/issues/5
-
- Fixed in 2.9.10: https://codeberg.org/leiningen/leiningen/releases
-
- See also duct-framework/core’s project.clj referencing
fipp
dependency
- See also duct-framework/core’s project.clj referencing
- Related clojureverse discussion:
- GitHub Actions - the list of software preinstalled on the ubuntu image
- A clojure.org guide mentioning user.clj: https://clojure.org/guides/dev_startup_time