Tuesday, July 30, 2024

Maven version ranges inception

Have you ever encountered a situation where your changes to a configuration file (like owasp-suppressions.xml) aren't being reflected? If you're working on a Maven project that uses version ranges, you might be in for a surprise.

In a maven-based project that uses version ranges, there is an implicit expectation may lead to great frustration for developers. Frustration similar to an intermittent issue that is hard to figure out. Let's say you've defined the version range [1,2) for a dependency you also develop. You'd expect this to always point to the latest SNAPSHOT, as long as it is before the release version 2.0.0However, this expectation may unexpectedly defy expectations, causing the latest SNAPSHOT to not be reflected during a local build.

When you define the version range [1,2)
for a dependency that you also develop for, normally you expect it to behave such that it will always point to the latest SNAPSHOT as long as it is before the release version 2.0.0.

In my case, the dependency was released a few days prior. But due to unrelated build issues, like Docker base Debian images suddenly failing to run apt-update, the release had to be done twice. So instead of release 1.5, the version was bumped to 1.6. Our build process uses the Maven release plugin to automatically update the version to the next development SNAPSHOT. However, due to the build issues, the release for 1.6 had to be done manually, and updating to the correct version of 1.7-SNAPSHOT was forgotten. The dependency project was left at version 1.6-SNAPSHOT.

During a Maven build, the version range resolves to the latest version. In my case, the latest was the release version 1.6, as release versions are considered greater than SNAPSHOTs. Additionally, Maven typically checks both local and remote repositories for a given dependency. So, when the remote repository has the released 1.6, but locally you're still on 1.6-SNAPSHOT, your build will use the release version instead of your local SNAPSHOT version. This causes a situation where your local code changes aren't reflected when you run your code.

I was able to diagnose this problem by looking at the metadata files that maven keep in:

cat ~/.m2/repository/path-to-your-dependency/maven-metadata-<remote-hostname>.xml

Sunday, April 28, 2024

Reflection of thoughts #1 - Logs and Multithreaded applications

Ever since I have started working, applications have had to run an increasing number of threads. This is mostly due to the adoption of web applications in favor of desktop applications. Due to this change; One the more difficult work that has come up is looking at the logs and figuring out what happened.
For a significantly busy server, looking at the logs is not as straightforward as I would expect.
  1. In the logs, each http request is represented as its own thread, so you can identify which logs belongs to which http request since the thread name is included in the log. However, it would still be hard to hard to follow because multiple threads be adding new lines to the log in a mixed fashion because these threads are running simultaneously.
  2. Much more difficult is when each http request would also spawn new background threads where each has their own name which you cannot reliably associate with the parent http request. For example a spawn background thread called acme_Quartz_Worker-9 . With the logs being appended by many threads on-demand.. I think it is impossible to identify which worker thread belong to which http request thread.
There are basic I have learned over the years that are must-have things to keep the logs as helpful as possible:
  1. In Java, always make sure to always include the exception when you call logger.error(). Failing to include it may hide valuable clue since most exceptions have a root cause exception that would usually be the most important clue.
  2. If available, always add contextual information in the log message. Usually this would be the id of the object being processed (like the UUID of a particular entity). Human readable information is also welcome.
  3. Have a well designed exception system where you ensure which layers do log an error and which layer does the logging of errors. For instance, logging an error in a class called StringUtils is a very bad idea since different callers of this utility class may want to handle errors in different ways. Also, it is hard to determine which process had an error with the StringUtils. In addition, you usually do not (and you should not) pass contextual information to a StringUtils method - which is recommended to include in #2.
Further thoughts: I found this great blog post reflecting on how our logs are not human readable yet it is also not computer-readable (since the software stack like ELK is popular).