Tales from the jar side: Spring AI, Java String Templates, 1BRC, PDQ Bach, and the usual silly tweets and toots
People always talk about animal husbandry, but doesn't that neglect animal wivery? (rimshot)
Welcome, fellow jarheads, to Tales from the jar side, the Kousen IT newsletter, for the week of January 14 - 21, 2024. This week I taught my first Spring AI course on the O’Reilly Learning Platform and a Latest Features in Java course as an NFJS Virtual Workshop.
Here are the regular info messages:
Regular readers of and listeners to, and video viewers of this newsletter are affectionately known as jarheads, and are far more intelligent, sophisticated, and attractive than the average newsletter reader or listener or viewer. If you wish to become a jarhead, please subscribe using this button:
As a reminder, when this message is truncated in email, click on the title to open it in a browser tab formatted properly for both the web and mobile.
Spring AI
This week I taught my new Spring AI course on the O’Reilly Learning Platform for the first time. That was much more stressful than I expected, mostly because the project is (still) only at version 0.8.0-SNAPSHOT. That’s awfully early to start adopting a new framework. The other problem with early versions like that is that many projects leave documentation and tutorials until late in the process, because very few people enjoy writing documentation, and nobody enjoys writing it twice. Spring AI has some of both, but there are, shall we say, issues.
Given that, the class went remarkably well, given that I lowered expectations right at the beginning. Spring AI as a framework does bring a few nice features to the table:
Prompt templates that allow you to reuse text prompts prepared ahead of time.
The Resource abstraction, which means you can save templates in a directory under
src/main/resources
and easily load them using the@Resource
annotation.A few output parsers, like
BeanOutputParser<T>
, which makes it (relatively) easy to convert from the AI response into a Java class. I had some issues with that one, but was able to tweak the AI request enough to get it to produce what I wanted.An embedding client and several vector databases, which enables you to add your own documents for the AI to scan. That’s called Retrieval Augmented Generation (RAG), a term I still seem to be incapable of remembering.
That’s a lot. For example, to use the output parser, this code example is suggested:
The idea is to tell the AI that you want back an instance of the ActorsFilms
record, give it a prompt that knows it should generated that in JSON form, and then parse the result to get what you want. The problem is that most of the time the returned response isn’t just the JSON data — it has a wrapper around it, usually starting with ```json and ending with ```. That causes the parse method to throw an exception, because the embedded JSON parser (Jackson in this case) doesn’t know what to do with that.
I modified the template to say this instead:
"""
Generate the filmography for the actor {actor}. In your output,
please do NOT include the backticks and json expression, as in
```json (and the corresponding close backticks). Just include
{format}
""";
Fortunately, that worked, at least the time I ran it during class. That is, however, a good illustration of the problems you face with an LLM: you really need to specify exactly what you want, and you still may not get it. You therefore need to build in some kind of validation or error checking, and so far the project doesn’t have any of that. It’ll get better, I’m sure, but we’re not there yet.
As for the RAG example, the one they suggest tends to hang on my machine. I think that’s happening because the request exceeds the threshold for the LLM, and unfortunately there’s no warning or diagnostic output. I “fixed” it by cutting way back on the number of matching documents (h/t to Craig Walls, who was very helpful during all this) and got a result, but it wasn’t necessarily the best result. Again, I had a demo that illustrated the concepts, but wasn’t something I would want to put into practice.
Craig also introduced me to the Apache Tika project, which provides a reader that can handle PDFs, Excel spreadsheets, PowerPoint files, and more. He even gave me a nice demo (at his GitHub repository) that read a copy of his Spring in Action book and answered questions about it. I showed that to the students, with credit, of course. :)
All in all the class went reasonably well, or, as I like to tell my wife, “In the hands of a lesser instructor, that might have been a problem.” I did mess up my own example of using Spring without Spring AI when I misread one of the elements in a JSON response, but otherwise we were fine.
I’m giving the class again in March. Hopefully we’ll be on at least version 1.0 by then.
String Templates
Keeping to Java for the moment, in my NFJS Virtual Workshop on Latest Features in Java, I wound up showing examples for string templates, which is a preview features in both Java 21 and the upcoming Java 22. I wasn’t going to do that, but I updated my IntelliJ IDEA environment before the class, and every time it saw me use string concatenation in my code, it wanted me to use the new templates instead.
For example, when I wrote:
System.out.println(length + “: “ + wordList);
It wanted to change that to:
System.out.println(STR."\{len}: \{wordList}");
Originally I thought that syntax was about the ugliest possible choice for anything in Java ever (and I used to write anonymous inner classes on a regular basis), but I’m forced to admit I’m slowly getting used to it. Still, wow, that’s ugly, and I don’t look forward to introducing it to a new group of students each time and having them recoil in horror.
Besides, that example is silly. That’s just a formatted string. What you need templates for is when you evaluate an expression with some code, though I suppose there are ways to rewrite that as well. We’ll see what the team recommends when this is fully released, which apparently won’t be until at least Java 24 in September at the earliest.
1BRC
I meant to comment last week on the programming contest proposed by Gunner Morling called the 1 Billion Row Challenge, but fortunately Artur Skowronski’s JVM Weekly covered it better. After all, that’s exactly the sort of topic JVM Weekly is really good at discussing.
The basic idea behind the challenge is to write Java code that reads a text file with one billion rows of temperature data and computes the average, min, and max values as fast as possible. The baseline solution uses streams and solves the problem in about 3 minutes. Then developers got to work, using a variety of optimizations both simple and mind-bending. If you check the table on that page, the current record stands at just a hair over 2.3 seconds (!).
I know a lot of developers who really got into that. That’s not my skill, but I enjoyed it as a spectator sport.
The best part for me is that Artur included in his newsletter a taxonomy of the types of developers that participate in the annual Advent of Code challenge, and that was about the funniest thing I’ve seen in months. I was going to include it here, but it’s fairly long and a lot of my readers probably wouldn’t get the inside jokes. If you’re a developer, however, definitely check it out, but you should probably be reading JVM Weekly anyway.
PDQ Bach, RIP
The New York Times had a good article (gift link provided) summarizing the life of Peter Schickele, “discoverer” of the awesome fictional composer PDQ Bach, “last, and definitely the least,” of the 20-odd children of Johann Sebastian Bach (“and by far the oddest”), who passed away this week. The whole subject of PDQ is a great comedy routine from a serious composer, especially as Schickele kept unearthing previously unknown works from local garbage cans or other hidden locations. The article references a few of them, like “The Abduction of Figaro,” “Hansel and Gretel and Ted and Alice” (an opera in one unnatural act) and the haunting carol, “O Little Town of Hackensack.”
As the Wikipedia article on PDQ ways, “Schickele divides P. D. Q. Bach's fictional musical output into three periods: the Initial Plunge, the Soused Period, and Contrition.” It’s all like that.
I first learned about PDQ Bach from a friend at MIT who had Schickele’s “biography” of PDQ, which was filled with puns and other gags. I even saw him in concert once (Schickele, not PDQ), on one of his final tours. That was in the Hartford Bushnell Auditorium, long ago (I found a Hartford Courant review of the concert, dated April 21, 2001, so apparently that was the actual date). Apparently he performed the tragic oratorio “Oedipus Tex,” though I must admit I don’t remember much of it. I do remember Schickele running in down the aisle like he was late to his own concert. I didn’t know at the time that was one of his regular gags. Years later I got to perform the madrigal My Bonnie Lass She Smelleth as part of our church choir:
My bonnie lass she smelleth, Making the flowers Jealouth. Fa la la (etc.)
My bonnie lass dismayeth Me; all that she doth say ith: Fa la la (etc.)
My bonnie lass she looketh like a jewel And soundeth like a mule. My bonnie lass she walketh like a doe And talketh like a crow. Fa la la (etc.)
My bonnie lass liketh to dance a lot; She’s Guinevere and I’m Sir Lancelot. Fa la la (etc.)
My bonnie lass I need not flatter; What she doth not have doth not matter. Oo la la (etc.)
My bonnie lass would be nice, Yea, even at twice the price. Fa la la (etc.)
I was going to include a YouTube video here, only to realize that there are way too many to choose from. Just go to YouTube and search for PDQ Bach and you’ll be inundated. The Seasonings is a good one.
Tweets / Toots / Etc
Too soon?
You mean like being down 28 - 3 with just over 3 minutes left in the third quarter of a Super Bowl? Sure, I can talk about that.
Bill Belichick is out as coach of the New England Patriots, which is truly the end of an era. Growing up, my team was Washington, but I always rather liked the Pats. After Daniel Snyder took over in DC and destroyed them (as he did everything he touched) and I moved to Connecticut in the late 1980s, I became a serious Patriots fan, even though they’d never won anything. Then Belichick showed up, and Tom Brady was drafted soon after, and the rest is history.
It will feel very weird watching him anywhere else, but it really was the end. That interview question in Atlanta, though, is priceless.
Da-Elrond-rond-rond
Camouflage
I don’t see a bear at all. Do you?
Now you know
Answers the age old question.
That ought to work
Anti-Vaxxers Be Like
Yup, in a nutshell.
One more thing…
Too close to home
Hey, I resemble that remark!
Guilty, your honor
And finally:
Dogs vs Cats
Have a great week everybody!
The video version of this newsletter will be on the Tales from the jar side YouTube channel tomorrow.
Last week:
Spring AI, on the O’Reilly Learning Platform
Latest Java Features, an NFJS Virtual Workshop
This Week:
No training class, but first week of the new semester at Trinity College in Hartford, with my class Large Scale and Open Source Computing