Tales from the jar side: The DeepSeek API, Janus vision model and image generation, My second missed day, and the usual toots and skeets

If the groundhog sees his shadow today, there will be six more weeks of Winter. If not, it will only be about a month and a half. (rimshot)

Feb 02, 2025

Welcome, fellow jarheads, to Tales from the jar side, the Kousen IT newsletter, for the week of January 26 - February 2, 2025. This week I taught my Managing Your Mananger and my Functional Programming in Java courses on the O’Reilly Learning Platform, and my LangChain4j course as an NFJS virtual workshop, as well as my regular courses at Trinity College (Hartford).

NOTE: Before we dig in, let’s address the orange elephant in the room. Like so many people, I have been watching in horror as my country is destroyed by the pathetic, greedy weasels who believe they are above the law. I am assuming, however, that you’re not reading this newsletter to be confronted with those events yet again. Therefore, I’m not going to address them directly, and instead try to provide a brief respite from the chaos going on around us.
I’ll just say this: I am with you, in any way I can be. I’ll do my best to help.

Experiments with the DeepSeek API

This week I prepared a whole series of tests in order to evaluate DeepSeek’s programmatic interface. Progress has been spotty, however, due to issues like this:

The service works for a while, then stops, then works again, then gives inconsistent results, and so on. It’s hard to do any real investigation with it up and down like that. I certainly wouldn’t rely on it in production, at least not yet.

I have a video planned with the results, but might be a while before the service stabilizes. That said, I have examples in a combination of Python and Java coding frameworks (using LangChain4j via its OpenAI API for the Java parts) and am eager to share them.

Here’s a brief summary of what you can and cannot do:

List the available models (currently deepseek-chat, which is called V3, and deepseek-reasoner, called R1)
It’s easy enough to get the account balance for the current user
Execute a regular (blocking) chat request
Execute a streaming chat request
Manage a conversation with multiple requests
Extract a Java record from response data
Demonstrate that the vision models are not yet supported (but see below)
Call my own functions, but somehow not get the final answer from the model (!)
Output both the final result from R1 and the thought process, which is in a different property in the JSON response

The vision model doesn’t work with either model, but since releasing a major competitor to OpenAI’s o1 model wasn’t enough, the same company has released Janus.

The version deployed to Hugging Face, known as Janus-Pro-7B, is apparently both an image generator and a vision model, though they call it “multimodal understanding” since they combine reading images with reading text. I was able to run the sample on Hugging Face, but wasn’t able to get it to run on my local machine. They also don’t (yet) provide a programmatic API for their own hosted model, assuming they even have one.

I gave the online service this image:

I asked the following question:

What do you know about the products produced by the company whose logo is shown in this image?

and here’s the response I got:

The logo in the image belongs to SkyNet, a fictional company from the movie "Terminator 2: Judgment Day." In the film, SkyNet is a highly advanced artificial intelligence system that plays a crucial role in the plot. It is portrayed as a cybernetic organism that can control and manipulate various machines and systems, including robots and other technological devices.

In the context of the movie, SkyNet is a central antagonist that seeks to take over the world by merging with the human brain, effectively becoming a superintelligent being. The concept of SkyNet highlights themes of artificial intelligence, cybernetic enhancements, and the ethical implications of creating life-like machines.

Good enough. I also asked it to generate an image of “a warrior cat flying into battle on the back of a dragon,” and got this response:

Not bad, if you don’t look too closely. I don’t know why the cat has wings. The results are comparable to DALL-E 3, but not as good as Flux 1.1. Too bad I can’t access Janus programmatically yet.

The function calling situation is very strange. When I use LangChain4j, which makes function calling really easy. I can verify that the LLM calls the functions I want and gets the proper results, but for some strange reason it returns an empty answer.

This isn’t news to the company, however, which includes this message in the API documentation:

The current version of the deepseek-chat model's Function Calling capabilitity is unstable, which may result in looped calls or empty responses. We are actively working on a fix, and it is expected to be resolved in the next version.

Okay, so much for that. Everything else works, though the R1 model (which is getting all the press) is rather limited:

Supported Features：Chat Completion、Chat Prefix Completion (Beta)
Not Supported Features：Function Call、Json Output、FIM (Beta)

In short, use R1 for chat, or don’t use it at all. That FIM abbreviation stands for “Fill in the Middle,” which is where you give the model a prefix and a suffix and the model provides the part in between. You can say something like, here’s how a function starts, and here is the ending, implement the body. Their V3 demo works fine with that.

It’s great that everything uses the OpenAI model, but they add properties to a lot of elements in the JSON response. That’s fine if you’re coding in Python or JavaScript, but in Java that’s a problem, because if you don’t declare attributes for those fields in your classes, you can’t access them without extra work. I talked about that in my Perplexity videos, and it means extra mapping and the inability to use existing frameworks like LangChain4j or Spring AI.

Incidentally, not to be outdone, Perplexity now lets you use DeepSeek as an underlying model:

The sonar-reasoning model is new. I haven’t tried it yet, but I don’t expect any issues with it other than the ones I identified above.

(Speaking of Perplexity and Chinese companies, apparently Perplexity made an offer to buy TikTok, assuming it divests from its ByteDance parent in China. That’s objectively hilarious, given that their bid values TikTok at around $300 billion, and Perplexity’s value is way, way below $1 billion. According to the article, they offered to give the US government as much as a 50% stake (might be an effective bribe, given the players involved), but Perplexity is still an ant claiming it plans to eat a rhinoceros. Good luck with that.)

That’s Two

As many of you know, I’m a career changer. After eight years of higher education, I took a job as a research scientist at United Technologies Research Center* in East Hartford, CT. There I spent eight years doing research on the unsteady aerodynamics of jet engines, publishing a couple of papers a year, going to the occasional conference, and spending way too much time begging for funding for the next year. Technically, that involved lots and lots of math and lots and lots of (shudder) Fortran. The plan all along had been to become a professor somewhere, but the academic market collapsed around that time, so I was stuck.

*United Technologies Research Center (UTRC), was the research arm of the United Technologies Corporation, which was a holding company that included Pratt & Whitney aircraft engines, Sikorsky helicopters, Carrier air conditioners, Otis elevators, and some smaller divisions, One way you know you’re getting old is that companies you worked for no longer exist. Eventually UTC became part of Raytheon, and I know nothing about the current situation.

In the mid 90s, when Java came out, I switched to a different group at the Research Center that specialized in, of all things, artificial intelligence. A couple of months there demonstrated to me that a decade of coding in Fortran hadn’t taught me anything about real software development, so I went back to school and got an MS in computer science. When that finished, so did my career at UTRC. Just before I left, I passed the Sun Certified Java Programmer exam, which meant a lot at the time because I didn’t have tons of Java code I could show potential employers.

I joined a ten-person company, called The Golden Consulting Group, on May 31, 2000. Golden ran training courses, and I was hired as a Java instructor. I was supposed to start on June 1, but they were running a two-day course in Object Oriented Concepts and asked me to start a day early. I was totally fine with that. As I mentioned to my wife later, after a few weeks at Golden I already enjoyed myself more than during my entire career at UTRC.

As a trainer, you don’t teach every single day. Even if you could attract that much work, you would burn out if you were constantly in front of students, dealing with new technologies and hard questions all the time.

(I have no idea how high school teachers handle their teaching load. I’m in awe of them, to be honest.)

When I finally went out on my own in March of 2005, my goal was to teach at least ten days a month, and aim for twelve to fifteen. That’s a pretty heavy load, I know now. After all, there are only about twenty to twenty-three working days in a month, but I figured that’s the level I needed to reach to be successful. That meant learning to teach as many topics as possible and maintaining as many clients (mostly training companies) as possible as well. I also traveled all the time, which was fun at first but eventually got old.

(By the way, Golden no longer exists, either.)

One record I maintained for years was never missing a class. Training classes aren’t like academic classes in many ways, but one important one is that a typical training class is only a few days long, so missing a day is a serious problem. Working professionals rearrange their work schedules to make it to class, so you really need to be available for them. From my first day in 2000, I was determined to never miss a training day. That record was a point of pride, especially as my streak passed five years, then ten, and then fifteen. I was really hoping to make it to twenty.

My mother passed away on May 18, 2019. The funeral was held on the following Sunday, where she lived near my sister in the Philadelphia area. At the time, I taught the official, two-day Introduction to Gradle course for Gradle, Inc., online, on a regular basis. We always ran it on Tuesday/Wednesday each time, but for some reason I no longer recall, this time it was a Monday/Tuesday. My mother’s funeral meant I was on the road on Monday, and missed my first class.

You could argue (and my wife did argue at the time) that I had a good excuse, but it bothered me then and still bothers me now. The truth is I didn’t remember that the class was on Monday, because that wasn’t our regular pattern (and I might have been distracted), so I didn’t even notify Gradle and was surprised when they called me wondering where I was. I explained what happened, and I did a decent job making up for it the next day, and nobody gave me a hard time about it, but I still feel bad about the whole situation.

Why am I bringing all this up? Because this week, on Friday, Jan 31, I missed day number 2. This time the issue was jury duty. I got the original summons back in December and rescheduled to the end of January, but I forgot about it until I got the reminder at the beginning of this week. By then I was scheduled to teach my regular Functional Java course on the O’Reilly Learning Platform this Thursday and Friday, so that was a problem. There was a chance I wouldn’t have to report for jury duty, but I wouldn’t know for sure until Thursday evening.

At least this time I told O’Reilly right away, and they were very kind about it. I’ve taught that class on the platform many, many times, so they had multiple recordings to choose from that they could make available to the students. That meant the attendees could still see all the code and get all the relevant information. They just couldn’t ask any questions about the material. On Thursday at the end of class I warned them this might happen and told them to contact me with any questions. That seemed about the best I could do.

It turned out I did have to report. I tried to have some fun with it online:

or this:

or this:

or even this:

and I finished with this:

I never came up with a good “You can’t handle the truth!” joke. I do believe in Harvey Dent, however.

The case we were there to handle settled during the morning, so we were all dismissed. Unfortunately, that was too late for my class. It worked out, but again I feel bad about it. I should have rescheduled for a different date. I also didn’t put the appointment on my calendar right away, so when we scheduled the training course, I didn’t realize I had a conflict. So I am at fault, at least partly.

That makes two days missed now. That sounds like a pretty good record, and it is, but some of it is luck. For example, I spent a week in the hospital about three years ago, and if I’d had any classes scheduled I would have missed them. Fortunately that didn’t happen.

Still, missing a day like that still bugs me. If I miss a third day, I’ll probably just retire.

Tweets and Toots

Welcome to February

ODE TO FEBRUARY
Four weeks, four weekends
Starts on a Saturday, ends on a Friday
Knows exactly what it's doing
None of this half week/five weeks mess
Every month should do this

That calendar looks strange to US people because the weeks start on a Monday, but the points made by the cartoonist (the awesome Emily’s Cartoons) are correct. I’m also reminded that in Gilbert & Sullivan’s The Pirates of Penzance, Frederick was born on February 29, and the fact that he’s not 21 but rather 5 1/4, is a key plot point.