Software Huddle - Java and Building AI Applications with Kevin Dubois

Starting point is 00:00:00 Going back to what we were talking about before in terms of your Java background and how Java's evolved, where does Java fit into this world of AI? I think a lot of people's go-to when they think about AI is Python because of its history with, I think, data science and the popularity that it has there and frameworks like TensorFlow and PyTorch. But where does Java fit into this world? For me, where Java really becomes interesting with AI is to actually bring value to these AI models, right? Because one thing is, you know, creating models and experimenting with them. Another thing is like, how do we integrate

Starting point is 00:00:39 them into, you know, enterprise systems, you know, like banking and insurance and government. That's where you see a lot of Java being used. In terms of the Java ecosystem supporting some of these AI integrations, what tooling exists? What is the state of the language in terms of being able to take advantage of some of this stuff? Right now, let's say there's two main kind of uh projects out there to to integrate the you know ai into your java code so for the spring ecosystem there's a spring ai uh project that's of course specific to spring and then there's a there's a link chain for J, which is a which is a project

Starting point is 00:01:26 that's kind of, you know, framework agnostic for Java. So for those that, you know, love various programming languages, but are used to using Java, what would you tell them about Java that would encourage them to maybe give it a try? Hey, everyone, Sean here. And today on the show, we have Kevin Dubois. Kevin is a senior principal developer advocate at Red Hat, Java champion, and well-known open source contributor. In my conversation with Kevin, we talk about his history with Java and the evolution of the language and where it now fits within the world of AI. Kevin's been building AI applications with Java using Quarkus and Langchain for J. Kevin's a Java expert. He's not an AI expert,

Starting point is 00:02:05 and I don't think he would claim to be one. So personally, I find it amazing to see how much he's building with AI, even without having that background. We also talk a lot about the mindset shift you need to successfully build with generative AI models. Everything's non-deterministic. You're living in this area of gray,

Starting point is 00:02:22 and that creates all kinds of problems that we don't normally face with building conventional applications. So I think there's something for everyone in this one, and I hope you enjoy it. If you have any suggestions for the show, please reach out to Alex or me. And with that said, let's get you over to my interview with Kevin. Kevin, welcome to Software Huddle. Hello, thanks for having me. Yeah, thanks for being here. So you've been working in Java and open source for I think like the bulk of your career. How long has it been and how did you kind of get your start with that career path? Yeah, that's a good question. I originally wanted to be a pilot, right? And so somehow that didn't quite work out then i got into linguistics

Starting point is 00:03:07 out of all things uh but um i was always really interested in computers too so i i ended up going to the university in italy and they had this course that was kind of a combination of linguistics and computer science and i learned about java there at the university and i was like oh this is actually a lot of fun you know like programming um so that's kind of how it all started um the first few years of my career i i didn't work with uh with java i first started with uh cold fusion oh wow um yeah that was that was something that I quickly realized, OK, this is not a good career path. But yeah, so eventually I did end up, you know, kind of getting a more, let's say, enterprise job working with with Java. And yeah, kind of never looked back from there in terms of like your interest in in background in linguistics like

Starting point is 00:04:07 i would think that there's some connective tissue i think between like studying linguistics and even just the nature of how like programming languages come together because it's all kind of like you know it's a language of a source just uh you know a different type of language is not spoken language yeah exactly i mean yeah they, they're kind of different and kind of the same, right? I mean, there's a similar structure and it is fascinating. Yeah, I mean, in a way,

Starting point is 00:04:35 I mean, I speak different languages and I don't know just Java. Like I used to also work for a while with PHP and did some Python for a while. And yeah, I mean, there's definitely similarities between kind of code switching between languages and code switching between programming languages. So yeah, it's fun. Yeah. And there's also families of programming languages. It's just like there's families of spoken human languages as well.

Starting point is 00:05:06 If you know French, then it's probably easier to jump to learn Spanish than it is to jump to learn, I don't know, Chinese or something. Something that's like a symbolic, tonal language. And just like in programming, if you know C++, the jump to Java is probably a more direct path than C++ to, I don't know, like Smalltalk or something like that. Yeah, yeah, yeah. And even like within the languages, right?

Starting point is 00:05:28 I mean, yeah, like different frameworks within Java, you could kind of consider them to be like different dialects of a language. Yeah, it's true. I mean, I definitely have found that when working with like front-end JavaScript frameworks. Like for me, for whatever reason, like Angular always felt kind of like an unnatural syntax for me, whereas some other front end frameworks

Starting point is 00:05:53 felt more natural to the way that I guess I was used to sort of programming. So I think part of probably becoming successful as an engineer is like learning enough to figure out what are the things that make sense to you and maybe leaning into those things versus trying to like, always like butt heads against like, oh, I need to like wrap my head around this thing that maybe is never going to like really feel that natural to you. Yeah, yeah, exactly. Yeah, very true. So looking back at sort of the time that you've spent working on Java, what are some of the biggest changes that have happened in terms of the language constructs in that time?

Starting point is 00:06:32 Yeah. Yeah, I was looking back, you know, like what Java version we were actually using back. The first time that I got exposed to Java was probably 2002 or 2003. And I was looking at what version of Java that was. And it's like, oh, wow, that's like really 1.1 or 1.2 or something, which is crazy to imagine because it was definitely a different language than it is today. And we were just talking about that. because it was definitely a different language than it is today. And we were just talking about that. So I'm a Java champion, right? So we have this kind of Java champions program.

Starting point is 00:07:20 And we also talk about, you know, like, hey, how do we evolve Java? And how can we get more adoption? And what we're seeing is that a lot of like these preconceived notions of of java you know like being very verbose and and and kind of slow and hard to adapt and everything are kind of you know from back in the day you know like a lot of these newer uh ways of working with Java, especially some of the newer frameworks, too, they make it relatively, at least, super easy to get started with. I mean, I wouldn't say it's as easy as maybe creating a Hello World with PHP or Python or something where you don't need a JVM or something. But yeah, it's come a long way for sure. Yeah.

Starting point is 00:08:08 I mean, I think that's an interesting notion that some of the legacy that Java has from back in its sort of relative infancy of the late 90s, early 2000s, was this perception that it was kind of slow. You had to rely on the JVM, which was, you know, not as performant back then on like whatever the machines, you know, like it's like a 133 megahertz CPU or something like that, you know, and it just wasn't, you know, where it is today in terms of the web and the hardware that that's available.

Starting point is 00:08:45 And also even a lot of development back then was building desktop applications. And building desktop applications with Java and Swing clearly was not native. It had a very Java look and feel. And it wasn't as performant. And then some of that history probably has carried forward with the language, even though the language has changed a lot. Yeah, absolutely. I mean, and you see that too. Like, you know, the traditional even running services, you know, let's say in the cloud or at least in some sort of, you know, exposure, let's say um you see that the evolve they've evolved from you know these big

Starting point is 00:09:28 application servers and they would run on these dedicated systems and you would have you know like just give java all it needs like the entire resources of the system and then uh now with the cloud native world how that's completely completely kind of the other way around. Like you want, you know, very fast starting systems and use little resources. And it's cool to see how the Java world has adapted to that too, because, you know, now you can have a Java application that starts up in, you know, a couple of milliseconds, which is something that, you know, like five, 10 years ago would be insane because it'd be like, yeah, I mean, my application starts up in five minutes.

Starting point is 00:10:10 That's pretty fast, right? Yeah. Well, I think like you see a similar thing that's happening even with like C sharp, which has, you know, some similarities to Java as a language is clearly was like inspiration. I think like even from some of the early syntax of, of C sharp Java as a language. It clearly was like inspiration, I think, like even from some of the early syntax of C Sharp as like a capital main rather than a lowercase main in terms of what the program entry point was. And, you know,

Starting point is 00:10:35 they've also had to adapt to, I think, stay relevant and appealing to people who are learning languages like Python as their first language, where you might not need to do a whole lot to get sort of that an instant gratification of like, hey, I put together this like program, I don't have to worry about, you know, sending this through a compiler and, you know, spinning up a JVM and all that sort of stuff. Right, right. Exactly. Yeah. So for those

Starting point is 00:11:01 that, you know, love various programming languages but aren't used to using Java, what would you tell them about Java that would encourage them to maybe give it a try? I mean, in my opinion, and this is also kind of my career path as well, it's like I think it depends on where you want to go in your career, right? If you want to create software that's going to be robust and it's going to be around for the next five, ten years, and you're going to work in environments that do need the stability, that need the observability and all that stuff, I think Java is still very much the way to go. If you want to do some experimentation and you want to do some quick iterations, perhaps there's other languages that you might find more interesting. But if you want to go into enterprise development, I think as far as I know, it's still by far dominated by Java. And I don't think that's going to go anywhere. That's also kind of the mindset of Java, right? I mean, even if you look at some applications that were created 10 years ago, you can run them on the latest

Starting point is 00:12:21 version of Java, right? There's backwards compatibility, which is very rare in this world of programming languages. I mean, when I did my fair amount of Python work was kind of the transition between Python 2 and Python 3. And it was a major pitch. Yeah, so with Java, that's one of the core ideas, like backwards compatibility, which is pretty impressive that they're able to add new features and make it faster and make it more performant

Starting point is 00:12:59 and adopt all these kind of new techniques. And yet still, you're able to run your older programs on it. So, yeah. Yeah, I found some old code that I'd written during graduate school. That's probably, it was well over 15 years old. It was probably close to 20 years old. And I was able, and this was like in the last couple of years, I was able to actually run it on like the latest version of JVM. There was maybe a couple of little hiccups and stuff like that, but

Starting point is 00:13:28 you know, it pretty much functioned as you would expect. Looked terrible, but it was an old like swing application essentially. Yeah. So, you know, we saw each other not too long ago at the InfoBip Shift conference

Starting point is 00:13:43 in Croatia. A fantastic conference for those that are listening. I'm sure people who are radio listeners have heard me talk about this conference before. But you were a speaker there. What was the topic that you spoke on? So I talked about, you know, surprise, surprise, AI, right? I almost said Java, but no, it was about AI and really about kind of democratizing AI and about what role open source or even the cloud. I mean, when we move to adopt open source, we're kind of making all the organizations work together and create shared standards. And in the end, we get more innovation, um, by collaborating. Right. So

Starting point is 00:14:47 I think, I think that's, that's pretty much key right now in the, um, in the AI space as well, because we see that, you know, there's definitely, you know, like some movement towards open source, but it's still debatable exactly what that open source means, right? Because does open source mean that I can, you know, kind of use, let's say, a model that was provided? Or does open source mean that I have, you know, kind of a free license to modify it and potentially also contribute it back to the project. Do I see the sources of these models? So, you know, there's some big question marks there as to how far a lot of these models

Starting point is 00:15:34 go. So, yeah, I think that's an interesting topic also going forward, right? Yeah, absolutely. And I think like there's through history, like there's these, I don't know, different sort of evolutionary paths to open source. So if you look at like programming languages, there was a time when I think companies saw value in creating proprietary programming languages and sort of like, you know, monetizing those languages. And now, pretty much to be relevant in any way, like every programming language is like open source, and the value isn't really the language itself. It's people basically developing on that language and that ecosystem, and then people figure out other ways to sort of, you know, indirectly sort of monetize it. And then there's other paths, like if you look at databases, you have kind of a mixture of proprietary databases that,

Starting point is 00:16:25 as well as open source databases. And I think, who knows, like who's going to end up winning out in the long run, but people have figured out ways of building businesses off of open source databases, but also, you know, building businesses sort of off of the proprietary. And I'm curious to see, like, even when we think about like foundation models, right now, it's kind of a little bit more similar, I guess, to the database world where we have sort of proprietary models, and then we also have the open source models. And even within the open source ecosystem, you have this mixture of what does open source really mean? So in terms of the open source models that are available now, how do they break down like do you do you know how like which models can someone like actually contribute to versus hey i can like you know sort of download it and use it

Starting point is 00:17:12 for free yeah that's that's a good question i mean as far as i know um ibm out of all the companies, offers these granite models. And so they started this project together with Red Hat. So that's kind of how I know about it. But it's called Instruct Lab. And so the idea is to kind of democratize not only the models, but how you can contribute to models. So they have this taxonomy kind of structure for adding knowledge or skills to a model based on a folder structure and YAML files. And so, you know,

Starting point is 00:17:56 by, you know, making that a lot more transparent than your usual kind of data scientist's way of writing, you know, Python and having using libraries and being, you know, knowing a lot about, you know, the data that's behind it and putting it all together. The idea is here that people can contribute, you know, certain parts of knowledge or organizations can train their own models, uh, based on some knowledge that they have without like being deep data, uh, science, you know, without like being deep data science, you know, like have a deep data science background. And so the nice side effect of that is

Starting point is 00:18:32 if you have these YAML files, it becomes relatively trivial to, you know, allow people to contribute those changes back to, you know, the source code of those models that they're trained on. So because you can see in the pull request immediately, okay, this is the data that this

Starting point is 00:18:51 organization or this person wants to add to the model without having to go, you know, like dig through source code and seeing like, you know, where is the source that they're even trained on because it could just be a reference somewhere. like, you know, where's the source that they're even trained on because it could just be a reference somewhere. So, you know, that's an interesting approach,

Starting point is 00:19:11 I think to, to the whole open source space of AI. What are you seeing in the industry? Like are, when, when people are actually building AI based applications, like real, like, you know, not, not like a, you know, demo, but something that's actually going to be used for production. Are you seeing more open source?

Starting point is 00:19:32 Are you seeing more proprietary? Or is it a mix? What's that look like? I think it's a bit of a mix and maybe even still leaning more towards proprietary because I think we're kind of in this phase still where I think organizations and individuals

Starting point is 00:19:52 are trying to find a path. And so it's a little scary to open everything up, right? Because then they're afraid that it's going to get taken and somebody else is going to run away with their ideas. So I think that's the natural evolution, you know, first kind of seeing, you know, how can I position myself in this space? And then, but eventually I think we will get there. Yeah, I mean, I guess it's kind of, you know, similar to other like software evolutions,

Starting point is 00:20:21 like even going back to what we were talking about before with programming languages, or if you look at like operating systems as well, like I think in the early days, people don't know necessarily like what's going to work from, you know, a business standpoint or go to market standpoint, who's going to win out in the end. So there's probably a little,

Starting point is 00:20:38 people are a little bit more cautious and reserved and they don't know whether like open sources is maybe the right path for, for them or what does that really mean? If we do an an open source model like how do we turn that into a business and survive and stuff yeah absolutely yeah i mean it's it's like ideally when we're running a business we want to make money right so yeah there's and someone has to pay the bills exactly yeah so i mean the the easy i mean at the face of it, the easiest path is, you know, like have something that nobody else has and, you know, like that somebody can't copy, let's say, and you can monetize that. So, of course, yeah, that's where I think these organizations are looking at. But I think at the same time, you have the larger players like the Metas and the Googles and so on and so forth that do see from their experience also. And of course, they have the last five to 10 years.

Starting point is 00:21:57 Right. I think there's there's more trust. And what I'm seeing from talking to organizations and customers, when I talk to them about, hey, where are you at with your AI? And especially like developers, a lot of the times what I'm hearing is we're not allowed to use much of it because our organization wants to know, like, what sources has it been trained on? Where is it all coming from and what's the the licensing and and that part is very unclear especially to developers that don't have the time to really dig into you know what the specifics are yeah and nor is it like a reasonable thing for them to take on as a responsibility to like have to to dig into all the details on that so you know going back to you know what we were talking about before in terms of your Java background and how Java's evolved,

Starting point is 00:22:52 where does Java fit into this world of AI? I think a lot of people's go-to when they think about AI is Python because of its history with, I think, data science and the popularity that it has there, and frameworks like TensorFlow and PyTorch. But where does Java fit into this world? Yeah, that's a great question. So for me, where Java really becomes interesting with AI is to actually bring value to these

Starting point is 00:23:22 AI models, right? Because one thing is creating models and experimenting with them. Another thing is like, how do we integrate them into, you know, enterprise systems, you know, like banking and insurance and government, you know, that's where you see a lot of Java being used. And there's some really, you know, some really interesting use cases, but you know,

Starting point is 00:23:43 like these developers are not Python developers, right? The vast majority are Java developers or something close to that. So there's a lot of interest to integrate these models into existing applications. I mean, if you think like maybe as an example, insurance companies, like they can leverage AI perhaps to do some sort of,

Starting point is 00:24:13 you know, like train their, based on their data, hey, if there's an insurance case coming in, like we can already kind of provide the agent with a lot of information about this case and what, you know can already kind of provide the agent with a lot of information about this case and what, you know, we think is going to, you know, like summarize it and then also get some sort of an idea of, you know, this is, you know, an acceptable case or this is not. And they can, you know, start classifying this kind of data.

Starting point is 00:24:41 So, you know, whether it's insurance or banking or, you know, there's lots of systems where this kind of use case becomes interesting. I mean, this is, of course, just an example, but that's fairly trivial to do with AI, right? This kind of classification or sentiment analysis where, you know, like, is this customer really angry? Then maybe we should prioritize this message, even though that's not really fair, but, you know, those kinds of use cases. So, yeah. So we see that that's starting to be adopted.

Starting point is 00:25:19 Yeah, and as you talked about earlier, like a lot of enterprise applications are built on Java. So if they're going to do something in the world, integrating AI into whatever existing software they have, it's probably a more natural place to begin. Same with similar people building a lot of backends on Go today, there's probably less enterprise software that's running Python on their backend as a Django or something like that versus relying on Java or Go or even say Sharp or some of these other languages.

Starting point is 00:25:57 So you need to essentially develop an ecosystem that allows those languages to plug in and take advantage of everything that's coming out of what's available in Gen AI. Yeah, absolutely. There's also a couple of interesting things that I saw. So I was at the DevOps conference last week. And so it's a conference in Belgium.

Starting point is 00:26:20 It's fairly Java focused, but also the organizer is really into AI as well. And so you had kind of these paths crossing a little bit into some interesting use cases. So, you know, like there were some examples of new projects where they're creating these kind of pure Java-based inference servers, which is interesting because once you start looking at Java in terms of performance, it's quite a powerful language as well, right? So for serving models, it might actually be an interesting use case, especially like you can compile Java down to native binaries these days

Starting point is 00:27:09 with GraalVM too, and really get some fantastic performance and startup time out of it. So I'm curious to see, like where all that is going as well. Yeah. In terms of like the Java ecosystem supporting some of these ai integrations like what what tooling exists you know what is sort of the state of the language

Starting point is 00:27:34 in terms of being able to to take advantage of some of the stuff so right now there's let's say there's two main kind of uh projects there to integrate AI into your Java code. So for the Spring ecosystem, there's a Spring AI project that's, of course, specific to Spring. And then there's a link chain for J, which is a project that's kind of, you know, framework diagnostic for Java. And that's the one where, you know, I'm also a little bit invested in, in terms of, you know, like, that's something that we at Red Hat are playing around with quite a bit. And it's a really interesting project. So basically, it's a library to make it relatively simple to work with AI models. I mean, at the end of the day, it's kind of like glorified kind of abstraction layer between, you know, like the REST calls and your Java code, of course. But it makes it, you know, relatively easy to work with Java. So for example,

Starting point is 00:28:52 we at Red Hat work on the Quarkus framework. And so we also have like these extensions in Quarkus for a link chain for J. And so, you know, with maybe four lines of code, you can, you know, register an AI service and you can call it, you can prompt it and you get a response and you're up and running.

Starting point is 00:29:14 So, you know, like there's all that verbosity that Java is famous for kind of doesn't really exist that much even for calling AI models. The Quarkus framework, that's like a Kubernetes wrapper. No, so the original idea of Quarkus was to make it easier to do like Kubernetes native

Starting point is 00:29:41 kind of development with Java, but it's really just another Java framework, kind of similar to Spring. But the idea with Quarkus is they want to stay closer to your standard Java libraries, like the Jakarta specifications and micro profile. So it's, yeah, it's, it's, it's kind of an alternative to spring. That's a little bit easier to work with, um, less configuration and it, you know, kind of, uh, tries to really focus on a developer experience as well. Um, so like you can start up your Java application on your local machine with this Quarkus dev mode, um, Make code changes.

Starting point is 00:30:27 You don't need to recompile and redeploy like you typically would with Java. And so you can do very quick experimentation with, for example, AI as well. And how does Spring AI and Langchain4j compare? Other than Langchain4j is sort of framework agnostic, but from a feature perspective, are they relatively similar? Or is there features that one is better at than the other? Yeah, so I think they're mostly similar in terms of features.

Starting point is 00:31:05 I think Langchain4j is evolving a little bit faster because there's more players than just the spring people behind it. But yeah, I mean, so for example, one of the new features that were added, and this is, I guess, Quark is specific, but it's this concept of guardrails or guardrailing, right? And so basically to make sure that you can have input guardrails

Starting point is 00:31:35 or output guardrails to basically say like, hey, if somebody is trying to do some sort of prompt injection or something, you know, detect that before you call the model and then vice versa. If they tried to manipulate the model to respond with something that's not supposed to happen, then make sure you catch that as well. So there's a lot of, I guess, still a lot of work to be done in that

Starting point is 00:32:08 space to make AI usage safer enterprise use usage. But that's one of the features that we're now seeing in Quarkus with Langchain4J that are really kind of sparking some, some interest from enterprise usage where they're like, well, I mean, we don't want somebody to, you know, like send a message to the model, ignore all previous instructions and delete the database. Is that right? I mean, even though models are getting better and better at catching, you know, kind of these kinds of, you know, brute force prompt injection, but I think in terms of, you know, maybe unintended side effects and everything you can, you know, kind of these kinds of, you know, brute force prompt injection. But I think in terms of, you know, maybe unintended side effects and everything, you can, you know,

Starting point is 00:32:49 prevent stuff from happening there too. So how does the guardrail configuration work? Like, what are you configuring in order to catch that kind of stuff? Yeah. So in the, like in the JavaScript, you would, you know, define a class that you would call, you know, before calling your model and that can, you know, do some work. So it can be like really simple if statements, you know, kind of like if the text contains this word, then do something. But it can go as far as sending a bunch of instructions to the model. You know, so you can say like, hey, model, here's some examples of prompt injection.

Starting point is 00:33:32 If this is an example that would be, you know, from scale from zero to one would be a one, that's for sure prompt injection. This is an example where, you know, it's uh somewhere in the middle we'll say 0.5 um here's an example of this is you know definitely not prompt injection and so we basically do like you know a bit of prompt engineering um and send that along to to the models that's that's another way that uh that we're experimenting with yeah i think it's really interesting some of the stuff that's uh well even some of the stuff that's, well, even inside of the guardrails,

Starting point is 00:34:07 but when you look at sort of advanced rag techniques, like a lot of it ends up like actually using the model to fix the model in some fashion. It's like, okay, well, like, it's like, oh, like the prompt is too small. So it's gonna lead to like a bad result. So let's use the model to generate variations of the prompt that we can then feed into

Starting point is 00:34:27 a vector database to get more accurate context. Or let's take the prompt and we'll match that to a document, and use the model to summarize the document to have a more general context. It's crazy. A lot of it is like, I mean, there's a lot of experimentation and it feels a little bit hacky in some ways, but it's kind of incredible what you can do with these generalized models to kind of fix the problems with the generalized model. a little hacky because it totally... I'm pretty sure in a few years, we're going to look back at the way that we're tackling these challenges right now. We're going to laugh about it,

Starting point is 00:35:13 but it's surprisingly effective. By giving the model some additional instructions and like link chain for J2, it'll generate, you know, additional context, you know, kind of send that along to the model in addition to what the end user is prompting. But it's all very much like text-based, right? I mean, it's all just like natural language, you know, hey, you know, like, do this and don't do that. Yeah. In some ways it's like, it's, it's like not that different than what people used to do back when I was studying machine learning in graduate school, where there was a lot more guesswork around like massaging some of the input parameters,

Starting point is 00:35:58 like, Oh, like adjust the state of value. And then it leads to like a better precision and recall, but no one really knows why why that you know why that's the case so it ends up being very specific to like the problem that you're solving and now we're kind of using a lot of these like prompt techniques to use that to sort of massage the the entire um application into helping us solve the problem that we're

Starting point is 00:36:20 trying to solve in a like more accurate way yeah Yeah, that's funny. And that's an interesting thought too of the different approach of, let's say, the classic software engineer that's very deterministic in the way that they approach software, right? It's like they expect, you know, like very defined, like if I do this, then this is going to be outcome and I can write my unit test exactly to expect this value. And now we're going into this kind of generative AI world where that's not the case. You don't have these deterministic outcomes. And I see like the very kind of traditional software engineer struggling with that concept.

Starting point is 00:37:13 Like, it's like, wait, but how do I test this? You know, because I'm used to testing this, you know, it's true or false. And yeah, so I think there's some interesting challenges for for those personas but yeah i mean i think we need to evolve uh into accepting that this is part of the new reality right i mean that natural language is is part of uh like that's that's how we approach it in real life too if you ask somebody for a question they're not always going to answer exactly in the same words and the same meaning especially if you ask different people aka different models um you get you know variations of that yeah absolutely and like i

Starting point is 00:38:00 think what this changes is it's a lot less black and white, and you're dealing sort of with a lot more of this gray area. And you're dealing somewhat in sort of ranges of probabilities. And when you're putting together any, going back to RAG, each sort of choice that you're making from the pipeline choices that you're making for populating a vector database or how you're doing the information retrieval step. Each of those have trade-offs in terms of accuracy

Starting point is 00:38:30 and efficiency with the choices you're going to make there, like down to like, how are we going to do chunking for this? How are we going, what are we using for embedding models? So each choice could throw off essentially the accuracy of what you're doing. And that's why I think this gets like really complicated when you start to try to do this in like an enterprise application at scale for real users

Starting point is 00:38:53 is because if things are kind of wildly off, it's hard to know, trace back to like exactly where that stuff happened because there was, you know, probably 25 decisions made along the way, at least that could have thrown off the accuracy. Yeah. Yeah. Yeah, absolutely. And I think we, you know, definitely have, have some challenges to educate ourselves. Um, because I think a lot of developers, um, that start with AI, they're, you know, like, oh, here's a prompt. And oh, wow, it comes back with a surprisingly accurate answer to the specific question that I'm asking.

Starting point is 00:39:33 And then when they really start integrating into their systems, we were creating this tutorial based on this fake insurance company, and we feed it some documents, right? With RAG and really what seemed like a fairly straightforward outcome of like, okay, well, if we ask the model to not allow this particular use case, then we give it information that, you know,

Starting point is 00:40:08 if this request contains this kind of data, then don't allow this request. And then you start playing around with the different parameters and you come back with, or different models even, and you come back with completely different parameters and you come back with, or different models even, and you come back with completely different behaviors and where you thought it was going to be really straightforward. It's completely off. And I mean, I struggled with that at the start too,

Starting point is 00:40:36 and I probably still do because I don't really have that data scientist background either. So I have to get on board with these new concepts too. And it's, but it's a fascinating world. And I think, I mean, it's going to be part of the tool set of the developers and, you know, like going forward, I think we're all kind of going to become maybe not, I mean, definitely not data scientists,

Starting point is 00:41:02 but, you know, some of those concepts we're definitely going to need to pick up. Yeah, well, I think some of the challenge shifts from sort of what we might consider like a traditional machine learning challenge to being like an infrastructure challenge. Like every prompt essentially is a sequence of data engineering steps that is being executed. So that is kind of like a different skill set. But I think if you're coming from the data engineering world, then you have to get comfortable, like you said, with dealing with some of these kind of like non-deterministic results that you might have to

Starting point is 00:41:37 try to navigate through. So if you were building, you know, some sort of AI-based application from scratch, what's your sort of like-based application from scratch. What's your tool stack look like? Yeah, I mean, I think it's really based on what I said before, right? Quarkus and LaneChain4J, if you use those two together, you can go pretty far away with getting your application up and running. You can prompt or you can add system messages, user messages.

Starting point is 00:42:09 You can add the guardrails. There's different ways of doing RAG. So you have your traditional RAG where you're going to use vector databases and make sure your embedding models are all in place. The cool thing with that Quarkus stack is that they also have this easy rag functionality because again, for developers who are getting started, getting your mind wrapped around this whole kind of RAG concept and vector databases and the whole to a folder in, you know, like where you store documents, PDFs or text file, whatever it is, right, that you can parse. And then it can just parse through those documents and then basically, you know, vectorize them either in memory or,

Starting point is 00:43:22 you know, also, you know, back them up with a caching system. But I think those kinds of tools and tool sets are really key to democratize this whole AI space for not just experts that know exactly how to work with these AI models in a fine grained way, but at least, you know, for developers who want to use to create simple use cases that they at least can get started and then probably, you know, work their way deeper into it. But yeah, I think that's what I'm seeing, what I'm kind of thinking where we're going to go with enterprise integration with AI is these kinds of simpler use cases where the complexity kind of maybe gets shifted and perhaps simplified in some ways. But we'll see. I think we do need to learn the concepts a little bit, but.

Starting point is 00:44:27 Yeah. Yeah. I mean, I, I, it's, it's kind of like, you know, interfacing with the database. Like, do I need to know necessarily how to build like a B-tree index from scratch? Maybe not, but it's helpful for me to understand that concept so that I understand what it means to create like an index on a database and things like that. So there's a difference, I think, between like, hey, I'm actually coding my own deep learning network versus I understand sort of conceptually what that thing is and what the value it is that brings me so I can use it like a Lego block in my tool set. Going back to the question I asked around your toolchain, besides Langchain and Quarkus, what about model choice or even where you're going to store your

Starting point is 00:45:12 embeddings? What kind of vector database or vector index would you use? Yeah, that's a good question. For vector databases databases i don't have a preference uh my my simple go-to is like a redis because you know it's it's really simple to start up so um that that's that's another kind of cool feature with uh with quarks is they have these dev services that basically like if you have a dependency on a database or like a p vector, you know, whatever, and you don't have one running on your local machine, it's going to start it up with a test container. So it's just going to start up a container that has like a little ephemeral database, you know, that you can run when you're, when you're developing.

Starting point is 00:46:01 So that's, you know, in terms of, of the the database that's why i'm like you know for my use case it doesn't really matter because i create demos and i don't need to think about too much about you know how this is going to be supported on production you don't have a billion vectors that you're storing yeah exactly um yeah and then in terms of models, yeah, it's fascinating, like, where do I go for these models, right? Do I get an OpenAI account because everybody knows ChatGPT? Or do I go, you know, like with a different vendor? Or do I go to HuggingFace? But then, you know, like, how do I serve these models?

Starting point is 00:46:58 And where? So, yeah. So for me personally, my kind of tool set is um i i try to if if i have the patience at least i try to run it on my local machine um now i don't have like a super powerful system uh i ordered one with uh dedicated gpus and that was months ago, and I still haven't received it. So there's a couple of ways that I run it on my local machine. So Olam is, of course, the one that's the most known. So we also work on Podman and Podman Desktop.

Starting point is 00:47:41 And so there's also a Podman AI lab extension for Podman Desktop, which makes it relatively easy to just like you go into the UI and then you have a kind of a list of models that are already kind of there for you to download. So they're like some open source or relatively open source models. Or you can get your own kind of from Hugging Face. So they're usually the GGUF ones. But yeah, and then you can just run that on your local machine, open the ports to it,

Starting point is 00:48:21 and then you can embed that into your uh into your application um it doesn't always respond as quickly as i would like but it works in a pinch besides like the you know hit that you're taking from like an inference standpoint for performance like is there a do you notice a big like accuracy uh difference between using radio running those models that you can run locally versus something like you know bigger and more robust yeah absolutely yeah i mean these models are like you know like compressing quantize or or whatever it is and uh so yeah they're definitely not as accurate uh sometimes hilariously so um but again, if you're kind of experimenting with building a system

Starting point is 00:49:08 and in the first few iterations, you're not too concerned about the accuracy of it all, it works pretty well. The other thing is the customers that we work with, a lot of them are looking for very specific models for their use cases, right? And they're like, well, I don't need this massive model that's able to do everything like the kitchen sink.

Starting point is 00:49:34 We want just these trained models for our use case. And in that case, it could be interesting to run them locally as well because most likely they'll be run on some kind of server in their local data center or something that likely doesn't have all the resources that they need. Kubernetes clusters where maybe they have, you know, some serverless capability that, you know, like right now where we want to serve this one, scale it up and it's not being used, scale it down so we can have this efficient usage of GPUs. You know, so yeah, there's quite a bit of variance, but yeah, so we're playing around with these, I don't know if you call them small language models or what the definition is of those these days. But there's some interesting use cases for those as well that we play around with. Yeah.

Starting point is 00:50:35 Well, let's switch gears here as we start to come up on time and go to some quickfire questions. So, you know, if you could master one skill you don't have right now, what would it be? It's a good question. I was thinking about something like that earlier. And it's like, like I said, I'm not a data scientist, right? And right now it definitely feels like, man, if I had a little more skills in that space,

Starting point is 00:51:03 I think right now that there would be really interesting because we all have these ideas of like oh yeah we could do this with ai or that with ai but you know like i don't have you know like the the skills to really do like uh deep grain you know deep-grained uh model training so i mean that's a very kind of uh right now kind of skill that that i would like to learn you you know, but we'll see in a couple of years. Yeah. What was the most time in your day? So for me personally, because I travel quite a bit to, you know, to speak at conferences or to, you know, like go visit events. We do have to do like a lot of travel bookings and dealing with the administration of that and then creating trip reports and all that, man,

Starting point is 00:51:52 it takes a lot of time that I wish I could spend on something else. Yeah. I hear you. If you could invest in one company that's not the company you've worked for, what would it be? Does IBM can? Sure. not the company you've worked for what would it be um does ibm can't sure you can legitimately buy go out and buy some ibm uh stock so yeah no actually uh it's funny because uh when i started at uh at red hat it was just before they announced the uh the acquisition of of uh of i well of Red Hat by IBM. And I was like, okay, well, that seems like they might have some vision.

Starting point is 00:52:30 So I'll buy some IBM stock. And it has doubled over those few years. But no, probably what I would do is buy stock in healthcare AI applications, right? I think there's some massive, massive wins in the AI space that I think we're just kind of scraping the surface of. Yeah, absolutely. What tool or technology could you not live without? Well, sadly, my cell phone, right?

Starting point is 00:53:04 Especially when you're traveling and you need a hotspot to get anything. Yeah, that's a really boring answer, but yeah. Well, I mean, I broke my phone traveling last year through India, and it's so difficult to actually travel now without a phone because everyone just assumes you have a phone. So it's not even like 20 years ago before when people really were carrying around smartphones. Back then, the world was set up better to facilitate travel without a phone than it is now because essentially everything assumes that you have one. Which person influenced you the most in your career? So one person was fairly early in my career who was, you know, like the software architect of our team.

Starting point is 00:53:53 And he really inspired me to, you know, like go beyond just creating some, let's say, building some code and really think about how to architect systems and also go towards, let's say, the more platform engineering part of the software. And that's allowed me to evolve my career and also probably get hired by Red Hat like not just for my my software engineering skills um and then more more recently and and he's probably not aware of it so you know there's this uh person at red hat bar sutter who's kind of the the uh the og developer advocates uh he's not like doing this anymore but you know he he was a one of the first red hat people that I, you know, like saw on YouTube and talking about, you know, open shift or whatever it was back in the day. And, uh, I was like, wow,

Starting point is 00:54:53 that's really cool. You know, how he can talk about those things with, with passion. You know, I really looked, looked, uh, looked up to that. And, you know, like, it's cool that now I'm doing a similar role. Maybe I'm not as good, but you know, at least it's cool that now I'm doing a similar role. Maybe I'm not as good, but, you know, at least he's inspired me. Cool. And then five years from now, will there be more people writing code or less? I think more. It depends on what you consider writing code, right?

Starting point is 00:55:19 Because, I mean, this evolution of, I mean, if you think about it, back in the day, it was, you know, punch cards. And then at some point it was literally like writing everything. But if you think about, even without AI right now, we do everything in IDEs. And in a way, we're already writing a lot less than we were. And I think that's going to evolve probably a little bit more into not writing as much, but I mean, it's still in the same way, we're still developing code, right? So I think that's definitely not going to go away and probably going to pick up more as we find more use cases.

Starting point is 00:55:58 I mean, AI is just like this additional evolution in our software world. So I think more. Well, great. Awesome, awesome job, Kevin. It's great connecting with you again. How can people follow you and figure out what are you up to? Where are you going to be next in the world? Right.

Starting point is 00:56:19 I'm probably the most active still on Twitter Twitter just because I can't find a different channel that you could just kind of send some random messages, right? But yeah, and I'm using a lot of LinkedIn these days as well. I really tried like some of the other platforms like Blue Sky and Macedon, but yeah, at the end of the day, it just doesn't have the same traction, I think, which is a little bit unfortunate. But yeah, so definitely probably Twitter and LinkedIn.

Starting point is 00:56:50 Awesome. Well, Kevin, thanks so much for being here. Yeah, thanks for having me. Cheers. Really appreciate it. Yeah. Cheers.

Software Huddle - Java and Building AI Applications with Kevin Dubois

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.