Orchestrate all the Things - Evaluating and building applications on open source Large Language Models. Featuring Deci CEO / Co-founder Yonatan Geifman
Episode Date: March 6, 2024If we look at the current status quo in AI as a case of demand and supply, what can we do to close the gap between the exponentially growing demand on the side of AI models and the linearly growi...ng supply on the side of AI hardware? This formulation was the premise on which Yonatan Geifman co-founded Deci in 2019. Today, with the generative AI explosion in full bloom, demand is growing faster than ever, and Deci is a part of this by contributing a number of open source models. Join us as we explore: How AI models are different than traditional software and what open source means in AIChoosing between GPT-4, Claude 3 and open source LLMsCustomizing LLMs and fine-tuning vs. RAGEvaluating LLMsMarket outlook Article published on Orchestrate All the Things: https://linkeddataorchestration.com/2024/03/06/evaluating-and-building-applications-on-open-source-large-language-models/
Transcript
Discussion (0)
Καλώς ήρθατε στο Αρχιστήριο των Πορταγών.
Είμαι ο Γιώργος Ανατιώτης και θα συνεχίσουμε τα πράγματα μαζί.
Στοιχεία για τεχνολογία, δίδα, AI και ΜΕΔΙΑ και πώς φύγουν σε έναν άλλο, σχετικά με τις πληροφορίες.
Αν δούμε την σύστημη στάτικη της AI ως περίπτωση της ανάπτυξης και της προσπάθειας,
τι μπορούμε να κάνουμε για να κλείσουμε τη γύρω από την ανάπτυξη που αυξάνεται μεταξύ των AI μοντών
και την αυξανόμενη προσπάθεια που αητό στον πλευρό των AI χαρτιών.
Αυτή η ομορφή ήταν η πρόκληση στην οποία ο Ιωάννης Κέφμαν βρίσκεται στο DECI το 2019.
Σήμερα, με την εξελίξη της AI με την πλήρη πλήρη, η αρκετή αίσθηση αυξάνεται πιο γρήγορα από ποτέ
και ο DECI είναι μέρος αυτού από αυτό, επίσης, επίδρασε σε πολλά open-source μοντών.
Επίδραστε όσο αναδειχθούμε πώς οι μοντές AI είναι διαφορετικά από
το παραδοσιακό software και τι σημαίνει Open Source στην AI. Εύχομαι μεταξύ GPT-4, Cloud 3 και Open Source evaluating large language models and market outlook. I hope you will enjoy this. If you like my work and orchestrate all the things,
you can subscribe to my podcast, available on all major platforms,
my self-published newsletter, also syndicated on Substack, Hackernium, Medium and DZone,
or follow Orchestrate All the Things on your social media of choice.
Hi George, I'm happy to be here. I'm Yonatan, the CEO and co-founder at SC.
And in my background, I've got my PhD in computer science at the Technion in 2015 to 2019.
It was the early days of deep learning.
And in that point in time, it was mostly around CNNs and understanding what people can do with
deep learning.
But one of the understandings that I had in that point in time is that computational complexity
of deep learning is something that is really going to block us from achieving or getting
to the full potential of adopting AI in industry applications and real-life applications.
And we can see it today, but the hypothesis that we had in the past was that basically
computational complexity of models are growing exponentially.
If we'll think about GPT-3, GPT-4, and those large language models, we can see
larger models that require more and more compute. And the available compute given by the hardware
is something that is growing somewhat linearly, especially when we're comparing it per watt or
per cost, per dollar. And the idea is that there's a gap between those two numbers,
kind of a supply-demand problem,
that the demand is coming from the algorithm
and the supply is coming from the hardware,
and there's some gap that needs to be resolved,
either by building better hardware,
which is something that I don't know how to do,
or building better algorithms that are more efficient
and better utilizing or leveraging the hardware.
So that's where we started with this understanding.
It was pre the LLM and generative AI world of models that are getting
larger and larger and things like that.
And it was really based on data hypothesis.
So we thought, okay, that process of building AI is very manual.
How can we automate that?
How can we use AI to build better AI?
And we ended up building a company that is focused on the algorithmic part of AI,
building models that are more efficient and more accurate.
And today we have a range of foundation models that we built, most of them in open source
models like the CLM and other models that are showing great accuracy and efficiency
together where companies can take them and benefit from those models together with our
commercial offering, which is a set of tools to take those models, customize them to the use case with fine-tuning and training techniques,
and then deploy them very efficiently.
So that's kind of the short story about what we saw in the early days of deep learning
and how we decided to start DESI
and what is the mission of DESI.
So that's the origin of DESI.
And people ask me also, why DESI?
What is the name DESI?
So it's coming from Latin, from decimus,
which is one-tenth.
And this is exactly what we're trying to do
for the modern AI,
to make it one-tenth of the computational complexity of the development time
and help companies to ship AI faster with better performance.
Great. Great. Thank you for the intro.
I have to say, personally, I was familiar with a big part of that
because we've had the pleasure of connecting previously, but I'm sure it's been very useful for people who may not have heard about you
and what you do previously.
And actually, I also learned something new about the origin of the name,
which I didn't know.
So thanks for that.
And since we've connected before, I was already aware of the fact that you work on the algorithmic side of things, of models.
And I was also familiar with your previous, let's say, line of products, which was also open source, by the way, and was focused on computer vision.
And last time we connected was in 2022.
So before this whole large language model craze took over the world.
But I know that you also have a line of products, a line of language models that are open source and large language models and how these two interplay,
let's say, I thought you are a very suitable person to have this kind of conversation around
open source and large language models. Before we get into the weeds, however, I thought it's worth
starting by actually defining open source models because, because well it's not as self-evident
as it may seem at first and the reason I'm saying that is well because open source software has been
around for a long time and it's sufficiently well understood even though you know different
licensing schemes always have their peculiarities let's say, that somehow require a bit of expert
knowledge to navigate. However, open source AI models are relatively new and they are different
from traditional software because there are more artifacts involved. So it's not just code.
And also their lifecycle is different. So you have the whole AI model life cycle from training to convergence and release
and subsequent release and so on.
So I would like to start by just going through the different artifacts that constitute an AI model
and by doing that, we should also be able to more properly understand and define what constitutes
an open source AI model.
So we have the source code, we have training data and processes and weights and different
metrics.
So some people, when releasing models, they also include metrics and the binary code,
so the final product, let's say. So based on your own understanding and involvement, since you also deal in open source models,
which of these parts would you say constitutes an open source model?
That's a good question.
And I think that in order to answer it, we should go back to the goal of open source. In my perspective, it's to let people build and use on top of other work
and collaborate around those derivatives that are in open source.
And if we think about those goals,
first and foremost is having the model available out there for people to use.
So it's the weights and the code to run the model.
Those are the most important thing to put in open source.
The collaterals on what is the accuracy and what is the metrics are important,
but those could be also reproduced by testing the model.
And we see great initiatives like LLM Leaderboard that takes any model and run it through a
lot of tests to show the numbers and the accuracies on various data sets and things like that.
So I think that the most important aspect is the weights and the code to run the model,
to help or enable people to use that model in their applications
and to build on top of those models either by fine-tuning and other techniques,
continue the work of that model and bring derivatives out of it.
So those are the most important aspects.
Alongside those, there are the data and the training process,
which are also important, but not at the level that the model and the weight.
And some companies, Desi and Mistral, keep those a little bit more closed.
And I don't think that it affects the fact that the models are open
and preventing people to use those models
because those are kind of trade secrets today.
So it's not like what is the exact data,
but it's also how to mix the data
and what is the right mixture of different components of the data
in order to get to this and that model accuracy.
So those things that are significantly less important than actually putting the model
itself in open source.
So those are the components.
And in terms of, you know, the traditional definition of open source, I think that the
community concluded to a point where instead of calling it open source models,
we can call it open weights models.
So I think that that's the current stage of open source in AI.
Obviously, there are also examples that, you know,
the training data is shared and the training process also,
that's shared for most of our models.
Sometimes it's useless because most of the organizations don't have the amount of compute
in order to reproduce the training of those models.
So basically, the most important aspect is to release the techniques, the code,
and the weights of the model in order to people being able to use that model
and also build new models on top of that derivatives or that open source that the company shared.
So those are the important aspects of open source AI.
I think that the training process could be perceived as the development process. So in terms of open source,
if you have some algorithm in open source,
in some cases you see code,
and in some cases it's hard to understand the code,
and in some cases it's easy to understand the code,
but you don't get a log of the brain function
of the developer that built that open source.
And that's similarly the training process.
So you don't really need a training process in order to use those open source models.
Yeah, it's interesting.
I think I tend to agree with the gist of what you said, at least if I got it right, which is basically that, well, open source
models
are a different animal
compared to open source
software. So because
precisely of the fact that
the process is different, the artifacts are different,
maybe
people should come up with
a different definition that
covers open source in AI models,
because, to be honest with you, I think it may be a bit confusing for people who are not familiar with the intricacies of everything that producing an AI model entails,
to call a model open source when there's no one-to-one equivalence, let's say.
So maybe we should come up with a new terminology that makes things
more clear. Also, interestingly, what you mentioned about the training data and process
not being that important, I see the point and I think that makes sense if the intent is what you
also said for people to take that model and use it in whatever applications
they're building if that is the case then yes i think i think you're right and the training data
and process are not that important because what matters is what you can do with the end product
but if the intent is for people to be able to reproduce what has been generated as an as an
open source model i think I think it's quite
important.
I just saw yesterday, actually, a statement by an engineer that works in open AI that
was, and he was basically saying, well, in my experience, you know, all the hyperparameter
tuning and even all the different architectures don't matter as much as the data set.
Eventually, you know, no matter how much tweaking you may do,
no matter how many different architectures you may try,
what matters most in terms of converging towards something
is the data that you're using.
And I'm sure that the process and what you also highlighted,
so how you take different parts of that data set and what you emphasize and what you de-emphasize is also important.
So, again, it depends on what you want to do with the end product. There is a spectrum from the closed source models like OpenAI, GPT-3, GPT-4,
to full open source as we think about it with the data hyperparameters and everything.
And most of the AI models in the generative AI area are somewhere in the middle.
I'm talking about LAMA models, Mistral, DESI.
Most of those are sharing all the things that are needed in order to take and use that end product,
but keeping some of the aspects closed.
And we're in an ongoing debate about sharing more and more.
So in some cases where the data is not proprietary, we are sharing what data we use and what learning rate and hyperparameters
and in some cases we cannot share that.
The goal is to do as much as we can to push the community,
the open source community, and release as much models
with the best accuracy that we can. But sometimes we have limitations
on that. I believe that also Meta has some limitations
in ensuring what data they use to train Lama,
and the same happens for Mistral.
Yes, you're right.
It's a spectrum, and it all depends on what you want to achieve in the end.
And definitely there's a whole debate
whether it's best to use open source models and customize them and tweak them or just use something of the self.
And there are many aspects to that debate. So obviously, by leading a company who is releasing open source models,
it's de facto what your position is.
You're obviously for open source models.
However, let's try and just go through the sides, let's say, in this debate.
And I think the first thing to examine in terms of looking at this from the end user,
let's say, point of view is the classical build versus buy.
So the question there is, so is it worth for a company to invest in having their own model?
So can they see it as a strategic asset that they need to own?
Or is it fine for them to just sort of outsource that to OpenAI or, I don't know,
whatever other closed source company and, you know, just use their APIs and not worry about that.
And there's many things to consider there.
So the competitive edge.
So if you're using something that's the same for everyone in your domain,
then what's the differentiation? It's data ownership and, you know, concerns about how much of the data that enterprises share through API calls really belong to them
and what happens with data that goes over the wire.
And I'm not talking about technical considerations, but more like things like terms of use or potential data leaks and so on, and also safety and robustness.
So how safe it is to use these third party models.
So what in your experience, and I'm sure that you talk to lots of clients that bring those issues in front of you. How do you advise them? Yeah, so first, I suggest to zoom out
and to understand the role of open source in AI
and the risk that we are seeing in reducing
or leaning towards closed sources we see today.
If Google and Facebook wouldn't write PyTorch and TensorFlow,
and if Google wouldn't open source or write the academic paper about transformers, attention is all you need, we wouldn't be even close to the point that we're at today in terms of the capabilities of AI, the potential of AI, and the adoption of AI.
So AI is built on top of open source, either in framework for development,
algorithms, training, innovation, and things like that.
So today, the benefit and all what we're seeing is based on open source,
open source academic research that some or even most of it have been done in the large tech companies
like Google, Microsoft, and Facebook and others.
So there's a very important role of open source in the field of AI
to the continuous progress of AI in the world,
which I am very positive about, although there are some discussions about safety and things like that,
but I am very, very positive and supportive for the progress in generative AI in general these days. So that's just to put things in context about the importance of continuing to contribute to open source, publishing academic paper with innovative findings and algorithms, etc.
So that's from a high-level perspective about the question.
Now we're diving deeper today.
AI practitioners have two options.
One of them is to go to the closed source APIs,
things like OpenAI and traffic and others.
And the other alternative is to use open source AI models,
completely on-premise, taking them from Hugging Face,
using some open source frameworks.
And as discussed, there is a spectrum.
You can take those open source and use them on the inferences of service providers, companies like Together, Together AI, but you can also use some tools, either open source or commercial solutions to run those models in your premise.
So there is a range. what you need to think about that range is that probably the easier thing that you can do
in order to get up and running is to use one of the closed source API.
The time that it will take you, the proficiency that you need,
and the effort that you will need to put in place in order to get that up and running is relatively low,
and you can do it really quick, and that will help you to get to the POC level.
But in some cases, that will not scale nicely as your organization is expecting or needs
because several kind of drawbacks of closed source API.
First of all, it's a black box.
So you will probably be able to build a nice demo,
but when you want further customizations,
you will be blocked in the amount of fine-tuning or customizations that you can do to those
and the control that you have on the algorithms that are running on those models.
Also, for example, when the model will be updated, it will update your application without you having any control on any derivation in quality that you will perceive in your scenario.
Another aspect is the cost of closed source AI.
Basically, you are paying a premium here that is based on the number of API calls that you're running. And when you scale your application, you can get to very high costs
based on the amount of tokens
that you are generating through those APIs.
And many, many, many other challenges.
We talked about the data privacy issue.
Most of those APIs are working as a hosted solution
that you need to share the data with the API provider.
In some cases, you can take those into your VPC,
like in the Microsoft and OpenAI Azure services.
But in most cases, you will use the API,
a kind of a pay-as-you-go approach.
So there's also data privacy issues with sharing your data
and ownership and license issues about the data that is generated
and things like that.
So all of those are challenges or drawbacks of using the API that works so well
in order to build a demo, the POC for the integration that you are trying to build.
I'm not saying that open source is very easy and will bring you what you need very, very fast,
but there's also challenges for open source.
But in a general perspective, we can think about open source in model development as open source in software.
You're getting more openness in terms of you understand what is running there underneath.
You're getting more control.
You can customize the application.
It's an open source model.
You can fine-tune and you can customize the model to your specific needs
based on your data.
It's cheaper.
You're not paying for the model.
You can run it on your infrastructure, scale that infrastructure in the cost
that you're paying for your cloud provider,
and that will be your cost at scale, basically. So all of those are kind of the benefits of
using open source models. And what we see, and I think that it's the state of the market at the
moment, is that companies are starting to experiment with the closed source API, and then when they understand that they need something more complex
or more sophisticated than what they can get with those APIs,
they are starting to work with open source.
Because one of the considerations that I mentioned,
either it's the data privacy or the scalability in terms of cost
or the control and product customization capabilities
that you are getting with open source.
And all of those are usually the motivation for organizations to go from their first POC
that they usually do on open AI or one of their competitors
to implementing open source approach for generative AI.
And as the market will mature and we'll get more sophisticated,
I believe that we'll see more and more open source model development
and open source adoption in AI.
Another thing that we need to take into consideration
is the gap between models accuracy when you are considering GPT-4
and, I don't know,
the best open source available out there.
So there is still a gap.
GPT-4 is the best model today, I guess.
And you can get something
that is even close to that in open source.
But for most B2B applications,
you don't need the GPT-4 accuracy.
Open source and that customizability and control,
letting you choose a model that is in the right size and the right complexity
to the task that you're trying to solve.
If you're trying to summarize documents, you don't need GPT-4 to do that.
You can do it with a mid-size LLMs, with good performance, lower latency,
better cost performance.
So, yeah, if you need high-end use cases, you should go to GPT-4. But if you're trying to build some B2B application that can work with smaller models,
that's probably the way to go to open-source AI models and use them.
So, yeah, and I believe that this gap will be closed.
I believe that we'll see by the end of the year
models that are with the capabilities
or very close to GPT-4 in open source.
So that consideration will probably,
will be closed towards the end of the year,
I believe.
Yeah, you raised a number of interesting points.
Let's see, I'll try and highlight them in reverse order.
So starting by the last thing you said about closing the gap in performance,
I actually just saw yesterday a new release of LoRa fine-tuned models,
a collection actually of them for specific tasks that purportedly actually match GPT performance
for specific tasks. So, you know, it just goes to show that if you want to highlight, if you want to
focus on specific tasks and you are able to fine tune a smaller open source model, you may be
actually able to get GPT level performance. And also, well, on this classical, let's say,
dilemma, the build versus buy, I think
yes, it makes sense what you said. So many organizations
probably start by just leveraging closed source
APIs because, well, it's the easiest thing to do and the easiest thing to
get them off
the ground.
But as they move along, I think what we'll probably end up seeing is what we also see
in any other type of software.
So you'll have the organizations that want to own and control and customize their deployment
and also the ones that don't see it as such a strategic investment
or just prefer to have someone else to take over
and actually also point fingers in case something breaks.
And so eventually, I think organizations will find their place along this spectrum.
Talking about performance, actually,
which is a very important consideration in this
evaluation, I think, again, we need maybe to take a step back and actually ask ourselves,
so what do we mean when we talk about performance here?
You mentioned accuracy, and that's a good metric.
There's also speed.
So how fast can you get a reply?
But then, you know, there's also the interplay of those.
So are we talking about easy things?
You mentioned document summarization, for example.
And again, there are many parameters to consider there. What kind of document?
What kind of context windows are we talking about?
So small documents, extensive documents, technical documents, general content documents.
So already, just by getting a bit into the weeds and just asking these questions out loud,
I think it becomes quite evident that evaluating models is not at all straightforward.
So the question there is, how do people currently do it?
What do you think are some meaningful metrics that organizations can consult in order to have a better idea of what works for their specific needs?
You mentioned the Hugging Face leaderboards, which I guess are kind of the de facto evaluation that people use.
You also mentioned open source models kind of closing the gap.
And I think we're seeing that happen.
There are many documentations to that.
Some people are also even saying that maybe there's a sort of plateau that we've reached,
at least a temporary one in terms of performance.
So how do you evaluate performance, actually?
Yeah, so as you mentioned, performance is not only accuracy.
It's a combination of metrics, accuracy, latency, throughput, cost,
and other factors that we need to take into consideration
to ask ourselves about AI inference.
And I think that one of the things that I didn't mention before
is that using the largest models like GPT-4 comes
not only with a cost, but with high latency that is really hard to incorporate
into real-time
applications. So the models that are 7
or 13 billion parameters are working faster in an
order of magnitude compared to GPT-4 models. So if you need speed, you should better work with
smaller fine-tuned tailored models for your applications. In terms of accuracy evaluation, it's still an unsolved problem, but we see a lot of organizations trying to
understand how they want to evaluate their models.
Some people are using human evaluation, so writing some tests and comparing two models
to see which responses are better in which cases and try to analyze which model is better.
That's one approach.
Another approach is using a large language model
to evaluate another large language model.
So that's the Alpaca eval and the empty bench.
So those are very useful approaches to use LLMs
or very large LLMs, it could be GPT-4, in order to do
the evaluation of which
model produces better results.
But we've seen also that
for example in alpaca
eval, there is a
bias towards long answers
which the LLM thinks to be
more accurate or more
comprehensive or higher quality
which is not always the truth.
So, for example, if you consider measuring summarization results
with large language models,
and it will tend to prefer longer answers,
it will probably miss the target that you are trying to achieve
with summarized answers.
So it's still an unsolved problem.
Most of the organizations today
are employing human evaluation
or human-in-the-loop evaluation.
Some of them are using
large language models
in order to evaluate the models,
but the idea is that
you really need to evaluate the model
in the context that you're going to use it.
The LLM leaderboard means nothing for your company if you're trying to do summarization, for example.
You need to evaluate the model in summarization tasks.
And as you mentioned, summarization for long or short documents would be very different
and for technical documents would be very different from general documents. So you really need to build the evaluation pipelines for the models that you're using in the scenario,
in the data distribution that you are using.
And it's a challenging problem today.
So I don't have good solution except human evaluation and large language models that is being used for evaluating models.
Yeah, I can totally understand how challenging it can be.
And yeah, it's an evolving field and even in fields that are not evolving.
So if you take something like a database, for example, sure, you have benchmarks.
But in the end, what matters most is you need to set up a representative environment of the application that you intend to use the software for and evaluate different options there.
And I guess the same goes for LLMs as well.
And just the last one before we actually go to talk a little bit more about your specific offering.
So the open source LLMs that you're providing,
just to cover one base on customization.
So lots of people these days,
whether they're using off-the-shelf closed source APIs
or open source customized models,
they want to sort of fine-tune them
or use something called rag- show retrieval made at Generation
in order to basically tailor the model more to the data
sets and therefore to their needs. So do you think it's easier
to employ something like that, whether it's fine-tuning or
RAG, using a proprietary or an open-source LLM?
I would tend to say intuitively probably an open source,
but I wonder what your take is.
That's a good question.
And there's a huge difference between fine-tuning and RAG.
Fine-tuning is changing the actual model for a specific dataset,
and RAG is more around the model, around the generation,
which is the retrieval and augmentation part. and RAG is more around the model, around the generation,
which is the retrieval and augmentation part.
I think that it will be much easier to do,
or I would say that in RAG there is not a lot of difference if you're looking to do it for open source models
versus to closed source models.
In open source models, and in fine tuning,
it will be much easier to do for an open source models, and in fine tuning, it will be much easier to do
for an open source model and experiment
and get the results and understand them better,
etc. So, but I
think that in the debate of fine tuning
versus RAG, the future is kind of a
combination of both of them.
There's some benefits of doing RAG, mostly
from data privacy. You don't want
to train the model on private data, but
in RAG, you can supplement the model with private data in the prompt.
So that's one of the benefits of using RAG,
and also the update, the fact that you don't need to fine-tune all the time the models
in order to get it updated, so you can give updated data.
You see very interesting products today that are using RAG,
like Perplexity and u.com and other LLMs or systems that are using RAG
in order to bring up-to-date data from the Internet.
So those are kind of the benefits or the pros and cons of each one of them.
And the future is probably the mix of RAG and fine-tuning models
and obviously fine-tuning it much easier in open source.
So overall, I would say that building those more complex use cases
will be much easier to do in open source versus closed source in the future,
especially when the accuracy gap will be closed completely
between the closed source and open source models.
So with that, I think it's about time we actually sit focused
to what you do specifically around open source models
and open source language models, to be more precise.
So I was wondering, basically basically if you can just quickly
introduce us to how it works for you basically. So I know that you had a line of open source
models even before you started focusing on language models. So how do those fit in your
business model, in your overall offering? And we can start from that part.
And then we can focus specifically on the language model.
So what is your offering there?
And what are the different features?
How do people use them?
And so on.
Sure.
So this is building foundation models.
We usually put them in open source.
So most of our models are available in open source. You can check either SuperGradients,
which is an open source repository for training computer vision models
with computer vision open source models inside,
or you can check the Hugging Face page of DESI
where you can find text-to-image models and LLMs. So basically we have one
component in our offering,
which is the open source models that are free to use,
and you can try and use them.
And then comes the tooling layer,
which is tools to customize those models like super gradients
to fine-tune those models and to adapt them to specific data,
and then the tool to deploy them that is called in
ferry so the idea let's take for example the clm7b the clm7b is a model that is similar in its
accuracy to the industrial 7b but the interesting aspect about the clm7b is that it's significantly
more efficient or performant in terms of inference speed than Mistral in this example.
So Desi LM running in 4.5x better throughput compared to Mistral 7b.
If you take Desi LM for open source,
what you'll see is that it's running around 2x faster than Mistral.
But if you're using Ferry, which is is DES's runtime engine for LLMs,
Inferi LLM is for LLMs,
and Inferi is for regular models,
you will see that you can get additional boost
out of the commercial offering.
So basically, it's an open-core approach
where the models are open source.
But if you want to get extended value
in the area of performance,
you can use our commercial offering,
which is called Inferi, and the tools that is complementing the models
in order to shorten the development cycles and improve the performance of the models
as they run at the inference workloads in production.
So that's the general approach on how open source models working with our commercial offering and we have a variety of open source
models from computer vision like yellow nest and yellowness pose for pose estimation and now a new
model that is called ulana sat for for small object detection with examples from satellite
image analysis and then in the llm world we, we have DESI Coder 1B,
which is a coding assistant model, which is very small
and can work on the edge or in the cloud.
And the CLM 6B, which outperform CodeLambda 7B and StarCoder 15B.
And the CLM 7B that I just mentioned that is similar to Mistral
in terms of accuracy and running about 4.5x faster at
inference time. So those are DLMs and we have some text-to-image models that are similar to
stable diffusion 1.5 and 2.1 and all of those are available in Hugging Face in our company's page. Okay, great. Thank you. And yeah, I can definitely see
why you decided to address
the other language model market,
let's say, obviously,
because there's been a huge demand for that.
And interesting that you mentioned
you have models that are focused
on programming, basically,
sort of like coding co-pilots.
And I wonder specifically for those, if you could share,
are they also able to integrate with coding, with IDEs,
or people's coding setups in some way?
Yes, so one of the nice things that when you release open source models,
that is people building cool stuff around them.
So one of the open source contribution is a tool that can take that 1 billion power meter models
and use it as a coding assistant in IDE.
And that model can work locally at the speed that you're typing and coding
and to assist you with code completion and code correction in real time on your laptop.
So those can be easily integrated with tools, either through API tools that are working in the cloud
or tools that are working locally on laptops.
What kind of use cases are you seeing?
Obviously, for the ones that you just mentioned, it's pretty clear.
And even though if you have specific clients that are using it that you just mentioned, it's pretty clear. And even though
if you have specific clients that are using it that you're able to share, I'm sure that would
be interesting. But for the other ones, for the generic, let's say, language models, what are you
seeing people using them for? Yeah, we see a lot of use cases and there's a lot of excitement in trying new approaches and new use cases.
We try to focus on the business area or the business vertical where our uniqueness is the most compelling,
which is usually the performance side of the model.
So if we'll think, for example, on chatbots for customer care, there are two approaches.
One of them are chatbots that are really talking with the customers,
and the second one is the co-pilot approach,
that is a chatbot that's assisting the customer care representative
in giving service to the customer during the call.
And in both of those cases, we have customers that are using our models
and tools in order to support their workflows and their customers.
And in those cases, it's very important to have a low latency model that can respond in real time
and really be an assistance so the customer won't need to wait a lot of time for the LLM to compute
or to generate the text that is needed.
Another area that we're working in with customers is the financial services. We have customers there that are doing financial analysis
and other aspects of assets analysis with LLMs.
There's a lot of business use cases in enterprises like summarizations
and highlights from calls,
either from calls with customers or from internal calls and other aspects that people are summarizing,
also summarization of documents.
So those are mainly the use cases that we see people are using there.
Let's call that mid-size LLMs that we're currently providing,
together with the tool, with the very strong emphasis on working on-premise with high performance.
Yeah, you did mention that technically those models that you have,
I think you said they go up to 15 billion parameters,
so I think technically they would be called probably mid-size, not large-size.
Do you have plans for potentially maybe releasing also larger-sized models?
Do you actually see the need for that?
Or are the use cases that you're targeting well-served by the models in the range that you already have?
Yeah, so we are going towards larger models,
and we'll probably release later this year models that are larger than the models
that we released so far so so definitely we see need for models that are more capable we're trying
to push the boundaries of the capabilities of the models in the middle mid-size but we'll also get
to release models that are larger than what we have today for for sure. And what's the process, actually, of you building those models?
Obviously, you have to start by finding the right data set,
which is something you talked about in the beginning.
But let's put it that way.
Is there something special?
Is there maybe some kind of secret sauce that you apply besides the obvious of having this
extra speed inference layer
that is available as your value-add proposition.
In the actual process of training the model,
is there something special that sets your models apart?
Yes, so we're trying to build
the most accurate model. So at the moment, our model
is slightly more accurate
than Mistral and more accurate than Alarma 2. We're working on a more accurate model in the
7 billion parameter range and also on an 11 billion parameter range. And we're trying to
break the state of the art and the capabilities of the models in those size categories. So first, we put a lot of emphasis on the instruction following capabilities
of the models and their accuracy.
Second, we try to also think about how easy to use those models,
to customize them, fine-tune them,
and to use them in your workflows and in your organizations
with our set of tools that we're providing.
And third, we're thinking about high performance.
How can we improve that performance, reducing the latency,
reducing the cost of using those models,
and improving the throughput that you're seeing on the hardware that you're using, etc.
So those are kind of the priority.
Best accurate models, easy to use, time to value,
short development cycle, and performance on cost.
I see.
And I wonder, as lately, obviously,
the shift of focus to language models and coding
or assistant models as well has been evident.
I wonder how that has played into, well, your overall trajectory,
let's say, as a company.
I saw, for example, that you had a very recent, let's say, Series B round.
And so I wonder how, if you're able to, let's say, pinpoint the role
that this new line of products that you are releasing now with the language and
coding assistance how it has been contributing towards your growth and what is your say what
are your future plans and what part do you see for this specific line of models in those plans
yeah that's that's a good question i think that year ago, maybe a bit more than a year ago,
with the, we call it the change GPT moment,
we all saw new capabilities emerging in AI
and a new set of enterprises that didn't think about integrating AI
started to think about it.
So basically what we can think that as an AI platform,
our market just tripled or even more in the last year
with the amount of companies that are trying to integrate AI into their products and workflows.
For us, it was natural to expand to those areas.
Usually, our mission was to solve that problem that I mentioned at the beginning of the podcast about the gap between the computational complexity of running those models and the available compute.
So basically, that's a natural gap that even increased or the problem even increased
when thinking about those large LLMs.
So it was very clear to us that we will have to enter into that territory,
and it's a territory that is very interesting to solve
both because of the excitement from those models and the new capabilities
and the potential of them being integrated in enterprises across many verticals
and also the technical challenge of making those models to run faster with the high accuracy that we see from going
to larger and larger models.
So those are kind of the two reasons why it was very natural for us to expand to those areas.
So today, we have two segments in the business.
One of them is the computer vision, which is the old one that we're continuing to growing
there very fast with more and more customers
as this side of the business is more mature in terms of market.
The market maturity is higher.
People know what they want, they know what they need,
and they are more looking towards production use cases
and more and more multiple use cases per company and team.
And in the journey, people are more early in the journey,
more in an exploratory phase.
And we believe that this year
will be kind of the production year
that more and more companies
will try to get to production.
So you can think about it
as tackling two segments
of the market,
which one of them is more mature
and more business-oriented,
and the second one is more exploratory at the moment. But both of them have
a lot of potential to grow the business of
the company in the long term. So I'm very positively
thinking about both of them as being kind of two components of
an AI platform that is enabling the organization to benefit
from the potential of AI.
Yeah, thanks. And I think you're right. I think lots of people are expecting that, well,
this year should be the year that actual deployments started to happen because yes,
so far for many organizations, it has been mostly around experimenting and understanding their own needs and trying to find what works for them and so on.
But hopefully this should be actually the year where the rubber hits the road, so to speak.
And I guess you'll be in a good position to tell us how things have actually worked out about, let's say, about this time next year. Thanks for sticking around. πώς τα πράγματα έγιναν, ας πούμε, αυτή τη στιγμή το επόμενο χρόνο.
Ευχαριστώ που παρακολουθήσατε. Για περισσότερα ιστορία όπως αυτή, εγγραφείτε το σχόλιο στο βιβλίο και ακολουθήστε την οδηγία κοινωνικών δεξιών.