Latent Space: The AI Engineer Podcast - 2024 in Open Models [LS Live @ NeurIPS]

Episode Date: December 23, 2024

Happy holidays! We’ll be sharing snippets from Latent Space LIVE! through the break bringing you the best of 2024! We want to express our deepest appreciation to event sponsors AWS, Daylight Compute...r, Thoth.ai, StrongCompute, Notable Capital, and most of all our LS supporters who helped fund the venue and A/V production!For NeurIPS last year we did our standard conference podcast coverage interviewing selected papers (that we have now also done for ICLR and ICML), however we felt that we could be doing more to help AI Engineers 1) get more industry-relevant content, and 2) recap 2024 year in review from experts. As a result, we organized the first Latent Space LIVE!, our first in person miniconference, at NeurIPS 2024 in Vancouver.Since Nathan Lambert ( Interconnects ) joined us for the hit RLHF 201 episode at the start of this year, it is hard to overstate how much Open Models have exploded this past year. In 2023 only five names were playing in the top LLM ranks, Mistral, Mosaic's MPT, TII UAE's Falcon, Yi from Kai-Fu Lee's 01.ai, and of course Meta's Llama 1 and 2. This year a whole cast of new open models have burst on the scene, from Google's Gemma and Cohere's Command R, to Alibaba's Qwen and Deepseek models, to LLM 360 and DCLM and of course to the Allen Institute's OLMo, OL MOE, Pixmo, Molmo, and Olmo 2 models. We were honored to host Luca Soldaini, one of the research leads on the Olmo series of models at AI2.Pursuing Open Model research comes with a lot of challenges beyond just funding and access to GPUs and datasets, particularly the regulatory debates this year across Europe, California and the White House. We also were honored to hear from and Sophia Yang, head of devrel at Mistral, who also presented a great session at the AI Engineer World's Fair Open Models track!Full Talk on YouTubePlease like and subscribe!Timestamps* 00:00 Welcome to Latent Space Live * 00:12 Recap of 2024: Best Moments and Keynotes * 01:22 Explosive Growth of Open Models in 2024 * 02:04 Challenges in Open Model Research * 02:38 Keynote by Luca Soldani: State of Open Models * 07:23 Significance of Open Source AI Licenses * 11:31 Research Constraints and Compute Challenges * 13:46 Fully Open Models: A New Trend * 27:46 Mistral's Journey and Innovations * 32:57 Interactive Demo: Lachat Capabilities * 36:50 Closing Remarks and NetworkingTranscriptSession3Audio[00:00:00] AI Charlie: Welcome to Latent Space Live, our first mini conference held at NeurIPS 2024 in Vancouver. This is Charlie, your AI co host. As a special treat this week, we're recapping the best of 2024 going domain by domain. We sent out a survey to the over 900 of you who told us what you wanted, and then invited the best speakers in the latent space network to cover each field.[00:00:28] AI Charlie: 200 of you joined us in person throughout the day, with over 2, 200 watching live online. Our next keynote covers the state of open models in 2024, with Luca Soldani and Nathan Lambert of the Allen Institute for AI, with a special appearance from Dr. Sophia Yang of Mistral. Our first hit episode of 2024 was with Nathan Lambert on RLHF 201 back in January.[00:00:57] AI Charlie: Where he discussed both reinforcement learning for language [00:01:00] models and the growing post training and mid training stack with hot takes on everything from constitutional AI to DPO to rejection sampling and also previewed the sea change coming to the Allen Institute. And to Interconnects, his incredible substack on the technical aspects of state of the art AI training.[00:01:18] AI Charlie: We highly recommend subscribing to get access to his Discord as well. It is hard to overstate how much open models have exploded this past year. In 2023, only five names were playing in the top LLM ranks. Mistral, Mosaics MPT, and Gatsby. TII UAE's Falcon, Yi, from Kaifu Lee's 01. ai, And of course, Meta's Lama 1 and 2.[00:01:43] AI Charlie: This year, a whole cast of new open models have burst on the scene. From Google's Jemma and Cohere's Command R, To Alibaba's Quen and DeepSeq models, to LLM360 and DCLM, and of course, to the Allen Institute's OLMO, [00:02:00] OLMOE, PIXMO, MOLMO, and OLMO2 models. Pursuing open model research comes with a lot of challenges beyond just funding and access to GPUs and datasets, particularly the regulatory debates this year across Europe.[00:02:14] AI Charlie: California and the White House. We also were honored to hear from Mistral, who also presented a great session at the AI Engineer World's Fair Open Models track. As always, don't forget to check the show notes for the YouTube link to their talk, as well as their slides. Watch out and take care.[00:02:35] Luca Intro[00:02:35] Luca Soldaini: Cool. Yeah, thanks for having me over. I'm Luca. I'm a research scientist at the Allen Institute for AI. I threw together a few slides on sort of like a recap of like interesting themes in open models for, for 2024. Have about maybe 20, 25 minutes of slides, and then we can chat if there are any questions.[00:02:57] Luca Soldaini: If I can advance to the next slide. [00:03:00] Okay, cool. So I did the quick check of like, to sort of get a sense of like, how much 2024 was different from 2023. So I went on Hugging Face and sort of get, tried to get a picture of what kind of models were released in 2023 and like, what do we get in 2024?[00:03:16] Luca Soldaini: 2023 we get, we got things like both LLAMA 1 and 2, we got Mistral, we got MPT, Falcon models, I think the YI model came in at the end. Tail end of the year. It was a pretty good year. But then I did the same for 2024. And it's actually quite stark difference. You have models that are, you know, reveling frontier level.[00:03:38] Luca Soldaini: Performance of what you can get from closed models from like Quen, from DeepSeq. We got Llama3. We got all sorts of different models. I added our own Olmo at the bottom. There's this growing group of like, Fully open models that I'm going to touch on a little bit later. But you know, just looking at the slides, it feels like 2024 [00:04:00] was just smooth sailing, happy knees, much better than previous year.[00:04:04] Luca Soldaini: And you know, you can plot you can pick your favorite benchmark Or least favorite, I don't know, depending on what point you're trying to make. And plot, you know, your closed model, your open model and sort of spin it in ways that show that, oh, you know open models are much closer to where closed models are today versus to Versus last year where the gap was fairly significant.[00:04:29] Luca Soldaini: So one thing that I think I don't know if I have to convince people in this room, but usually when I give this talks about like open models, there is always like this background question in, in, in people's mind of like, why should we use open models? APIs argument, you know, it's, it's. Just an HTTP request to get output from a, from one of the best model out there.[00:04:53] Luca Soldaini: Why do I have to set up infra and use local models? And there are really like two answer. There is the more [00:05:00] researchy answer for this, which is where it might be. Background lays, which is just research. If you want to do research on language models, research thrives on, on open models, there is like large swath of research on modeling, on how these models behave on evaluation and inference on mechanistic interpretability that could not happen at all if you didn't have open models they're also for AI builders, they're also like.[00:05:30] Luca Soldaini: Good use cases for using local models. You know, you have some, this is like a very not comprehensive slides, but you have things like there are some application where local models just blow closed models out of the water. So like retrieval, it's a very clear example. We might have like constraints like Edge AI applications where it makes sense.[00:05:51] Luca Soldaini: But even just like in terms of like stability, being able to say this model is not changing under the hood. It's, there's plenty of good cases for, [00:06:00] for open models. And the community is just not models. Is I stole this slide from one of the Quent2 announcement blog posts. But it's super cool to see like how much tech exists around open models and serving them on making them efficient and hosting them.[00:06:18] Luca Soldaini: It's pretty cool. And so. It's if you think about like where the term opens come from, comes from like the open source really open models meet the core tenants of, of open, of open source specifically when it comes around collaboration, there is truly a spirit, like through these open models, you can build on top of other people.[00:06:41] Luca Soldaini: innovation. We see a lot of these even in our own work of like, you know, as we iterate in the various versions of Alma it's not just like every time we collect from scratch all the data. No, the first step is like, okay, what are the cool data sources and datasets people have put [00:07:00] together for language model for training?[00:07:01] Luca Soldaini: Or when it comes to like our post training pipeline We one of the steps is you want to do some DPO and you use a lot of outputs of other models to improve your, your preference model. So it's really having like an open sort of ecosystem benefits and accelerates the development of open models.[00:07:23] The Definition of Open Models[00:07:23] Luca Soldaini: One thing that we got in 2024, which is not a specific model, but I thought it was really significant, is we first got we got our first open source AI definition. So this is from the open source initiative they've been generally the steward of a lot of the open source licenses when it comes to software and so they embarked on this journey in trying to figure out, okay, How does a license, an open source license for a model look like?[00:07:52] Luca Soldaini: Majority of the work is very dry because licenses are dry. So I'm not going to walk through the license step by [00:08:00] step, but I'm just going to pick out one aspect that is very good and then one aspect that personally feels like it needs improvement on the good side. This this open source AI license actually.[00:08:13] Luca Soldaini: This is very intuitive. If you ever build open source software and you have some expectation around like what open source looks like for software for, for AI, sort of matches your intuition. So, the weights need to be fairly available the code must be released with an open source license and there shouldn't be like license clauses that block specific use cases.[00:08:39] Luca Soldaini: So. Under this definition, for example, LLAMA or some of the QUEN models are not open source because the license says you can't use this model for this or it says if you use this model you have to name the output this way or derivative needs to be named that way. Those clauses don't meet open source [00:09:00] definition and so they will not be covered.[00:09:02] Luca Soldaini: The LLAMA license will not be covered under the open source definition. It's not perfect. One of the thing that, um, internally, you know, in discussion with with OSI, we were sort of disappointed is around the language. For data. So you might imagine that an open source AI model means a model where the data is freely available.[00:09:26] Luca Soldaini: There were discussion around that, but at the end of the day, they decided to go with a softened stance where they say a model is open source if you provide sufficient detail information. On how to sort of replicate the data pipeline. So you have an equivalent system, sufficient, sufficiently detailed.[00:09:46] Luca Soldaini: It's very, it's very fuzzy. Don't like that. An equivalent system is also very fuzzy. And this doesn't take into account the accessibility of the process, right? It might be that you provide enough [00:10:00] information, but this process costs, I don't know, 10 million to do. Now the open source definition. Like, any open source license has never been about accessibility, so that's never a factor in open source software, how accessible software is.[00:10:14] Luca Soldaini: I can make a piece of open source, put it on my hard drive, and never access it. That software is still open source, the fact that it's not widely distributed doesn't change the license, but practically there are expectations of like, what we want good open sources to be. So, it's, It's kind of sad to see that the data component in this license is not as, as, Open as some of us would like would like it to be.[00:10:40] Challenges for Open Models[00:10:40] Luca Soldaini: and I linked a blog post that Nathan wrote on the topic that it's less rambly and easier to follow through. One thing that in general, I think it's fair to say about the state of open models in 2024 is that we know a lot more than what we knew in, [00:11:00] in 2023. Like both on the training data, like And the pre training data you curate on like how to do like all the post training, especially like on the RL side.[00:11:10] Luca Soldaini: You know, 2023 was a lot of like throwing random darts at the board. I think 2024, we have clear recipes that, okay, don't get the same results as a closed lab because there is a cost in, in actually matching what they do. But at least we have a good sense of like, okay, this is, this is the path to get state of the art language model.[00:11:31] Luca Soldaini: I think that one thing that it's a downside of 2024 is that I think we are more research constrained in 2023. It feels that, you know, the barrier for compute that you need to, to move innovation along as just being right rising and rising. So like, if you go back to this slide, there is now this, this cluster of models that are sort of released by the.[00:11:57] Luca Soldaini: Compute rich club. Membership is [00:12:00] hotly debated. You know, some people don't want to be. Called the rich because it comes to expectations. Some people want to be called rich, but I don't know, there's debate, but like, these are players that have, you know, 10, 000, 50, 000 GPUs at minimum. And so they can do a lot of work and a lot of exploration and improving models that it's not very accessible.[00:12:21] Luca Soldaini: To give you a sense of like how I personally think about. Research budget for each part of the, of the language model pipeline is like on the pre training side, you can maybe do something with a thousand GPUs, really you want 10, 000. And like, if you want real estate of the art, you know, your deep seek minimum is like 50, 000 and you can scale to infinity.[00:12:44] Luca Soldaini: The more you have, the better it gets. Everyone on that side still complains that they don't have enough GPUs. Post training is a super wide sort of spectrum. You can do as little with like eight GPUs as long as you're able to [00:13:00] run, you know, a good version of, say, a LLAMA model, you can do a lot of work there.[00:13:05] Luca Soldaini: You can scale a lot of the methodology, just like scales with compute, right? If you're interested in you know, your open replication of what OpenAI's O1 is you're going to be on the 10K spectrum of our GPUs. Inference, you can do a lot with very few resources. Evaluation, you can do a lot with, well, I should say at least one GPUs if you want to evaluate GPUs.[00:13:30] Luca Soldaini: Open models but in general, like if you are, if you care a lot about intervention to do on this model, which it's my prefer area of, of research, then, you know, the resources that you need are quite, quite significant. Yeah. One other trends that has emerged in 2024 is this cluster of fully open models.[00:13:54] Luca Soldaini: So Omo the model that we built at ai, two being one of them and you know, it's nice [00:14:00] that it's not just us. There's like a cluster of other mostly research efforts who are working on this. And so it's good to to give you a primer of what like fully open means. So fully open, the easy way to think about it is instead of just releasing a model checkpoint that you run, you release a full recipe so that other people working on it.[00:14:24] Luca Soldaini: Working on that space can pick and choose whatever they want from your recipe and create their own model or improve on top of your model. You're giving out the full pipeline and all the details there instead of just like the end output. So I pull up the screenshot from our recent MOE model.[00:14:43] Luca Soldaini: And like for this model, for example, we released the model itself. Data that was trained on, the code, both for training and inference all the logs that we got through the training run, as well as every intermediate checkpoint and like the fact that you release different part of the pipeline [00:15:00] allows others to do really cool things.[00:15:02] Luca Soldaini: So for example, this tweet from early this year from folks in news research they use our pre training data to do a replication of the BitNet paper in the open. So they took just a Really like the initial part of a pipeline and then the, the thing on top of it. It goes both ways.[00:15:21] Luca Soldaini: So for example, for the Olmo2 model a lot of our pre trained data for the first stage of pre training was from this DCLM initiative that was led by folks Ooh, a variety of ins a variety of institutions. It was a really nice group effort. But you know, for When it was nice to be able to say, okay, you know, the state of the art in terms of like what is done in the open has improved.[00:15:46] AI2 Models - Olmo, Molmo, Pixmo etc[00:15:46] Luca Soldaini: We don't have to like do all this work from scratch to catch up the state of the art. We can just take it directly and integrate it and do our own improvements on top of that. I'm going to spend a few minutes doing like a [00:16:00] shameless plug for some of our fully open recipes. So indulge me in this.[00:16:05] Luca Soldaini: So a few things that we released this year was, as I was mentioning, there's OMOE model which is, I think still is state of the art MOE model in its size class. And it's also. Fully open, so every component of this model is available. We released a multi modal model called Molmo. Molmo is not just a model, but it's a full recipe of how you go from a text only model to a multi modal model, and we apply this recipe on top of Quent checkpoints, on top of Olmo checkpoints, as well as on top of OlmoE.[00:16:37] Luca Soldaini: And I think there'd be a replication doing that on top of Mistral as well. The post training side we recently released 2. 0. 3. Same story. This is a recipe on how you go from a base model to A state of the art post training model. We use the Tulu recipe on top of Olmo, on top of Llama, and then there's been open replication effort [00:17:00] to do that on top of Quen as well.[00:17:02] Luca Soldaini: It's really nice to see like, you know, when your recipe sort of, it's kind of turnkey, you can apply it to different models and it kind of just works. And finally, the last thing we released this year was Olmo 2, which so far is the best state of the art. Fully open language model a Sera combines aspect from all three of these previous models.[00:17:22] Luca Soldaini: What we learn on the data side from MomoE and what we learn on like making models that are easy to adapt from the Momo project and the Tulu project. I will close with a little bit of reflection of like ways this, this ecosystem of open models like it's not all roses. It's not all happy. It feels like day to day, it's always in peril.[00:17:44] Luca Soldaini: And, you know, I talked a little bit about like the compute issues that come with it. But it's really not just compute. One thing that is on top of my mind is due to like the environment and how you know, growing feelings about like how AI is treated. [00:18:00] It's actually harder to get access to a lot of the data that was used to train a lot of the models up to last year.[00:18:06] Luca Soldaini: So this is a screenshot from really fabulous work from Shane Longpre who's, I think is in Europe about Just access of like diminishing access to data for language model pre training. So what they did is they went through every snapshot of common crawl. Common crawl is this publicly available scrape of the, of a subset of the internet.[00:18:29] Luca Soldaini: And they looked at how For any given website whether a website that was accessible in say 2017, what, whether it was accessible or not in 2024. And what they found is as a reaction to like the close like of the existence of closed models like OpenAI or Cloud GPT or Cloud a lot of content owners have blanket Blocked any type of crawling to your website.[00:18:57] Luca Soldaini: And this is something that we see also internally at [00:19:00] AI2. Like one project that we started this year is we wanted to, we wanted to understand, like, if you're a good citizen of the internet and you crawl following sort of norms and policy that have been established in the last 25 years, what can you crawl?[00:19:17] Luca Soldaini: And we found that there's a lot of website where. The norms of how you express preference of whether to crawl your data or not are broken. A lot of people would block a lot of crawling, but do not advertise that in RobustDXT. You can only tell that they're crawling, that they're blocking you in crawling when you try doing it.[00:19:37] Luca Soldaini: Sometimes you can't even crawl the robots. txt to, to check whether you're allowed or not. And then a lot of websites there's, there's like all these technologies that historically have been, have existed to make websites serving easier such as Cloudflare or DNS. They're now being repurposed for blocking AI or any type of crawling [00:20:00] in a way that is Very opaque to the content owners themselves.[00:20:04] Luca Soldaini: So, you know, you go to these websites, you try to access them and they're not available and you get a feeling it's like, Oh, someone changed, something changed on the, on the DNS side that it's blocking this and likely the content owner has no idea. They're just using a Cloudflare for better, you know, load balancing.[00:20:25] Luca Soldaini: And this is something that was sort of sprung on them with very little notice. And I think the problem is this, this blocking or ideas really, it impacts people in different ways. It disproportionately helps companies that have a headstart, which are usually the closed labs and it hurts incoming newcomer players where either have now to do things in a sketchy way or you're never going to get that content that the closed lab might have.[00:20:54] Luca Soldaini: So there's a lot, it was a lot of coverage. I'm going to plug Nathan's blog post again. That is, [00:21:00] that I think the title of this one is very succinct which is like, we're actually not, You know, before thinking about running out of training data, we're actually running out of open training data. And so if we want better open models they should be on top of our mind.[00:21:13] Regulation and Lobbying[00:21:13] Luca Soldaini: The other thing that has emerged is that there is strong lobbying efforts on trying to define any kind of, AI as like a new extremely risky and I want to be precise here. Like the problem is now, um, like the problem is not not considering the risk of this technology. Every technology has risks that, that should always be considered.[00:21:37] Luca Soldaini: The thing that it's like to me is sorry, is ingenious is like just putting this AI on a pedestal and calling it like, An unknown alien technology that has like new and undiscovered potentials to destroy humanity. When in reality, all the dangers I think are rooted in [00:22:00] dangers that we know from existing software industry or existing issues that come with when using software on on a lot of sensitive domains, like medical areas.[00:22:13] Luca Soldaini: And I also noticed a lot of efforts that have actually been going on and trying to make this open model safe. I pasted one here from AI2, but there's actually like a lot of work that has been going on on like, okay, how do you make, if you're distributing this model, Openly, how do you make it safe?[00:22:31] Luca Soldaini: How, what's the right balance between accessibility on open models and safety? And then also there's annoying brushing of sort of concerns that are then proved to be unfounded under the rug. You know, if you remember the beginning of this year, it was all about bio risk of these open models.[00:22:48] Luca Soldaini: The whole thing fizzled because as being Finally, there's been like rigorous research, not just this paper from Cohere folks, but it's been rigorous research showing [00:23:00] that this is really not a concern that we should be worried about. Again, there is a lot of dangerous use of AI applications, but this one was just like, A lobbying ploy to just make things sound scarier than they actually are.[00:23:15] Luca Soldaini: So I got to preface this part. It says, this is my personal opinion. It's not my employer, but I look at things like the SP 1047 from, from California. And I think we kind of dodged a bullet on, on this legislation. We, you know, the open source community, a lot of the community came together at the last, sort of the last minute and did a very good effort trying to explain all the negative impact of this bill.[00:23:43] Luca Soldaini: But There's like, I feel like there's a lot of excitement on building these open models or like researching on these open models. And lobbying is not sexy it's kind of boring but it's sort of necessary to make sure that this ecosystem can, can really [00:24:00] thrive. This end of presentation, I have Some links, emails, sort of standard thing in case anyone wants to reach out and if folks have questions or anything they wanted to discuss.[00:24:13] Luca Soldaini: Is there an open floor? I think we have Sophia[00:24:16] swyx: who wants to who one, one very important open model that we haven't covered is Mistral. Ask her on this slide. Yeah, yeah. Well, well, it's nice to have the Mistral person talk recap the year in Mistral. But while Sophia gets set up, does anyone have like, just thoughts or questions about the progress in this space?[00:24:32] Questions - Incentive Alignment[00:24:32] swyx: Do you always have questions?[00:24:34] Quesiton: I'm very curious how we should build incentives to build open models, things like Francois Chollet's ArcPrize, and other initiatives like that. What is your opinion on how we should better align incentives in the community so that open models stay open?[00:24:49] Luca Soldaini: The incentive bit is, like, really hard.[00:24:51] Luca Soldaini: Like, even It's something that I actually, even we think a lot about it internally because like building open models is risky. [00:25:00] It's very expensive. And so people don't want to take risky bets. I think the, definitely like the challenges like our challenge, I think those are like very valid approaches for it.[00:25:13] Luca Soldaini: And then I think in general, promoting, building, so, any kind of effort to participate in this challenge, in those challenges, if we can promote doing that on top of open models and sort of really lean into like this multiplier effect, I think that is a good way to go. If there were more money for that.[00:25:35] Luca Soldaini: For efforts like research efforts around open models. There's a lot of, I think there's a lot of investments in companies that at the moment are releasing their model in the open, which is really cool. But it's usually more because of commercial interest and not wanting to support this, this like open models in the longterm, it's a really hard problem because I think everyone is operating sort of [00:26:00] in what.[00:26:01] Luca Soldaini: Everyone is at their local maximum, right? In ways that really optimize their position on the market. Global maximum is harder to achieve.[00:26:11] Question2: Can I ask one question? No.[00:26:12] Luca Soldaini: Yeah.[00:26:13] Question2: So I think one of the gap between the closed and open source models is the mutability. So the closed source models like chat GPT works pretty good on the low resource languages, which is not the same on the open, open source models, right?[00:26:27] Question2: So is it in your plan to improve on that?[00:26:32] Luca Soldaini: I think in general,[00:26:32] Luca Soldaini: yes, is I think it's. I think we'll see a lot of improvements there in, like, 2025. Like, there's groups like, Procurement English on the smaller side that are already working on, like, better crawl support, multilingual support. I think what I'm trying to say here is you really want to be experts.[00:26:54] Luca Soldaini: who are actually in those countries that teach those languages to [00:27:00] participate in the international community. To give you, like, a very easy example I'm originally from Italy. I think I'm terribly equipped to build a model that works well in Italian. Because one of the things you need to be able to do is having that knowledge of, like, okay, how do I access, you know, how Libraries, or content that is from this region that covers this language.[00:27:23] Luca Soldaini: I've been in the US long enough that I no longer know. So, I think that's the efforts that folks in Central Europe, for example, are doing. Around like, okay, let's tap into regional communities. To get access you know, to bring in collaborators from those areas. I think it's going to be, like, very crucial for getting products there.[00:27:46] Mistral intro[00:27:46] Sophia Yang: Hi everyone. Yeah, I'm super excited to be here to talk to you guys about Mistral. A really short and quick recap of what we have done, what kind of models and products we have released in the [00:28:00] past year and a half. So most of you We have already known that we are a small startup funded about a year and a half ago in Paris in May, 2003, it was funded by three of our co founders, and in September, 2003, we released our first open source model, Mistral 7b yeah, how, how many of you have used or heard about Mistral 7b?[00:28:24] Sophia Yang: Hey, pretty much everyone. Thank you. Yeah, it's our Pretty popular and community. Our committee really loved this model, and in December 23, we, we released another popular model with the MLE architecture Mr. A X seven B and oh. Going into this year, you can see we have released a lot of things this year.[00:28:46] Sophia Yang: First of all, in February 2004, we released MrSmall, MrLarge, LeChat, which is our chat interface, I will show you in a little bit. We released an embedding model for, you [00:29:00] know, converting your text into embedding vectors, and all of our models are available. The, the big cloud resources. So you can use our model on Google cloud, AWS, Azure Snowflake, IBM.[00:29:16] Sophia Yang: So very useful for enterprise who wants to use our model through cloud. And in April and May this year, we released another powerful open source MOE model, AX22B. And we also released our first code. Code Model Coastal, which is amazing at 80 plus languages. And then we provided another fine tuning service for customization.[00:29:41] Sophia Yang: So because we know the community love to fine tune our models, so we provide you a very nice and easy option for you to fine tune our model on our platform. And also we released our fine tuning code base called Menstrual finetune. It's open source, so feel free to take it. Take a look and.[00:29:58] Sophia Yang: More models. [00:30:00] On July 2, November this year, we released many, many other models. First of all is the two new small, best small models. We have Minestra 3B great for Deploying on edge devices we have Minstrel 8B if you used to use Minstrel 7B, Minstrel 8B is a great replacement with much stronger performance than Minstrel 7B.[00:30:25] Sophia Yang: We also collaborated with NVIDIA and open sourced another model, Nemo 12B another great model. And Just a few weeks ago, we updated Mistral Large with the version 2 with the updated, updated state of the art features and really great function calling capabilities. It's supporting function calling in LatentNate.[00:30:45] Sophia Yang: And we released two multimodal models Pixtral 12b. It's this open source and Pixtral Large just amazing model for, models for not understanding images, but also great at text understanding. So. Yeah, a [00:31:00] lot of the image models are not so good at textual understanding, but pixel large and pixel 12b are good at both image understanding and textual understanding.[00:31:09] Sophia Yang: And of course, we have models for research. Coastal Mamba is built on Mamba architecture and MathRoll, great with working with math problems. So yeah, that's another model.[00:31:29] Sophia Yang: Here's another view of our model reference. We have several premier models, which means these models are mostly available through our API. I mean, all of the models are available throughout our API, except for Ministry 3B. But for the premier model, they have a special license. Minstrel research license, you can use it for free for exploration, but if you want to use it for enterprise for production use, you will need to purchase a license [00:32:00] from us.[00:32:00] Sophia Yang: So on the top row here, we have Minstrel 3b and 8b as our premier model. Minstrel small for best, best low latency use cases, MrLarge is great for your most sophisticated use cases. PixelLarge is the frontier class multimodal model. And, and we have Coastral for great for coding and then again, MrEmbedding model.[00:32:22] Sophia Yang: And The bottom, the bottom of the slides here, we have several Apache 2. 0 licensed open way models. Free for the community to use, and also if you want to fine tune it, use it for customization, production, feel free to do so. The latest, we have Pixtros 3 12b. We also have Mr. Nemo mum, Coastal Mamba and Mastro, as I mentioned, and we have three legacy models that we don't update anymore.[00:32:49] Sophia Yang: So we recommend you to move to our newer models if you are still using them. And then, just a few weeks ago, [00:33:00] we did a lot of, uh, improvements to our code interface, Lachette. How many of you have used Lachette? Oh, no. Only a few. Okay. I highly recommend Lachette. It's chat. mistral. ai. It's free to use.[00:33:16] Sophia Yang: It has all the amazing capabilities I'm going to show you right now. But before that, Lachette in French means cat. So this is actually a cat logo. If you You can tell this is the cat eyes. Yeah. So first of all, I want to show you something Maybe let's, let's take a look at image understanding.[00:33:36] Sophia Yang: So here I have a receipts and I want to ask, just going to get the prompts. Cool. So basically I have a receipt and I said I ordered I don't know. Coffee and the sausage. How much do I owe? Add a 18 percent tip. So hopefully it was able to get the cost of the coffee and the [00:34:00] sausage and ignore the other things.[00:34:03] Sophia Yang: And yeah, I don't really understand this, but I think this is coffee. It's yeah. Nine, eight. And then cost of the sausage, we have 22 here. And then it was able to add the cost, calculate the tip, and all that. Great. So, it's great at image understanding, it's great at OCR tasks. So, if you have OCR tasks, please use it.[00:34:28] Sophia Yang: It's free on the chat. It's also available through our API. And also I want to show you a Canvas example. A lot of you may have used Canvas with other tools before. But, With Lachat, it's completely free again. Here, I'm asking it to create a canvas that's used PyScript to execute Python in my browser.[00:34:51] Sophia Yang: Let's see if it works. Import this. Okay, so, yeah, so basically it's executing [00:35:00] Python here. Exactly what we wanted. And the other day, I was trying to ask Lachat to create a game for me. Let's see if we can make it work. Yeah, the Tetris game. Yep. Let's just get one row. Maybe. Oh no. Okay. All right. You get the idea. I failed my mission. Okay. Here we go. Yay! Cool. Yeah. So as you can see, Lachet can write, like, a code about a simple game pretty easily. And you can ask Lachet to explain the code. Make updates however you like. Another example. There is a bar here I want to move.[00:35:48] Sophia Yang: Okay, great, okay. And let's go back to another one. Yeah, we also have web search capabilities. Like, you can [00:36:00] ask what's the latest AI news. Image generation is pretty cool. Generate an image about researchers. Okay. In Vancouver? Yeah, it's Black Forest Labs flux Pro. Again, this is free, so Oh, cool.[00:36:19] Sophia Yang: I guess researchers here are mostly from University of British Columbia. That's smart. Yeah. So this is Laia ira. Please feel free to use it. And let me know if you have any feedback. We're always looking for improvement and we're gonna release a lot more powerful features in the coming years.[00:36:37] Sophia Yang: Thank you. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.latent.space/subscribe

Transcript
Discussion (0)
Starting point is 00:00:02 Welcome to Latent Space Live, our first mini-conference held at Newrop's 2024 in Vancouver. This is Charlie, your AI co-host. As a special treat this week, we're recapping the best of 2024 going domain by domain. We sent out a survey to the over 900 of you who told us what you wanted and then invited the best speakers in the Latent Space Network to cover each field. 200 of you joined us in person throughout the day with over 2,200. watching live online. Our next keynote covers the State of Open models in 2024
Starting point is 00:00:38 with Luca Soldani and Nathan Lambert of the Allen Institute for AI, with a special appearance from Dr Sophia Yang of Mistral. Our first hit episode of 2024 was with Nathan Lamber on RLHF 201 back in January, where he discussed both reinforcement learning for language models and the growing post-training and mid-training stack. with hot takes on everything from constitutional AI to DPO to rejection sampling and also previewed the sea change coming to the Allen Institute and to interconnects his incredible substack on the technical aspects of state-of-the-art AI training.
Starting point is 00:01:17 We highly recommend subscribing to get access to his Discord as well. It is hard to overstate how much open models have exploded this past year. In 2023, only five names were playing in the top LLM ranks, Mestral, Mosaic's MPT, TII UAE's Falcon, Yi from Kifu Lees 0.1.A.I. And of course, Meta's Lama 1 and 2. This year, a whole cast of new open models have burst on the scene, from Google's Gemma and Cohere's Commandar,
Starting point is 00:01:50 to Alibaba's Quen and Deepseek models, to LLM-360, and DCLM, and of course to the Allen Institute's OLMO, OLM-O-M-O-E, Pixmo, Molmo and Olmo two models. Pursuing open model research comes with a lot of challenges beyond just funding and access to GPUs and data sets, particularly the regulatory debates this year across Europe, California and the White House.
Starting point is 00:02:16 We also were honoured to hear from Mistrial, who also presented a great session at the AI Engineer World's Fair Open Models track. As always, don't forget to check the show notes for the YouTube link to their talk as well as their slides. and take care. Yeah, thanks for having me over. I'm Luca. I'm a research scientist at the Allain Institute of AI. I threw together a few slides on sort of like a recap of like interesting themes in
Starting point is 00:02:49 open models for for 2024. I have about maybe 20, 25 minutes of slides and then we can chat if there are any questions. If I can advance to the next slide. Okay, cool. So I did the quick check of like to sort of get a sense of like how much 2024 was different from 2023. So I went on hugging face and sort of tried to get a picture of what kind of models were released in 2023 and like what do we get in 2024? 2023 we got things like both Lama 1 and 2, we got Mistro, got MPT, Falcon models,
Starting point is 00:03:30 think the Y model came at the tail end of the year. It was a pretty good year. But then I did the same for 2024, and it's actually quite stark difference. You have models that are reveling frontier level performance of what you can get from closed models from like Quinn, from Deepseek, we got Lama 3, we got all sorts of different models.
Starting point is 00:03:57 I added our own, Olmo at the bottom. There's this growing group of like fully open models that I'm going to touch on a little bit later. But just looking at this slide, it feels like 2024 was just smooth sailing, happiness, much better on previous year. And you can plot, you can pick your favorite benchmark or least favorite, I don't know,
Starting point is 00:04:25 depending on what point you're trying to make, and plot your close model, your open model, and sort of spin it in ways that show that, oh, you know, open models are much closer to where closed models are today versus to, versus last year where the gap was fairly significant. So one thing that I think, I don't know if I have to convince people in this room, but usually when I give this talks about open models,
Starting point is 00:04:58 there is always like this background question in people's mind of like, why should we use open models? is it just-use model APIs argument? It's just an HTTP request to get output from one of the best models. Why do I have to set up infrared use local models? And they're really like two answer. There is a more researchy answer for this, which is where my background lays,
Starting point is 00:05:26 which is just research. If you want to do research on language models, research thrives on open models. There is a large swath of research on modeling, on how these models behave, on evaluation, and inference, on mechanistic interpretability that could not happen at all if you didn't have open models. They're also for AI builders.
Starting point is 00:05:53 They're also like good use cases for using local models. You know, you have some, this is like a very not comprehensive slides, but you have things like there are some application where local models just blow closed models out of the water. So like retrieval, it's a very clear example. You might have like constraints like edge AI applications where it makes sense. But even just like in terms of like stability, being able to say this model is not changing under the hood, it's, there's plenty of good cases for, for open models. And the community is, just not models is I stole this slide from one of the Quintu announcement blog post,
Starting point is 00:06:40 but it's super cool to see how much tech exists around open models, on serving them, on making them efficient and hosting them. It's pretty cool. And it's, if you think about, like, where the term opens come from, comes from like the open source. really open models meet the core tenets of open source, specifically when it comes around collaboration. There is really a spirit, like through these open models, you can build on top of others people innovation. We see a lot of these even in our own work of like,
Starting point is 00:07:25 as we iterate in the various versions of Alma, it's not just like every time we collect from scratch, all the data. No, the first step is like, okay, what other cool data sources and data sets people have put together for language model for training? Or when it comes to like our post-training pipeline, one of the steps is you want to do some DPO and you use a lot of outputs of other models to improve your preference model. So it's really having like an open sort of ecosystem system benefits and accelerates the development of open models. One thing that we got in 2024, which is not a specific model, but I thought it was really
Starting point is 00:08:14 significant, is we first, we got our first open source AI definition. So this is from the open source initiative. They've been generally the steward of a lot of the open source licenses when it comes to software. And so they embarking this journey and trying to figure out, okay, how does a license, an open source license for a model look like? Majority of the work is very dry because licenses are dry, so I'm not going to walk through the license step by step, but I'm just going to pick out one aspect that is very good, and
Starting point is 00:08:55 then one aspect that person feels like it needs improvement. On the good side, this open source AI license actually, this is very intuitive. If you have a built open source software and you have some expectation around like what open source looks like for software, for AI sort of matches your intuition. So the weights need to be fairly available. The code must be released with an open source license. there shouldn't be like license clauses that block specific use cases. So under this definition, for example, Lama or some of the Quinn models are not open source because the license says you can't use this, this model for this, or it says if you use this model,
Starting point is 00:09:46 you have to name the output this way or derivative needs to be named that way. Those clauses don't meet open source definition. And so they will not be covered the Lama license. not be covered under the open source definition. It's not perfect. One of the things that internally, in discussion with OSI, we were sort of disappointed, is around the language for data.
Starting point is 00:10:20 So you might imagine that an open source AI model means a model where the data is freely available. There were discussion around that. But at the end of the day, decide to go with a softened stance where they say a model is open source if you provide sufficient detail information on how to sort of replicate the data pipeline so you have an equivalent system sufficient sufficiently detailed it's very it's very fuzzy don't like that an equivalent system is also very fuzzy and this doesn't take into account the accessibility
Starting point is 00:10:59 of the process, right? It might be that you provide enough information, but this process costs, I don't know, $10 million to do. Now, the open source definition, like any open source license has never been about accessibility, so that's never a factor in open source software, how accessible software is. I can make a piece of open source, put it on my hard drive and never access it. That software is still open source. The fact that it's not widely distributed doesn't change a license, but practically there are expectation of like what we want good open sources to be. So it's kind of sad to see that the data component in this license is not as open as some of us would like it to be.
Starting point is 00:11:46 And I linked to blog post that Nathan wrote on the topic that it's less friendly and easier to follow through. One thing that in general, I think it's fair to say about the state of open models in 2024 is that we know a lot more than what we knew in in 2023, like both on the training data, like the pre-training data, how you curate on like how to do like all the post-training especially on the RL side. You know, 2023 was a lot of like throwing random darts at the board. I think in 2024 we have clear recipes that, okay,
Starting point is 00:12:29 don't get the same results of the closed lab because there is a cost in actually matching what they do. But at least we have a good sense of like, okay, this is the path to get C of your language model. I think that one thing that it's a downside of 20204 is that I think we are more research constrained than 2023. It feels that, you know, the barrier for compute that you need to move innovation along as just being rising and rising.
Starting point is 00:13:02 So like if you go back to this slide, there is now this cluster of models that are sort of released by the compute-rich club. Membership is hotly debated. You know, some people don't want to be. called the rich because it comes to expectations. Some people want to be called rich, but I don't know, there's debate. But these are players that have, you know, 10,000, 50,000 GPUs at minimum. And so they can do a lot of work and a lot of exploration in improving models that
Starting point is 00:13:35 it's not very accessible. To give you a sense of like how I personally think about research budget for each part of the language model pipeline is like on the pre-training side, you can maybe do something with a thousand GPUs. Really, you want 10,000. And like if you want a real state of the art, you know, your deep seek minimum is like 50,000. And you can scale to infinity.
Starting point is 00:14:03 The more you have, the better it gets. Everyone on that side still complains that they don't have enough GPUs. Post-training is a super wide sort of spectrum. You can do a sliddle with like eight GPUs. As long as you're able to run, you know, a good version of, say, a Lama model. You can do a lot of work there. You can scale a lot of the methodology just like scales with compute, right?
Starting point is 00:14:34 If you're interested in, you know, your open replication of what Open AIs 01 is, you're going to be on the 10K special. spectrum of our GPUs. Inference, you can do a lot with very few resources, evaluation, you can do a lot with, well, I should say at least one GPUs, if you want to evaluate open models. But in general, like if you care a lot about intervention to do on this model, which is my prefer area of research, then the resources that you need are quite significant.
Starting point is 00:15:15 One of the trends that has emerged in 2024 is this cluster of fully open models. So the model that we built at AI2 being one of them. And it's nice that it's not just us. There's like a cluster of other mostly research efforts who are working on this. And so it's good to give you a primer of what like fully open means. So fully open, the easy way to think about it is instead of just releasing a model checkpoint that you run, you release a full recipe. So that other people working on it, working on that space, can pick and choose whatever they want
Starting point is 00:16:04 from your recipe and create their own model or improve on top of your model. You're giving out the full pipeline and all the details there instead of just like the end output. So I pull up the screenshot from our recent MOE model. And like for this model, for example, we released the model itself, the data that was trained on, the code, both for training and inference, all the logs that we got through the training run, as well as every intermediate checkpoint. And like the fact that you release different part of the pipeline allows others to do really cool things. So for example, this tweet from early this year from Fox News Research, they use our pre-training data to do a replication of the Bitnet paper in the open. So they took just a really like the initial part of a pipeline and then did the thing on top of it. It goes both ways.
Starting point is 00:17:08 So for example, for the Olmo II model, a lot of our pre-trained data for the first stage of pre-training was from this DCLM initiative that was led by folks. A variety of institutions was a really nice group effort. But for when it was nice to be able to say, okay, you know, the state of the art in terms of like what is done in the open. has improved. We don't have to do all this work from scratch to catch up the state of the art. We can just take it directly and integrate it and do our own improvements on top of that.
Starting point is 00:17:50 I'm going to spend a few minutes doing a shameless plug for some of our fully open recipes. So indulge me in this. So a few things that we released this year was, as I was mentioning, this Omoye model, which is, I think still is state of the art MOE model in its size class. And it's also fully open, so every component of this model available. We release a multimodal model called Molmo.
Starting point is 00:18:23 Molmo is not just a model, but it's a full recipe of how you go from a text-only model to a multimodal model. And we apply this recipe on top of Quint checkpoints, on top of all more checkpoints as well on top of Olmoi. And I think there have been replication doing that on top of mistral as well. On the post-training side, we recently released Tulu 3. Same story. This is a recipe on how you go from a base model to a state-of-the-art post-training model.
Starting point is 00:19:00 We use the Tulu recipe on top of Olmo, on top of Lama, and then it's been open replication effort to do that on top of Quinn as well. It's really nice to see like when your recipe sort of it's kind of tricky, you can apply to different models and it kind of just works. And finally, the last thing we released this year was Olmo 2, which so far is the best state of the art fully open language model. It sort of combines aspect from all three of these previous models, what we learn on the data side from OMOE and what we learn on like making models that are easy to adapt from the MoMA project and the Tulu project. I will close with a little bit of reflection of ways this ecosystem of open models.
Starting point is 00:19:48 It's not all roses. It's not all happy. It feels like day to day, it's always in peril. And I talked a little bit about the compute issues that come with it, but it's really not just compute. So one thing that is on top of my mind is due to like the environment and how, you know, growing feelings about like how AI is treated. It's actually harder to get access to a lot of the data that was used to train a lot of the models up to last year. So this is a screenshot from really fabulous work from Shane Longpre, who's, I think, is in Europe about just access. of like diminishing access to data for language model pre-training.
Starting point is 00:20:36 So what they did is they went through every snapshot of Common Crawl. Common Crawl is this publicly available scrape of the of a subset of the internet. And they looked at how for any given website, where the website that was accessible in, say, 2017, whether it was accessible or not in 2024. And what they found is as a reaction to like the close, like of the existence of closed models, like OpenEI or Clod, the GPT or Clod, a lot of content owners have blanket blocked any type of crawling to their website. And this is something that we see also internally at AI2.
Starting point is 00:21:24 Like one project that we started this year is we wanted to understand like if you're a good citizen or the internet and you crawl following sort of norms and policy that have been established in the last 25 years, what can you crawl? And we found that there's a lot of website where the norms of how you express preference of whether to crawl or date or not are broken. A lot of people would block a lot of crawling, but do not advertise that in Robustyxte. You can only tell that they're crawling, that they're blocking you in crawling when you try doing it. Sometimes you can't even crawl the robot TXT to check whether you're allowed or not. And then a lot of websites, there's all these technologies that historically have been,
Starting point is 00:22:15 have existed to make websites serving easier, such as Cloudflare or DNS. They're now being repurposed for blocking AI or, or any type of crawling in a way that is very opaque to the content owners themselves. So you go to this website, you try to access them, and they're not available. You get a feeling it's like, oh, something changed on the DNS side that it's blocking this,
Starting point is 00:22:47 and likely the content owner has no idea. They're just using Cloudflare for better, you know, look balancing, and this is something that was sort of sprung on with very little notice. And I think the problem is this blocking or idea really impacts people in different ways. It disproportionately helps companies that have a head start,
Starting point is 00:23:13 which are usually the closed labs, and it hurts incoming newcomer players, where you either have now to do things in a sketchy way, or you're never going to get that content that the closed lab might have. So there's a lot of coverage. I'm going to plug Nathan's blog post again. That is it that I think the title of this one is very succinct, which is like we're actually not, you know,
Starting point is 00:23:43 before thinking about running out of training data, we're actually running out of open training data. And so if one better open models, they should be on top of our mind. The other thing that has emerged is that there's strong lobbying efforts on trying to define any kind of open source AI as like a new, extremely risky danger. And I want to be precise here. The problem is not considering the risk of this technology. Every technology has risk that should always be considered.
Starting point is 00:24:23 The thing that it's like to me is, sorry, is ingenuous is like just putting this AI on a pedestal and calling it like an unknown alien technology that has like new and undiscovered potentials to destroy humanity. When in reality, all the dangers, I think, are rooted in dangers that we know from existing software industry or existing, issues that come when using software on a lot of sensitive domains like medical areas. And I also noticed a lot of efforts that have actually been going on and trying to make this open model safe. I posted one here from AI2, but there's actually like a lot of work that has been going on like, okay, how do you make, if you're distributing this model openly, how do you make it safe? what's the right balance between accessibility
Starting point is 00:25:24 of open models and safety. And then also this annoying brushing of concerns that are then proved to be unfounded under the rug. If you remember at the beginning of this year, it was all about bio-risk of these open models. The whole thing fizzled out because there's been, finally, there's been,
Starting point is 00:25:47 like, rigorous research, not just this paper from Cohes, your folks, but it's being rigorous research showing that this is really not a concern that we should worry about. Again, there is a lot of dangerous use of AI application, but this one was just like a lobbying ploy to just make things sound scarier than they actually are. So I've got to preface this part and says, this is my personal opinion, it's not my employer, but I look at things like the SB 1047 from California, and I think.
Starting point is 00:26:20 we kind of dodge a bullet on this legislation. We, you know, the open source community, a lot of the community came together at the last, sort of the last minute and did a very good effort trying to explain all the negative impact of this bill. But there's like, I feel like there's a lot of excitement on building these open models, or like researching on these open models.
Starting point is 00:26:48 And lobbying is not sexy. being is not sexy. It's kind of boring, but it's sort of necessary to make sure that this ecosystem can really thrive. This end of presentation, I have some links, emails, sort of standard thing in case any of anyone wants to reach out. And the folks have questions or anything they wanted to discuss sort of open the floor. One very important open model that we haven't covered is Mistral. Yeah, yeah. Well, it's nice to have the Mistral person. Yes.
Starting point is 00:27:34 Recap the year in Mistral. But while Sophia gets set up, does anyone have like just thoughts or questions about the progress in this space? You always have questions. Always. I'm very curious how we should build incentives to build open models, things like Francoise Choulet's Ark Prize and other initiatives like that. What is your opinion on how we should better align in the? incentives in the community so that open models stay open. The incentive is like really hard.
Starting point is 00:28:01 Like even as something that I actually even we think a lot about it internally. Because like building open models is risky. It's very expensive. And so people don't want to take risky bets. I think the definitely like the challenges, like our challenge. I think those are like very valid. approaches for it. And then I think in general promoting building
Starting point is 00:28:30 so any kind of effort to participate in this challenge, in those challenges, if we can provide doing that on top of open models and sort of really lean into this multiplier effect, I think that is a good way to go. If there were more money for efforts, like research efforts around open models. There's a lot of, I think there's a lot of investments in companies that at the moment are releasing their model in the open, which is really cool.
Starting point is 00:29:05 But it's usually more because of commercial interest and not wanting to support this, this, like open models in the long term. It's a really hard problem because I think everyone is operating sort of in what, everyone is at their local maximum rent, in ways that. that really optimize their position on the market, the global maximum is harder to achieve. Can I ask one question? You know?
Starting point is 00:29:34 Yeah. So I think one of the gap between the closed and open source models is the multilinguality. So the closed source models like chat, Chitv, works pretty good on the low resource languages, which is not the same on the open source models, right? So is it in your plan to improve in that space? I think in general, yes.
Starting point is 00:29:58 I think it's, I think we'll see a lot of improvements there in like 2025. Like there's groups like focus on the smaller side that are already working on, like better crawl for multilingual support. I think when I'm trying to this theory is you really want to be experts who are hiring. actually in those countries, the English language to participate in the United States. To give you like a very easy example, I'm originally from Italy. I think I'm terribly equipped to build a model that works for in Italy.
Starting point is 00:30:41 Because one of the things you need to be able to do is having a knowledge of like, okay, how do I access, you know, libraries or content that is from the street that they're going to be in the US long enough that I've no longer known. So I think the experts that both central here for example doing I around like, okay, let's tap into regional communities to get access to bring to the collaborations from those areas. I think it's going to be like very crucial for get artists in. Hello everyone. Yeah, I'm super excited to be here to talk to
Starting point is 00:31:22 to you guys about Mistral. A really short and quick recap of what we have done, what kind of models and products we have released in the past a year and a half. So most of you have already known that we are a small startup founded about a year and a half ago in Paris. In May 2003, it was funded by three of our co-founders.
Starting point is 00:31:48 And in September 2003, we released our first open source middle model Mr. 7B. How many of you have used or heard about Mr. 7B? Hey, pretty much everyone. Thank you. Yeah, it's our pretty popular and our community really love this model. And in December 23, we released another popular model
Starting point is 00:32:13 with the MLE architecture, Mr. AX7B. And going into this year, you can see, we have released a lot of things this year. First of all, in February 24, we released Mr. Small, Mr. Large, LaShatt, which is our chat interface. I will show you in a little bit. We released an embedding model
Starting point is 00:32:36 for converting your text into embedding vectors. And all of our models are available the big cloud resources. So you can use our model on Google Cloud, AWS Asia, Snowflake, IBM. So very useful for enterprise who wants to use our model through cloud. And in April and May this year, we released another powerful open source M-O-E model AX-20B, and we also released our first code model, Coastal, which is amazing at 80-plus
Starting point is 00:33:14 languages. And then we provided another fine-tuning service for customization. So because we know the community love to fine-tune our models, so we provide you a very nice and easy option for you to find-tune on our platform and also we released our fine-tuning code base called Mr. Fine-Tune is open source, so feel free to take a look.
Starting point is 00:33:37 And more models on July to November this year, we released many, many other models. First of all, is the two new small best small models, we have Ministral 3B, great for deploying on edge devices. We have Minstrel AB. If you used to use Mr. 7B, Ministera AB is a great replacement with much stronger performance than Mr. 7B. We also collaborated with Nvidia and open sourced another model, Nemo 12B, another great model.
Starting point is 00:34:15 And just a few weeks ago, we updated Mr. Large with the version 2, with that. updated state of our features and really great at function calling capabilities. It's supporting function calling in Latimate. And we released two multi-modal models. Pixtroll 12B, it's open source and Pixel Large, just amazing models for not understanding images, but also great at text understanding. So yeah, a lot of the image models are not so good at text understanding, but Pixel large and pixel B are good at both image understanding and text understanding.
Starting point is 00:34:55 And of course we have models for research. Coastal Mamba is built on Mamba architecture and METRO great with working with math problems. So yeah, that's another model. Here's another view of our model offerings. We have several premier models, which means these models are mostly available through our API. I mean, all of the models are available throughout our API, except for Minister 3B.
Starting point is 00:35:33 But for the Premier Model, they have a special license, Mistral Research License. You can use it for free for exploration, but if you want to use it for Enterprise, for production use, you will need to purchase a license from us. So on the top row here, we have Minestrel 3B and AB as our Premier model. Mr. Small for best low latency use cases.
Starting point is 00:35:56 Mr. Large is great for your most sophisticated use cases. Pixel Large is the Frontier Class Multimodel model. And we have Coastal for great for coding and then again, Mr. Embedding model. And the bottom of the slides here, we have several Apache 2.0 licensed open way models, free for the community to use. And also if you want to fine tune it, use it for a kind of customization, production, feel free to do so. The latest, we have Pixrose 12B.
Starting point is 00:36:28 We also have Mr. Nemo, host from Momba, and Mastro, as I mentioned. And we have three Lexi models that we don't update anymore. So we recommend you to move to our newer models if you are still using them. And then just a few weeks ago, we did a lot of improvements to our code interface, Lechette. How many of you have used Lechette?
Starting point is 00:37:00 Oh no, I need a few. Okay, I highly recommend Lechette is chat. mestro.AI. It's free to use. It has all the amazing capabilities I'm going to show you right now. But before that, Lechette in French means cat. So this is actually a cat logo. Yeah, if you can tell, this is the cat eyes.
Starting point is 00:37:25 Yeah, so first of all, I want to show you something, maybe let's take a look at image understanding. So here I have receipts and I want to ask, just going to get the prompts. Cool. So basically, I have a receipt and I said I ordered coffee and the sausage. How much do I owe? Add an 18% tip.
Starting point is 00:37:58 So hopefully it was able to get the cost of the coffee and the sausage and ignore the other things. And yeah, I don't really understand this, but I think this is coffee. It's, yeah, nine. Yeah. And then cost of the sausage, we have 22 here. Yep. And then it was able to add the cost, calculate the tip, and all that. Great. So it's great at image understanding. It's great at OCR tasks. So if you have OCR tasks, please use it as free on the chat. It's also available through our API. And also I want to show you a canvas example. A lot of you may have used Canvas with other tools before. But, uh, with Lachat is completely free again. Here I'm asking it to create a canvas that's used PyScript to execute Python in my browser.
Starting point is 00:38:57 Let's see if it works. Import this. Oh, yep. Okay, so yeah, so basically it's executing Python here. Exactly what we wanted. And the other day I was trying to ask Lachette to create a game for me. Let's see if we can make it work. Yeah, the Tetris game.
Starting point is 00:39:24 Yeah. Let's just get one roll, maybe. Oh no. Okay, alright, you get the idea. I failed my mission. Okay, here we go. Yay! Yay!
Starting point is 00:39:47 Cool. Yeah, so as you can see, Lechette can write a code about a simple game pretty easily, and you can ask Leschette to explain the code, make updates, however you like. Another example. There's a bar here. I want to move. Right, okay. And let's go back.
Starting point is 00:40:14 go back another one. Yeah, we also have web search capabilities, like you can ask, what's the latest AI news? Image generation is pretty cool. Generate an image about researchers in Vancouver. Yeah, Black Forest Labs, Flex Pro. Again, this is free. So, oh, cool. I guess researchers here are mostly from University of British Columbia. That's smart. Yeah, so this is Lachia. Please feel free to use it and let me know if you have any feedback.
Starting point is 00:40:59 We're always looking for improvement and we're going to release a lot more powerful features in the coming years. Thank you. Okay, maybe you get something off now. Yeah, yeah. Okay, cool. Yeah, hi, everyone.
Starting point is 00:41:18 Thank you so much for coming today. Huge shout out to SWIX and the latent space team. I think it's been a great. Yeah, let's just give it up for SWX. Just real quick. I did a little bit in terms of helping with the planning, but I work in notable capital. Some of you may have heard of GGV,
Starting point is 00:41:34 which was our former name, on the cloud infrastructure team. So basically anything, data, dev tools, AI infrastructure, as well as AI applications. And so we like to stay close to those that are smarter than us, which is all of you in this room. So if anyone ever wants to, you know, brainstorm or thinking about starting a company,
Starting point is 00:41:53 we're happy to collaborate. We've had the opportunity to partner with, like, amazing companies such as HashiCorp, Bressel, Neon, and many others over the years. And we're based in San Francisco and New York. So, yeah, feel free to find me, Laura Hamilton, X, LinkedIn. You know, if we become friends, Instagram.
Starting point is 00:42:10 Yeah. Thank you all for coming. And then we'll kick off some of the chats with AWS after everyone gets lunch. right.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.