The Changelog: Software Development, Open Source - Livebook's big launch week (Interview)

Starting point is 00:00:00 What's up friends, welcome back this week on the changelog Jerry Winsolo talking to Jose Valim about what's new in Livebook. Livebook is the Elixir based interactive code notebook and Jose has been working on this for the last few years. He made a big bet when he decided to bring machine learning to Elixir. Check out episode 439 for those details. The link is in the show notes, but the bet is now paying off with amazing new capabilities, such as building and deploying a whisper-based chat app to Hugging Face in just 15 minutes.

Starting point is 00:00:42 Jose demoed that and much more during Livebook's recent first ever launch week. The link to that is also in the show notes. Before we get to the show, I want to mention this. We're going to be at the Open Source Summit next week, May 9th through May 12th. So we're going to be there. Make sure you come say hi. We'll be in the expo hall. Look for the maintainer month booth. We'll be there recording podcasts, saying hello and all that fun stuff. So hope to see you there. A massive thank you to our friends and our partners at Fastly and Fly. This pod got you fast because Fastly is fast globally. Check them out at Fastly.com. And our friends at Fly help us put our app in our database, close to our users with no ops, and they'll do it for you too.

Starting point is 00:01:20 Check them out at fly.io. What's up, friends? Before this episode kicks off, I'm here with one of our sponsors. I'm here with Ben Vinegar, VP of Emerging Technology at Sentry. So Ben, let's start off with this. What is it you do here? So my role at Sentry is I'm VP Emerging Technology. What that means is I've sort of been tasked with finding new innovative ways that we can extend the Sentry platform.

Starting point is 00:02:02 I think this is more about trying to be a little risky. You know, trying to just sort of, what if we were, you know, 30% riskier, what would that look like as a, in terms of, in terms of product stuff? So given that with emerging technologies and this risk, this 30% risk, how do you build and ship features in that kind of environment? I think that the Sentry development model is that we mostly build for ourselves. Whenever we're introducing new capabilities, we are our biggest users, first and foremost. So, you know, when it came to building the session replay thing, it began with how can we use this? Like, can we build something that we use every day that really provides value to us? And once we've crossed that threshold, like once we're using it and we're seeing how it's sort of like improving our debugging workflow, then that's when we start

Starting point is 00:02:44 bringing it to other people, maybe in like an alpha or like a beta, like, hey, this is working for us. Does it work for you? And so I think that like everything that we've worked on, including features outside of this group, it's all about dogfooding really aggressively. Engineers are really sort of like really baked into our product development process here, right? It's got a, what we're building has to work for them. So this isn't like a company where PMs go away, say, hey, it's got to look like X. Our enterprise customers want Y. So here's a big task list for you to implement.

Starting point is 00:03:12 It's really more about building solutions that our team wants to use. And we have this theory that if it works for us, there are probably many teams that look something like us that could also benefit similarly. Very cool. Thank you, Ben. So, hey, listeners, check out our friends at Sentry at Sentry.io and use the code CHANGELOG when you sign up and you're going to get the team plan for free for three months. Make sure you tell them we sent you because they love hearing from our listeners. Sentry.io and use the code change when you sign up. Enjoy. All right, I'm here with Jose Valim. Welcome back to the show, Jose.

Starting point is 00:04:18 Thanks for having me once again. Always a pleasure. Always ready to see what you have been up to. So it's been a couple of years since you were on the pod. Last time we were talking about bringing machine learning things into the Elixir world. Of course, Elixir is API complete. It's somewhat complete. I'm sure there's still advancements going on, but your focus has changed to the world of NX and now Livebook. And it seems like Livebook, of all all things has really taken your primary focus. Is that fair?

Starting point is 00:04:47 Or you're also working on other stuff? I think it's half Livebook, half Numerical Elixir NX. Okay. And then of course they play in the same pool, right? Like these things are related. Yeah, definitely. So we started this idea of Numerical Elixir

Starting point is 00:05:04 to make Elixir good for numerical computing, machine learning, definitely. So we started this idea of Numerical Elixir to make Elixir good for numerical computing, machine learning, AI. And then we were like, well, if we're going to work on this, obviously using the Python community as a reference, we were like, well, maybe we should have a code notebook platform as well. And Elixir is known for like doing web stuff. So like, well, maybe we can try different things, different ideas and build this cool notebook platform. So kind of like we only have Livebook because of the numerical Elixir effort, but they go hand in hand. So for example, I think at the very end of last year,

Starting point is 00:05:40 one of the big features that we announced is that we have like pre-trained machine learning models now in Elixir. So you can like go to Hugging Face, pick a model, and now you can run. So we had to implement that in Elixir, and now you can run them in Elixir on the CPU, GPU, whatever. And when we announced it, we announced it with Livebook. So what you could do is that you start Livebook, which is a code notebook,

Starting point is 00:06:05 and with three clicks, you can have stable diffusion running on your machine and all powered by Elixir. So a lot of the work they are doing is side by side. We developed this feature here for machine learning, but at the same time, we're thinking, what is the best way to expose developers to this new feature, this new idea, this new library? And we figure out how to do it together. So the last time we talked two years ago, that show's called Elixir Meets Machine Learning. And my co-host that day was Daniel Whitenack from our Practical AI podcast. A lot has happened since then, not the least of which machine learning has kind of blown up.

Starting point is 00:06:45 Even practical AI has kind of blown up. It's like a very popular show now because they were well placed for this moment when all of a sudden models are the thing to have. You know, you got your language models, you have your image models. And really, I think you're ahead of the curve to a certain degree. Or I guess maybe you're just ready for this moment as well with Elixir. Well positioned to just integrate, take advantage of these cool things, whisper, stable diffusion like you said.

Starting point is 00:07:14 Do you feel validated in your decision back then to focus on this as the next thing for Elixir? Because it seems like it paid off so far. Oh, that's a very good question. It's very good because I haven't thought about it if I feel validated. I'm actually not sure I have an answer. Because you could have spent the last two years building blockchains, Jose. You could have been all in on blockchain for two years. I just woke up the kids. But no, yes. No, no, that's pretty good. Yeah, no, I think putting things in perspective,

Starting point is 00:07:43 no, you're right. I think that's pretty good. I mean, the thing that I think about the most is that I'm really enjoying working on all those things. And I'm not sure, maybe somebody can correct us later, but I think Elixir was like, so when it comes to stable diffusion, for example, Elixir was the second high-level language to implement that and in elixir itself because everybody could say hey i'm going to shell out to python or i'm going to shell out to like invoke something you see but we have the model actually implemented in elixir and when this part came out i think we had the version like in a week running as well which speaks really well about the abstractions we built. And I'm still enjoying it a lot, working on Livebook and thinking about problems differently.

Starting point is 00:08:31 So from that perspective, I feel like very validated in the sense that, hey, I'm working on things that I'm really enjoying with a team that I enjoy working with. But yeah, I think generally validated, if it all made sense, we'll probably need a couple of years more to say like, hey, that was the absolutely right call. It doesn't feel like it was the wrong call. So, you know. No, I think you're well positioned right now.

Starting point is 00:08:59 And speaking of bets, just to maybe pat myself on the back and give you some more credit, I was just coding on our Elixir app today and have been just over the last couple of weeks and thinking, I made this bet more than seven years ago on Elixir and Phoenix for our platform that we run our business on. And yeah, we're a small website with very limited scope and all these things. But it feels pretty good.

Starting point is 00:09:25 Like just, and I'm still productive. I'm still enjoying it and still working for us. And I still like the language. I mean, seven years is a long time to be working with one thing and not to have, I mean, of course I get wanderlust, you know, I kind of wonder like, well, what if we rewrote this part and this thing, or maybe this would better, but all in all, Jose, I mean, thanks again for creating these tools because I'm just like a happy user. Seven, eight, probably a decade later, I can't see us changing anytime soon. Just write an Elixir code and run our business with it. So that feels like a good bet.

Starting point is 00:09:57 Yeah, that's great. Thanks. Yeah, thanks for reminding me. Yeah. Yeah. And there's something that you said about which, so I'm just going into tangents here, but. Sure.

Starting point is 00:10:08 Well, we've done episodes together, but you said like, oh, we are small. Yeah. And like, you know, we just want to have those things running. And that's one of the things that I think can be interesting about machine learning in Elixir, because I assume it would come up at some point

Starting point is 00:10:24 or people probably listening, like why machine learning in Elixir because I assume it would come up at some point or people probably listening, like why machine learning in Elixir? And I think it's the, because the Erlang virtual machine that Elixir runs on, you can kind of like, you know, if you have to do a lot of IO work, it's going to do that well. If I have to do a lot of concurrency work,

Starting point is 00:10:40 it can use all the CPUs at the same time. It allows you to say, say hey i want to run a machine learning model you can actually just run that machine learning model as part of your app you don't need to like hey you know i have this machine learning model now i have to pay like amazon to run it for me or i have to figure out like this whole other service that I have to version it separately. If you want, you don't have to, but if you want, you can just run it with your servers. It's going to depend a lot on what is the use case, the application, right? But at the same time, because again, like Erlang is distributed.

Starting point is 00:11:19 If you say, hey, I started running with my web servers, but now the load is increasing and I want some machines with GPU, you just connect them with the distributed air link. So if you're running on fly, that happens by default. And I would say, hey, I just want this machine learning task to happen on those machines with GPU. And you don't change anything in the code. You just say which part is running the machine learning model.

Starting point is 00:11:45 So I think that's one of the the the exciting things and when you say like oh we are small i think it's one of the appeals of just allowing people to play with those ideas and embed within their apps yeah and you know and then when trouble comes if it comes then you can figure out what is the next step and don't have to worry about all those things up front. Right. Well, I'm definitely watching very interestingly with regards to Whisper. Of course, we have transcripts that we have been transcribing for years manually, and we're kind of waiting a little bit and watching and seeing how to get speaker identification going on with Whisper, which is kind of a feature that doesn't currently exist. So it's a holding pattern, but like I'm ready to integrate Whisper into our little web app and just like have this new functionality. So that's super cool. Maybe you can help demystify

Starting point is 00:12:34 to a certain degree. For those of us who are maybe like web developers or not really in the whole machine learning world, data science world, when it comes to the models, I hear Hugging Face thrown around. I've been there and downloaded things. I know we're going to talk about how you can deploy to Hugging Face some stuff now, but if I'm going to have a stable diffusion like up in my Phoenix app and it's going to run on fly, when you say like run it in Elixir, are you downloading the model to a local disk and then running inference against it on the fly side? Or are you referencing something on Hugging Face? Maybe just clear up what is, I guess for me, a little bit confusing.

Starting point is 00:13:15 Yeah, great question. So when you build the image, if you want, you can already like pre-download the model and have that- As part of your image. Yeah, as part of our image. So when the app boots, you're going. It's there. So one of the things that we do, yeah. One of the things that we do, and you're going to download from Hugging Face,

Starting point is 00:13:33 one of the ways, but you know, you could put on your S3 or, you know, anywhere else. So anyway, it has to be in disk at some point. And then you can run from your machine. That's one of the ways that you can do it, right? Or if you go with the other route I said, you can have like some machines in your cluster that is only for running that model.

Starting point is 00:13:54 And then we get that if you have a GPU, we load that into the GPU and then you're good to go. So let's say if you're running like a multi-node cluster of Elixir nodes and you don't know or care where the actual inference is executed. Is each one of those have a copy? Like let's just use stable diffusion as an example. It's like four gigs or something, right? Is each one of those nodes just have that on disk there locally and they can just execute against it?

Starting point is 00:14:21 Or how does that work? Do you have to manage that mess of like, well, here's my inference cluster and they have it, but my other nodes don't have to have it or do people not care because it's not that big of a deal? Yes, exactly. So you only need them in the clusters running inference. The other ones, they don't have to worry about it at all. They don't even need to have...

Starting point is 00:14:40 So the library that we have that has the pre-trained models, it's called Bumblebee. And you don't even need Bumblebee installed on your web nodes. You only need NX, which is the base library for Numerical Elixir. And so when we talk about Hugging Face, I'm sure some of our audience is very familiar. Other ones are like, what is Hugging Face? And why hasn't Jared asked Jose to explain exactly what it is? like, what is Hugging Face? And why hasn't Jared asked Jose to explain exactly what it is? You said download from Hugging Face. This is like a

Starting point is 00:15:09 repository similar to GitHub, correct? Of machine learning models, other things. But there's also this concept of deploy to Hugging Face, which I don't want to get too far into that because we'll talk about it with your Livebook launch stuff. But just at a general level, if you were to describe Hugging Face to me, if I was someone who didn't know what it was, how would you describe it? Yeah, so the way I say it, and I don't know if they like this description or not, and maybe they can let me know, is exactly like they are the Hugging Face for machine learning. So the whole like machine learning community industry like microsoft google open ai they have the models there on hugging face and what is really cool about hugging face that not only they have the models but they also invest in research and they invest in the ecosystem so if you want to read

Starting point is 00:15:59 like as i said like for you to run a model So what you do is that when you upload, you can think a model of two things, the code that specifies the model and the parameters of the model. That's what a model is. So you get those things together. You send like, if you're talking about Whisper, then you're going to give an audio input. You transform it a little bit, give it to the model. The model is going to give the output,

Starting point is 00:16:22 which is like, hey, you know, the transcribed audio. And so the model, the model is going to give the output, which is like, hey, you know, the transcribed audio. And so the model has two parts. So what is stored in Hugging Face is the model's parameters, the model's weights. And then they also like have this library called Hugging Face Transformers, which has the implementation of all those models, but for Python. And what we did in Elixir is that we have this version of the library and that's why it's called Bumblebee because transformers, right?

Starting point is 00:16:49 So it's like the Bumblebee library, which is our implementation of Elixir. So they contribute a lot to the Python ecosystem and they have been helping, they have been interested in helping on the developments on the Elixir front as well. So that's one of the things, like all the models that are there,

Starting point is 00:17:06 but they also allow you to run like your own Docker images, which means that if I have a machine learning model that you want to run on the GPU, you can run your own, they call it the Hugging Face Spaces, where you can run your own Docker images in there with GPU, and then you can do quiz stuff on that front as well. And then they have a bunch of other stuff like inference API. So we were talking about, well, you can run the model in Elixir itself, which it's going

Starting point is 00:17:34 to have its benefits. But if you say, look, I actually don't want to care about it. I just want to have a service that does it for me. Hugging face provides inference as well. So yeah, I really love Hugging Face. And a lot of people, I say that a lot of people talk about, oh, like we are democratizing AI, right? And it's like, look at this model that you need like 100 GPUs to run it all.

Starting point is 00:17:58 And they're like, well, you know. And I feel like Hugging Face, it's like really democratizing AI. If it was not for them, for us to do some of the things that we are doing today in the Elixir community would probably take one or two years longer. So, yeah. Yeah, very cool. So what layer of abstraction does NX and Bumblebee and all of these elixir tools work with regards to these models? So there's models coming out constantly at this point, right? We talk about stable diffusion, there's GPT models, there's specifically, you know, Facebook's Llama,

Starting point is 00:18:38 there's Dolly, there's Alpaca. I mean, there's just like nouns and nouns and nouns of animal-based things that are out there. Is this a situation where every time something's released, like Llama, which is a large language text model, which came out of Facebook, is that something that Jose and his team at Dashbit hustles out and then codes up support for it? Or is it something that can be used by pointing a pointer at some sort of a binary, the model itself, and then you can just use it immediately? How does that work?

Starting point is 00:19:15 Yeah, excellent question. I'm going to break into parts. One is how it's architectured internally. And the other one is what is the work when there is a new a new model so the way things work and i've been saying this at this point for like two years but you have a subset of elixir that compiles to the gpu and i've been saying for two years and still amazes me when when i say that i'm like, like, that sounds very exciting. If somebody told me this like 10 years ago, like, nah, not happening.

Starting point is 00:19:50 No chance. So you have a subset of Elixir that compiles to the GPU. And the subset, what it does is that it builds a graph of the computation. So what people realize is that, you know, for these like large neural network models, one of the ways to, if you're just executing the operations, like if you say, look, I want to add two tensors or multiply two tensors.

Starting point is 00:20:14 And tensors are how we are representing like multidimensional data, right? So which are part of the neural network. So it's like, if you want to multiply, if you did that immediately, that could potentially, that's not the most's like, if you want to multiply, if you did that immediately, that could potentially, that's not the most efficient way because imagine you're using Python,

Starting point is 00:20:29 you multiply two things and now you go back to Python and then you do the next operation, which is a multiplication. All this back and forth would leave like the GPU idle, for example. So they realize what they want to do is that they want to build this graph,

Starting point is 00:20:43 which is all the operations that you want to do in this neural network. And then you compile it to run or on the GPU or on the CPU very efficiently. And it happens that, so I like to say, like, I'm going on a tangent here very quickly. So one of our inspirations is Google Jacks, which is a library for machine learning from Google. And it's a Python library, but they say, they have things like in the library, like when you're writing code in Jacks, you should treat your data structures as immutable

Starting point is 00:21:16 because they want to build this graph, right? So you need to express that in Python. And then I'm like, well, I know a language where everything is like immutable by default, right? And then like'm like, well, I know a language where everything is immutable by default, right? And then when you're writing Jax, you need to approach things in a functional style. Because again, if you think about the neural network, it's input, output, and computations in the middle. And they want to represent these computations to compile down to the GPU. And then I was like, okay, this is very exciting. And that was one of the sparks that

Starting point is 00:21:45 led Sean Moriarty and I to start like, and Jacko to start working on those ideas. And then, so that's how it works. Like, so we have like a subset of Elixir that compiles to the GPU and you can write it. It's not exactly as Elixir code, but you can break things into functions. So you're writing high level Elixir code that we are going to compile down and optimize. And then we have a neural network library called Axon, which provides like the building blocks for neural networks. So you say like, oh, there is a dense layer. There is an attention layer. Like all those things, they are part of the neural networks.

Starting point is 00:22:28 And then Bumblebee is on top that implementing the models, which is building on top of those layers. So you're talking about like three layers here, right? Like there's NX, which is like a library for those coming from Python. It's equivalent to NumPy or JAX. And then we have Axon, which is more like equivalent to PyTorch or like TensorFlow

Starting point is 00:22:47 with the higher level building blocks or Keras. I think that's a better example. And then we have Bumblebee at the top. So now you can start to have an idea because we have like a different abstraction levels when you're building. So let's say is whisper now so two things happen one is that models they usually reuse parts of other models so and they are often going to reuse layers so it's more like yeah we have to assemble things but a lot of it has already been used in another model and or you, you know, if we have to do something, we already have the layers in Exxon. So a lot of the work has already been done. So because of the abstractions that we have, it takes some work, but we are building on top of the existing infrastructure. And it's not like, oh, well, we are starting from scratch every single time. What's up, friends?

Starting point is 00:24:07 This episode is brought to you by CIQ, the founding sponsor and partner of Rocky Linux, Enterprise Linux, the open source community way. And I'm here with Gregor Kertzer, the founder and CEO of CIQ and the creator of Rocky Linux. So, Greg, I know that a lot of people are still sort of catching up to some degree with what went down with CentOS, the Red Hat acquisition, and just the massive shift that required everyone using CentOS to do.

Starting point is 00:24:36 Give me a glimpse into what happened there. We've seen a number of cases in the open source community where projects were pivoted due to business agenda or commercial needs. We saw that happen with CentOS. CentOS was one of the primary, one of the biggest enterprise operating systems ever. People were using it all over the place. Enterprise organizations and professional IT teams were all leveraging CentOS. For CentOS to be stripped away from the community and removed as a suitable option to meet their needs created a massive pain point and a gap within the industry. As one of the founders of CentOS, I really took this to heart and I wanted to ensure that this does not happen again. And that is what we created with Rocky Linux and the RESF.

Starting point is 00:25:27 Okay, you mentioned the RESF. What is that and what is its relationship to Rocky Linux? The RESF is the Rocky Enterprise Software Foundation. And it is an organization that we created to hold ourselves responsible to what it is that we've promised that we're going to do with the community. It is community run. It is community led. We have a board of directors, which is comprised of a number of people that have a huge amount of experience, both with Linux, as well as open source and community. And from this organization, we solidify the governance of how we are to manage Rocky Linux and any other projects that come and join in this vision.

Starting point is 00:26:09 Sounds good, Greg. I love it. So enterprise Linux, the open source way, the community way has a home at Rocky Linux and the RESF. Check it out and learn more at Rocky Linux.org slash changelog. Again, Rocky Linux.org slash changelog. Again, rockylinux.org slash changelog. Let's focus in then on Livebook. You recently had your very first live book launch week which i thought might be inspired by our friends at super base and i actually saw you reference them in one of your posts like explicitly like their idea you you liked it i think it's a really cool

Starting point is 00:27:01 idea especially for people who want to come on podcasts and talk because it gives us a bunch of stuff to talk about. Right. Like let's launch one thing a day for five days and then we can come on a changelog and talk about that just a lot easier than having, you know, one amorphous thing. So launch week, you want to talk more about the idea and the inspiration and how it went for you? Because it was just a week or so back and now you're in the you're in the wake of launch week yeah so right now i'm saying joke is like uh the first launch week and the last one because they're like the whole time right it's like you are so tired because it's like a whole week you have to stay like right yeah like two weeks recording video but then like four months will pass and they'll be like maybe we should do another

Starting point is 00:27:45 launch week and then we do it and then we get exhausted again it's like having a child you know after you have your have a child like you never want to have another one again and then like you know six months eight months couple years later you're like you know what it wasn't so bad it was worth it right let's have another one yes exactly so yeah but but we were working on the on a bunch of different features and some were ready before like three weeks three months ago but we didn't announce it we're waiting for like the next live book release 0.9 and then we were preparing the change log or the things we want to talk about we're like there's way too many things in here and we cannot like possibly like it's way too many things we can't

Starting point is 00:28:26 talk about everything and then highly inspired by super base because their launch weeks are just fantastic it's just like uh you know like uh there's not a single there's not a single dud in there like oh maybe tomorrow is going to do that it's like no they're all good right yeah they do they do them good yeah yeah and and then i was like well maybe if you're trying to do guys like no they're all good right yeah they do they do them good yeah yeah and and then i was like well maybe if you're trying to do this uh launch week thing and it's even funny because i sent an email to paul cto or ceo from super base and i said hey by the way we are doing a launch week uh thank you it was inspired by you know your marketing work which is amazing and it's like we are doing a launch week on the same week.

Starting point is 00:29:06 So it coincided, like it was both of us doing at the same time. Co-launch. Yeah. But that was the idea. It was just like too many things to talk about. And then we said, yeah, let's see if we can organize this content where we can do one video a day and get people excited. Very cool.

Starting point is 00:29:23 So this was a live book launch week. You described live book in a few words a moment back, but can you go back and say it again, maybe with more words? And then also why this, like, why is this your area of focus today in early 23? Yeah. So, so live book is a cold notebook platform for Alixur. But rather, let's say, maybe I should start reworking that because it's a code platform. It's a data platform. It's a machine learning platform. And it's in the notebook format. And what that means is that

Starting point is 00:29:59 we have a desktop app, but you're going to, when you install Livebook desktop, you're going to run it on your machine, open up a browser page, and you can write notebooks, which is a mixture of prose, text, documentation, and code. In a summarized version, that's what is a code notebook. There are a bunch of ideas in Livebook that maybe we should explore soon, but overall it's a code notebook and you can explore data, explore machine learning. So what I said is like,

Starting point is 00:30:30 well, you can run a machine learning model in Elixir, like you just click in three buttons because that's a workflow that we have in Livebook. And the reason why we're working on this, as I said, it's like when we started

Starting point is 00:30:42 this whole numerical Elixir effort, it felt like we needed to have this, but it took a life completely of its own. So like, for example, today, so we have NURBS. I think this is a great example. So NURBS is a framework for building embedded software, embedded systems. It runs on top of Elixir, right? But imagine that you say, hey, I want to teach somebody how to use NURBS, right? Then you're like, okay, what I can do is that we can write a small software, I can burn it to an SD card

Starting point is 00:31:13 and then run in my Raspberry Pi, right? And then I go back, burn another software in the SD card, put in the device, or maybe figure out a way of doing over-the-wire updates, which all those things they have. But now with Livebook, for example, they have a Nerves Livebook, which is an introductory way to get people into embedded programming, where you just start a Livebook. We start exploring our ideas.

Starting point is 00:31:37 You make, you know, you connect to the Wi-Fi, connect to Bluetooth, you make some lights blink. And, you know, it's all running in Livebook, which originally started thinking about AI. And, you know, it's all running in Livebook, which originally started thinking about AI, but, you know, it's a development environment. And it's a really good development environment, if I have to be honest. You get all the features that you would expect

Starting point is 00:31:54 from an IDE, like code completion. We have like built-in dependency management, inline docs, right? So it's a really nice environment to program and work with. So last time we talked, this thing was brand new. I mean, you had just kind of conceived it and launched like a 0.1. And I haven't looked at it since, Jose, until I watched your launch week video. And I'm over here thinking, I don't really have any uses for something like this.

Starting point is 00:32:23 And then I'm watching you use it. I'm like, oh, I can actually think, I can think of any uses for something like this. And then I'm watching you use it. I'm like, Oh, I can actually think I can think of ways of using something like this. This is really cool, especially for an elixir is like me, where you have kind of the imperative, sorry, the declarative way of setting things up. And then you can click on this little thing, it will show you the exact code that it wrote for you in order to do that in elixir. And it's very nicely format, I'm thinking, I could actually learn how to use Bumblebee and these other tools just by using this tool and then taking that code and integrating it into my app. And to me, that's a super cool thing. Yeah, exactly. When we were here on the show, we just announced it and we didn't have this feature,

Starting point is 00:33:00 which is what we call smart cells. And the idea behind it, it comes, one of the inspirations is an academic paper called Mage. But the idea is, one of the reasons why we wanted to focus on this because, well, we are working on machine learning for Elixir, right? But the amount of Elixir developers

Starting point is 00:33:19 who know machine learning is probably going to be like, you know, there are dozens of us, right? Like when we started, there were very few. And we also have machine learning, it's probably going to be like, you know, there are dozens of us, right? Like when we started, there were very few. And we also have machine learning engineers, developers, but they don't know Elixir, right? So if you get like the Venn diagram of like those communities,

Starting point is 00:33:36 it's a small niche. Yes, right? So we're thinking like, you know, we need to find a way that we can make like machine learning developers who know their stuff, but they don't know how to write Elixir, how to get them started. And we need to help the Elixir developers as well. So the whole idea of smart cells is that I say it's like it's metaprogramming UI.

Starting point is 00:33:56 So what you have is that you have a UI where you can say, look, I want to run this model. I think more accessible is like, hey, I want to connect to a database. For example, we have a smart cell for database connection. And you know, all programming languages, they have like libraries for connecting to the database. You have to go check out the documentation, learn how to use it, figure out the exact parameters that you have to pass,

Starting point is 00:34:20 where the username go, password, yada, yada, yada, right? And now with Livebook, we have then this database connection smart cell. So when you click it, it's going to appear, hey, which database you want to use? What is your password? What is your username? And then you fill that in, and then I can execute the smart cell, and it's going to give me a database connection in a variable that I can run queries against it, right? So in a way, it starts looking like low-code

Starting point is 00:34:47 or no-code tools. Right. But, and those tools, they have an issue, which is like, but wait, like maybe I'm connecting to a database that requires me to pass a special SSL certificate that I did not add in my UI, right? And then it's like your tool, it completely falls apart

Starting point is 00:35:05 because we did not consider that particular use case. So the insight with smart cells is that all they do is that they execute code and you can see this code. Like the smart cell cannot interact with our environment in any way, it can only say, hey, execute this code. And the code that it executes, you can see it at any moment and you can say, hey, I want you to convert this to and the code that it executes, you can see it at any moment. And you can say, hey, I want you to convert this to an actual code that I can add it. And now, which means that now,

Starting point is 00:35:32 if I have the specific use case, right, you can just do that. Or as you said, like I started with a model, a machine learning model, and I want to run that inside my Phoenix application. The code is there. You bring it, you put it in your app. And I even have a video showing how to do that, like in a Phoenix application. Yeah. And then you ship it to production. So the idea is exactly, you know, like how, like, how can we, how can we have the best of both worlds, right?

Starting point is 00:36:00 I still have the why for a learning perspective or just to be productive, but that does not limit me in any way. I even have like a whole separate rant of like how low code should not be, at least for a live book, should not be called low code. Because if somebody gives you a task, it doesn't matter if you achieved it visually, right? Like it doesn't matter if you use a why to write the code for you or if you write it by hand, like as long as you deliver the task you're supposed to do, right? You're good. Right? So I'm like, it's just code. It's not local. It's just code. It doesn't matter how you write it.

Starting point is 00:36:35 It's code. Yeah. I think that's super powerful, especially once we get to your Explorer tool, that's part of launch week, because I'm looking at that thing. You know, a lot of stuff that we do as working developers is like, you know, this csv file or like grab this thing of json and then i'm in my terminal and i write a little bit of code i run a mixed task and i'm like looping through and i'm like all right i'm gonna print like the third field of this thing but i'm gonna skip these lists and then i'm just like console logging or whatever and i run it and i log it and i'm like okay then i tweak it and i log it and i look at the you know i'm saying like this is kind of what we do and it's just to figure out

Starting point is 00:37:09 maybe the format of what i actually need to code and then i throw that all away and i just write the code that works you know and with with this tool which i'll just tease it up now we'll return to it later with this explorer tool i mean you're really doing that with a very rich user interface and able to filter and do all these things that you would in a gui tool and then like there's the code right there just you know copy paste it into what you're up to or throw it away if you don't need it and that's spectacular let's go meta for a second so these live cells how are they implemented is this like a very complicated react component? Like actually in Livebook itself, the ones that generate your code for you,

Starting point is 00:37:48 is this like a mess of React components? Is it like a Phoenix Live view thing? How do you guys build the live cells inside Livebook? Yeah, so the way they work is that, so there are two ways you can make like Livebook come to life. So Livebook, so you can think like Livebook is the web application that you are interacting with. And then you have the runtime,

Starting point is 00:38:09 which is running your Elixir code. Right. And your runtime actually doesn't know much about Livebook. We have the separate library called Kino, which is how it's called like, so for example, in Poland, a cinema is called Kino. And that's the idea. It's how you

Starting point is 00:38:25 animate things right okay so the kino library is what brings live book to life and there are two ways of doing that or is a rich output it's an output that you can interact with it or is with smart cells and what they do is trace a cute code and what they are is that they are a web page that it's so you're going to render an html we render we render them inside an iframe so we have And what they are is that they are a web page that, so you're going to render an HTML. We render them inside an iframe. So we have an iframe. Okay.

Starting point is 00:38:51 You render things inside like that. And it communicates with the Elixir code that is at the runtime using JavaScript, whatever messages you want to send. So we open up a web socket between those two things. So you have JavaScript running on the client, and you have Elixir running on the runtime, the actual runtime where all the code, where if you say, hey, I've downloaded this CSV.

Starting point is 00:39:11 So that's the Elixir runtime and JavaScript running on the client. And you can use whatever you want there. We are just opening a page. And what you do when we open up the page is completely up to you. Most of our smart cells and our outputs that we provide out of the box, they have been implemented with Vue, I believe, Vue.js.

Starting point is 00:39:34 But we have one that is React because it's using a table, like a spreadsheet implementation that is a React component. And then we just use React. So it doesn doesn't matter it can be whatever you want i think you should even be able if you want like to plug live view and make things work with live view yeah it doesn't matter and that's the general architecture so it's effectively each smart cell is a is an iframe and inside that iframe, you can accomplish it with whatever JavaScript you want. And you guys generally use Vue unless it calls for something else.

Starting point is 00:40:09 Okay, that makes total sense. Well, it's super cool. I just imagine whenever I see rich tooling like this and me being a longtime web developer, I just imagine the JavaScript that's driving these things. And I kind of, I shiver and then I stopped thinking about it and I'm like, okay, well it works. So congratulations on that much. Yeah.

Starting point is 00:40:30 One of the things that we really wanted to do is like to make it extensible. So that's the cool thing is that our smart cells, like anybody can define them and the outputs. So it's really extensible. So if I like, Hey, imagine that it like, for some reason, you want to have like audio splicing tool that you're using, and then you want to share with the team, you can create your own smart cell that interfaces with FFmpeg or whatever, right? And then you can share with your team and you install it as a package. So that's really cool because it's not like, it doesn't depend only on the view of the Livebook team.

Starting point is 00:41:06 Anybody can create their own outputs, their own smart cells, put it in a package and ship with the community. And so like, yes, there is JavaScript, but based on the abstractions on the accessibility, it's contained to those iframes, right? And everything is encapsulated in there. Right.

Starting point is 00:41:26 And because it's essentially a desktop app or a workstation-based application that you're going to be using at your job or in your work, it's totally fine to just shove as much JavaScript as you need to get the job done. Because we're used to downloading large things. I'm going to be running local models here. We're going to be downloading gigabytes and gigabytes of things and so a lot of the concerns

Starting point is 00:41:47 we have on typical web pages are kind of out the window which has got to be nice yeah all right so that's how the smart cells work let's hop into the explorer a bit because we'll regroup and talk about how you how you can run this machine learning stuff in the cloud and all the crazy stuff you're doing with gpus but i watched the explorer video at the end. It was your last launch day thing. And this is the one where I'm like, okay, I could use this today. How do I download my book and give this a try? So I think it's going to be a nice marketing thing for people who, hey, maybe they've been using curl. Maybe they've been using, I don't know, Jupyter notebooks or something. They're data scientists. They're whatever. They're just mungers.

Starting point is 00:42:26 You know, we're all kind of data mungers to a certain extent. And this Explorer tool, I think, is really powerful. You want to tell us about it? Yeah, so the reaction to Explorer has been very nice because I was like, I would say like, super basic, those launch weeks and they never have a dead. And I honestly thought that the Explorer one

Starting point is 00:42:42 would be the dead. Oh, so you saved it for last, just in case. Yeah, so I put it for last. And I was like, I don't think people are going to like it that much. Maybe it's the point of just like, and I have my theories of why people like it so much. But let me break down what it is, right? Yeah, please do. So we have this tool called Xploder, which brings data frames in series to Elixir.

Starting point is 00:43:06 So what are series and data frames? Series, imagine one-dimensional data. And so just like it can be like a huge vector of strings or numbers, whatever. And data frames, they are two-dimensional data. They are tables. Think, you know, like Excel, like two-dimensional or a spreadsheet. And so Chris Granger, he started working on Explorer for Elixir tools to work with those abstractions.

Starting point is 00:43:30 And it's implemented on top of a library in Rust called Polars because we have pandas in Python, right? And then we have Polars for Rust and it finished with the RAS extension. So it's a nice naming scheme in there. And so it's super fast, right? Super fast. Like when we see like the benchmarks on how Polars work, it's like super fast.

Starting point is 00:43:53 And Chris, when he was working on this, he really loves the API that comes from Deploy R from the R community. So it's kind of like this mixture between, well, it's an Elixir project that has the foundation Rust and it's really inspired by R and Deploy R. And so we have this project, one of the numerical Elixir efforts, because we want to, yeah, sometime before you put data into a machinery model,

Starting point is 00:44:19 often you have to do a lot of messaging, you have to do joins, you have to do a lot of things with that data. And this is an important project. So we have been working on this for a while. And I said in the video that we're just starting our data journey. And maybe that's one of the reasons why I thought it's going to be a test because it's only the beginning. And I did not think people would be excited enough. And again, it was the same question. Like we have this library, Explorer, we love the API that we've designed. I really think it feels great.

Starting point is 00:44:48 It feels elegant, but people don't know how to use it, right? So how can we teach electric people to use this library, smart cells? So the way you do with the smart cells, I just say, hey, I have a data frame in here, right? So I have this information. It could have come from a CSV, whatever.

Starting point is 00:45:04 And I want to filter some columns i want to group by so you have a smart cell where the ui is guiding you you're clicking the buttons and adding all the operations that you want to perform and as you're adding the operations the updated table is appearing like right next and the next thing that you could do i didn't show this in the video i think is well, now you have that table as a result, you can create a smart cell to plot a chart from that table. And from there, only with like going with the smart cells, I can now have a chart that is plotting my filter table and going for that. Again, I'm not writing any Elixir code, right, at all. And I'm doing all those things.

Starting point is 00:45:46 But if I want to see the code, because actually that data processing should be part of my application, I can just get the code and bring it. Yeah, so, you know, and it's just the beginning of like spoiler, we want to have things like, well, we want to be able for you to drag a CSV, for example, into Livebook. And we automatically detect that and emit the code for parsing that CSV. That would be pretty cool, right? So, and it could be easy, but everything that we do, we want to make extensible, right? So that's the challenge. Because then you can drag and drop whatever you want like if you want to we were saying like maybe you want to do something for audio editing you could hook

Starting point is 00:46:29 into a way where you you you hook into audio files and do things specific to them sure so yeah so and we want to do a lot of things like related to data like have chart suggestions and improve the whole plotting story so it's only the beginning but it was really nice to see that people were excited about this and about this workflow. And I think compared to machine learning, like I thought that, like we did a bunch of machine learning features, but I feel like a lot of the machine learning, it still feels, even for we, I think we made it really accessible. It still feels far away because you have to think, well, how I'm going to use that machine

Starting point is 00:47:08 learning model or all those different machine learning tasks in my work. There is still a conceptual gap that you have to do. But with data, it's exactly what you said. Everybody has worked with data. Exactly. Everybody has had at some point to like merge data, filter data, do weird stuff with data. And, and I think it was a hit. It went to a hacker in this front page.

Starting point is 00:47:28 And I think exactly because it felt like a problem that it, you know, everybody, you know, has gone through at some point or will go through and they're like, okay, this can be really useful. Yeah, absolutely. There's lots of tools out there that address it. Busy data is what I'm thinking of. Of course, there's Simon Wilson's data set and all of his SQLite tools. And I'm

Starting point is 00:47:52 sure there's plenty of other ones, but I just think that you might've just like nailed the implementation. You know, the, at least what I saw, I haven't used it myself, but what I saw in the demo, it just looks like it's so easy to use yet also malleable along the way like you can just stop it you can take a smart cell that like does some filtering and stuff and you can convert it into code maybe you already said this but you're talking too fast and i missed it and you can like convert it into code and then change the code no you might have i don't know sometimes you get going and i'm just i'm holding on for dear life shall say and you can convert that into code and then just like you know munch the code and save that.

Starting point is 00:48:26 And now you just have like, you kind of forked off from what the smart cell would generate, which I thought was super cool for, you know, just for malleability sake. So happy to hear that it was successful or that it was interesting. Yeah. I start speaking too fast when I get very excited. And I am very excited about those things. So, you know, that's why I get going because it's like the excitement. But yeah.

Starting point is 00:48:51 The excitement. So you mentioned CSV. What about JSON? What about like just drop a SQLite file in there? And maybe it like detects all your tables and like allows you to just immediately start doing stuff with the data in there. Can you do something like that? Yeah. I i mean that's what i want to do i want to work on the abstractions to make that possible and from there on it can be because everything is extensible it can be anything right like you can think for example i think there is even already like a kind of like a image editor, like smart cell, where, you know, like you can have an image and it gives you like some ideas like, oh, I can rotate the image, crop, do a couple of things.

Starting point is 00:49:35 And it gives you the code, right? And then you can think that what somebody can do is that they can further integrate that, that if somebody drags and drop a JPEG or a PNG, right, it automatically brings this tool up. And does Elixir have image manipulation tools? Or are you talking about like, so what I've done historically is like shell out to ImageMagick and

Starting point is 00:49:57 then find ImageMagick's magic flags that I have to send in order to you know, center and gravitate and crop and stuff. But does Elixir have image manipulation tooling available, like natively? We have OpenCV bindings called E-Vision. That's a recent effort that has happened after Numerical Elixir. I'm not sure if we have ImageMag.

Starting point is 00:50:20 Oh, there's also bindings for VIPs, I believe, which is kind of a new one compared to ImageMagick. So they are here and there. We also have a very small one, which is just about creating the images and putting them into like a tensor format so you can further do manipulations with it. Sorry, I'm going to go on another tangent here. Okay, let's hear it.

Starting point is 00:50:46 But one of the things about, yeah, one of the cool things about NX and that we are building this whole graph and compiling it to the GPU is that for example, when you think about like, sure, you can use like OpenCV or something for doing image manipulation. But because an image is like,

Starting point is 00:51:04 it's three-dimensional data, you have like the height, you have the width, and then you have the third dimension for the RGB colors or maybe an alpha layer. So you can represent this with a tensor, which means that you can now implement

Starting point is 00:51:20 image manipulation tooling in Elixir, like using Elixir itself, that is going to be compiled to run on the CPU or on the GPU, right? And you can embed that as part of our machine learning model, if you want as well. So I think it's one of the nice things that are coming out

Starting point is 00:51:38 of it. Yeah, you can use existing tools, but depending on what you want to do, depending if you have to do like a bunch of image resizing, large batch, you're like, alright right, I'm going to do this in the GPU. And you write some Elixir code and then you can do it. I think you explained to me last time how it actually works, like how it decides what to run where on the CPU or GPU. But I can't recall. And I probably can't even follow you into the depths of that, Jose.

Starting point is 00:52:02 So I won't ask. Let's regroup on Explorer. So just to be super clear, just don depths of that, Jose. So I won't ask. Let's regroup on Explorer. So just to be super clear, just don't even try, Jose. I just won't be able to. Okay. Can you bring it down to my level?

Starting point is 00:52:14 Can you layman's terms? How does it know where to run what where? No, it's just a single option. You say, I want to run on the host or on CUDA. And it's a single option to specify and everything figures out for you. You don't have to do anything else. That's the part that I thought would be confusing is the figures it out for you part.

Starting point is 00:52:29 That's the part I thought you were going to explain how it figures it out for you. Oh, yeah, I mean. That's why I said it was going to be hard. Yeah, I can send the flag. I can say CUDA. I'm down with that part. It was like the whole how it decides

Starting point is 00:52:42 which code to run where. I don't know. It sounds like some dark, dark magic. Yeah. So the way it works is that we have this thing called numerical definitions. And so if you think about your machine learning model, it's implemented using those numerical definitions. And what it is, so for an electric developer, we define functions using def. And then the function name, the arguments and the implementation.

Starting point is 00:53:09 And a numerical definition is def n. So you just put a n in front of it and that's a numerical definition. And that's something that we guarantee we can compile that subset of Elixir that you have in there to the CPU or the GPU. So when you're going to run something, when you pass this flag somewhere, it's basically saying, well, when I see some numerical definition in the future, like I pass this tensor to some numerical definition or something, I want it to run or in the host or in the GPU. And then we build the same graph, regardless if it's host or if it's GPU, we build the

Starting point is 00:53:47 same graph. And then we call like Google XLA, which is the library that actually compiles the graph to the CPU or GPU. And that's doing like all that hard work, like the real, let's say magic stuff. Gotcha. But yeah, but we carry the option forward so we know where to compile things and where to run things. But yeah, so it's not trying

Starting point is 00:54:09 to be smart to say, oh, I figured out that this piece of code would be more efficient in the host or in the GPU. It doesn't do that to that point. You still have to say where it goes. Okay, you still specify. Fair enough. Let's close the loop on Explorer.

Starting point is 00:54:25 So just to be clear, Explorer is an Elixir library. This is if you have any needs for one-dimensional or two-dimensional data frames. And then the cool stuff that we're talking about is Livebook using Explorer

Starting point is 00:54:38 to build on top. Is that correct, Jose? Yes, exactly. Hey, friends. I'm here with one of our partners and sponsors, Jason Bosco, co-founder and CEO of TypeSense. You may remember Jason from episode 505 of the Change Law. We talked about TypeSense being truly open source search. And that's kind of where we got interested in TypeSense because we've been hitting bottlenecks and issues with Algolia. And so I reached out to Jason and said, hey, Jason, we'd love to work with you and partner with you.

Starting point is 00:55:37 But Jason, tell the listeners here why you all build TypeSense. What do you believe? So we believe that fast search-as-you-type experiences need to be widely available and adopted by as many sites and apps as possible. And what I mean by search-as-you-type is you type in a letter and it returns results right away in, say, less than 50 milliseconds or 100 milliseconds.

Starting point is 00:56:00 And we've tried building experiences like this in the past with other products. You know, there's solar, there's Elasticsearch, there's Alcolia, and all of them are good in different respects, but they either are very complex to deploy, or they're hard to scale, or they're very expensive to use, even for moderate scale. So that's why we built TypeSense. We open sourced it. We made sure that you can run TypeSense locally or if you don't want to worry about infrastructure, we also have TypeSense Cloud.

Starting point is 00:56:32 So you have cloud and you have open source and you ship binaries in your open source that you actually use in your cloud with extra features, of course. But what was making you think that you should build cloud in the first place? Based on what users have told us over the last several years, many folks wanted us to host the search service. So we started building TypeSense Cloud. So whether you self

Starting point is 00:56:52 host or use TypeSense Cloud, it is the same binary that we run in TypeSense Cloud that we also publish open source. So the feature set is identical, but in TypeSense Cloud, of course, we manage the service for you. So you'd have to worry about infrastructure. And then we give you a nice UI to manage your data. And then we give you role-based access control, the single sign-on, more collaboration aspects. But regardless of whether you self-host it or use TypeSense Cloud, we want to bring this technology to as broad an audience as possible without having to worry about cost. And that's one of the reasons we decided to partner with you, Adam, and talk about TypeSense here.

Starting point is 00:57:27 Yeah, I love the idea of getting this into as many developers' hands as possible. The fact that you have blazing fast in-memory search like you do that's open source, that competes with the likes of Elasticsearch or Algolia, that you can just host yourself if you want to. That's so awesome. Of course, we're excited to partner with you. We're using TypeSense Cloud, which is awesome and very fortunate to have a chance to work with you

Starting point is 00:57:50 on this project. Obviously, we have so much more in store for our search feature, so we're barely scratching the surface. But hey, listeners, check out TypeSense at typesense.org or at cloud.typesense.org. I think Jason's awesome and he has an awesome team. And of course,

Starting point is 00:58:10 we're using TypeSense, so we think you should check it out too. Again, typesense.org or cloud.typesense.org. All right, let's switch gears now because you teed up what we call distributed machine learning notebooks with Elixir and Livebook. You teed up this whole ML thing that you thought would be, you know, take the world by storm and everybody's interested in this data thing. But surely there's a lot to talk about with regards to now running machine learning stuff. You've been talking about it kind of around about, but let's talk specifically about what went into building this and how it works. Yeah. So I think we already covered a lot about it, but it was one of my favorite days

Starting point is 00:59:08 because so when we started this numerical Elixir report, right? Like a lot of people, they would ask questions like, why, right? What are potentially the advantages of doing numerical Elixir or running a machine learning model on your network in Elixir. And at the beginning, the answer was like, we don't know. Like we are enjoying it. And there were like some references, like I said, like, you know, if part of the Python community was saying, hey, functional programming is a good idea for that.

Starting point is 00:59:38 We were like, well, it's worth trying. So, but a lot of the time it was like, we don't know. Like we are having fun doing it. We are going to see where it takes us. And we have a hunch. It would be interesting to get the power of the early virtual machine in running

Starting point is 00:59:53 concurrent and distributed software and have that with machine learning and see where that takes us. Because that's also a direction that the Python community is going. They are starting to have like distributed frameworks. I think it's Ray or Desk, things like that. And then we're like, well, we have this, let's say technology, right? For several decades at this point. So it'd be nice to see where it leads us.

Starting point is 01:00:21 So the announcement of the second day of the launch week was exactly distributed square machine learning models in Elixir. And it's distributed square because for an Elixir developer, distributed means running on multiple machines in the same cluster. Right. But

Starting point is 01:00:40 for a machine learning developer, distributed means running on multiple GPUs. And now we can do both. We can, and that's what I show in the video. Like we start with a simple machine learning model and then it changed like two lines of code. You make it concurrent

Starting point is 01:00:56 and then you don't change anything. It's already distributed. And then you pass an option. It's distributed over multiple GPUs. So that was the idea, how easy, like how little you have to change to explore all those different architectures. And for me, it was very exciting

Starting point is 01:01:14 because it was kind of a, you know, it was like kind of, I'm not sure if promise is the right word, but it was like an idea that we had and we were not sure if we would get there. Yeah, potential. And being able to get there and release it. And for me, it was very exciting, right?

Starting point is 01:01:32 But maybe like that's why people should not have me selling stuff, right? Because I focus on like the technically exciting things. But yeah, the ones that are going to be very practical right now is not going to be, you know, distributed square machine learning. Yeah. Right. Okay. So that brings me then to my next line of questioning. So you've been using the first person plural a lot. You've been saying we, almost exclusively, by the way, I don't think you've said I once on this show. Who is we? And who's the marketing director? No, who is we?

Starting point is 01:02:07 Like, what's this team? Who are these people? I know like a lot of this, or maybe all of it is open source, but you know, there's a living to be made. You have a family. Let's talk a little bit about the we, the Dashbit, the people involved. And then like, are you driving towards revenue?

Starting point is 01:02:22 Do you have a marketing department? Like what's going to happen money-wise? Yeah, no, love this question. So I'll try to summarize because it's, and I will definitely laugh some people out. So I apologize, but it started with, so it started because somebody dared write it. I say it's the book that changed my life and I never read.

Starting point is 01:02:45 So Sean Moriarty, he released a book called Genetic Algorithms in Elixir. And that got us talking about Elixir. You never read it? No, I never read it. But it started- It changed your life, but you never read it? Yes, yes, exactly. Okay, fair.

Starting point is 01:03:01 And it started us talking, Sean and I, we got to talk and I asked a couple of things on Twitter and then Jacko joined us and we started working on America Elixir. So that's when things started. Today we have a lot of help. We're also on the team from Paulo Valente and he's working at Dockyard, which is a known consultancy in the Elixir world. And he's working part-time to help us bring those ideas forward. And that covers Annex, which is kind of the core. On the Explorer front, we have Chris Granger.

Starting point is 01:03:37 He's the one who created it. He works at Amplified AI, which is a company that runs Elixir. And they are migrating. He gave a great keynote. I think it was the future of AR, the future of machine learning at ElixirConf, saying how it's the first company, really, I believe, to start running Elixir machine learning models in production. And today, Philip Sampaio from Lashbit, he's working on Explorer as well.

Starting point is 01:04:07 And then Bumblebee is mostly an effort from Jonathan Kosko, who also started Livebook. So both Livebook and Bumblebee, Jonathan we have Chris, who is, you know, working on the different smart cells, the different integrations. We also have Voitec, who is working on Livebook desktop. And like, you know, it's like we're talking about all those features, but possibly one of the hardest was just like shipping Livebook desktop. And Hugo is doing the marketing. And Hugo has, I know him for almost 20 years. I've met him in my first day in university. We met and we did platform attack together.

Starting point is 01:04:59 I know Hugo very well from Elixir Radar. Yes, exactly. I wouldn't say very well. You know him much better than I do, but I know him. Yeah. Yes, exactly. So, and we have been working, we have been doing things together. We had a band together, then we had a company together, and now we're working.

Starting point is 01:05:13 You had a band together, is that what you just said? Yes. Oh, tell me more. Yes. A long, long time ago. What we did was acoustic versions of, I think we wrote one song, which thankfully, it was like- Is it on the internet anywhere? Yes, exactly.

Starting point is 01:05:32 No, it isn't. So we were like- Dang it. We were that period where we had internet, but things that went to the internet, they were not lasting forever. It's not like, well, it's in the internet. It exists. went to the internet they were not lasting forever you know it's not like well it's in the internet that it exists so we it was and but it can be a positive and negative because i found so i learned one of the way i learned how to program actually like my first real thing was building a website for our band okay and i built it using flash it was not even adobe flash at the time it was macro media flash nice so one of the downsides is that i actually found the source code for the website

Starting point is 01:06:13 and apply you probably can't run it anymore i cannot run it right so uh man yeah so you need like a turn of the century era uh windows machine that still has all the stuff on it, you know, like some sort of image of some sort of, you know, like what computer you would have had back then to run that thing. Something with Flash, you know, on it. That'd be cool. So tell me more. What instruments did you guys play?

Starting point is 01:06:38 Was it a duet? You said it's all covers. What kind of covers? Mostly pop, but more in the direction of rock so of course we had our tastes but if we wanted people to actually come listen to us we had to play mostly what was pop rock at the time uh i believe and i've played hugo played the guitar did backing vocals i could not sing for the life of me. So I was very far from the microphones. And I played guitar and piano.

Starting point is 01:07:11 I'm not sure if I played the bass back then. Okay. Multi-talented. Multi-instrument. Yeah. I mean, I was, I have not played anything for real for a really long time. I really hope I could go back into it, but I keep getting entertained with programming stuff.

Starting point is 01:07:29 You keep coding new stuff. Maybe that could be a bucket list kind of thing for our future ElixirConf. Get you and Hugo and get the band back together, Jose, and play on stage at ElixirConf. How cool would that be? That would be fun, actually. That would be pretty actually. Yeah.

Starting point is 01:07:51 Yeah. I know that some people at Gopher Con have done that over the years. They get together certain people. There's a lot of musical people in the Gopher community and they usually have some sort of a band. It's like an ad hoc, you know, they were never a band in the first place, but now they get together and play some cover songs and have a really good time. So I think that would be something that Elixirist would totally. I would. I have not been to ElixirConf. But if I heard that you and Hugo were going to be playing music, I would go. I would fly for that, for sure.

Starting point is 01:08:14 All right. Now we know. Maybe shortly for ElixirConf US. But maybe ElixirConf US 2024. OK. Well, I'm glad. Okay, book it. We got it on the calendar.

Starting point is 01:08:30 We just got to get Hugo on board and make this magic happen. That's amazing. Okay, so you covered some of the people involved. There's lots of projects. So each little project has certain people who are involved. What about the Dashbit side? What about the financial side? All this stuff is in the open source world. Are you making money? Are you hoping to make money?

Starting point is 01:08:51 Are you raising money? How are you managing this? Yeah. So yeah, a lot of those questions. So Dashbit is, you know, we are, we have a service that has been doing well and that funds the rest of the work that we are happening. So we are like three of us, we are working with clients and everybody else is like full-time on open source, which is really awesome. Nice. And we want to figure out ways of making a live book.

Starting point is 01:09:23 Maybe we can find ways to break even. So one of the things we did, but at the same step, I think like live book, I think it's already playing, but I think it can play a really large role towards Elixir adoption. So we don't want to say,

Starting point is 01:09:39 hey, you know, like, well, like we have those features, but you have to pay and somebody is just deciding to learn Elixir, right? So one of the things that we are thinking is that we want to have everything in Livebook for a single user is going to be there. And we want to make everything be easy to install and run your machine. but if you need to collaborate with other users which by definition requires a central place where you're going to put and share information then that's going to be a service that we develop

Starting point is 01:10:13 that we are calling Livebook Teams. So if you want to deploy an application as a team manage it, share secrets, share code snippets all those kind of things is going to be in Livebook Teams and it's something that we are exploring we hope that we're going to have a beta for people to try out in the second semester

Starting point is 01:10:34 so that's how things are going but we're not in a hurry and regarding like investment like VC it's so funny because I've been doing open source for like 15 years and I have never had a VC reach out to us because of any other work I did. And with Livebook, I think the most since the release because there is so much happening.

Starting point is 01:10:59 And at the time, we're not even talking about machine learning, but because it was like data, right. And notebooks, which is related to this data world, we had VCs reaching out to us. And that was like the, it was the first time. And that was interesting. But, but I always tell them like, well, there are a couple of things. So there are certain things in Livebook that really only works. And I'm not trying to be like annoying about this, you know, like functional programming, but there are a couple of features in Livebook

Starting point is 01:11:30 that only work because everything is immutable by default. So for example, if you think like, have you used Jupyter Notebooks before? I've loaded one, like I've never written one. Right, okay, yeah. So the way that Jupyter notebooks work is that you have the cells and you can think the way they operate

Starting point is 01:11:49 is that you have some code, that code gets the variables from some global state, change those variables and put back in this global state. So everything is global state. And that makes it very hard for the notebook should be reproducible. So the example that I gave is that if I have a cell that is x equals x plus one, and x starts with zero, every time we execute that cell in a Jupyter notebook, the value of x is going to increment,

Starting point is 01:12:18 which means that if you want to share that notebook with somebody, right, you have to say, hey, for you to get the same result as I do, you have to go and execute the cell number three three times, right? Right. So there is this aspect. But in Elixir, like in Livebook, if you do x equals plus one and x equals zero, you can execute that 100 times. At the end, x is going to be one. So we had this whole focus on making everything reproducible. There are some like Python notebooks that try to solve that, but because everything

Starting point is 01:12:50 is global state, like I can append to a list that's global state that you have in the Jupyter notebook. And those things, they are greatly reduced in Livebook. And because of that, we're able to do a lot of very specific tooling on top of this abstraction. These smart cells, they work on top of this abstraction. We can do caching because we know if a cell depends on which variables a cell depends, and because everything is immutable, we know that when those variables change. So there are a lot of interesting things that we can do.

Starting point is 01:13:20 So when it came to, and we lean on that for Elixir. So when we're thinking about VC, I'm really worried about like, well, if we get VC money, they're likely want you to grow, right? And that's going to generate attention because, well, yes, Elixir is, you know, is beyond my expectations of community size or where it would get to. But compared to Python, it's a small community, right? So I don't want to be in the middle of this tension where they're like, well, Linebook's great, but it's Elixir only. So now you have to figure out a way to make Python work or JavaScript work because that's where things are. So it's like, technically it's not going to be easy and I don't want to be in that place.

Starting point is 01:14:05 So when the VCs, they come talk to me, I say exactly that. And then they're like, well, you know, it's great. It looks like you have the vision for the product. You know exactly what you want. If anything changes and you feel like you want to get investment, let us know. But at this moment until I feel like,

Starting point is 01:14:23 I'm not really, I'm not planning to, but I'm not also ruling out getting investment. But yeah, at this moment, I'm like, until I solve this problem and say, look, I think a live book can grow because these smart cells are so good that people don't write in a certain way. Or we make SQL our second language. And, you know, SQL is also declarative and there are a bunch of interesting properties. But until we like we solve this big problem, I wouldn't go after investment, I think, because I don't want to be in that position. Well, you definitely thought through it. I think that makes tons of sense. I think if you really want the VCs to go away, you can just talk about, you know, the immutable characteristics of Elixir and just talk about it fast enough that their eyes glaze over.

Starting point is 01:15:09 And then they're like, I'm sorry, I guess we'll leave. So that's a good tactic. Well, it's really our fault because you're making excellent questions, which makes me excited. And then I just go on a roll. So yes, but I apologize. That's what I'm here for. I tee him up and you knock him out of the park. All right, Jose. I know there's other things in this launch week. We will leave them to our listener to go check the show notes and check out the other three or four things that we didn't discuss because we're out of time. It's always a blast. It always goes by so fast, even when you're speaking so fast. And I always learn something. I am actually

Starting point is 01:15:50 excited about Explorer. I'm actually excited about Whisper up in my Elixir. I was looking at Whisper CPP for various reasons, but this makes tons of sense. As soon as we can get speaker diarization, can I get you on a tangent about speaker identification inside of the machine learning world? Because Whisperer doesn't do it. And I would love for it to just support that out of the box. Have you looked at that at all? Yeah, I've looked at that.

Starting point is 01:16:18 So Chris McCorn, the creator of Phoenix, he actually did a nice video because he built this thing called LiveBits, which is like Spotify, but built with LiveView and where you can like share music and so on. And he did a blog post where he was adding Whisper to this LiveBits application and he has a video. So it's really neat. And at that moment, it was not even doing like audio segmentation so you have to figure out ways to break the

Starting point is 01:16:50 audio into like 30 seconds but we know how to solve that problem and do automatic audio segmentation it's something that we plan to work on but I don't know enough about the other problems and doing the organization and figure out

Starting point is 01:17:06 who is the speaker unfortunately I think there are some tools in Python there are yeah PyAnnot is one of them I don't know if that's how you say it PyAnnot so I know some people there's a thread on the Whisper CPP discussion about how to go about it

Starting point is 01:17:23 and it's basically like a pipeline situation where you run it through Whisper, then you run it through this other thing called PyAnnote, and then you run it through Whisper again or something. It's very complicated at the moment. Not worth my development efforts. I see. Just kind of sitting around waiting.

Starting point is 01:17:37 I know there's other ways we could do it ourselves. We actually had a lot of people reach out after that Whisper episode saying, you could do this, you could do that. We have separate tracks. We do multi-track recordings so we could actually just edit them and then do each track individually and then munch the files or something like this possible it just really it just really messes with our workflows you know so we're trying to have like a we we would love to just upload our mp3 into our Phoenix app and then in the background it just goes and transcribes it

Starting point is 01:18:06 and it just is amazing so I'm just kind of waiting for that to be potential well you could I mean with you could upload three mp3s then or is it because getting the individual tracks is hard yeah so we start

Starting point is 01:18:22 with individual tracks but once we mix down, we're basically just adding a bunch of steps. Oh. So we start with tracks, we edit them, and then we have a session. And then we mix that down to a single WAV file,

Starting point is 01:18:34 and then we convert that to an MP3. And that's what we finally ship. And we're doing manipulation on the MP3 in our app, like adding chapters and stuff, but we're not doing the transcripts. And so at that point, you don't have any sort of multi-track information and you don't want to do it too early

Starting point is 01:18:49 because then the timestamps are all going to change because we're still going to edit it. So it's like, how does it work in bulk? Because we're shipping five or six shows a week. It has to be like, it can't add hours and hours of additional steps and WAV files are large. And anyways, podcaster problems.

Starting point is 01:19:07 But I appreciate all of our listeners who wrote in and gave us suggestions. It's just, I'm not going to use any of them. I'm just going to wait. I'm just going to wait until something pops up. I'm patient. I've been waiting this long. Until it's a smart cell, right? So you can just click.

Starting point is 01:19:21 I'm just going to pop it in, drop my WAV file into Livebook and hit a button. Boom. And then copy that to Elixir code. If you see a model in Hugging Face that does it, then you can ping us because then we can port it and have everything running in Elixir. But as far as I know, there is no such thing. I think that's correct. You know what I just realized? That I assume most people listen to podcasts

Starting point is 01:19:48 at like one and a half speed or two and a half. And this time they won't need to. That's right. They'll have to turn it down just to keep up with you, man. You were talking at one and a half to two X. So that's all right. I like to wind you up and let you go. Yeah, I actually troll my mother this way

Starting point is 01:20:08 because I know she listens everything with like one and a half or two. Yes. So whenever I send her an audio, I on purpose speak super, super fast. Like, hello mother, how's it going? So she has to be like swapping the settings on that off uh but yeah i'm sure she appreciates that all right jose we will let you go uh all the links to all the things will be in the show notes

Starting point is 01:20:33 the whole launch week how to connect with jose how to check out the things will be there for you listeners so you can catch up with him and the work he's doing always appreciate you always love the tools that you build and looking forward to the next one and what you do next with Livebook. I think it's got a lot of potential. It's already pretty cool. I'm definitely going to give it a try next time I'm munging some data. All right. Thanks for having me and see you next time. So as Jared said, all the links to all the things are in the show notes, the lab book launch week, all the demos, all the videos, all the things in the show notes. Check them out.

Starting point is 01:21:13 Up next week, we have a fabulous episode celebrating maintain a month along with GitHub and many others. And we're talking to Alyssa Wright from Bloomberg, Chad Whitaker from Century, and Dwayne O'Brien, famous for creating the Fonz Contributor Fund, among other things. It's going to be an awesome show talking about how companies can support open source. Again, we'll also be in Vancouver next week at Open Source Summit.

Starting point is 01:21:40 So if you're going to be there, make sure you come check us out. We'll be in the Expo Hall, as I mentioned in the intro. We are going to be podcasting at the Maintainer Month booth. So make sure you check us out. We'll be podcasting and high-fiving and all the good things. So come see us. Once again, a big thanks to our friends at Fastly, Fly, and TypeSense, and also to Break Master Cylinder. Those beats are banging. But that's it.

Starting point is 01:22:07 The show's done. Thanks for tuning in. We will see you next week. Outro Music

The Changelog: Software Development, Open Source - Livebook's big launch week (Interview)

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.