Programming Throwdown - 151: Machine Learning Engineering with Liran Hason

Episode Date: February 13, 2023

Machine Learning Engineer is one of the fastest growing professions on the planet.  Liran Hason, co-founder and CEO of Aporia, joins us to discuss this new field and how folks can learn the ...skills and gain the experience needed to become an ML Engineer!00:00:59 Introductions00:01:44 How Liran got started making websites00:07:03 College advice for getting involved in real-world experience00:12:51 Jumping into the unknown00:15:22 ML engineering00:20:50 The missing part in data science development00:29:16 How to build skills in the ML space00:37:01 A horror story00:41:34 Model loading questions00:47:36 Must-have skills in an ML resume00:50:41 Deciding about data science00:59:08 Rust01:06:27 How Aporia contributes to the data science space01:14:26 Working at Aporia01:16:53 FarewellsResources mentioned in this episode:Links:Liran Hason:Linkedin: https://www.linkedin.com/in/hasuni/Aporia:Website: https://www.aporia.com/Twitter: https://twitter.com/aporiaaiLinkedin: https://www.linkedin.com/company/aporiaai/Github: https://github.com/aporia-aiThe Mom Test (Amazon):Paperback: https://www.amazon.com/Mom-Test-customers-business-everyone/dp/1492180742Audiobook: https://www.amazon.com/The-Mom-Test-Rob-Fitzpatrick-audiobook/dp/B07RJZKZ7FReferences:Shadow Mode: https://christophergs.com/machine%20learning/2019/03/30/deploying-machine-learning-applications-in-shadow-mode/Blue-green deployment: https://en.wikipedia.org/wiki/Blue-green_deploymentCoursera ML Specialization (Stanford): https://www.coursera.org/specializations/machine-learning-introductionAuto-retraining: https://neptune.ai/blog/retraining-model-during-deployment-continuous-training-continuous-testingIf you’ve enjoyed this episode, you can listen to more on Programming Throwdown’s website: https://www.programmingthrowdown.com/Reach out to us via email: programmingthrowdown@gmail.comYou can also follow Programming Throwdown on Facebook | Apple Podcasts | Spotify | Player.FM Join the discussion on our DiscordHelp support Programming Throwdown through our Patreon ★ Support this podcast on Patreon ★

Transcript
Discussion (0)
Starting point is 00:00:00 Hey everybody, we are going to be covering something really, really important, which is ML engineering. Some folks might have never heard that term before, but ML and AI is becoming extremely important. I saw some stat the other day. Basically, almost all the Fortune 500 companies have whole teams dedicated to AI and ML. It's kind of accelerating so many different areas. And so a machine learning engineer is going to become a bigger and bigger field as we kind of move forward. And so super excited that we have Liran Hassan here to talk to us about ML engineering, what it's all about, how you can get skills in that area, and what those folks do all day.
Starting point is 00:01:12 So thanks so much, Liran, for coming on the show. Thank you. Thank you for inviting me. Cool. So let's talk a little bit about kind of your background. So what has been kind of your experience with ML engineering? How did you, you know, kind of get into that? And how did you kind of move on to where now you're kind of building tools to serve that community? How did that journey go for you?
Starting point is 00:01:35 Sure. So, well, my journey started way, way back when I was 10 years old, just being super curious about building websites. It started with that. Wait, really building websites it started with that wait really building websites at 10 that's amazing yeah well just a friend of mine just called me and kind of told me he built a website and i was like what does that mean right and he just read kind of 200 long characters of a website address. I was typing it one by one. And suddenly I saw his name on my screen and my mind just blown. Like it was like, whoa, that's crazy. Yeah, so that got me curious.
Starting point is 00:02:18 And I was like, how did you do that? I have to learn that. So yeah, I learned and then I saw an advanced capability called html so i asked my mother to buy me a book about html i started learning continued to asp free and php and all this is history like these languages almost non-existent today then kind of moved into more software engineering so for example i saw a lot of people, kids at my age, who were trying to learn software engineering, but were struggling with the language barrier of having to know English.
Starting point is 00:02:55 So I wrote kind of a programming language in Hebrew. It was called Blue-White Programming. All my friends actually laughed at me about doing that in my summertime. So, hang on, we got to back up a little bit here so so at 10 you're already really adept at using a computer right so so you must have gotten a computer at maybe seven or eight or something or yeah something like that um that is wild and so was that uh like uh so you got a home pc around seven or eight and then um did you have in uh grade school you know typing classes and all of that or was this all done at home no it was all done at home wow amazing okay so so then you you got into uh you know the web programming
Starting point is 00:03:40 stuff and then you got into kind of compilers and writing your own languages and that was maybe later or was that also at 10 no no obviously that was later i'm talking about 15 16 something like that um i learned c c plus plus um i was like really interested in understanding how everything works uh down to learning the assembly x86 and stuff like that. Wow, amazing. Okay, so yeah, tell us about that. I mean, how did you evangelize that? And was it the kind of thing where just folks locally knew about it?
Starting point is 00:04:16 Or were you able to kind of get the word out? And how did that look like? No, so it was mainly online for the days of ICQ and then Messenger, and some online forums. There was no Facebook back at the day. I know it might sound weird for some people. There was no TikTok. Yeah. So, yeah, it was very, very different.
Starting point is 00:04:37 So I was sharing it with the world online on a couple of forums. One of my projects got published on a newspaper, which was super exciting for me. And then I got my first job at the age of 17 years old as a software engineer. So that's kind of how I got into software engineering in general. With machine learning specifically, my first experience was about when I was 18 i built i built a software to do biometric identification by taking pictures of the iris and this was built in matlab again another thing that i guess that today outside of the academy you won't see a lot of people using yeah isn't that wild how matlab has uh i think there's still know, there's probably some tranches that they have, but outside of there, MATLAB has kind of totally fallen out of favor. It's pretty wild. I remember using Octave back in the day at university, which is for folks who don't know, Octave is an open source fork or not fork implementation, I guess, of MATLAB. But yeah, I really doubt universities teach MATLAB now.
Starting point is 00:05:46 Yeah, no, and their neural net library is, I don't think they've got updated since like many, many years ago. But today everyone is using either Python or R, so there are better languages to use. Anyway, kind of skip away. So I spent five years in technological unit in the israeli army um graduated as captain i won't get into the details of what i did there but it wasn't related to software so wait so at 17 you had a software engineering job right and was that the
Starting point is 00:06:18 army job or that was pre-army no that was pre-army oh i see I see. I see. And then I think army is like 18 or something, right? Okay, got it. So you had kind of like a, like you finished high school early, or maybe 17 is a normal age in Israel. And then you had a year where you were able to do some industry experience. And then you had some army experience. And so you actually, did you go to college at some point? Or did you go straight from high school into industry? Well, it's a great question. Yeah, I did go to university at some point in time, but that was in parallel to me working full time.
Starting point is 00:06:55 And that was when I was already working for a startup company as an engineer. And so what is your feeling about that? I mean, folks who are, you know, let's say graduating high school now, and maybe they don't know what they want to do in college and they're feeling that they want to go and get a job. What sort of advice would you give them? Would you recommend the full-time work, part-time or full-time school thing? Or what would you tell them if you were to give advice today? So I think it's very, very much individual. There are some people who need to have the frame of courses with homework, with projects, the guidance, and that's their best way to learn new stuff. And if you're that kind of person,
Starting point is 00:07:41 then yes, obviously go to university or college. Other people are better just learning on their own. If you're that kind of guy, I definitely leverage that for using your free time to work on some projects. And if you can land some real work in some company, even if that means starting at the junior level, by the end of the day, there's nothing similar to hands-on work. And to be honest, as someone who has both experience from university as well as from working for a startup company, I think each one of them provides me with different aspects and different perspectives to, at least when I'm coming to machine learning projects.
Starting point is 00:08:27 It helped me to not only have the hands on practice, but also to have deep understanding of what's going on behind the scenes. Right. Like how the math works. So I think this is also important if you really want to be good at what you're doing. Yeah, that makes sense. So, OK, So the thing that you built in MATLAB, and I also built a bunch of things in MATLAB that are almost impossible to put into production, you know, like, like they fail constantly or they're just really slow. And so it's like, how do you run this multi-node? And so at some point you kind of ran into the brick wall of ML engineering, you know, actually making this create value. And so what was that like?
Starting point is 00:09:07 How did you kind of deal with that? Yeah, so just after the Army, I joined a startup company in the cloud security space that was called Adalone. So I was one of the core initial team, and we built a product really from scratch. And we had our data science team and they were building you know some machine learning models and at one point of time i was placed inside of
Starting point is 00:09:33 that team in order to kind of create the proper infrastructure for reaching production because as you mentioned and i really i'm really happy that you mentioned it. It's not trivial to get to production. There's one thing of conducting your research, building something in MATLAB or Python or R versus taking this artifact and making a proper running service in production. Yeah, and so that startup, was there a lot of ML?
Starting point is 00:10:04 Was that core to that startup's business? And so that's where they needed that infrastructure. ML was definitely one of our main lines. So what we were doing essentially is monitoring cloud services that enterprises were using. So a lot of companies started to use back in the day, you know, transition to the cloud, whether it's G Suite or Dropbox or Salesforce and all of these solutions. So we were helping them really get visibility to what activity they have in these services and identify malicious attacker behaviors. Now, in order to do that, that's where we utilize machine learning, right?
Starting point is 00:10:39 Like we created the baseline for the behavior of each and every user and also for different devices. And we utilize machine learning to identify anomalies in the patterns of behavior of this user and say, oh, wait, this doesn't look like the traditional behavior of that user. That might be an attack. So this is where we use machine learning. Cool. That makes sense. And so then from there, at some point in the chain in production, to be honest.
Starting point is 00:11:27 And I think that back in the day, we were missing out of the proper tooling, right? So we had to build everything ourselves. We had to build kind of a monitoring solution and same kind of because it was more a bunch of Python scripts rather than an actual solution right so we did have beta dog for example to monitor application workloads but we didn't have a proper solution to track our machine learning models so what happened we just learned about issues and productions from our customers which obviously that's not the ideal way to figure out about issues that you have in production. Yeah, so back to your question. I just realized that more and more companies like that we did are utilizing machine learning
Starting point is 00:12:14 for all sorts of different use cases. And all of them are the same way that we did, are just building again and again the very same code. And as a software engineer, I prefer to do reuse of code, right? So the same goes for ML monitoring. I just realized there has to be one centralized solution where you can track all of your models in production, see what predictions they're making, automatically identify drift or data integrity issues.
Starting point is 00:12:47 So that's kind of what led me to live and work in the port. Cool. And so how do you kind of, how do you say this, like muster the courage to quit your job? You know, that's always something I'm really inspired by, like people who say, OK, you know, I feel like you stop getting that paycheck. I'm going to find my self-insurance and I'm going to build this thing out and start getting customers. How did you will yourself to do that? Yeah. Actually, I just met with a friend during the last weekend and discussed this very same subject so it's interesting especially as an engineer where you're getting usually a very nice paycheck
Starting point is 00:13:51 so why should you quit and go do something crazy and maybe work without any funding for a year or even more than that so I really feel it's the question of why do you want to start your own startup or your own company what's the drive for me it was clear I really wanted I really enjoyed creation I enjoyed
Starting point is 00:14:14 creation of products I enjoyed the creation of teams that's what bring me joy and I want to create an impact and I believe that building something of your own allows you to achieve that. So that's kind of what was my drive to go and do that. Yeah, that makes sense. I think you're creating a lot more when you start something and you also can be very focused. I think, you know, there's not as much kind of red tape. So everyone just focus on building as much as they can as quickly as they can. And yeah, I think there's something really exciting about that. Very cool. So yeah, that's an amazing, amazing journey.
Starting point is 00:14:53 And we should definitely talk more about Aporia at the end. We'll put a bookmark in that and jump to the main topic, which is ML engineering. So what is an ML engineer? Cool. So there are various descriptions to what an ML engineer is. And if you look on LinkedIn for open job positions, you might find different descriptions. But I think as time goes by,
Starting point is 00:15:17 there is a consensus about what is machine learning engineer. And essentially, a machine learning engineer is usually someone who's coming from software engineering background. They may also come from data science background. And their purpose is to build and supply proper infrastructure for machine learning that will be able to run machine learning in production. So that really means from data collection, how do you build maybe a feature store so data scientists can collaborate about different features?
Starting point is 00:15:52 How do you build a proper environment for data scientists to build their models, to experiment with different type of models? How you provide them with proper tooling to actually package these models, wrap them up, deploy them to production? And lastly, how do you provide them with proper tooling to actually package these models, wrap them up, deploy them to your production. And lastly, how do you allow them to monitor, track what's going on in production so they can derive insights so they can further improve their models.
Starting point is 00:16:16 So it's really about end-to-end infrastructure of the ML pipeline. That makes sense. And so, you know, you can imagine Patrick, for example, does a lot of geospatial work. And so Patrick could be sort of like a geospatial engineer, right? But we don't really go so far as like classify that as its own job title, right? And so what about ML engineering kind of, you know, merits having its own job. Is it the fact that there's so much technical depth required or that ML is such a big space?
Starting point is 00:16:51 What separates ML engineer from somebody who's putting anything in production? Right. So machine learning workloads are very much different than traditional application workloads. And that's why when it comes to ML engineering, yes, you do need to have maybe a different skill set or experience. So you do need to have experience with getting software to production, right? But you also need to have some level of understanding of data science, ML models, how they work. So this kind of combination, I think, is unique.
Starting point is 00:17:27 And one more reason to make it a title. I don't, like, when I did that at Adelon, it wasn't a title, okay? So it's pretty new from the very last years. I was just a software architect. Sorry to interject, but I have a crazy stat. You know that, I think it's like 90%. So if you look at all the different job titles,
Starting point is 00:17:46 so not the people in each title, but just the number of titles, something like 90% of them were created in the past, I want to say like 20 years. And so that includes everything like farmer, blacksmith, you know, like take every profession. And I think a lot of it is just because the landscape
Starting point is 00:18:06 keeps changing. Similar to how, you know, Eskimos have like 100 words for ice or 80 words for snow or something like that, because, you know, they work with it so much and now they need to disambiguate. OK, here's snow that I'm going to use to build an igloo versus here's snow that we're going to use to turn into water, et cetera. You know, similarly, you know, ML engineering is becoming its own field. And so it's now another job title. And, you know, and it reflects kind of just how the landscape keeps changing so quickly and how AI and ML is becoming so important. Yeah, basically, you know, you hit the nail on the head that there's just a ton of new set of skills that folks need to have to do ML engineering. You could end up having a pipeline that doesn't have any errors, it doesn't
Starting point is 00:18:57 crash, it doesn't, there's no failures, your APM, your process monitoring all looks great. But then the result of the model is just trash, right? And so it's like, clearly, there is something that you need to keep your eye on. That's more than just the sort of software health and all of that. And that's where all of that background in data science and statistics comes in. Yeah. And by the way, just to add on that, I know you mentioned about new titles. So even if you look on LinkedIn, the last time I checked ML engineering was the number one most wanted, most asked for position on LinkedIn. Wow. I didn't know that. That's wild. Yeah. So that's pretty crazy. And the second interesting fact is if you'd go and search for ML engineer title on LinkedIn today, you'd find about 14,000 people in the US alone.
Starting point is 00:19:54 Well, six months ago, it was 11,000. So about 30% in that time. So it's pretty crazy. Wow, those numbers are mind blowing. It's totally wild. Okay, so has there been kind of any standardization or is it really all over the place in terms of what people, you know, what different companies think an ML engineer does? So I think that in the last year, there became kind of a pretty much standard for what a machine learning engineer is doing. And what you'll find is that a lot of companies in the past five, six years,
Starting point is 00:20:32 they've been investing over hundreds of millions of dollars in building data science teams and data science groups who are building machine learning models. And what they realized is they're missing a part in the puzzle in order to realize the impact of what these guys are doing. And this missing part is MLOps. And ML engineers are the ones to actually implement MLOps in the organization. So what you'll see, you know, the top tech companies, so companies like Netflix and Uber and Airbnb, they build their internal ML platform.
Starting point is 00:21:09 Who built this platform? The ML engineers. And what we are seeing is that more and more companies are hiring ML engineers to build their internal ML platform. So the ML platform is this kind of centralized place serving all different kind of needs of data scientists to allow them to reach production and to properly manage their production. Yeah, that makes sense. I mean, just to show an example, you know, tell a little story of how this can go terribly wrong. Yeah, I was in a research group in the past and it was a huge company. There were definitely ML engineers at the company, but our research project
Starting point is 00:21:52 became really popular inside the company. A lot of people started using it and then it started breaking and then it started causing this waterfall effect where our research code would break and that would cause a whole bunch of other things to break. And then it would eventually cause, you know, production to break and you trace it back to a bunch of eggheads who are writing research papers. And so sort of short-term solution was, well, let's just set up an on-call rotation. But, you know, we didn't really know what we were doing. You know, I mean, we were kind of researchers and we didn't know really how to productionize anything. And so then, you know, people who really knew what they were doing,
Starting point is 00:22:35 you know, kind of were able to jump in and help out. And a bunch of ML engineers came in and ruggedized everything. But it was scary for a few months there. Yeah, that's very interesting. And by the way, how did you figure out when something is like, how did Dev on call or Data Science or the researcher on call, how would they know that something got broke down? Oh, that's a great question.
Starting point is 00:23:00 So a number of things that we did, one was when someone trained a model, if the training didn't complete successfully, so if it didn't actually finish or if the person canceled it early for any reason, we put a little interstitial, like a modal survey. It's like, why are you stopping this, this job? And so the answer could be, you know, the loss is spiraling out of control, or it could be a, you know, just user error. You know, control, or it could be, you know, just user error, you know, I didn't like what I built myself, etc. And so that, you know, then we were able to sort of
Starting point is 00:23:31 track that and burn that down number of people who said, Oh, I'm stopping this run, because it never finished when it was supposed to, we tried to get that number down. And as I said, people who really knew what they were doing, came in and did a whole bunch. So I'm only talking about the early days when it was not good. So that's the extent of, then I moved on to some other research project where I could this was we were trying to track you know some production metrics and and some social like sensation went viral it was literally a giraffe was giving birth in a zoo and for whatever reason this went super viral i think it was called no sophie the giraffe is the teething toy for kids. The giraffe had a different name. There's something the giraffe. The giraffe got super viral.
Starting point is 00:24:29 Then all of a sudden, everything, all the alerts went off. All the models must be broken because all of the predictions are way underperforming. That pointed out to me that we need some kind of AA testing, you know, we need to have like slow roll so we could have two models at any given time. And if both of those models start generating way more clicks, then we know that, that it's not the model. Right. And so it's all this causal analysis, explainability, but these were all things that we needed. And so, um, yeah, I think folks think folks went on to build a lot of that stuff later. Cool.
Starting point is 00:25:07 Yeah, we do see a lot more data science and ML teams actually deploying models either in shadow mode. So they have two different experiments of the same model running in parallel in production. So they can figure out which one is actually overperforming and then do load balancing accordingly and also blue green deployments and really trying to try multiple experiments see what's working better and do load balancing accordingly yep another issue is anything that you deploy in production becomes a closed loop system. So for example, let's say you go on YouTube. There's people who go on YouTube and they search for, I'm just making this up, but the pattern still holds true. They search for cat and they click on the third video. And instead of just
Starting point is 00:25:59 copying the link to that video, that's their way of getting the video. It's like, oh, I search for cat and I go to the third video and I get this exact video that I want. And so you can actually train another model, exactly same parameters, same data, same everything, but just because of random numbers and timing and stuff, the model will be different. and now that video isn't the third result for cat anymore and now that model underperforms so it's like you have to deal with all the humans that are in the loop too and their behavior and it just became really difficult to try to know if you've actually broken something yeah and there's also the seasonality aspect of things, right? Like maybe, I don't know, in the last few months, people are more interested in cats for some reason.
Starting point is 00:26:50 Or we have the upcoming Black Friday. So I guess that a lot of more people are looking to buy all sorts of things on Amazon and other e-commerce websites. So the behavior of the population is changing dramatically. Is the model still relevant? Right.
Starting point is 00:27:07 It's a very interesting question. Yeah, I mean, I am. I've never worked for Amazon, but I'm 100 percent sure they have a separate machine learning either model or even entire machine learning stack for the holidays. Or at least they have some type of residual model or something. But you're right. I mean, it would throw everything out of whack. I saw another issue where because of daylight savings time in the US, which I think they got rid of this, right, Patrick? At least in Texas. Oh, in Texas, I think they got rid of it. They'll have rid of it soon.
Starting point is 00:27:41 Oh, it's going to be gone soon? Yeah, it's still a couple more years oh but is that nationally or just yeah oh okay okay so for folks who aren't from the u.s we have this or other countries that have this we have this daylight savings time where the clock jumps forward an hour and then it falls back an hour in the fall um i think we're going to get rid of it in a couple years but but you know that means that there is a day where you only have 23 hours and there's a day where you have 25 hours. And every single year, all of our alarms would fire on that day. You have one 24th less clicks or something.
Starting point is 00:28:18 So yeah, these are just some of the statistical pitfalls that ML engineers have to deal with that's kind of unique to that profession. Yeah. And also dealing with the large scale of data that these models need to consume and process sometimes even in real time. So all sorts of really interesting challenges, by the way. It's interesting data engineering challenges, interesting software engineering challenges. So I think in my opinion, it's a super interesting profession to work at. Yeah, totally. So usually academia kind of lags behind a little bit.
Starting point is 00:28:53 So I know, for example, data science, it took a long time for you to find a data science program and academia, I'm sure they have them now. So what would you recommend to folks? Let's say someone is in high school or they're just my opinion, my recommendation would be first getting a degree in computer science, getting some hands-on experience in either a corporate or startup company, but just getting your first job as software engineer.
Starting point is 00:29:40 Really feeling hands-on what it feels like to build software, take it to production, feel how mature, because in traditional software, production practices are very much mature, unlike in data science. And then by getting this kind of experience, let's say, spending about one or two years as a software engineer, then I think it could be a good time to transition, maybe take a few courses online in Coursera.
Starting point is 00:30:10 There are some really, really good courses online for data science, so you can get into that field, understand different types of machine learning models, you know, decision trees and regression and neural nets and what's the difference between all of them. And from there, really transition either within your existing company or just look for a job as an ML engineer. Because the role itself is pretty much new, it just was introduced
Starting point is 00:30:32 in the last year, it will be difficult for companies to find someone with past experience in ML engineering, right? So having software engineering plus some knowledge or experience with data science is a huge advantage. Yeah, that makes sense. So I think most folks, we've done tons of shows on how to be a software engineer. We've never really done a show on how to be a data scientist. So how do people build, we talked a lot about all these statistical funny things that can happen and how to try to pull causality out of that yeah how do people get experience in that is it is it taking some stats classes is it is it youtube but where did you learn um kind of your your data science core so
Starting point is 00:31:19 i did learn my data science core part of it was in some academic program. This was actually while at high school. But then again, while doing my measure, that's where I got most of, at least the theoretical knowledge that I had to have in order to start doing that. You know, essentially, you can just take some Python library,
Starting point is 00:31:44 you can take XGBoost, for example, and train your model, and you probably will be able to train and build a model. But you won't be able to really understand how to make a good model. You need to have a good understanding of different hyperparameters and what alpha means and what learning rate means. So you have to get this theoretical knowledge. So in my opinion, in order to become a good data scientist, you have to come from a theoretical background, at least. I think what you'll see in the industry is a lot of people with masters and PhDs
Starting point is 00:32:19 as data scientists. And I think it is changing over time. But at least that's what I'm seeing. Yeah, I think that makes sense. So Liran, did you study, you didn't study in the US or did you? No, I did not. Okay.
Starting point is 00:32:34 And I studied in Israel. I feel like our statistics, even for engineers is not good. I think, Patrick, did you take statistics for engineers or how did that work at your school? You went to UF, right? Yeah, we had like, just a single basic statistics class. There was like a giant auditorium class across everyone, math, engineering, all of that. That was it.
Starting point is 00:32:57 We had the same thing. And so I was in this cohort of people who could apply early, like who could get first dibs on the classes, which is super convenient, loved it. But I happened to pick this and I learned my lesson. This was first year. I picked this statistics class and it was almost empty. It was at a really convenient time where I lived really far from the university. So I could, if I picked this particular statistics class, I would only have to go to the university
Starting point is 00:33:25 two days a week. I could just go and do all my courses. And then the other two days I could stay at home and just do homework or whatever. And so I picked that stats class. It was almost empty when I picked it, which really, almost full, sorry, when I picked it, which really surprised me. It turns out that was the stats class for all the professional athletes. So if you were effectively majoring in scoring touchdowns or something, like your coach told you, take this statistics class.
Starting point is 00:33:53 And I don't know if I'm like as a big expose on the academic industrial complex or something, but I walked into this class. At one point, the teacher was like, ask this super basic question. Like how many ways could you pick two things from four things or something? And this person says 23. And he's like, why did you say 23? He's like, cause it's Michael Jordan's number. Yeah. It's just like, oh my God, what am I, what have I done? This is like my first college experience. I was, I probably should have got into industry. Uh, I definitely felt that way at the moment, but that was, that was it. Like that was all the statistics courses that we had in undergrad. And so I, I guess, uh, you know, maybe now it's a lot better, but, but I implore folks to, uh, you know, take, uh, some other statistics classes as electives if you can,
Starting point is 00:34:40 because it's by far the most important thing that you'll use as an engineer yeah and i think today there are so many things online right so even if you end up in such a statistics fun fun statistics course like you just had right you can still complement that by learning on your own in your free time if you have such you know getting some more knowledge more experience actually i there is a university in Israel that has, that's actually how I learned statistics. I don't remember the name of the university. It was a long time ago, but they had a YouTube channel and they had a bunch of statistics courses and they have game theory courses there too. I mean, now I don't know if that's even a good recommendation because it's like 15 years old or something.
Starting point is 00:35:27 But yeah, there's a bunch of amazing resources online. And you can build yourself up. And you can go at your own pace too. So if there's something that clicks right away, you just skip it. And if there's something you don't understand, you can watch it as many times as you want. You can watch three different professors you know give you three different takes on the same thing yeah actually one of the most popular ones is a course by
Starting point is 00:35:53 from stanford i'm not saying anything new it's like i think maybe it's the most popular course in coursera but this is highly recommended yep yeah totally yeah no matter how many courses you take though at least my courses i'll take i will never figure out like eigenvectors and eigenvalues like that will always occupy some part of my brain that i can't totally access but maybe maybe your odds will be different but at least you can try as many times as you want so yeah that's true you can do replay yeah andrew ring's course is phenomenal. Yeah, highly recommend it. I don't know if he's doing new ones.
Starting point is 00:36:30 I know he has, I think it's like deeplearning.ai is his domain. And he might have new content there. But his original course is fantastic. So what are kind of, you know, different, like, do you have any sort of ML engineering horror stories? Other than the ones I told, but do you have any of your own, like things that, you know, kind of went terribly wrong that could kind of give people insight into what that's like and how to recover from that and excel from that? Oh, yeah.
Starting point is 00:37:02 Well, there are plenty of them, right? A lot of time it happened to me that you know we we used to work on a model it was functioning very very well on training and when we tested it and it seemed to be working fine we deployed it to production at the beginning everything seemed to be working working fine as well and a few weeks go by, and then we got starting. And then I'm talking about previous company. We started getting some questions from our product teams, from our clients.
Starting point is 00:37:32 And the assumption was that something wrong with the models. It's always the assumption, by the way. Yeah, that's right. And we went and really investigated what has changed, if something has changed with our model, with our data. And what we realized is we didn't do anything wrong. It was just someone else in a different software engine, the backend team, that changed some logic
Starting point is 00:37:58 to the fact that the data that got to our model, and as a result, it completely changed the behavior of our model. Right? So it wasn't even our fault but it happened so many times and it's so frustrating yeah actually that's a good segue into into training serving uh skew and just training serving in general like uh we talked about matlab and and know, people might write a Python program and a Python program is full of Python for loops and it takes forever. And, you know, how do they get that model? How do you get some research code into production? And then, you know, how do you iterate back and forth, right? The researcher,
Starting point is 00:38:44 you know, how do you keep that latency kind of low? Yeah, so how do you get models into production? I think this is a huge question. A lot of companies are trying to figure out now, how do they get more models into production? And that is exactly where machine learning engineering comes to place. How do you create proper infrastructure so data scientists can release ml models more easily and more
Starting point is 00:39:10 frequently through production so i think that in the past we used to see some teams where the data scientists would package their modeling to a pico file they would hand over to a software engineer from a different team, and they will kind of rewrite it in a different language. So instead of Python, it will be rewritten in Java, for example, and that's what will be run in production. That doesn't work very well, right? Because all this translation from one language to another, huge opportunity for a lot of bugs.
Starting point is 00:39:47 So I think today they're pretty much good tooling and good open source tools for packaging, like Bento ML, for example, is a cool open source that allows you to package your model and ship them to production. Now, how does that work? So if you're writing in Python, what does your production environment look like and how does BentoML get you there?
Starting point is 00:40:10 Yeah, so I'll talk in general, like there's BentoML and Kubeflow and other solution, but per your question, kind of how do you scale it up? I think the main way to do it is by leveraging Kubernetes and Docker. So what you'd usually do is you'll build a service around your machine learning model, and that service will expose a REST API endpoint or GraphQL endpoint to the machine learning model, for example, to the predict function.
Starting point is 00:40:41 And then you run this service on top of Kubernetes so you can scale it horizontally. Yeah, that makes sense. So if you're going to do that, actually, you could run Python then in production. I mean, it might not be as fast as C or something, but I mean, you just add more nodes and it won't be orders of magnitude slower. Oh, yeah, you can definitely run Python in production with Python free for sure. And to be honest, because of all the libraries that are existing today in Python, Pandas, NumPy, PySpark, you have so many libraries to process a lot of data very, very efficiently. Usually it will be more efficient than just rewriting it yourself in another language. So if you wanna run models in production, unless you have some specific constraints,
Starting point is 00:41:29 yes, Python would usually be the main go-to. Got it, that makes sense. And so I think then the biggest challenge becomes when you are loading the data at training, right? You need to load in like huge volumes of this data that's like frozen in place, like you have a data lake, like s3, or one of these things, and you're just streaming in just huge volumes of data and iterating your model. But then at serving time, it's totally different, right at serving time, like maybe you're getting an SMS or something,
Starting point is 00:42:02 and that SMS needs to get turned into into a piece of data that then goes into your model. And so how do you keep those homogeneous? Yeah, so I'd say it's a case by case. It's a very interesting question. How do you scale this up from running your models in training where in a way you run it in batch versus... So in general, we split the type of machine learning models to batch models. For example, they might be running once a week. And then they'll be getting like millions of predictions
Starting point is 00:42:36 in the same time. Node type is near real time. And then you have the last part, which is real time. For example, if you're i know a master card or visa or amex right and they're checking your transactions for fraud that means that whenever you swipe your card a machine learning model is running and evaluating whether this is fraud now that has to be in a millisecond's time actually that's interesting you know you're kind of bringing up something that i hadn't really thought about, which is like, you could build a machine. So
Starting point is 00:43:08 MasterCard, using your example, they could build a machine learning model offline that kind of says, if I see this type of person in the future, then it's fraud. And if I see this type of person, it's not fraud. And then the model can be just frozen and say, you know, if it sees a new person it hasn't seen before, it still can follow the same formula. Say like, okay, this person, I'm going to break them down into some history, process this history and say, yes, it's fraud. Or like alternatively, the model itself can run, can update in real time. It's not clear to me like when to use one or the other.
Starting point is 00:43:45 I guess you could have a model that just describes all these different archetypes of people and then it doesn't have to update in real time. The data is kind of updating in real time. So yeah, can you walk me through like that whole paradigm? Like when would you train a model in real time? And how would that even work? So what you're talking about is the concept of auto-retraining,
Starting point is 00:44:11 like automatically retraining your model upon new data that got to production, automatically releasing it to production so it can leverage the newly deployed model. So it is possible to do it. My recommendation is if you're going that route, you have to have a lot of automated tests to your data and to your model before you release it to production automatically. It's kind of best practice in general, by the way, to retrain your model every now and then. Because if you'll think about it, when you train a model, you train it on a specific snapshot of reality, right?
Starting point is 00:44:48 It might be like the last five years of loan applications or credit card transactions. But reality is constantly changing, right? We had COVID, now we have the holidays upcoming. A lot of things are constantly changing. So we need to make sure that our models are updating themselves to the new reality. So that's why, in general, it's a good practice to do retraining every now and then. But also, when doing retraining, you have to watch what data you're using for retraining.
Starting point is 00:45:19 Because if there is malfunction with the data, for example, maybe there was an outage of some data. So you got a lot of now values as part of your, let's say, just 5% of your data points. And when we are talking about millions or even billions of records, you might not notice that. But for the model, it might learn something wrong. A behavior that you don't want it to learn. So that's why on one hand, I'm saying retraining is important and you have to do it
Starting point is 00:45:49 every now and then. But with that, I'm saying automatic retraining could also be dangerous if not done properly with proper tooling and measures. Yeah, that makes sense. That totally makes sense. Yeah, I feel like most people probably don't need real time. So most people don't need automatic retrading that's done every minute, or every hour or something like that. That seems overkill. For a lot of things, it seems like that might be a better place to have sort of some data or some counters that are updating in real time. the model doesn't have to be so much real time yeah and training might also be very expensive and so your question beforehand about how do you decide whether you should retrain the data like train a model for each specific person or have this one huge model for tons of transactions and just retrain. So these are kind of architectural decisions. This is why it's so important to understand the inner depth of how these machine learning models are working behind the scenes, how these algorithms are working, how the accurate
Starting point is 00:46:58 parameters affect the end resulting model. So you can make the right decision when thinking about this architecture. Yeah, that makes sense. So for folks who are, let's say, let's kind of role play here that we're hiring for a ML engineer. What would that resume look like? I mean, what kind of, we talked about definitely,
Starting point is 00:47:19 you know, it's good to get an advanced degree. You know, you get hit much harder with stats and these other things in your master's and your doctorate. But beyond academics, what are other things you look for in a resume of an ML engineer if you're hiring one? So definitely experience in software engineering and delivering software to production. That will be very important.
Starting point is 00:47:45 And especially when today a lot of time, because machine learning is a new practice, it's all about building something from scratch, right? And I like to say that time spent planning is time well spent. So in order to build the proper ML engineering pipeline and a proper ML platform, doing the right planning and the right design and architecture from the very beginning is crucial. You don't want to be in one year ahead and realizing, oh, we made a bad architecture decision. We need to refactor or even rewrite our ML platform or part of it.
Starting point is 00:48:24 So that's why having experience in writing, designing software, architecting software, I think is a key part in being a good ML engineer. Yeah, that makes sense. I think the nice thing about that too is, you know, ML engineer builds on top of a lot of other building blocks. And so if someone's just getting started, it might be hard for somebody to know. It's like, you know, everyone says,
Starting point is 00:48:48 my kids want to be astronauts when they grow up. None of them say, I want to be a machine learning engineer when I grow up, right? Maybe a software engineer. But so this is like something where, you know, you can build those foundations. If you love building software, you like solving, you know, kind of different kind of mathematical challenges at your company.
Starting point is 00:49:14 How do we capture this value accurately? I have a buddy who worked for what's the company that Groupon Groupon that does this like, you know, I guess Internet coupon type thing. And this was pretty early. So they were just kind of getting started. And they had this issue where they wanted to kind of predict how many coupons are going to sell before, you know, before they air this merchant, right? They wanted to know, okay, is this merchant going to do well on Groupon before we give them a platform, right? And so this, again, very early in Groupon, this friend of mine, just pure software engineer, but he was kind of faced with this sort of statistical or challenge or his predictive
Starting point is 00:49:59 challenge and went on and built a lot of those skills and then got really interested in ML and AI and lot of those skills and then got really interested in ML and AI and all of that. And so I think that's a good, really good advice, Liram, is, you know, folks out there just get started by building cool stuff. And when you build anything, you will always reach some point where you need to predict something. If nothing else, you need to predict how many people are using that, how many people are going to use that thing you built, right? Everyone's going to need to predict that. So yeah, building stuff is a good gateway to having to do stats. And that's a gateway to ML engineering. Yeah, that's a good summary of it. Yeah. So how would you decide whether you want to be an ML engineer or data scientist before you start your studies.
Starting point is 00:51:07 And I think that maybe after studies, it's a question of what kind of person are you? Do you enjoy more doing research or do you enjoy more developing? Do you get excited about digging into data and playing around with different parameters and thinking about algorithms? Is that kind of what excites you? Or are you more excited about software architecture and this kind of thing? So it's very different, in my opinion. And based on these questions, you can identify whether you're more data scientist
Starting point is 00:51:47 or more of a software engineer. For example, if you do a project, like you mentioned, predicting how many users are going to sign up, right? Or something like that. Which parts of this project do you enjoy more? Do you enjoy more conducting the research and optimizing the algorithm
Starting point is 00:52:04 and squeezing one you know one more percent inaccuracy is that kind of what drives you or you you're crazy about how do you think about the class architecture the classes architecture and this kind of thing the microservices so it's two different things yeah Yeah, that makes sense. Well, would you agree with that? Like, you know, you being a researcher? Yeah, I think you hit that on the head. So, you know, I got into this through my PhD, which was on games, playing games with AI. And so we would do a lot of different sort of ways of improving the system. A lot of it was through self-play, but then we would have other agents that you could play against. So for example, we were doing chess, you know, we would
Starting point is 00:52:51 play against GNU chess and see how many levels of GNU chess we could beat with what we were doing, right? And then with Go, there were open source programs that were pretty good. And you could also run those for longer. You could kind of cheat on their behalf, right? So you could kind of make it the domain more difficult. And so the exciting part was, you know, how good can we get this? You know, can we make a world champion level player? And what's the ELO of this chess player?
Starting point is 00:53:21 And how could we get a chess player to be better and better without more work like without more human work you know just by spending computer time how can we like what's the limit there this was i got my phd around 2009 so at this point like uh to answer that question you know we had to write a zillion lines of code. I think it was all C++, which is super painful. But I think that, you know, that was in service of trying to get that, you know, synthetic ELO as high as possible. So you're always kind of looking at that number and saying, can I do better there? So yeah, I think you hit the nail on the head.
Starting point is 00:54:00 I think data scientists look at, you know, can I either, you know, predict this thing more accurately than I did the day before? Or can I make this number go either as high or as low as I want? So the lowest amount of, you know, spam, the highest amount of revenue, and what is going to cause this to go up. And on the software side, you know, I think there is something really satisfying about building something that's really clean, that scales. There's that feeling where you've ever kind of ported one thing from one platform to another. And the first time you run it, there's a zillion errors. And then you go through and get it down to zero, and then it works. I mean, that's extremely rewarding. So I think those are kind of the two different dynamics there. Yeah, I agree. Yeah, I think now there is some,
Starting point is 00:54:50 there's still, I think, a decent amount of C++, but maybe it doesn't need to be. It's kind of, like nowadays, you know, unless you're doing some stock trading or one of these things, you can just get it done with Python. So I really like C++. I know it's not something to be proud of. What do you think of these C++ replacements, like Rust and Carbon and these other languages? So I really like C++ for various reasons. I don't think that C++ is a good language for most of the use cases and most of the projects or software today. I just enjoy
Starting point is 00:55:28 it very much and that was kind of my first language that I learned very, very to the deep of how everything works behind the scenes in C++. And things are very complex behind the scenes in C++. And that's part of the reason I don't think it's a very good language today.
Starting point is 00:55:44 Yeah, I think that you can kind of shoot yourself in the foot with like pound defines and templates and some of these things. Even just when creating a new object, a constructor could fail, for example. It's not obvious. Yep, yep, yep. Yeah, I think that a lot of people use templates when really they need to use other design patterns. And I think the problem with that is once you use a template in C++, everyone else has to start using it too. And so, and so it, because of the way the specialization works, all of that, you end up in this mess. Like when I saw, for example, oh yeah, here's a good example. You might say, I want to have a synthetic clock.
Starting point is 00:56:29 So I want this class to take in a clock, and it can ask the clock what the time is. And so when it asks the real clock, it gets the real time. But when I'm running tests, I want it to ask my fake clock, and my fake clock to always say it's midnight or something. And so if you're writing this in Python, you would pass in a clock, or maybe you'd use a singleton, but most people would just pass in a clock. And so you'd pass in the real clock when you want to do that. With C++, you have templates. So you could have a template parameter be the clock. The problem is now that bleeds into everything like if i pass my
Starting point is 00:57:07 my class to somebody else they need to know about the clock because it's not just handled through the through the polymorphism anymore and so that's that's one example but i've just seen so much uh pain around uh you know recursive pound defines and crazy templates and all of that. Oh, yeah. Oh, yeah. And that's, you know, in order to understand what you just say, you really need to understand what templates are doing behind the scenes. So after you're running the compiler, before it's getting to the linker, what's the end
Starting point is 00:57:41 result? Like what the templates are actually doing to your code in order to understand the potential implications like what you just mentioned. And that's in general my recommendation to any engineer and any data scientist. Whatever you decide to do, make sure to understand it very, very deeply and how it works behind the scenes. I think it's very important. Yeah, that's so true. I think with I think with Python, every day, I'm kind of amazed by what things are built into the language. There's like a key value store like Berkeley DB, I think, or level DB, one of these is actually built into Python. Like you can just say,
Starting point is 00:58:18 yeah, import, you know, BDB, there's just crazy amounts of stuff built in there. And, you know, I think in the case of c++ i mean there's the stl which is constantly expanding but it's less about like all these batteries are included and more about as you said understanding the craft so you can craft like something really beautiful yeah and if you're going with the c++ route maybe you know golang could be a good alternative for that. Yep. Yeah. Golang is super, super clean. I think if you're doing really low level stuff, actually Rust recently made it into the Linux kernel. So you could even write kernel drivers in Rust now. So that's looking pretty nice. And then this Carbon thing from Google,
Starting point is 00:59:01 who knows? I mean, it's so early, but it has a lot of big names behind it. Yeah, interesting. I didn't know, like, how's the performance of running Rust drivers? That's a good question. You know, it's, there are a bunch of really, really fanatical people about Rust. And so if you search in Rust performance, you'll be dominated, the search will be dominated by people telling you it's a million times faster than anything. It would be amazing to get someone on the show who's done embedded stuff in Rust. That would be phenomenal. I don't know, Patrick, have you ever used Rust? You do more embedded work. I haven't, but I think the end result when you use it in device drivers and stuff is it's
Starting point is 00:59:43 supposed to be pretty indistinguishable. i mean you have to be it's not it's sort of like all those arguments like between c versus c++ people will say oh c is faster and it is true that it is easier to use a feature of c++ that is expensive versus c just won't have that feature and so if you use certain features but it is entirely possible to write C++ code that is on par with C code. So I think the same is true in the Rust paradigm, right? It is entirely possible to write Rust code that uses a ton of dynamic allocation
Starting point is 01:00:15 and would not be suitable for use in a kernel driver. Or you write it in another style and you get some of the benefits of the language, but you still get performance. So maybe I'm more old school here but you know when writing this kind of things i'd go with c i want to have full control on what's going on with the memory i don't want any fluid in my code yeah yeah totally i think the the rust memory checking has got to cost something but on the flip side you know it does offer some kind of guarantees. Patrick, you used to do a lot of embedded stuff, right?
Starting point is 01:00:51 When you were doing the health stuff. So in that, did you do like a balance checking? And did you have a lot of memory sanitization? Or was there just not enough runtime for that? Yeah. I mean, so even just, I mean, I guess it depends. So if you're running bare metal is what we'd call it, right? So you don't have a formal operating system, even getting most of like the memory checks
Starting point is 01:01:12 you would get in an operating system you don't have. So you're accessing just like raw pointers and stuff is memory mapped in. And so that means you're just accessing some address, and there's no ability to do checking. There's no context switching. There's no any of those things. So you don't normally get most of the internal things.
Starting point is 01:01:34 And C++ doesn't really have a bounds checker. You would have to be using, by convention, we use this abstraction or this library which has it in. So no, we didn't use any of that stuff. Got it. Yeah, that makes sense. I've actually, this is kind of a segue, but every year for Halloween, we do robots that scare the kids for Halloween. And so this year, I got my hands on a bunch of Pico W's, which are these raspberry
Starting point is 01:02:07 pie Picos. They're extremely small. I'm trying to think, they're maybe the size of a stick of gum or something. I mean, they're really tiny, but they have a Wi-Fi chip, which is amazing. And so the plan is, and we actually, it's behind me, ready to go now. It's actually when somebody reaches their hand in the candy dish, this motion sensor fires. And then all the Pico W's are sending little HTTP requests every half a second to this machine saying, hey, have you seen someone? Have you seen someone?
Starting point is 01:02:41 As soon as it says yes, all these zombies start rising up from the dead. So it's a lot of fun. And I was able to do that whole thing in MicroPython, which really surprised me. I didn't even know that that had existed until about a week or two ago. It just shows you how powerful even these tiny embedded computers are getting, that they could run Python and it doesn't even matter to them yeah yeah well raspberry pi became very very powerful computer like you can run tons of things on it it became very very intuitive in my opinion yeah these are
Starting point is 01:03:15 these pico w's i think they're i don't know what they're running to be honest patrick do you know like what hardware these guys have i have really no idea but it's it's powerful enough to run this micro python and and uh do a web server and all sorts of other stuff yeah i mean they have their own framework right so it's just it's not running with an os it's sort of like programming in arduino they have like a bunch of libraries sitting around you to help you do common tasks but the way you can kind of there's not like a ton of stuff going on in in parallel and so normally do is the wi-fi stuff is handled by an interrupt like an interrupt will fire and it'll do all the handling uh and then go back to your program but it's not true task switching like you would think about an operating system oh i see got it cool so yeah it. Cool. So yeah, I think, you know, it's ML engineer has a lot of different, you know,
Starting point is 01:04:08 kind of traits, you might have to do serving side, you have to do C++, you might not have to write a device driver, but you never know, you might have to do that. But you definitely have to be really concerned about latency. You know, I think whether it's a web app or definitely anything in the financial industry. And so how do those folks kind of keep the latency down? And how do they fight that monster? So to be honest, you'd be surprised, but a lot of times Python is able to meet their required latency goals. And when it's not, what we see is the most often thing is just taking the model, translating
Starting point is 01:04:47 it into Golang, and writing a microservice that runs this Golang-based model. And that's it. Oh, nice. It's not as big an issue as I thought it would be. Yeah. Oh, that is super cool.
Starting point is 01:05:03 Yeah, that's the issue. As soon as you go into C++ now, or even issue as soon as you go into c++ now or even just as soon as you go into anything else now you have your training code in one language and your serving code in the other and so if you have some kind of feature transformation and it's not perfect uh you know it's not exactly the same in both languages, then you start introducing all these errors and that can kind of go wildly out of control. Yeah, and I think that's part of the reason that we see more and more just Python being used both for training and for serving.
Starting point is 01:05:36 And as for the transformations, I think that Apache Spark became very, very popular. So you can use Apache Spark both for training and for serving. In serving, it's one fork in real-time. It would be more for batch models than real-time models. Oh, cool. I didn't know that. Neat. So yeah, I think we
Starting point is 01:05:55 kind of did a really good 360. We covered pretty much all of the different aspects of being an ML engineer. If this stuff all sounds exciting to you, then you might have a new career to look forward to. You should definitely dive into it. And it's growing at, you said, what, 30%, 40% annually.
Starting point is 01:06:15 So there's a ton of folks who are going to be looking for people with that skill set. So why don't we talk a little bit, shift gears and talk about Aporia. So is Aporia, how does that relate to ML engineering? Is it a tool for those folks or is it kind of a supplement? How does that work? Sure. So Aporia is an observability platform dedicated for machine learning. What does that mean? It means that the Hoya will usually come, will usually integrate with the existing ML platform of the organization. And it allows the ML engineer really to have one centralized place where they can see which models are currently running in production,
Starting point is 01:07:01 what decisions they're making for the business. You can create your own dashboards and your own custom metrics, and you can define monitors within Aporia. So you can get alerts where there's a drift or data integrity issue. So all the different incidents that we've discussed about in the last hour
Starting point is 01:07:18 could be identified with Aporia. And one of the cool things with Aporia is the fact that the system is super customizable up to the point that you can actually write python code within our system and let aporia actually run it for you so it's as customizable as it gets very cool so so someone can say things like uh you know keep track of my, and they might have some key performance metrics for their team or their company. And they say, I'm going to feed to you
Starting point is 01:07:52 the performance metrics each day. And then you could do some anomaly detection. And then maybe I even feed you the performance per model. And so if a model starts underperforming, boom, they get a Slack or email or something, or SMS or something. Exactly. Just sometimes
Starting point is 01:08:14 you're calculating performance metrics on live production data is a bit more complex. So Aporia is actually doing it for you. So all you need to report to Aporia is the feature vector, like the inputs of the model and the predictions. And when you have ground truth,
Starting point is 01:08:30 you can also report it asynchronously. And Aporia will calculate, you know, F1 score, precision, recall, all these metrics on its own and will identify anomalies. Oh, very cool. And so how does that work? So you hook up Aporia, let's say you're training using like PyTorch Lightning or something. Is there sort of like an Aporia callback that you do during training? Yeah, exactly. So there is a Python SDK or REST API that you can use to integrate it with your serving. And from there, Apoya will just have this data and work with it. Cool, that makes sense.
Starting point is 01:09:09 So is it like, I guess, scraping the calls to TensorBoard or something? Or do you manually tell Apoya, like here's the loss, here's the other metrics? Oh, no, so you just, so you can think about it like you do with any other logging library, right? So after you call to model.predict or predictPrava, you'd call to aporia.logPrediction.
Starting point is 01:09:32 You'll pass the feature vector, like the data frame, as well as the prediction data frame. Behind the scenes, Aporia will report it to the back end of our system. This information. Internally, our system will calculate all sorts of aggregations and metrics on your live production data. It will also use training as a baseline
Starting point is 01:09:51 for that behavior, and it will find anomalies. Got it. Cool. So how do people like, you know, let's say, let's transition a little bit from individual now to some giant. Let's say I work for, you know, giant, you know, let's say, let's transition a little bit from individual now to some giant, let's say I work for, you know, giant, you know, Corporation X, and giant Corporation X wants to keep all their data in house, and they have all this paranoia about everything. You know, how do you, how does that work? Can you put a Porya into someone else's, you know, virtual private cloud?
Starting point is 01:10:23 Yeah, so it's a great question. And it's not only for large, huge enterprises and corporates. I think that when it comes to machine learning, businesses are always very, very sensitive about it. It could be PII, it could be their business sensitive information. They don't want to share it with an external vendor for whatever reason. So Appoya was built from the very first day in security in mind. So, you'll only deploy a very small workload on your VPC. This will be the only part that touches the data.
Starting point is 01:11:00 It will calculate the aggregations and metrics, and then APOYA will do the rest for you. So, no sensitive data will ever leave the organization's VPC, for example. I see. So you have, maybe to use the term from Datadog, you have an agent that runs in someone's VPC, and then by the time the data exits that agent, it's aggregated at such a coarse granularity that you don't have to worry about.
Starting point is 01:11:29 Exactly. Cool. That makes sense. Yeah. So I see. And so then once the data, so then folks can go to Aporia and see how their models are doing in aggregate and all of that. And then that can feed back into tickets for the developers and all of that, and then that can feed back into tickets for the developers and all of that. Yeah, like you can connect it with your Jira, for example, create tickets, for example, saying, well, we have new data, we need to retrain our model. Maybe we have segment of our data that the model is underperforming.
Starting point is 01:12:00 Maybe it will be worthwhile training a separate model that will be built dedicated for that kind of data segment. So you can drive insights from the system and then connect it to the rest of your tooling that you're using. Cool. That makes sense. It sounds like another layer on segment. So it's something like segment segment you can say i hope i'm using getting the right name but there's there's a company pretty short segment where you can say like different rules so you know when this amazon queue grows too large then send a slack message and a text message to bob or something right and so segment lets you kind of connect all these different rules. Or when I get something in this queue,
Starting point is 01:12:48 put it into this data lake site, right? And so you're doing that plus like a piece on the front to kind of filter and to do some pre-processing. So like, you're getting a, not every single feature vector needs to be an SMS to somebody, but you're getting all these feature vectors and then saying, oh, I've hit some threshold now, which is defined by the person who's using a Porya. Now I need to send something to somebody. Yeah. So you can either define it, really build the sentencing when there is a data drift
Starting point is 01:13:26 in my churn prediction model, send an alert on Slack channel to Jason, for example. Right, so you can do that. But you can also use a pointer to create your own dashboards. You know, you want to see what the impact of your model is making. You want to explore the different behavior of the data, like distributions, how different metrics are changing over time. So you can see all of that and create your own dashboard. Oh, that is super cool. Does it plug into these different dashboard things
Starting point is 01:13:57 like Superset and what's another one? But yeah, there's a whole bunch of these different, Looker, like Superset and and looker does it plug into those um no so it just like the solution itself is kind of a web service it's like any other sass solution you sign up you log in um you get your dashboards you can do the integration you can set up alerts so it's within the Aporia system, the dashboarding part. Okay. Very cool. What about folks who are just graduating or they're in the final years of university? Are there internships? Are there full-time positions? And is it remote? Is it local?
Starting point is 01:14:42 What's the state of hiring over there? Sure. So we're constantly growing. We raised our series of $25 million not that long ago. Oh, congrats. Thank you very much. So we are very, very happy always to have more skillful and smart people join our team. So definitely. Very cool. What about interns, like summer internships? A lot of smaller companies don't really do that yet. Is that a thing at Aporia or not yet? So we're doing it kind of on an opportunistic basis. So we've done it a couple of times in the past,
Starting point is 01:15:23 and it's case by case. Got it. Cool. Yeah, we'll post a link to Aporia careers or jobs page, we'll find the appropriate link to that and put that in the show notes. Great. Well, this is super exciting. What is one unique thing about working at Aporia? So it's something that is kind of special that you do that kind of makes it distinct? You know, it could be, you know, there's football tournaments on Thursdays, or it could be a team outing that you all do or anything like that. I think we're solving a very, very interesting, impactful and challenging problems and what you'll find if you ask people working in aporia people that are here for two two and a half years you'll hear that they're still learning
Starting point is 01:16:13 new a lot of new things every day so i think that's pretty crazy right like usually it happens only within the first month when you join the company but that's not the case here. Very cool. That is awesome. Yeah, well, this is filling a huge need in the community. I think that I really haven't come across like an anomaly detection as a service and kind of model tracking as a service. It's a huge, huge opportunity. And I'm really excited that we were able to get you on the show.
Starting point is 01:16:45 And I'm going to definitely be following this and see where it goes. Thank you. Thank you very much for having me. I really enjoyed the conversation. Cool. And thanks, everybody, for listening. Thanks for supporting us on Patreon and on Audible. It's always kind of special to get emails.
Starting point is 01:17:03 We got actually a lot of folks have been emailing us lately, you know, about always kind of special to get emails we got um actually a lot of uh folks have been emailing us lately you know about their kind of experiences and you know it is you know a rather tough labor market so um you know it's always extra special when people reach out and they say hey i've got ramped up on some new skills thanks for programmingdown. I got a job at this place. Everything makes us, kind of warms our heart, fills our bucket. So thank you for that.
Starting point is 01:17:32 And we will catch you on a couple of weeks. Music by Eric Barndollar. music by eric barnmeller programming throwdown is distributed under a creative commons attribution share alike 2.0 license you're free to share copy distribute transmit the work to remix adapt the work but you must provide attribution to patrick and i and share alike in kind

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.