Drill to Detail - Drill to Detail Ep.77 ' Keboola, Scaling Analytics and Winning the Looker Join Hackathon' with Special Guest Pavel Dolezal

Starting point is 00:00:00 Hello and welcome to Jewel to Detail and I'm your host Mark Rittman. So I'm joined today by Pavel Dolezal, CEO of Kabula, a company I first came across at the London Looker Join Conference last year and more recently at the Looker Join event in San Francisco where his company won the Tagathon event. So Pavel, welcome to the show. It's great to have you with us. Hey Mark, thanks for having me on the show. It's really a pleasure. I'm a big listener to your podcast and I love them. I love the people you bring on and the level of technical detail. So thanks for this. And yeah, as you said, my name is Pavel and I was born in Europe, in Czech Republic. And basically I was fascinated by technology since I was very, very,

Starting point is 00:00:56 very young. And then, you know, I, when the internet came, you know, I started to actually do my own internet businesses with a couple of friends. We started the search engine internet portals in Europe. I was part of the leadership team. And then we actually were part of a Microsoft network for a while. We were in several countries, developed over 50 different products, like consumer-facing, B2B-facing. It was late 90s. And we were all based on.NET, Microsoft SQL, and all that, you know, environment.

Starting point is 00:01:32 Then my second company actually later started, passed as growing performance media company. And within two years, we were actually servicing 15,000 businesses all over Europe. We moved our stack from Microsoft to, you know, open source, Ruby on Rails, CloudDB, Redis. That was that time, you know, everything was changing and then cloud started to catch on. And, you know, like servicing 15,000 clients meant that we were really automating a lot of things

Starting point is 00:01:58 and we were doing a lot of that with data and machine learning. But in both of those ventures, I would always have dozens of people taking care of data infrastructure and the process to get to something, to answers, insights, it was very linear. As a business owner, I would say, well, I have an idea, I would like to explore this.

Starting point is 00:02:22 And I would issue a ticket and ticket goes to IT, you know, then goes back. I was like, well, it's really good. I explored it, but something missing. So it goes back to IT. And before you know it, we have four iterations and, and the, you know, the answer is, you know, nobody cares, you know? So later on, I actually, you know, when I sold that company, I, I invested in a couple of startups. And one of these companies was actually doing Google Apps and Google Cloud implementations into large banks. And I was working with the CIO of that bank. And he actually wanted to quickly analyze data about engagement of their employees and from totally different sources, like internal systems, external systems, RESTful API. And I was like, well, it's going to take a year, right?

Starting point is 00:03:08 It's like, well, you tell this cloud, so I have a cloud challenge for you. You know, like I care about speed, accuracy, and cost. So that's how I found Kevula because nobody else was able to do it. And it was really interesting. Back then, Kevula was really an agency which was specialized in moving clients to cloud and helping with their systems and data integration for

Starting point is 00:03:32 analytics they were doing as integration projects right so but they already had this internal tool which I really fell in love and they were the only ones who were able to deliver what I wanted from them in 12 hours so that got my attention that's kind of like my story so my understanding of Kaboola is it's um it's it's a product that came out of the consulting work your company was doing originally um so you were developing uh I suppose tools and techniques which you eventually found that could be turned into product and is now being offered for sale as a packaged up accelerator, I suppose, really, for businesses looking to scale their analytics and to deliver things faster.

Starting point is 00:04:14 But there's quite a few companies out there that do this. So what is it particularly about Kaboola that is its unique selling point? And who are you aiming this product at particularly? So, you know, I have to say we typically would serve, you know, more and more two groups of, you know, customers. One of them would be hyper growth customers and the second one would be kind of like the large enterprises. And as you say, you know, like we've been around

Starting point is 00:04:40 for some time taking what we had internally and prioritize that. And what we see, there's some, you know, pressures in the enterprise kind of like world with more and more people wanting to scale analytics, right? And what we really kind of like address for those people, you know, is kind of like three problems. The first one is when, you know,

Starting point is 00:05:00 there is a rise of agile teams, you know, cross-functional teams across the enterprise, and their desire to actually do analytics and move from those linear ways to non-linear ways and actually analyze data from. Gartner says there is over 1,100 applications, which are SaaS, used in typical enterprise, plus all of the old systems. So this rise of people and citizen analytics, how you of the old systems. So this rise of, you know, like people and how to, how you, citizen analytics, how you scale it to them. Second, it's kind of like, you know, when you, when you want to democratize, you know, this analytics and scale it across the company, you know,

Starting point is 00:05:37 it's usually with our customers, we see the rise of, you know, like 10% of the whole company would really need to do, you know, deep analytics, not just, you know, looking at the dashboard or really do analytics. If you want to go the traditional, you know, IT way, what we have found, you would need actually, you know, to support for three to five business people, you would need one IT person. And that's just not scalable. If you talk about 10% of people doing analytics, because you would have to hire hundreds of IT personnel

Starting point is 00:06:06 to actually support that linear approach, right, that we talked about a minute ago. And then, as I said, there's that plethora of data sources. And not only data sources, but also tools for analytics. Somebody wants to use for something SQL. Somebody wants to use R. Somebody Python. Somebody Julia, MATLAB, OpenRefine, and on and on you go.

Starting point is 00:06:31 And also, you know, what you want to do with the data changes. You know, originally people just wanted to look at the data. Now they want to do, you know, a simple, you know, BI, move to machine learning and all the way, you know, do neural networks and applying third-party components. And all of that, you know, is for insights and automations, right? And all of that creates a huge stress on organization design where traditional IT and business roles are really separated, right?

Starting point is 00:06:58 The responsibility and stuff, right? And especially when you think about the legacy technology. And so what we saw, you know, like companies actually wanted to move from this. They wanted to become more agile, you know, like create cross-functional team. And really the reason was to compete with Silicon Valley tech giants, right, who are actually really fast and traditional business. How can we compete with it? We need to really, something. We need to augment the typical humans with data powers, make them into data heroes and automate the manual, the boring stuff so they have more time to be creative.

Starting point is 00:07:35 And the company moves from he does this, she does that, and they are siloed into, hey, let's bring the cross-functional teams into Agile and create tribes. So, but it's hard. What we typically see that companies take two routes, right? You know, previously. They would, you know, like either resignate and because we are in that old world, you know, kind of like Oracle, data warehouse,

Starting point is 00:08:00 a lot of data mars, very rigid, you know, structure. And that's, by the way, what Looker, CEO, talked to Looker, Frank, calls the first wave of enterprise architecture, where only a selected few could really do something. We talked about it a little bit

Starting point is 00:08:17 when we were preparing for this call, and the tools like the Oracle, they were brilliant when you talked about being used by IT people. And this didn't really scale. And the cloud came and, you know, suddenly you would have all these, you know, small tools, you know, as you talk about stage, DBD or whatever. And anybody could use it. And, you know, as Frank, you know, Lucrezio says, this is the second wave of, you know, like analytics architecture.

Starting point is 00:08:45 And then there is no security around it, right? So the second wave is kind of good, but not for enterprise, and it creates a lot of shadow IT. So what we see now is actually the need to kind of come at the third wave, which is the abstraction layer on top of these two waves, which actually gives, you know, the users the ability to use the tools they want in the teams they like, which are cross-functional, reuse what they build. And still, you know, like that has to be augmented with enterprise-grade security, you know,

Starting point is 00:09:18 metadata, lineage, you know, procurement, certificates, et cetera. So the way we think about this, it's kind of like if you think about what twilio did you know for communication they actually put an api layer on top of all communications right or stripe they put an api layer on top of all the payments so what we are in a sense doing we are putting a data ap API layer on top of the whole data analytics and automation journey, on all the infrastructure which is there, on all the, you know, processes. So we can abstract

Starting point is 00:09:51 the users, you know, from actually the complexities. This results kind of like in a serverless implementation, you know, like of the data kind of like analytics journey, where people say, well, this is what I want to do, you know, the code, and this is the data I want to work with, right? It's a little bit more difficult or different from the, you know, serverless implementation

Starting point is 00:10:14 in, you know, development, but I think it's very, it's actually very good. What this abstraction actually allows us is to, you know, have, you know, all the data goes in through API, goes out through API. All, you know, processes go in and out through API. And that, you know, going through API layer, we have all the metadata, we have all the payloads, user activity, changes in data. It creates automatic lineage, right, and creates all this data intelligence, which is done automatically on the fly and can be actually used by, you know, like company procurement and security departments.

Starting point is 00:10:53 We put all the data in GradDB. It's very easy for users to use. So if you think about the environment kind of like Looker or Tableau, it just usually kind of like starts with, you know, I want to do BI, right? I want to connect to new systems or old systems. Then, well, I want to do transformations. Well, I actually need to do some machine learning and I want to use Python. Oh, no, no, I want to use Julia.

Starting point is 00:11:16 Hey, we need to work together as a team. And we need to work, you know, all of this, all of this data operations, if you will, along the whole journey. It's kind of like, just like really data operations, if you will, along the whole journey, it's kind of like just like really difficult because too many moving pieces. So that's why we created that abstraction, you know, like API layer that would enable actually the companies to actually really move into data ops way. And I know if you're familiar with the guys from Data Kitchen, they actually wrote a very

Starting point is 00:11:44 good piece, you know, about DataOps. And they kind of like stated several, you know, like things, how DataOps is different from DevOps, right? So, yeah, like we all know that it's kind of like the DevOps is kind of people know, and you think about it, you know, how it applies to data analytics, you know, and how it enables the agile teams, it's kind of like the guys actually say three components, right? First of all, we have different users in data analytics. In DevOps, they have just the engineers, the developers, right? And they are very technical.

Starting point is 00:12:24 They want to, you know, like tweak things. They love to learn new technologies and new approaches. In the data analytics, you know, I just counted, you know, we have over, you know, six user groups, right? From IT developers, data engineers, data scientists, data analysts, business analysts, right? And, you know, except for IT developers, five of those groups don't really care about the code and tweaking the code.

Starting point is 00:12:50 They all care about business outcomes, right? And they, so we need to support them and automate the process around them so they can actually create agile, you know, teams. Second piece of the DataOps kind of like manifesto, if you will, that the Data Kitchen guys put together, it's kind of like, well, we need to adopt the DevOps methods

Starting point is 00:13:08 for continuous integration and continuous testing because we have two different systems, if you will, in data, right? We have innovation cycle where we need to be really speed, velocity, different tools, agile teams and everything.

Starting point is 00:13:25 But then we need to kind of like move it into the second cycle, which is a production cycle. So this is like DevOps methods, you know, thinking the second pillar enables us to do that. But the third pillar is actually the production cycle. And if you think about it, you know, like the production cycle, the models and pipelines, you know, and automation workflows, they need to run both in cloud and some of them on-prem, and they need to be, like, really rigid. They need to be scalable, right? And we need to be able to test them and actually measure that production cycle.

Starting point is 00:13:55 And it actually brings together the methods of lean manufacturing, right? And so when you think about this, you know, it's kind of like when you ask us, well, how do we fit in? We say, well, we are kind of like a serverless data operations platform. And what we enable is to kind of like scale business insights and revenue through the whole organization,

Starting point is 00:14:16 not just parts of it. And automate workflows and enable click and mainly risk-free secure experimentation in that innovation cycle with one click, you know, putting the models and pipelines into production. And that all is enabled by the API abstraction layer. That's how we fit very well in that world,

Starting point is 00:14:36 you know, of the new tech together with old tech. So the product sounds quite technical. And, yeah, although obviously you were talking about being aimed at business users, I mean, there's quite a few products in this space. You've got from one extreme maybe tools like DBT or Dataform where that's about trying to make analysts into engineers. But it's quite kind of basic and scripting and so on. And other extreme, you've got maybe more established,

Starting point is 00:15:05 maybe more old school tools like Wirescape Red. Who actually, who do you see as the target market for this? And who is the persona that you're developing this product for, really? Well, that is actually the great question. That's the only question that matters, right? You know, like what value you bring to your users and who are they, right? You know, like all of those guys we know, most of those guys are really, really great,

Starting point is 00:15:30 you know, technical people are really great people, but they come to the problem, you know, from a purely technical perspective, right? And I would say those kind of like, everybody comes to a problem, you know, like from two different angles, and I would say both are wrong. And the first one is kind of like everybody comes to a problem, you know, like from two different angles. And I would say both are wrong. And the first one is kind of like IT, right?

Starting point is 00:15:50 Everything says it's an IT problem, right? And therefore, we need to think about IT and their current needs and I just call this, you know, let's say Gartner Quadrant and my quadrant, which I know, you know, a little bit better, it's going to do the revolution, right? So if I kind of like, if I just, you know, like enable more access to people to data warehouse, or if I, you know, enable them to run just more Python notebooks, you know, better, or if I just, you know, enable better consumption of whatever Tableau Looker, you know, you name it, it just kind of solves the problem. But I believe this is dead wrong. Because, you know, people have their own bias and they box themselves into seeing, they don't see the whole journey of, you know,

Starting point is 00:16:38 they don't have the design thinking, right? They see just, you know, like they put themselves in the box. We do data warehouse. You know, we do data warehouse house virtualization we do transformations we do ETL we do whatever and then they actually think that the whole world is kind of like you know that better date our house or better whatever and I just I just don't think that it really works because like if you if you if you will live inside the white box your whole world seems to be white right so what you need to do is kind of like step out you know

Starting point is 00:17:12 think outside the box and you know like really look at the whole journey as a complex journey talk to users you know talk to them you know like talk to you know the end users and you know like see that the typical enterprise has 1100 you know outside SaaS applications together with old you know Oracle warehouses together with you know like like together with with you know like the Hadoop like together with everything right and you, things like ETL, MDM, or, you know, different ways of integration, then, you know, like suddenly become just a method that you need to do to actually get to the target, right? And so our target and our users,

Starting point is 00:17:57 we come from the business users. We come, you know, from the perspective, you know, of that changing world where, you know, like, you know, We come from the perspective of that changing world where this whole data democratization and moving company into agile tribes is not going to happen only with hiring more IT. We actually have people in business departments who are more and more tech savvy you know we have very very similar view of the world as looker has right they said well they know they i love your your your kind of like your podcast with colin right colin zima and he said well the guys at looker did very good thinking in the beginning that you know know, they need to enable more, you know, engineering,

Starting point is 00:18:47 not less of it, right? Not kind of like click and fall. It just doesn't work. And so, you know, if you look at the statistics, just DataCamp, you know, they have over 3.3 million students. 3.3 million. That's huge. I mean, those are not just IT folks, right? All of those, you know, students of DataCamp

Starting point is 00:19:09 are people who are within business units. They are the analysts in marketing. They are the guys in logistics, right? They sit in procurement. They are at risk departments. And they need to, you know, connect those 1,100 applications. And they are in sales, right? And they want to do something meaningful.

Starting point is 00:19:27 So if you can think about this, for them, just creating better design or data warehouse just doesn't count, right? It's all a whole journey. Insights, data into insights, insights across the whole company. From insights, there are actions and you need to automate,

Starting point is 00:19:44 you need to build machine learning models to automate those actions and the whole company. From inside, there are actions and you need to automate, you know, you need to build machine learning models to automate those actions and the whole customer journey, whole, you know, whole, you know, human work, you know, like people journey

Starting point is 00:19:54 inside the company. And that's what we help them to do. But, you know, we start with business and then we go to IT. And then, you know, IT actually tends to love us

Starting point is 00:20:04 very, very dearly because for them, you know, they are always thinking, well, they. And then, you know, IT actually tends to love us very, very dearly because for them, you know, they are always thinking, well, they are very smart, you know, all these people, right? And they see, well, if we need to support these thousands of, you know,

Starting point is 00:20:14 citizen users in the business department, how are we going to do it? We don't have enough people. We can't hire more people because first, there are not enough great people. Second, we don't know how to manage it.

Starting point is 00:20:25 So for them, for IT, we just become this great tool that enables them to scale business inside and to scale machine learning creation and automation of processes into every single business department with full control auditability metadata. And that's why, you know, like solving just for one, you know, persona in the whole journey just doesn't work, you know. So, you know, for the benefit of anybody who's never, you know, seen Kaboolian Action, just walk us through what a typical development session would look like. Yeah, well, you know, thanks for that. I will just use the

Starting point is 00:21:06 Looker hackathon as an example, right? But first, kind of like, you know, when you say people, you know, picture Kibuwa, right? I like to use this analogy, right? Like, I remember 20 years ago, you know, almost 30 years ago, when I started with, you know, with the IT and development,

Starting point is 00:21:22 when we wanted to send an email, right, we would have to buy a server, right? Then we would have to install an operation, you know, a system on the server. Then we would have to, you know, install service spec one, service spec two, whatever, right, patches. Then we would have to kind of like install the mail server and then blah, blah, blah.

Starting point is 00:21:39 And before the end of the day, when we would kind of like want to send an email, it was like three days of preparation, right? And that's kind of like the way of the day, when we would want to send an email, it was like three days of preparation. And that's the way, the state of the market that we have in analytics. It's very fragmented, a lot of different things. If you want to do something, send an email, do an insight or automate, you really need to be very technical and spend 80% of your time just putting together that stack, cleaning data. And then fast forward 20 years, and then we have Gmail, right? And just like Office 365 or whatever, right?

Starting point is 00:22:12 You just come to a survey. You say, I want to send an email. It's an address. You put in the code, your text of the email. You press send, and it's serverlessly done for you. You don't have to do anything. And that's such a huge productivity booster productivity booster or google docs right we are exchanging together right notes across you know through google docs i see your changes use your mind it's done you

Starting point is 00:22:36 know totally automatically and now let's you know talk about our user journey so for uh for the uh for the looker hackathon you know we we came there and we're like, hey, yeah, let's do a hackathon as well. What are we going to do? Well, what can we do? Okay, well, what can we do? Where is the data? Well, there's a lot of data on Twitter and on social media and how people talk about this Looker join, right? Well, you know, typically we're like, okay, how are we going to analyze it? Well, I can get some Python library, maybe hack it, you know. Well, I'm going to start our database. Well, what is the schema?

Starting point is 00:23:13 How are we going to do it? You know, okay, well, I'm going to. So for us, you know, we were like, okay, we start from the end. So our end goal was kind of like we want to show how are people talking about Looker join? How is Looker, you know, talking, you know, about sponsors? How are the sponsors talking about Looker? What is the frequency? What is the quality of the text?

Starting point is 00:23:34 You know, et cetera. What are we going to need on this journey? Well, first, we need to connect to some data sources, you know, some REST API of Twitter, let's say. Then maybe we'll need to scrape some data sources. You know, we need to, you know, at some RESTful API of Twitter, let's say, then maybe we'll need to scrape some data sources. You know, we need to, you know, take the data in. We need to put it in some blob storage because we don't know how much of it is going to be, you know, down.

Starting point is 00:23:53 Then somebody needs to spin a SQL server so we can put it, you know, structure it from blob into SQL. What about the coding of that? Well, it's a JSON. Who is going to, you know, materialize the JSON? And then once it's's a JSON. Who is going to materialize the JSON? And then once it's in SQL server, who is going to run transformation

Starting point is 00:24:09 on that? Who's going to clean it? Who's going to join it? Oh, by the way, how are we going to know what is the sentinel data? Well, somebody needs to connect Google API for NLP or some other software. Who is going to do it? And before you prepare the data for actual analytics, it's days and days, right?

Starting point is 00:24:30 Because you need to prepare the whole stack. With us, it's totally different. Still, the same steps are needed to be done, right? You need to connect the data. You need to scrape it. You need to warehouse it, blah, blah, blah. But with us, you need to scrape it, you need to, you know, warehouse it, blah, blah, blah, blah. But with us, you get one platform. We create

Starting point is 00:24:47 a project, and there are already prepared extractors, right? So, Twitter extractor, yeah, click, boom, you know, two minutes, and the data is being extracted, the data that we want, you know, the form that we want, it's not, you know, preset, from Twitter,

Starting point is 00:25:04 and it's being automatically put into our blob storage. Then automatically the JSON is being materialized into CSV files, which are being stored into Snowflake. So within half an hour, you have the data in Snowflake, and already somebody can actually, with all the metadata, with all the information, who touched it, who did that, right? And somebody can start, you know, actually do analysis and create transformation on that. So, you know, actually we got people from other teams, you know, actually joining us in that, right?

Starting point is 00:25:36 And so somebody said, well, I'm going to clean the data. I'm going to use, you know, SQL. And they went into Kibuwa, started with one click a sandbox, which actually, you know, gave them the data that was already warehoused in the Snowflake, which is connected under Kibuwa. And it gave them, you know, like separate Snowflake warehouse. So it's just sandbox for them. So they developed in the sandbox environment. You know, they clean the data, they join it.

Starting point is 00:26:02 And then they registered, you know, the code, which was production ready, into our transformation folders. And then, you know, in the meantime, you know, somebody else was in the same project actually, you know, pivoting the data using Python. So what they did, they actually went into sandbox. I want the sandbox, you know, one click. It has to be, you know, like with Jupyter Notebook, it has to have, you know, certain amount of, you know, like memory for it. And, you know, automatically, you know, our API gives them the data from that snowflake instance, right?

Starting point is 00:26:36 So here you go. You know, at the same time, you know, without touching all those, you know, underlying infrastructures, two people with just one click got two totally different sandboxes. They develop, they register it into our transformation layers. And then what about, you know, NLP? So somebody else actually meanwhile

Starting point is 00:26:55 went into our marketplace where there is over 900 different apps, right? And said, well, there is an NLP app from Genia. One click, you know, edit what is going to be the, you know, the file, right? And he said, well, there is an NLP app from Genia. One click, you know, edit what is going to be the, you know, the file, right, that's going to go in. And what do we want? This, you know, sentient, you know, scoring, positive, you know, okay, entities, click. And meanwhile, you know, data was sent to Genia and received automatically back through Kebula API into Snowflake. So within really just two hours, you know, four people worked on that, you know, together in one project.

Starting point is 00:27:31 And Kebula kept, you know, kept track of every single action that everybody did in this environment. Because again, all of that was happening on top of the data API. And all of the data actions and jobs, you know, I want to run the SQL in the production environment with this data. Then Kebula created the job, which was, you know, run through an API. And we had all the metadata about who did it, what data it touched, you know, how is the data tables, how are they connected together, the lineage. And what was the, you know, what was the you know what was the start what was the result and so then you know we just went and just created a looker dashboard on top of that

Starting point is 00:28:12 with one click we've automated this whole journey so it would be happening every hour for example right and so that's how kind of like you know what it was it was kind of like done you know like you didn't have to start anything you know you didn't have to integrate together five different pieces of technology because everything is just provided for you. themes of the conference and the product announcements by Looker was around packaged applications and building applications you know directly into into Looker creating marketplaces and so on you know is this is this something that's that's that's kind of a threat to Caboola is it something where you know it aligns with what you're trying to do what's your thoughts on that? No we love them we actually you know like with Looker team you know we were preparing you know like in the new marketplace we actually, you know, like with Looker team, you know, we were preparing, you know, like in the new marketplace, we are preparing, you know, several different, you know, applications with Kebula technology, which we call scaffolds.

Starting point is 00:29:11 And, you know, the idea is kind of like the Looker is a great tool. It's a great platform. And they have a very big focus on the front end, right? On the kind of like, you know, the insights and what you can see, how you deliver the data experiences to end users. Our focus is more on the platform side, like one layer below, and to automate all of these different orchestrate, all of these different tools,

Starting point is 00:29:40 different processes, and different people together. And so what we did a couple of weeks prior to the Looker journal, we were actually using our scaffold, which basically, you know, everything I described about that, you know, like Looker hackathon that we did, all of those, you know, analytics, you can actually now package together into one scaffold. And with just one API call, you can call it and just create a new Kibla project for you. It connects to data sources. It creates them.

Starting point is 00:30:07 It warehouses them, you know, and it, you know, does all that, you know, SQL code, Python code, calls the Genia and create a Looker application and all of those dashboards. So now if you go into Looker marketplace, you know, you can see several of these. For example, there is a whole subscription

Starting point is 00:30:25 analytics with Looker powered by Kibble scaffolds. There is a whole user comment, user behavior application in Looker powered by Kibble scaffold. So I think these two are very powerful when combined. And I totally subscribe to Lookerway because when we want to democratize across the companies, we really need to allow the internal data tribe people to actually prepackage certain things

Starting point is 00:30:59 for their users. So people don't have to reinvent the wheel all the time. That's a big thing. You know, when we see companies go really, you know, like up to 10% of their workforce doing like advanced analytics, we see a couple of things, you know, like first they need templates so they would not have to kind of like rebuild the stuff. That's kind of like our scaffold and Looker application. Second, there needs to be, you know, the automated kind of like analytics

Starting point is 00:31:27 on the user behavior. And, you know, in the sense that we want to help the users if they get stuck. For that, you know, we are using all of that metadata and activated data that goes from our data API. And we have something we call predictive support. So, you know, imagine you own the, you are support. So imagine you own the data tribe, you own the data

Starting point is 00:31:48 initiative and you see somebody kind of like going, trying something and then it just goes down. They don't finish. Their jobs are still running with error. We will let you know automatically. So you can reach out to them and you can just like immediately jumpstart them. This works

Starting point is 00:32:04 really wonders at the company. And a third is kind of like you need to play in a wider ecosystem because like nobody's going to solve all, you know, like those 1100, you know, applications enterprise on their own. So that's where I think Looker is doing a fantastic job and they are kind of like enabling that that that last mile we are trying to enable the miles before that and together it plays very nicely as we have experienced so another topic we talk about a lot on the the podcast is this idea of scaling analytics and democratization democratization of analytics um yeah and i

Starting point is 00:32:42 suppose analytics going beyond just um dashboards and and graphs to be part of a process and part of a workflow you know what's your thoughts on this area well you know i that's that's i would say that's the whole you know like the reason for being for kibbutz right so you know like know, like, we did a count, and on typical journey that goes beyond, you know, dashboarding, and by creating the machine learning models, and then using the models to actually automate the processes, right? There's over 17 steps, you know, on the complete data journey, from, you know, data is somewhere, all the way to models are, you know, insights are actually done, and the models are created, machine learning models, and all processes are automated.

Starting point is 00:33:30 And then there is a feedback loop back into the system, so it can get better and better. And how we fit in, we care about the whole data operations alongside the whole journey. So we care about the user experience and scalability to 10 of workforce and you know we really are thinking you know why when we go to companies like there's an example of large you know large financial institution and we start with them and you know like these institutions they have everything right they have oracles you know, like these institutions, they have everything, right? They have oracles, you know, they have Hadoop, they have everything. But somehow, as you said, you know, it doesn't work together, right? People were not able to actually scale those analytics or they were able to scale, you know, dashboards, but not, you know, like the model creation and alteration of those processes.

Starting point is 00:34:21 And so when we come there, we kind of like connect to those sources. We connect to Hadoop Play. We connect to, you know, like Oracle Data Warehouse. And we let the teams to connect to their applications, right? But that's the first step. The second step, which I think is really important and was really undervalued before, is, you know, we need to allow the users to create, you know, like agile tribes.

Starting point is 00:34:47 And by definition in agile, right, these are cross-functional teams. So in one team, you know, you have a data scientist, data engineer, IT person, and business person. And together, they actually create the whole process. And at the end of the process, there is a metric, right? And it's kind of like they are responsible for it. It's registered in a data catalog or something like that. And then anybody who wants to shop for data or use that metric in their dashboard or in

Starting point is 00:35:18 their model, they can actually see who is responsible for it, how it got created, and mainly in which other processes it's already being used. And if the data is getting old or if it's fresh. And this kind of changes the logic from only tiering the layer on top of the layer, integration on top of integration, until everybody loses the context. This agile approach, on the other hand, gives people, you know, the context.

Starting point is 00:35:47 They own the context. And the technology, you know, is supporting their processes. And the processes are supporting the people. So in typical, you know, like the DevOps manifesto, right? And, you know, like when we go to this, you know, like that large financial institution, we actually introduce Kibula, you know, we connect it to to this, you know, like that large financial institution, we actually introduced Kebula. You know, we connected to Oracle, we connected to Hadoop.

Starting point is 00:36:17 We let them connect to their, you know, like, for example, risk guides to all of their, you know, external sources from, you know, that are out there. And within six months, people in 14 different departments, you know, started to use Kebula, not only for dashboarding, that was the first time, but they started to actually create what we call business data apps. And business data apps is the, you know, like, yes, you can use the scaffolds to automate, you know, recreation, but mainly it's, you know, like what you talk about, right?

Starting point is 00:36:41 You know, the whole process from acquiring the data, putting some logic into it, enriching the data, yes, looking at the inside, but then creating action and using those actions to automate the processes. For example, one of these apps would be in the risk department. They wanted to kind of like, you know, bring down the fraud. So they actually created a business data app on top of Kibula, which helped them to actually get from users the IDs, right?

Starting point is 00:37:08 Then they would use the recognition software, you know, like as an app on top of that, which would say it's a male, female, you know, what are, you know, they would read the name, you know, surname, they would read the exit files. And all that they did for, you know, to compare it with the database of stolen IDs, which were kind of like used for fraudulent actions, right? So they wanted to kind of like, you know, decrease the probability that somebody's using

Starting point is 00:37:36 fake ID. And at the end, you know, they actually had, you know, the list of white labeled IDs kind of like not used in fraud, right? And all of that, you know, they created within four hours. And, you know, the result of this, you know, business data app or workflow automation is a clean data set that can now then be used, you know, in multiple other, you know, use cases within the company. And this is the power, you know, like another example is kind of like the data app, you

Starting point is 00:38:06 know, people with CRM department, right? They would actually first connect the data from different sources, mobile web, you know, their CRM, ERP, and they would look for people, you know, like insights, you know, looking through dashboards, right? Hey, what are the groups? You know, how can we, what can we, you know, they would start thinking about questions, right? And then within the same environment in K, what can we, you know, they would start thinking about questions, right? And then within the same environment in Kibula, they would actually create, they would create

Starting point is 00:38:30 the next best offer, right? They would invite somebody from, you know, data science department, or they would already have someone or somebody external, and they would create that. And within just a couple of clicks, they would actually automate the whole journey. So now the data comes in, you know, automatically automatically it's clean it's being compared with crm erp you know it's giving compared with some credit you know some scoring of their and then you know next best offer and you know the system kind of like model decides should i contact this person via email via sms or should i give it to call center so they would contact them and before you, it gets sent or orchestrated by Kebua, there is the risk model, which we

Starting point is 00:39:09 talked about previously, right? And that's always updated, you know, within a minute. And it says, well, this is the risk, you know, portfolio. We can do it or we cannot do it on the fly. It has to go for, you know, recheck. And again, this is another example. So these are, you know, like things when you ask me what is the secret sauce, how to actually change that scale of analytics, I really think it's kind of like think about

Starting point is 00:39:33 the whole journey, you know, from a design perspective, from the user experience perspective, right? And, you know, like change into data ops mindset with agile teams, DevOps techniques, and lean manufacturing processes in the production cycle. And, you know, like if you don't think about the whole journey, then you're just like optimizing for local maximum. You know, if you think about, oh, we can do the ETL process, you know, the best. Well, you optimize for local maximum, which is one of 17 steps of journey.

Starting point is 00:40:09 So it's very helpful. It's very helpful. But it's helpful for that particular part of the journey and that particular persona. So our approach is kind of like we look at the whole data operations, you know, and we care about the user experience and the collaboration of users. And if they use in those 17 steps, if they want to use, let's say, for the model creation R or Python or MATLAB or Julia or Databricks, we don't really care. We care about what is their experience and what, you know, data intelligence we keep in the process and how we obstruct the users from actually having to run the systems themselves. So, you know, a lot of, many of our listeners actually are consultants or work in a consultancy or independence. And they're probably thinking, well, is Caboola something actually that would replace us on a project? You know, is the whole point of a tool like yours

Starting point is 00:41:09 to actually do away from the need of a consultant or a partner to help with the implementation? Mark, thanks. That's actually a very good question, and we get it a lot. And, you know, like, no, totally not. It just allows you to do more interesting jobs. Because, like, what we have actually seen, you know, a lot of consultants would kind of actually seen, a lot of consultants would start and help with integrations and stuff.

Starting point is 00:41:30 And for that, the budget would be used, all of that. And before they get to interesting stuff, the clients are like, oh, should we spend all the money? With us, you get there faster. You save that budget. And actually, you can do interesting stuff. You can move from that initial dashboarding to actually what you said, you have the money, resources and time to spend it on actually scaling the analytics across the organization. Just to give you a number, in our typical implementation, we are very, very partner focused.

Starting point is 00:42:02 Our professional service has just six people globally. That's it. We do, you know, we do a lot of architecture, but mainly we are there to support our consulting partners. And so, like, when we go, you know, in the first phases of the project, let's say first year in the large company,

Starting point is 00:42:18 the amount of money to be spent on Kibua, you know, compared to the amount to be spent on services is one to six. Sowa, you know, compared to the amount to be spent on services is one to six. So actually, you know, we can see how much more, you know, like it actually creates because we unlock the data into business department. That's a fundamental change because then, you know, business people actually, you know, like they go, they have, you know, problems, issues, challenges, they try something.

Starting point is 00:42:46 But ultimately, they don't want to spend days and years. They want to innovate, but then take it into operations. They actually call on the consultants more and more and more. And that's been great. There are several companies around the Czech Republic that started with us. They were our clients originally. And there's at least three, four companies I can think of now, and they actually scale their business from one to 20,

Starting point is 00:43:10 30 people within one and a half years. You know, and that is, no, we love, actually, we love consulting partners because they are the ones who can, you know, bring the context and who can keep the context. And they have the fundamental skills. So, Pavel, just to wrap things up really, how would people find out more about Kibula? Maybe download the product or find out some more data sheets

Starting point is 00:43:34 or background information? So, we have a website, www.kibula.com. We actually, if you are technical, you might really like our near-day blog, which is 500.kib.com internal server error. There you can, you know, read a lot of things about, you know, like how we, how we actually, you know, program Kebula, et cetera. We have a GitHub slash Kebula where you can read a lot of code. If you really want to, you know, read a business kind of like a related blog post, it's blog.kibula.com, Twitter, obviously, Facebook.

Starting point is 00:44:06 And we are trying to do a lot of community events, a lot of hackathons. We support a lot of hackathons we create. We also have started several years ago with one women in tech group. We started the group for teaching females how to do data 101. It was originally called Data Girls because it was at little girls. It kind of grew and a new initiative started on top of that, like She Loves Data in Asia.

Starting point is 00:44:36 And that kind of initiative now has over 10,000 people kind of went to the initiative, which I love. It's one of my most favorite things to do. And stuff like that, yeah. We try to be at the conferences, and my Twitter is pabu01, where I try to publish interesting stuff. But yeah, really

Starting point is 00:44:57 on the internet. And you can also go to try.kevula.com and if you use the code guide slash mode, you will actually get a free project for 14 days with guide mode already in it. So you can actually test all this and see if it's, yeah, what I say.

Starting point is 00:45:18 Pavel, it's been great speaking to you. Thank you very much for coming on the show and great product. And take care and hopefully see you some point in the future hey mark it's been a real pleasure and thanks for inviting me here i really love your show as i said and you know like you have a great weekend thank you Thank you.

Drill to Detail - Drill to Detail Ep.77 ' Keboola, Scaling Analytics and Winning the Looker Join Hackathon' with Special Guest Pavel Dolezal

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.