Drill to Detail - Drill to Detail Ep.57 'WhereScape Red and Data Warehouse Automation' With Special Guest Neil Barton

Episode Date: June 18, 2018

Mark Rittman is joined by Neil Barton, Chief Technology Officer at WhereScape to talk about metadata-driven data warehouse design, automating the build and management of data warehouse infrastructure ...and the thinking behind his company's WhereScape Red and WhereScape 3D tools.

Transcript
Discussion (0)
Starting point is 00:00:00 So welcome to another episode of Drill to Detail, the podcast series about the world of analytics, big data and data warehousing. So I'm very pleased to be joined today by Neil Barton, Chief Technology Officer at Warescape, a company many of you will know from their data warehousing and infrastructure automation products. So welcome to the show, Neil, and it's great to have you with us. Thanks, Mark. Thanks for the invitation. It's a pleasure to be here. So, Neil, tell us a bit about how you ended up working at Warescape. What's your kind of route into the industry and what were you doing kind of on the way there and what interested you really, I guess, about this kind of area of the industry? Yeah, sure. So I started out, my background is Unix and Oracle. And I spent, I grew up in New
Starting point is 00:00:51 Zealand and then moved to Australia and spent a couple of years working for Sequent Computers in Sydney and also Oracle. And then went back to Sequent to work for one of their spinoffs called Decision Point, which was a prepackaged data warehousing software solution. Got a job, you know, relocation to here in Portland, Oregon. And then I knew the founders from my previous life at Sequent and Decision Point. And then in 2006, they were looking to hire a, you know, US-based technical pre-post sales person. And I'd enjoyed working with Michael and Wayne
Starting point is 00:01:26 a long time ago so I decided to go and join Wirescape. Did that for a couple years and then ended up going out and doing my own thing for a little while and then in 2014 they came back looking for someone here in the US to help focus on the big data space and I always liked the company and particularly loved working with the people at Warescape. So for me, it was a good opportunity to come back and work on something interesting with an excellent team. And four years later, here we are. Okay. So what is it you do at Warescape then?
Starting point is 00:01:58 What's your current role now? And what do you do in terms of, I suppose, influencing the products and the direction they take? Yeah, sure. So my current role is CTO, so Chief Technology Officer, and I'm primarily focused on the overall product direction and ensuring we are addressing the needs of our customers today, and then also looking out for where we want to take the product in the future to meet the needs of our customers as we go forward. Okay. Okay. So for anybody who doesn't know what Warescape do, give me a very kind of succinct, I suppose,
Starting point is 00:02:29 kind of summary of what Warescape do and I suppose the problem you're trying to solve for your customers. Yeah, sure. It's a good question. So what our customers are struggling to do is deliver your business value on their data warehouse projects in a timeframe that their business users want. Warescape is coming at solving this problem by reducing the time,
Starting point is 00:02:51 the cost, and the risk in building and managing a data warehouse over the long haul. And we do that by improving the time to value for these projects by automating the relevant aspects of the data warehouse lifecycle, from discovery and design of a system through to development of the ELT code and deployment and operations. So we automate all those repetitive, time-consuming tasks,
Starting point is 00:03:15 which allows our IT developers within our customers to free up their development resources to focus on adding the important stuff, adding value to the business allows them to get more done with with fewer resources in a cost-effective and risk mitigated fashion okay okay so um so you talk about data warehousing there and and surely the days of people building data warehouses has kind of gone now people tend to kind of throw their data into a into a kind of data lake or or maybe haven't got the kind of appetite for for building kind of you
Starting point is 00:03:49 know curated kind of etl systems and so on you know our data warehouse is still being built and they're still being maintained or is it a kind of dying market in your world yeah that's a good one uh this is a common debate right you know raised by about most technologies and approaches at one time or another, right? But I think it particularly misses the whole point in that companies are battling the ever-growing complexities of managing data, given the proliferation of new data sources and data volumes. But what we're hearing from our customers is that they see their data warehouses as important for the business operations, and most are increasing their investment in data warehousing. So to us, it appears that data warehousing remains very much alive. And now more than ever, getting data infrastructure right
Starting point is 00:04:32 in support of governance, lineage, and manageability has never been a bigger issue. And the new requirement we see these days is to design, develop, deploy, and manage data infrastructure at the pace of the business. And that's where Westgate comes in. Okay. So when you say pace of the business, what does that mean then really?
Starting point is 00:04:49 I mean, how does that differ perhaps from the data warehousing projects maybe I worked on 10 years ago that were very manually written and so on? Yeah, sure. I think that's a good question. I mean, if you roll back a decade or so ago, right, the data warehouse was, like I said, typically manually written, not necessarily mission critical. So if the data warehouse died and was down for a couple of days, businesses would tolerate it.
Starting point is 00:05:10 And the development process was weeks, months before you'd see anything out of there. But businesses in this day and age just can't tolerate that, right? The data warehouse is mission critical and the business wants new data, new analytics in weeks,
Starting point is 00:05:25 if not days. So the ability for the IT team to add value for the business by developing these new analytic capabilities in a matter of hours or days has become really important if they want the data warehouse and the data infrastructure to be successful and manageable over the long haul. Okay. So who are the typical buyers of your product? And we'll go into the detail of what the product is in a minute.
Starting point is 00:05:49 But typically, who are the people you engage with? Is it the IT department? Is it the business? I mean, who typically is your buyer and your person you aim at in terms of the product? Absolutely. We typically sell into the IT organization. Our goal really is to help them deliver the value to the business. And from a development standpoint,
Starting point is 00:06:07 you know, they work hand in hand with the business users, but we're selling it to the IT team and, you know, making sure that they can provide successful deployments within their infrastructure. Okay. So,
Starting point is 00:06:17 so I remember seeing my background when I first started in this area, I used to work with a tool called Oracle designer, which was a case tool. And I've also seen, I've also seen the product being used in projects where it's actually been used by actually consultancies and so on. I mean, I suppose the first question there is, is Wirescape Red, and we'll get into what that is, is it another case tool or is it more than that really? And is it also kind of like being used by consultants as well? Yeah, it's certainly more than that. We obviously have our own professional services team
Starting point is 00:06:45 that helps customers out. We have, once we install it at a site, customers will spin up their own resources and train them. And we have third-party companies, consultancy firms who use us. Red is, it's more than a case tool, right? It's for designing and developing the entire data warehouse and managing the lifecycle.
Starting point is 00:07:05 And like all Wirescape products, and I'm sure we'll touch on 3D in a little bit, it's metadata-driven. And what that does is allows developers to drag and drop from source to target to build out data warehouse structures. And in turn, it will then turn around and create all the necessary mappings and ELT code, transformations, et cetera,
Starting point is 00:07:25 and then generate that for the underlying platform. The key aspect here is the tool itself is focused on building and managing a data warehouse. So all of the best practices and methodologies for a relevant data warehouse methodology are built into the tool. So whether you're building a dimensional model or a 3NF model or a data vault, we've got all of those best practices and methodologies to help you walk through and build a data warehouse in a consistent, repeatable fashion.
Starting point is 00:07:52 And then also, I think just as importantly, right, when we generate the ELT code, it's not black box, right? It's native to the platform you're in and it's optimized for your platform. So if you're generating within SQL Server, it's going to be T-SQL, store procedures, everything's SQL Server-based. If I go over to Teradata, then it's all going to be in Teradata code, with TPT code and Teradata procedures.
Starting point is 00:08:14 Go up into something more modern, like, say, a Snowflake environment, we're going to utilize the best practices and the Snowflake capabilities. And for our customers, we see that as important. There's no black box that developers within our customer sites can work with and seek code that is familiar to them, that's not hidden away. And that allows them to reduce the learning curve for their resources
Starting point is 00:08:36 and easy ongoing manageability of their environment. And I think the last thing really to focus on on red, but also that Wirescape at its core is we manage the full lifecycle of building a data warehouse all the way through from development, but onto deployment and operations, scheduling. And the thing that, as you know, from your background, right, if you build a data warehouse by hand, you're never going to document it. And if you do, it's always going to be out of date so given that we're a metadata generated tool uh all the documentation is always up to date and generated that click of button so you get full documentation lineage impact analysis which really is key to managing the data warehouse over the full lifespan of an environment of a of a data warehouse not just the initial development phases so so so i said i said that
Starting point is 00:09:23 you know it's not just a case tool, is it? And we've started to talk about Warescape Red in that conversation. So just kind of, can you set out for me what Warescape Red is? I mean, not so much in the kind of marketing sense of saying it, but what does it do? Put it in terms of the kind of data warehouse lifecycle and what it kind of automates and so on. If you're pitching this to an IT manager,
Starting point is 00:09:42 how would you describe how it works, say, with an Oracle backend or something? What's the kind of bits it provides for you? Sure. Wirescape Web has a metadata backend that will reside in your Oracle database and then a user interface which allows you to drag and drop from source system through the various stages of building a data warehouse. So take a simple dimensional model, for example. You can drag and drop it to create dimensions and fact tables. And then the tool with its built-in methodologies will ask you the relevant questions around
Starting point is 00:10:14 how you want to design this dimension. Is it a type one or a type two? And then it will take that metadata, use that to build the relevant model, so the relevant dimensions and facts. The tool will generate all of the ELT code for loading and processing and transforming the data. And then from an operations standpoint, Red can also manage all the DDLs, all of the create tables, all the necessary indexes, it will manage that for you. And then once you deploy that into a particular environment, say production, it will also do all the scheduling to execute, do all the loading, all the logging and auditing of the processes and making sure that if it has a failure, jobs are stopped.
Starting point is 00:10:54 And then when you restart, they pick up and continue on as per normal. So it's managing the full lifecycle, but generating all of the structures and code for execution and operations of the data warehouse okay okay and it's the scope is the scope of what you do limited to the kind of data warehouse layer or do you have any integrations in with any kind of bi tools and the metadata they use at all uh we don't we don't provide any direct integration with bi tools like micro strategy or all that we have a metadata layer that a lot of companies then will interrogate to extract out the definitions for the particular models and use that to integrate with. If you're doing cubes and tabular models,
Starting point is 00:11:33 then yes, we build into those structures as well as traditional database platforms. Okay, okay. So what about the product Wirescape 3D? What's that and how does that relate to what you've been talking about so far? Yeah, good question. So 3D, you can really think of it as the front end of the development.
Starting point is 00:11:51 So what we've designed that product for is it focuses around the discovery of a source system. So understanding the structure, the relationships between tables, for example, to foreign keys. What are the structure of my source systems? Profiling the data within those source systems so I can understand what the data looks like. I can look at data quality to understand where I potentially may have some issues that need to be addressed as part of my ELT process. And if I want to do a design,
Starting point is 00:12:19 I can do both a model or a data-driven design approach to rapidly design a model for the data warehouse. And then 3D will actually generate all of the metadata for RED and push that into RED who will then take over the ongoing construction of the schemas and the ELT code generation and the ongoing operations standpoint. Okay, okay. So basically it helps you understand and profile the data and then maybe kind of introspect it and then use that to come up with some of the dimensional kind of design and that sort of thing?
Starting point is 00:12:48 Yeah, exactly. Yeah, introspect it and then we've got model conversion rules that will build out a dimensional model or one that we see growing interest in these days is DataVault. So it'll, based on that structure of that source system, you can build out all the necessary DataVault hubs that link object structures
Starting point is 00:13:04 and then it will do that in a bulk fashion. then you can push all that into red who will take over and generate the code and do all the operations and management aspect of the data warehouse okay okay so so that's interesting you mentioned data vault there as well and one thing i noticed looking through your website was you know support for things like data vault which is a fairly kind of i suppose popular concept now. Maybe just explain what Data Vault is and why people might be interested in that and why you added that support into the product then, maybe. Yeah, absolutely. Yeah, so we have, to your point, we're seeing, certainly in mainland Europe and the UK,
Starting point is 00:13:38 a huge interest in Data Vault. And here in the US, we're seeing a growing interest in Data Vault. And we have a great, great partnership with Dan Linstead, the inventor of DataVault. So he helps work with us around the best practices for the methodology, as it were. And DataVault really is a hybrid between a 3NF and a dimensional model. And really designed to allow you to incrementally build your data warehouse infrastructure and also adjust for schema change over time. So as my source systems change, the Data Vault methodology allows me
Starting point is 00:14:12 to continue to augment and grow that underlying Data Vault in an incremental fashion, reducing the ongoing overhead of maintaining changes over the long haul of the data warehouse. That's interesting. I mean, because I was actually going to ask you you know how i suppose one of the kind of the i suppose one of the the criticisms of formal data warehousing is that you know the time you take to kind of curate and design the the schema where you're going to load data into you know that people haven't got the haven't got the interest haven't got the time for that these days but also with the more flexible schema data we get these days
Starting point is 00:14:45 and the way that can change, that can be a bit of an issue for data warehouses. Is that one of the ways that you deal with this? I mean, is that your recommendation for that? Or would you maybe use Data Vault with maybe a data lake or something? Or are they kind of different really problems really? Well, I think it's a good question. I mean, in terms of slowly changing schemas,
Starting point is 00:15:03 like we see in Salesforce for automation systems, certainly something like a data bot methodology, it's designed to accommodate that, which is good. If we're talking, change it around a little bit, schema on read, if you're populating a data lake with native data from various source systems, we can handle both of those. We can bring data in in its native format,
Starting point is 00:15:25 manage transformations, then persist that into a data lake, whether that's HTFS or S3, or maybe you want to put it into something like Snowflake, which can handle native JSON structures in those variant data types. We're quite happy to manage any of those approaches. Okay.
Starting point is 00:15:42 Okay. What about... So that's interesting. So you would use your product then in a kind of a data. Okay. Okay. What about, so that's interesting. So you would use your product then in a kind of a data lake environment as well. You would use it maybe as running on top of, say, sort of like a Hadoop cluster or Hive or is that getting to the edges
Starting point is 00:15:53 of where you'd recommend using this really? What's your thoughts on that? Yeah, we absolutely, we can support Hive. We have a number of platforms that we can support. We have the traditional on-prem Oracle SQL Server Teradata and some of the appliances. We have a number of customers running stuff out on Hive.
Starting point is 00:16:13 We can deal with the data lake if you want to put stuff into HDFS and S3. And then obviously the cloud now is becoming pretty relevant. So Stoke Lake, we've got a great partnership with. If you want to go into Redshift or on the Azure side of the house, we can manage all of those platforms. For us, it's about how we do the processing and take advantage of that underlying infrastructure
Starting point is 00:16:31 in each of those cases. Okay, okay. And do you find there's a lot of use of, say, Snowflake these days? I've seen that be used quite a lot with some of the clients I work with now. Is that one of the database types that's getting quite popular now in your view? Yeah, we're seeing, I think think cloud in general is getting significant interest. And we're seeing quite a lot of interest in Snowflake. And we started a partnership with them pretty
Starting point is 00:16:54 about a year ago, I guess. And we've seen quite a lot of interest. We've had a lot of uptake there on the Snowflake space. So I think it's a great partnership for both companies. And certainly their value propositions are pretty unique and pretty compelling to a lot of organizations. And I think our value proposition around speed of development aligns well with them. Yeah, interesting. So when, again, when I used to work maybe in my Oracle world, I used to sort of often think that I wondered how a tool like Warescape Red could write as efficient sort of SQL, for example, as a tool like one of Oracle's tools, or I suppose one of the dedicated ETL tools for those kind of platforms. I mean, how do you, I suppose, especially as CTO, how do you go about writing and keeping
Starting point is 00:17:36 up to date with kind of the nuances of all the different engines? And do you kind of, I suppose, in a way, do you make that a selling point that you'll do very efficient transformations, or is that a weakness and more you're more to do the process? What's your view on that? Very good question. I think that is, if we look at how we continue to evolve the product over each release is how do we maintain, I think our key value proposition is that while we have the methodologies of the dimensional data vault, the code that we generate for Teradata is optimized for Teradata. You build the same model on SQL Server, the code that we generate is going to be different
Starting point is 00:18:11 because it's going to be optimized for SQL Server. So from a customer standpoint, they've bought this hardware, this infrastructure, and we want to make the most of it. So our strong selling point really is the code that you generate is A, native to the platform in use, and B, optimized for that platform. And the way we store the metadata, each platform, the methodologies are consistent, but how we expose the code generation will include the nuances for each platform. And the underlying template engine that generates the code is tailored to provide optimized code for each platform. And as we move forward, as each new release of each SQL server or Teradata or Snowflake comes out, one of our key things is ensuring that we take advantage of new capabilities.
Starting point is 00:18:55 So for us, it really is the strength as opposed to a lot of tools which may provide a generic SQL across all platforms, which isn't going to scale to the data volumes that some of our customers deal with. Yeah, sure. So do you, I mean, what about things like cloud? I mean, what if, so your product, is it an on-premise install? Is it run as a sort of cloud service? How does you, I mean, what's your view about cloud
Starting point is 00:19:17 in this kind of environment? How do you kind of work with it, really? Yeah, I think we're seeing, certainly in the last 18 months, huge interest in cloud. I think a lot of companies are now looking at cloud first. I'll go cloud before I go on-prem. And a lot of companies are also taking this opportunity to look at re-platforming. So for us, it's a good opportunity.
Starting point is 00:19:41 If you want to install on-prem, you can install your stuff on-prem. You want to go up in the cloud, you can install our stuff up on the cloud. We have a number of customers who are cloud only. We have a number of customers which I think are going to be fairly common, which is they run a hybrid, which is they have a mix of on-prem and cloud, right? I've got a massive Teradata EDW that's been around forever. That's not going anywhere anytime soon. But maybe I want to augment it with a Snowflake. We can manage both of those hybrid environments
Starting point is 00:20:05 as a single logical data warehouse. Code that's generated on Teradata is native to Teradata, optimized for that, and then the code that's generated on Snowflake is optimized for that. So we can manage whichever combination you want, on-prem, cloud, or hybrid. Yeah, yeah. And your product itself,
Starting point is 00:20:20 you install your product on-premises, or is it run as a service? How do you typically deliver the kind of software that you you kind of sell really yeah so it's um yeah it's it's installed in your quote-unquote on-prem so you can you would install it on your on your desktop uh if it's up in the cloud you can install still on your desktop or you know companies would spin up from an amazon right an ec2 instance and just run it up up there and and then remote desktop into a machine. But it's a physically installed piece of software on a machine itself.
Starting point is 00:20:50 Okay. Okay. And so, I mean, last question about the actual kind of product itself here. So data lineage and end-to-end metadata management. I mean, again, in the work I'm doing at the moment, I'm working at a startup in London and they're just discovering at this point some of the ideas around data lineage and having a kind of like a managed metadata, kind of like an end-to-end metadata. I mean, tell us a bit the story around how the product helps with, I suppose, end-to-end data lineage and impact analysis and why that would be something that companies should be concerned about and actually kind of place value on. Absolutely. I think you're, what are we, four days in GDPR, so pretty important.
Starting point is 00:21:28 So we are, our product is all metadata driven. So we have a metadata layer which describes everything. We use it for code generation, for schema management, but also just as importantly, documentation and lineage, which becomes really important when I go to enhance my warehouse in a couple of three months, six months down the road, when I can look at a set of changes I want to make and understand what the downstream impact is, that allows me to de-risk my project if I know what objects I need to change over a particular upgrade process, rather than having to scroll through thousands of lines and hundreds of thousands of lines of ELT code. And the other thing that is really key is maintaining the knowledge that's in the developer's heads, right? The data warehouse is going to be around
Starting point is 00:22:08 for a long time. What happens if those developers walk out of the warehouse? You need to capture that knowledge somewhere. You do that in our metadata layer. From a lineage documentation standpoint, I think things like GDPR we can really help out and show you where your data is going through the system, understand what transformations you're doing with it, where is my social security number, where is that being stored on my environment. If you don't have lineage and documentation describing that, you're going to be in a world of hurt.
Starting point is 00:22:37 And a lot of our customers find, they buy the tool for the code generation for lifecycle management. Documentation is typically, you probably know, second or third biggest value add to them once they've got the tool. And in a couple of cases, we've actually sold the tool purely because of the documentation we generate. I'm actually in exactly the same situation at the moment where one of our last engineers has walked out, not walked out, but has moved on, and we're now having to reverse engineer a data loading process. And it's kind of fun you know it's as in it's not really
Starting point is 00:23:08 it's um and it's one of those things where you you you kind of you know at but it's um yeah i mean i think that whole kind of you know end-to-end metadata management data lineage just under especially with gdpr as well i think now being able to prove you know i suppose why this was made in a certain way and how the data was arrived at is important. Has GDPR been something that's driven a lot of kind of, I suppose, new interest or certainly, you know, I suppose you say put a spotlight on the value you guys deliver? Yeah, I think certainly interest is, I think automation in general is, the interest in automation has really taken an uptick over the last few years.
Starting point is 00:23:42 GDPR, I think, has certainly shown a spotlight on the value add, the value proposition around the documentation lineage. So for us, we were already there doing the bulk of it already. So it's really just another value add from our sales standpoint and the value that customers get out of our product. Yeah, okay. So I suppose another trend that's happened in the industry has been the kind of the,
Starting point is 00:24:07 I suppose the move towards self-service and the move towards individual kind of, I suppose, power users doing work themselves with, say, BI tools that are more desktop, but also this new generation of data prep tools from likes of kind of Paxart or Trifactor and so on. I mean, where do you see, do you see a role for those tools?
Starting point is 00:24:25 Are they something that's complementary to what you guys do? Are they maybe a kind of a bit of a kind of a dead end? What's your view on the data prep market really at the moment? Yeah, I think from our standpoint, I think they're very complementary. We have a number of customers that run Paxart or Trifactor, Alteryx and House in conjunction with the Red Data Warehouse. And those tools provide the exit value when used appropriately. I mean, for me, it's more of a fitness-for-purpose type thing,
Starting point is 00:24:51 like use the data prep and the self-service tools where they add value. But you're still going to want this data warehouse with governed, curated, full auditability and lineage. So I think there's a good fit there. Again, like everything in life right when used yeah exactly so so um i mean i suppose another trend really is has been the move towards things like streaming uh data loading as well so batch is like very much kind of like last century and everyone's not talking about streaming and kafka and so on i mean is that something is that
Starting point is 00:25:21 something that's kind of first of all do you think it's do you think it's kind of, is the next big thing? And secondly, is that something you guys can support? I mean, tell us how, what your thoughts are on that really. Yeah, very good,
Starting point is 00:25:31 very good question. From a streaming standpoint, I think we've seen a lot of customers have interest in streaming. We, and that really drove our initiative last 15 months to build our, you know, our streaming automation capabilities. And I don't see batch so much being replaced with streaming so much as augmented with streaming real time feeds.
Starting point is 00:25:55 It's not about having everything in real time so much as having everything at the right time. So batch will continue to exist going forward. Batch it in where it makes sense. But augmented and stream data where you need that data in real time. And for us, we've done automation for streaming. So if you bring data in from Kafka, IoT type data bringing it in from Kafka,
Starting point is 00:26:17 say, do some processing, we can handle that. For our customers, they saw the same, they had actually the issues are worse, right? They don't have people with knowledge of Scala or Kafka or that. So the question was, how can you, Wirescape, help us bridge this gap? Our IT needs to bring
Starting point is 00:26:35 this data in, but we don't know Scala, Spark, Kafka, any of that stuff. Can you automate that for us? And the other key thing I think is important when you look at streaming is you need to, you're typically going to need to augment it with data that's in your data warehouse. So for us, we added a streaming option that really fits hand in hand with our traditional batch read style processing. We can blend data as it's coming into the data warehouse. The technologies are different, but the value proposition is the same, right? How do we reduce the time cost and risk
Starting point is 00:27:06 and improve the time to the value for the business? Because there's a lot of business value in that streaming data. You just need to be able to process it appropriately. But again, still need to document it, still need to understand lineage and all of that stuff. Yeah.
Starting point is 00:27:19 So, I mean, quite a bit of wider questions here. How come you guys have done very well and that, you know, Warescape is still kind of. How come you guys have done very well and, you know, Warescape is still kind of there and it's still kind of respected. And, you know, there are other competitors that are in this space. You know, there's been the old kind of case tools. There's been various kind of code generators. I mean, Noetics, I think, were around a while ago.
Starting point is 00:27:36 What do you think it is about your approach and the way you do things that's meant that you are kind of the prime example, I suppose, of code generation in this space that has worked well and is still kind of um the the prime example i suppose of code generation in this space that has worked well and is is still kind of i suppose being bought and used what what yeah from a technology perspective yeah from a technology perspective and so on what's the good thing about it really well i think um as as new platforms that come on board like snowflake we've we've adopted those where it makes sense so we can generate code for those platforms, the codes optimized.
Starting point is 00:28:08 And I think the other key important that our customers see are both documentation, but also we manage the full lifecycle. We manage all the way from discovery and design of a source system, all the way through to deploying it in production. And then the important stuff and the hard stuff, which is when I want to enhance my data warehouse, how do I do that in an easier fashion, I would say.
Starting point is 00:28:30 And we really allow customers to manage the data warehouse over the long haul, which I think has allowed us to be successful over the last 15, 16 years, and also speaks well to our customer retention rates, significantly high. And customers, once they get us in just love us and we stick around for quite a long time i think you're quite focused as well aren't you i mean again observing your company from from afar um you know you seem to very much focus on the data warehouse market you very much focus on this particular problem space and and it'll be very easy i mean take informatica for example the areas they got into at sort of certain points it was
Starting point is 00:29:03 very much you know trying to broaden i suppose that what they do and make themselves more sticky but you know you can lose focus has it been a conscious decision to stay just working in this area and focus on that entirely really yeah absolutely it's been a very conscious decision we are you know we stick to what we're good at and i mean there's lots of opportunities to go play in the nice new shiny space trying to self-service or any of that stuff. But our focus is what we do today around data warehouse automation or data infrastructure automation.
Starting point is 00:29:31 And we just maintain that focus. And I think that allows us to provide a really strong, valuable product for that space. I mean, I say that, and that's not meant in a kind of derogatory way because, again, look at what you do. You've embraced things like DataVault and the new kind of database types out there as well. So it's admirable, I suppose, to focus on a problem.
Starting point is 00:29:51 And it seems to – I mean, how big is the company now in terms of operations and customers and that sort of thing? What kind of scale are you now really? I think we have, last count, north of 700 750 customers we've got you know obviously big operations here in the US and over in in the UK and mainland Europe so you know the bulk of our new growth comes in those regions okay okay okay so I'll ask in a second how people get hold of the software and kind of find out more but just before I do that you know as a CTO of this kind of your company you
Starting point is 00:30:24 know what what do you think what's the thing that keeps you kind of not awake at night but inspires you about maybe the next problem to be solved in this space or the thing that you think this is the thing that you know if we are if we or others could solve this or move this on that would be a real kind of you know advance in the industry for data warehousing any kind of thoughts on that sort of thing I think a couple of things. I mean, short term, what we've seen and what we're working on today is a lot of the companies are, now that the data warehouse is mission critical in a lot of cases, right? It's front and center for their business.
Starting point is 00:30:56 A lot of companies are wanting to apply this software development best practices to their data warehouses. So how do I do continuous integration, continuous delivery as part of my data warehouse environment? How do I do things like test automation, which really fall squarely into what we are good at and in value proposition for us. So for us, DevOps is at the core of what Wirescape does and has always done. And to us, it is an evolution of our product capabilities
Starting point is 00:31:23 where we can take our deep experience in this area and look to provide even more automation to those aspects of the lifecycle. So I think that's today stuff we're working on. Beyond that, I think if you look at other tools that do AI and things like that, how do we bring some of those capabilities into under the covers of the automation of designing and building a data warehouse and enhancing it over the long haul to, again, reduce some of the manual processing that needs to be done by data warehouse developers.
Starting point is 00:31:52 Yeah, so I think just understanding how people are using the data warehouse and determining what changes, how do we automate some of those aspects rather than just relying on someone to know that they need to go and add a new schema or switch this off? I think that's an interesting area that requires some research. But again, sticking to what we do, how do we automate? How do we take something that people are doing and actually automate it where it makes sense? So it's the key of what we've been doing for 15, 16 years. And how do we just continue to evolve that as we move forward?
Starting point is 00:32:21 And as these data warehouses and environments change over the next, you know, two, five, ten years. Okay, so one thing I've also observed is a whole new, I mean, I don't know how old you are, Neil, but I'm just turned 50, and, you know, I've noticed there's a kind of a new generation
Starting point is 00:32:36 of kind of developers that are getting into this kind of industry, but through that thing called data engineering, where they're kind of, I suppose, in a way, they're discovering the need for kind of retail and they're discovering the need for automation, but doing it through a kind of a very different route, not necessarily the old data warehousing, but doing it more to do with, say, kind of big data and so on. I mean, is there anything your company's doing to try and evangelize some of the things you're doing with that market?
Starting point is 00:33:02 Or is that a kind of a generation that has kind of lost you really what's your what's your thoughts on those kind of on that kind of movement really yeah i think uh we continue to look at those and really focus on how do we for those particular areas how do we how do we look at automating certain aspects that are repeatable and patterns that we can help you know help those resources um actually deliver more with less right and they'll you know those resources get more productivity done for them and um by having the tool do the repetitive donkey work and they can add value so it's an area that we continue to keep abreast of as we move forward if we want to maintain relevance in the market we need to make sure we know where the where the space is
Starting point is 00:33:41 going and i think we've got a good pedigree in automation and stick with it and stay abreast of where the market's going so we can help our customers. Okay. So just to kind of finish off then. So if somebody wanted to find out more about the product, maybe get a chance to, like you say, America, kick the tires or kind of have a play around with this. How would they go about getting some exposure to your product and getting a little feel for kind of whether it's right for them yeah absolutely i think the easiest one would be to actually go to our website you know westgate.com we have a number of use cases and materials and content there and then from there they can reach out and get a hold of relevant sales people in their region and um you know see a demo if you're at a tdwy or a trade show then
Starting point is 00:34:24 oh it's pretty good you'll see a westgate booth there you at a TDWI or a trade show, then it's pretty good. You'll see a Warescape booth there. You could stop by and say hi to the teams and they'll happily take you through the product and see what we can do to help you provide a successful data warehouse. Okay, okay, fantastic. One last question.
Starting point is 00:34:39 In Warescape Red, what's the red bit mean then? Where did that come from? I was curious where that bit came from. I think if you talk to five five people you get six different answers i think i've forgotten these days yeah i'd have to go back to ask michael i'm sure come up with the the factual answer to that question yeah it's one of those things isn't it it's part of the mythology of a company that i don't actually know is originally where it came from really but um but it's been well neil it's been great speaking to you um it's uh it's good to speak to someone who is in a company that
Starting point is 00:35:07 is so uh focused on data warehousing and that kind of area because it's like you say it is important and uh and you know i'm still finding these projects are going on and you know a new generation really is discovering the the need for automation and so on there as well so it's good to speak to you it's good to hear what you guys are doing and um what I'll do is I'll put some links in the show notes to your website and some of the kind of papers around there. But other than that, thank you very much. It's been great speaking to you. And, yeah, great.
Starting point is 00:35:35 Thanks for coming on the show. Yeah, thank you, Mark. Yeah, it was fun. I enjoyed it and always happy to talk about the space. It's interesting to me. Thank you.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.