Software at Scale - Software at Scale 57 - Scalable Frontends with Robert Cooke

Starting point is 00:00:00 Welcome to Software at Scale, a podcast where we discuss the technical stories behind large software applications. I'm your host, Utsav Shah, and thank you for listening. Hey, welcome to another episode of the Software at Scale podcast. Joining me here today is Robert Cook, the CTO and co-founder of 3Forge, a full-stack internal tools platform. 3Forge helps many companies, including some of the largest financial organizations in the world, build systems that help them understand their own data and improve internal productivity. Welcome to the show.

Starting point is 00:00:43 Yeah, thanks for having me on today. Yeah, so I wanted to start off with some of your work experience and really your story. The first item on your LinkedIn, if you look at it chronologically, is infrastructure lead at Bear Stearns. First of all, I don't think I've seen the name Bear Stearns anywhere except for the big short. So it was cool to see that. But what does that mean? If you could just explain to us. Yeah. That's funny. That's the takeaway of Bear Stearns because it really was a great company in many ways. The department I worked in, equities, which happened to be just below mortgage-backed securities, that's where I

Starting point is 00:01:23 worked was in the equities department. And frankly, that was a great group to work with. We built some of the first high-frequency trading systems. And by high-frequency trading systems, I mean systems that were electronically looking at the world in terms of market data, where things were at, making decisions on that. And so you can imagine when you are building high frequency trading applications, and just to kind of overview as to what that is, you're taking in market data, you're taking in client requests for orders, you're kind of marrying those together. And then you're deciding at what point you want to place, you want to electronically place an order into the market to get that executed, a.k.a. to get that filled. So basically, we're building these systems to do that.

Starting point is 00:02:12 And this is back in the early 2000s when this really was starting. And this opened up this new concept really at the time, which was computers fighting other computers. It still goes on today in many industries. I mean, even airline tickets and things like that. But at that time, that was kind of being able to do it very, very fast, but also be able to do it reliably. You're dealing with large amounts of money. So when we talk about dealing with infrastructure, that's a big part of it, which is around the hardware, the network connectivity, all of those sorts of things. With that said, I should clarify, it was head of infrastructure actually at the Darkpool LiquidNet, which is another fascinating story. At Bear Stearns, I was running the high-frequency trading or working on the high-frequency trading part.

Starting point is 00:03:12 But at LiquidNet, I was head of infrastructure. And there it really meant, yes, paying attention to kind of the, I would call it the application or the layer, the software layer that sits between what developers are actually building, like custom business logic, and then the hardware itself. So you could think of it as a platform, kind of a generic platform that people build on top of. And when you think about developers, or when you say developers, does that mean they're building these rules? Like, oh, if X equity or X commodity goes to a certain price and a bunch of more complex logic than that? Is that kind of what you're describing? It's funny when you say, yeah, what do you mean by developer?

Starting point is 00:03:56 Because I can mean many things. And the analogy I always use is, well, when we say an engineer, what does that mean? Because an engineer can mean everything from someone who drives a train, the guy who sits in the front of a train, that's a train engineer, all the way to someone who's actually out there designing antennas or electrical engineers. So my point is that the term engineer can cover a huge spectrum. Same thing with developer. So I would think there's various degrees or various disciplines within development. develop software so that other developers can develop software quickly. To your point, the developers that would use our software, yes, they're generally SMEs, subject matter experts, who understand the industry that they're writing software in. And they're using that to basically build, and I don't want to trivialize it, but basically build, you know, if-then statements to the other end of the spectrum, which is there's someone out there

Starting point is 00:05:25 who's taking assembly code and writing that in order to create a compiler so that then we can compile our code, et cetera, et cetera. So there's many, many layers to the stack. And I think as the world matures, or at least the software world matures, we're getting better and better at being able to identify and layer the different

Starting point is 00:05:48 development approaches. And so 3Forge, we're focused on what I would call a development platform, meaning that we kind of sit between the operating system and the developer. And so that we've developed a lot of the components that end developers need in order to build business applications. No, I think that makes a lot of sense to me. I have to ask you one question about high frequency trading, which is like, do you think the systems,

Starting point is 00:06:19 like HFT systems would be equipped to deal with a larger event like the pandemic? Or I hear all these, I read all these news articles, Renaissance not doing well when COVID just happened or the lockdown just happened. I don't know. This is a pretty vague question, but what's your take on it? Right. Well, this is a perfect plug for my platform. Because I would say that's what was largely missing. And I guess I can talk about Bear Stearns now because it's no longer, well, it's no longer around. It's been wrapped into JP Morgan.

Starting point is 00:06:56 But that's something that was missing, which was these high frequency trading decisions can make decisions very, very rapidly. Obviously, it's baked into the name high frequency. But at the end of the day, they're executing software that some subject matter expert encoded into it. And that's what it does. That's what it's designed to do. And yes, you can have machine learning kind of things sitting in the background that analyze macroscopic things to make more deferred, delayed decisions and so on and so forth. I can go into that in great detail. But what I felt was missing and got me started on this in a way was you had these systems that

Starting point is 00:07:40 were almost like bombs. They were like a nuclear reactor. I mean, they could produce huge amounts of energy, but if something went wrong, they could explode. You know what I mean? And the thing is, imagine you've designed, you know, your GE, you've designed this nuclear reactor, you've got it sitting there. You know, you need to have a lot of periphery around that in order to monitor and understand what's happening inside the reactor. You can never have enough because when that anomaly happens, you want to understand what's gone wrong and you want to have the playbook ready for what you're going to do, etc. And so those companies, and by the way, we work with over half of the top tier one banks in North America. And I'm proud to say our customers, they have tooling in place to basically

Starting point is 00:08:27 try to get ahead of or at least not be last in line when an event like that takes place. And in fact, sadly, a lot of times that's when we've gone into these companies is after an event like that, because the decisions that need to be made need to be made split second. I mean, we literally have use cases where there are eyeballs on a screen. They're staring at data all day. When something goes wrong, it's alerted to them. If it's a very big event, they want to be able to then take a human, you know, they want to be able to very quickly have workflows with their team members to make an educated

Starting point is 00:09:00 decision to decide what to do. And that might mean stop trading. And I think most of us have seen all the way back, you know, Knight Capital's an interesting event to throw that one out there. The Facebook IPO. I mean, I could go on and on. The BATS IPO that was rescinded.

Starting point is 00:09:20 These are all cases where it really was good software executing in many ways as it was supposed to. But the problem is, was there the periphery around that to be able to get so that when something went wrong, that a human could see and quickly take action? And that's really what got us started, was helping companies install that. I think that makes sense. There's so much risk. And I think the nuclear reactor analogy makes a ton of sense to me. Maybe just in one minute, could you describe to the listener something about Knight Capital

Starting point is 00:09:53 or Facebook IPO? I've heard about what happened at Knight Capital, but what happened during the Facebook IPO? Well, the Facebook IPO, from my understanding, and at this point, I was at ThreeForge. So it's really me kind of talking through, I've been told by. So just to be clear, this was an issue with the IPO that was being managed by NASDAQ, NASDAQ's exchange. Our customers are interacting with that exchange. And what happened is that exchange, the exchange itself had an issue where they were receiving orders, but they weren't filling them properly or they weren't sending back the fills properly. And just like, I guess, to make an analogy for those that might not be familiar with this, imagine a ticketing.

Starting point is 00:10:40 And now, in fact, you know, we just had Southwest. I don't know if that's something that you've heard about. Southwest had an issue a week ago, right? So this is not limited to just, and boy, I wish they were a customer of ours because I think we could have helped them on this. But this is the sort of thing that we focus on. So, but let's go back to NASDAQ here before I lose my train of thought. So basically, oh, the analogy I was going to make is if you look at a ticketing system, imagine if everyone's buying their tickets and those tickets are actually getting executed, meaning that the credit card is getting charged, but the seats aren't getting filled. Let's just imagine that scenario. So all of a sudden, you have many, many people buying the same seats on the airplane. That's an example.

Starting point is 00:11:29 That's not a perfect analogy for what happened with the Facebook IPO, but it's a close enough example. So everyone's trying to participate this in, but the actual things that are taking place aren't getting reported back to the customers. And that got delayed for 40 minutes. And then all of a sudden, all the fills came rushing in with stale prices and everyone was really ticked off. Right. And so just to be clear, when we think about the Facebook IPO, I don't know the exact, I can't give the exact numbers. We could go look it up, the volumes, but we can just take a typical stock. Many stocks, when they IPO, will have millions of transactions basically during the opening session. And so that happens. You've got these millions of transactions. If something goes wrong,

Starting point is 00:12:21 you literally have a million different people that are ticked off that you need to basically be able to respond to and handle as quickly as possible. And the problem is usually when you play Monday morning quarterback, you can go back and say, you know what? We could have identified this in minute one, meaning when 10,000 transactions had taken place, but we didn't really figure it out until minute 40. And so you've compounded the problems 40x by being so delayed in getting that information. And this is what it means. And I don't know that much about nuclear reactors, but I would assume it's often the same case where knowing information early on is critical, right? That's how you can mitigate a lot of these issues.

Starting point is 00:13:06 Now, I think that makes a lot of sense to me. Like definitely an outage of one minute is not as bad as an outage of 40 minutes because the problem keeps compounding. There's a large space of like observability tools, right? Like the one that comes to my mind, of course, is Datadog because that's the one I use at work. Why can't I use like a traditional observability tool in this kind of system? It

Starting point is 00:13:29 may be a dumb question, but I think it really helps explain the difference. I think, well, what we're about, what I think observability tools, you're right, there's, again, there's a whole spectrum of observability tools. And there's kind of generic ones that'll look at your data. Datadog would be an example. Splunk would be an example, those sorts of things. But then there's also, I would say, much more specialized. And we focus on more specialized observability. Whereas instead of just saying, consume all this data and kind of give us a generic display where we can look at it, we want to actually have built in workflows. So there's something called OODA, Observe, Orient, Decide Act, O-O-D-A. It's actually a term that the military, US military came up with 50 years ago. And when you talk about

Starting point is 00:14:21 observability tools, they're only doing the O, literally. It's in the definition. And what the military found was that in order to effectively engage in war, you need to be able to observe. You then need to be able to orient yourself. You need to be able to make decisions, and you need to be able to act. So, you know, you and I are sitting in a room and we get into a fight and you're about to punch me in the face, right? If all I can do is observe you about to punch me in the face, this is a terrible analogy, but I'm doing this because I'm talking about the

Starting point is 00:14:57 military, right? But if you think about it, just being able to observe the issue, okay, great. That's the first step. But then you need to be able to orient yourself. And what do we mean by orient? Usually it would be, okay, maybe I should use a cleaner analogy, like you're in a car and you see a deer coming at you. You want to be able to very quickly orient yourself, which means look around. Is there a family of deer? Is there a deer to the left? Is there a deer to the right? And then you need to take a decision. I can miss this deer if I veer to the left. And then you need to be able to act on that, which is turn the wheel to the left, right? So this is what it means to be able to respond quickly to outages. And the

Starting point is 00:15:43 problem is when you have generic observability tools, first off, they only cover the first step of four. But beyond that, they don't really let you orient yourself. And I think orientation is just as big a part of it. And so I'll give an example in finance, which is, or with the Facebook thing, which is, oh my gosh, we can see that a lot more orders are going out right now at this very moment than are coming in. So first off, to do that with a generic observability tool would be very hard to do. But you could basically encode certain rules that operators can then look at so they can sort of see and get in touch and feel these things in real time once they observe that now they need to be able to um orient

Starting point is 00:16:31 themselves and what they usually what we call literally or what we what we've materialized that as is what we call um a drill down so we see that there is some sort of delay and orders aren't getting filled. And now we want to be able to drill down. So maybe we take the whole set of orders and we want to click on that and within moments be able to see what's going on with all of our Facebook orders. Okay, now we see this. Now we need to be able to take a decision. That might mean sending a message through the console to my team members saying, look, we've got an issue with Facebook. This is our exposure. What are we going to do?

Starting point is 00:17:11 And now the manager can say, we need to pause all open Facebook order. This is how you make a decision in split seconds. And actually another thing, we're talking with governments because this is the same sort of thing you need when you have a fire or, you know, any sort of incident in a government situation and in a civil situation, which is, you know, you need to be able to quickly find out who's around. And all these things you want to be able to just shorten that life cycle to make that decision. The other interesting aspect is the technology itself. One thing that's fascinating about Three Forges offering, from what I can tell, is this idea of really low latency UI updates. Why is that important?

Starting point is 00:17:58 Right. Well, I think it's... So we focus on mission-critical use cases. And so the amount of money on... How do I put this? And this is where it gets to the core of me writing software, where we've built sort of, I would say, a culture within 3Forge about how we build systems. And one saying I have is when it comes to building a generic platform like ours, the time as developers we invest is insignificant compared to the amount of time that the users end up looking at the product. So if we end up spending an extra thousand hours developing a little widget that goes on the screen, but it can respond milliseconds faster, and we have 100,000 users, you do the math, and it actually makes sense. Right? But it is far more complicated to build real-time systems. Now, for those listeners who are technical, there's actually, in computer science, there's

Starting point is 00:19:08 a pattern. It's called online programming versus offline programming. It's not the same as being online, like going on the web, or offline, like you're stuck at home and your internet's down. So what online computing means is that you are reacting to deltas in the world. And as things are changing, you're doing minimal effort to update what's changed on the screen. So an example would be is I have a chart and that chart is showing me a million points, right? I'm viewing lots of data in a chart. Now, the way almost all GUI software I've seen being built is that when anything changes, you re-render that entire chart.

Starting point is 00:19:50 But that's a very inefficient way to do it. It's far more efficient to say, when a particular point changes, figure out really what changed on the screen and only update that component. But this becomes important as the size of data gets larger. And it also becomes important as the amount of time it takes to react becomes more and more critical. But here's the thing at the end of the day that this is going back to the culture of ThreeForge, which is we might as well had solved it. It's software.

Starting point is 00:20:23 We might as well have coded it as well as we possibly can to cover the most critical use cases that we can. And that's fine for use cases that aren't mission critical, but it doesn't work the other way around. You see what I mean? If you build front-end, if you build front-ends that are designed for real-time, then they can handle real-time. But if they don't near real-time, then that's fine. You can use it in a static way. And by the way, I will say that most of our use cases, many times they start off in what we would call a static capacity, meaning that people just want to view a chart. They want to see what their profits or their P&L was for the last few months or something like that, a static chart kind of overview sort of BI type thing. But then over time, that does morph into being able to get very real-time information and being able to have these workflows. And by the way, when I talk about workflows, I mean the ability for multiple people sitting in maybe different locations looking at the same data to be able to react to that. So, for example, I see a chart.

Starting point is 00:21:34 Something looks interesting. I want to be able to put a comment on that. And the moment I put that comment, someone, you know, you're sitting in another office. You should be able to see that comment. Right. Similar to like Google Word doc, you know what I mean? If you're familiar with that, right. And that's a real time concept.

Starting point is 00:21:53 Yeah. No, I think that makes sense. And I want to ask from a technology perspective, then like 3Ford started in 2010 and like the front end development industry or like the whole stack evolves so rapidly. So how do you choose the right technology to power a business decision? We should start real-time on day one. Right.

Starting point is 00:22:15 Well, I have been writing software since I was very young, in the early 80s. So I've been through a lot of this. And I've seen different platforms. And I've seen how things were built. And once I understood, and this is kind of an arcane term for a lot of people, but it's still what really powers anything real time. Once I understood how Ajax worked and the web worked, I knew that that was the future. This was in the late 90s.

Starting point is 00:22:46 And I made several predictions, frankly, all of which have come true, about where technology would go. I felt that, first off, computers, the CPUs themselves, would stop getting faster. Clock cycles would cease to get faster and faster, which, by the way, was a fairly controversial prediction to make. I felt that computers, you know, that we could not, you know, just were hitting physical limits. And the idea of getting to 10 gigahertz, 100 gigahertz, 1,000 gigahertz, which is what the trend was suggesting, that just wasn't going to happen. We had to level out at some point. But with that, the ability to do parallel computing, which turned out to be GPUs, that has gone way up, right? Now we see CPUs with many, many cores. So that prediction influenced how we ended up building scalable software. That's one piece of it.

Starting point is 00:23:47 Another piece was is that I felt that memory would get larger and larger and larger. I should say the capacity would get larger and larger. And that has happened. Memory has continued to, I'm not sure if it's doubled year after year. I'd have to look at a chart. And when I talk about this, I mean both RAM and on disk. You know, it's continued to get faster and larger. And then when it came to front end, I felt that the web was the ultimate solution because the web does not require any install

Starting point is 00:24:19 on the desktop. It's a much more secure way of building applications from an end user perspective. It's also guarantees. And one of the issues we have in finance with heavyweight front end applications is they take up, they, you know, developers have a tendency to put way too much logic in that heavyweight desktop. That ends up slowing down the user's desktop. You can't really or, you know or if you're building your web application in any sort of reasonable way, you will avoid that. And so I knew that this was kind of the way to go. And in fact, I realized once you had AJAX,

Starting point is 00:24:57 you could actually go all the way. We've basically, we can re-implement this concept all the way back from the 70s, X Windows. And X Windows is also, I think, a very, very clever technology. The problem is it was very Linux-based. So it didn't really, you know, gain a lot of exposure. But the idea of X Windows was you basically write all your code server side, and then it's just pushing GUI updates to the client side. It's actually pretty incredible that they had built this in the 70s, in my opinion. And that's really what the web is. And that's how we are treating a web browser. So I would say it was based a lot by on looking at the history of how software had been built in the past, what had worked, and what wasn't working. And I felt that heavyweight applications just weren't working well.

Starting point is 00:25:47 And they had a lot of overhead. So most of the computation or like all of it is already done on the server and it keeps pushing down updates in some shape to the client. We do apps. And in fact, this is why we've had to build every single component from scratch. Of course, we let people embed their've had to build every single component from scratch.

Starting point is 00:26:05 Of course, we let people embed their own, but we built every component from scratch designed to put minimum data in the web browser. And I cannot overstate how important that is. So, for example, just like Google Maps allows you to zoom in on a map, that's how our charts work. So when you look at a chart, you're getting a certain set of data. When you zoom in on a quadrant, that's automatically pushing just the data for that portion of the chart to the front end. Otherwise, you would never be able to view 20 million points of data. Take just about any charting application you've ever used and try to load 20 million points into it. It won't work. Now, you could argue that there's information fatigue by having that information. But when you have mission critical situations, you don't want a large amount of data to be crashing your browser

Starting point is 00:26:49 and stopping you from proceeding. Another example would be a dropdown. Something as silly as a dropdown. Let's say you've got a dropdown field and you want to have a list of all the symbols where it auto completes. You do not want to push all 10,000 instruments, aka symbols, to the client. You want as a user types, it's bringing that back and bring it to the front end. Now, of course, a lot of times that's coded manually, but we've basically made it available out of the box without writing a single line of code. Same thing with our tables, same thing with our trees, same thing with our heat maps, all the way through. So the idea is at no point when you hit refresh on your browser,

Starting point is 00:27:26 is it going to move anything more than the minimal amount of data that you literally are seeing on the screen. And as you scroll, we're pulling some tricks to make it feel very, very fast. But as you're scrolling around, it's pulling in the data based on where your scroll bars are. Interesting. So like what you're really engineering is this extremely high performing component library, which is not how most people design component libraries even today. And it has to interoperate with each other really well. Correct. All the different components. Yeah. And that's actually, I think one of the beauties about building all these components ourselves is I think one of the issues when you start gluing a lot of GUI,

Starting point is 00:28:07 off-the-shelf GUI components together, they actually don't interact very well. Different things are designed different ways. So for example, let's just, I'll just pick something here like you hit Control-A, right? If I'm on a table and I hit Control-A, I want it to select all the rows. If I'm in a chart, I want it to select all the points.

Starting point is 00:28:25 If I'm in a heat map, I want it to select all the rows. If I'm in a chart, I want it to select all the points. If I'm in a heat map, I want it to select all, you know what I mean? I can go through each one of these. And a lot of times when you end up gluing different components together, it's like, well, control a works this way on this and it does this on that, something else on something else. And then you hit the tab key and it's moving around and it isn't consistent. So by building all components from scratch, it actually provides a very, I would say, seamless or user-friendly experience. Beyond that, more importantly, there's no development work necessary, which does scare developers a lot. The idea that you don't have to write JavaScript or HTML, zero JavaScript or HTML, and you can build fully interactive real-time web front ends is actually a little jarring to some software developers.

Starting point is 00:29:14 But that is the future, that period. That is where this will go. No, I think that makes sense, especially if you want to have this high performance. You can't expect everyone to think in the same principles of this has to work with 20 million data points. Right. And let me just step back. Let's pick the – I'm going to go back to my dropdown.

Starting point is 00:29:39 Now, let's say we have a dropdown, and that dropdown has, I don't know, a hundred elements in it. Cause it's like the state United States. It has 50, it has 50 elements in it, all of the States in the United States. And it works fine. And the developers are told, look, here's the specs. You know, this is going to be a state drop. You're like, all right, 50 States. That's not that bad. But then two years later, someone says, you know what? We're now a global product. We're not just gonna have States in here. We're going to do all, you know what? We're now a global product. We're not just going to have states in here. We're going to do all regions around the world. And suddenly that drop down with 50 elements has 5,000 elements in it or 50,000 elements in it. And now you've got to go, and this isn't a great analogy, but I think it's one people can relate to, which is now the

Starting point is 00:30:22 engineers say, well, I designed this for 50 elements. But the question is, why not design everything to handle, why not design the component right once that's designed to handle as much data as possible? So whether it's 50, 500, 5,000, or 50,000, or 500,000, that component is designed to handle that volume of data. And so you don't have to worry about it. And that way all your components are built the same way. So we always try to worry about worst case scenario and then work backwards from there. I wonder like often developers may say,

Starting point is 00:30:57 oh, that's over engineering. But I feel like if you have this design principle in mind, it's not actually that much more work. It's just a different design. Correct. It would be over-engineering if you were building it over and over again. If I had 10 select fields on the screen, I don't know why I keep picking on select fields, but if you had 10 select fields on the screen and you went and did this 10 times, that would be over-engineering. But if you build it once and you build it right, and you build it so that the entire world can use this component, then it's not over-engineering.

Starting point is 00:31:30 It makes sense. You're getting economies of scale there. Yeah. And it's also like if building it right is 10 times more work than building it wrong, like, one, okay, it may be worth considering deploying the hack, but secondly, you should really think about why it takes so much time to build it right. And there's something probably wrong with the abstractions then. Exactly. I mean, well, the thing is, at the

Starting point is 00:31:57 end of the day, you have to ask yourself, how different is this than other applications? And by the way, one of the reasons I started 3Forge amongst what I talked about before was I felt as I was moving from job to job, from company to company or department or department or use case to use case, I felt like I was, it's like the use cases had a 90% more or more overlap, meaning that they all needed the same components. Some of them needed them really fast and in real time, and some of them didn't. And some of them needed really large data sets, and some of them didn't. But at the end of the day, why not build something that can handle really large data sets and be real time and have all the user feature, you know, the creature comforts that users expect, build that once, build it right. And then no matter the use case you can use it yeah so you've

Starting point is 00:32:47 been working on wall street for a while before three forge and then you've decided okay i need to build a company and because i'm seeing the same use case come again and again right but you know for an engineer in like silicon valley or just in general like the whole idea of selling to wall street directly like it just feels at least for me idea of selling to Wall Street directly, it just feels, at least for me, it seems like it's too far away. It's too big of a stretch. Where do you even begin? How do you know yourself?

Starting point is 00:33:11 You've been there, so you know some of the use cases, but it feels like the sales cycle is going to be really long. It seems like there's going to be a ton of approvals you need to get. How does that all work out? Well, in fact, I would say that fitting in with the whole concept I've been talking about now around why not build it for the hardest use case possible, then I've also said to myself, why not sell this to the hardest customers possible? Because if you can sell it to the hardest, most demanding customers, then that makes it an easy sale for others. You have that brand reputation. And that's the angle I've gone for. Because I think it's very hard to work the other way around. In fact, I know the name of this podcast talks about scaling. It's one of those things where it's really hard to build a piece of software and say, okay, we're going to sell this to mom and pop buy side small firms with 50 to 100 employees. And then later on, we'll scale this up and we'll

Starting point is 00:34:19 go and we'll sell it to Morgan Stanley. That usually doesn't work, but it can work the other way, which is you say, we're going to build this for the very complex, large commercial use cases. And then we can use that same software basically for smaller, simpler use cases. That approach works. And so that's really why I felt it was important to go in and sell to the tier one banks first in order to get that A, to prove that it works to myself, but also to prove that it works to, at that time, our future customers, the smaller customers. So you start with the game on hard mode. The game on hard mode means that it's going to be tough, right? So what did you have to do? I don't know how many details you could share, but anything would be interesting. Yeah, sure. Well, I can tell you this. And it's funny because I do believe in the concept of over-engineering or premature optimization. I get that.

Starting point is 00:35:26 But it's actually, the term is used way more than it should, especially if you're going to be building generic software. And by the way, I look back on it and I'm like, I can't believe the leap of faith I took because I sat there for months building these ultra-efficient libraries under the presumption that the whole world would be using these things. But I had no facts or no evidence to actually say that would be the case. Now it is, but I look back on it and the fact that I sat there for months writing ultra-fast string libraries, as an example, right? The ability to process strings, to do fuzzy comparisons and things like that really, really fast. I look back on it, it's pretty crazy that that's what I

Starting point is 00:36:13 did, but I knew I needed to focus. So I guess what I would say is that in order to build really fast, to build a really, really good mousetrap, you have to start with very, very good ingredients. And I felt that I needed to step back, work on these individual components individually, make them to the best of my ability, and then basically, you know, keep layering on top of that to get to where we need to be. So while most people would see us as a GUI sort of solution or a full stack platform, years 12 and 13 was spent on how to marshal a piece of data as fast as possible so that I could communicate it between two processes so that I could do elastic scaling. And that was the first two years of this product. Then after that, we focused on how do we do real-time caching, columnar compression, things like that. But these are the things that you start with. And then at the end of the day, you can put the window dressing on it and then go out and sell

Starting point is 00:37:19 this. And so that was kind of where we got, at some point, 2014, 15, then we actually had enough of a GUI that we could go out and start to market this. Now, of course, it's a very mature product at this point, but it's all sitting on those same underpinnings from 2011, 2012. It's fascinating how similar that story is to like the Figma founding story where they just spent years and years making the best design product possible. And then once you show it to someone, so imagine something similar would have happened in your case where you just, you can have the skiller demo of like,

Starting point is 00:37:50 here's an app that sends me 200,000 data points a second and look how seamlessly you can add and like show these things and zoom in and zoom out. And it's like obvious to the buyer that there's nothing else on the market that works as well as this thing. Yes. And I have to say it was quite magical it was quite magical when you plug all the pieces together and you get it there. It's a pretty awesome feeling. But it's actually exactly what you said.

Starting point is 00:38:13 That's probably our best demo, where we say, look, stream in 200,000 points a second. I don't know how you guessed that number, because maybe you've done your research. But that's literally the numbers we're looking at, between 50,000 and a million messages a second, you stream that into our system. And now you can build real-time GUIs on top of that. Not that you need to day one, but it's nice to know that you have that scale when you get there. And, and another thing, and by the way, another thing I was going to say about, you know, people talk about premature optimization and things like that. What I have realized in my travels is that no matter how fast I build a piece of software, the end users will find a way to max it out. It just doesn't matter. You know what I mean? I don't

Starting point is 00:39:00 know how to put it. It's like, if I can make the software run five times as much, they're just going to put five times more data in it. And if I can make it run a hundred times faster, they're just going to put a hundred times more data in it. And the thing is you can scale. Scaling works. I mean, you can go from one server to 10 servers to a hundred servers, but you actually start to even hit some limits when you talk about, I mean, you really want to have to have a thousand servers to solve this. You're much better writing the software a thousand times faster in the first place. And that's realistic, by the way. There's a great interview from Steve Jobs, where he talks about, you know, the best developers, you can write software that's

Starting point is 00:39:36 a thousand times faster than the developer sitting next to you if you really focus on it and you're focused on high quality code. And so that's been our thing. Yes, write it as fast as possible. And in fact, we still are constantly looking at the platform. And, you know, as we get more use cases and people use it in unimaginable ways, literally ways I couldn't think that it would be used, we go back and try to refine it and say, can we make this more efficient, et cetera. So that's really been the focus.

Starting point is 00:40:10 Yeah. In terms of team composition and hiring the right people to work on this stuff, I think there's a certain mindset of an engineer that you need to have. What are your thoughts on that? Well, first off, I think it started with, I mean, at the risk of sounding a bit narcissistic, I mean, I have spent my whole life thinking about software and how could I write the best software possible. And in fact, it's funny, when I first started writing code, I got genuinely concerned. I mean, here I am, I'm 10 years old, and I realized I really like writing software. And I started thinking, but all the fun software is going to be written. And there's going to be nothing left to do unless I get really, really good at software. There's not going to be any developer jobs left

Starting point is 00:40:48 because all the good code is going to get written and then everyone's just going to reuse that. There's not going to be any developer jobs. So I need to get really, really good at writing software if I want to have a shot at being a developer. That is my big miscalculation is that there's now more developer jobs than ever, right? So I was wrong on that. I will concede that I was wrong on that. But it still set something in my mind from a very young age that I need to be the best I can possibly be. And so I think what that has done as I've built my engineering team, that sort of – it's almost like a sports team. And when you work with the

Starting point is 00:41:28 team and there's one guy who's trying real hard, then you try really hard. And then you realize when you join a team and you see people are really taking it seriously and love what they do, and they're smart people, that motivates you to want to be smart and try hard. And it feeds into itself. It's like the most awesome feedback loop. And I don't really spend, I mean, I'm the CTO of the company. I don't spend a lot of time, I don't know, criticizing or whipping the chain, saying, go do this, go do that. I genuinely believe, and I'm sure many of the employees who will listen to this podcast

Starting point is 00:42:03 would agree that they like doing the work. And it's fun to work with people that enjoy and are smart and build good software. And so it feeds into itself. And I will say that I like to think that it kind of started with my attitude. And what I love is that I've seen that that attitude, I've been able to hire the right people that embrace that attitude. And maybe they had that attitude before joining the company, but I think we certainly all share that now. Yeah, I think founders definitely set the culture. And I think what people don't realize often is that engineers like working on high quality things. It sucks your energy if you are working on a product and it's buggy and it's like customers

Starting point is 00:42:48 are upset. And even though you're like, okay, moving fast and breaking things or whatever, it still sucks your energy sometimes. It's like, I just want to work on something that works. Right? Like people underestimate that. Yeah. Exactly. No, you know, and I've definitely been on both sides of the aisle i mean you know you take over a piece of software that's really buggy but maybe it's really profitable because it's a great business idea um but that's just you know you just feel like you're you're working on this sinking ship or something i don't know how to put it and that's not nearly as much fun yeah as working on really high quality stuff. It's like, you know, it's like

Starting point is 00:43:26 working on an F1 race car. You know what I mean? You're working with the best engineers. You know, you're like, my gosh, here we are trying to like, just refine the engine and do all this analysis to make it a little bit faster. So the car goes a little bit faster. And that's, that's, I think, the dream of engineers is to work on stuff like that. That's the kind of culture, at least, I'm trying to build. And at the very least, I think a team has to be composed of different people who think on that spectrum in a different way. If you have too many people who are just trying to optimize that last 1%, maybe you're not thinking as much about the business use case. And if you have too many on the other side, then you're going to release shitty software. So you need that mix. That's an astute observation. And that's

Starting point is 00:44:10 probably one that I've actually learned through 3Forge because I hired the best engineers. And there was a point in 2019 where we had no salespeople. I mean, it's pretty remarkable that we had some of the top tier one banks in the world using our software. We had no salespeople. We had no marketing people. We had no graphics people. All I had were math and physics and software engineers. That's what we use to basically build our product. That's what I felt. And then at some point you realize, well, by getting different disciplines and different inputs and different types of people, that actually helps a lot. And it might not be exactly how I think, but that complements the company incredibly. And I think that was actually a big learning point in the 3Forge story. And 2019 was a big year when we started to say,

Starting point is 00:45:05 look, we need to, yeah, we need, we need to get different opinions and different disciplines in here. And then that allowed us, I mean, dramatically changed the company and the,

Starting point is 00:45:13 and the profile company customer profile. Yeah. And maybe like one more question on like a similar note, but geared towards like customers perhaps is I'd love to know, you know, what were customers using before like a high quality piece of software? And I'm sure you think about Reforge as shipping high quality software. Like what was the state before? And like, why do you think they're jumping on to a better product? I think it's varying degrees. It's varying degrees. Now, I think one of the biggest problems with, and I could pick different customers' segments and talk about what they're using. And I think a lot of times products get abused in ways they shouldn't. I mean, I could say it's sad the number of times I find out customers have Perl scripts or Python scripts or Shell scripts to basically manage data during an outage.

Starting point is 00:46:06 I mean, that's just, you know, and I could go, I could spend an hour talking about just the problems and the likeliness that that's taking place within many organizations. But one of the big things is people write these scripts and then they leave and that's gone. So it's really not growing the company's intellectual property at all. It's just a guy or a girl writing a script to help them do their job. Then when they leave. So something as trivial as that.

Starting point is 00:46:32 We have other customers that are probably doing way too much in Excel. So we help them move away. And Excel is probably one of the best software products in the world. It's incredible the things you can do with it, but it also has its limits. And so a lot of times we can help with that and we're replacing things along those lines. So I think I could pick

Starting point is 00:46:50 the different segments and talk about what they were using and what they've moved to. But I think the biggest thing for us when we talk about the large customers is that it's extremely siloed. And by that, I mean, if you take any large bank or any large company, the company has organized their software in line with how they've organized their employee structure. At least this is my opinion. This is Robert Cook's opinion. The Conway's law or something. Yeah. Right. And so literally it's like, okay, we have our equities department and we have our fixed income department and we have this and we have that. And so they build their systems siloed in that way. And then they build their support systems siloed on top of that.

Starting point is 00:47:46 And so if a customer calls up and says, I'd like to understand what's going on within equities and within FX, that's two different groups, two different systems. There's no way to correlate between the two. It's kind of like back in the day, you know, the fire department and the police department didn't talk to each other, right? Because literally it was siloed. Like you had the police department doing one thing, you had a fire department doing another, and the communication wasn't there. Imagine that sort of thing. And the thing that we focus on is basically building systems that cut across that. And so that's a big thing that we help with is replacing these, I would say, very siloed specific solutions for a particular use case with a more broad solution that gives

Starting point is 00:48:32 them kind of a dashboard that cuts across, I guess you could call it all the different risk categories. So as a technologist, you are definitely interested in making sure your systems work as well as possible. Are there open source systems out there that you're excited about? Things that you think are changing the way we're thinking about data or the way the market is thinking about systems and how good the performance of certain systems could be. One that comes to my mind is Red Panda. It's like a replacement of Kafka.

Starting point is 00:49:10 I don't know if you've heard about that. But that's the kind of... I'm wondering if you have things that you're excited about or technology changes you think are going to be instrumental over the next few years. Yeah, I think... I'll be blunt. I'm not a huge,

Starting point is 00:49:26 I think open source software has its place, but it's hard in a corporate, I believe it's hard in a corporate setting. Now there's a lot of products that will kind of position themselves as open source, but then you really end up paying for it through support. And it's basically closed in terms of being able to contribute to that open source, but then you really end up paying for it through support. And it's basically closed in terms of being able to contribute to that open source. So it's not, I would say, necessarily

Starting point is 00:49:49 open source. With that said, I think there's a lot of really good, there's a lot of things happening out there. To me, what's shocking though, you mentioned Kafka and other messaging solutions is how many new systems there are. And to me, that's what's actually scary about this. I think at this point, we integrate with over 100 different types of databases and real-time messaging systems. We integrate with dozens of different entitlement systems. We integrate with many different browsers and smart desktops. That's one of the big things we focus on is trying to be kind of this platform that sits across your organization. So we are always integrating this. But to me, it's almost scary.

Starting point is 00:50:34 It seems like we're going on a bad path here where there's so much innovation of different technologies that have a 90% overlap. I think it creates a lot of technical debt down the road. Now, at the end of the day, I also think that, of course, we don't want to stymie that. I think that there's also a lot of interesting things that come out of that. I really pay attention to it from an academic aspect. At the same don't there's nothing in particular that i'm saying wow that that to me is a game changer i think i think that makes a ton of sense to me um and and finally to just to like wrap up right like what are you excited about just generally for three four trade what do you think the next few years is going to look like what do you think

Starting point is 00:51:23 um is a little scary like like what do you think the next few years is going to look like? What do you think is a little scary? What do you think is going to be different? Well, I'll start with the scary part. Like I mentioned, the number of different systems out there is continuously growing. And that, to me, I don't know if I'd use the word scary, but it is worth taking note of, you know, I mean, as there's so many different ways. And just if you could imagine being in our shoes and we come into an organization, we say, okay, you know, we're going to help you cut across your different groups. What are you using? And then they list 25 different database technologies and 15 different messaging systems.

Starting point is 00:52:03 And then they're saying, we want to query this system, bring the results in, query that system. When those results come in, now you're going to query those five different types of databases, bring that in. And then you're going to blend that with some message coming in from Kafka and some other, you know, TIBCO. And we have to orchestrate and bring all that together. And that seems to be just a trajectory of more and more software out there. So I would say that's one of the things that I think companies should be wary of is trying to limit the amount of different technologies they're embracing all at once. What gets me very excited about 3Forge. So first off, I spent 10 years of my life building a platform and a concept that really, I'd say, that has no equal. And I'm not saying no equal in terms that we have no competitors. That's not what I mean. I'm saying in terms of the approach we took, and I like to believe the discipline we had. And, you know, there's many years in the beginning where it's like, is this even going to work? Are we going to be able, are our customers going to see the value in what it means to be able to view large amounts of data across many different systems and take action on that and et cetera, et cetera. We now know that we have the data and we have the customer base to say with absolute certainty that,

Starting point is 00:53:31 yes, this is the right direction. It was time well spent. It was a good investment to spend all those years doing that. To me, I think we started doing a lot more marketing in the last year, and we're seeing a huge growth in terms of our customer base. And the thing that we're working on now through events, and we did a global tour last year. I'm doing another one this year. We did visit India, three different cities. It's now about building that developer community so that developers can start to interact with other developers on our platform and so that they can kind of share that, you know, basically share their experience. And then and then I think the product takes on a life of its own. It's no longer about three forge pushing it.

Starting point is 00:54:20 It's about it's about users and the developer community just just getting together kind of globally and being able to um adapt for yeah i think that's really exciting and scary time for sure i guess you could say that's scary in another way yeah yeah i've seen like all of the community driven or like the like open source systems or just general systems with communities like i feel like there's all the discussions about hard forks and making sure everyone on the same page and infinite list of get up issues. Definitely an exciting time, but yeah,

Starting point is 00:54:53 I know we're at the end of the podcast here, but I could talk about software forking and all the issues of that for, yeah, that's a, that's a scary topic in and of itself, but we have focused on that. Yes. Single fork,

Starting point is 00:55:04 single, single, single branch of focused on that. Yes. Single fork, single, single, single branch of code for three. Yes. Well, I'd love to have you on the show again in a few months, like as you've, as you see this thing evolve and like,

Starting point is 00:55:15 especially like retrospect, like, Oh, what we discussed a year ago that didn't end up working out. And we did something else. Like, I think those are the most interesting kinds of conversations. I'd love to have you on again.

Starting point is 00:55:25 And thank you so much for being on the show. Hope you had fun. Okay. Thank you. It was a pleasure.

Your Ad Here

Software at Scale - Software at Scale 57 - Scalable Frontends with Robert Cooke

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.