The Pragmatic Engineer - Software architecture with Grady Booch
Episode Date: December 4, 2024Brought to you by:• WorkOS — The modern identity platform for B2B SaaS.• Sevalla — Deploy anything from preview environments to Docker images.• Chronosphere — The observability platform bu...ilt for control.—Welcome to The Pragmatic Engineer! Today, I’m thrilled to be joined by Grady Booch, a true legend in software development. Grady is the Chief Scientist for Software Engineering at IBM, where he leads groundbreaking research in embodied cognition.He’s the mind behind several object-oriented design concepts, a co-author of the Unified Modeling Language, and a founding member of the Agile Alliance and the Hillside Group.Grady has authored six books, hundreds of articles, and holds prestigious titles as an IBM, ACM, and IEEE Fellow, as well as a recipient of the Lovelace Medal (an award for those with outstanding contributions to the advancement of computing). In this episode, we discuss:• What it means to be an IBM Fellow• The evolution of the field of software development• How UML was created, what its goals were, and why Grady disagrees with the direction of later versions of UML• Pivotal moments in software development history• How the software architect role changed over the last 50 years• Why Grady declined to be the Chief Architect of Microsoft – saying no to Bill Gates!• Grady’s take on large language models (LLMs)• Advice to less experienced software engineers• … and much more!—Timestamps(00:00) Intro(01:56) What it means to be a Fellow at IBM(03:27) Grady’s work with legacy systems(09:25) Some examples of domains Grady has contributed to(11:27) The evolution of the field of software development(16:23) An overview of the Booch method(20:00) Software development prior to the Booch method(22:40) Forming Rational Machines with Paul and Mike(25:35) Grady’s work with Bjarne Stroustrup(26:41) ROSE and working with the commercial sector(30:19) How Grady built UML with Ibar Jacobson and James Rumbaugh(36:08) An explanation of UML and why it was a mistake to turn it into a programming language(40:25) The IBM acquisition and why Grady declined Bill Gates’s job offer (43:38) Why UML is no longer used in industry (52:04) Grady’s thoughts on formal methods(53:33) How the software architect role changed over time(1:01:46) Disruptive changes and major leaps in software development(1:07:26) Grady’s early work in AI(1:12:47) Grady’s work with Johnson Space Center(1:16:41) Grady’s thoughts on LLMs (1:19:47) Why Grady thinks we are a long way off from sentient AI (1:25:18) Grady’s advice to less experienced software engineers(1:27:20) What’s next for Grady(1:29:39) Rapid fire round—The Pragmatic Engineer deepdives relevant for this episode:• The Past and Future of Modern Backend Practices https://newsletter.pragmaticengineer.com/p/the-past-and-future-of-backend-practices • What Changed in 50 Years of Computing https://newsletter.pragmaticengineer.com/p/what-changed-in-50-years-of-computing • AI Tooling for Software Engineers: Reality Check https://newsletter.pragmaticengineer.com/p/ai-tooling-2024—Where to find Grady Booch:• X: https://x.com/grady_booch• LinkedIn: https://www.linkedin.com/in/gradybooch• Website: https://computingthehumanexperience.comWhere to find Gergely:• Newsletter: https://www.pragmaticengineer.com/• YouTube: https://www.youtube.com/c/mrgergelyorosz• LinkedIn: https://www.linkedin.com/in/gergelyorosz/• X: https://x.com/GergelyOrosz—References and Transcripts:See the transcript and other references from the episode at https://newsletter.pragmaticengineer.com/podcast—Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email podcast@pragmaticengineer.com. Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe
Transcript
Discussion (0)
the entire history of software engineering is one of rising levels of abstraction. So what we're seeing
here is the rise of another level of abstractions, which gives us all these extraordinarily powerful
frameworks from which I can build systems, and which, as I alluded to, the architectural decisions
that were front and center for us back then are now embodied in these. So now becomes a decision
what cloud service do I use? What messaging system do I use? What platform do I use? That's the decision
which has a lot of economic decisions
and not just software kinds of decisions associated with it.
So I think the role of the architect in effect has changed
because now I'm dealing with systemic problems,
not just software problems themselves.
Grady Booches is a trailblazer in software engineering.
He built his first computer 57 years ago at just the age of 12
and is known for his decades-long work
in advancing the field of software engineering and software architecture.
He is the co-author of UML
and has originated a term and practice of object-oriented analysis and design.
He's an IBM Fellow, an ACM Fellow,
and has been awarded several other prestigious awards for his work in software architecture.
He's the author of six books and more than 100 technical papers on software engineering.
In this conversation, we cover the first two golden age of software engineering,
how UML was created and why Grady disagrees with how it has evolved since version 1.0,
how the practice of software architecture has changed over time,
Grady's views on large language models.
Interesting stories like how Grady was offered to be Microsoft's chief architect,
but said no to Bill Gates, and a lot more.
If you enjoy the show, please subscribe to our podcast on any podcast platform and on YouTube.
It's safe to say I'm talking with a living legend in the field of software engineering,
so welcome to the podcast.
Emphasis on living. I'm not done yet. Yes.
Absolutely.
So to kickoff, you're a chief scientist at IBM.
That's a pretty fancy title.
What does it mean and what do you do?
I'm curious to know.
Well, there was a time that my business card said I was a free radical, but upper management
didn't like that.
So I had to find something a little bit more tame.
Actually, the more important title slash position is that a fellow.
There are, I think, 68 of us still active.
No, 89 of us still active.
And this has been out of 350 or thereabouts fellows,
throughout the history of IBM.
So we're a fairly rare breed.
I was made a fellow upon the acquisition
of our company Rational Software back in 2003.
And the great thing about being a fellow
is it's rather like having tenure,
meaning we trust you, done good things.
We want you to continue doing things.
Let's give you the degrees of freedom to do that.
And so as a fellow,
and with my focus upon first software engineering
and then later upon AI,
I am given a lot of degrees of freedom to pursue what I think makes a lot of sense to trade to, as Alan Kaye would say, invent the future.
So in my journey from starting at 2003, I first stayed with the rational division, but then quickly moved over to research because IBM's bureaucracy realized I was a person who worried about the next five to ten years, not the next quarter.
And indeed, my very early work in research was looking at finding ways to automate the discovery of patterns within legacy software systems.
This is something we were doing pre-neural network days.
And it was interesting trying to see if we could discern the design patterns from the gang of foreign elsewhere.
It never went anywhere because it was a hard problem.
And so I then began to, in the architecture sense of things, I worked with a lot of the design.
I worked with a lot of customers, and I actually had been doing that for decades, where I'd be
parachuted in.
A customer would say, come help me, Mr. Wizard with this particular architectural problem.
And the exciting thing about it for me is that for many decades, I was engaged when projects
across every conceivable domain.
And I'll pause there to say that roughly around the turn of 2010-ish is when I began to be
drawn back into the space of age.
So when you say you worked on legacy systems, what does a legacy system mean to you?
Well, the moment you write a line of code, it becomes a legacy system until you throw it away.
So all code to some degree is a legacy system.
Facebook is a legacy domain.
Google is legacy.
Heavens, even Open AI has a legacy problem.
Because reality is that, as I often say, old code never dies.
You have to kill it.
Once you have built something that's useful, then it's going to live on.
And so unless you have fully disposable code, there is some body of code that you have there
that represents something immutable, something that has a cost, something that has some degree
of technical debt to it. It may be very, very small. If you look at many of the classic
organizations in the big financial space, these are groups who have been working with code
basis since literally the 60s. I had one engagement with the Internal Revenue Service in the United
States because they've been trying to modernize their systems since the 1960s. Now, let's go back in
time. What was happening in the 60s? Well, you had the rise of the population, the increase of
Social Security and the like, and more and more organizations who were creating lots of paperwork.
And so the banks and the government realized by the late to mid-60s, there was simply so much going on that you couldn't do it by hand.
Why did banks use to close at 3 p.m.?
Because they needed the time for the humans to reconcile accounts.
The IRS was very much like that.
And so there was during the 60s this period of automation of human processes.
And so most of that was written in IBM 360 assembly language.
Zoom back to today, there is still code in the IRS system written in IBM 360 assembly language running on
emulators upon emulators. But that creates really difficult problems because some of that code
embodies business rules that are within the assembly language itself. So how do you change that?
And the answer is you're faced with a real human problem of how do I transmorgify old code in Cobol
assembly language so that it can work on modern technology. There's only so much you can do via
emulation. And then we'd consider that the government, you know, makes new business rules every
year. How do you keep up with that? That's a real and present legacy problem. Facebook has the
same thing, although their code doesn't date back that far. Google has the same problem as well.
OpenAI will soon have that problem. This episode is brought to you by WorkOS. If you're building a SaaS app
at some point your customers will start asking for enterprise features like SAML authentication,
skin provisioning, and fine-grade authorization.
That's where WorkOS comes in, making it fast and painless to add enterprise features to your app.
Their APIs are easy to understand, and you can ship quickly and get back to building other features.
WorkOS also provides a free user management solution called AuthKit for up to one million-monthly active users.
It's a drop in a replacement for Alt Zero and comes standard with useful features like domain verification,
role-based access control, bot protection, and MFA.
It's powered by Radix components, which means zero compromises in design.
You get limited as customizations as well as modular templates designed for quick integrations.
Today, hundreds of fast-growing startups are powered by WorkOS, including ones you probably know,
like Cursor, Versal, and Perplexity.
Check it out at Workos.com to learn more.
That is WorkOS.com.
This episode is brought to you by Savala.
It's a true Heroku alternative where you can deploy
applications, manage databases, and host static sites for free.
Savala is a platform designed for teams.
With its preview and pipeline features, developers can collaborate on any stack,
while being assured of the security of their workloads from staging to production.
Savala holds all the major security certifications companies are typically looking for.
Their application hosting offers automatic Git integration, Docker image deployments,
hibernation for optimal cost savings, vertical and horizontal auto-scaling,
TCP proxy support, and optional private network connections for your databases.
Their free static site hosting is perfect for landing pages, documentation sites, and more.
It also includes preview deployments for easier iterations and seamless teamwork.
Savella features an easy-to-use interface, unlimited seats, no hidden tricks, and transparent user-based pricing
with enterprise-level cloud for DDoS protection for workloads of any size.
Sign up and deploy today. Go to savala.com. That is Savala with a
When you say you were parachuted to help a bunch of different types of companies,
can you give a sense of what types of companies you worked over the years, the decades,
to help with their architecture, their legacy code, their tech debt?
Every conceivable domain truly is.
I've had the opportunity to work with, obviously, of the financials.
I've done a lot of work in the defense sector.
In fact, to go way back in time, complex systems were really not.
started in the commercial realm, but they really began in the world of defense. The phrase I also
use here is that all of modern computing was woven on a loom of sorrow. What we see in modern
computing was born from World War II and the Cold War, particularly a system called Sage,
the semi-automatic ground environment that came about during the 50s and indeed was operational
until the 1980s. This was a system built in response of the Soviet threat of them taking
bombers over the Arctic and coming into the United States before we had satellites and
pervasive radar. And that was a system that was that precipitated the creation of what we
called the software crisis. It was what triggered the creation of the NATO conference
later in that decade in which a group of folks came together from around the world saying,
you know, how do we attend to this problem? And it was really at the peak of what I'd call the first
golden age of software engineering.
So we have defense systems.
I worked with a lot of real-time systems, everything from pacemakers to to subway systems to,
gosh, what else, CT scans and the like.
Truly, you name a domain and I probably spent some time in it.
The James Webb Space Telescope currently uses the UML in its design.
pretty freaking cool if you think about it.
Jumping way forward.
You've now been working with a lot of companies, been involved in a lot of projects,
and also influenced a lot of software architecture, the broader field.
How would you describe the field evolving over the decades?
You were clearly part of some key technique as invented and becoming commonplace.
What was this like?
So I alluded to a phase I called the first golden age of software engineering.
This is the realm of the time of functional, or not functional, but algorithmic languages such as Fortran and Cobol, APL, LISP, and the like, although LISP was sort of a multimodal kind of language.
But the dominant way that we decompose systems was through algorithms.
And so you saw the rise of structured analysis and design techniques, which made a whole lot of sense at that time.
This is where you had the Jordans and DeMarcos and Constantine's and the like, because the presenting problem for software systems was they were generally not distributed.
They were largely monoliths.
And how could we build larger and larger systems that were sustainable and economically interesting over time?
Well, the golden age of that first golden age of software engineering began to change as we started to see the rise of distributed systems.
And that rise, again, happened not in the commercial world, but it happened in the defense world.
The ARPANET was, you know, funded by the government, funded by DARPA.
I was, as I mentioned to you earlier, I got my first email address in 1979 when there weren't
that many email addresses around.
In fact, one small story there, when we had the ARPANET in the air, I was teaching at the
Air Force Academy at the time. We had a little mimographed document that listed the email address of
everybody in the world. I think at that time there were a few thousand people. So we knew whoever,
but who everyone was at the time was pretty cool to go back. So the first distributed systems
were happening in that domain. Indeed, back to Vanneberg, I worked on a system called the telemetry
integrated processing system, which was a close network of some 32 mini computers.
Many computers were being to be a thing. And so the problems of how do I deal with taking a
larger system and breaking it up into multiple distributed parts was beginning to emerge as a
problem, hadn't reached the commercial sector yet. So we saw we, the industry, began to see
that there were limitations to what one could do with algorithmic decomposition. And so there were
these pressures all around to try to attend to the next kinds of software, software that was
distributed, software that was real-time, software that was multilingual, software that worked on a
variety of computers. And I had to deal with all the normal aspects of distributed systems,
which is they're going to fail at various times. And I've got communication issues and the like.
So it led to a realization that we needed to think about software in very different ways.
The other thing that was happening is in research, you saw the rise of languages such as simula and small talk, which were looking at the world in fundamentally different lenses.
So here we are again in the late 70s. And again, I'm what a 20 something?
I was asked by one of my former teachers at the Air Force Academy say, Grady, would you go help the Department of Defense figure out how to use this new
programming language called Ada so we can apply it to modern software engineering techniques.
Now, why was the government worried about this? By that time, software was a real problem for the
Department of Defense, actually for all of the federal government, because there were several
thousand languages in use that exploded because Fortran and Cobol were useful for some things,
but not for all things. And so there was a decision made to build one language to rule them all,
and that was the Ada programming language. Ada was
far ahead of its time. It was a language that was influenced by simulence, small talk, and others,
but it used the ideas of abstract data types from Lyskov and Galgan and others. It used the
ideas of information hiding from David Parnas. All ideas that were very new at the time, but
frankly are part of the atmosphere in which we breathe right now. And so as an industry,
we really didn't understand the methodologies to make that work. Thus was born. The
bootch method. I was, here I was in 79 till about 81. I was going back and forth across
United States, helping, helping the federal government and helping contractors try to apply this
new language in new ways. And this was the beginnings of the second golden age of soft
engineering, in which it was not so much the complexity of the algorithms, but it became a systems
engineering problem, systems that were dealing with distributed systems that were very new at the
time. And that was the essence of the Booch method. It was the things I learned about helping
organizations architect systems with these new kinds of domains and new kinds of languages.
Could you explain what the Boch method is? I understand it has to do with object-oriented programming,
but coming from you as the person who invented it, it would be nice to explain what it is and
why it was important. Well, let's go back to Plato.
talking about going way back.
I wasn't around then,
but I've read about him a little bit.
There's this wonderful treatise he wrote,
the dialogue,
in which there's a debate
about how one should best look at the world.
Should I look at it as atoms?
Or should I look at it as processes?
Well, the first golden age
of software engineering was more focused upon
the processes, the algorithms.
But there's a parallel way
of looking at the world
and that's looking at it through the atoms, if you will, the classes and objects within them.
So, yes, I was influenced by abstract data type theory, by Plato, by a lot of other interesting
philosophical things that were coming together at the time of looking at the world in fundamentally
different ways. So the Booch method was really trying to codify that. How could we decompose systems,
not based upon algorithms, but how do we decompson?
compose it based upon classes and objects. And again, that's where I was influenced by Liskof and
Parnas and Dykstra and horror and the like, names that are probably unfamiliar to students these
days, but they were representing the theoretical underpinnings of the first and second generation.
The Booch method was basically saying, hey, here's a new way of thinking about the world.
And so it said, look not at algorithms, but look at combining data.
and processes, algorithms together in one thing, thereby classes.
Now, we did some things right.
We did some things wrong.
I think the things we did right was classes make a lot of sense in terms of abstraction.
What we did wrong is we over-emphasize a notion of inheritance.
Inheritance was all about, let's save code because we can build generalizations of the
like.
That proved to not make a lot of sense because we ended up doing lots of disparate kinds of
That's okay. Fast forward to today and people say, well, what difference does it make? And the answer is,
it's part of the very atmosphere in which you breathe and so you don't even think about it. You look at, you know,
Redis and you may build things upon it, but you know, if you look at it, you're really dealing with a set of
abstractions that Redis offers you and those abstractions are class-based. So they're baked into the
way of thinking of those kinds of systems. So in short, the Booch method was, let's look at the world not
through algorithms, but instead through objects and classes.
The last thing I'll mention is one of the things the Booch system, the Booch method hinted at
that really did not catch full form into the UML was looking at systems through multiple
points of view.
Now, we'll come back to that bit when I talk about, Khalik Proustin.
So just so I understand because myself and most of us listening will have started our careers
long after the Booch method was invented for us, classes.
variables, inheritance, that's pretty common, pretty everyday things.
But as I understand, it wasn't like that back then, right?
So could you talk us through what the environment, the technology was like?
So what made the Booch method so new, interesting or innovative?
Well, let me go even further back to the 1950s and show you a
parallel story. There was a time in the growth of algorithmic programming languages where the idea
of a subroutine was considered controversial. Why? Because doing a function call added at least two or
three more instructions, which was computationally expensive. So even function calls and decomposing
something into subroutines was viewed as an architectural aberration. And people oppose.
it because it was inefficient. Well, obviously, we think, well, that was stupid because we need it for
our management of complexity. The same thing I think was true back in the days of early object orientation.
I mean, people were doing object orientation in algorithmic languages because you'd have these things
in COBOL called, you know, common data areas. People would devise, here's all this common data.
and as a matter of practice but not language,
you would say this data is used in this way
by these algorithms and vice versa.
In fact, going back to that project I mentioned to you
at Bannerberg Air Force Base with those 32 computers,
on the side of every one of those computers,
every day we'd see a printout of here is the common data pool.
And so it was the abstractions right in your face
because they changed.
And so people were trying to do algorithms,
or object joint or decompositions, but the languages didn't support it. There was a need to do
that kind of thing. And until the languages came into play, there was no way to bring those ideas
together efficiently. And so, yes, the Booch method was very much a reaction to the forces upon
building software intensive systems to look at classes, trying to apply it with modern
languages and building a methodology around it. Today, we take it for granted because our languages
make it easy for us to do this.
It's just fascinating to think back how revolutionary it was and compared to just how commonplace
it is.
And also to think about how things that we invent today and that are revolutionary in 20
years, people will be like, oh, yeah, that's commonplace.
Right.
Exactly.
One thing that you're known for and you also mention you're associated with it is UML.
But can you share how this was created?
What was the goal of it back then, who were involved?
And what was the need that it was solving at the time?
Right. So in 1982, two of my classmates from the Air Force Academy, Paul Evey and Mike Devlin,
Paul had been a roommate of mine at the Air Force Academy.
He was an economics major.
Mike Devalin was a computer science major.
Mike and I had a few classes together.
The first time I ever met Mike was in an unarmed combat course, by the way.
So not your typical thing you get at colleges, but hey, you know, I was trained to be a warrior.
That's where I first met Mike, and I think he beat the stuffing out of me, if I'm not mistaken,
in the Pugel Stick competition.
But Mike and here I was at Vanderbueger Air Force Base.
Mike and Paul were at stand up in the Bay Area.
They were working at the satellite control facility, and I engaged with them because they had
one of the first largest ATA projects that was going on.
so I went up and helped consult with that project.
The two of them also went to Stanford, and I think, I swear there's something in the water
at Stanford because they then connected with Art Rock and Hamburg and Quist, the two
premier venture capitalists at the time.
Art Rock and Hamper Conquist were the key founders to Apple, and they also contributed to
the funding of what became rational software.
So in 82, Mike and Paul got together with me and said, let's start a company.
And we did. It was a company called Rational Machines Incorporated, whose intent was to build a software
developing environment for this new coming ATA programming language. We saw that to be an opportunity
in which we could make piles of money. And we built hardware at first because this was the time when, you know,
many computers were becoming affordable. You had sun coming into play here. But none of them were
powerful enough to do the kinds of things what we were doing. So Mike designed a system. I helped build
the methodology around using the system. It was a system called the R-1-000, and that was the dominant
Ada system used around the world for Ada at the time. Well, around, I don't know, would have been
the early jump ahead a decade now. Here we are the mid-90s. And I was getting a little tired of doing
that kind of stuff and branching out in other places. And I found I found that there was traction that I was
getting from the Booch Method into the commercial sector. I was giving a bunch of lectures at the time.
And at one time, there was a gentleman in the audience who asked a really insightful question.
And afterwards, he and I met up, a guy by the name of Bjarnes Strewstrip. And it turns out
Biarna was working on a thing called C with classes, which was the predecessor to C++.
The two of us got together.
We hit it off.
We found that we were doing very similar things together.
And in fact, it led to the two of us doing a lecture series around the United States where I got to know him quite well.
And this was around the time he wrote his first book on C++.
If you look at the first edition, you'll see here references a lot of my ideas.
and that's when my book on object oriented design came out,
and I reference his work a lot too.
So, oh, oh, the Booch Method and C++ kind of grew up together.
Well, I thought this is interesting.
And I remember a particularly important meeting I had with Mike and Paul around the time.
It was at the Red Carpet Club of United Airlines in Denver.
And they met with me and said,
Hey, Grady, we're thinking of moving the company in the direction of embedded systems.
And I said to them, well, good for you.
I think that's a stupid idea because you're missing the commercial sector.
Go off and have fun.
I'm going to do different things.
That gave them pause.
And I think they realized, wait a minute, maybe there is something here in the commercial space.
We were finding that we were having challenges continuing to grow the business in just the
defense sector, which is what led them to that.
And so they then made the decision, hey, let's take the bootch method and make it
real and thus was begun the beginnings of a system called Rose, Rational Objects Joint and Software
Engineering, our first tool. The first prototype, by the way, I wrote in Smalltalk. It was a wonderful
system. I wish I'd kept the source code for that around, but I remember making changes to it
just literal minutes before we did the first demo. And so that's where we sort of broke out from the
defense sector into the commercial sector. And it was a big hit because that was the top. That was the
time when I think lots of others were recognizing object orientation was a good way of looking at the
world and C++ actually supported that. Well, that led to two things. Not just commercial success in our
part. We began to make lots of money and we began acquiring other companies and we started filling out
the software engineering lifecycle. Can you just tell us what rational software did this commercial thing?
That was a hit. Yep. So rational software, the
Rowe's Rational Objectory and Software Engineering was a personal productivity tool, if you
well, that ran on an IBM PC.
It also ran on, I think we eventually moved it to a number of other devices.
It ran under Windows in the first one that basically allowed you to, you know, draw UML
diagram, not UML, but Booch diagrams, so that you could then reason about and think about
your design.
We did a little bit of cogeneration, but really it was just a digital.
design tool to help organizations think about their designs. And, you know, people use it quite well
to document and specify and build their systems. Now, we started making lots of money off of it.
And so we started acquiring companies. We bought a requirements company. Ed Yorden came to me
and said, go look at these folks. We did. We bought a small company out of Cambridge called Pureatria,
which was led by a gentleman by the name of Reed Hastings.
Reed came to us, we bought his company, and he said, you know.
And we're talking about the founder of Netflix, right?
Yes.
Yes.
So we bought his company.
Small world.
Yeah, Reed realized he's a lousy CEO.
And so he took his money, hang out around for a few years, figuring out what he was going to do.
And actually, that was a lot of the seed money that helped form Netflix.
It is a small world in that regard.
So at that time, at the peak of what we were doing, by,
the late 1990s, IBM rational was sort of dominating the space of software engineering because we had tools across every part of the development life cycle.
And this is where the ideas of incremental and iterative software development came into play.
Long before today, we would be called continuous integration and continuous deployment.
We were already doing that with a rational machine and our tools.
We had pioneered those ideas because we had built incremental compilation tools and the like.
So here we were in the 90s, and we had a whole set of tool sets around this.
But because this was clearly gaining traction in the marketplace, we were at the only ones,
and we were seeing the rise of hundreds, if not a few thousand companies that were beginning
to try to do objectorining kinds of things.
And this was the beginnings of the second golden age of software engineering, where you had
Pete Code and Constantine back again and Jordan again and Martin.
We had Evar Yacchus and Jim Romba.
And so it was a very vibrant time where organizations were trying to say, gosh, we've got these great tools.
We've got the ARPANET.
We've got personal computers.
How do we build software for it?
So the presenting problem was in software, what is the best way to design systems using this very, very robust, very powerful set of tools we have at our hands?
And so rational being in a very interesting space, we said, you know, we're kind of dominating the market.
Let's keep going here.
So we hired Jim Rumba and the task Jim and I had was to combine his methodology, OMT, object management technique with mine.
We were sort of the two leading ones at the time.
And then we bought Evarigakwison's company because both Jim and I were using what was the idea of use cases.
Again, talk about something that's part of the atmosphere.
Use cases are just something you think about, but they were new at the time in the 90s.
They were an idea invented by Evar in his work primarily at Erickson.
And so we were working together with Evar on building software for base stations for the burgeoning cellular telephone networks, which were a thing back at the time as well, too.
So here we have the three of us who were brought together by Rational.
And our task was, let's unify our methods.
Now, you could never have found three, two very, three very different people.
I'm pretty amazed that we didn't end up with one of us in the hospital and one of us in jail.
We were so, so very different.
And I won't go into further detail on that except to say that I'm very proud of what we created.
And from that was born in the UML.
So we decided, you know, this is not just ours.
We need to make it something that the whole world could use.
So we made the decision to release it into the object management group, and that was born,
UML1.0. I sort of drove most of that working with Jim and the over. I wrote the primary document for it.
Obviously, it was the three of us working together. I don't want to, you know, I want to give them complete credit, believe me.
But after UML1.0, I was emotionally exhausted and want to go off and do new things. So I kind of walked away from it at the time.
I want to pause and mention one other person who was important here, to actually two other people.
The first was Philippe Krushton. So we realized that our work was so big, we couldn't just do the methodology and the notation. So Jim Evar and I worked on the notation. That was the UML. And Philippe worked primarily with some of Evar's people on the methodology. And this was born the rational unified process. Notice the emphasis upon unified. Filipp brought to the table the very important idea that we've begun to see hints at with the Booch method.
and that is looking at the world through multiple points of view.
Philippe has this idea of the 4 plus one view model,
which he grew from his work with building the Canadian air traffic control system.
Again, a very complex distributed system.
Those ideas were eventually made manifest in I-Triplee, IEC, IOC standard 420020
on architectural description,
which basically says if you're looking in an architecture,
you have to look at it from multiple points of view.
use cases, logical view, process view, implementation view, and deployment view.
And that's a very important and profound piece.
The other thing that came into play, another person that came into play was Walker Royce.
Now, Walker's an interesting guy.
You talk about a small world.
His father, Wynne Royce, who had the pleasure we're working with when he was at Lockheed at the time,
Wyn Royce was the gentleman who wrote the paper on waterfall life cycles, and his son was basically working on, you know, spiral models in the Booch method.
Wyn was misunderstood because he was not endorsing waterfall methods. He said, that's a stupid idea.
In fact, look at Parnas's paper, a rational design process, why and how to fake it.
It's from that the rational software name came out of, by the way, which says at one level it looks like waterfall inside,
No, it's what today we would call Agile.
So here we are, here we are what, the late 90s, early 2000,
UML1.0 was in the bag and that was the life of Grady at the time.
Hey, developers, we've all been there.
It's 3 a.m. and your phone blare's chelting you awake.
Another alert.
You scramble to troubleshoot,
but the complexity of your microservices environment makes it nearly impossible
to pinpoint the problem quickly.
That's why Chronosphere is on a mission to help you take back control
with differential diagnosis,
a newly distributed tracing feature that takes the guesswork out of troubleshooting.
With just one click, DDX automatically analyzes all spans-eyed dimensions
related to a service, pinpointing the most likely cause of the issue.
Don't the troubleshooting drag you in the early hours of the morning.
Just D-DX it and resolve issues faster.
See why Chronosphere was named a leader in the 2024-Guardmer Magic Quadron
for observability platforms at Chronosphere.
that I.O. slash pragmatic.
That is chronosphere.io slash pragmatic.
And do I understand correctly that the goal of UML was to describe a system?
So when I think back to college with UML, we have the different boxes of different classes.
We have the arrows between them.
They describe relationships depending on the type of the arrow is.
And when you look at this whole diagram, you get a sense of the structure of the software you're building.
Yes. So if you look at the very first line of the UML 1.0 standard, I believe it says something to the fact the UML is a visual language intended to reason about, visualize, specify, and document the artifacts of a software intensive system.
It says nothing about it being a programming language. In fact, I voraciously pushed back again.
that. It was a language meant to think about and reason about a system, to think about the world in
object-oriented ways, particularly in ways that you looked at it through multiple points of view.
DevOps today, by the way, is simply an amalgamation of deployment and implementation views.
But we didn't call it back that way then. We looked at it from those kinds of views.
And that's really where UML1.0 was. It was meant to be, how do I think about these things? How do I reason
about them. And I always intended for, you know, you'd write a UML diagram and you'd throw most of them away.
Now, unfortunately, many people didn't do that. And in the move from UML 1.0 to 2.0, there was a faction
of individuals and companies who said, no, we want to make the UML very precise. We want to turn
it into a programming language. And that was, I think, a profound mistake. I never intended the UML to be a programming
language. But the net result of that was to make the UML much more complex, much larger. And the emphasis
then was upon not using it to reason, but to generate code and to reverse engineering. Now, the
reverse engineering I can get, that makes a lot of sense. But turning into a programming language was a
mistake. And I think that began the decline of the UML because people were using it in the wrong ways.
At its peak, the UML probably had a 20 to 30 percent, you know, penetration.
in the marketplace, which is pretty cool if you think about it.
And so I'm proud of looking at the systems in which it was used.
But most of all, I'm proud of the fact that it helped people think of building software in
different ways.
So when you say at a peak it had 20 to 30 percent usage, does this mean that about 20 to 30
percent of commercial developers were using it at the time?
And what time was this?
Around what time?
Yeah.
here we're talking around
2000 plus or minus a few years.
Remember that Microsoft
was big in the midst of this as well too
that they actually worked with us
to take our Rose product
and make it a part of Visual Studio.
So we had a team that was working up in Seattle
to make that happen.
And it was a major selling point
for Microsoft at the time
because it actually helped their customers
build more complex software.
So, yeah, we're talking around 2000 or so.
But what else happened in that time frame?
The answer is the internet was, so the ARPANET had moved over to the internet.
We were beginning to see companies in the late 1990s move on to the internet, which was great.
But there was a challenge there.
There was first, how do I even build systems for distributed work in the web?
And how do I make money off of it?
What does those systems look like?
And so that's why Microsoft was interested, because we were helping their customers move from the PC to distributed systems.
But on the other hand, there was also a lot of hype that, you know, the internet's going to improve your sex life.
It's going to do all these kinds of things.
It was just totally overinvestment in that space.
So a little after the millennium, we saw the, we saw a backlash.
and there was this great downturn in the marketplace
where people had built things and realized
they weren't necessarily economically,
economically sustainable.
So now we are here in 2003.
IBM and Microsoft were still using our tools heavily
because they were important for their customers.
IBM and Microsoft both bid for us
and I think IBM won some bid of $2.7 million,
billion dollars to buy to buy to buy rational and so we went over to to IBM and became part of it.
And that made sense because at that time there were 3,500 of us in 14 countries.
We had reached almost a billion in revenues, which was pretty extraordinary for a company around
that time, but it was time to be absorbed. Now one other story I'll tell before I move,
before I pause again. Here I was, IBM had acquired us,
They made me a fellow immediately, which had never happened before.
It usually takes years of being part of IBM.
And a couple of months after, I got a phone call.
It said, hey, Grady, it's Bill.
Come visit me.
So I went up and flew up to.
Bill, Bill Gates, right?
Bill Gates, yeah.
Wow.
I'd done some things with Bill before.
And so Bill was at the time still CEO of Microsoft.
He said, hey, Bill, took me into his office.
We had like a 30-minute meeting scheduled.
We ran for two hours, much to the annoyance of his staff.
And he sat down and said,
Grady, it's not public yet,
but I'm going to be moving out of my role at Microsoft
because I would have, you know, do other things.
And Grady, you know I've got two roles.
I'm CEO and I'm chief architect of Microsoft.
I'd like to give you that job of chief architect for enterprise.
And so I said, Bill, that's very interesting.
And so I said, give me a little time.
And I went around and met all of his, you know,
his main reports. Most importantly, Balburn never met me. And that was a red flag. And it was around the time, too, I realized
that Microsoft was a particularly nasty company. You had the office group and you had the, you had the Windows
group that just couldn't stand one another. So I eventually came back to Bill through his hiring folks and
said, Bill, I'm flattered. But you know, you have a profoundly dysfunctional company.
and I'm not the one to fix it.
So, Bill, thank you, but no, thank you.
I think I use something to the effective.
It would only end in tears for both of us if I accepted.
So let's move on.
So I stayed at IBM.
And it was a good decision to make.
It would have been a bad decision for me to go.
And there's this cartoon about Microsoft created by a software engineer cartoonist,
Manu Kornate, about the organizations of Microsoft,
the two organizations, and they're holding guns.
against one another? Yeah, yeah, that's where it was. They needed somebody to knock at. I'm,
I'm a lover, not a fighter, and I was not the guy to break things out. Wow, what a story.
So I set a UML in college, and to this date, it's part of several college curriculums.
But interesting enough, in the industry, I've just not really seen it used for anything, at least the
companies that I worked at and the startups and scale ups and large companies I work with.
I kind of see a resistance to using it when it's brought up, claiming that it's too formal
because we do use architecture, right?
We use boxes and arrows and we diagram.
I'm curious to know, you know, we talked about how UML was used by 20, 30 percent of the industry
at some point, but what happened in your view?
So this leads us to, you know, contemporary architecture that,
I've got a shelf full of books on architecture, both in the, you know, older ones and new ones.
And if you look at what a lot of people speak of as software architecture today, I think it's
reasonable and sound and there's good stuff there.
But in many of the kinds of systems these architects talk about, the architectural decisions
have largely been made for you.
I'm going to build a system that requires message passing.
Well, let's go find, you know, Rabbit, MQ, or whatever.
or Redis or whatever I need,
the architectural decisions have been made for you.
So a lot of the activities of contemporary architects
is simply taking very large frameworks and components
and weaving them together,
which is a very noble and wonderful thing to do.
They also represent systems like Meta and Dunn as particular.
They have grown their architecture and system over a few decades now,
and the stuff they're building on top of it is,
largely evolving and building upon those APIs, which does not require the deeper kinds of
architectural thinking. So I use this, let me give you an image here of a three-axis system.
Along one axis, you have levels of ceremony. If I'm a startup, then it's just my other people's money
and no one else is, and heck, I can write disposable software, and if I fail, I'll just go find
another venture capitalist. Of course, I'm not going to worry about any kinds of degree of ceremony,
because just build it, go hire some brilliant people and make it happen. That's wonderful.
On the other hand, if let's say I'm doing something like, I don't know, building the next generation
intercontinental ballistic missile system, which uses about, I don't know, about half a billion,
half a trillion dollars, you bet you're going to use more ceremony because you have to have
degrees of accountability. So that's one axis. The next axis is that of risk. So if I build a system and
says, oh, if I fail, you know, so-and-so's not going to find their grinder match. Big deal. On the
other hand, if I fail and somebody dies, that's a problem. And you're going to use a more
disciplined architecture. The third axis is that of complexity. If I build a system that people have done
again and again, then heck, I don't need anything. Heavens, this is where prompt engineering comes into
play. I go build an app just by building prompts because we built these things. I don't need no stinking
UML for that thing. On the other hand, if I'm building something that I've never built before,
let's say I'm building not an LLM, but I'm building a constellation of LLM's work together,
and I want to weave them together with non-neural systems,
then you begin to think about architecture,
and that's what the UML comes into play.
There's a sweet spot for a tremendous amount of software development going into place
that doesn't need the UML and does not need any kind of thing like that.
But on the other hand, you go a little further out in that three dimensions,
and yes, there are people all over the place using the OML.
I mentioned the James Webb Space Telescope.
I still work of financial companies,
doing that where the risk, the complexity, the ceremony is sufficiently high, that it demands
a bit more formalism. So do I understand it correctly that you're saying that software architecture
has changed from the 90s and 2000 when systems were new, architectures were new, software architecture
was still a lot more novel? And if we look at venture-funded startups and big tech today,
what we see is, one, they're just not as risky if they fail big deal.
And then two, as we can use a lot more software that's out there and been architected.
And we can, for example, use Redis as a cache.
And it is there.
It works.
We don't need to think too much about it.
And then the third is that there's a lot of startups who just don't need ceremony.
Basically, they don't need audits.
They don't need formality.
they don't, they can just go like, all right, let's just do it however we want to it. We don't need that kind of
auditability. Do I understand that these are the changes that are in play with software architecture?
And is this why some of these formal methods are just not as popular with startups and scale-ups?
Yes. In fact, I think it goes to the root of the economics of software development.
Let's go back to the first age of software, first golden age. The machines were,
far more expensive than the humans. And so it required one to do some thinking before I even got to the
machine because machine time was very, very expensive for me. And so, yes, algorithmic decomposition,
structured analysis and design techniques made a lot of sense because we needed that kind of
optimization. If you move to contemporary times, computational resources are like water to a fish.
they're available to anybody.
I've got on my desk behind me,
I've got my own personal cloud
of four Nvidia single board computers,
which has more processing power
that existed in the world in the 1970s.
And that's pretty amazing.
That's a few thousand dollars there.
And so the economics have changed vastly
such that you don't need to think about it so much
because, heck, it becomes a disavis.
disposable. AIS, I think, is also changing that because it allows me to build things where I don't even have to
think about design. Heck, I don't even have to think about software. I just prompted for those things to
build something and one-offs. And once I'm done, I throw it away. So in that sense, it comes back to the
economics. But there will continue to remain a class of software that's new, unique, breaking new ground
that still requires that kind of architectural thinking.
because just recently I read how Amazon is using formal methods for AWS3.
They're publishing a blog post detailing, like how they're doing it.
And they're doing this to catch those really, really edge cases that only happen in one in a billion, one in a trillion.
But at their scale, this is a regular reoccurring event.
And I found it fascinating how some of these.
methods are making their way back. Yeah. Well, let me set aside formal methods in a moment
because it turns out that's a different topic for which I have some experience and opinions.
But go back to Amazon, you go to their websites and they have a whole language around architecture.
Microsoft does as well too around Azure. It's the way I describe an architecture in Amazon.
It's those, you know, they're blocky diagrams. It says, hey, I'm going to build this kind of thing.
and here's my particular notation for it.
And furthermore, here, there are some examples.
So even Amazon and Microsoft
have recognized that architecture plays a role,
but there are enough times
people have done these kinds of things.
It says, oh, you want to build this kind of system,
then you want to use these services.
In fact, here are some examples for it
you can find on our website.
So without them really acknowledging it,
Amazon and Microsoft view,
architecture is still important.
But there are enough patterns
that one doesn't have to go through the process of rediscovering those because I can build those things.
And this is a representation of the, I think, the maturation of our business.
We move from algorithms, whichever one can use, to design patterns to now architectural patterns,
which Amazon and Microsoft have codified themselves.
So let me switch over now to formal methods.
Formal methods have always been a thing.
However, formal methods in my experience have been a niche part of every software-intensive systems
because formal methods only go so far in what domains they can cover.
And so you'll see is what your example described, Microsoft began using formal methods in their drivers
to validate the correctness of their things.
Their hardware.
With their hardware, exactly.
Yeah.
And there was an important move forward.
I've been with projects that use things like, you know, I've got this system in which people might die.
Let's run a formal analysis upon it.
The thing is, though, that those formal methods don't deal with real world things because they don't deal with space and time.
They deal with functionality.
So I have always found formal methods to be of use, but only for parts of a system and never as drivers of the architecture itself.
Speaking of software architecture, these days the role software architect is not really popular anymore, at least at the likes of startups and big tech instead.
We do have architects, but they're including called staff engineer, principal engineer, distinguished engineer.
The architecture is, and they do still do architecture, but there's a different focus on it.
Now, you were there when software architecture was created, when the first software architecture, when the first software architecture,
architects were created as a role. Can you tell us how you've seen this role be created and then
evolved throughout the decades? So two things influenced my understanding of the space versus I didn't
really call myself an architect, but I helped people design the systems they were building. I was
heavily influenced by a dear friend of mine, Mary Shaw. She's a professor at Carnegie Mellon. I think she won the
National Medal of Honor under Obama, if I'm not mistaken. And she wrote this really profound book called
Software Architecture, in which she began the first exposition of architectural patterns. Mary's
a delightful human being. And that's when I began to understand the formalizations of what
architecture could be. And the other thing that influenced me was architecture from other domains,
particularly civil engineering and the like. And there's, in the defense world, there's,
also, you know, shipbuilding architecture and airplane architecture and the like. So the term
architecture is one that's very well respected outside the software engineering world. Ignore the title
for a moment. And let's go back to first principles. What is it at all? And you've probably heard me say
this. All architecture is designed, but not all design is architecture. Architecture represents the
set of significant design decisions that shape the form and function of a system where significant
is measured by cost of change. So software architecture, an architect, has this horrible emotional
baggage around it. So think of it as it's all about making decisions. What are the decisions
that shape my system? As an architect, that's what I'm doing. As the project manager or whatever,
I'm also about making decisions. But one subtle difference is that it's no longer just the
decisions about the shape of the software, but it's the shape of the system itself where it embodies
itself in the physical world, where the other systems and humans themselves as well. That's what it is.
It's all about decision process. And have you seen this role change? Did you see a golden age of
companies where they were employing software architects that were empowering them to churn out
design. I'm seeing a bit less of this. I'm curious. Are you seeing something similar? Is this just a
bubble that we're seeing? Just because you seems you're embedded in a lot of different companies.
There's another sound by I'll give you, which is that the entire history of software engineering
is one of rising levels of abstraction. So what we're seeing here is the rise of another level
of abstractions, which gives us all these extraordinarily powerful frameworks from which I can build
systems and which, as I alluded to, the architectural decisions that were front and center for us
back then are now embodied in these. So now becomes a decision, what cloud service do I use?
What messaging system do I use? What platform do I use? That's the decision which has a lot of
economic decisions and not just, not just software.
kinds of decisions associated with it.
So I think the role of the architect in effect has changed because now I'm dealing with
systemic problems, not just software problems themselves.
And have you heard about the role called Solutions Architect?
I think it was created maybe 10 years ago.
And it's about people who are doing cloud architecture.
It's fascinating.
It's a role specific to the cloud.
And as you said, they make economic decisions.
What services do I use?
Do I use AWS GCP?
If I use AWS, do I use EC2 or do I use another service?
Yeah, and that's why it's a systemic issue.
You generally, if you're a startup, you're going to hire somebody who's done that before
who knows where the skeletons are buried, who knows, who knows, you know, what the cost
of these things are.
And so you'll hire those kind of folks because they'll accelerate you because they've made
those decisions and they sort of know in the shape of what you're building now,
what decisions next make sense.
And they are systemic decisions
because they have economic and long-term
associated consequences to them.
One thing when comes up with software architecture
is migrations.
I do notice that most large companies
are being hurt by long-running migrations.
And these migrations usually happen
thanks to software architecture changing,
for example, from going from a monolith to microservices
or changing technologies, for example,
going from Node-to-Go or upgrading major version
and frameworks, for example, one version of Angular to the other one.
How do you think software architecture and migrations are connected?
And why do you think software migrations are just so darn hard and they don't seem to be
going away?
They are.
Migrations will plague us until the heat death of the cosmos, I believe, because there's
always, you're always building economically viable software, but then the technology is
changing out from under you that compels you to consider migrating.
You know, consider the migration from monoliths from the 60s, 70s, 80s, to a lot of a sudden
economically we had many computers and now distributed systems.
You're still not going to use an iPhone and put everything up in my mainframe, but those
changes compel you to consider an architecture that better balances where the processing takes
place. If all of a sudden I can begin to do edge inferencing on my devices, that's going to be
something that's going to change architectures as well. So there are always these changes both in
the hardware as well as societal changes, if you will, that impact the structure of my
systems that that are the forces that impel me, compel me to do this migration. But why is it hard?
well another sound bite I'll give you is it the code is the truth but the code is not the whole truth
there is so much that is outside of the code that represents design decisions their rationale
why I chose this versus why I didn't choose that subtle things as to why did I name things this
way because of its impact that is as long understood long misunderstood and so while the
code may be the truth, the problem with moving migrations is that you lose, there's a loss of
information. And it's difficult to try to recreate those design decisions just from the code. The people
who wrote that initial code you're trying to move, they may have, they've probably cashed out or
they've died or some combination of. And so you don't really know why those decisions were made
and so you're working a little bit in the dark.
Yeah, you're mentioning people have died, and I don't think we usually think about that,
but I guess when you're thinking about systems that are 40, 50 years old, that is the reality
of some of them.
What will the Lennox Colonel look like when Linus eventually retires?
So how will it drift?
Because he has provided a firm and much needed hand upon the conceptual.
integrity of that system. And that's what the chief decider makes. He or she is the one that
provides that conceptual integrity. And when that person is gone, then you naturally see drift.
And that's inevitable. It's the, it's entropy. Software, all software exhibits some degrees of
entropy without that without adding that kind of force to it. One thing that is very relevant today in terms
of technologies and architecture is AI.
This technology is here.
It's revolutionary.
It is disruptive.
Now, you've been in the industry for closer to 50 years.
If you look back, how do LLMs and AI compare to the industry to pass innovations and events?
Because for many of us in the industry, for those of us who've been for 20, 30 years or so,
LMs do seem like the biggest change of software.
But when you look back, have you seen something that was comparable in terms of impact or the pace of change?
That's a great question.
Yeah.
The first, I think, was just the realization that I could build distributed systems as opposed to putting all my processing on one.
That was a change that rattled everything in the way we built systems.
So at the point in time that all of a sudden, I could have a network of mini computers and then eventually that carried on to microcomputers and devices at the edge like phones themselves.
But that transition in the growth of mini computers was seismic that I don't think people understood the full implications of until much later because it required us to rethink the way that we put systems together.
So we saw there the rise of a great degree of uncertainty because we didn't know how to how to build these systems.
I have messaging across them. Do I use RPC? Do I use something else? What do I use to communicate?
Do I use shared memory? And so there was this period of exploration that we didn't know about until it finally
we realized, oh, these are the common ways that work. And can you remind me around what time this was?
Was this around the 80s?
Was it around the 90s?
There was the late 70s.
Because here you had, well, let's go back away in history.
There was the thing called, gosh, what's the name of it?
I'm drawing a bit of a blank here.
But we first, here we are in the late 60s, early 70s,
in which mainframes were roaming the earth.
But even then, it was a realization that these machines were sometimes
being underutilized. And so there was born the formation of the first time-sharing systems.
The next thing that had happened around the same time were systems such as whirlwind,
which came out of the Lincoln Laboratories, which was beginning to take machines and touch
them to the real world. So all of a sudden now you had real-time computing. So you had this
confluence of two very interesting things. You had large machines in which we were beginning
to develop time-sharing kinds of operating systems. And then machines,
off to the side that we're touching the real world, these came together in really the rise of
many computers, particular digital equipment corporation and the like, where miniaturization,
which frankly came about because of the investment the Department of Defense was making in semiconductors,
making it economically interesting to have a computer that could sit on your desktop that one or
two people could use full time. So it took kind of the ideas of the embodiment of,
of things like WorldWin and then the distributed systems,
we were people building.
Now all of a sudden, I could take these things and I could start building them together.
And that's what was happening in the mid to late 70s,
that we saw the rise of those distributed systems.
There was one last thing.
There was the rise of client server systems where you saw dumb terminals.
That's what IBM was doing at the time, the green screens,
talking to mainframes.
But the rise of those new machines meant I could move some of the
processing off to the edge as opposed to leaving it to the model. So those are the forces that
led us to rethinking about systems. So do I understand correctly that programmers at the time,
they were used to working on these large computers and distributed computing came in.
Absolutely. And it just completely changed the dynamics. So youngsters came in. They started to
embrace this distributed computing. But existing programmers, they kind of stick to one.
what they knew, what was efficient on the mainframe?
Was that what it was like?
It was.
And then when you had things like TCPIP and HTML and all those things around it,
all of a sudden, we now had a vocabulary.
We now had mechanisms that we could bind these things together.
So I would not say that LLMs are as pervasively as important as the rise of distributed systems,
but there's a parallel to it.
The second thing I'd say that is sort of similar to that is the rise of GPUs, which came from the gaming industry.
Because there they were solving a very different problem.
How do I deal with more photorealistic things in my games?
And we realized it's all matrix multiplications.
And so Nvidia began to move into that marketplace.
And we had the rise of GPUs, which dominated that space.
And it wasn't until Andrew Ning came and said, wait a minute, those GVIAs,
DPUs used for gaming use the same kind of mathematics as do our deep learning kind of things.
And poof, all of a sudden, we had this, we had this perfect storm of lots of data, powerful
hardware, and the rise of interesting algorithms, backpopperation and the like.
So it was a perfect storm.
So yes, this is an exciting time.
There's no doubt that large language models are interesting, but we must be careful.
And we'll go into that in a moment.
So let's go into that.
I'd love to hear your candid thoughts about elements.
Their applicability, the innovation, and the trade-offs that they come with.
Well, so to set the stage before people worry that I'm just pontificating about things I know nothing about,
I alluded to at the very beginning that, in here I was 12, 14, whatever, I was interested in what was burgeoning,
becoming AI at the time, that is always kind of stuck with me. And in, what would have been,
I've always, you know, pursued an interest in that space. And it wasn't until IBM drew me back to it
around the time of Watson that I began to make it a full-time thing. So David Faroochie, who led
the development of Watson, had called me in and said, hey, Grady, I'm going to give this lecture. I can't do it.
would you do it for me? And I said, David, happy to do so, but only on the condition that I can
choose the topic. And the topic I chose is, what is the architecture of Watson Jeopardy? And as it
turns out, nobody had documented it. So being an expert in the architecture space, I sat down with
David's team for several months and documented the as-built architecture of Watson Jeopardy. It's not a neural
network system. It turns out to be a pipeline architecture that through a pipeline brought together a
number of statistical systems.
AI at the time was all about predictive statistical methods, not neural networks and methods
in knowledge engineering.
And so I documented the architecture.
That caught the eye of IBM management.
And it's for the first time I really documented an AI architecture.
And for those of us who don't know, this was in Jeopardy.
This was the IBM computer that played the game Jeopardy and won, if I remember.
It won. Yeah, we beat all the humans in that space.
And at the time, this was in mainstream media, in the press everywhere,
and this was shown as an example of an AI system that can outperform a human on a very human task,
which is this popular show Jeopardy.
Yeah, right.
It was part of the trajectory IBM had been on in that space because before that,
we had Deep Blue, which beat the leading chess player at the time.
Gary Kasparov.
Gary Kasparov.
Yeah, yeah.
So we did that through, yeah, we did that through brute force methods.
Brut force methods.
Then Watson Jeopardy came along, beating humids in natural language processing at the time.
IBM had already been doing things in that space with a system, again, statistical systems.
We eventually sold it to nuance and the like.
But we were pretty dominant in the space of natural language processing.
And IBM Watson Jeopardy was the peak of that.
So IBM asked me, hey, Grady, this is cool stuff.
We're going to commercialize this.
Would you please help us do a study as to what IBM should do with this?
So I let a study for about a year in cognitive systems.
What should IBM do?
Now, this was interesting because I study what Watson was doing.
I looked at what was happening in the marketplace.
Here we were, what, 2010-ish or the?
thereabouts a little bit later. It was that decade. And I made it very clear to IBM management,
this is pretty cool, but be careful because there are things we know it can't do. And I was very clear
to our management that be careful about hyping it. Well, I remember in my meeting with Jenny when I was
doing the briefing about the project, Jenny was the CEO at the time. She asked me a question.
one of the VP stepped in and sort of started talking over me and said, well, yada yada.
And I politely said to him, well, thank you, but I think you're wrong.
And I went on to explain to him.
Now, being, if I were not a fellow, I'm sure I would have been fired on the spot.
But as a fellow, I get to say things like that.
Now, they didn't, they did not invite me to join the Watson group.
So I was happy with that.
So I worked off into this side.
But, you know, it's sort of an I told you so that, yeah, there were things that,
We can't do it. By God, we can't do them. So I was working in the underground for several years. And I'll get to the answer of your question in a moment, but it explains why I can say something meaningful. So I was kind of an outsider. I was kind of doing my own thing inside IBM. And then we had a team down in Austin who had been approached by Hilton hotels, who said, we'd like to build a robotic concierge for our hotels. And so we chose some robots from outside.
Alder Barron, a company out of south of France. They built these three foot tall robots called
Pepper and smaller ones called Nao. And they built a robotic system, was kind of cool. It was a question
and answering system based on Watson technology. But there was something missing out of it.
And the team came into me and said, Grady, would you please help us out? So I did. I helped improve
the architecture. We actually installed it in a few Hilton Hotels. But as it turns out, just down the
road from that team was the Johnson Space Center. And so we began to be involved with them. And I was
thought, this is all great because I can get back to my space routes. So they had a system called
the Robinoid 2, which was a humanoid robot that at the time was on the International Space Station,
a beautiful piece of engineering. And NASA was exploring the idea of human robotic interactions
for the space, for the mission of Mars.
So the group of us sat down and said,
what would be an interesting design problem for us
to explore that would propel us to look at hard problems
because we want to do hard things?
And we realized it was the mission of Mars
because they had two interesting use cases.
The first is because of the speed of light issues,
you couldn't rely upon mission control on the ground.
You had to take it with you.
This was the howl problem.
We needed to build a how minus the kill all the humans
which used use use case.
which for some reason, NASA didn't want that use case.
I don't know, but it's their choice.
The second is, at the time, NASA wanted to put robots on the surface of Mars
to help astronauts, you know, build their,
build their scientific experiments and build their habitats and the like.
So I remember one afternoon, I'd sat down and said,
I know how to architect this.
This would have been around 2014.
And I built a neuro-symbolic architecture.
we called self, and it used the ideas of Marvin Minsky's Society of Mind,
together with Rodney Brooks's notion of subsumption architectures,
together with Hofstetter's ideas of strange loops,
and those three came together forming this self-architecture,
which we then built.
And so we experimented with NASA for a while to build some software behind the robinot, too.
Do things like, hey, robot, here is a here is a,
here is a scientific process. Go do this or go, go clean these filters, which is a common station
keeping activity. So we were actually building the software that allowed humans to interact with it and also
serve as question answering things. That led us to soul machines, a company out of New Zealand.
They were an offshoot of the work that John Cameron had done in the movie,
the shoot, what's the what's the one he's done? He's done a trilogy.
of them now, that he had this great technology where you put cameras on the artist and it would
measure their muscle movements. They took that. They built a neural model of human musculature, and they
had the great hardware for it, but they had no softers. We helped them in that regard. We then also
worked with a company called Woodside, an oil and gas company out of Australia, because they had a
problem similar to NASA. They were building systems for oil rigs, which is a very dangerous environment.
And so they wanted to build cooperative robots, just like NASA was doing.
So here we were with this system called Self that was, frankly, a neuro-symbolic system.
It had neural pieces, but also symbolic pieces based upon those three architectural decisions.
It was pretty cool.
At its peak, I had 35 people working on this.
But then we recognized IBM was across six laboratories, as I should mention around the world,
which made it possible for me to live and work out of Maui, Hawaii, which is where I still am.
But then IBM recognized IBM's Watson was bleeding cash.
So they fired everybody.
And again, being a fellow, they didn't fire me.
So I continued in that space.
But that was my exposure to the first set of new kinds of architectures.
For about six years now, I have been working with a set of neuroscientists
to try to understand the architecture of the organic brain.
And so rather than, you know, software kinds of things,
I've been studying things like cortical columns and the loops that exist with the hypothalamus.
So I've been looking at architecture from the lens of a software architect into these kinds of
organic systems.
And so that now leads us finally back to your question.
What do I think about large language models?
The answer is they're pretty freaking cool.
However, they are by the very nature of their architecture, unreliable narrators.
That's what I say politely.
If I'm going to be impolite, I will say they allow us to build that global scale bullshit generators.
Because they are, clearly, they are stochastic parrots.
They do not reason.
They do not understand.
But they do produce some very coherent results because they allow us to navigate a latent space
that has been made very complex through training.
through the corpus of the internet.
So that, I think, is back to the perfect storm.
We have large language models based upon data and the algorithms and the hardware
that allow us to build these coherent kinds of things.
But again, we have to be careful.
Gary Marcus and I have been voracious and consistent critics of Sam and others who are saying
we're just on the cusp of AGI.
Well, I think that diminishes the elegance of what.
what AGI actually is within our humans.
It diminishes what the beauty of what our human intelligence is.
And frankly, we're not going to get there by scaling.
We've already seen we're hitting, we're hitting limitations upon it.
Elon, whom I've gotten into as well on the internet, a number of times on his, his particular views of the world on AGI.
He's been promising full self-driving for years now.
We're not going to get there because those kinds of models,
are simply they're the wrong architecture.
If I've got to build a nuclear power plant to build my systems,
I'm probably doing the wrong architecture.
And so, yes, I think there are valid use cases for large language models,
but we have to be careful because they are indeed dangerous.
Jan blocked me on Twitter because at the very beginning of their work, Galactica,
I called him out on it and said,
Jan, this is great stuff, but do you realize the implications?
And he simply dismissed them.
And I kept calling him out.
He got tired of me doing so and eventually he blocked me.
But this is another I told you so.
We have seen the joys and the dangers of large language models.
So just to double click on this, it sounds you're saying that large language models by themselves will not get us to AGI.
But do I understand that if we combine it with other tools like neural nets or neural somebody,
systems, we can actually get closer to these intelligence systems. And if we take this example
that you just said, a robot on Mars that is able to clear filters and follow instructions,
what do you think we would need to get to that type of intelligence? Right. And I even add to this,
if you look at cultures such as Japan, where you have an aging population, a shrinking population,
the U.S. is very much in this place. You have tremendous need for elders.
care. And so the robotic use case and not just the full humanoid robotic use case comes into play,
there are clear and present needs for these kinds of things as well too. So yes, there are lots of
great use cases. So I've always believed this to be a systems engineering problem, which is why
if you go back to the architecture of self, that's why I've been focused upon neurosymbolic systems.
Now, let me get a little philosophical for the moment. You, I'm pretty.
pretty sure you're sentient. I can't guarantee it, but I'm just guessing that you are,
and you're not some amazing large language model. And why do I think that? It's a fair guess. It's a fair,
guess. Why do I believe that? It's because I have a theory of mind about you that through
multimodal ways and my interactions with you, I have a theory that says you're sentient.
You're someone who has its own agency, who has your own feelings and behaviors and needs and
like. And that's cool. You're not, you're not a robot. And so the human mind has evolved over a
millennium to develop that through an architecture that, yes, it's neural in nature. But on the one hand,
artificial neurons are a shadow, not a shadow, they're an echo of a whisper of what organic neurons
are. They are a small, small piece of it. Furthermore, if you look at the architecture of
large language models, which as you observed are relatively simple, there's a simple layering to them
which ignores the exquisite complexity of the human mind. We have in our cerebral cortex,
cortical columns, we have tens of millions of them. They're in humans about seven layers deep. Reptiles
have them as well, too. They're a bit smaller. Those are the ones that appear to be the place,
our homuniculous work, where we build the predictive models of the world. But we're
We also have places where emotions and other decision-making take place among these entangled
architectures associated with the thalamus and the like.
And furthermore, we have other things going on with, it appears to be our hormonal system
that appear to be other ways of message passing within our neural networks.
So in that same regard, large language models are interesting, but they also are a whisper
of a shadow of, I think, what reality is.
And so the next level is,
and Gary and I have observed
there we're reaching diminishing returns
on what scale can do.
The open AI folks,
and Elon believe scale will continue
for a long, long time.
Gary and I are saying,
no, we're reaching a limit there.
You need to think about it in other ways.
That brings us back to those architectures.
The last thing I'll mention is
at the very beginning,
you talked about me being involved
with embody cognition.
that brings me back to my roots again.
What is embodiment?
It's building systems that are in and of the world
that respond to the world and act in the world.
Large language models are largely unimodal.
They work through text,
maybe static images woven together
through video and the like,
but they're very sensory sparse.
Our minds and our intelligence have grown
because they've been embodied in the world.
In that regard, I think there are many kinds of intelligence that exist, but our human intelligence grew because of our embodiment.
It's going to require some fairly complex architecture to get that.
That's why I've been studying human architectures for six years because I don't know enough about them.
Let's see if that influences the way I build software architectures.
And as closing, you previously tweeted something really interesting around this, and I'll quote this.
You tweeted, we need a standard way of visualizing the architecture as well as the activity of L.
and to generalize any artificial neural network, sort of a UML for AI.
Now, this was a year back.
Have you seen anything happen here?
And why do you think LMR structure is so important?
In fact, I have both thoughts and I have seen movement.
So internally, I've been helping out with providing a little bit of architectural adult supervision
in our large language model work.
And I've been trying to figure out what's the way to visualize the kinds of things we're doing.
turns out it's still very UML-like that these are now boxes that are systems under themselves
and their message passing systems along the way. So I'm trying to figure out the best ways to do that.
As you may know, just this week, Alpha Fold released its source code and waits publicly. So I'm
jumping on that and I'm going to use that as a basis of can I describe the architecture of Alpha Fold 3 using some UML deliberative.
So stay tuned. I think there's work to be done in that regard.
And what advice would you have for software engineers who are just starting out the software industry?
There's plenty of recent graduates who find themselves in a chilly job market.
There's a threat of what feels like AI tools changing how do software engineering.
You've gone through a lot of cycles of innovation and software engineering.
What advice do you have for those starting out to set them up for a successful software engineering career?
Indeed, when co-pilot and chat GPD came out, I received a flood of messages and folks saying,
my gosh, have I chosen the wrong career because you're not going to need developers at all.
You will always need people who make informed decisions, no matter what the language is.
Software engineering, again, is one of rising levels of abstraction.
It's just that our tools have changed.
I learned to program in assembly language.
people today are going to be learning in programming with languages that are at a much higher level of abstraction.
So the advice I'd give such folks is twofold. Don't worry. Don't be afraid. There's always going to be some
really cool work for you to do there. And in fact, I would say it's quite the opposite. This is an
exciting time because there is so much opportunity, so many cool tools, so much comprehensive,
computational resources that in many ways you are limited only by your imagination. And so what I
encourage people to do is to first learn as much as you can. Second, don't get stuck in just one domain.
You need to become an expert in some space, but the world of computing is vast. And so find some
space that nobody's in right now and go make a name for yourself there because there are lots of
those kinds of places there for you. And the third.
third thing I advise is go have some fun. I mean, my gosh, the toys we have at our disposal,
they're amazing and wonderful and cheap. There's so much a single person can do at so low expense
to go change the world. I'd like to also close by saying, I'm not done yet either. I'm having a
tremendous amount of fun. And in addition to the things I told you in studying the human brain and
working with large language models. There are two projects that I'm engaged in that I've been working
with for a long time now. The first is I've been trying to write a book on software architecture.
I'm glad I did not write it because I know so much more now than I did before. And this is a different
architecture book rather than saying, here's how you do it. I've been working with a number of
companies to document their as-built architectures. So I'd like to put Alpha Fold at it, Photoshop,
is in it. What's the architecture of Photoshop? What's the architecture of a climate monitoring system? What's the architecture of Wikipedia?
So I'm trying to document the as-built architecture of systems that people use today that they may never have thought about.
The idea being is that there are many different architectural styles. Let me expose you to them because you may have studied only one particular one. The world is a vast one.
The second is, I was heavily...
And just to affirm, this is the software architecture's guidebook, right?
Yeah, the software architecture handbook, which my handbook, which my long-suffering editor has been, has been patient with me for literally a decade now.
And it's something I hope to finish before I die.
The second book is, I was on the board of trustees for the Computer History Museum for about 10 years.
And we'd hired a new CEO.
He came to me and said, he had been at PBS.
we had a conversation and said, hey, John, why don't you do a documentary like Carl Sagan's Cosmos?
He paused and said, well, Grady, why don't you be our Carl?
And I said, I'm no Sagan, but that's an intriguing idea.
So I've been on a journey to do a documentary and writing a book about computing and the human experience,
which looks at the history of computing and what it means to be human.
So I'm looking at what is computational thinking look like.
how has computing change the individual society, nations,
how has it changed science and the art and religion and such?
And ultimately it asked the question,
in the presence of computing, what does it mean to be human?
So that's the larger project I'm working on right now,
which I hope to finish up,
Ag also before I die.
And let's wrap up with some rapid questions.
So I'm just going to ask and then just shoot,
you don't need to think too much about it.
What was the first programming language that?
you used. Fortran.
Fortran.
What project did you commit code most recently, and what language did you use in it?
Most recently, my own language, my own project self, Python.
Python.
What do you do to recharge from doing software engineering related work?
I live in Maui.
I wake up.
That's enough.
Yep.
Enviable place and also answer.
What are two books you would recommend for those who would like to understand more
about software architecture.
Mary Shaw's book, Software Architecture.
I'd read that one.
Other books kind of pale in comparison, but I'd start there.
Well, thank you very much, Grady, for being on the show.
This was really interesting and fascinating.
It was a pleasure.
Thank you for having me.
Thanks to Grady for this great conversation.
You can find ways to contact Grady in the show notes below.
If you enjoy this podcast, please subscribe in your favorite podcasting platform and on YouTube.
For some additional takeaways from our conversation, please see the show notes.
Thank you.
you in the next one.
