Microarch Club - 11: Robert Garner
Episode Date: March 13, 2024Robert Garner joins for a fascinating tour of the last 50 years of computing, told through his experiences working alongside pioneers of the industry on projects like the optical mouse, the X...erox STAR workstation, Sun Microsystems’ SPARC instruction set architecture, and many more. We also discuss Robert’s work preserving and restoring systems at the Computer History Museum, and his upcoming book on the technical history of Ethernet.Robert on LinkedIn: https://www.linkedin.com/in/robertgarner/Computer History Museum: https://computerhistory.org/Detailed Show Notes: https://microarch.club/episodes/11
Transcript
Discussion (0)
Hey folks, Dan here. Today on the Microarch Club podcast, I am joined by Robert Gardner.
Robert has had a long and impactful career working at many influential companies,
including Xerox PARC, Sun Microsystems, IBM, and others. At each of these stops,
he has contributed and in many cases led the development of computer software and
hardware systems that shape the industry today. One of my favorite parts of interviewing Robert is the stories that he shares about working with
a star-studded cast of colleagues and how they worked together to design and build machines
that were often well ahead of their time. As with most episodes, we start with Robert's
upbringing and exposure to computing before jumping into his career and the products he
worked on, spending the majority of our time discussing his work on the Xerox STAR Dandelion hardware and his
experience leading the definition of Sun Microsystems Spark Instruction Set Architecture.
We round out our discussion with the restoration work Robert does today at the Computer History
Museum and his upcoming book on the history of Ethernet.
This was an incredibly
enlightening conversation and I wanted to thank Tom Lyon who initially introduced me to Robert
and has served as an incredible resource for computing history through his social media
accounts and appearances on the Oxide and Friends podcast. With that, let's get into the conversation. All right, Robert, welcome to the show and thanks for joining me.
No, thank you.
Well, it's an honor to have you here. I wanted to give a little bit of background on how we got
connected because I always like to shout out folks who helped me meet other people who have interesting backgrounds and
stories and histories. And I had reached out to Tom Lyon, who I believe you worked with at Sun.
And I was asking Tom some questions about an Oxide and Friends podcast episode where they
had kind of talked about the history of Spark, and I had written some about register windows, which we'll get into later.
And Tom said, you need to meet Robert and connected us.
And since you've shared a lot of awesome information with me, so shout out to Tom.
But I'm super glad to have you here, and I think we're going to be able to cover a lot.
It's great to be on your podcast.
Awesome.
Well, let's jump right into it.
Obviously, there's a number of things at Xerox and at Sun that we want to talk about, but
I always like to learn a little bit about folks' background because I think that kind
of influences where you end up later on.
So what is kind of your upbringing and what kind of led you to get into processor design and computers in general?
Yeah, that's a way back question.
Yeah, so as a younger person, I was interested in two things.
Natural history, the outdoors.
We had a cabin in the mountains in Arizona where I grew up.
But also electronics.
As a kid, popular electronics gave you the belief you could just build anything,
just get a few components. Of course, I couldn't afford them. So I would devour all the issues of
each issue of popular electronics and just hope that the components would come raining into my
house somehow. If anyone offered free components, that was a heyday. My mother would, on her way into town, this is in
Phoenix, Arizona, I would say, please drop me off at a library. I would read, consume,
you know, the books on physics and engineering there. I, you know, I was, I guess you'd call
a young geek. Nice. Well,, what kind of age did you start to
actually get exposed to, you know, physical machines, um, and maybe start to write some
code and that sort of thing? Yeah. Well, that, that happened pretty young. I think
in this category, I, when I was in grade school, I think seventh or eighth grade, um,
I learned that some friends of my parents had access to a computer.
So I said, well, if I write a program, will you run it for me?
So I think I wrote a little tic-tac-toe program and they would go run it for me. And it probably wasn't as good as Bill Gates' program I've read.
But I had this sense that I could get access to real computers.
And that was very exciting.
Very exciting.
What was the programming model?
Was that on punch cards or were you?
To be honest, I don't know what the adult did, but he probably had to put it on punch
cards.
Right.
It was a GE 225 or 425 mainframe, the same computer that was being offered in timesharing systems and was abysmally
slow as a timesharing system. But I probably wrote it in Fortran or something.
Right, right. And as you kind of got up into higher levels in high school and that sort of
thing, did your schools have any access to-sharing systems or anything like that? They did.
We were very lucky.
One thing to keep in mind is post-Sputnik, pretty scary when this little thing goes around in the sky for the first time.
Right.
There was a lot of funding for science and technology in American high schools, including in Arizona.
In high school, so we had fantastic mathematics teachers.
We had access, we had a time-sharing system
that had access to a scientific data system,
Sigma-7 computer.
We could bang away on that teletype
any time we wanted to.
Although we learned that
if you actually went to where it was
you know in a building downtown they would let you have access it at night if no one was logged
in they let us have the whole 20 million dollar computer um which was really cool so we would
run our little dumb programs until someone logged in let's say at 1 a.m or 2 a.m
so that was great fun.
Right.
And were you mostly at that point in time programming computers exclusively
or were you also getting into building them
and continuing with your electronics tinkering?
Well, that's an interesting question.
One thing that happened when I was in high school
was I would visit Arizona State University Library.
And I would sneak in on a Friday night.
No one noticed I wasn't a student, I guess.
I would just walk the show. I knew where
the codes were
for the books on computers
and I would find the books on computer design.
You'd look at them and go, okay, this is pretty
straightforward.
I thought, well, do I really want to
build my own little computer? Some friends
would do that for science fair projects. they just seemed so wimpy.
It was like, why would I spend my time doing that if I can get access to the real enchilada, you know?
So, you know, in high school, as in any time, you focus on what's right in front of you, uh, the social scene. Right. You know,
um,
and we actually got a fantastic,
like I mentioned,
a mathematics professor.
We,
several of us actually scored fourth place in the nation on the national math exam.
Wow.
And the school right above us was a place called Palo Alto high school,
which we'd never heard of in California.
So high school was just a great,
great,
I started, I... I started...
I mentioned I was into natural history, so I actually founded the Saguaro High Conservation
and Ecology Club, which was the first ecology club advocate in the state.
Oh, wow.
We set up the first recycling center in Scottsdale, Arizona in the late 1960s.
So, very busy at that time.
Right, right. the late 1960s so very busy at that time right right that sounds like quite a lot for a high
school student but afterwards you you went on to uh go to undergrad at arizona state is that right
yes i had some friends we had a science club uh several of us and one of my friends in that club
and we built 3d chess sets like there were in star Trek. Right. Build tube amplifiers.
He got a job with a math professor there,
and I joined him,
who was doing 3D surface illustration.
It was on a Tektronik 4010 storage screen,
and we would write the code on a PDP-10 time-sharing system,
and the bits would come over the 300-bit-per-second,
300-bot phone line and display on the screen.
I realized we could record on a real, real tape recorder at a slow speed
and then replay it back at a higher speed to get faster graphics.
Right, right.
I wrote a little hidden line removal algorithm, which wasn't that great,
but it worked on simple surfaces and
i also had a job with the that was with dreg nielsen i also had a job with
dr barry lesowitz in the psychology department where they he had a pdp 15 and 18 bit computer
with uh with d to a's and they would send audio signals into soundproof booths
and torture the students how can you tell where something is by just listening to A's and they would send audio signals into soundproof booths and torture the students
how can you tell where something is
by just listening to the sounds
we did a lot of assembly language programming there
the PDP-10 gave me experience
in assembly language programming for the 10
so you know
if the courses were boring
I had two great jobs as an undergrad
and a place in the math department
where I could hang out.
So it was just wonderful.
I really appreciate Arizona State for doing that.
Absolutely.
And so were you studying math or were you studying computer science?
I was in engineering, electrical engineering.
Although the ASU had a broad curriculum in engineering.
You had to take mechanical engineering.
There was even still a slide rule class at the time. HP35 calculator had just come out and we were
like, whoa, wait a minute. But exposing us to mechanical stuff really helped later in
my career because you have to package the electronic components. You have to package
them. So you have to be able to communicate and understand what the mechanical guys are doing.
So in retrospect, I wasn't as bitter about it later.
They also let me take math classes in the math department instead of in the E department,
and they would count as for my engineering degree.
So they were really flexible.
I really appreciated that.
Yeah, that's really interesting.
I personally had a very theoretical computer
science education, which at the time I enjoyed quite a bit. And I think, you know, in my short
experience of kind of like the academic system, it seems like, you know, over the last few decades,
there's been more of a rise in the theoretical computer science. And a lot of people really
enjoyed that at the university I went to. But I definitely, looking back, wish I had had more of
that holistic engineering experience and a little more, you know, hardware and perhaps mechanical
engineering perspective as well. And that's something I kind of, I hear a lot from folks
that maybe worked kind of like in the early semiconductor industry and that sort of thing,
that they had more of this holistic view of the computer.
And, you know, perhaps the rising complexity in computing is one reason why it feels more specialized now.
But that seems like a trend that I hear from folks.
Yeah.
And then after Arizona State, you went to grad school.
What was that decision like? And how did you pick Stanford? And what was being there like? And then after Arizona State, you went to grad school.
What was that decision like, and how did you pick Stanford, and what was being there like?
Well, I really knew to get a really exciting and challenging employment, you really had to get a graduate degree. So I applied to many universities, and Stanford accepted me.
And I, hey, California.
And from Arizona, that's a shining thing on the hill across the horizon.
So I bid on that one.
It turns out my father grew up in the Bay Area,
so that was kind of a nice connection later on.
I actually applied there.
You mentioned theoretical.
I actually applied for the information theory curriculum.
I was really into information theory in college as well,
in addition to taking all these practical engineering classes, and they accepted me there.
When I came in, I was with Marty Hellman and all these guys, and I was like, this is really intense just for a one-year master's program.
So I asked if I could switch to computer engineering, and they said yes, so that was great.
So the program was still just one year, though?
Yeah, really nice.
The master's program is just one year, which is really fantastic because I really wanted to get out.
I wanted to get out working for some place.
I took a lot of lab classes in that year, and labs are very intensive time-wise.
So I was pretty overwhelmed, and I took Russian language again. I took Russian
language in high school. That probably wasn't the greatest idea because that was pretty intense.
And I took, once I signed up for Don Canoose algorithm class,
after the first two weeks, it's like the assignments were impossible. And it seemed
like it was two classes in one. So I dropped it, you know. A few years ago,
Professor Don Canoose is very active at the Christian Museum. I had dinner with him. I said, Don, I took your class as a young
kid, and I felt it was two classes in one. He said, oh, that's great, because that's
the way I constructed it. I was so overloaded. Maybe if I had more time, I could have stuck
with it. But I loved his art of computer programming textbooks you know many kids today i'm sure are not exposed to him
and a lot of people don't even know who he is but he's just a marvelous person
absolutely well well uh if uh if it's any signal uh i definitely know who he is and and i think
folks in my program at least uh also were certainly exposed to him. They published all his articles in books, and one of the books is on his humorous articles, the ones that are all just jokes and fun.
Oh, nice.
Analyzing all the possible vanity license plates you could put together, and he discovered.
So he has a great sense of humor, which is just remarkable.
Absolutely. Well, I'm sure once we
get to towards the end of the podcast, we'll touch a little more on your work at the Computer History
Museum, which I think folks will find really interesting. But, you know, you mentioned you're
excited to kind of get into industry and get out of grad school. You ended up landing at Xerox.
What was the process like getting there
and obviously around that time
a lot of really important technology
came out of there that has led
to many of the
products and things we have today
what was it like kind of
getting into Xerox at that time
well I think we were vaguely
aware of Park but for me it was dumb luck
I went to an on-campus interview.
You know, people, employers would come in and interview you, and Bob Metcalf,
basically the inventor of the Ethernet, was interviewing me.
And, you know, it sounded exciting, the kind of stuff he wanted me to come work on there.
And I'll never forget, he called me to come work on there.
I'll never forget, he called me at my... I was staying in someone's garage.
There wasn't great student housing at Stanford at the time.
He called me at 11 or 12 o'clock at night.
That was probably halfway through his work day.
Park was like a startup company.
He said, do you want to try to do something besides all that easy coursework?
I'm like, easy coursework.
Anyways, I knew he had a sense of humor.
He was the most amazing, as a first manager, that was his first management job, actually, as well.
So it was his first management job and my first manager.
It was a pretty special experience.
He's a remarkable person. He really cares about people and opens his heart and speaks what's on his mind.
Great person.
That's awesome.
So you joined Xerox, and you're coming onto his team, right?
And what was kind of the first thing you worked on when joining?
Yeah, this was 1976.
Well, first of all, I'll never forget during the on-campus interview,
he showed me this thing called the Alto.
And I have one sitting next to me on the right here.
And it had a bitmap screen, so you could do graphics.
And it also had this three megabit ethernet.
And you could send a file or a whole disk pack between two machines in what seemed to be almost instantaneous.
That impressed me, not so much the bitmap screen.
But he hired me to help commercialize the 3 megabit Ethernet,
which he and David Boggs had developed at PARC.
By that time, there were probably 50 to 100 Altos working on a 3 megabit Ethernet.
They had developed the first protocols.
The email systems were coming up.
There was a file server on the master time-sharing computer called Max.
It was like being in heaven.
I technically was in what was called the System Development Division,
which was set up to productize the Alto.
But we were co-located with the people at Park,
so I got to know all the people as if I was working there.
Butler Lampson.
So I started working with Chuck Thacker.
I basically, you know, what do we do next
and Bob actually set up
a goal to try to run at 20 megabits
per second
which seemed a little aggressive
we ended up running at 10
and we had analog people
in the group
there was about 4 of us
in the group that were going to design the entire
workstation
what was called the Xerox STAR 8010 professional workstation.
And I started, actually I started, the first Ethernet I did was what was called the Xerox Dolphin or D0, which was meant to be the productized version of the Alto.
Right. of the Alto. As a kid out of school, I learned rather quickly that Chuck Thacker, who's equal
to about 10 normal people, he was just remarkably brilliant. He could design something in one minute
that would take you half an hour. He wrote his own schematic editor. He pulled the system together,
and it was like three times as complex as the
Alto, what I would call the second system syndrome. But I, I,
my first job was to build an ethernet controller for that system.
And the so you mentioned D zero and then there was a,
there was a kind of a series of these D star processors, right?
And there was D one and then was Dandelion was the one
that the star workstation ended up being built on, right?
Yeah, the sequence was the Alto, the D0.
The D1 was a slightly later machine.
It was a pure Eccle.
Okay.
With ECL logic, extremely expensive and a lot of power.
Like you didn't want to have one in your office.
Right.
It was probably like a 10-bit machine that's my watch today right um the alto was probably a half-bit machine um the dandelion as you mentioned that was the code name for the
hardware that we designed um it was yeah maybe a half a machine, but it was faster than the Alto because it had a faster memory system.
A huge difference between the Alto and the Dandelion was that in the Alto, Chuck had come up with a brilliant idea.
It turned out on the TX- back east, they had done this idea.
But you have a bunch of hardware.
You're really trying to reduce the cost of the hardware.
So he basically came up with a multi-threaded idea to share the CPU and the microstore memory with all the devices. So the Ethernet controller, the display controller, the disk controller,
low-speed I.O. had microcode that did most of the work there. And the hardware was just
small buffers, enough for a few words from the disk or a few words from the Ethernet.
And the devices would, like on a normal bus, compete in priority order, compete for access to the CPU. One of the
challenges with that Alto design is that your microcode could keep control of the
CPU as long as you wanted. You had to relegate, you know, you had to let
someone else fight for control of the CPU. And that was an interesting
experience for me because I was writing a microcode for the
mouse, for displaying the mouse on the screen at the time.
Oh wow.
And it would
stop working and I would go look at the other microcode and Chuck Thacker had
spent too many cycles trying to get his microcode to work.
So I said, Chuck, you're not giving me the cycles just for the mouse to track. And so he'd go fix his
and then mine would work and then he would go fix his, and then mine would work,
and then he would go change his again, and mine would stop working.
Right.
So Butler Lampson, who was the brilliant, I don't know,
smart at PARC, everything, operating systems and compilers and languages,
and he got familiar with hardware there,
he came up with a design called Wildflower.
He actually wanted to have a design with like zero chips in it,
because he saw the D0 had gone one way,
and let's try to make it even more focused.
Right.
And he had an idea where you should schedule fixed rounds for the microcosm
so the display always knows when it's going to run.
The Ethernet always knows when it's going to get a chance to access the CPU and main memory.
Brilliant idea.
Computers aren't generally designed that way.
Usually it's a bus, and like I was saying, you ask for service,
and if you don't get service, your buffer underruns or overruns.
Right.
It's like, why do these computers work?
I mean, they don't at a certain point.
This idea,
where there's guaranteed access to memory in the
CPU on a regular basis
meant that
if the processor met
its cycle time, I.O. would never get
an underrun or overrun.
That concept that Butler came
up with, we adopted that
in the Dandelion. It basically followed the ideas that Butler had in the wildflower.
We implemented that in the dandelion.
The Ethernet got two slots.
The display got one.
The disk got one.
Low-speed I.O., which is based on 8086, 8085, kind of later took over the world.
Right. kind of later took over the world right so we knew we designed the dandelion that if we met
the cycle time goal which was 137 nanoseconds the fine structure of constant physics
that it would work and what a great project to work on right was the so if you're you know the
way i'm kind of like conceptualizing this is you're essentially like checking sort of like each slot each time through.
Is there a performance penalty?
Like let's say you have a scenario where just one of those devices or peripherals or whatever you like to call it, one of those slots is the only thing that's active in doing anything.
Are you paying a cost for kind of like checking or
having um each of them kind of occupy a part of that loop well if you don't have any iota to
remove to or from memory you don't ask for your slot okay so and then if if no one takes that
slot uh it goes to the emulator the language emulator okay so in the case of the Alto, well, it's not slotted like this,
but the language emulator was always lowest priority.
So it ran the Nova instruction set, the data general Nova instruction set.
By the way, I have a Nova computer behind me there.
You can see.
That's from the Sol, the new machine.
Novas were around Park before they designed their Alto.
And then they built a Smalltalk,
the language Smalltalk that Alan Kay
and his company designed,
and Lisp. So you
could kind of tailor the CPU
for the instruction
set that you wanted. And for the Dandelion
we just did the emulator
for the Mesa language, which was
a
language
designed as a rocks park.
It didn't go anywhere after that, but it was the basis for all ARM machines,
a very strong type-checking language,
kind of some aspects of object-oriented programming.
Just a beautiful language, but it didn't have enough users to survive right is the um the so
looking at the kind of um on on pcb or uh the the architecture the chips on the board um for the
dandelion it was a a bit slice architecture right yes we uh you know we we won't be able to see it if you're listening to
the podcast but this is the actual cpu card wow there are four am2901 four bits each so this is
a 16-bit computer and then the control store is at the bottom uh 4k 48-bit words. And that is enough to cause a lot of trouble.
Right.
I mean, with a writable microcode,
with a writable control store,
what a field day for the language designers.
Right, right.
So they would run applications,
representative applications,
and re-encode the Mesa instruction
set, the bytecodes,
almost every week to see if they
could get the code size smaller because
in the Dandelion, we weren't sure
if it would go out with 128
kilobytes of memory, 384 kilobytes
of memory, or maybe even 512
kilobytes of memory.
There was just so much functionality.
It was a GUI.
Right.
There was email.
There was networking.
There was storage.
It was remarkable how much functionality you could put
into just half a megabyte of memory.
Right.
Just remarkable.
Now, there was no,
there was all one unprotected address space.
There was no user and kernel.
But it was, yeah.
So the classic Sys complex instruction set computer,
they could redesign it on a weekly basis and get the code size down.
There was even one point where they implemented microcode swapping.
You could swap microcode in and out of the micro store.
Wow.
Wow.
You know, just kind of the pinnacle of cisc at the time
right and so so the the reason why kind of like looking back the computers were designed in this
way at this time right is because it was too expensive or not possible to have you know a
single integrated microprocessor right and so the uh you mentioned the 2901, which I believe was kind
of the, the control chip. And then there was all of these 2,900, um, logic chips that you'd chain
together depending on, uh, whatever the, the width of the logic that you were doing. Is that right?
Well, the, the, um, the, the 2901, um, did was the ALU, so these are the ALU chips.
Everything else is just decoding the instruction, the program counter,
accessing the microstore, talking.
This has the microtasking hardware so that you can, the different microcom
can run, so you have to save the PC for all the microtasks.
Yeah, that's basically about it. But, you know, fundamentally, you know, we'll get to Spark and
RISC, but, I mean, now's a good time as any since we're talking about architecture. Fundamentally,
the people at Park were certainly
aware of caches.
IBM had implemented some
caches on their very high-end
360s, the model 195.
But
caches would just add more cost.
Significant more cost.
And without a cache, you know,
you're limited to the main memory access,
which is, in the case of the Danny line,
was three micro-instructions.
So during one of those slots I was telling you
where the microcode could get access to the CPU and memory,
it could do one memory operation.
So now you've got memory running three times slower
than your basic instruction.
Whereas in risks, you have single cycle, if possible, loads and stores.
And that's because you have a cache.
You can get to the cache typically in one cycle.
And caches just were not in the cards at this time.
And no one knew what the best instruction set looked like.
So having a writable control store gave the flexibility to,
especially as a research institution at PARC, to explore that. What are some... of course there are an infinite number of programming languages
so it would have taken the age of the universe to answer that question, but
you know, as a research vehicle it was a great thing. Now as a
product, it meant you could also, in in the field upgrade the microcode if you needed to, if there were bugs or
you needed some new feature. There are some very complex instructions
for process switching, so there were lightweight processes
and there was microcode for procedure calls,
saving the arguments away and all that kind of stuff. It was a stack-based
architecture, MESesa, unfortunately.
And stack-based architectures have very slow execution times.
Right.
Compared to others.
In fact, Forest Basket once came around with a little benchmark called Puzzle
and he ran it on our dandelions and it really sucked.
We were like so slow.
It's like these guys, you you know develop this beautiful language but
no one ever looked at how slow it was right right so i learned i learned a lot of lessons there
and when when you say that there was no caches and it was stack based um so there's no registers
or anything like that right yeah there's just top of stack. And so typically, like in this design,
there would be, it would be saved in a top of stack register on the processor card and a little
small stack. So there was a small area of registers in the hardware that I forgot to mention earlier.
Gotcha. And so how did you, you mentioned kind of like,
you know,
reprogramming the microcode in the field
and that sort of thing.
How did you actually program one of these machines?
That's a really good question.
So these days we're so used to bits
coming in over the network.
Well, okay.
So during debug of the dandelion, we set up an Alto with an umbilical
cord talking to the I.O. processor on the dandelion, which could read and write the
microstore and could set breakpoints, so we could use that to debug. For the Alto itself,
it was meant to be just a ROM-based microcode, but they added to it a writable microcode so they could access that through another Alto to read and write, or maybe through a Nova to read and write that.
So that's how you could program the microcode.
In terms of programming at the higher level, the language and application layer, typically we would all do it via the Ethernet.
So it's got an Ethernet, so we could boot from the Ethernet.
You could load code from the Ethernet.
Ethernet was so amazingly fast, 3 megabits per second.
You've got a half-bit computer, but 3 megabits coming in.
It actually was too fast almost for the speed of the machines at that time.
And then, of course, customers, it was floppy disks.
I've got some eight inch floppies around here somewhere.
Right.
Is the, I know, I know Ethernet was, you know, obviously a big part of the story at Xerox and at Park and obviously has become a big part of the story everywhere. Was it kind of an instant success in terms of adoption,
or was there a lot of work to make it become ubiquitous?
What was it like around that time?
Yeah, that's something I'm investigating.
As we'll talk about later, I'm doing a history of the Ethernet.
And it was a struggle.
I mean, back then, there were so many ways you could pump bits over a cable.
It wasn't wireless, wireless really at that point.
I mean, Ethernet was inspired by the Aloha Net at the University of Hawaii.
But there were several other cable networks at the time, so you had to contend with, okay,
I got mine, you got yours, what's the advantage, disadvantage? But the reality was,
Bob Metcalf and David Boggs
made it work, and it worked.
It just worked.
You put a cable,
you just, you know,
this is my hand here, for those on the podcast,
this is what the transceiver looked like.
It's about the size of two cigarette packs.
You take
your cable, which I have in my hand, and you put a tap
on it. First you put the tap on, then you actually screw this in and actually drill
the hole by hand into the cable and you took all the pieces out. Then you took the transceiver
and you put it in like that. Then you put a cable up to the transceiver that goes to your Alto, which could be fairly long, actually, 50 meters.
And you're on the network.
You could send email.
You could get onto the ARPANET, the Internet, and send messages to your buddies back east.
It started out maybe as an option in 1973.
But by 74, 75 by late 74
there were probably a dozen Altos working on it
and by 75 and 76 there were
several dozen. By the time I got there
in 76 there were over 100
and then Xerox started setting up Ethernet
at different sites and they used the
50 kilobit lines that
ARPANET was using and built gateways
they used the NOVAs
actually a little faster than this one that ARPANET was using and built gateways. They used the NOVAs,
actually a little faster than this one,
to be a gateway.
So they actually set up the first internet.
Most people don't consider the ARPANET an internet because imps all talk to each other with a common protocol.
But the first internet where you're going over somebody else's network
to talk to another
ethernet somewhere else they had by the late 70s or early 80s i mean this will all be in my
in my book but they had over a dozen different sites fully you know ethernet fully at them
interconnected via 50 kilobit lines wow so xerox was way ahead of its time and people didn't realize
how good it was right right and i know that the alto was built as you know kind of a prototype
machine and there wasn't that many of them but then you all are tasked with um building star
and kind of making it you know a commercial uh product was it successful as a commercial product
or how widely was it deployed?
Well, to be honest, at an engineering level,
we thought it was successful
because it did so much more than a minicomputer of its time.
So a VAX minicomputer would be the size of a washing machine
and it didn't have a display.
The board to connect ether i was you know
it's three feet on the side and it was three times as four times as expensive and because of
betler lamson's idea and chuck's idea of compressing the hardware we were the cast me out in terms of
cost it was extremely inexpensive and so powerful you know it gave us all the paradigms we have today. You know, Charles Simone wrote Bravo, the editor,
which became Word when he went to Microsoft,
a little unheard of program like that.
You know, Doug Bratz wrote an email program,
which we're all familiar with.
We didn't have browsers because there wasn't much on the Internet at that point.
I'll never forget the first person who tried to advertise something on the ARPANET.
We were like, don't do that.
So, you know, the Danielite
and the Xerox STAR
from our perspective were very cost
effective, but from a user
perspective, they were too expensive.
So the STAR
went out at about $16,000.
I'm sure we discounted it,
but in today's dollars, that would be like
$30,000, $40,000.
And really, you needed more I'm sure we discounted it, but in today's dollars that would be like $30,000-40,000. Right.
And really you needed more than just the workstation.
You needed the file server, which was another serox, another dandelion with a large disk,
80 megabyte disk.
You needed a print server, and they had a low-end laser printer that printed beautiful
documents,
one every five seconds or ten seconds or so, one page every ten seconds.
You needed a communication server, a gateway server,
to get to the ARPANET or to another site, another Ethernet site.
Maybe that was optional.
But still, you needed, in today's dollars like two hundred thousand
dollars worth of equipment right and most people that time the so-called information executives
whatever you want to call them didn't know how to type right and they had no idea what this weird
thing is called a mouse right it was a freaky thing the drugs didn't know how to announce it
um i'll never forget driving down highway 280 and hearing my first ad for a mouse on the radio from Apple. I just about drove off the highway. You know, a mouse on
the radio. So, you know, they didn't know what, why they needed it and why would they
spend that much money? I mean, their administrative assistants, you know, their salaries were
less than a hundredth the cost of this whole system i could hire
100 secretaries right and i don't want to type so you know they only sold about 20 000 stars
i think in the end somewhere around there i personally saw a large site at the nsa
they could afford it they were very popular in Japan, it was the first Kanji workstation
there
Voice of America used them
because it was
multilingual, you could have different languages
fonts
and that
I remember reading one story about that
where the person
from some Slavic country, we won't mention
which one was afraid he was getting radiation from the monitor.
He refused to work on them because it was pretty scary.
Right.
So it was unfortunate that it's been penned as a failure.
But $20,000 for a minicomputer would be considered a success.
So it just didn't have the...
Next was Apple's Lisa,
which was $12,000 at the time,
$24,000 to $30,000 today.
That was too expensive.
Next was the Apple Macintosh,
$5,000 then, $6,000 then,
you know, maybe $12,000 to $15,000 today.
I mean, would you spend $15,000
for a little thing,
a little tiny screen with poor graphics, you know?
Right, right.
So, you know, it took a while for the...
The IBM PC was announced the same year and month
as the Xerox Star in 81.
It was announced a few months later.
And fine, it was inexpensive,
but it set the whole world back from our perspective.
It turned every male into an IT administrator.
You're probably too young to remember the PC, but it had floppy disks.
And God knows, Charlie Chaplin were in all the ads that IBM ran,
and everyone thought they needed one to catch the up with the world.
It set the world back 20, 30 years in terms of an easy-to-use interface.
I mean, the Xerox Star designers, Charles Irby was one of the main people there.
Oh, I can't remember all the names.
David Smith.
By the way, there is the last demo of the Xerox Star.
You can see it online.
We held it in 1998.
I had two Danny Lyons working, and they were on stage.
Wow. And they ran all the software and
you can see what that looks like online um
but um it was just too expensive ahead of its time was the did after that i mean i'm not familiar
with um many other machines that came out of or many other computers that came out
of xerox was it just kind of determined that that wasn't the the path going forward like why wasn't
there more iteration since y'all you know you felt strongly there was there actually it actually was
xerox gets blamed for not pursuing it but they did i mean the the next version was called daybreak
the 885 i i had one here but i I... Did I give it away? No,
I still have it somewhere. So it was smaller, you know, not instead of a foot and a half
wide by four feet tall and three feet... No, not four feet. By three feet tall and three
feet deep, you know, it was more compact. Basically the same architecture, just compressed
a little bit. But that, you know, the PC was out, people were doing spreadsheets, you know, filing
the star, got a spreadsheet.
That didn't sell that well.
They then ported the user interface to a PC.
I forgot what that was called.
But that, Global View or something, you know, that sold.
But now all of a sudden the data is incompatible with what everybody else is doing.
It's not Microsoft Word and Excel and PowerPoint, whatever.
So it never – they tried.
They put effort into it, but it never – being first doesn't always mean you win.
Right, right.
Absolutely.
Well, you mentioned the mouse there, and maybe i'll use this to kind of wrap up uh your
time at at xerox and park um yeah i know you worked on the it was the sensor chip right in
the optical mouse is that right yeah when i was at when i when i finished the dandelion work and
the ethernet um i i i got i moved into xerox PARC proper. I worked for Lynn Conway of the
famous Mead-Conway design method, her VLSI group. So I learned about chip development
then. And Dick Lyon was there and had invented this amazing version of the mouse. Let's see,
yeah, this is actually it right here. There's an optical sensor. I'm showing this to people on your blog so you'll have to
bear with me. But there were three lenses and what Dick had done is he designed
like a six by six array of sensors and it would, and you had, your mouse pad was equidistant spots.
So a print pattern with equidistant spots or maybe your blue jeans would work
and it would be like neural, a neural net.
It'd be cross couples of inhibitions and enhancements.
And it, it would, it was set up to try to recognize the dot pattern.
So it would stabilize on a dot pattern, save it,
stabilize on a new dot pattern,
compare which way the dots had moved
and decide which way the mouse had moved.
Brilliant design.
And I basically came on after he had it working
and worked with him to try to work out some bugs and stuff.
But he was a very inspirational person to work with.
Amazing person.
You mentioned the Mead-Conway and structured VLSI.
That was a pretty big difference from the approach to designing the dandelion, right,
where you had all of these individual chips.
VLSI was now you were designing the chips yourself.
Is that right?
That was the goal.
Right. You know, Lynn broke the barrier to people learning about how chips worked.
I mean, we used to teasingly say, you know, she came up with the idea that all you needed was red, green, and blue pencils, and you could design any chip because, you know, if the gate was one color green,
and poly was, you know, red, so you just crossed red with green, you got a transistor.
Right.
And, I mean, it was a joke, right?
But they, so it made it very accessible.
And then we had the chip project, the MOSIS.
They set up the whole procedure where you could create a layout, submit it, multiple little pieces of the chip.
The chip would get fabricated and come back.
I did a chip like that. It was a very simple chip. It was a serial adder.
Jim Clark did a chip, the Predacus searcher, the geometry engine, which, which of course he later became famous for other reasons.
So I even later learned that people at Intel read the Mead-Conway book because it was really educational.
Back then you really had to lay out a chip by hand.
The tools weren't ready yet.
John Osterhout was working on magic CAD tools for chip design, but they weren't ready yet. John Ousterhout hadn't quite... He was working on Magic
CAD tools for chip design,
but they weren't ready yet. So you were very
limited in what you could do.
I actually hand-designed
a MESA chip, because I thought
we need to get MESA into a chip.
Everyone knew this was coming.
We needed to do it. But it just wasn't
coming fast enough.
When you get
a catalog of chips you can build something right away right whatever but with the lsi you gotta
sit down you gotta lay there was one guy at park i'll never forget i walked into his office we
worked you know until 3 a.m late at night and he had a to the chip editor up icarus i think it was
called he was starting in this corner and heading for that corner.
And I was like, that's no way to design a chip.
And they started a project at Park called Dragon,
which was going to be eight custom chips to do a multiprocessor
with an amazing cache coherency protocol,
memory coherency and cache coherency protocol.
And I just knew they couldn't possibly accomplish that based on the set of the tools.
And that was one of the reasons why I left and went to Sun Microsystems.
Right.
So getting into Sun Microsystems there, I know you obviously were hugely influential
in the Spark instruction set architecture design.
What was, you know, coming into Sun, is that what you knew you were going to work on?
Did they come in and get you to come work on that?
Or what was that process like?
Yeah, so I, there were some events that kicked it off.
Xerox PARC had an unfortunate thing where the lab director, Harold Hall, came down and kind of interfered with Bob Taylor
and much of the computer science
lab actually revolted and left.
And formed a new lab at
Digital Equipment Corporation, DEC
System Research Center, SRC.
And they kind of just shut
down PARC as we knew it.
Alan Kay and SSL people still stayed
for a while.
So it was time to look around.
And Sun, to me, looked like just another Motorola 68000-based workstation company.
I couldn't see why it would last any longer than anybody else.
It was tiny.
There were probably about 20 workstation startups at that point.
And Dave Patterson, who was a professor from Berkeley, was consulting with the Smalltalk Group, actually.
He got roped into consulting his son.
Bill Joy was in his classes.
He'd always say how Bill talked more than he did.
Bill talks faster than a racing airplane. and uh you know bill had been famous for for porting um creating bsd unix and uh
so bill had attracted dave patterson to consult at sun microsystems and
uh dave patterson knew my positive background on the dandelion and suggest I might join them
I was like why?
one Friday night my wife sent me to Tower Records
remember vinyl records
we used to have to go to a store
there was no Amazon
to walk around
maybe listen to them first by record
my wife wanted a certain record
and in the store
there was that night
Friday night
at Tower Records where was Eric Schmidt and Bill Joy walking up and down the hallways.
And they said, we want you to come join us.
And I go, why?
They twisted my arm.
We want to do a risk is what Bill said.
I said, that sounds cool.
You know, the dandelion, you know, it had 137 microsecond cycle time but i needed three of
those just do one instruction so why not do a risk and i thought that was a good idea so my wife
hadn't sent me there that night i wouldn't have joined his son probably wow what i would have
done instead but uh um when i started his son though i was surprised that it wasn't quite the way the ticket I had been sold
Bill said well my boss
Bernie LaCrew who had been at DEC
Digital Equipment Corporation
he had managed the VAX system
he knew what a risk could be like
Patterson had written articles
about how the VAX was too complex
and that's where the CISC term was coined
Bill said well Bernie has just asked
us to do a floating point accelerator because Sun was last in the list. If you're an engineering
company, why would you be last in the list in performance, right? So this has got to
be fixed. So Bill said, well, if I'm going to do a coprocessor, I might as well make
it a RISC, the fastest processor. I said, Bill, I didn't join this company to do RISC as a coprocessor.
Right.
Come on, Bill.
And it was a 68,000-based company.
There wasn't NFS.
There was no way to have different binaries on the network.
And so Dave and I basically spent several months convincing Bill
that we needed to do it as a coprocessor.
And people like Dave Goldberg, who had been at DEC WRL,
they had done a Titan ECHO RISC machine there.
He was like, you don't need a 68000 in this anywhere.
Just do it.
And so finally, we convinced Bill to do it as the main processor.
And at that point, Bill said, well, it's got to be open.
And we're like, I don't care.
Because he knew that Xerox had failed by keeping everything closed and proprietary.
He was so familiar with the Xerox's experience.
And so he became the cheerleader for Spark and supported it in a huge way after that.
And what was the process like of kicking kicking off that effort and, you know,
designing an instruction set?
I know there was lots of inspiration from the wrist designs from Berkeley, but, you
know, for in some regards, starting from scratch with the new instruction set, how did y'all,
how'd y'all begin and start to make decisions about trade-offs and that sort of thing?
Yeah, it was a very contentious process.
Obviously, Patterson, we knew all about risk one and risk two.
And I basically, it is an open process.
So I looked at all the risks I knew about.
So the 801 had been relatively well written up in the literature.
It wasn't that secretive a project.
So I looked at the
801 instruction set, I looked at the Titan instruction set at DEC, which is a research
lab as well. There were other risks. ARM was just starting at that time. Pyramid was doing
something we didn't know all the exact details. We knew HP was doing something and we didn't
know the details because they were all close-mouthed about it.
But I looked at them all, and they fundamentally weren't that different.
And since Patterson was there and since risk had been done in the open, no one could go after us for patents.
So I said, let's just stick with the Berkeley risk, but we need to add some
things to it. We need to add floating point. We need to give it an MMU, memory management
unit of some sort. A lot of people were doubting Patterson because, oh, this is just a student
project. It's not a real world thing. But we knew how to make it real-world.
Dave was a consultant there one day a week in my office for four or five years.
Great person to have as a consultant.
He was a very fatherly figure.
He would break up fights.
So it's very contentious because when you design an instruction set,
you can put anything in it you want.
And Bill wanted tags. Some people, so people could do a tagged
architectures like at Lisp. We put in some tag instructions. They actually had a bug
in them. I guess our heart wasn't in them. I'll never forget going to Texas Instruments
and they just called me out there. I said, why do you want me to come? And I walk in
the room and there are all these dour faces.
You guys screwed up the tag dad instruction.
I went, oh, shit.
They like roasted me.
Right.
So it took two instructions instead of one to do what you wanted.
You know, I was like, oh, God.
Right.
So Bill was so, he was just like a fast dealmaker.
And he would say, all right, I'll stop asking for X and Y if you do Z.
Right, right.
And he would just interrupt.
So I formed what was called an architecture committee. So the great thing about Sun is that we had system people
and we had OS people.
SunOS was known already to be stable and reliable.
We had compiler people.
We had actually user interface people.
We had system people.
So I had all those people on the committee and we together, you know, I ran that group, and we decided what to put in and what to take out.
And Bill would interrupt at one point.
I actually shot him with a water pistol.
Bill, quiet.
And I'd go to Bernie and say, Bernie would say, when can we get done defining the instruction set?
And I'd say, well, if you get this VP out of my office.
He was so into it.
In retrospect, we loved it, but it was kind of disruptive.
And the reality is, at the same time, we had no freaking idea how we were going to implement it.
Because we couldn't start laying out rectangles from a chip starting in this corner going to that
corner. We had no full custom. I had little experience. We had no real full custom design
experience. So we went out talking to chip firms to see who could do something for us.
Most of them were based on standard cells, which are a great idea for doing general logic.
I wish Sark's Park had done standard cells. They would have not fallen behind
so far.
But we couldn't find
anything that would meet
our speed targets. We know we had to go at least
16 megahertz.
I know that
sounds slow, but that was
fast for the time.
And the Mead-Conway design method
couldn't achieve that either
because it wasn't designed
for speed, really. Past transistors
and dynamic storage.
Luckily, we hired
this guy, Anad Agarwal, from
STC Storage Technology, and they were
trying to build an IBM
mainframe clone. And they had experience
with Fujitsu. And Fujitsu was doing
a gate array. And they just stepped up to and Fujitsu was doing a gate array and they just
stepped up to a 20,000 gate gate array with a four kilobit memory and one quarter of the die
and that was just enough to do a risk so we as soon as we had Fujitsu in place I could finalize
on the instructions to be launched right and I know the uh at the first public i guess
release of spark was v7 right and so what were all the what happened to the previous versions before
that they were my document version numbers people normally do 0.1 0.2 i said what is this decimal
crap i mean what's wrong with integers?
So I guess the marketing people made it sound like,
you know, it was cool.
But it was just the version numbers on my documents.
Every month I would update the version number.
That's pretty funny.
Yeah, fascinating.
There wasn't room in the gate array for multiply, integer multiply instruction.
I heard this, yeah.
But the risk mentality was that most, for multiply, integer multiply instruction. I heard this, yeah.
But the risk mentality was that most,
the compiler only really needed most of the time to just multiply by two or four, you know,
to shift addresses.
And so, you know, just shift and add
is what we put in initially.
For floating point, certainly there's no room
for all the floating point.
We knew we needed a floating point.
So we had another whole gate array, the floating point, certainly there's no room for all the floating point. We knew we needed a floating point, so we had another whole gate array,
the floating point control chip, which just controlled a WayTech 1164 and 1165.
WayTech had just come out with a floating point multiplier and an ALU,
which did add informing to the IEEE standard, single and double precision, 32-bit and 64-bit.
One of the members of the architecture committee was David Huff,
who worked for Professor Kahn,
who spearheaded the IEEE 754 floating point standard.
And he strongly believed we needed extended precision.
You have more than 64 bits,
so that when people write their algorithms,
if they're starting to accidentally underflow too much, they
won't lose significant bits.
They'll be held in that extended precision thing.
And we're like, really?
But he insisted so hard that we defined
it. Also because Motorola,
they were designing their 68881
floating point coprocessor. We knew
they were going to do 80-bit.
So I was like, we can't be worse than them.
So we put 80-bit in the architecture, but we never, ever implemented it in hardware.
So it was always done in emulation.
But that's the kind of thing that happens.
Right.
I know Spark stands for scalable processor architecture.
What was kind of the uh you know
you're talking about optimizing actually if i can interrupt you it actually didn't stand for that in
the beginning oh really no when we started the project we said we're gonna have to name for this
thing you know so uh will brown who was had done the instruction set simulator okay he wrote a
little c program you could feed it a bunch of buzzwords. And he would print out all the combinations of the buzzwords.
So it was just one of those.
Well, if you looked at the listing, one of the things that came out
was Sun's Processor Architecture for Risk Computing.
So that's what Spark stood for originally, Sun's Processor Architecture for Risk
Computing.
And then marketing got a hold of it?
Is that what happened?
Well, actually, Wayne Rosing realized if we're going to open the standard, how can it be open standard if Sun is in the name?
Right. So he changed the acronym to mean Scalable Architecture.
Okay.
And was the change accurate for the goal of the architecture, though?
I know later on there was embedded variants and things like that.
Was it a goal from the beginning to be able to scale in terms of fitting lots of different size systems?
Well, yeah.
We actually thought that what scale meant is that it could scale with technology.
Okay. Not with the systems.
I mean, it turns out that you're right.
I try to tell people this, that the reason that Spark lasted so long and was so successful per se was because it enabled us to design our own I.O. interfaces.
You know, multiprocessor coherency protocols, I.O. buses, memory interfaces.
You know, the Intel ecosystem was way behind us.
Right.
So it wasn't so much the instruction set itself, you know, sure, it has some interesting aspects,
but enabled us the full flexibility to design a system
the way we want it. Andy Beckleschein
and Bill, his
architects, and then all the other people,
great architects we hired,
had much more
freedom to design
a coherency system.
I mean, I mentioned the
Dragon project at Xerox PARC.
Right. Really novel. Deep Sindhu and Frey Long.
Pradeep went later and formed Juniper.
That system had a very interesting memory system.
It was packet-based.
You'd send an address out to the memory system
and get your answer back later, asynchronously.
So really slow from a uniprocessor
perspective. Oh, and by the way,
what happened is, I was going to finish that story,
is when I went to Sun
a few years after I was there,
we announced Spark
in 87,
in 88 or 89.
All of a sudden, these guys from Xerox
Park were walking around our hallway.
The guys I had been working with on the Dragon project at PARC.
And I go, what are you guys...
Pradeep was there.
What are you guys doing here?
Well, they had realized they couldn't possibly
get the chipset done at PARC,
so they'd come out and reached out
to work with us at Sun for doing the Dragon chipset.
Wow.
Which was then called Sun Dragon.
And the irony is that Chuck Thacker
had asked me to work on the I.O. chip for Dragon
at Xerox PARC when I left.
When they came back
to Sun, I ended up getting assigned
to work on the I.O. ASIC for the
Dragon. For me, it was a
full circle because Xerox had decided
I think I forgot to mention this actually,
before they ported the Star software
to an x86, they ported it to a Sun, actually. Before they ported the Star software to an x86,
they ported it to a Sun workstation.
Okay.
So they had decided... A Spark workstation.
I'm sorry.
So they had decided the best next step for them
was to run on Sun's hardware.
So I go, great, you guys adopt the Spark.
I gave a talk about Spark at Xerox.
So you adopt the Spark, and now they're telling me I have to
finish my work and do the I.O. chip
for Dragon.
So it was a real closed circle.
I ended up being manager of that
chip and that was an ASIC chip we did
with LSI Logic.
That was an amazing system.
In fact, somewhere around here I have the
portal. Maybe I didn't bring it with me.
No, but that
it had multiple I.O. cards and vertical memory DIMMs
and multiple CPU chips done in the early 90s.
But that's an example of how, since we had full control of our I.O. interfaces,
we could tailor the system.
When we were designing the Dragon system, SunDragon system,
I'm like, what is this good for?
There's no benchmark.
Well, when the dot-com era came, the Dragon came out as a Spark Center 1000, and then there was a Spark Center 2000.
When the dot-com era came, then there was a Spark Center 10,000.
That was the system that was used to bolster Sun during the dot-com era because it was just surveying out web pages right it was a coherent memory system of the highest order but that wasn't
its main purpose right for utilization at that point so so looking at at spark v7 and you know
subsequent v8 v9 as well but uh specifically uh i guess v7 V8, because they were both 32-bit, right?
And then V9 brought 64-bit.
So there were a couple of, like, looking through the architecture manuals,
there are a couple of features or design decisions that kind of stick out to me as maybe not unique, right? They take inspiration from elsewhere, but they're decisions that y'all made to include this functionality.
So I'll list them off and then let you kind of choose which one you want to start with.
Okay.
But register windows, we already mentioned.
The branch delay slot and some of the functionality around that with different variations of instructions you could put in there.
And then the alternate address space and using the address space identifiers.
Okay, well, don't get too far ahead of me.
Let's start with RegistroMendo.
Okay, let's do it.
Yeah, so I asked that question.
I said, why are we going to stick with RegistroMendo?
And, you know, they had been clearly invented and articulated at Berkeley with the RISC-I and II.
In fact, it was a master's student.
I've forgotten about
this, but it was two master's students
Halbert and Kessler
in a master's student project
had actually proposed them at Berkeley.
You know,
Chuck Seitz had talked about what to do with a thousand
registers and AT&T
was doing research on a lot of registers.
The whole idea is that you don't need to,
if you have a small number of registers
and you're doing a lot of procedure calls,
you have to save them right away
and reload them when you come back to your procedure.
So why waste all that bandwidth to the main memory or cache
when you can just stick them on a local register
cache and have overlapping windows so that outs can become ins and just reduce the number
of loads and stores.
But compiler technology had advanced.
Certainly on the 801 project, I forgot his name, had done better optimizing compilers.
How can you use a fixed number of registers like 16
and keep items there for multiple procedures at the same time?
So you have to do cross-procedure register allocation,
or register coloring as they called it.
Right.
And that made the compilers very complex.
So here we are, a little startup company.
It's hard to imagine we were a small startup company.
There were two people in the compiler group,
and there was Steve Munchnick from
HP. And HP had a,
I mentioned, had a RISC project. They were
actually slightly ahead of us
called
Spectrum at the time. It later became Precision.
Steve was
extremely close-mouthed. He was
scared about saying anything about HP's project with us.
He was very, you know, closed-lipped, and I think properly so. But anyways, so one day I called a
meeting. So on my left was Steve Munchnick, and on my right was Dave Patterson. I said,
are we gonna do Windows or not? And I turned to Steve Munchnik, and he said,
well, if we're going to do cross-procedure call
register coloring, I need two more people.
And so, you know, and I turned to Patterson,
and I said, well, we kind of get that reduction
in load storage automatically with Windows.
So I said, okay, decision, done.
And we stuck with Windows. Right. So I said, okay, decision, done. Right. And we stuck with Windows.
So that's how it happened.
There was a, well, I don't know if this was unique,
but not all register window architectures did it this way,
where you all had explicit save and restore instructions, right?
That was a feature that we did.
The original Berkeley, when you did a call,
it changed the window pointer.
Right.
And it returned Decker Matter.
So one of the OS guys, it could have been Steve Kleiman, who later became the CTO of NetApp.
Steve, I think it was Steve, suggested that we decouple the two.
Why not have a separate way to move the window pointer?
That way you could do things like tail recursion elimination. You don't have to advance the window pointer if you don't want to. In fact, you could actually make it a traditional
fixed register machine if you wanted to. Just never increment it. Go for it. So that was really an important change
that helped keep Windows going for so many decades at Sun
because when you start doing the pipelines
and the very faster designs,
a 300 register file might seem pretty big to go one cycle.
And so by separating the two,
and especially if the change happened in the previous instruction,
or at least you could schedule the instructions,
you could actually maybe start the process of reading out a row early,
either by caching a whole window in one cycle
or by just charging the row early and reading it out. It's a two-step
memory access,
SRAM access in the chip.
So by decoupling those two, that was a good idea.
And the, so
there was a large number
of registers, but they weren't infinite,
right? So if you did enough
procedure calls,
you'd eventually run out.
You had to spill and refill.
So if you're going up and down the stack a lot,
but the programs we looked at tended to go within a range.
Maybe you'd bump up here and go like this,
go down here and go like this.
So it does count on not zipping down the call chain,
and zipping all the way back more than eight windows.
But that seemed to be the behavior at the time
at least. And object-oriented languages
like Smalltalk really benefited
because there you couldn't allocate anything
across procedures.
For those type of environments, they were
a real win.
Compiler technology by today,
the RISC-V obviously is just a fixed number of registers,
so people feel...
In reality, the memory system ends up dominating performance.
Today we have first, second, third-level caches,
maybe fourth-level caches.
So memory has historically dominated von Neumann designs.
I hate the word von Neumann, but he's a great guy.
So we still use it, you know, computer designs.
Right.
So any other notes?
You mentioned delay slots.
So that actually came from microcode.
You know, the microcode we had in the Alto
and many microcoded machines had a delay slot.
So since you have to fetch where you're going, that's going to take another cycle.
Why not do something?
This is a primitive form of threading with no overhead, essentially.
So it seemed like a no-brainer to do delay slots. Now I added the null feature, which you could turn on a bit in the branch so that
if you don't really have anything to put there, it would
not do it, not execute anything. And so you could save code
space and execution time in more sophisticated
pipelines later. So the null bit is a way just to
do a normal branch.
Right.
So you get it both ways that way.
And I'm sure that the future pipelines
were happy with that.
Right.
And so just to kind of dig into
the different options you could have
for leveraging that delay slot,
the most naive thing you could do is just put a no op instruction in there.
Right.
And, and that would be a total waste.
So the compiler was really trying to move something there that is independent of which
way you branched.
Right.
But you did have to be cognizant of, you know, what, what you were moving there and how it
impacted it.
Right.
Because, um, if you were, I think I was reading, you know, some from literature
from that time. And it was talking about, you know, if you are doing an unconditional branch,
then you could move, you know, whatever instruction was right before the branch to after it. And that
would be an optimization. If you're doing a conditional one, you could move an instruction
from wherever the target was up to just after
that conditional branch.
And then I assume if you set the annul bit, then if the branch is not taken, the instruction
is annulled.
Is that right?
Yeah.
Okay.
Yeah.
And then I also read about the, let me look in my notes here, the delayed control transfer instruction couple
where you have a branch and then in the delay slot,
you have a branch.
And that gets a little bit more complex, right?
We elected to leave that undefined.
Yeah, right, right.
A little hole in the architecture,
but the compiler just never did it
and people weren't doing hand
code.
Because the problem, you could define it, what would happen, but we didn't want to
constrain future implementations to whatever that definition was.
So we just said, undefined.
Don't do it.
In the instruction set, it was undefined, or in the hardware?
Because I saw that in the...
In the architecture.
In the architecture, we said this will result in undefined outcomes so don't don't do it there were there were a few other
little tiny things like that in the architecture which aren't clean but you know hey this is
reality you know this shit happens right another one was a floating point underflow you know um
really i i triple e would prefer not but a lot of people want to round to zero so
there was
a bit
that you
know and
it may
not be
consistent
in all
implementations
I mean
that all
the positions
may not
implement it
right
I kind of
think
the the
other one
that the
other feature
that I kind
of read a
little bit
about in
research was
the alternate
address spaces
and the
address space identifiers. Was that something
that was taken from...
That was something I did, yeah.
I, well,
you know, putting I.O.
devices in the address space seemed like
the most sensible thing to do. I mean, Sun
had a thing where they had
an I.O. MMU
where devices could go through that to get their
virtual address.
And putting devices in the address space seemed like the simplest, most straightforward thing
to do.
And we had a lot, because the ASI field was 8 bits, I think,
you could have a lot of alternative address spaces.
And we used it for all kinds of things. One of the key insights that I brought, and OS people support it,
in the definition of Spark architecture was really to find something
that applications and users see and don't get hung up on what the OS needs.
For instance, we did not define an MMU.
Our first implementation used Andy Bechelschein's brilliant MMU design.
One of the things Sun, I'll talk about that for a moment, one of the things Sun was known
for in his very first Sun 1 and Sun 2 was, I'm sorry, the Sun 2, yeah, was he put an MMU in, node cache, DRAM,
you know, you address it in two phases, the row address and column address.
So he would send out the low order bits immediately for the row address. And then while the DRAM was accessing the row internally, he would translate the upper bits from a virtual number to a physical number in a table in SRAM chips.
And then supply those bits for the column address.
So that meant he had a zero cost. AndU. It took zero cycles. In other words,
it added no time to the memory access. And a nice hack. And we actually implemented that
in the first Spark chip. So in addition to the architecture, I co-designed the hardware.
This is a startup, right? You do more than one thing. You do it 23 hours a day.
I designed the CPU part,
the processor part of the Sun 4 board.
I do have a Sun 4 board
here.
Notice how the board's got bigger.
It occupies the
entire screen.
We can get into the components on it later.
In the Sun 4
board,
to get a single cycle cache,
who wants set associativity and
TLBs and all that
stuff in the way? So it's a virtually addressed
cache.
So the bits come
right out of the integer unit,
directly access the cache. In fact, I
patented this idea where the
bits come out unregistered,
just wiggling away, and then I have a solid advanced Schottky TTL AS374 latch them and
drive all the cache chips to get a single cycle cache with all these, you know, 2016
plus tag chips out there.
And we actually patented that idea.
HP had a similar thing.
They had a flow through the latch, which addressed the cache.
So really, that was a huge difference from the Motorola 68020.
The clock output delay time on a 68020 is like 20 nanoseconds.
Oh, wow. Yeah.
Yeah. What happened to our...
You know, we need a 60 nanosecond cycle time.
We've already lost a third of it
just getting off the chip.
And so by wiggling the chips
and registering in an external
beefy registered latch,
registered device,
and then driving the cache,
that helped us achieve the single cycle cache. And clearly addressed. Well, you still want to map multiple
virtual addresses together, so we implemented Andy's MMU and we said if
you want to address virtual addresses, if you want to map them together, they have
to be equal modulo the cache size. So the lower order bits have to match the cache size.
And then we'll have tag bits on the side.
We'll check the physical higher order bits to see if they're different or not.
And if they're the same, we know it's the same location.
So we, and these MMU saved us there too.
Right.
But I knew that this, I knew about TLBs, translation look-aside buffers.
Right.
You just couldn't implement it at that time.
And at Xerox, when we did the Xerox Star, the virtual to memory translation was in main memory.
So if you ever wanted to do a virtual access you had to first access the translation table
then access memory so actually every memory access was six cycles two slots right that's
one reason why the star was so slow relatively speaking right so um
so i knew that there's no reason to formalize on a TLB or MMU.
For instance, HP made the opposite decision.
They formalized on the MMU management unit, and they formalized on I.O. devices.
And I thought that was a mistake.
So that's an example of how having the system people, they said, fine.
Because the kernel is going to change with every system anyways, and the kernel's the only
person that cares about this stuff.
It's just a mess.
I.O. is a mess.
We're not aware of it so much now, but every
possible I.O. device you may ever plug in
your computer back then, and today, you have
to have the device driver preloaded
as part of the kernel image.
It's got to be gigabytes by now.
Anyway, so because it's so tied to the hardware there was no reason to put it in the instruction architecture that's really interesting
the the idea of um you know defining less kind of gives you more flexibility and each individual
system and and that the fact that y'all had kind of this, despite being very open at multiple levels of the stack, you had a very vertically integrated team, which allowed you to, you know, understand.
You could just go and grab the person in the meeting who was going to be working on the kernel driver, right, and talk to them, which really seems like that was kind of a superpower y'all had on both sides, on both the hardware and the software side yeah yeah like for instance the windows you know the os people would think about them and they
go they're kind of cool because i can immediately get registers i can use very soon after we
realized that there was a security leak there because if the kernel didn't zero them out before
returning it's about to use from left register file right we weren't thinking about security
leaks today everything's about security leaks right but anyways we had to plug that one early
right right what was there and so the the spark architecture like we said evolved it had v8 and
v9 uh i think in v8 uh the multiply and divide instructions were added um was, was there anything about kind of the evolution that you found really
important in subsequent versions from that,
that first one?
Well,
I have my own view on that.
I,
I intentionally,
uh,
designed the instructions that encoding to be really sparse.
Um,
I did not leave a lot of free bits. And I did that on purpose,
knowing that it would constrict our ability
to add a lot of stuff.
Maybe that helped us live with each other later on.
So, for instance,
a quarter of the entire opcode space
is just with a call instruction.
Just two bits.
You can go anywhere in memory in one instruction.
You don't have to do a procedure call. You can jump anywhere with one instruction, just two bits. You can go anywhere in memory in one instruction. You don't have to do a procedure call. You can jump anywhere with one instruction, which
I thought was pretty cool. But still, it's a quarter of the address space just for going
somewhere. And it was pretty sparse otherwise. One thing I did was I defined a coprocessor
interface because everyone wanted, There were so many ideas.
Vector instructions, not a bad idea.
More graphical-oriented instructions.
So I said, hell, we'll just make a coprocessor space.
Go do whatever coprocessor you want.
And you access it just like the floating point was accessed.
And that helped put the bait, all that stuff.
When it came time to do 64-bit, we weren't first,
but it was so easy to do because the instruction set was so simple.
Most things just expanded to a 64-bit format with no thinking involved at all.
Wow.
And I'll never forget later meeting,
when I was working at IBM, all my research,
meeting people at IBM back east, and they'd say, you guys were so brilliant in defining the first
32-bit instruction set, because it expanded to 64-bits with no work. I'm going, we did not have
64-bits in mind when we did 32. I'm not sure I had the guts to tell him that.
But because it was so simple, it was brainless to extend the 64 bits.
Now, one thing that happened on V9 was Bill Joy, just a brilliant person,
realized that being so familiar with the kernel,
you can get in the kernel, and you can get in the kernel and you can get another trap and you can get another trap, you know, and you want to respond
to them quickly. You don't want to have to save registers. So we added a whole V9, added,
what was it, three sets of extra register sets just for the kernel. So now we've even extended
beyond register windows even to more. And those ideas happened later, actually,
in other processors that have register sets
for user kernel and virtual machines,
virtual register sets now.
So we weren't thinking of virtual machines, by the way,
at the time either, which is problematical in some cases,
like condition codes and stuff like that later, but I wasn't
around then.
Fair enough.
As far as
the rest of your time at Sun,
I know you worked on
I believe it was
MAJC
and Pico Java.
Those were two things I saw
in your background. know java obviously was
um very very popular at sun um what was the uh how closely were hardware decisions made
with java in mind do you know the story how java started originally it happened very early at sun
some of the early there were very few
hardware people at sun i was like the third hardware designer at sun when i joined in um
84 but one of the other hardware guys you know was kind of burned out
it was easy to burn out you know yeah right um working around the clock um he just went off and
said you know we have so many remote controls in our house, you know, the audio, the TV, this and that.
Let's design a universal remote control.
So they formed their own little group and it was called Oak at the time.
And they, well, we need a language for it.
Oh, we need an operating system for operating environment for it.
So they designed both those. And later, Wayne Rosen went around,
what are you guys doing?
Because he started the first Sun Labs,
and what have you got here?
This doesn't make any sense.
So he wanted to cancel it.
But Bill Joy came in and said,
oh, this is just perfect for the web now.
And they renamed it Java.
Wow.
And it was Bill Joy's intervention,
and he realized they could all be ported.
Gosling had done a great job
defining the language and the runtime environment.
It was all done on the side for a different purpose,
and Bill just repurposed it.
It's just the right timing, like many things Bill did.
And the job was great, but it didn't make Sun any money.
Right.
We couldn't charge for bytecode.
You know, the marketplace had gotten more competitive.
By 1998, Pentium ships were running the spec benchmark as fast as Sparks by 98, 99.
By the way, I was one of the co-founders of the spec benchmark consortium.
There was this guy who had a restaurant in Campbell, and he disclosed it one night,
and invited MIPS, Sanmi, HP, and Apollo.
And we said, we need a better benchmark.
Drystone just sucks.
Drystone is this old benchmark.
Fits in a cache.
You can eliminate code.
It's just horrible.
So we proposed all these applications.
And that became the first spec mark,
integer and floating point.
We were worried later because one of them
was all integer multiply, right?
Oops!
We didn't have it yet.
Right, right.
Oh, man. I said, well well that's the way it goes anyway so right you know became a nice solid foundation for evaluating all the microprocessors
of that era um and um where was i going with that the oh yes java so Oh, yes, Java.
So Intel was now up with us compilers, the techniques in the pipeline.
You can do registry renaming, you can do TLB, you can just do all kinds of things to catch
up with Spark.
So Bill was looking for more magic, and there were two things.
One of them was PicoJava, and one was UltraJava, which became magic, M-H-A-C.
Okay.
So I'm sitting in a company-wide staff meeting once.
I don't know why I was there.
Bill was sitting next to me, and Ed Zander was the CEO at the time we're acting.
And Bill leaned over to me and said, how big would a chip be if we implemented Java and
Silicon?
And I said, oh, probably three millimeters on the side.
Bill raises his hand.
He goes, we're going to do Java and Silicon.
I'm going, Bill.
And by that next Monday, the marketing people had already announced it.
We had no design and no team.
And I thought, this is such a desperate
company you can get.
I thought it was a dumb idea.
I ended up giving a talk
about it at Microprocessor Forum
one year just because I wanted to talk at Microprocessor
Forum.
They formed a team
that started working on it and it was eventually
canceled. In fact, my neighbor ended up being a manager on it.
I now know.
But I thought it was a dumb idea.
I had Bill try to defend it, but it was just one idea to try to...
The other idea we had, which I think was better, is I held an off-site at Asilomar here in Monterey.
Okay.
And I invited the graphics guy, Michael De Deering just a brilliant graphics person
his son
he wanted to
by the way the technique he thought was important
then is to render a triangle fully
with complete illumination
based on the scene
but most chips
couldn't do that at the time
Jensen
and two guys I forgot their name now, his son they were trying to do that with the time. Jensen and two guys, I forgot their name now, his son, they were
trying to do that with Michael's advice and they left and formed this little company called
NVIDIA. Jensen, by the way, used to be our LSI field rep from LSI Logic. Okay. You know, and now he's CEO of NVIDIA.
Right, of course.
Yeah, amazing person.
But the three of them were sitting around one day
and said, one of us has to be CEO.
And so Jensen said, okay, I'll do that.
Yeah.
Really nice, amazing man, you know,
built himself up from the bootstraps
coming in here as an immigrant.
But they were kicked out of Sun, basically,
because it was just too hard to do that kind of graphics chip.
So anyways, I invited Michael Deary, and I said,
what can we do to bring Sun back in with a hardware story that makes us look like a leader again?
So we had lots of walks on the beach,
and we realized we could add a graphics pipeline to the cpu seemed like there
to be just enough transistors now to do that and since we had java we didn't need to make it spark
we could make any new instruction set we wanted right back to the drawing board full open and
and so we did that mark trembley was the architect and he designed a whole new instruction set.
No register windows.
What was it named?
Well, at that time, it was Ultra Java, the internal code name.
But that became Magic.
That became N.
And it also had threading.
If you're going to go out to memory, let's do something else under that. Same ideas from delay branches and stuff.
But now to a much greater
extent same alto ideas like right you know while you're waiting for memory go execute another
stream so i think it was the first multi-threaded uh hardwood ship out there i think wow
uh but you know it was is tied to the java story and and Well, we soon realized we couldn't fit
the entire graphics pipeline on the chip.
We could do all the triangle rendering,
you know, all the floating point.
Ha-ha, floating point.
You know, AI algorithms, anybody?
We could do all that on the chip,
and we'd do the back-end rendering on a separate chip.
But we needed to find clients,
and so I would fly to Japan and visit with Sega,
and they would go, Java? we need to find clients. And so I would fly to Japan and visit with Sega and,
and they would go Java,
you know,
they're,
they're hand coding primitive machines and they're doing patches where you,
you don't render the triangle with the lighting.
You just know,
well,
this kind of scene,
it's going to look like this and just fill it in.
Okay.
You know,
all shortcut methods today.
Right.
Graphics is so realistic.
It's just pathetic.
But back then it was a hack
right and so they they thought this no we don't need this we couldn't find a user and the only
user ended up being a graphics accelerator in the at sun right you know 200 million dollars just to
speed up a graphics accelerator so that in some sense that project was not a success but it you
know it hired a big team of people speaking of non-successes uh it's fun
when we did you know we talked about spark well i i next became um head of the
um logic design architecture logic design and cad group for ultra spark this was in the early 90s
and um viking was the name of the chip before us.
People go, Spark is such a good idea.
Where is the full custom chip?
So we hired Jim Slager and somebody else from Intel who had done the 386.
They came and hired a team.
They wanted to buy CMOS, which is bipolar in CMOS.
And we thought, ugh, and a very complex pipeline.
Well, that project, project unfortunately fell like two years
behind and sun was now behind the market luckily there was a recession right then so the sales guy
could still sell our current stuff amazingly right so viking came out late the ultra the
sun dragon was based on on viking okay but it came out late uh so during that time it was late
Anand Agarwal who was
who had
been really the spirit leader behind
our relationship with Fujitsu
started an Eccle design project
ECL Spark working with BIT
and that
was a lot of work
how to design an Eccle with the whole board
and it was like we had two That was a lot of work. How to design an ECHO with the whole board.
And it was like, we had two camps.
We had Slager doing CMOS and Anat doing ECHO.
Spark was scalable, right?
So maybe ECHO was a good place for Spark. But we fell behind in CMOS.
And so one day Andy Bechtolsheim said, I just had it with this.
We need to be like Intel.
We need to have two CMOS teams leapfrogging each other.
Intel had Santa Clara and Oregon, you know, each working concurrently.
So you've got to chip, like pipeline, you've got to chip out every year.
Right.
So we can't stop the ECHO project.
Well, actually, before we did that, oh, yeah, the guys were like,
well, let's try gallium arsenide.
So they actually put me in charge of the gallium arsenide program.
You know, Seymour Cree is looking at gallium arsenide.
We worked with Vitesse.
I would come into work on Mondays and go, let's go for it.
And by Friday, this is hopeless.
Gallium arsenide technology, it turned out that the P-channel devices were slower than the N-channel devices
and there was no memory.
The tests was excited to maybe
do a processor, but they were too small of a company.
That's the point. Andy came in and said
I've had it with these two
teams.
Guys, you guys who
echo the Gali-Marsenai guys,
you're going to do a CMOS processor. So that became
what we call Spitfire.
And now Slager's all pissed off because now there's a competing team.
But now we had the problem, we're behind in the market.
How do we hire a team to catch up?
And we had to hire 200 people in the course of six months.
And we would interview six people a day on Fridays, review 20 resumes
and decide who to hire. We did find people who wanted to help save Sun. Les Cohen did
the I-860 at Intel, the first 1-MIP chip. He came as the lead architect of Spitfire.
I worked with him.
Anat was in charge of the whole project,
and we pulled it off.
We pulled off 167 megahertz,
5.3 million transistor chip on schedule and brought Sun back up with a reasonable performance
and a great systems, as the system people designed.
So that chip, Spitfire was called ultra spark one and then ultra spark two and ultra spark three
and then some other spark chips fueled sun in the 90s but by the end of the 90s like i said intel
with their pentium chips had caught up with us in performance and uh there was some I was not at Sun anymore
but they did some amazing uh Sun chips through the early 2000s up until an Oracle uh Larry Ellison
bought them Scott knew Larry Ellison from years ago and they did some amazing chips while they
were there and I mean the only reason you know the oracle products
had kind of the same performance whether it was an intel chip or a spark chip so larry finally
threw in the towel right you know fundamentally i you know i think patterson has come around to
this that the fundamental difference between risk and cisc back then was that CISC machines were micro-coded and RISC machines were not.
And so, you know, that didn't matter anymore by the time the year 2000 came
around. The IBM PowerPC has become the most complex instruction set in the
world. There are literally hundreds of instructions, every instruction known to
mankind. Right. Every instruction known to mankind. Every instruction known to mankind.
Even ones you can't even imagine.
You know, there's
decimal floating point.
All the decimal instructions came back that are
sophisticated instructions for virtual
machine handling.
Remarkable
main memory models.
The complex
transistors are free. Metal and the size of the chip are important
right but the instruction set size has become a non-issue now in running these things right
the time's really changed right well kind of kind of rounding out um your time at sun there i know
i know you went on to do a number of other very interesting things that we
might cover in a follow-up episode, but I did want to hear and talk a little bit about what
you're up to today at the Computer History Museum. You talked about the history of Ethernet, so I'll
kind of let you take that whichever direction you think is best there. Okay, well, yeah, let me
definitely talk about the Computer History Museum.
So I always had an interest in history.
In the late 90s,
after I was discouraged,
I worked on a project called Genie for a while at Sun before I left.
And I started volunteering
at the Computer History Museum.
And it was like the Valhalla
of these great computers, right?
Craze and everything all over the place.
And I had to know the staff.
And in 2003,
I started working on a restoration
and demo project for a DEC PDP-1.
That was DEC's first computer in the early
60s. An 18-bit machine
that was used a lot in universities. In fact,
Space War was written on it at MIT
and Nolan Bushnell was wandering around as a grad kid not knowing what to do and saw Space War was written on it at MIT and Nolan Bushnell
was wandering around as a grad kid
not knowing what to do and saw Space War
formed Atari
from that experience
Peter B. Wunz was in universities
he used to say by the way that
women would walk up to him and thank him for
their marriage
and he would say what? well because
in all the bars there were these Atari machines and women are known
to be good at playing games
and so they would always win
and get to know that.
So a lot of marriages
where no one was proud of that.
So
while we were working on
the PDV-1 restoration
Mike Chaponis
who was the spearhead of that, spearheading that,
he heard there was an IBM 1401 available in Europe. And he said, would anyone like to lead
a team to restore it? And I had just joined IBM Allman Research Center, and I didn't know anything
about IBM computers. I cut my teeth on DAC and SDS computers,
and I was not interested in business machines, to be honest.
But I just joined IBM, and I said, sure, I'll do that.
And then I said, what's an IBM 1401?
And then I look online, and I go, holy shit.
Five-ton, 12-kilowatt computer from the early 1960s, late 1950s.
So the next question was, how am I going to get a team to restore that?
So someone said, there's a group of retired IBMers in the Bay Area, so I put an ad in
the IBM Retirement Newsletter, IBM 1401 needs help.
Two weeks later, about a dozen retired IBMers who had worked on these machines,
proudly worked on these machines in the field, came in and said,
we'll help you get it to work again.
Wow.
So after two years, three years, four years, it started to start working.
Germanium transistors, which are not the most reliable transistors in the world.
Solid state. IBM had switched to Solid State in 1958
from tube computers. You would think that Solid State, you know, would be pretty hardy.
Well, I soon learned that they had packaged the transistors in 205 cans that had iron in them.
And guess what happens when you put iron in a moist environment in Germany
for 20 years?
Right.
So I became worried
we had just purchased 10,000 rusted transistors
because we were finding
two to three bad transistors every week.
And one night
I get a call on my cell phone.
This guy in Connecticut calls and says,
my father has a 1401
in his basement i said what and uh are you interested i said sure and you know it had
been operated up until 1995 yeah the internet existed in 1995 right his customers were
westchester county golf courses or something and he converted his home into a little data processing center and fell in love with the machine and never stopped.
Wow.
It had a little water damage, but we brought it in on a rainy day, of course.
And the week, the day it arrived in our lab, the Computer History Museum, which is, by the way, SGI's old headquarter building, the server room of SGI's headquarter building.
At Sun,
we were really proud to help put SGI out of business because graphics wasn't enough to
get them going, but
servers were.
The day it arrived
in the lab,
the German system started to work.
It's like,
if I don't shape up, I'm going to ship out.
By having two, we can make sure at least one is working when we do live demos.
We have trained docents that give live demos of the systems,
and maybe at least one of them is working on any given day.
There are half a million discrete components in each one.
It was the world's most popular computer in the mid-1960s, before the IBM 360.
There were over
14,000 installed worldwide.
It was
it quickly became, and people
you would never imagine loved
it because it was their first computer. Alan Kay,
the author of Smalltalk, that's his favorite computer.
He didn't run business applications.
He wrote an OS and a metacompiler
for it. it was a computer
kids think it's a T-Rex dinosaur
and we
speak of it as a time machine
when you walk in that room and smell
especially people from that era
a mainframe from that era
it has magnetic tape drives
it has the mechanical card reader
it has the mechanical printer
the chain printer, 90 inches per second,
hammers, 132 hammers that fire.
It's like a cross, as students said, between a machine gun and a typewriter.
It looks like a T-Rex dinosaur, and people love it.
I mean, they love seeing something so real.
So it's been great fun for 20 years while I was still at IBM one day a week.
The CEO of IBM came to visit one time, Ginny,
and I gave her the tour and I said,
you know, I work at IBM.
And, you know, I was too young
when this computer came out.
I didn't know it back then.
She looks at me and goes,
how do you find time to be here?
And I said, oh, pink slip in the mail tomorrow.
Yeah, right. I said, well, at IBM, at me goes how do you find time to be here oh pink slip in the mail tomorrow yeah right
i said well at ibm you guys have a big community service component
yes you guys can support this kind of thing so that's awesome that's really cool and so come by
the computers museum if you want to see a live demo saturday mornings at 11 and 1 p.m or wednesday afternoons at three they have trained
docents who who give the demos absolutely well next time i'm out that way i'm i'm definitely
gonna have to make it um you you also mentioned uh you're you're working on this uh history of
the ethernet tell me a little bit about that and and kind of what the the goal is there and and
current state maybe yeah i am yeah that's a really exciting project. I, you know, I worked on it.
I did that Ethernet,
those two Ethernet adapters
for Bob Metcalf at Xerox.
And I went to Sun Microsystems.
So, you know, Andy was really tuned in
to the importance of Ethernet.
He made sure it was on all of the cards.
And I mean, he had been at Xerox PARC
as a no-fee consultant,
but I was there working on the data.
He was across the street.
I didn't know it, building his own workstation based on a 68000 and an
amazing small multi-bus Ethernet card he did. So it was important, but I was not
part of that. But a friend of mine at Xerox, a dear friend of mine who's
passed away, who did the analog part, people contacted
me and said they wanted to know more about how Ethernet came out of Xerox.
And I said, oh, that's boring.
But I looked at it.
I could see it wasn't well written up technically.
So I started talking to people and expanded.
So I've decided to do a complete technical history of the Ethernet, starting from AlohaNet.
It's inspiration by AlohaNet.
I found the AlohaNet designers
I've done an oral history
it's now online with them
so from AlohaNet in Hawaii
which was the inspiration for Bob Metcalf
to do Ethernet
through his work with David Boggs
and others at Xerox PARC
a little bit of our work
with the Xerox STAR
and then all the subsequent initial
cards that came out
and then standardization and the IEEE
802
that's a lot
I dare not
get to the modern era
because that would take 10 more volumes
I've talked to 140
participants so far
140 and lots of interesting stories.
And I'll be talking with, I'll be visiting with Bob Medcalf next month.
Actually, it is home in Austin.
So it's, I guess it really left an impression for me and seeing how it played out, because it was a fierce battle out there. And its standardization, it got standardized at the same time as Token Ring and Token Bus.
But because it was so simple and so straightforward for people to build interfaces for it
and the protocol layers took care of problems that the hardware layer is
fundamentally in one. You know, token ring, you have to wait for your turn, you know,
good thing that doesn't work today, right? You can't make a telephone call till a
red light comes on, you know, that would be bad. It, because it allowed packets to
be lost, you know, so-called collisions in the original Ethernet,
originally on the cable,
somebody's transmitting here and somebody's transmitting here,
and they might listen, but they're too far apart.
They're not going to hear it in time,
so they're going to collide and the packets are going to be wiped out.
Right.
That scared people.
Right.
But Park's universal packet format, later TCP, ARPANET was unreliable too, so it would come back and retransmit the lost packets.
And the low-end net retransmitted lost packets, so they never had any collisions probably.
But that was so important because when it made the transition to switches, Ethernet switches, initially, the first Ethernet switches
didn't have enough bandwidth, all ports in to all ports out,
even assuming it's equally distributed.
Right.
So if a port got overloaded, the microcode could just go,
firmware could just go, oh, just throw some packets away.
So because it allowed, it promoted you know in a way
you know that the destruction of packets which would be fixed at a higher level and metcalf
really cared about reliability but in the right way right led to its total dominance of all
communication i like to say that it's possible all human to human, machine to machine,
communication goes through an Ethernet.
Right.
Or an Ethernet derivative like Wi-Fi at some point.
Wow.
That's a really interesting design insight of, you know,
being unreliable in a way at the lower level allows for a more reliable,
complete system because you're deferring to the upper levels.
You're not trying to solve every problem at that low level,
which people always tend to want to do.
They want to be good engineers.
Like Chuck Thacker thought he was a brilliant designer of the Alto and his work he did at DEC SRC,
but he never thought it was a good idea.
He said, no, the customer deserves your ability to deliver all the packets.
I worked at a fiber channel company
at Brocade. I was director of hardware there for a while.
There we actually did
deliver every packet in order,
but it was more of a constrained environment
and smaller, kind of a micro net
instead of a local area net.
Right. Interesting.
Well, how will folks
be able to, once the history of the Ethernet is complete or in a state where you're comfortable with it, how will folks be able to get eyes on it?
Well, I should put it on the web, but it'll be a book with pictures because people don't have time to read books. So, you know, I'll have a lot of pictures and diagrams. And if someone wants, they'll usually go to the section they're familiar with.
So it's going to be another year probably.
Gotcha.
Awesome.
Well, I'm sure that'll be really exciting when it comes out.
I'll definitely.
I'm worried that only the people who read it will be the people I've talked to.
But yeah, it's good to know that someone else might be interested.
Well, I'll definitely want a copy of it.
So I think that sounds incredible.
Okay.
Well, thank you.
Absolutely. Well, I think that probably can wrap us up for today. But like I said, I think there's
even more we could cover. So if sometime in the future you're willing to come back,
I'd love to have you back on. Oh, yeah. That'd be fantastic.
Awesome. All right. Well, thanks, Robert. And have a good rest of your week.
Well, thank you for having me on your show and your podcast.
Absolutely.