Grey Beards on Systems - 35: GreyBeards talk Flash Memory Summit wrap-up with Jim Handy, Objective Analysis

Starting point is 00:00:00 Hey everybody, Ray Lucchese here with Howard March here. Welcome to the next episode of the Graybeards on Storage monthly podcast, a show where we get Graybeards storage assistant bloggers to talk with storage assistant vendors to discuss upcoming products and technologies and trends affecting the data center today. I'm sorry, and analysts. This is our 35th episode of Graveyard Done Storage, which was recorded August 12, 2016. We have with us here today Jim Handy, Analyst at Objective Analysis that covers the flash and memory chip industry. We all attended the Flash Memory Summit this past week, and I just thought we'd start the

Starting point is 00:00:45 discussion about what happened at the summit. So, Jim, you were a major player at the summit this last week. What do you think were the highlights of the Flash Memory Summit? Oh, there was an awful lot to see there. I think that there were kind of contests going on between different vendors as to who could have an SSD with highest IOPS and who could have an SSD that had the largest capacity. I think Seagate won that one. 60 terabytes is pretty impressive. Yeah, well, I was doing a little bit of back of the envelope math on that because, you know, Samsung, first of all, Samsung said, yeah, last year we introduced a 16 terabyte drive and that was the biggest in the industry. And now we're introducing a 32 terabyte drive and that's the

Starting point is 00:01:24 biggest in the industry. And only a few steps away 32-terabyte drive, and that's the biggest in the industry. And only a few steps away from the Samsung booth was the Seagate booth with their 60-terabyte thing. I said, how many chips would go into one of those things? You know, it would be an awful lot. And Samsung's got a 32-gigabyte chip that they just introduced at the show. That's cool.

Starting point is 00:01:44 Yeah, you compare that to the capacity of a hard drive not too many years ago. Yeah. You know, when was it that the 50 gigabyte hard drives came out? You know, this is now in a single thumbnail size chip. Yeah, Jim, remember, you're on Greybeards. Yeah, we all know when the 50 gigabyte drive came out. Ray and I remember when the 50 megabyte drive came out. i remember when the 50 megabyte drive came out yeah yeah yeah yeah yeah no i do remember that my first hard drive that i had on a computer was 20

Starting point is 00:02:12 megabytes so yeah 50 megabytes yeah that was uh my my macintosh my beloved macintosh back in 1985 yep but you know looking at the math of these things it's you know still a little bit mind boggling what they're able to do something that the nan flash guys do is they take these chips and they put several of them in a package so like when you buy a 64 gigabyte micro sd card that thing is actually packed with you know probably 16 17 chips 16 flash chips and a controller and it's hard to imagine that it all goes in there. But what these guys are doing is they're saying, okay, well, if we can get that much stuff

Starting point is 00:02:50 into a little micro SD card, then how much can we get inside a two and a half inch or in Seagate's case, a three and a half inch hard disk drive enclosure. And they shoehorn it all in. Might not make economic sense, but by golly it's certainly uh you know it'll get you get people standing up to look yeah you know my guess is that that's

Starting point is 00:03:11 gonna that product's gonna sell for about 20 grand if you look at large ssds you know two terabytes and up the cost per gigabyte doesn't decrease with size. But controller in the case is hardly any of the difference in a cost between it. It's hardly any of the cost when you get to a four terabyte SSD. So, you know, when you go from four to 16, it's basically four times as much because it's four times as much flash and flash is 90% of the cost. I'm intrigued by the fact that this is the first three and a half inch SSD we've seen in a long time it seems to be targeted at the same kind of applications that like sand disks in finiflash is where i need not the hundred times faster than a hard drive that most flash stuff has been

Starting point is 00:04:01 targeted at for the past 10 years but i need 20 times faster than a hard drive and it's right once and read all the time and you know things like facebook's long tail where that drunken fat boy photo is only being accessed by a million people instead of a billion people you know it's not anybody else's long tail but there's we're now getting these use cases for bulk flash and flash at you know well it's not about a million iops it's about throughput yes i didn't see the specs on it was a read intensive workload associated with that is that they didn't say oh okay but it's a sas drive and and 60 terabytes behind 12 gigabyte s. It'd take a little while to pull it up, you think? You know, you're giving up performance because the interface becomes a bottleneck when you've got that much flash behind it.

Starting point is 00:04:53 So if you cared about four 16 terabyte drives, they're going to be substantially faster just because you're not going to saturate the interfaces but if you're you know comcast then this might be where you store all the hot video on demand stuff in each of your pops you know i was so must have saw three different vendors showing project lightning kinds of configurations well i wouldn't call them that they're what would you call them well i've I've been calling them the new Tier Zero. So 10 years ago, we had violin and Texas Memory, and they defined Tier Zero as an order of magnitude faster than a symmetrics. But the only virtue it had was speed. There is and always will be a market for something in order of magnitude faster than normal, regardless of what else it does. You know, high-performance trading right now is that market. But there is and always will be somebody who thought of an application and then tried to run it and found out he needs another order of magnitude of performance.

Starting point is 00:06:02 And the application is going to deliver value so large that it doesn't matter what the storage system costs. But I was impressed with the fact that they were trying to, you know, and it wasn't really a dense package, quite frankly, right? It was just a PCIe way of expanding the PCIe bus, as far as I could tell. There's several of them that are doing that, you know, the DSSD model. But there were more of them, you know, Ethereum and the E8 and MangStore, who are using Ethernet or InfiniBand as the connection, and they're all leading to NVMe over Fabrics.

Starting point is 00:06:41 You know, these are all the guys who built their products before the standard was ready, so you can't really call NVMe over fabric. But it's about having end-to-end latency in the 100 microsecond range, where for an all-flash array, we talk about sub-millisecond latency. So again, in order of magnitude, it's faster. Yeah, Paul Prince of Mangstore, during his keynote, he took special care to say that they participated in the NVMe over Fabric standard. And as a result of that, what they created before the standard was adopted is compatible with NVMe over Fabric. They need to do a minor software rev to match the, I call what they do pre-standard NVMe over Fabric.

Starting point is 00:07:28 Okay. Okay. I thought maybe they just asleep the way to it. I was familiar with NVMe over Fabric fiber channel stuff, but this Ethernet stuff was completely unknown to me before. Actually, the fiber channel stuff is the weird variation in NVMe over Fabric. Really? Yeah, because the main NVMe over Fabric stream uses RDMA. RDMA, yeah, so it's Rocky and all that stuff, yeah. So it's InfiniBand or it's iWarp or it's Rocky. The Fiber Channel part is, you know, it was a standards organization in Cisco and Brocade

Starting point is 00:08:02 and QLogic and Emulex, subject to all of the acquisitions and name changes. Apply said, well, people have fiber channel networks too, and this is a storage thing, and some people will want to use the fiber channel networks for the storage. And so that was more a secondary idea added on. Works fine. Don't get me wrong. It's just the main thought when NVMe over Fabric was in development was about the RDMA path. And, you know, considering the fact that 25 gig Ethernet is now basically the same cost as 10 gig Ethernet, although there's only like six products

Starting point is 00:08:41 where you could say that. But next year it's going to be – well, I have the Cisco switches where the Nexus 9200s that they're actually selling for cheaper than the Nexus 5548 10-gig switch. So I know it's coming, but I also know this is like one of the first five products that does that. So it's going to be a year or two but that means in the year or two when these products on the storage side mature it can be you know i've got the the four servers that have that are packed full of u.2 ssds that are the nvme over fabric target connected via 100 gig ethernet to the switch and then the client server is connected via 25 gig ethernet and you know now you're talking about boku bandwidth at 100 nanosecond latencies you can do some very interesting things with that someday i'll understand that stuff i'm the chip guy and the ssd guy

Starting point is 00:09:40 and so i i look at all this stuff and i say why in the world would anybody want to put hundreds of thousands of dollars worth of flash into one single box it seems like it'd be better for you if you peppered it around as you know a whole bunch of pci ssds in your server whole discussion of facebook he started the whole thing with this just a bunch of flash why are they moving to just a bunch of why are they consolidating the flash storage jim oh i'll tell you the thing that that i couldn't understand for the life of me about facebook is why do they want right once read many flash that they were acting as if was going to be the last place they put anything because if it is right truly right once read many then you're not going to want to eventually shuffle it off to colder storage.

Starting point is 00:10:27 Because if you do, then you'll have a block of flash that you'll just retire. Right. But that's the effect that their long tail is so much fatter than everybody else's long tail. If you think about Facebook, uploaded images are write once read many and you know as the camera in everybody's phone gets better they just get bigger and bigger well that i understand that what i don't understand is that he admitted once they were doing the question and answer after his talk that they were not going to stop using hard drives and if you're gonna have hard drives then why would you want a flash where you end up um when you move stuff to hard

Starting point is 00:11:14 drive then you end up going to turn off the flash it's the lifetime of an object for them so that an object gets uploaded and it's going to be accessed thousands of times a second for a short period of time and then it's going to fade down and for everybody else the cost of having three tiers would be way the maintenance and engineering cost of maintaining a fast disk a fast flash tier a slow flash tier and a disk tier would be that building it would be more expensive than it would pay off you guys got to realize that facebook is still doing optical for their worm stuff so they still have a real slow which means that they're definitely visiting you in colorado yeah i know i understand but the blu-ray they did this blu-ray thing and it's like you know their data centers are

Starting point is 00:12:13 10 petabyte data centers of northwest you know that but all it all it all proves is that facebook's requirements are so different from enterprise requirements that we should only look at them as entertainment, not as... A multi-tier worm storage environment. It's like disk. It's going to be QLC flash and it's going to be optical. It's bizarre. It's totally bizarre.

Starting point is 00:12:44 But I want to get back to this other question. For the last decade, Facebook has been pushing server Flash, Fusion I.O. cards, intense direct access storage throughout the world. And they were the leader in all this stuff. And all of a sudden, they announced this last week that they're moving to just a bunch of flash types of storage architecture. And the only thing I could gather as far as a rationale for doing that was that they wanted to deal with these peaks that they get. You know, every once in a while, I don't know, somebody wins a Super Bowl or somebody wins an NBA championship or, God forbid, Phelps wins the 22nd gold medal. You know, the world just goes bonkers, you know.

Starting point is 00:13:35 I think, Ray, they actually discovered that the Fusion I.O. model concentrated too many IOPs near too little compute. So, yeah, I see the scalability thing where they want to be able to independently scale the compute and or the storage over time. And the only way that you can do that with DAS is to make more copies, which raises your costs. So I would guess, and this is speculation, that it's about managing the compute cycle to storage IO ratio. We always talk about, because we're storage geeks, we always talk about storage as if it didn't cost CPU. That storage system could do 2 million IOPS.

Starting point is 00:14:27 It's like, yeah, but to get Oracle to do 2 million IOPS, I need four data centers worth of x86 processors. It's the part that we always leave out. And I think it's more related to that integration. But again, their requirements are just so different. But no, no, no. I believe that they've become almost leading edge for at least the hyperscale community. And the hyperscale community is leading edge for the enterprise.

Starting point is 00:14:53 I'm starting to view Facebook less as leading edge and more as academic. Oh, God. No way. No way. No, in that the things that they're developing are new and innovative and some of the ideas might be useful, but that the problems they're trying to solve are so unique. The whole you should make your data center look like Facebook's data center argument is just absurd where four or five years ago i would agree that you know facebook and aws were building what the corporate world needed to build five or ten years later i think that as they given the scale that they've achieved now all of their problems are scale related and are less and less

Starting point is 00:15:42 applicable to anybody else i i i don't know anybody else. I don't know. I don't know if I agree with that. But, Ray, we're ignoring our guest. Okay. Jim. No, that's all right. Tell us about the 3D NAND and what's going on in that space and the server storage class memory.

Starting point is 00:15:59 I keep hoping to see something. Yeah, the storage class memory, I could probably do a whole podcast with you guys on that. Easily. Maybe we should. You probably should. I was thinking about that when Howard was talking about the IOPS versus compute, and there are some issues there with that. But let's not go down that rat hole because we could spend the entire podcast on that. Let's instead talk about the fact that because of the fact that it's been a year since Intel and Micron introduced their 3D crosspoint

Starting point is 00:16:30 memory, now it seems like an awful lot of people are willing to spend some money promoting their stuff. And they had booths at the Flash Memory Summit, for example. The guys at Everspin, which is a magnetic RAM company, had a booth there, which they hadn didn't need to have the capacitance and buys them a couple of square inches of PC board space to do something else. I would imagine some of your audience don't have a clue what we're talking about, so let me just elucidate a little bit on that. Most SSDs use either DRAM or some memory on the controller chip to buffer some of the data that's going into the

Starting point is 00:17:25 chip because NAND flash is really slow to write into. The interesting part of that is that today most of the flash systems I'm looking at actually report lower write latencies than read latencies because they act once it's in the DRAM. Exactly. And then the question is, well, if you've got DRAM or if you've got memory inside the controller chip that you're using, that memory, if there's a power failure, is going to lose everything that you tried writing to the disk. And that's not an acceptable situation. And so these guys put capacitors on that have enough energy to allow the disk to move everything out of the DRAM and into the NAND flash storage before the solid-state drive actually shuts down. With MRAM, you don't have that problem. You've got high speeds, which is the reason why you put the DRAM or the memory on the controller chip, and why those are on the SSD. But MRAM gives you the high speeds in an external chip without the need to move the data in a power failure

Starting point is 00:18:23 off into the NAND flash because the MRAM will retain its data in a power failure off into the NAND flash because the MRAM will retain its data after the power failure. I saw Crossbar, and there's a resistive RAM. Yeah, and they had a big booth there too. So that's also a non-fallible memory. And they're only there because of Intel and Micron coming out with a 3D crosspoint announcement last year. Yeah, well, I think both crossbar and, well, crossbar technology looks so similar to 3D crosspoint. Given the limited information we've been able to get about 3D crosspoint, all we have is drawings that look very much the same,

Starting point is 00:19:01 and everybody from Intel and Micron going, no, no, no, not the same. No, I can't tell you anything else, but it's not the same, and everybody from Intel and Micron going, no, no, no, not the same. No, I can't tell you anything else, but it's not the same. Yeah, I actually wrote a report on that back last October, and I think that a couple of the people who bought the report from me thought that I was going to tell the secret that Intel and Micron failed to disclose. But instead, what I did was I explained where it fit into the system and then made a forecast as to how many dollars of it would ship based on how it would interact with DRAM. Let's talk for a minute about this next generation universal memory stuff, the bid addressable, higher speed, but still non-volatile class. What do you see as the market impact over the next year or two? I mean, clearly, we've gotten a huge amount of 3D crosspoint hype, but we haven't seen any products in the first year of the hype cycle.

Starting point is 00:19:52 Yeah. First of all, you've got to understand why Intel wants it. that there were platforms, especially in the PC, that if you unplugged, let's say, a 30 or a 25 megahertz processor and plugged in a 35 megahertz processor, that the performance didn't go up because the rest of the platform was so slow. Right. Systems are all bound. Exactly. And Intel, they decided that they had to take ownership of this problem or else they weren't DDR to DDR2 to DDR3 to DDR4. That was all driven by Intel, coordinated, sequenced. The standards were facilitated by Intel. They really put a lot of effort into this kind of a thing, and they ran into a point where they said, okay, well, we need to insert a new memory layer in order to get the kind of performance that we need for the number of cores that we want to put on our processors. And they figured that the best way to do that was to invent a new technology and to create support for that in their upcoming platforms.

Starting point is 00:21:25 And so that's where 3D Crosspoint really came from. It's not that they need something that is storage. It's just that they need a layer of cheaper, slower stuff than DRAM, but that outperforms the NAND flashes used in SSDs. But anytime you create something like that, the opportunities are significant here. I mean, you're creating another tier of memory, but it can be construed as another tier of storage. Yes, and it can, too.

Starting point is 00:21:57 The difference between storage and memory is persistence. And in this case, it's not. Right? It's the same. It's both persistent well well no howard has drawn a line and different people draw lines in different places howard's drawn the line of persistence i know a lot of people who draw it at block storage right it's at byte addressability would be the other place you could draw a line yeah but the weird thing about byte addressability is the processors don't use byte addressability anymore. Right.

Starting point is 00:22:27 Yeah, they write whole cache lines. Whole cache lines, which is 64 bytes. And the difference between 64 bytes and a 512-byte cache or disk access is actually only less than 10 to 1 ratio. Not enough to worry about. Yeah, truthfully, when I say byte addressability, what I really mean is direct memory addressing that it is a load store device, not that there's some software layer that does RDMA that makes it look like a load store device. Yeah, I suspect that a lot of that's going away. SNIA, the Storage Networking Industry Association, has been doing an awful lot to add things to the interface between the application program and the actual memory or storage that facilitates what you're talking about of addressing persistent storage at the byte level. Yeah, and at some point, we're saying that storage is memory, and memory is storage,

Starting point is 00:23:30 and it's the interface that matters. Does it plug into the DIMM socket, or does it plug into the PCIe lane? And it kind of doesn't matter. I'm looking at the next four or five years as being really exciting from a software development point of view because, you know, the version of an in-memory database like SAP HANA that takes advantage of a next-gen memory layer and RDMAs every write from one node to another instead of having to have the big honking all-flash array on the back end to absorb the writes in case there's a power fail. Just looks like an incredible engine for doing analysis. We haven't even figured out what we would need to do with yet. Yeah, but what I find the people in SNEA are doing is they're identifying all kinds of bottlenecks when you do that. And with our DMA, probably the worst thing is the DMA used to always just kind of push the processor aside and have the disk with some kind of a smart interface talk directly to the memory,

Starting point is 00:24:36 you know, and actually chunk stuff in there. And now DMA goes through the processor. And tomorrow, that's not going to happen. Yeah, well, that's not going to start showing up the way it does with DRAM. Yeah. But, you know, one other thing that's a big problem is that once you start going through the processor, then every time that an RDMA happens, then processor cache lines get removed. And so you'll have a processor that's whizzing along

Starting point is 00:25:31 and all of a sudden its performance goes down. And the reason why was because somebody went and did something completely unrelated. But the processor happened to be the conduit through which everything was going. It's like a party line on a telephone, you know. You can't – graybeards would understand this. I don't think anybody would.

Starting point is 00:25:51 But, yeah, you know, you're trying to make a phone call. You pick up the phone and your neighbor's on the phone yakking away. You can't do anything. Let me try to understand this. Now, if it's a memory device, then the processor caches start to take effect. If it's a storage device, it's a different animal. But RTMA still goes through the processor caches the way it's hooked up right now. But this is an implementation.

Starting point is 00:26:16 It is. It's an implementation problem. So if you want it to not go through the caches, fine. Or you want to have a separate cache just for the data layer, that's fine. Except that the only you that properly answers that question is Intel. Yes, absolutely.

Starting point is 00:26:34 And mind you, it's a chip spin or two to get there, right? If we're talking about this... They've done it. If we know about it and they don't, they're way stupider than we give them credit for. So, you know, my biggest challenge was storage class memories. You know, the crossbar thought was like, I don't know, eight megabits or something like that. The Everspin stuff, I have no idea, but it's not gigabytes, right?

Starting point is 00:27:05 No, no, no. Yeah, yeah. They're way smaller. And this is one of the things that now you get into the economics of memory manufacturing rather than the technology is the Everspin stuff, I believe, is on something like 90 nanometer, a 90 nanometer process, which is where DRAM was a long time ago. I think it was 2005 DRAM was there. And so their chip density, the number of bits they can get on a chip is higher than the DRAMs of 2005 were able to do. But because of the fact that they're only selling tens of millions of chips a year, and DRAM is selling tens of billions of chips per year. So you look at that and you say, okay, what does that mean to everything? Well, nobody is going to want to invest the $6 billion on a fab.

Starting point is 00:27:56 For a fab, right. Yeah, yeah, for a product that's only selling in the tens of millions of dollars or hundreds of millions of dollars a year. Is the fab you would need to make MRAM or Crossbar or any of the other non-3D crosspoint competitors that different from a standard CMOS fab that you can't do what the fabless semiconductor guys do and just have TMSC make it? The chemical stuff is so different right yeah well the chemical stuff is different but that's not really the big issue is that these chemicals fortunately are friendly to silicon um there was a company ramtron back in the 80s introduced something that was lead-based ferroelectric memory and that went over like a lead balloon right uh-huh so uh yeah but the

Starting point is 00:28:48 problem is is that that lead ions are extremely mobile you know and now i'm parroting something that i don't understand that's okay my degrees in chemistry i do oh okay well maybe you can explain it to the audience after i say what the problem was and that is that that if you ever brought any lead in direct contact with the silicon then apparently a lead ion or two would go floating around through the silicon and find whatever they could wreak the most havoc with and decide that was where they were going to settle yeah so yeah okay so so at that atomic level, with the way the outer electron shells on silicon and lead are arranged, to a single lead ion, silicon is like jello. And it will just squirm its way through through following some really minor differences in charge. It's a whole other dimension to you, Howard.

Starting point is 00:29:52 I know about the explosive stuff, but this is – It's the same thing. It's the chemistry. I took quantum mechanics. Well, you remember when we were at Intel. It was like I don't understand quantum mechanics better than anybody else doesn't understand quantum mechanics. Well, you remember when we were at Intel, it was like, I don't understand quantum mechanics better than anybody else doesn't understand quantum mechanics. And so the lead ions eventually chase whatever source of charge there is and neutralize it. Okay.

Starting point is 00:30:18 They get wherever the charge is and they absorb an electron. So that was bad. The stuff we have today is better? It is. And so it's much more silicon-friendly. Now, where you have a problem, there are a couple of things. One is TSMC and the other foundries, Global Foundries and UMC and WinBot and Samsung Intel, all of them, the way that they work is that they sell wafers. You know, okay, a typical DRAM wafer, a NAND flash wafer, is going to cost about $1,600 to produce.

Starting point is 00:31:03 TSMC's wafers probably cost about twice as much because of the fact that they don't run a highly efficient manufacturing plant with only one product going through it in huge volumes right all all the switchovers add to a huge cost they do yeah simply because you'll have idle equipment all over the place and that kind of stuff while things are changing around well and you always have to throw out the first pancake yeah yeah that's true too i hadn't thought of it that no but i'm sure that the first x wafers that are part of roughly a batch after a switchover have lower yield? I think with TSMC, that's not really a problem, but there are still going to be lower yields than you would have if the entire one of TSMC's fabs was running only one product all the time. Yeah, you get better the more you specialize. Yeah, yeah, very much so.

Starting point is 00:31:46 And so, you know, the cost of these technologies, if they were run on the CSMC, might be about double the cost of where they were run, you know, if they were run on their own FAB, the way that DRAM and NAND flash are. So that automatically pushes the cost up. Wouldn't moving from 90 nanometers down to whatever tmsc could do make up for that yeah but what i'm saying is that tsmc's their most aggressive

Starting point is 00:32:11 process is no more aggressive than what dram and man flash are being built on right but i'm thinking but everspin is now on you know something that's four times that big. And what my thought is, wouldn't Everspin be better off taping it out in whatever TSMC can do and get that four times boost before they have to start building their own fab? Yeah, okay. Well, the thing that they're up against there is that the more aggressive processes

Starting point is 00:32:42 have more aggressive or more expensive mask sets. It's really hard to make the mask set is kind of like the negatives for photography, which graybeards understand, but the younger audience won't. Yes. Feel free to explain semiconductor manufacturing. No, no, no manufacturing no no no no well as much as you think to answer the questions because i'm sure our audience doesn't understand it yeah no i and i don't really want to but but if you're going to produce high resolution chips at these very very small

Starting point is 00:33:19 processes which is basically what a small process is is just increased resolution if you're going to increase the resolution of what you're printing with a negative, then you have to have a higher resolution negative. And the higher the resolution of the negative, the more expensive it is. It costs close to $100 million now to produce a mask set for the 14 nanometer node, which is the most aggressive thing that TSMC does. And so then the question is okay that sound was my jaw hitting the desk oh okay because i'm thinking well you do this all in cad and then you

Starting point is 00:33:55 send it over to that machine over there and the masks come out and well that machine over there was 10 million dollars but we'll work it off eventually. Is it design validation and stuff that costs so much? Holy moly. Yeah, yeah. And, you know, that explains the popularity of using a processor to do things that you used to use an individual custom chip to do. Or an ASIC where I only have to tape out a corner of it. Holy moly. Yeah, or an FPGA.

Starting point is 00:34:24 Okay, the audience is usually impressed when Ray's mind is blown, but mine is now. Yeah, it's still $100 million. You know, it used to be like $20 million or $5 to $10 billion for an ASIC. This is serious stuff. I mean, we're not talking that long ago. I mean, Jesus Christ, we were doing ASICs in 2000. There were, I don't know five million yeah but ray jim's talking about a whole a complete custom chip yes these are custom chips

Starting point is 00:34:52 these were you know this was not normal stuff right no but the whole idea of an asic is that it it's a simple it's you have fewer options and how much you can customize it but it's much simpler to customize all right right, go on. So anyway, today, Everspin is selling, let's say generously, $100 million to $200 million worth of product a year. But I think it's probably under $100 million. The ability to fund a near $100 million mask set is just not there. Okay. mask set is is you know it's just not there okay and why haven't you know samsung or sk or somebody else decided that this is an opportunity that intel and micron are going to have and snap up

Starting point is 00:35:36 crossbar ever spin or somebody and go after it well they did just talk about re-ram and stuff like that yeah toshiba is going to talk about re-ram and stuff like that? Yeah, Toshiba's going to be producing re-ram in conjunction with the company formerly known as Sandisk. It's now part of WD. Oh, and also Toshiba on its own is going to be doing MRAM like Everspin. Yeah, and Samsung bought Grandus years ago, like eight years ago, in order to be able to do their own MRAM. But they also, Samsung actually shipped cell phones using what they call a PRAM, which Intel and Micron call a PCM. You know, Samsung certainly knows how to do that technology. Right, but that's all still, but all of this is still much lower densities than we would care about for anything but a cache in a storage system, right?

Starting point is 00:36:28 Yeah, but it's lower densities because they're not putting it on the most aggressive processes, and they're not putting it on the most aggressive processes because of that cost. If you're going to have your researchers do a science experiment, you'd rather have them not use all the most expensive things to do the science experiment with, unless the experiment is about those expensive things. Right. Yeah, but that's kind of a chicken and egg thing with respect to the market, right? I mean, there is, you know, I think Crosspoint, to some extent, has indicated that there's a presence of a market here. Yeah, there is. You know, and I think I mentioned before that Crosspoint will fit in the market if it's cheaper than DRAM and faster than Flash.

Starting point is 00:37:09 You know, those are the two things it has to do. And it's really clear how fast it's going to be because they've got engineers and physicists who tell them that the cost thing is a big problem. And you said chicken and egg, and I've been throwing that term around like crazy about 3D crosspoint because 3D crosspoint's cost is not going to be able to be lower than DRAMs until the volume, the scale of manufacture approaches that of DRAM. So while they're building up the market. No. Yeah, and, you know, they're going to have to build up the market by selling it below cost.

Starting point is 00:37:42 For Intel, that makes a whole lot of sense because intel is going to be able to sell a more expensive processor you know they'll get 50 more for their processor by losing 10 a year they're gonna yeah storage class memory is gonna have to be 10 billion chips a year before it gets the cost down it's never yeah oh well you know they're trying no but that but that's the answer. And so SK and Toshiba and Samsung can sit on the sidelines until the market develops, but Intel is more willing to lose money on it in the short term because they're making it up on the other side. Yeah.

Starting point is 00:38:22 So that answers the question i was wondered why sk and toshiba and the like were letting intel and micron have a lead in this next generation memory business but now i understand that since there's a lot of money that has to be lost they're more willing to let intel lose the money and intel is more willing to lose the money because they're making it up on the processor side. Yeah, I still have my doubts about these companies being able to do it, though. I think everybody who tries ramping a new technology is probably going to lose a lot of money on the way toward ramping it. Is that a lot of money, a slightly smaller amount of money, if you're trying to reinvent the wheel somebody else has already made yes yeah and if the market's a ready market then then it's going to be a whole lot easier to convince management or your investors to do that yeah and i see it we see this with 3d

Starting point is 00:39:15 nand right i mean so samsung was sort of a leader in that space and these other guys micron and toshiba have taken up the call as well, right? Yeah, yeah. The reason, though, everybody wants to do 3D NAND is because there isn't a way to get NAND costs to continue to go down the way that Moore's Law prescribes that semiconductor prices have to go down unless you go to 3D. There just isn't any way other than 3D to get there. Right. And Jim, didn't Samsung go 3D earlier than everybody else because they had problems shrinking their planer? They did have that, but also something that Samsung was hoping to do was to get their scale up high enough that they'd get the manufacturing efficiencies before everybody else and they'd be able to be profitable

Starting point is 00:40:03 while everybody else was still struggling. I think that that has eluded them so far. But they are getting something else, which is very, very important to Samsung, is creating the illusion that they are more technically advanced than every other company. And the way that they worked that before was when they introduced their 27 nanometer NAND flash, they called it 20 nanometer class. And their competitors, Hynix had 26 nanometers when Samsung had 27. So when Samsung introduced 27 nanometers, then Hynix had 26 nanometers. Intel and Micron had 25 nanometers. And Toshiba and SanDisk had 24 nanometers.

Starting point is 00:40:46 But the press said that Samsung was ahead because Samsung had 20 nanometers, because Samsung was called, there's 20 nanometer class. Yeah. That's been the case for a few generations now. You know, they got down to where they were saying 10 nanometer class. And a friend of mine jokes, you know, what are they going to do next? Say that they have zero nanometer class? a friend of mine jokes you know what are they going to do next say that they have zero nanometer class that would be really impressive yeah what they did instead was they just they just introduced 3d nand earlier than anybody else was willing to everybody else didn't want to do it until they saw a path to profits whereas samsung said this is a way to keep keep

Starting point is 00:41:22 the illusion going that we're in a technology leadership position. And, you know, they are in a technology leadership position now. But when they started this escapade, they certainly weren't. And it has, you know, certainly aided them in their chase for market share. Okay. I need to ask another question. So what about this Flash Matrix? Did you take a look at that, Jim? Yeah, I did. And let me tell you, the first thing that popped into my mind is, who's this company who's doing this? You guys understand the storage business a whole lot better than I do. Is Toshiba known for doing anything in the world of storage? I mean, disk drives and stuff like that, but not systems, per se.

Starting point is 00:42:01 Storage systems. They have a little bit more than you see here in the home market, but not much. In Japan, yeah. It's not like Fujitsu, who we never see here, but is really strong in the home market. But they have done some things, but nothing

Starting point is 00:42:19 outstanding. And they're not a supercomputer company, are they? A Shiba, no. So yeah, it seems supercomputer company, are they? No. So, yeah, it seems kind of odd that they'd think anybody would come to them for a solution like this. But it's an interesting solution. You know, what I noticed was they gave all these chip counts. And I learned later on that I should have been taking photographs of the slides because they're not going to be giving the presentation to the Flash Memory Summit people to put on the website. But, you know, and maybe we could give a shout out to people to send a photograph if they took them at the summit.

Starting point is 00:42:57 But the architecture, the way that I understood it was that you have a matrix of flash nodes, and they weren't very clear about what those were, but, you know, there's several ways to do that. And all of these flash nodes can communicate with each other, both north, south and east, west. And at the edges of the thing on the east west edges, let's say, then there's a processor on either one of them. And apparently all of this flash can be uniformly accessed by all processors at all times. OK. Yeah. All over PCIe kind of thing, right? Well, did they say it was over PCIe? Yeah, I think so. I didn't make this presentation, but it's sounding very DSSD-like.

Starting point is 00:43:36 Yes and no. So obviously DSSD is a shared storage connected to servers, PCIe buses, but this thing has the servers. I mean, it's got all the compute embedded in the matrix. Right, right. I'm just thinking of the internals of the DSSD system. Yeah, and then it looks like, Ray? Yes?

Starting point is 00:43:59 They're just eliminating the cables. Yeah. Yeah, I suppose. But it also looks like they've got special they got special because in the dssd architecture that's a pcie over the cable yeah but they got a pg they got an fpga on each of these flash modules i'll call it right that talk you know the pcie four directions and stuff like that right and they're all plugged into, effectively, a motherboard. And then the motherboard has connections to the compute engines as well.

Starting point is 00:44:33 So it's a system in a box. Do they push functions like data protection down to the FPGAs in the grid? That would be really clever. I don't know. Because if you've got that distributed fpga model then you know in those fpgas you could do you know you could do n-dimensional raid via erasure coding you could do background scrubbing all of that stuff nobody mentioned anything about raid background scrubbing or any of that stuff there might be some checks on stuff yeah you know you know guys this reminds me an awful lot of the hyper cube when it first came out in the 1970s yeah and and the guy said i can i can hook up um memories

Starting point is 00:45:13 and processors and if you multiply the operations per second of all the processors by each other then you end up getting something equivalent to one of the IBM high-performance systems. And then he ended up trying to code it, and he gave up. It could be they've got one of those, where it's a wonderful dream of some hardware person who never even thought about any of the software that's needed to support it. Yeah, well, that's one of the things I love about the Flash Memory Summit, is there's the mix of the things that will be in our data centers in two to five years. But, you know, the new, crazy, innovative, wonderful ideas and just the new crazy ideas. Hard to tell between the two, I might add.

Starting point is 00:45:58 Well, sometimes it's obvious. Sometimes you only can tell the truly crazy in retrospect. I looked at the Flash Matrix, I thought maybe this was like an InfiniBand, you know, tripe-a-thing from Toshiba directly. So why are they getting into the system-level Proc Matrix? Did they announce this as a product, or did they just say, we built this thing, isn't it cool, we get to do a presentation? Yeah, I don't remember a product

Starting point is 00:46:25 right because you know because then we're at the like the hpe the machine level it's like we're doing this because it looked you know because it has great pr value and people thinking about it they were showing it but i didn't show it actually running any applications i'm you know what what they're saying is is probably you know we're uh if if seagate can put 60 terabytes of flash into a 3.5 inch container then we can do this by golly yeah right it might not be practical yeah yeah yeah it's like it's like the the database acceleration engine that that the guy the guys from HDS from Tokyo were showing. Nobody really plans to make it a product, but it's way cool. Yeah.

Starting point is 00:47:13 Well, folks, we kind of run out of time here. Is there any last questions you might have, Howard? No, I think we're good. Hey, Jim, is there anything you want to say to the Greybeards audience here? You know, you guys are looking older than the last time I talked to you. Oh, wait, wait. There is one thing I did want Jim to talk about. In your update presentation, you acted like any money in trading places,

Starting point is 00:47:41 and you timed the market to predict when the next flash glut, and we should all buy big SSDs, is going to be. So could you spend like two minutes explaining how it's boom and bust, and every once in a while there's a buying opportunity, and when you expect to see it? Yeah, I don't really call flash a buying opportunity. It's what you've got to look at is whether or not you want to be holding on to inventory. Something that I've been talking about for a while now that hasn't really happened the way that I was expecting because of demand issues is a shortage for NAND flash. 3D NAND is two years late from where it was supposed to be. And typically in that kind of a thing, there should

Starting point is 00:48:22 be a shortage because the NAND flash manufacturers plan, they figure out where they think demand is coming from and how much demand for gigabytes is going to grow every year. And then they build their plants and they have like, they need to put these plants in place two years ahead and they build these plants to be able to satisfy that demand. And they don't want to build them too fast because the profits they make in a shortage make up for the losses they make in a glut, right? Yeah, yeah, exactly. So they want it to be a nice, even-keeled kind of a thing. But two years ago, they thought 3D NAND would be selling the majority of NAND flash gigabytes by now, maybe even, you know, 90% of them. And so they built their capacity according to that. And 3D NAND just hasn't materialized the way that people thought, because it's just a very, very difficult process. It's got

Starting point is 00:49:19 a number of things that have never been tried before. So the longer that draws on, then the less likely it is that the NAND flash manufacturers are going to be able to keep pace with growing demand for NAND flash. And when that happens, then we'll end up with a growing shortage that will keep prices stable because for some reason that I've never been able to really understand, memory prices don't go up a whole lot during a shortage. They just go flat. Right, which for those of us who buy retail becomes subjectively going up because we're so used to them going down. Yeah, yeah. And when I say they don't go up, they might go up 10%, maybe in an extreme case case 20%. But during a collapse, you'll typically see a 60% price drop and you'll never see a 60% increase. which is the manufacturing cost of planar NAND, and that actually goes below the manufacturing cost until that happens,

Starting point is 00:50:26 which is anybody's guess as to when that's going to happen, then we'll have a shortage of growing importance. Once they've solved that problem, you know, and gotten the last thing squared away, and 3D NAND becomes a better technology, a more manufacturable technology, then all of a sudden you'll see the bottom fall out of the market because there will be an oversupply of chips, and everybody who is producing the chips will be lowering their prices

Starting point is 00:50:53 to try to take the market away from their competition. So for lack of any concrete date of when I can say this will happen, I'm just saying the middle of 2018 is when this is going to happen. So we'll have relatively stable pricing. And then all of a sudden in 2018, you'll see a 60% or larger price drop. Okay. I would have to assume that, you know, that transition is going to come earlier for Samsung than for everybody else. Is there usually a lagger who suffers when a transition like this happens? There always is a lagger.

Starting point is 00:51:36 What's interesting, though, with your presumption about Samsung is that all of the NAND flash manufacturers are using different processes from each other, which is unusual. Usually they all pretty much do the same thing. But each one of them thinks that they have secret sauce. Well, yeah, exactly. And, you know, for example, for Micron, they're not using a charge trap. And a charge trap has been particularly tricky for nearly every company that's made it. There's one company who has been very successful with it, that's Bansion, which is now a part of Cypress Semiconductor. They make Norflash using charge

Starting point is 00:52:05 traps. And everybody else who has tried making charge traps has failed. Had Spansion not been there, I don't think anybody would have ever tried using a charge trap for 3D NAND. But everybody's 3D NAND uses charge traps, except for Micron. So Micron decided to avoid that. And there's always a possibility that while everybody else is wrestling the charge trap monster, that Micron will just zoom ahead of them. The SanDisk Toshiba process is a much, much simpler process than either Micron's or Samsung's. And it could be that that simplicity. Simple is sometimes really good. Yeah, yeah. And it might give them a real edge. So, you know know i don't think that you can yet tell who's going to win this race god you sound like a memory handicap

Starting point is 00:52:50 here jim the whole discussion here is kind of uh and it's samsung by a nose followed by yeah this is interesting well it's gonna could be a whole other podcast on that stuff. All right. I guess we better call it a day. All right. Well, good night, Gracie. Yeah. It's been a pleasure to have Jim with us on our podcast.

Starting point is 00:53:15 I appreciate being with it. But next month, we will talk to another Star Technology person. Any questions you want us to ask, please let us know. That's it for now. Bye, Howard. Bye, Ray. Until next time, thanks again, Jim. Bye, Howard and Ray.

Starting point is 00:53:27 All right. Thank you.

Grey Beards on Systems - 35: GreyBeards talk Flash Memory Summit wrap-up with Jim Handy, Objective Analysis

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.