The Good Tech Companies - When Will Infrastructure Companies See Gains from Generative AI?

Episode Date: May 29, 2024

This story was originally published on HackerNoon at: https://hackernoon.com/when-will-infrastructure-companies-see-gains-from-generative-ai. As Generative AI apps move ...toward production, the stage is set for companies to start seeing real, consumption-based gains contributing to their bottom lines Check more stories related to machine-learning at: https://hackernoon.com/c/machine-learning. You can also check exclusive content about #generative-ai, #retrieval-augmented-generation, #future-of-ai, #adoption-of-ai, #genai-applications, #good-company, #uses-of-genai, #genai-projects, and more. This story was written by: @datastax. Learn more about this writer by checking @datastax's about page, and for more stories, please visit hackernoon.com. With the consumption that GenAI apps drive, companies like Microsoft, Google, and even Oracle are starting to report results from AI. Outside of the realm of hyperscalers, other AI infrastructure companies will likely start to highlight lifts in the earnings reports they release in January, February and March of next year.

Transcript
Discussion (0)
Starting point is 00:00:00 This audio is presented by Hacker Noon, where anyone can learn anything about any technology. When will infrastructure companies see gains from generative AI by data stacks? A lot of questions are swirling about the state of generative AI right now. How far along are companies with their bespoke gen AI efforts? Are organizations actually building AI applications using their own proprietary data in ways that move the needle? What kind of architecture is required? These questions aren't surprising. There's a huge range of opinions about AI out there, from unabashed optimism to jaded cynicism, mixed with a lot of hype. I've found that it helps to clarify the state of affairs in the Gen AI
Starting point is 00:00:39 marketplace across three areas. What the market wants to see, fears about what might happen, and what will happen, and how, in 2024. And what does everyone want, and what are they afraid of? Gen AI exploded on the scene in late 2022 when OpenAI released Chad GPT and showed how powerful and accessible this kind of technology could be. The excitement about the potential upside of AI was everywhere. In short order, Gen AI was going to be infused into every application at every enterprise. Investors envisioned a hockey stick-like growth curve for companies that provide the infrastructure to support Gen AI. The naysayers, on the other hand, envisioned a dystopian AI future that's a cross between Westworld and Black Mirror. Others warn of an AI bubble. From an investment perspective, some say it's like crypto all over again. Lots of excitement and hype, and then a smoking crater.
Starting point is 00:01:32 I think both of these fears are unfounded. Sure, with every new technology wave, there'll be bad actors using Gen AI for the wrong reasons. And the excitement about the possibilities of Gen AI is everywhere. It does have a bubbly feel to it and might even be more strident than the crypto buzz. But the big difference between Gen AI and crypto is the fact that there are many, many real use cases for the former across organizations and across industries. In crypto, there was one strong use case, financial transactions between untrusted parties, aka money laundering. That's something that the mainstream isn't quite as interested in. Right now, the state of affairs for Gen AI applications reminds me of e-commerce in the
Starting point is 00:02:14 late 1990s, when companies were trying to figure out how to make it safe to use credit cards over the internet. It took a little while for organizations to figure out how to do it securely, but once they did, suddenly everyone had an commerce site. The parallel I see in Gen AI right now. How to ensure that language models don't return inaccurate responses by hallucinating. The good news? That's been figured out, thanks to Retrieval Augmented Generation, or RAG. More on that below.
Starting point is 00:02:42 Where are we now, and where are we going? A lot of what we saw last year were proof-of-concept Gen AI projects. Apps to demonstrate to a company's leadership what's possible. But very few companies have moved beyond that to build applications that are in full production. By production, I mean that an organization has an AI application that is being used by customers or employees in a non-prototype way. In other words, it's available as a routine part of activities within some segments of business operations. It might be the front office, it might be what's behind a call to customer service,
Starting point is 00:03:15 but it's somewhere near the mainstream part of business. Walmart is a good early example of this. The retailer announced in January that it has added Gen AI-powered search to its shopping app. Apple is reportedly testing a Gen AI tool to help its employees provide speedier technical support. Until we start seeing more examples like this, Gen AI is going to linger a little longer in the early stage of what Gartner calls the hype cycle. Volkswagen just announced an in-house lab to develop Gen AI apps for navigation and infotainment applications for its automobiles. That said, we're not as far from reaching the plateau of productivity as some might think. As I mentioned earlier, trusting models output has been a hurdle for organizations that are still grappling with how to produce relevant and accurate large language model, LLM. Responses by reducing hallucinations, RAG, which provides models with
Starting point is 00:04:06 additional data or context in real-time from other sources, most often, a database that can store vectors, is being employed now to help solve this problem. This technology advancement is a key to developing domain-specific, bespoke Gen AI applications built on organizations' most valuable asset, their own data. While RAG has emerged as the de facto method for getting enterprise context into Gen AI applications, fine-tuning, when a pre-trained model is trained further on a subset of data, is often mentioned as well. There are times when this method can be useful, but RAG is the right choice if there's any concern for privacy, security, or speed.
Starting point is 00:04:45 Regardless of how context is added to the application, the big question I get often from investors I've been speaking with is when will companies start to make money from Gen AI apps? My response? Most of the enterprises you track are consumption-based businesses. Many are now supporting the experiments, proofs of concept, POCs, and niche apps that their customers have built. Those don't do much in the way of consumption. But this is starting to change as major AI applications start to go from post-into true production. I predict this is going to happen in a significant way by the end of 2024. It will come to fruition in the second half of 2024
Starting point is 00:05:22 starting in two places. First, it's taking hold in retail. See the Walmart example mentioned earlier. You'll also see widespread adoption in what I call the AI intranet area. Chat with PDF, knowledge bases, and internal call centers. With the consumption that these kinds of apps drive, companies like Microsoft, Google, and even Oracle are starting to report results from AI. Outside of the realm of hyperscalers, other AI infrastructure companies will likely start to highlight lifts in the earnings reports they release in January, February, and March of next year. The path to production for Gen AI applications, the groundwork has already been laid for consumption-based AI infrastructure companies. We've already seen strong, commercial proof points that show what's possible for a large
Starting point is 00:06:09 base of domain-specific, bespoke applications. From the creative AI apps, Midjourney, Adobe Firefly, and other image generators, for example, to knowledge apps like GitHub Copilot, over 1 million developers use it, Glean, and others. These applications have enjoyed great adoption and have driven significant productivity gains. Progress on bespoke apps is most advanced in industries and use cases that need to facilitate delivering knowledge to the point of interaction. The knowledge will come from their own data, using off-the-shelf models, either open-source or proprietary, RAG, and the cloud provider of their choice. Three elements are required for enterprises to build bespoke Gen AI apps that are ready
Starting point is 00:06:50 for the rigors of functioning at the production scale, smart context, relevance, and scalability. Smart contextLet's take a quick look at how proprietary data is used to generate useful, relevant, and accurate responses in Gen AI applications. Applications take user input in the shape of all kinds of data and feed it Alento, an embedding engine, which essentially derives meaning from the data, retrieves information from a vector database using RAG, and builds the smart context that the LLM can use to generate a contextualized, hallucination-free response that's presented to the user in real-time.
Starting point is 00:07:28 Relevance This isn't a topic you hear much about at operational database companies. But in the field of AI and vector databases, relevance is a mix of recall and precision that's critical to producing useful, accurate, non-hallucinatory responses. Unlike traditional database operations, vector databases enable semantic or similarity search, which is non-deterministic in nature. Because of this, the results returned for the same query can be different depending on the context and how the search process is executed. This is where accuracy and relevance play a key role in how the vector database operates in real-world applications. Natural interaction requires that the results returned on a similarity search are accurate and relevant to the requested query. Scalability Gen AI apps that go beyond POCs and
Starting point is 00:08:11 into production require high throughput. Throughput essentially is the amount of data that can be stored, accessed, or retrieved in a given amount of time. High throughput is critical to delivering real-time, interactive, data-intensive features at scale. Rights often involve billions of vectors from multiple sources, and Gen AI applications can generate massive amounts of requests per second. Wrapping up, as with earlier waves of technology innovation, Gen AI is following an established pattern, and all signs point to it moving even faster than previous tech revolutions. If you cut through all the negative and positive hype about it, it's clear that promising progress is being made by companies working to move their POC Gen AI apps to production.
Starting point is 00:08:54 And companies like my employer Datastacks that provide the scalable, easy-to-build-on foundations for these apps will start seeing the benefits of their customers' consumption sooner than some might think. By ed enough, Datastacks learn more about how Datastacks enables customers to get their Gen AI apps to production. Thank you for listening to this Hackernoon story, read by Artificial Intelligence. Visit hackernoon.com to read, write, learn and publish.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.