The Good Tech Companies - When Will Infrastructure Companies See Gains from Generative AI?
Episode Date: May 29, 2024This story was originally published on HackerNoon at: https://hackernoon.com/when-will-infrastructure-companies-see-gains-from-generative-ai. As Generative AI apps move ...toward production, the stage is set for companies to start seeing real, consumption-based gains contributing to their bottom lines Check more stories related to machine-learning at: https://hackernoon.com/c/machine-learning. You can also check exclusive content about #generative-ai, #retrieval-augmented-generation, #future-of-ai, #adoption-of-ai, #genai-applications, #good-company, #uses-of-genai, #genai-projects, and more. This story was written by: @datastax. Learn more about this writer by checking @datastax's about page, and for more stories, please visit hackernoon.com. With the consumption that GenAI apps drive, companies like Microsoft, Google, and even Oracle are starting to report results from AI. Outside of the realm of hyperscalers, other AI infrastructure companies will likely start to highlight lifts in the earnings reports they release in January, February and March of next year.
Transcript
Discussion (0)
This audio is presented by Hacker Noon, where anyone can learn anything about any technology.
When will infrastructure companies see gains from generative AI by data stacks?
A lot of questions are swirling about the state of generative AI right now.
How far along are companies with their bespoke gen AI efforts?
Are organizations actually building AI applications using their own proprietary
data in ways that move the needle? What kind of architecture is required? These questions aren't surprising. There's a huge
range of opinions about AI out there, from unabashed optimism to jaded cynicism, mixed
with a lot of hype. I've found that it helps to clarify the state of affairs in the Gen AI
marketplace across three areas. What the market wants to see, fears about what might happen, and what will happen, and how, in 2024. And what does everyone want, and what are they afraid of? Gen AI exploded
on the scene in late 2022 when OpenAI released Chad GPT and showed how powerful and accessible
this kind of technology could be. The excitement about the potential upside of AI was everywhere.
In short order, Gen AI was going to be infused into every application at every enterprise.
Investors envisioned a hockey stick-like growth curve for companies that provide the infrastructure
to support Gen AI. The naysayers, on the other hand, envisioned a dystopian AI future that's
a cross between Westworld and Black Mirror. Others warn of an AI bubble. From an investment perspective,
some say it's like crypto all over again. Lots of excitement and hype, and then a smoking crater.
I think both of these fears are unfounded. Sure, with every new technology wave,
there'll be bad actors using Gen AI for the wrong reasons. And the excitement about the
possibilities of Gen AI is everywhere. It does have a bubbly
feel to it and might even be more strident than the crypto buzz. But the big difference between
Gen AI and crypto is the fact that there are many, many real use cases for the former across
organizations and across industries. In crypto, there was one strong use case, financial transactions
between untrusted parties, aka money laundering. That's something that the mainstream isn't quite as interested in.
Right now, the state of affairs for Gen AI applications reminds me of e-commerce in the
late 1990s, when companies were trying to figure out how to make it safe to use credit cards over
the internet. It took a little while for organizations to figure out how to do it
securely, but once they did, suddenly everyone had an commerce site.
The parallel I see in Gen AI right now.
How to ensure that language models don't return inaccurate responses by hallucinating.
The good news?
That's been figured out, thanks to Retrieval Augmented Generation, or RAG.
More on that below.
Where are we now, and where are we going?
A lot of what we saw last year were
proof-of-concept Gen AI projects. Apps to demonstrate to a company's leadership what's
possible. But very few companies have moved beyond that to build applications that are in full
production. By production, I mean that an organization has an AI application that is
being used by customers or employees in a non-prototype way. In other
words, it's available as a routine part of activities within some segments of business
operations. It might be the front office, it might be what's behind a call to customer service,
but it's somewhere near the mainstream part of business. Walmart is a good early example of this.
The retailer announced in January that it has added Gen AI-powered search to its shopping app. Apple is reportedly testing a Gen AI tool to help its employees provide speedier technical
support. Until we start seeing more examples like this, Gen AI is going to linger a little longer
in the early stage of what Gartner calls the hype cycle. Volkswagen just announced an in-house lab
to develop Gen AI apps for navigation and infotainment applications for its automobiles. That said, we're not as far from reaching the
plateau of productivity as some might think. As I mentioned earlier, trusting models output has
been a hurdle for organizations that are still grappling with how to produce relevant and
accurate large language model, LLM. Responses by reducing hallucinations, RAG, which provides models with
additional data or context in real-time from other sources, most often, a database that can store
vectors, is being employed now to help solve this problem. This technology advancement is a key to
developing domain-specific, bespoke Gen AI applications built on organizations' most
valuable asset, their own data.
While RAG has emerged as the de facto method for getting enterprise context into Gen AI
applications, fine-tuning, when a pre-trained model is trained further on a subset of data,
is often mentioned as well. There are times when this method can be useful,
but RAG is the right choice if there's any concern for privacy, security, or speed.
Regardless of how context is added to the application, the big question I get often
from investors I've been speaking with is when will companies start to make money from Gen AI apps?
My response? Most of the enterprises you track are consumption-based businesses.
Many are now supporting the experiments, proofs of concept, POCs, and niche apps that their
customers have built.
Those don't do much in the way of consumption. But this is starting to change as major AI
applications start to go from post-into true production. I predict this is going to happen
in a significant way by the end of 2024. It will come to fruition in the second half of 2024
starting in two places. First, it's taking hold in retail.
See the Walmart example mentioned earlier. You'll also see widespread adoption in what I call the
AI intranet area. Chat with PDF, knowledge bases, and internal call centers. With the consumption
that these kinds of apps drive, companies like Microsoft, Google, and even Oracle are starting to report results
from AI. Outside of the realm of hyperscalers, other AI infrastructure companies will likely
start to highlight lifts in the earnings reports they release in January, February, and March of
next year. The path to production for Gen AI applications, the groundwork has already been
laid for consumption-based AI infrastructure companies. We've already seen strong, commercial proof points that show what's possible for a large
base of domain-specific, bespoke applications. From the creative AI apps, Midjourney, Adobe
Firefly, and other image generators, for example, to knowledge apps like GitHub Copilot, over 1
million developers use it, Glean, and others. These applications have enjoyed great
adoption and have driven significant productivity gains. Progress on bespoke apps is most advanced
in industries and use cases that need to facilitate delivering knowledge to the point of interaction.
The knowledge will come from their own data, using off-the-shelf models, either open-source
or proprietary, RAG, and the cloud provider of their choice.
Three elements are required for enterprises to build bespoke Gen AI apps that are ready
for the rigors of functioning at the production scale, smart context, relevance, and scalability.
Smart contextLet's take a quick look at how proprietary data is used to generate useful,
relevant, and accurate responses in Gen AI applications.
Applications take user input in
the shape of all kinds of data and feed it Alento, an embedding engine, which essentially derives
meaning from the data, retrieves information from a vector database using RAG, and builds the
smart context that the LLM can use to generate a contextualized, hallucination-free response
that's presented to the user in real-time.
Relevance This isn't a topic you hear much about at operational database companies.
But in the field of AI and vector databases, relevance is a mix of recall and precision that's critical to producing useful, accurate, non-hallucinatory responses.
Unlike traditional database operations, vector databases enable semantic or similarity search, which is
non-deterministic in nature. Because of this, the results returned for the same query can be
different depending on the context and how the search process is executed. This is where accuracy
and relevance play a key role in how the vector database operates in real-world applications.
Natural interaction requires that the results returned on a similarity search
are accurate and relevant to the requested query. Scalability Gen AI apps that go beyond POCs and
into production require high throughput. Throughput essentially is the amount of data that can be
stored, accessed, or retrieved in a given amount of time. High throughput is critical to delivering
real-time, interactive, data-intensive features
at scale. Rights often involve billions of vectors from multiple sources, and Gen AI
applications can generate massive amounts of requests per second. Wrapping up, as with earlier
waves of technology innovation, Gen AI is following an established pattern, and all signs point to it
moving even faster than previous tech revolutions. If you cut through all the negative and positive hype about it, it's clear that promising progress
is being made by companies working to move their POC Gen AI apps to production.
And companies like my employer Datastacks that provide the scalable, easy-to-build-on
foundations for these apps will start seeing the benefits of their customers' consumption sooner
than some might think. By ed enough, Datastacks learn more about how Datastacks enables customers to get their
Gen AI apps to production.
Thank you for listening to this Hackernoon story, read by Artificial Intelligence.
Visit hackernoon.com to read, write, learn and publish.