Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm - 09x02: Moving Beyond Text for Agentic AI Applications with ApertureData

Episode Date: October 6, 2025

Our online interactions include audio, video, and sensor data, but most AI applications are still focused on text. This episode of Utilizing Tech considers how we can integrate multimodal data with ag...entic applications with Vishakha Gupta, founder and CEO of ApertureData, Frederic Van Haren of HighFens, and Stephen Foskett of Tech Field Day. After decades of developing AI models to process spoken word, images, video, and other multimodal data, the ascendance of large language models has largely focused on text. This is changing, as AI applications are increasingly leveraging multimodal data, including text, audio, video, and sensors. Many agentic applications still pass data as structured or unstructured text, but it is possible to use multimedia data as well, for example passing a clip of a video from agent to agent if the system has true multimodal understanding. Enterprise applications are moving beyond text to include voice and video, data in PDFs like charts and diagrams, medical sensors and images, and more.Guest: Vishakha Gupta, CEO and Founder, ApertureDataHosts: ⁠⁠⁠⁠⁠Stephen Foskett⁠⁠⁠⁠⁠, President of the Tech Field Day Business Unit and Organizer of the ⁠⁠⁠⁠⁠Tech Field Day Event Series⁠⁠⁠⁠⁠Frederic Van Haren⁠, Founder and CTO of HighFens, Inc. ⁠Guy Currier⁠, Chief Analyst at Visible Impact, The Futurum Group.For more episodes of Utilizing Tech, head to ⁠⁠⁠⁠⁠the dedicated website⁠⁠⁠⁠⁠ and follow the show ⁠⁠⁠⁠⁠on X/Twitter⁠⁠⁠⁠⁠, ⁠⁠⁠⁠⁠on Bluesky⁠⁠⁠⁠⁠, and ⁠⁠⁠⁠⁠on Mastodon⁠⁠⁠⁠⁠.

Transcript
Discussion (0)
Starting point is 00:00:00 Our online interactions include audio, video, and sensor data, but most AI applications are still focused on text. This episode of Utilizing Tech considers how we can integrate multimodal data with agentic applications, with our conversation with Veshaka Gupta, founder and CEO of Aperture Data, Frederick Van Herron, and myself, Stephen Foskett. Welcome to Utilizing Tech, the podcast about emerging technology from Tech Field Day, part of the Futurum Group. This brand new season focuses on practical applications of agenic AI and related innovations in artificial intelligence. I'm your host, Stephen Foskett, organizer of the Tech Field Day event series and host of Utilizing Tech now for nine seasons. Joining me this week as my co-host is Frederick Van Herron, who's been present for a lot of those seasons. Welcome, Frederick to the show.
Starting point is 00:00:51 Yeah, thank you once again. I'm here as a co-host. So my name is Frederick Van Herron. I'm the founder and CTO of High Fence, which is a HBC and an AI consulting and services company. And Frederick and I have been talking about practical applications for AI for a long time. But one of the things that always sticks in my craw, I'm not sure what a craw is, but something sticks there, is that when people talk about AI, they too often focus only on text. Basically, it's chatbot or bust. And that's great.
Starting point is 00:01:25 And in fact, there's a lot that you can do with text. But text isn't the whole world. And Frederick, I mean, your background in AI, you didn't start with text. No, definitely not. You know, as you know, my background is in speech. And that's the language we use to communicate with people. But we have to understand that people are more visual than learning from text. So a multimodal approach to AI.
Starting point is 00:01:53 It's definitely something we're looking forward in this agentic AI world. Absolutely. And I think that, you know, just regular people who are going to be sort of wondering, why are we always talking about, you know, documents and books and web pages and stuff? Why aren't we talking about literally everything we interact with today, which is audio, which is images, which is video, documents, and so on? And that is why we have an exciting guest to kick this season off. Today, we have Veshaka Gupta, founder and CEO of Aperture Data,
Starting point is 00:02:30 who is somebody that I spoke with earlier this year, and they are really focused on multimodal data. Welcome to the show. Thank you so much, Stephen, and Frederick. I'm really happy to be here. So tell us a little bit more about your background and yourself and what you're focused on. I'm happy to.
Starting point is 00:02:48 So, yeah, as you mentioned, right now, I'm co-founder and CEO of Aperture Data. Prior to that, I was at Intel Labs for over seven years, which is where we started working on this problem. And, you know, just looking at it from a researcher standpoint and now through the journey at Aperture Data, it's been more like product and user standpoint and businesses standpoint. For my background, I have a PhD in computer science from Georgia Tech and a master's from Carnegie Mellon, and I got my undergrad in computer science from Bitspilani in India.
Starting point is 00:03:23 So it's been, you know, one part of life where it's a lot of, you know, computer science, deep research, working really like, you know, on underlying systems, the hyperwires. We were one of the first teams to virtualize NVIDIA GPUs back when it wasn't even so popular in data centers to be offered in cloud environments. and then now this has been a completely different part of the journey, you know, I sometimes miss coding too, but it's a lot more about people than it used to be before. And most importantly, it's a lot about very exciting use cases and applications
Starting point is 00:04:01 that have emerged in this last decade, just as, you know, we have witnessed the progress of machine learning from just from the very basic, like, you know, the very first image net thing that make it like, oh, now we can actually automatically understand what's in an image. to now AI agents trying to, like, you know, order our plane tickets for us to plan this perfect vacation. Well, it does seem, though, that the progress of AI, you're right, that it started, as Frederick said, it started with speech processing for the most part. A lot of the applications early in the utilizing journey back when we started this podcast, we talked about processing images and detecting objects and processing.
Starting point is 00:04:45 video and sound and all sorts of other data sources too, you know, bio mechanic information from sensors and so on. But it seems like the prevalence of large language models has just crushed any kind of discussion of anything that's not text. Are you seeing that as well? I think a lot of the practical applications, you know, the approach I end up seeing for people a lot of the times is, well, they're asked to go use AI. So this is like, you know, you kind of have to, you know, we work with a lot of, you know, medium to large enterprises, some startups. And we see this very often. There is always this like, how can we use AI? If they're doing it right, it would be more from the perspective of there is this business problem can AI help us here and then
Starting point is 00:05:34 watch the process. But then sometimes it's like we got to get on the AI bandwagon. There is a lot of funding for it. Right. So there is like a spectrum of people. But the common thing is like, okay, look, especially when you think about larger companies, data is very siloed, right? Especially if they're collecting, if they're going beyond simple tabular data, going beyond text, that means sometimes it'll be like images, videos, audio, they get organized in places that most of the company doesn't know. They're like few people in some team that will have access to it. So just the process of bringing things together and then making sure that the models are up to par,
Starting point is 00:06:15 And then can you actually take all of that and combine that into a model that can give you the right answers? That's a very intensive process, which is what makes it like, okay, well, let's prove the value with text first, and then we'll see, right? Unfortunately, though, the problem is, in some cases, text is sufficient. A lot of the times people are like, you know, let's say if they are just, there is an example where you have a lot of. PDFs and granted it took some while to start parsing PDFs to the level of, you know,
Starting point is 00:06:50 actually understanding tables and images within PDFs too. It's not, it looks simple, but it's not always. But like, you know, if you're trying to understand a lot of reports that you do internally and enable a rag chat bar, that's the kind of application that you can pretty easily start off with. And, and, you know, so you start, if you start seeing the ROI, good. But if it is the sort of application where a lot of information was stuck in these other data types and you started just with text, it is quite possible that people arrive at the wrong conclusion that AI doesn't work. And I've seen that through a lot of, you know, even before we got to the whole rag and agentic stories that we are, like a genetic world that we are in today, even when people
Starting point is 00:07:35 were just trying to do like, you know, let's say e-commerce personalized recommendation. People wanted to use visual similarity search to recommend products that look like this, because, you know, we are very visual people, like you said. We look at something and that's what attracts us, not the text description of the product, right? And a lot of the times, teams would start building it, but they didn't have tools to be able to query and see what their image data sets look like. So they would just train the models, you know, with partial knowledge, not do the best job. And then the recommendations weren't as good as, you know, just based on your friend bought this and so you should buy this sort of stuff. And so the conclusion used to be, well,
Starting point is 00:08:20 AI is now helping us. And so that's why I kind of warn in terms of like, you know, it's a text is a good start, but there are a lot of cases where you got to bring in the other signals. And it looks very daunting because the tooling also has been pretty broken. But that's kind of why we are here. Right. So you talked a little bit about multimodal. I mean, it's already difficult enough to build models just for text or audio or video, let alone the different data types, right?
Starting point is 00:08:55 Video notoriously are much bigger, binary, very difficult to analyze. Text is easy to read, but much smaller than video. So how do you deal with all those different data types and those different models, into one final model, so to speak? I think, I mean, I couldn't speak. So the way I look at multimodal data and what it would take to unlock everything that it offers, I look at it in three sections, so to speak.
Starting point is 00:09:27 One is the models, what you're referring to. And, you know, there are a lot of vision language models now that are doing really well. I mean, I've recently reading all about, like, generative, on the SORA side, but also in interpreting, like if you look at some of the Gemini output and stuff like that, they can really do a very good job understanding what's happening in image video. The second aspect is processing. Well, there is no lack of processing today, I might say. I mean, Nvidia really has changed the name of that game, and there's a lot of
Starting point is 00:10:01 other inference providers and some more niche companies coming up in that space. And then the third aspect is data management. And I feel like this is where the biggest gap exists. If you look through the evolution of machine learning, you know, we used to see all these papers where, oh, you know, we are now able to detect like this tiniest bit of, like a dog face in the image. And then you went and deployed it in like a medical imaging sort of scenario. And I literally have an example I ran where the brain lobe, like the brain scan was classified
Starting point is 00:10:36 as a telephone lobe. So, you know, to those sort of things, they happen because, like, data has always been the differentiator. The better data you train with, the more representative data you give to your models, the better the outcome is. And it remains true to this day. But unfortunately, the solutions are still not there yet because changing data systems is a very involved and very complicated process. It doesn't have to be, but That's how we've always seen it. Like, you know, I mean, there's so much, and like people think so much in terms of SQL and relational tables.
Starting point is 00:11:13 Sometimes I run into people. It's like they won't use anything that doesn't support SQL. But that's not the right approach. The thing is what solves your problem? If AI needs to see all of the data, you need a data foundation that allows you to put all of the data, make it searchable, and make it easy to navigate. That has always been our drive.
Starting point is 00:11:36 principle behind this because if you want to go, and you know earlier we were talking about going from shallow intelligence to deep intelligence, we are, you know, like I was saying, the models really are very advanced now. The processing power, you know, it's growing constantly and accomplishing a lot. If we solve the data problem, if we build the right foundation for data layers, instead of still cobbling together a bunch of tools that make you inefficient, that create inconsistencies that don't let you scale as much as you can, you're still never going to fully go into deep intelligence territory. Yeah, it's so true.
Starting point is 00:12:15 And what I'm seeing, unfortunately, is that a lot of agentic applications are still focused on structured textual data. In other words, if we're going to have this process pass data onto this next process, In many cases, what it's doing is it's devolving, you know, let's say visual data into a, in some cases, a large set of text that describes that visual data in either free form text or in, you know, structured data and then passes that to the next thing. is it possible for agents to pass the actual image or some abstraction of that image between agents in these systems? It absolutely is possible, right?
Starting point is 00:13:03 So as we were building, so as you know, the product that my company offers is called Apporteur D.B. It's this unique vector graph hybrid database that we have purpose built for multimodal AI. So there was always this aspect of, you know, if someone says, I want to find images similar to this image. There's, of course, the vector search angle. So you need embedding generation and things like that. But how were people
Starting point is 00:13:32 going to give you the image? Because they would have to give you the image to put in the embeddings, and then you can do vector search, right? They would have to, and let's say you were going even further. You wanted to find video clips that had, let's say, kids playing in it. You would have to be able to understand components of the videos, you know, generate embeddings from those and be able to search through them. But it's not just a vector search part because what ends up happening is let's say you want clips where kids are playing in the video, right? Maybe the entire video is 30 minute long and there is only like a two-minute section in which the kids are playing. A true multimodal AI database should allow you to
Starting point is 00:14:17 search, decode the video and go into those two-minute part and just transfer that. Now, why is that important? Because I mean, imagine, videos, like you mentioned earlier, videos are really large. Are you going to be transferring the whole, like, you know, 30-minute long video between different components that need to operate on it, or do you want to just take the two-minute sections that are relevant and pass those along? So that's like that, there is this whole efficiency angle to it and you know not having to wait for hours for something to happen that can only be enabled when you introduce true multimodal understanding in your database and that's that's kind of what we did because you can literally say I want all the video clips in which a person was smoking
Starting point is 00:15:06 or not smoking and I want them returned to me in thumbnail size this is one query to aperture DB. It does a decoding. It takes out the parts that are interesting and it bundles up the clips and sends them to the next stage. And you're not duplicating any of this information. Because remember, you got to think about scale. We are going to operate, we're operating on petabytes or maybe even zabytes of data, right? And when we represent videos in our database, that is the original video file. But then we very smartly use the graph structure that we have to represent all the regions of interest in it. It can be interesting frames. It can be interesting clips in it. It's the same logic with images. You know, sometimes your cameras might be really high resolution and
Starting point is 00:16:01 capture a very wide angle. But all you care about is that person that's standing on the street. why are you transferring all those pixels between the different stages? So to your original question in terms of why can't you transfer some of these other data, can you transfer these other data types? I think one is the protocol that allows you to define this step, and I think there is still some room to improve. Like, we had to come up with a different query language to support all of this. we actively chose not to implement into like an SQL or a cipher-based query language because
Starting point is 00:16:38 they were too restrictive in what we were trying to do. Now we have built plugins to make them compatible because a lot of other tooling lives in that world, but we started out without hindering ourselves and we had to introduce the exchange like, you know, okay, this is how you are going to give us blobs of various types. This is how we are going to demarcate one blob from the other and the metadata we return is going to tell you what the rest of it means and things like that. So we are able to do it. And, you know, we work with PDF, audio, images, videos, and it works great. And of course, in the back end, we've introduced the whole performance and scale and, you know, understanding
Starting point is 00:17:23 that these data types are different. You have, you know, more parallelism requirements. There is less dependency among, you know, individual parts of the data. So there's all that stuff that goes into the architecture to make it high performance and efficient. And then there isn't the protocol to enable, like, to define that language. So it's very much possible to do it. Right. Yeah, data movement has always been a problem.
Starting point is 00:17:47 And as people collect more data, I think, the data movement by itself will get worse and worse. So how do people interact with your platform? Do you integrate with frameworks or orchestrators or how do people use and consume the platform? Yeah, I mean, you know, we started out with a database. No one wants to really think about a database. It needs to be hidden behind stuff. And so, yeah, we have, you know, originally when we started, because we were looking at a lot of training and inference sort of use cases,
Starting point is 00:18:20 we integrated with PyTor, Stensaflow, Vertx AI, these sort of frameworks. Then we, you know, with the rag, in the rag world, basically we introduced Langchain Lama Index integrations, and now we are looking at looking into agentic memory frameworks to integrate with them. We also do, you know, so ApertiaDB has grown from just being a database into this entire platform where you cannot just manage and search the data, but, you know, we have introduced workflows to make it easy to upload data, to just, generate embeddings, to extract information. I mean, we have a workflow that, you know, you give a URL and outcomes are at chatbot. You don't have to worry about segmentation, like, you know, embedding models and things like that. So, and we have these various, like, you know, MCP server plugins, SQL server plugins. So that has grown now into a platform. So people can interact with Apple TV directly on our cloud platform. They can use community addition. We can do
Starting point is 00:19:23 a VPC deployment, but we also have our own UI. And I'm actually really pleased recently with all the developments that have happened into our UI because you can literally go to one of the tabs in the UI, type a text question, and if you've like ingested the data and generated embeddings, it will show you, okay, these are the images that match, these are the PDFs that match, these are the videos that match, and all of this on one interface.
Starting point is 00:19:51 and there is still so much to improve there. Yeah, so the workflows or the plugins are kind of starter kits than I guess for the consumers. So can you talk a little bit how the platform works with enabling autonomous and semi-autonomous AI agents? Right. So there are different ways that you can go about it. Like, you know, a lot of the agents behind the scenes when they want to interact with data,
Starting point is 00:20:25 they basically might, you know, just do vector search queries and then, you know, implement their own LLM, like, feed, like, do the semantic search and feed things to the LLMs and then generate the responses. So that's, like, very fundamental way, which, you know, you can just use the vector search support we have. You can enhance that with graph rag, sort of, you know, like, rag improvements to start including the knowledge you've contained in the graph. But where we are seeing this go is essentially introducing this memory interface.
Starting point is 00:20:54 Because if you look at, you know, the memory frameworks, there's the component that actually takes user log, user questions, and extracts preferences and, you know, relevant personas and stuff. But underneath, it ends up storing this in either vector databases or a combination of vector and graph database or just simple text logs. and aperture TV is perfect for storing all of that stuff. I mean, the throughput and latency we offer in terms of updates and queries, it's phenomenal. And so it makes a really, you know, good foundation. And so now the thing we are working on now is like, okay, what's that memory layer right?
Starting point is 00:21:34 Like, you know, we start by integrate. We will basically integrate with some of the frameworks already out there. So the agents can really, like, make use of the memory and scale. through what Apertur DDB offers. So I love this talk of, you know, moving beyond text. I mean, that's the premise here at the beginning. But I wonder if you could help us with some examples or some ideas about moving beyond video, too. Because, of course, multimodal data, it doesn't just mean video.
Starting point is 00:22:08 It means all sorts of data types that are, you know, varied. So what other data types, beyond audio, video, and obviously text are people looking at with agentic applications. And what are some of the use cases for that? Well, you know, I mean, Stephen, as we talked about, a lot of people are still on text. We are not even at the audio, video stage. But, you know, so there is a lot going on with voice. So there's a lot of audio information.
Starting point is 00:22:38 I think people have realized there's a lot. They can even, like, you know, even before getting to voice and videos, there's a lot going on with PDF. because they, I mean, you know, if you, we, we all create so many reports, right? And we like to put tables to summarize. We put charts there. We have these pie charts and, you know, sometimes we put pictures to show the way, like, you know, how our architecture looks. So there is a lot that goes in.
Starting point is 00:23:04 And parsing PDF in itself is, and extracting information to start making sense. And, you know, at what part do you, how do you segment it and things like that? that stuff involves a lot of work. So I've been seeing, like, people who have managed to go beyond text, a lot of the times it's like, actually this is the kind of progression. You start text vector search, right? Then you start realizing, well, you know, your text is giving you some more relationships about things. And so can we connect and start, you know, utilizing the relations around it? So it naturally kind of progresses into, well, graph sort of notion. Can we build a knowledge graph? Can we use that information to improve the responses.
Starting point is 00:23:48 Then it moves into like PDFs and that automatically gets into like parsing images and stuff. Of course, voice AI companies, I think Frederick, you would know a lot more on this one. They are starting to get, you know, voice in my understanding started with like, let's convert this audio into the transcript and again, go back to the vector search where I think there is an increased understanding around like, hey, if you did that, you lose the emotions. You lose the background information, and that's sometimes really important. So you go beyond that. If you get two videos, there are some cases, especially in medical imaging, sort of cases, you know, where the scans, like, you know, nowadays a lot of the CT scans or ultrasounds can be pretty, like, 3D formats.
Starting point is 00:24:36 There can be the neural scans are in a different format. So when you go into more specific, like more domain-specific use cases, then the file formats start to be different. So that is, you know, can you understand the medical imaging file formats and start and enable the medical co-pilot sort of use cases, right? Because, I mean, patient information is naturally multimodal. Then there is satellite imaging sort of use cases. Like, you know, what can you gain? And that can feed into, you know, traffic sort of things. it can feed into agriculture sort of things,
Starting point is 00:25:12 but something that encapsulates the GIS formats and understands the different layering, like a satellite with different resolutions, how do you align pictures from all of those? So there are different formats for that. And I think there's a lot of development needed on those applications from the model side too. So I think that we are going to see those things come,
Starting point is 00:25:37 you know, there'll be more dedicated companies. is first even figuring out the models to operate on these sort of images. I mean, in the past, when we looked at medical imaging formats, we essentially would slice them up. Like diCom is like a series of PNG files, the usual image format. So we would slice it up because diCom itself contained too much. So that is like, you know, but that's when you become very domain specific. Yeah, I'm glad that you brought up medical because I think that that's definitely an area where we're going to see a lot of development in this. But also, as you talked about, a lot of geographical data, I was talking to somebody who's working on drone technology, and
Starting point is 00:26:21 they are working with everything from, you know, GPS data streams to topographic information, like you mentioned, to, you know, real-time feeds from sensors. And all of these things have to be integrated and localized and plotted together. It was a really interesting, conversation, a little bit beyond me, but I could understand the challenges because there, you know, it's not just video, it's not just maps, it's not just text, it's all of these things as well as LIDAR and radar and, you know, cameras and all of this had to be integrated. So I think that increasingly that's what the challenge is going to be is how do we integrate all of this data in a way that an AI agent can understand.
Starting point is 00:27:10 understand and act on without just overwhelming it with data? I think you bring up a great point. And, you know, like anything in AI right now, it's a two-part thing, you know, is that the model, it needs to start having an understanding of it. And, you know, the multimodal models are definitely, you know, advancing rapidly. And there is that data part. So, you know, one of the unique aspects, so why did we bring in a graph into picture? originally it wasn't because we were thinking there's going to be all this knowledge graph use cases and things like that
Starting point is 00:27:42 we brought it in because it gave us a good way to represent relationships and it was flexible to let us represent whatever data type we wanted to represent in it so in our same graph structure and we use a property graph structure for that reason instead of the RDF graphs that come in that just like you know we don't do the subject predicate object representation we do the full like If there is a representation for people, it'll be like a person, node in the graph, you know, name, last name, and all that stuff. But it can be very easily connected to another node, that's a picture of that person or that's connected to like video clips of that person. And all of these special data types videos, you know, we can introduce LIDARs, documents, all these things have representation in the graph. So you can go from one type to another, and in the same query, you can be saying, I want all of the various data things associated with Stephen and Frederick together, like whatever they appeared together, whether there was a text description,
Starting point is 00:28:47 whether there was an event, whether there was, you know, recordings. It can go, it can use the power of graph traversal to get there. So that's why we kind of, you know, originally started with the graph. And, you know, now you can basically represent a lot of application information. it too. Yeah, I think one of the problems too is that not only is there a large amount of data, but the amount of metadata associated with the data is also getting more complex, right? So you talked a little bit about the medical and geographical. I mean, the amount of metadata surrounding it is creating an additional problem in the complexity of the model. So one of the questions I had for
Starting point is 00:29:32 you was how do you see agentic AI evolved in the next 12 to 18 months? I mean, if you look at MCP servers, there are less than maybe around a year old, it's going so fast. What's your vision for agentic AI in the next 12 to 18 months? I think that it could be a lot more focus on what does it mean to get agents in production. You know, we've built a lot of toy agents. We've built a lot of, like, you know, agents that are starting to do some serious work. But I think, especially in larger companies, you know, now it's time to go from POCs into production, which means really answer all these questions. So all that we discussed, you know, how does, how does it get the maximum ROI? You have to start thinking about your stack. Like, are you going to do a framework way?
Starting point is 00:30:21 What framework is the best? What sort of models give you the least amount of hallucination and get you the most distance in terms of, you know, your particular use case. So like, you know, we work a lot in retail and e-commerce and there is personalized recommendations. Sometimes that doesn't require you to be 100% precise. You know, you're recommending product. You're telling them what you can buy. It's okay if, like, one of the products you recommend it doesn't exactly fall in our umbrella. But we also work with some medical co-pilot use cases and there it becomes very important that you do not hallucinate. So the guardrails become really important. So there'll be a lot more increased understanding in terms of, okay, for the vertical that you are
Starting point is 00:31:00 in, what are you okay accepting and what are you not? And then what does it mean? If you want to go in production, what are all the data types you're going to have to involve, what teams have to come together to put this information, what are the guardrails that are going to be, how are we going to evaluate, how are we going to observe and monitor this stuff, how are we going to capture user preferences at scale without disturbing their experience? I feel like there's going to be a lot more, you know, focused and organized efforts, and so the tools, like, you know, platforms like ours become really key in making that happen. I do wish, though, there is also some effort around taming compute. We've been throwing so much power, and you can see these numbers about,
Starting point is 00:31:45 you know, the electricity consumption for AI applications as, like, literally been drying reservoirs in places because of cooling. I really hope there is some effort around that too, to reduce the energy consumption. You know, it's interesting. I was just going to say, it's almost like people need some kind of special database that can handle all this multilateral data and maybe a platform that could bring it all together. Yeah, it is, I think what people need to know is they need to know that such technology exists and that it is possible to bring together various data types and with AI applications and that, you know, people are working.
Starting point is 00:32:25 working on this, because I wonder how many people are just, you know, sort of dismissing it out of hand and saying, like, we just can't handle this or we don't know how to handle this. So I guess what do you see happening next from the industry overall in terms of integrating multimodal data with agentic AI? I think it's going to, I think it's going to increase at a much more rapid pace with them. I mean, you know, there is at any time the big, big. companies start talking so heavily about it. You know, they start talking like six to nine months early
Starting point is 00:33:00 because they're trying to build up hype around it. But if you went to Nvidia GTC earlier this year or Google Next or like, you know, reinvent late last year, multimodal was already the thing and agents were already the thing. And it was naturally like multimodal AI agents, right? But of course, the practicality follows a little bit behind all of this. So, yeah, so I think we'll see a lot faster adoption, especially like, you know, we are in production so, you know, people can really unlock the data part and the moment you unlock data, the computer's ready. All right.
Starting point is 00:33:41 Well, thank you so much for this. It's been, it's been very thought-provoking, as was, you know, our previous conversation. And I hope that our listeners are starting to say, wait a second, maybe it's not. you know, just about text and just about structured data and, you know, passing JSON between, you know, agents and things like that. Maybe it's, maybe it's more than that. And hopefully that's the sort of thing that can come from this season of utilizing tech, where we're going to be talking to a bunch of folks who are doing some really cool things
Starting point is 00:34:12 with AI agents. Before we go, please let us know where can we connect with you, where can our listeners connect with you, where can they learn more, and where can they continue the conversation? Yeah, so I am very active on LinkedIn, so please connect with me on LinkedIn. I suppose you'll share the profile as part of the description. For aperture data, the best place,
Starting point is 00:34:37 you can go aperturedata.io or docks.appertadata.com, and I would say give it a try. The cloud has free trial. So if you sign up on cloud. aperturedata.io, you can try out the database, you can try out our various workflows that make it really easy to ingest existing, you know, data examples, run some embeddings, try out the UI, everything is there. And if you are concerned about privacy, because, you know, you work at a company that won't let you send data to a SaaS tool, then we also have a free community
Starting point is 00:35:11 addition on Docker Hub, and you can definitely try all the database features through that as well. And we would really like to grow our community. We have a Slack channel and we really, you know, amplify people who build and contribute to the set of applications that can help end users. So for sure, looking forward to such contributions. And more multimodal agents built on top of ApertureDB. Yeah, I can't wait to see what people build. And Frederick, how about you?
Starting point is 00:35:42 Yeah, I'm also active on LinkedIn. You can find me as Frederick V. Herron and on the high, and you will see both of us at AI Field Day, which is coming up real soon here at the end of October. We're pretty excited to be bringing together a cool group of companies talking about various elements and aspects of AI, some of whom you will hear about on this episode or this, I'm sorry, on this season of utilizing tech, and hopefully some of whom we will connect with further. If you are excited about AI and Agenic AI and where this is all going, do check out the Tech FieldA website. Also check out TechFieldA.com.
Starting point is 00:36:26 That's the website. Tech Strong AI is our media site. And also, we're going to be launching another podcast, a weekly podcast focus on AI as well. So keep an eye out for that. So thank you so much for joining us and listening to this episode. of Utilizing Tech. You'll find this podcast in your favorite podcast application. Just search for Utilizing Tech.
Starting point is 00:36:51 You'll also find us on YouTube. If you enjoyed this discussion, we'd love to hear from you. Please give us a rating. Please give us a review. Please subscribe. This podcast is brought to you by Tech Field Day, which is part of the Futurum Group. For show notes and more episodes, though, head over to our dedicated website, which is UtilizingTech.com or connect with us on X Twitter, Blue Sky, Mastodon, or, yeah, LinkedIn.
Starting point is 00:37:14 you can look for utilizing tech. Thanks for listening, and we will see you next week. Thanks.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.