Orchestrate all the Things - Amazon Neptune introduces a new Analytics engine and the One Graph vision. Featuring Brad Beebe & Denise Gosnell, Amazon Neptune General Manager & Principal Product Manager
Episode Date: November 29, 2023Amazon Neptune, the managed graph database service by AWS, makes analytics faster and more agile while introducing a vision aiming to simplify graph databases. It's not every day that you hear p...roduct leads questioning the utility of their own products. Brad Beebe, the general manager of Amazon Neptune, was all serious when he said that most customers don't actually want a graph database. However, that statement needs contextualization. If Bebee had meant that in the literal sense, the team himself and Amazon Neptune Principal Product Manager Denise Gosnell lead would not have bothered developing and releasing a brand new analytics engine for their customers. We caught up with Bebee and Gosnell to discuss Amazon Neptune new features and the broader vision. We cover where Amazon Neptune fits in the AWS vision of data management, and how the new analytics engine provides a single service for graph workloads, high performance for graph analytic queries and graph algorithms, and vector store and search capabilities for Generative AI applications. We also share insights on the One Graph vision, the road from serverless to One Graph via HPC, as well as vectors and Graph AI. Article published on Orchestrate all the Things: https://linkeddataorchestration.com/2023/11/29/amazon-neptune-introduces-a-new-analytics-engine-and-the-one-graph-vision/ 00:00:00 Introduction 00:01:44 Amazon Neptune & AWS vision of data management 00:05:35 The Importance of Graph Databases 00:08:55 Amazon Neptune Use Cases 00:13:13 Introduction to Amazon Neptune Analytics 00:15:20 Key Features of Neptune Analytics 00:17:40 Use Cases for Neptune Analytics 00:21:10 Preparing Data for Generative AI Applications 00:23:37 Neptune Analytics Use Cases and Deployment 00:26:43 Pricing and Roadmap Q&A 00:48:46 Conclusion
 Transcript
 Discussion  (0)
    
                                         Καλώς ήρθατε στο Αρχιστήριο των Πορταγών.
                                         
                                         Είμαι ο Γιώργος Ανατιώτης και θα συνεχίσουμε τα πράγματα μαζί.
                                         
                                         Στοιχεία για τεχνολογία, δίδα, AI και ΜΕΔΙΑ και πώς μπροστά σε έναν άλλο, σύγχρονα με τα τάξη μας.
                                         
                                         Το μεταφέρον του γραφικού δίδαλου Amazon Neptune από το AWS,
                                         
                                         κάνει την αναλυτική πιο γρήγορη και πιο εύκολη,
                                         
                                         ενώ παραδείγματος έναν βίστημα που προσπαθεί να απαντήσει τα γραφικού δίδαλου.
                                         
                                         Δεν είναι κάθε μέρα που ακούς πρόεδρους που προβληξηγήσουν την χρησιμοποιία των προϊόντων τους.
                                         
                                         Ο Brad Beebe, ο Γενικός Μέντρων
                                         
    
                                         της Amazon Epson,
                                         
                                         ήταν όλοι σοβαροί όταν είπε ότι
                                         
                                         οι περισσότεροι πελάτες δεν θέλουν
                                         
                                         γραφικές δίκαιες.
                                         
                                         Αυτό το σχέδιο χρειάζεται κατασταθήκη.
                                         
                                         Εάν είχε πει ότι, σε λιτέρια σκέψη,
                                         
                                         ο ομάδος του, ο ομάδος του Amazon Epson,
                                         
                                         ο πρωτοβουλίος του προϊόντου
                                         
    
                                         του Amazon Epson, ο Δενίς Κόσναν, δεν θαμάδος του ομάδου και ο πρωτοβουλίος του Amazon Neptune, ο κ.Δ.Ν.Κ.Οσνελ,
                                         
                                         δεν θα είχε ασκήσει να αναπτύξει και να δημιουργήσει ένα νέο ενεργό ενεργοσύνης για τους πελάτες.
                                         
                                         Είχαμε τη μία συζήτηση με τον Π.Π.Ο.Κ.Οσνελ για να συζητήσουμε νέα πιτσένα του Amazon Neptune και την πιο διάφορη θέση.
                                         
                                         Βεβαιώνουμε πού αντιμετωπίζεται το Amazon Neptune στην θέση του AWS για τη διαχείριση των δεδομένων
                                         
                                         και πώς το νέο ενεργό ενεργό σύμβουλος προσο προσφέρει μία μία διευθυνσία για γραφικές εργασίες,
                                         
                                         υψηλή επιτυχία για γραφικές αναλυτικές εμβολίες και γραφικές αλγόρθμους
                                         
                                         και βεκτορ-στορ και διευθυνσία για εργασίες με δημιουργητική επιχείρηση.
                                         
                                         Επίσης, δίνουμε επισκέπτες για τη βιβλιογραφική βιβλία,
                                         
    
                                         τη πρόταση από τη σερβερική σε μία γραφική μέσω του HPC,
                                         
                                         όπως και βεκτορ και γραφική επιχείρηση.
                                         
                                         Ελπίζω ότι θα το απολαύσετε. via HPC, as well as vectors and graph AI. I hope you will enjoy this.
                                         
                                         If you like my work and orchestrate all the things,
                                         
                                         you can subscribe to my podcast, available on all major platforms,
                                         
                                         my self-published newsletter, also syndicated on Substack,
                                         
                                         Hackernian, Medium, and Dzone,
                                         
                                         or follow and orchestrate all the things on your social media of choice.
                                         
    
                                         Hi, I'm Brad Beebe.
                                         
                                         I'm the general manager of Amazon Neptune and Amazon Timestream.
                                         
                                         Neptune is AWS's managed graph database service,
                                         
                                         and Timestream is AWS's managed time series database service.
                                         
                                         A little over seven years ago, I joined AWS from a small open source graph database company
                                         
                                         to launch a managed graph database service.
                                         
                                         And today, I'm really excited to talk to you
                                         
                                         about Amazon Neptune Analytics
                                         
    
                                         and give you a preview of what's going to be coming soon.
                                         
                                         But the first thing I wanted to do
                                         
                                         was to give you a little bit of an overview
                                         
                                         about how we at AWS are thinking about our vision
                                         
                                         of data management.
                                         
                                         And at AWS, our vision is an end-to-end architecture
                                         
                                         where customers don't have to worry about how their data is stored or managed.
                                         
                                         To do that, we really have three different pillars.
                                         
    
                                         The first is having the most comprehensive set of services to store, analyze, and share your data.
                                         
                                         The second is having solutions that make it easy to connect all of your data between the services that you want to use.
                                         
                                         And the third, which is very important, is having the right governance and policy solutions in place so that you know that your teams
                                         
                                         can use the data effectively and quickly, but within policy and regulatory guidelines.
                                         
                                         At AWS, from a database perspective, we have the most complete set of both relational and
                                         
                                         purpose-built databases.
                                         
                                         And of course, today we're going to focus on one of my favorites,
                                         
                                         which are graph databases.
                                         
    
                                         Graphs are awesome.
                                         
                                         And the reason that graphs are awesome is because they allow you to innovate
                                         
                                         based on the relationships in your data.
                                         
                                         In a graph data model, relationships are a first-class entity,
                                         
                                         which means you can ask questions and build applications that explore these
                                         
                                         relationships and the connections in your data.
                                         
                                         The challenge is that when you access a graph,
                                         
                                         the way that you need to touch the data is often random. And so if you think about
                                         
    
                                         how one person is connected to another, is connected to another, the way that you lay out
                                         
                                         and store that data internally in a system makes it very difficult to predict how you're going to
                                         
                                         access it. So it's hard due to the random data access. And when you want to ask generalized graph questions or high-performance graph processing,
                                         
                                         you often get the best result by using a purpose-built graph solution.
                                         
                                         Amazon Neptune is AWS's fully managed graph database solution.
                                         
                                         It's purpose-built for processing graphs,
                                         
                                         and it's designed for interactive graph applications
                                         
                                         where you need to store billions of relationships,
                                         
    
                                         in fact, up to 128 terabytes of graph data,
                                         
                                         and support interactive navigation
                                         
                                         with parameterized searches from one to three hops.
                                         
                                         Neptune offers customers the most choice
                                         
                                         of open-source and open and open standard query languages supporting both the labeled property graph and the resource description framework graph models and the three query languages of OpenCypher, Apache TinkerPop, Gremlin, and RDF and Sparkle.
                                         
                                         And did I mention that Neptune also provides a serverless deployment option and a global AWS regional deployments?
                                         
                                         One of the things that most excites me is that every day, thousands of customers create tens of thousands of different Neptune instances.
                                         
                                         And you can see from the customer logos here
                                         
    
                                         that the kinds of use cases that customers do with graphs
                                         
                                         are very broad.
                                         
                                         When we look across our business,
                                         
                                         we really see four different areas
                                         
                                         where we're seeing traction with customers.
                                         
                                         The first are knowledge graphs,
                                         
                                         which is how is information related.
                                         
                                         And we see customers using this for information retrieval, as a precursor for machine learning types of applications,
                                         
    
                                         and increasingly with various different kinds of Gen AI use cases.
                                         
                                         A brief example is Siemens is using a knowledge graph to power a digital twins use case,
                                         
                                         where they provide a query service for their digital twins that's connected by a knowledge graph to power a digital twins use case where they provide a query service
                                         
                                         for their digital twins that's connected by a knowledge graph.
                                         
                                         The second major use case that we see from customers are identity graphs.
                                         
                                         And these are using many different observations of customers or users or devices, and then using relationships between those observations,
                                         
                                         often in conjunction with different kinds of analytics,
                                         
                                         to be able to try and understand and create a 360-degree view of the customer.
                                         
    
                                         So of all the interactions that I have across all this data,
                                         
                                         who are the actual customers behind the scenes?
                                         
                                         How can I help understand the customer journeys? How can I use that as a precursor to various different kinds of
                                         
                                         fraud applications? Fraud, of course, is a classic graph use case. It's really only limited by the
                                         
                                         ingenuity and the creativity of those who commit fraud.
                                         
                                         But the kinds of fraud that we see customers using to detect with Neptune are dealing with the relationships in the data.
                                         
                                         So they're looking at transactions or groups of individuals and trying to understand how those individuals are related to be able to do fraud detection.
                                         
                                         A very fun example of this is Games 24-7, which is an online gaming
                                         
    
                                         company in India, and they play rummy for money. And one of the behaviors that they saw was that
                                         
                                         multiple groups of people were playing multiple tables of rummy at the same time, and they were
                                         
                                         colluding. And so by using a graph to look at the relationships across players at the same time, and they were colluding. And so by using a graph to look at the relationships
                                         
                                         across players at the same time,
                                         
                                         they're able to detect this particular pattern
                                         
                                         of collusion-based fraud.
                                         
                                         So it's a fun and interesting example.
                                         
                                         And the last major use case that we see from customers
                                         
    
                                         are security graphs, also a classic graph space. We see
                                         
                                         customers using the connections between their devices and networks to help them understand
                                         
                                         their cloud security posture, to do detection about data exfiltration and data flows, to manage
                                         
                                         policies for identity and access control. This has been one of our fastest growing segments over the Είναι ένας από τους πιο γρήγορους σεγμένους σεγμένους των τελευταίων χρόνων.
                                         
                                         Έχω δείξει ότι έχετε κάποιες επίδρασεις σε κάθε από αυτά τα σχέδια για αυτά τα χρησιμοποιητικά συμφέροντα.
                                         
                                         Και πιστεύω ότι για την ιδιαιτήτηση, την εμφανίστηση και τα σχεδιαστικά γραφή, αυτά είναι πιθανότατα, τουλάχιστον, διαχειρισμένα με την εγγραφική εγγραφή των ποιότητες σας.
                                         
                                         Η γραφική εμφανίστηση των γνώσεων είναι πιθανότατα διαχειρισμένη με την εγγραφή των ρΔΦ. graph case is probably typically handled by the RDF engine.
                                         
                                         I think I also looked around a little bit and I saw you have pre-configured notebooks
                                         
    
                                         for those three use cases for identity and all of them are property graphs, right?
                                         
                                         Yeah. So let me answer them in reverse order.
                                         
                                         So we do have the Amazon Neptune notebooks are Jupyter notebooks that we provide in open source.
                                         
                                         And they give you examples for both how to use graph graphs, how to use graph databases, and in particular for the different use cases.
                                         
                                         And you're correct in that the fraud and identity graph and security graph use cases.
                                         
                                         There are the examples that we provide are with Apache TinkerPop and Gremlin.
                                         
                                         And we do have a knowledge graph use case,
                                         
                                         which I believe we have both an RDF and a property graph one.
                                         
    
                                         We do, you know, I think that I see a mix of different uses of graph models.
                                         
                                         I think the first assumption that I might make would be that most customers
                                         
                                         are using RDF to build knowledge graphs.
                                         
                                         And while we do see that many customers are using RDF to build knowledge graphs,
                                         
                                         particularly those who are really thinking deliberately about their information
                                         
                                         architecture and their information models,
                                         
                                         we also see a large number of customers choosing to build knowledge graphs
                                         
                                         with property graph.
                                         
    
                                         And I think that that's speaking a lot to the value that customers see
                                         
                                         by relating the data and that it sort of transcends the choice
                                         
                                         of the graph model for those kinds of use cases.
                                         
                                         Okay, cool. Thanks. I'll let you pick up from there.
                                         
                                         Yeah, no, sure, no problem.
                                         
                                         So one of my favorite examples
                                         
                                         from the security graph space is Wiz.
                                         
                                         Wiz is a very fast growing security ISV.
                                         
    
                                         They have cloud security posture management software.
                                         
                                         They have lots of research services.
                                         
                                         But the thing that is
                                         
                                         really interesting about WIS is the way that they're using the graph is to really help you
                                         
                                         understand why findings are important. So, you know, in the security space, it's very easy to
                                         
                                         get overwhelmed by alerts and things that seem scary that your detection systems are finding.
                                         
                                         And it's challenging to really understand, of all the things that we've found,
                                         
                                         which ones are the most important for you to prioritize
                                         
    
                                         or for you to ask your IT teams to prioritize fixing.
                                         
                                         And what you see here is an example of Wizz's application.
                                         
                                         And you can see that they're using the graph to help understand why a particular vulnerability or detection is important to fix.
                                         
                                         And so in this case, what you're seeing is that the cause, the reason something is important is because this particular detection means that one of your business applications is connected to the Internet versus a developer system or something that was standalone and maybe had other kinds of defense in-depth pieces.
                                         
                                         So I think it's really interesting that, you know, their use of the graph is for explainability,
                                         
                                         and that's, you know, just really helps them be differentiating in their offering. So with that, I'm very excited to turn it over to Denise
                                         
                                         to talk about our new offering, Neptune Analytics.
                                         
                                         Thank you so much, Brad and George. My name is Denise Gosnell. I joined the Neptune team a little
                                         
    
                                         over a year ago, and it's been a privilege to get to be a part of this team and join here to talk to you all and to share where we are going with our new analytics engine,
                                         
                                         Neptune Analytics. Amazon Neptune Analytics is a new analytics engine for Amazon Neptune so that
                                         
                                         our customers can make better data discoveries by analyzing large
                                         
                                         amounts of graph data with billions of connections incredibly quickly. So far, there are three main
                                         
                                         features or main ways to think about Amazon Neptune analytics that our customers in a beta
                                         
                                         program have been loving the most. The first of which is that it's a single service for working with your graphs. You can
                                         
                                         invoke popular graph algorithms, you can run low latency queries, and perform vector similarity
                                         
                                         search all from a single API. This API supports OpenCypher, which is a really popular open source
                                         
    
                                         graph query language. The second thing our customers have been most excited about is that
                                         
                                         it's incredibly fast. So far, we've seen that our high-performance graph computing techniques have
                                         
                                         proven to be about 100 times faster for loading data in, and we've got 20 times faster scans and
                                         
                                         about 200 times faster columnar scans when you are running graph analytic queries and performing
                                         
                                         graph algorithms. The third thing our customers have loved the most is how much easier it is to start to build generative AI applications quickly.
                                         
                                         You can store and search vectors within Neptune Analytics by storing embeddings on nodes.
                                         
                                         And we also can use the Langchain library to perform, to translate natural language questions into open cipher
                                         
                                         queries, really lowering that bar to entry to working with graphs and working with graph
                                         
    
                                         algorithms.
                                         
                                         So far, our customers have been using Neptune Analytics in three unique ways, first of which
                                         
                                         is that they're using them to perform ephemeral analytics.
                                         
                                         So imagine
                                         
                                         you have a workflow where you just need to spin up a graph really quickly, run some analysis,
                                         
                                         and turn it off. That's one of the main ways our customers have been using it, and it's giving you
                                         
                                         an overall lower total cost of optimization for your graph analytics workflows. The second way
                                         
                                         our customers have been loving using Neptune Analytics is for performing low latency analytical queries.
                                         
    
                                         The best way to think about that is that there are many established ML pipelines with feature tables so that you can perform real time predictions off those features.
                                         
                                         Now our customers are able to run incredibly high concurrent query workloads to augment their existing feature tables with new analytics about their graph structure.
                                         
                                         That gives their ML models much higher prediction rates and overall higher end user engagement.
                                         
                                         The third way our customers are loving using Neptune Analytics is for doing vector search
                                         
                                         and then building Gen AI applications. Like we mentioned, you can perform a vector similarity
                                         
                                         search when you store your embeddings in Neptune Analytics, and then we also have a
                                         
                                         much easier way to translate those English questions into graph queries because the way we
                                         
                                         think about data just so happens to fit really well with how a graph structures it. I'd like to
                                         
    
                                         go a little bit deeper and just show you all some stories about how our customers have been using
                                         
                                         each of these three types of use cases in a beta program that we've been running.
                                         
                                         So for ephemeral analytics, there was or there is a financial service company that has been able to increase at the point of sale, increase their intervention of successfully identifying fraud
                                         
                                         from about 17 to 58 percent. And they've been doing that by quickly spinning up a graph, loading their data,
                                         
                                         studying specific structural properties about it, and then turning it off. And that type of
                                         
                                         investigation for their analysts has helped them identify those new patterns of fraud much faster.
                                         
                                         Because as we say, and as our experience is showing, fraudsters are only as creative as
                                         
                                         whelp their minds. And so you've
                                         
    
                                         got to be able to quickly find those patterns and be able to deploy new ways to fight against them.
                                         
                                         There's also a large media and technology company we've been working with who has loved the massive
                                         
                                         simplification that the ephemeral analytics workflows have brought for their data science
                                         
                                         teams. So what they've been able to do is to replace their data science pipeline with spinning up a graph, extracting those insights from algorithms and combinations of queries, and then turning it off.
                                         
                                         So that new process for them has offered an overall lower total cost for their data science team, and it's been a with our customers in our beta program for doing low latency analytical queries, we've been working with a social media company that 14 hours to load over 10 billion edges into a graph,
                                         
                                         understand specific properties about their recommendations
                                         
                                         and their friend engagements to then augment their ML pipelines.
                                         
                                         Now they can do that in about two hours,
                                         
    
                                         and they're able to run much higher concurrent queries
                                         
                                         to augment those feature tables and to get those graph stats in their ML pipelines.
                                         
                                         Also, Amazon.com has been able to reduce time to resolution by about 25% for investigating fraud cases. Again, very similarly, as you heard, by being able to extract a graph feature by using
                                         
                                         Neptune Analytics and then augment their understanding of the ML predictability
                                         
                                         or those features for their ML pipelines so that they can have a much faster resolution
                                         
                                         on finding that fraud when it's emerging very quickly.
                                         
                                         Now, generative AI.
                                         
                                         Everyone wants to talk about how people are building generative AI, and got some more
                                         
    
                                         stories about how our customers are working with us to build them with Neptune Analytics.
                                         
                                         So first off, one of the biggest themes that we hear about is how generative AI is helping customers make data discoveries.
                                         
                                         And that is exactly how we have been working with a large healthcare products company to do so.
                                         
                                         Specifically, they want to create scientifically aware search or translating proteins into vectors, doing similarity search, and then having the ability to you're able to find out what's similar and explain why.
                                         
                                         It gives you that much better way to discover new connections in your data.
                                         
                                         And it's really exciting and it's very interesting, especially right now on the Neptune team, to see how our customers are innovating in that fashion. We're also working with a very large online retail store
                                         
                                         who needs to make sure that they can quickly identify and flag pirated material that's being
                                         
                                         listed and sold. So you can imagine that you might have a piece of content that you know is
                                         
    
                                         pirated, and you can combine vector similarity search and a knowledge graph to say, well, for
                                         
                                         this piece of pirated material, find other items that are also,
                                         
                                         that are very similar to it. And then you can traverse your knowledge graph to determine other
                                         
                                         sellers, listers, and buyers or patterns of how that pirated material is being listed and sold
                                         
                                         on the website. It's all about giving a service to our customers that is incredibly fast at detecting emerging patterns in as near real time as possible.
                                         
                                         When we've been working with our customers for doing generative AI applications, we have been working very closely to determine how they're going to be deploying Neptune Analytics within their workflows so that they can most quickly build generative AI apps. So there's two ways
                                         
                                         that we've been working with them. There's the perspective of how our customers and users are
                                         
                                         going to be using Neptune Analytics and then using generative AI. But then there's also the
                                         
    
                                         perspective of how you're going to prepare your data and get it ready in a generative AI app.
                                         
                                         Let's talk about those in reverse order.
                                         
                                         So for getting your data ready, there's a need to have processes where I can imagine you have a large amount of training data that sits in a data lake.
                                         
                                         And you're going to need to use one of the well-established tools in Amazon or in AWS's tool suite like AWS Glue or Amazon EMR
                                         
                                         to process that data. And then typically our customers for Neptune Analytics are storing it
                                         
                                         in one of two places. You might need to store it in Amazon Neptune Analytics itself or Amazon
                                         
                                         Neptune, or you might just be storing it in S3. So once you process your data out,
                                         
                                         you can put it in Neptune or you can put it in S3. And from each of those locations,
                                         
    
                                         it's incredibly fast to get it into Neptune Analytics to use in your application.
                                         
                                         Once you're also pre-processing your data, you might want to learn embeddings off of that. And
                                         
                                         that's when, on a second note, you might want to use maybe the
                                         
                                         open source laying chain library, or you might want to use SageMaker or Amazon Bedrock to extract
                                         
                                         embeddings about your data to then also persist in Neptune Analytics to use for vector similarity
                                         
                                         search. So those are two ways to look at how customers are extracting data from their data
                                         
                                         lakes, storing it in the Neptune service, and then using Amazon
                                         
                                         tools like Amazon Bedrock to get embeddings to set it up. Now let's look at the other side,
                                         
    
                                         how our end users or how our customers' end users are using generative AI applications.
                                         
                                         They are going to probably start with invoking a query, and they're absolutely loving hitting
                                         
                                         that Langchain OS, Langchain library from open source
                                         
                                         to translate a human question into a graph query
                                         
                                         because that's our favorite part about working with graphs.
                                         
                                         The way we think and speak about data
                                         
                                         naturally maps into that connected
                                         
                                         and natural way to work with data.
                                         
    
                                         Once they have their query,
                                         
                                         those queries are being run against Neptune Analytics
                                         
                                         at incredibly high speeds with high concurrency
                                         
                                         so that you're able to get answers back, rewrap them, and use another large language model to make them much easier to understand and then return it to the end user.
                                         
                                         We like to talk about that because it's really important to see as the generative AI space is moving so quickly, it's important to see and start
                                         
                                         to understand the patterns in which people are deploying Neptune Analytics and deploying graph
                                         
                                         technology to be used within a generative AI app. And you got to consider both sides. You got to
                                         
                                         understand how you're going to prepare the data. And then you also need to work backwards from how
                                         
    
                                         your end customer is going to use it so that you can architect it to be as fast as possible. So let's talk about pricing here for a second.
                                         
                                         When you start to look at Neptune Analytics and its pricing, its pricing is going to be based on
                                         
                                         memory optimized units. So we're going to be pricing this based on how much compute that you
                                         
                                         use for Neptune Analytics per hour. You're going to have essentially a
                                         
                                         capacity, a provision of memory, and it's going to be associated to different compute and network
                                         
                                         resources. And there's a price per hour for how much compute that you're going to be using.
                                         
                                         Our customers have been loving this because it drastically simplifies how you create your graphs.
                                         
                                         You're not having to think about the instances and making all of those subsequent choices. You can just specify the maximum capacity of a new graph in terms of
                                         
    
                                         gigabytes of memory. And then the last thing that our customers have been loving is that the
                                         
                                         capacity can be automatically determined when you're importing your data from S3 or you're
                                         
                                         importing your data from Neptune with an overall max capacity so they can
                                         
                                         control their budget. So to kind of recap of Neptune Analytics and where we're going,
                                         
                                         Neptune Analytics is a new analytics engine for Amazon Neptune that's incredibly fast. It's about
                                         
                                         100 times faster than our existing solutions for doing graph analytics today. You can receive
                                         
                                         incredibly fast responses to analytics. It's tuned for those memory intensive graph computations,
                                         
                                         and it's built for use cases that are ephemeral.
                                         
    
                                         Spin up a graph, run analytics, turn it off.
                                         
                                         They require a lot of highly concurrent low latency queries,
                                         
                                         like augmenting established machine learning pipelines
                                         
                                         with new graph analytic features.
                                         
                                         Or for those building in the greenfield to build more
                                         
                                         generative AI applications, Neptune Analytics is built to support vector similarity search
                                         
                                         and other integrations like with large language models stored in Amazon Bedrock.
                                         
                                         Our customers are using this to make data discoveries and to use both the explicit
                                         
    
                                         modeling of a knowledge graph
                                         
                                         and the implicit search of similarity search from vectors
                                         
                                         to really do some fascinating,
                                         
                                         to build some fascinating new use cases
                                         
                                         when you combine those two together.
                                         
                                         The overall simplicity of where we're going
                                         
                                         with Neptune Analytics to have that single API
                                         
                                         is one of the most loved features so far.
                                         
    
                                         You can load, query, and analyze graphs all from a single API.
                                         
                                         And the simple pricing model is making it a lot easier
                                         
                                         for our customers to make choices and get started.
                                         
                                         That is where we're going with Neptune Analytics.
                                         
                                         It is an incredibly exciting time here.
                                         
                                         And thank you so much for having us to get to talk about it.
                                         
                                         Great. Thanks for the introduction. And I do have a number of questions, actually.
                                         
                                         And to be honest, that all sounds pretty interesting and impactful.
                                         
    
                                         So based on what you said, it sounds like your users are already making good use of it.
                                         
                                         And what I'm trying to figure out here, though, is where does that all stand,
                                         
                                         let's say, in relation to what you already had? να αναφέρετε εδώ, όμως, πού όλα αυτά στήνουν, ας πούμε, σε σχέση με το τι ήρθατε ήδη.
                                         
                                         Γιατί, όσο ξέρω, η Neptune ήδη υποστηρίζε έναν τρόπο αναλυτικών πιθανότητας και, πιο δημοσιογραφικά,
                                         
                                         έναν τρόπο αλγόθυρων που μπορούσατε να περάσετε από την κομμάτια. Και ξέρω ότι ήδη είχατε
                                         
                                         δυο εργαλείς που περάσαν κάτω από το κομμάτι, είχατε την εργαλ have the RDF engine and the property graph engine.
                                         
                                         So is this a new engine on its own,
                                         
                                         or is it some kind of add-on or enhancement
                                         
    
                                         or new features to the existing engines?
                                         
                                         Yeah, great question, George.
                                         
                                         So to answer one of your questions,
                                         
                                         Neptune Analytics complements Neptune by offering in-database algorithms.
                                         
                                         So when you have your data in Neptune, you can connect, you can essentially spin up a graph with Neptune Analytics,
                                         
                                         connect the endpoint to the ARN of your Neptune cluster, and it'll automatically ETL your data from Neptune into Neptune Analytics.
                                         
                                         Neptune Analytics is an in-memory processing engine that offers in-database
                                         
                                         algorithms. So that's a difference and an improvement for Neptune's customers today.
                                         
    
                                         I think it's also a question about the kinds of use cases. I think that when I was talking
                                         
                                         earlier about graphs, still excited about them. You know, we talked about random data access. And, you know, for Neptune databases use a pretty traditional database type of architecture,
                                         
                                         you know, where we separate compute and storage.
                                         
                                         You can store very large graphs, are used to answer graph queries.
                                         
                                         And so that works really well for interactive OLTP-like graph applications, where over time you have kind of your hot working set of data that's in the instances.
                                         
                                         And you're answering questions over specific parts of the graph because those queries are parameterized with, you know,
                                         
                                         I'm looking for friends of Brad or contacts of George or those kinds of
                                         
                                         things.
                                         
    
                                         Often for the use cases like the ephemeral analytics,
                                         
                                         some of the low latency analytic queries,
                                         
                                         you need to ask questions over the entire graph.
                                         
                                         And so from that perspective, what we found is that we needed to build an in-memory type
                                         
                                         or in-memory optimized architecture to be able to store and partition the data in a way that was optimized
                                         
                                         for questions where you might have to look over all of the graph at the same time to answer them.
                                         
                                         So if you think about your graph algorithms, you know, ranking kinds of operations, clustering
                                         
                                         kinds of operations, things where you really want to find trends across your whole graph or find insights across your whole graph
                                         
    
                                         versus just answering parameterized pieces. And so that's kind of why, from that perspective,
                                         
                                         Neptune Analytics has a little bit of a different architecture because it's built
                                         
                                         for to solve a slightly different use case on the graph problems.
                                         
                                         Okay, well, that makes sense. Actually, it also sounds a lot like the reasoning I heard from στις προβλήματα του γραφείου. Ωραία, αυτό έχει σκέψη. Επίσης, αυτό ακριβώς σκέφτεται
                                         
                                         πολύ σαν το σύμφωνο που έκανε
                                         
                                         από τους ανθρώπους του Neo4j
                                         
                                         που όπως πιστεύετε πιστεύετε
                                         
                                         πριν έφερε
                                         
    
                                         ένα αναλυτικό
                                         
                                         εγγυμό που βρίσκεται
                                         
                                         σε παραλληλή διαδικασία.
                                         
                                         Και έδωσαν ακριβώς το ίδιο σύμφωνο
                                         
                                         που μου έδωσε για
                                         
                                         γιατί το έκαναν και πραγματικά για το πώς το έκαναν. Έτσι, έλεγαν precisely the same reasoning that you just gave me for why they did it and actually how
                                         
                                         they did it as well, I think. So they said they basically introduced some parallelism
                                         
                                         in order to be able to achieve the speedup for cases that have to do with global graphs
                                         
    
                                         and so on. So that makes me wonder, how did you implement your own solution? And perhaps
                                         
                                         if maybe you also did something similar. So, I mean, I think that, you know, we're both, you know, we both see a broad subset of graph customers and graph use cases.
                                         
                                         So, you know, I'm always excited to see what Neo4j launches.
                                         
                                         I think they've got a great product team and a great engineering team.
                                         
                                         I think, you know, from our perspective, we have, you may or may not be aware, but we have several different, there's a role at Amazon called an Amazon Scholar.
                                         
                                         And Amazon Scholars are people in research or academia who often will spend time working with Amazon and with the service team.
                                         
                                         And we have several different Amazon Scholars as part of the Neptune team. And so one of the things that we leveraged was techniques that are coming
                                         
                                         from high performance computing processing of large scale graphs.
                                         
    
                                         And so that's really where we've taken the inspiration for kind of memory
                                         
                                         optimized graph partitioning graph and how to write algorithms over those
                                         
                                         kinds of in-memory optimized graph partitioning, and how to write algorithms over those kinds of in-memory optimized graph partitioning.
                                         
                                         I'm not as familiar with the specifics of Neo4j's implementation,
                                         
                                         but in terms of the parallel processing memory optimized pieces,
                                         
                                         those are things that are pretty well understood from the high-performance computing community, really the difference is that for HPC researchers,
                                         
                                         they're often solving a very specific graph problem on a very specific graph.
                                         
                                         And as we mentioned, graphs have random data access.
                                         
    
                                         And so as a service that has to solve graph problems for many customers, one of our challenges was we had to generalize the techniques that can work well for high performance computing for general graph processing.
                                         
                                         We don't know what the shape of a customer's data is going to look like, and we don't know what questions that they're going to ask.
                                         
                                         So what we really did with Neptune Analytics
                                         
                                         was we took what we saw was sort of the best
                                         
                                         of high-performance computing for graphs
                                         
                                         and tried to build it into a service
                                         
                                         for general graph processing
                                         
                                         that can give good performance for graph analytics
                                         
    
                                         for cases where you need to look over the whole graph.
                                         
                                         Yeah.
                                         
                                         Another thing that sort of stood out for me was that, well, those three areas,
                                         
                                         let's say, that you highlighted in the presentation, I tend to think of them as
                                         
                                         somewhat orthogonal. So, you know, for a similar workload, what you basically need to do is speed
                                         
                                         up the loading process by a lot. And in my mind, at least, that doesn't necessarily have to do with how
                                         
                                         the algorithms or parallelism or whatever it is that you do
                                         
                                         in the actual query execution after you have loaded those
                                         
    
                                         after you have loaded that data. So I think that you
                                         
                                         probably did something which is like, I don't know, Amazon storage
                                         
                                         specific there in order to
                                         
                                         enable that speed up in loading the data set. So yeah, I think there was a couple things that
                                         
                                         enabled it. One was very much moving to more parallelism and loading and changing the way
                                         
                                         that we partition the data. So that was definitely a key part of it.
                                         
                                         You know, with Neptune databases,
                                         
                                         one of the things that really makes it unique for customers is how easy it is to provide high availability
                                         
    
                                         and read replicas.
                                         
                                         And that was created by leveraging some storage technology
                                         
                                         that was originally built for other databases in AWS.
                                         
                                         And for Neptune Analytics,
                                         
                                         we're leveraging some other AWS technology,
                                         
                                         you know, that uses a log-based storage mechanism. And that's also how we're able to both load data very quickly, but also provide
                                         
                                         durability and strong consistency guarantees for many of those low latency type of applications.
                                         
                                         So we are leveraging, we've built a lot of things for graphs.
                                         
    
                                         We're, like we did with the Neptune database,
                                         
                                         we are leveraging some unique innovations that are within AWS as well.
                                         
                                         The other thing that I saw mentioned at some point was that the goal is to have one single API to access all of these features.
                                         
                                         And so OpenCypher was specifically mentioned there.
                                         
                                         I'm wondering if using OpenCypher was specifically mentioned there. I'm wondering if using OpenCypher
                                         
                                         is actually the only way to use these new analytics features or whether it's also available
                                         
                                         using Gremlin and Sparkle. We will be supporting Gremlin and Sparkle after we become generally
                                         
                                         available. The notion about using a single API is more about being able to manage your end-to-end workflow of doing graph analytics from one endpoint.
                                         
    
                                         Being able to do that from one endpoint is a massive simplification from what our customers have been telling us.
                                         
                                         I've been working within the Gremlin community for about a decade, so I'm very much looking forward to bringing Gremlin to Neptune Analytics very shortly.
                                         
                                         One other comment on that, Georgia,
                                         
                                         you know,
                                         
                                         I think you had asked a question earlier about kind of how the data vision
                                         
                                         for graph customers or something along those lines.
                                         
                                         And I think one of the things that I've learned is that customers,
                                         
                                         I would say,
                                         
    
                                         and this sounds a little heretical,
                                         
                                         but I think that most customers don't actually want a graph database.
                                         
                                         And what I mean here is that they want graphs
                                         
                                         and they want to store and query their graphs,
                                         
                                         but they don't want to create instances and clusters
                                         
                                         and have another database management system in their IT infrastructure.
                                         
                                         And so part of what we're really excited about with Neptune Analytics is this idea that, you know,
                                         
                                         the fundamental unit of using this new engine, this new service is a graph.
                                         
    
                                         Like that's what you operate on.
                                         
                                         You create graphs, query graphs, you store graphs,
                                         
                                         you select the multi-AZ availability constraints and policies that you want you operate on. You create graphs, query graphs, you store graphs, you select the multi AZ availability constraints and policies that you want to associate with it.
                                         
                                         And you don't have to manage clusters. You don't have to set those things up.
                                         
                                         And so, you know, we think that that is going to enable people to use graphs for more problems because it just reduces the kinds of overhead, you know,
                                         
                                         that they have to use before they can get started with it.
                                         
                                         So I think that, you know, on one hand, the analytics and the performance
                                         
                                         and the algorithms and the vectors are super cool and really exciting.
                                         
    
                                         But when I think about really impacting how customers are going to think about
                                         
                                         using graphs and think about using them in many different places,
                                         
                                         we may look back and find that this graph API abstraction here was really probably the most impactful thing.
                                         
                                         We'll see. Time will tell.
                                         
                                         Okay. Well, yeah, the way you talk about it actually does sound important.
                                         
                                         And in fact, I think I may have overlooked it initially. Αυτό που είπα, πραγματικά, σκέφτεται σημαντικό και, στην πραγματικότητα, πιστεύω ότι μπορούσα να το αποτύπω αρχικά.
                                         
                                         Όταν είπατε ότι θέλουμε να δώσουμε αυτή την εμπλεκότητα στους ανθρώπους,
                                         
                                         ώστε να μπορούν να εμπλέκουν στους γραφείς, δεν είδαμε ακριβώς ότι το σημαίνατε αυτό.
                                         
    
                                         Θυμόμουν ότι σημαίνει ότι θα υπάρξει ένα κοινό εμπλέκο API,
                                         
                                         το οποίο έχει σημαντικό σημασία, αλλά αυτό που λέτε είναι λιγότερο διαφορετικό. API entry, which makes total sense. But what you're saying is slightly different.
                                         
                                         It sounds like what you're saying is, well, basically we're doing away with the need
                                         
                                         to spin up nodes and provision instances and everything.
                                         
                                         So here's an API that you can use to create
                                         
                                         and manipulate your graph and that's all you need to do.
                                         
                                         Yeah, so I mean, with the Neptune Analytics, you create a graph, you specify some characteristics of it in terms of capacity, minimum and maximums that you want to consume.
                                         
                                         You specify characteristics related to policies for access control and those kinds of things.
                                         
    
                                         And also characteristics in terms of availability,
                                         
                                         particularly whether you want to have single availability zone or multi
                                         
                                         availability zone.
                                         
                                         And you don't have to do other things.
                                         
                                         You don't have to select instances and build other clusters.
                                         
                                         And so I think we're going to learn a lot.
                                         
                                         You know, we may not have it exactly right,
                                         
                                         but we really think that moving away from the database management system abstraction
                                         
    
                                         of consuming graphs and moving more towards a simpler API that gets customers storing and
                                         
                                         querying their graph data faster, it feels like the right thing based on what we've learned from
                                         
                                         customers. Absolutely. Absolutely. Because it's really, at the end of the day, helping our customers deliver an end value as fast as possible and making sure that we've abstracted that in the right way to help them discover data insights or be able to build up workflows that are going to help their applications and to do that work as fast as they can with as minimal choices that they have to make along the way. Absolutely,
                                         
                                         Brad. Okay, so the way you're describing it, it sounds like something like the equivalent of
                                         
                                         lambda functions for graphs. So you don't have to specify much. It's just you don't have to spin up
                                         
                                         an instance. You don't have to define your API, let's say, speaking about the equivalent in terms of programming in the same way that you
                                         
                                         don't have to provision nodes or anything. You can just spin up a Lambda function and you don't care
                                         
                                         about anything else. You can just spin up your graph and you're done pretty much.
                                         
    
                                         That's definitely the vision. And I'll be honest with you, I don't think we're 100% there at launch,
                                         
                                         but we're far closer to being there than we are with the Neptune databases.
                                         
                                         And so, like I said, we're really excited about it.
                                         
                                         And maybe we should mark that George called it.
                                         
                                         We'll have something as popular as Lambda functions, but for doing graph workloads.
                                         
                                         So thank you for that, George.
                                         
                                         Well, you know, actually, like I said, initially, I sort of missed that point.
                                         
                                         And so you may as well want to highlight it a bit more.
                                         
    
                                         Yeah, we'll think about that. It's a good call out.
                                         
                                         I think it's something that we'll go back and sort of think about.
                                         
                                         It's also it's kind of it's a nice theoretical lead in to say, like from a database vendor, that they don't think graph databases are the right thing.
                                         
                                         So we'll see. That's probably pretty memorable.
                                         
                                         It is. Indeed it is. So going back to your new features, as I said, I wasn't too surprised to see that you now support embeddings and vectors and all that, because it's a theme these days. και όλα αυτά, γιατί είναι ένα θέμα αυτές τις ημέρες. Πιστεύω πως πιο πολύ κάθε βιβλιοθέτης, όχι μόνο για βιβλιοθέτης γραφικών,
                                         
                                         αλλά κάθε βιβλιοθέτης βιβλιοθέτης κάνει αυτό ή είναι στο προσπάθειά του να κάνει αυτό,
                                         
                                         γιατί, φυσικά, υπάρχει πολύ προσπάθεια για αυτό.
                                         
                                         Πιστεύω ότι αυτό είναι πιθανότατα αυτό που δείτε επίσης στους περίπτωσές σας και
                                         
    
                                         τι σας έκανε να εμπλεκτικάτε αυτή την εφαρμογή.
                                         
                                         Αυτή η στρατηγική είναι σε ένα τυπικό συναντήμα.
                                         
                                         Αντιμετωπίζουμε ότι υπάρχουν πολλά βεκτορικές διεθνείς και έχετε τα φασικά τύπη ερωτήματα.
                                         
                                         Ποιος είναι ο καλύτερος?
                                         
                                         Είστε να επιλέξετε έναν ειδικό βεκτορικό διεθνείς,
                                         
                                         που προσπαθεί να είναι πιο γρήγορο και θα σας δώσει περισσότερο επιλογή σε θέματα αλγόρθων και εμπνευσμένης υποστηρίας ή μπορείτε να πάτε με την εξωτερική σας δίσκο,
                                         
                                         είτε γραφικό είτε άλλο, η οποία θα προσθέσει μετά κάποιες δυνατότητες και θα είναι
                                         
                                         περίπου καλό αρκετό για αυτό που θα κάνετε, για αυτό που πρέπει να κάνετε, αλλά δεν θα σας δώσει, for what you're going to do, for what you need to do. But it's not going to give you, it's not going to be like best of breed, let's say.
                                         
    
                                         Did you also get that type of question from your customers?
                                         
                                         Yeah, I mean, we absolutely do get the question about purpose-built vector database
                                         
                                         versus vector capabilities in other databases.
                                         
                                         And I think, you know, the way that we're thinking about it is sort of a yes and
                                         
                                         yes kind of answer, which is that absolutely customers will need purpose-built vector
                                         
                                         databases for certain things. But, you know, one of the benefits of putting vectors into your
                                         
                                         existing databases is that it makes it a lot easier and faster for customers to use and they don't have to move data around.
                                         
                                         So I think that what you'll see from our database offerings
                                         
    
                                         is kind of both of those thoughts,
                                         
                                         where we want to meet customers where they are
                                         
                                         by giving them vector search within the databases that they're using.
                                         
                                         And there are some other use cases
                                         
                                         where purpose-built specialized vector performance makes a lot of sense.
                                         
                                         For Neptune Analytics, I think that the thing that we're really most excited about is how
                                         
                                         you combine vectors with graph searches and graph algorithms.
                                         
                                         And we haven't quite figured out the right way to talk about it
                                         
    
                                         or the moniker for it,
                                         
                                         but internally we sort of talk about kind of vector-guided navigation.
                                         
                                         And what this means is that you're using the explicit relationships
                                         
                                         and properties in your graph,
                                         
                                         and you're combining them at certain points
                                         
                                         with the statistical capabilities that you get from vector similarity search.
                                         
                                         And by doing both of those things together, you're able to get a better outcome because you can both leverage the power of the statistical techniques and the explainability of the explicit side.
                                         
                                         That's not to mention that the other use case for vectors in the graph is vectors that are not necessarily coming from LLMs,
                                         
    
                                         but vectors that are coming out of GNMs.
                                         
                                         So the other thing that you can do, and I think this is a more advanced use case, but for those customers who need it is really important, is you can store the embeddings that are coming out of your GNNs back into Inoption Analytics.
                                         
                                         And then you can do cosine similarity type want to do link prediction on your graph, which is super important for many of these marketing and kind of recommendation targeted content types of use cases, fraud,
                                         
                                         then that's another capability that's also enabled by having vectors in the
                                         
                                         graph.
                                         
                                         And so, you know,
                                         
                                         we're really thinking about the vector capabilities of Neptune Analytics as
                                         
                                         there's a vector guided navigation,
                                         
    
                                         graphs and vectors being better together versus being, you know, trying to be like
                                         
                                         a pinecone or a mildus, if you will, for a vector side.
                                         
                                         Yeah, I think that makes sense, and I was also going to ask you about that
                                         
                                         because I know, I recall from the last time we spoke that back then you had just released a Neptune ML.
                                         
                                         So graph neural networks, basically. And it seems like a natural fit since you're now also adding vector capabilities to somehow intermingle those. Yeah, and you may or may not be aware, but the DeepGraph library that
                                         
                                         was part of Neptune ML has expanded into something called GraphStorm, also an open source project.
                                         
                                         And GraphStorm is really more about the APIs around building and deploying GNNs. And so I think that
                                         
                                         if you look forward into the future, you'll see things from both us and the AWS ML teams that make it a lot easier to store and load embeddings and compute embeddings over graphs between Neptune Analytics and GraphStorm.
                                         
    
                                         No, I wasn't aware of that. so thanks for the point. Check it out.
                                         
                                         For sure.
                                         
                                         So you already sort of outlined one of the things that you will be working on in the future.
                                         
                                         What else do you have in your robot?
                                         
                                         I mean, I think that, you know, so you talked about the query language pieces. You know, I think that one of the things that we want to do is that we want to make sure that we can, you know,
                                         
                                         provide customers who want to do the TinkerPop and Sparkle queries with their analytics and vectors that capability as well.
                                         
                                         And then I think, as you'll recall, one of our visions about graphs is that the distinctions between the property graph and the RDF model, you know, really more distracting to customers than they are helpful.
                                         
                                         And, you know, so I think that, you know, one of the things that you'll see in the future is some of our one graph vision start to be realized, you know, within the M10 analytics platform.
                                         
    
                                         So, for example, you know,
                                         
                                         if you can imagine leveraging the relatively large amount of publicly
                                         
                                         available RDF data with, you know,
                                         
                                         graph algorithms written over property graphs,
                                         
                                         those kinds of use cases I think that, you know,
                                         
                                         we're really interested in learning from customers how they want to use them.
                                         
                                         And that's part of the reason that we feel like now is the right time for us to release Neptune Analytics.
                                         
                                         We feel like there's a good core capability of use cases that people can do.
                                         
    
                                         But there's also pieces where we're really looking for customer feedback and to learn how customers apply these use cases.
                                         
                                         And just to also echo absolutely to what Brad just mentioned
                                         
                                         about increasing the number of query languages
                                         
                                         and starting to deliver on our one graph vision.
                                         
                                         Brad also mentioned a few times so far about our vision
                                         
                                         for really simplifying the experience for working with graphs.
                                         
                                         And just to echo it,
                                         
                                         when we ship, we've got one experience, but we're really looking forward to working backwards from
                                         
    
                                         customer requests right afterwards. I think one of the main themes that you're also going to see
                                         
                                         from us is really making it as easy as possible to work with graphs end-to-end in a workflow.
                                         
                                         When we've been talking with our customers, particularly those who are building out new
                                         
                                         Gen AI applications, they've made it very clear to us that being able to easily integrate
                                         
                                         these new features when they build Gen AI applications is one of their most important
                                         
                                         criteria.
                                         
                                         So we're really looking forward to continuing to build out on our vision for abstracting how you use a graph and build a graph workflow in your application,
                                         
                                         because making that as easy as possible is clearly one of the biggest priorities from our customers.
                                         
    
                                         Thanks for sticking around. For more stories like this, check the link in bio and follow linked data orchestration.
                                         
