The Good Tech Companies - Developing Next-Gen Data Solutions: SingleStore, MinIO, and the Modern Datalake Stack

Episode Date: June 5, 2024

This story was originally published on HackerNoon at: https://hackernoon.com/developing-next-gen-data-solutions-singlestore-minio-and-the-modern-datalake-stack. The inte...gration of SingleStore, a cloud-native database known for its speed and versatility, with MinIO forms an important brick in the modern datalake stack. Check more stories related to cloud at: https://hackernoon.com/c/cloud. You can also check exclusive content about #minio, #minio-blog, #data-solutions, #singlestore, #modern-datalake, #cloud-native, #database, #good-company, and more. This story was written by: @minio. Learn more about this writer by checking @minio's about page, and for more stories, please visit hackernoon.com. singlestore.io is a cloud-native database designed for data-intensive workloads. It compiles SQL queries into machine code and can be deployed in various environments, including on-premises installations, public/private clouds, and containers via the Kubernetes operator.

Transcript
Discussion (0)
Starting point is 00:00:00 This audio is presented by Hacker Noon, where anyone can learn anything about any technology. Developing next-gen data solutions, SingleStore, Minio, and the modern DataLake stack, by Minio. SingleStore is a cloud-native database designed for data-intensive workloads. IT is a distributed, relational SQL database management system that supports ANSI SQL and is recognized for its speed in data ingest, transaction processing, and query processing. SingleStore can store relational, JSON, graph, and time series data, catering to blended workloads like HTTP and both OLTP and OLAP use cases. It compiles SQL queries into machine code and can be deployed in various environments, including on-premises installations, public, private clouds, and containers via the Kubernetes operator.
Starting point is 00:00:49 Modern DATALAKE architecture In the modern data lake architecture, single-store fits squarely into the processing layer. This layer is where processing engines for transformations, serving up data for other tools, data exploration and other use cases live. Processing layer tools, like single store, work well with others. Often multiple processing layer tools sip from the same data lake. Usually, this design is implemented in the case of tool specialization. For example, super fast and memory data processing platforms like single store with hybrid vector and full text search are optimized for AI workloads, particularly for generative I use cases. Prerequisites to complete this tutorial,
Starting point is 00:01:31 you'll need to get set up with some software. Here's a breakdown of what you'll need Docker engine. This powerful tool allows you to package and run applications in standardized software units called containers. Docker compose. This acts as an orchestrator, simplifying the management of multi-container applications. It helps define and run complex applications with ease. Backslash. Installation. If you're starting fresh, the Docker Desktop Installer provides a convenient one-stop solution for installing both Docker and Docker Compose on your specific platform, Windows, macOS, or Linux. This often proves to be easier than downloading and installing them individually.
Starting point is 00:02:11 Once you've installed Docker Desktop or the combination of Docker and Docker Compose, you can verify their presence by running the following command in your terminal. You'll also need a single store license, which you can get here. Keep note of both your license key and your root password. A random root password will be assigned to your account, but you'll be able to change your root password using the single store UI. Getting started this tutorial depends on this repository. Clone the repo into a location of your choice. The most important file in this repo is the which describes a Docker environment with a single store database,
Starting point is 00:02:45 a Minio instance, and a container which depends on the Minio service. The McContainer contains an script that first waits until Minio is accessible, adds Minio as a host, creates the bucket, uploads a file containing book data, sets the bucket policy to public, and then exits. Using a document editor, replace the placeholders with your license key and root password. In a terminal window, navigate to where you cloned the repo and run the following command to start up all the containers. Open up a browser window, navigate to http://localhostport8080/. and log in with username backquote root backquote and your root password. Check Minio navigate to http colon slash slash
Starting point is 00:03:26 127 0 0 1 to 9001 to launch the Minio web UI. Log in with the username and password for. You'll see that the McContainer has made a bucket called and that there is one object in the bucket. Explore with SQLIN single store. Navigate to the SQL editor and run the following commands This SQL script initiates a sequence of actions to handle data related to classic books. It starts by establishing a new database named. Within this database, a table called is created, designed to hold details such as title, author, and publication date. Following this, a pipeline named is set up to extract data from an S3 bucket labeled and loaded into the table. Configuration parameters for this pipeline, including region,
Starting point is 00:04:11 endpoint URL, and authentication credentials, are defined. Subsequently, the Minio pipeline is activated to commence the process of data retrieval and population. Once the data is successfully loaded into the table, a select query retrieves and displays all records stored in. Following the completion of data extraction and viewing, the pipeline is halted and removed. The table is dropped from the database and the database itself is removed, ensuring a clean slate and concluding the data management operations. This script should get you started playing around with data in Minio in single store. Build on this stack This tutorial swiftly sets up a robust data stack that allows for experimentation with storing, processing, and querying data in object storage. The integration
Starting point is 00:04:56 of single store, a cloud-native database known for its speed and versatility, with Minio forms an important brick in the modern data lake stack. As the industry trend leans towards the disaggregation of storage and compute, this setup empowers developers to explore innovative data management strategies. Whether you're interested in building data-intensive applications, implementing advanced analytics, or experimenting with AI workloads, this tutorial serves as a launching pad. We invite you to build upon this data stack, experiment with different dataset SAN configurations, and unleash the full potential of your data-driven applications. For any questions or ideas, feel free to reach out
Starting point is 00:05:35 to us at thellowadmin.io or join our Slack channel. Thank you for listening to this HackerNoon story, read by Artificial Intelligence. Visit HackerNoon.com to read, write, learn and publish.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.