The Good Tech Companies - Developing Next-Gen Data Solutions: SingleStore, MinIO, and the Modern Datalake Stack
Episode Date: June 5, 2024This story was originally published on HackerNoon at: https://hackernoon.com/developing-next-gen-data-solutions-singlestore-minio-and-the-modern-datalake-stack. The inte...gration of SingleStore, a cloud-native database known for its speed and versatility, with MinIO forms an important brick in the modern datalake stack. Check more stories related to cloud at: https://hackernoon.com/c/cloud. You can also check exclusive content about #minio, #minio-blog, #data-solutions, #singlestore, #modern-datalake, #cloud-native, #database, #good-company, and more. This story was written by: @minio. Learn more about this writer by checking @minio's about page, and for more stories, please visit hackernoon.com. singlestore.io is a cloud-native database designed for data-intensive workloads. It compiles SQL queries into machine code and can be deployed in various environments, including on-premises installations, public/private clouds, and containers via the Kubernetes operator.
Transcript
Discussion (0)
This audio is presented by Hacker Noon, where anyone can learn anything about any technology.
Developing next-gen data solutions, SingleStore, Minio, and the modern DataLake stack, by Minio.
SingleStore is a cloud-native database designed for data-intensive workloads.
IT is a distributed, relational SQL database management system that supports
ANSI SQL and is recognized for its speed in data ingest, transaction
processing, and query processing. SingleStore can store relational, JSON, graph, and time series
data, catering to blended workloads like HTTP and both OLTP and OLAP use cases. It compiles SQL
queries into machine code and can be deployed in various environments, including on-premises installations, public, private clouds, and containers via the Kubernetes operator.
Modern DATALAKE architecture In the modern data lake architecture,
single-store fits squarely into the processing layer. This layer is where processing engines
for transformations, serving up data for other tools, data exploration and other use cases live.
Processing layer tools, like single store, work well with others.
Often multiple processing layer tools sip from the same data lake.
Usually, this design is implemented in the case of tool specialization.
For example, super fast and memory data processing platforms like single store with hybrid vector and full text search are optimized
for AI workloads, particularly for generative I use cases. Prerequisites to complete this tutorial,
you'll need to get set up with some software. Here's a breakdown of what you'll need Docker
engine. This powerful tool allows you to package and run applications in standardized software
units called containers. Docker compose. This acts as an orchestrator,
simplifying the management of multi-container applications. It helps define and run complex
applications with ease. Backslash. Installation. If you're starting fresh, the Docker Desktop
Installer provides a convenient one-stop solution for installing both Docker and Docker Compose on
your specific platform, Windows, macOS, or Linux.
This often proves to be easier than downloading and installing them individually.
Once you've installed Docker Desktop or the combination of Docker and Docker Compose,
you can verify their presence by running the following command in your terminal.
You'll also need a single store license, which you can get here.
Keep note of both your license key and your root
password. A random root password will be assigned to your account, but you'll be able to change your
root password using the single store UI. Getting started this tutorial depends on this repository.
Clone the repo into a location of your choice. The most important file in this repo is the which
describes a Docker environment with a single store database,
a Minio instance, and a container which depends on the Minio service.
The McContainer contains an script that first waits until Minio is accessible,
adds Minio as a host, creates the bucket, uploads a file containing book data,
sets the bucket policy to public, and then exits.
Using a document editor, replace the placeholders with your license key and root
password. In a terminal window, navigate to where you cloned the repo and run the following command
to start up all the containers. Open up a browser window, navigate to http://localhostport8080/.
and log in with username backquote root backquote and your root password. Check Minio navigate to http colon slash slash
127 0 0 1 to 9001 to launch the Minio web UI. Log in with the username and password for.
You'll see that the McContainer has made a bucket called and that there is one object in the bucket.
Explore with SQLIN single store. Navigate to the SQL editor and run the following commands
This SQL script initiates a sequence of actions to handle data related to classic books.
It starts by establishing a new database named. Within this database, a table called is created,
designed to hold details such as title, author, and publication date. Following this,
a pipeline named is set up to extract data from an S3 bucket
labeled and loaded into the table. Configuration parameters for this pipeline, including region,
endpoint URL, and authentication credentials, are defined. Subsequently, the Minio pipeline
is activated to commence the process of data retrieval and population. Once the data is
successfully loaded into the table,
a select query retrieves and displays all records stored in. Following the completion of data extraction and viewing, the pipeline is halted and removed. The table is dropped from the database
and the database itself is removed, ensuring a clean slate and concluding the data management
operations. This script should get you started playing around with data in Minio in
single store. Build on this stack This tutorial swiftly sets up a robust data stack that allows
for experimentation with storing, processing, and querying data in object storage. The integration
of single store, a cloud-native database known for its speed and versatility, with Minio forms
an important brick in the modern data lake stack. As the industry trend leans towards the disaggregation of storage and compute,
this setup empowers developers to explore innovative data management strategies.
Whether you're interested in building data-intensive applications,
implementing advanced analytics, or experimenting with AI workloads,
this tutorial serves as a launching pad. We invite you to build upon this data stack,
experiment with different dataset SAN configurations, and unleash the full
potential of your data-driven applications. For any questions or ideas, feel free to reach out
to us at thellowadmin.io or join our Slack channel. Thank you for listening to this HackerNoon story,
read by Artificial Intelligence. Visit HackerNoon.com to read, write, learn and publish.
