The Good Tech Companies - Meet New & Improved BigQuery: Single, Unified AI-Ready Data Platform
Episode Date: July 19, 2024This story was originally published on HackerNoon at: https://hackernoon.com/meet-new-and-improved-bigquery-single-unified-ai-ready-data-platform. Google has gone a step... further and unified key data Google Cloud analytics capabilities under BigQuery - now the single, AI-ready data analytics platform. Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #data-analytics, #google-bigquery, #bigquery-and-google-cloud, #ai-integration, #big-query-and-gemini, #good-company, #hackernoon-top-story, #real-time-data-analytics, and more. This story was written by: @googlecloud. Learn more about this writer by checking @googlecloud's about page, and for more stories, please visit hackernoon.com. We’ve gone a step further and unified key data Google Cloud analytics capabilities under BigQuery, which is now the single, AI-ready data analytics platform. BigQuery incorporates key capabilities from multiple Google Cloud analytics services into a single product experience that offers the simplicity and scale you need to manage structured data in BigQuery tables, unstructured data like images, audience and documents, and streaming workloads, all with the best price-performance.
Transcript
Discussion (0)
This audio is presented by Hacker Noon, where anyone can learn anything about any technology.
Meet new and improved BigQuery, single, unified AI-ready data platform, by Google Cloud.
Using large language models, LLMs, with your business data can give you a competitive advantage,
but to realize this advantage, how you structure, prepare, govern, model, and scale your data
matters. Tip new Google Cloud customers from Hacker Noon receive $300 plus an additional $50
in free credits to test, deploy, and explore Google Cloud for 90 days.
Get started using the link here.
Greater than 80% of data leaders believe that the lines between data and AI are greater than
blurring.
Tens of thousands of organizations already choose BigQuery and
its integrated AI capabilities to power their data clouds. But in a data-driven AI era,
organizations need a simple way to manage all of their data workloads.
We've egone a step further and unified key data Google Cloud Analytics capabilities under BigQuery,
which is now the single, AI-ready data analytics platform. BigQuery incorporates key capabilities from multiple Google Cloud Analytics services into
a single product experience that offers the simplicity and scale you need to manage structured
data in BigQuery tables, unstructured data like images, audience and documents, and streaming
workloads, all with the best price performance. BigQuery helps you scale your data and AI
foundation with support for all data types
and open formats. Eliminate the need for upfront sizing and just simply bring your data at any
scale with a fully managed serverless workload management model and universal metastore.
Increase flexibility and agility for data teams to collaborate by bringing multiple languages
and engines, SQL, Spark, Python Python to a single copy of data. Support
the end-to-end data to AI lifecycle with built-in high availability, data governance, and enterprise
security features. Simplify analytics with a unified product experience designed for all
data users and AI-powered assistive and collaboration features. With your data in
BigQuery, you can quickly and efficiently bring general AI to your data and take advantage of LLMs. BigQuery simplifies multimodal generative
AI for the enterprise by making Gemini models available through BigQuery ML and BigQuery
dataframes. It helps you unlock value from your unstructured data with its expanded integration
with Vertex AI's document processing and speech-to-text APIs
and its vector capabilities to enable AI-powered search for your business data.
The insights from combining your structured and unstructured data can be used to further
fine-tune your LLMs. Support for all data types and open formats customers use BigQuery to manage
all data types, structured and unstructured, with fine-grained access controls
and integrated governance. Big Lake, BigQuery's unified storage engine, supports open-table
formats which let you use existing open-source and legacy tools to access structured and
unstructured data while benefiting from an integrated data platform. Big Lake supports
all major open-table formats, including Apache Iceberg, Apache Huddy and now Delta Lake
natively integrated with BigQuery. It provides a fully managed experience for Iceberg, including
DDL, DML and streaming support. Your data teams need access to a universal definition of data,
whether unstructured, unstructured or open formats. To support this, we are launching
BigQuery Metastore, a managed, scalable runtime
metadata service that provides universal table definitions and enforces fine-grained access
control policies for analytics and AI runtimes. Supported runtimes include Google Cloud,
open-source engines, through connectors, and third-party partner engines. Use multiple
languages and serverless engines on a single copy of data
customers increasingly want to run multiple languages and engines on a single copy of their
data, but the fragmented nature of today's analytics and AI systems makes this challenging.
You can now bring the programmatic power of Python and PySpark right to your data without
having to leave BigQuery. BigQuery DataFrames brings the power of Python together with the
scale and ease of
BigQuery with a minimum learning curve. It implements over 400 common APs from Pandas
and Scikit-learned by transparently and optimally converting methods to BigQuery SQL and BigQuery
ML SQL. This breaks the barriers of client-side capabilities, allowing data scientists to explore,
transform and train on terabytes of data and processing
horsepower of BigQuery. Apache Spark has become a popular data processing runtime,
especially for data engineering tasks. In fact, customers' use of serverless Apache Spark in
Google Cloud increased by over 500% in the past year. 1. BigQuery's newly integrated Spark engine
lets you process data using PySpark as you do with
SQL. Like the rest of BigQuery, the Spark engine is completely serverless, no need to manage
compute infrastructure. You can even create stored procedures using PySpark and call them from your
SQL-based pipelines. Make decisions and feed ML models in near real-time data teams are also
increasingly being asked to deliver real-time analytics on die solutions, reducing the time between signal, insight, and action.
BigQuery now helps make real-time streaming data processing easy with new support for
continuous SQL queries, an unbounded SQL query that processes data the moment it arrives via
SQL statement. BigQuery continuous queries amplifies downstream SAAS applications,
like Salesforce, with the real-time enterprise knowledge of your data and AI platform.
In addition, to support open-source streaming workloads, we are announcing a preview of Apache
Kafka for BigQuery. Customers can use Apache Kafka to manage streaming data workloads and
feed ML models without the need to worry about version upgrades, rebalancing, monitoring and other operational headaches. Scale Analytics and
I with governance and enterprise features to make it easier for you to manage, discover,
and govern data. Last year Webber owed data governance capabilities like data quality,
lineage and profiling from Dataplex directly into BigQuery. We will be expanding BigQuery
to include Dataplex's enhanced BigQuery. We will be expanding BigQuery to
include Dataplex's enhanced search capabilities, powered by a unified metadata catalog, to help
data users discover data and AI assets, including models and datasets from Vertex AI.
Column-level lineage tracking in BigQuery is now available in Preview,
which will be followed by a preview for lineage for Vertex AI pipelines.
Governance rules for fine-grained access control are also in Preview, which will be followed by a preview for lineage for Vertex AI pipelines.
Governance rules for fine-grained access control are also in preview,
allowing businesses to define governance policies based on metadata.
For customers looking for enhanced redundancy across geographic regions,
we are introducing Managed Disaster Recovery for BigQuery.
This feature, now in preview, offers automated failover of compute and storage and will offer a new cross-regional service-level agreement, SLA, tailored for business-critical workloads.
The Managed Disaster Recovery feature provides standby compute capacity in the secondary region
included in the price of BigQuery's Enterprise Plus Edition. A unified experience for all data
USERSA's Google Cloud Single Integrated Platform for
Data Analytics, BigQuery unifies how data teams work together with BigQuery Studio.
Now generally available, BigQuery Studio gives data teams a collaborative data workspace that
all data practitioners can use to accelerate their data to AI workflows. BigQuery Studio
lets you use SQL, Python, PySpark, and natural language in a single
unified analytics workspace, regardless of the data's scale, format or location.
All development assets in BigQuery Studio are enabled with full lifecycle capabilities,
including team collaboration and version control.
Since BigQuery Studio's launch at Next 23, hundreds of thousands of users are actively
using the new interface.
2.
Gemini In BigQuery For AI-assistive and collaborative experiences
we announced several new innovations for Gemini in BigQuery that help data teams with AI-powered
experiences for data preparation, analysis and engineering as well as intelligent recommendations
to enhance user productivity and optimize costs. BigQuery Data Canvas, an AI-centric experience with natural language input,
makes data discovery, exploration, and analysis faster and more intuitive.
AI-augmented data preparation in BigQuery helps users to cleanse and wrangle their data and build
low-code visual data pipelines or rebuild legacy pipelines. Gemini in BigQuery also helps you write and
edit SQL or Python code using simple natural language prompts, referencing relevant schemas
and metadata. How Deutsche Telekom is innovating with the BigQuery platform
Greater than Deutsche Telekom built a horizontally scalable data platform in an innovative greater
than way that was designed to meet our current and future business needs. With greater than BigQuery at the center of our enterprises one data ecosystem, we created a
greater than unified approach to maintain a single source of truth while fostering greater than
decentralized usage of data across all of our data teams. With BigQuery and Vertex AI, we built a
governed and scalable space for data scientists to greater than experiment and productionize AI models while maintaining data sovereignty and greater than federated access controls.
This has allowed us to quickly deploy practical greater than usage of LLMs to turbocharge our
data engineering lifecycle and unleash new greater than business opportunities.
Ashutosh Mishra, VP of Data Architecture, Deutsche Greater Than Telekom Start building
your AI-ready data platform to learn Telekom start building your AI-ready data
platform to learn more and start building your AI-ready data platform, start exploring the next
generation of BigQuery today. Read more about the latest innovations for Gemini in BigQuery and an
overview of what's next for data analytics at Google Cloud. Tip new Google Cloud customers
from Hacker Noon receive $300 plus an additional $50 in free credits to
test, deploy, and explore Google Cloud for 90 days. Get started using the link here.
1. Google Internal Data.
You'll see why growth of data processed using Apache Spark on Google Cloud compared with Feb
23. 2. Since the August 2023 announcement of BigQuery Studio,
monthly active users have continued to grow.
Originally published here.
Contributed by Google Cloud
Thank you for listening to this HackerNoon story,
read by Artificial Intelligence.
Visit HackerNoon.com to read, write, learn and publish.