Explore New BigQuery Capabilities for 2025

Vadym Kramarenko

Growth Marketing Manager

September 11, 2024

BigQuery keeps improving, offering new features that help data analysts work with their data more efficiently. These updates are handy for managing large datasets with tools that make data integration, query optimization, and reporting smoother.

Whether combining data from different sources or refining complex queries, BigQuery’s new features are built to make your job easier and more effective.

The 2024 updates to BigQuery introduce a host of powerful features that can optimize their workflows, reduce manual effort, and focus more on deriving actionable insights from their data.

Overview of BigQuery's Latest Enhancements and Updates

In 2024, BigQuery introduced new features that significantly improve data management and analysis.

These updates are focused on four key areas:

Lakehouse Foundation: Strengthening data integration and management across hybrid environments.
Data + AI: Advancing AI integration and machine learning tools for more intelligent data analysis.
Enterprise Class: Enhancing scalability, security, and enterprise-level functionality.
Migrations & Data Pipelines: Simplifying data migrations and streamlining pipeline operations.

These updates are set to enhance efficiency and drive innovation, elevating your data strategy to the next level. Below, we break down these categories and the availability status of each feature.

Exploring New Lakehouse Foundation Capabilities

BigQuery's Lakehouse Foundation represents a significant step in unifying data management and analytics. It merges the strengths of data lakes and warehouses, allowing seamless data integration, storage, and analysis.

New and Generally Available Lakehouse Foundation Features

The 2024 update rolls out several new features under the "generally available" option of the Lakehouse Foundation. These enhancements include Materialized Views, improved JSON functions, and optimized query performance, all designed to make data management smoother and more efficient.

Materialized Views for BigLake Metadata Cache-enabled Tables

Materialized Views for BigLake metadata cache-enabled tables can reference structured data stored in Cloud Storage and Amazon S3. These views bring significant benefits, including automatic refresh, smart tuning, and the capability to pre-aggregate, pre-filter, and pre-join data stored outside BigQuery.

Enhanced JSON Functions for Improved Data Handling

The Enhanced JSON Functions in BigQuery now provide more powerful tools for parsing, querying, and manipulating JSON data. These enhancements simplify extracting specific JSON elements, managing nested structures, and converting JSON data into tabular formats, making working with complex data types within BigQuery easier.

Manifest Files for External Table Integration

BigQuery’s Manifest Files for External Table Integration enable seamless querying of data stored in open formats like Parquet and ORC. By using manifest files, you can define and manage data locations outside BigQuery, allowing efficient integration and querying of external datasets without importing them into BigQuery, simplifying data management.

Query Execution Graph for Optimized Query Planning

BigQuery's Query Execution Graph is a visual tool that breaks down query execution plans. Illustrating the query in stages and showing how data flows makes it easier to spot bottlenecks and inefficiencies. This allows for targeted improvements, ensuring better resource use across complex datasets.

External Query SQL Pushdown for Enhanced Performance

BigQuery’s External Query SQL Pushdown boosts performance by delegating specific operations, like filtering and column selection, directly to external data sources like Cloud SQL or Cloud Spanner. By processing only the necessary data at the source, this approach reduces data transfer back to BigQuery, reducing execution time and costs.

Recursive CTEs for Complex Query Construction

BigQuery’s Recursive Common Table Expressions (CTEs) simplify the construction of complex queries by allowing repeated execution of a CTE until a specific condition is met. This feature is handy for tasks like hierarchical data retrieval, graph traversal, and iterative calculations, making it easier to handle complex data relationships within a single query.

Cross-user Result Cache for Faster Query Responses

In BigQuery's Enterprise and Enterprise Plus editions, the Cross-user Result Cache speeds up query responses by allowing users with the right permissions to access cached results from other users within the same project. The cached result is stored in the querying user’s anonymous dataset for 24 hours.

High Cardinality Join Insights for Better Data Relationships

BigQuery's High Cardinality Join Insights feature helps you optimize queries that involve large, complex datasets. Identifying and analyzing high-cardinality joins provides detailed insights into join patterns that might slow down performance. This allows users to refine queries and improve data relationships.

Subscriber Usage Metrics for Analytics Hub Optimization

BigQuery’s Subscriber Usage Metrics offer deep insights into how subscribers use shared datasets within the Analytics Hub. These metrics allow data providers to track access patterns, query frequency, and data usage trends, helping them optimize their offerings, allocate resources more effectively, and improve overall performance.

Primary and Foreign Key Table Constraints for Data Integrity

BigQuery's Primary and Foreign Key Table Constraints are essential for maintaining data integrity by enforcing relationships between tables. Primary keys ensure each row is uniquely identified, while foreign keys link related tables. These constraints help keep data consistent and accurate across large datasets, supporting reliable database operations and preventing errors.

Object Tables for Structured Data Management

BigQuery’s Object Tables allow for better-structured data management by enabling the analysis of unstructured data stored in Cloud Storage. These read-only tables can support inference with BigQuery ML and allow you to join results with structured data. They also use access delegation, ensuring secure and controlled access to the underlying Cloud Storage objects.

Preferred Tables for BI Engine Performance

The Preferred Tables feature in BigQuery’s BI Engine enhances performance for visual reporting tools like Looker by prioritizing certain tables for in-memory analysis. This makes query execution faster and dashboards more responsive. By designating preferred tables, you can streamline data processing and deliver real-time insights more effectively in BI workflows.

Case-insensitive Collation for Consistent Data Matching

BigQuery's Case-insensitive Collation feature ensures consistent data matching by ignoring case differences in comparisons, sorting, and other operations that support collation. This feature includes the COLLATE function and options for applying collation to specific columns or entire tables, improving data accuracy across queries.

Default Values on Columns for Simplified Data Entry

BigQuery’s Default Values on Columns feature simplifies data entry by automatically assigning predefined values to columns when no explicit value is provided during data insertion. This reduces the chances of manual errors and streamlines the data-loading process by automating the assignment of common or expected values.

Delta Lake BigLake Tables

Delta Lake is now supported in BigLake, enabling you to create petabyte-scale data tables with enhanced reliability, schema enforcement, and ACID transactions. This integration provides an optimized solution for handling large-scale data lakes and ensures efficient querying, even for massive datasets.

RANGE Data Type JSON Encoding

JSON encoding for the RANGE data type is now supported in BigQuery, allowing structured and efficient representation of RANGE values in JSON format. This feature improves interoperability between BigQuery and systems or applications that utilize JSON, making data exchange and handling more seamless.

Manifest File Support for Amazon S3 and Azure Blob Storage

BigQuery now supports manifest files for external tables stored in Amazon S3 and Azure Blob Storage. This addition allows you to define how data is loaded from these storage sources, enhancing flexibility and control over data ingestion while streamlining workflows for managing external data sources.

Analytics Hub Features

New features have been added to the Analytics Hub, including public discoverability, a Featured section, public URLs for data listings, and Pub/Sub sharing in preview. These enhancements simplify the process of discovering, sharing, and managing data across organizations, fostering greater collaboration and data exchange.

Collation Support

Additional collation support has been introduced, particularly for the NULLIF conditional expression, which now respects collation settings when comparing values. This update extends support for structured data types like STRUCT, enabling more accurate comparisons when performing operations that involve different collation settings. Now generally available (GA).

Analytics Hub Data Egress

Analytics Hub Data Egress is now generally available in the Analytics Hub, allowing data publishers to enforce restrictions on copying or exporting shared data. This feature enhances data security and governance by giving publishers control over how their data is consumed and managed by subscribers.

AWS Glue Federated Datasets Creation in Google Cloud Console

You can now create AWS Glue federated datasets directly through the Google Cloud Console, simplifying the process of connecting BigQuery to AWS Glue databases. This integration enhances cross-cloud data collaboration by allowing users to manage datasets across multiple platforms with ease.

Analytics Hub Subscription Management

Analytics Hub Subscription Management is now generally available (GA) in the Analytics Hub, allowing data publishers to manage their subscribers, monitor subscription details, and revoke access to shared datasets as needed. This feature provides greater control over data distribution and subscriber engagement.

Analytics Hub Provider Usage Metrics

Analytics Hub Provider Usage Metrics for Analytics Hub is now generally available (GA), giving data publishers the ability to track how their shared data is being used. Metrics include job details, rows processed, and organizational consumption, offering valuable insights into data utilization and impact.

AWS Glue Federated Dataset Connection

AWS Glue federated datasets can now be connected at the dataset level within BigQuery. This feature enables seamless data integration and querying across cloud platforms, allowing users to work with AWS Glue databases without needing to duplicate or move data between environments

Quantified LIKE Operator

The quantified LIKE operator allows pattern matching against a list or array of patterns using LIKE ANY, LIKE SOME, and LIKE ALL conditions. This feature enhances query flexibility by enabling more complex string matching operations, making it easier to filter and process data based on patterns.

BigQuery Studio General Availability

BigQuery Studio is now generally available, allowing users to save, share, and manage versions of code assets such as notebooks and queries. This feature enhances collaboration and project management by centralizing code resources, improving accessibility and version control for teams working on data projects.

INFORMATION_SCHEMA.WRITE_API_TIMELINE Views for BigQuery Storage Write API Statistics

The INFORMATION_SCHEMA.WRITE_API_TIMELINE views provide per-minute aggregated statistics on BigQuery Storage Write API ingestion. This allows users to monitor and troubleshoot data ingestion processes more effectively, identifying performance bottlenecks and optimizing API usage in real-time.

BQ.REFRESH_EXTERNAL_METADATA_CACHE Procedure for Selective Metadata Refresh of BigLake Tables

You can now selectively refresh the metadata cache for BigLake tables using the BQ.REFRESH_EXTERNAL_METADATA_CACHE procedure. This capability optimizes query performance by ensuring the most up-to-date metadata is used, reducing query latency and improving overall efficiency when working with BigLake tables.

BigQuery Cross-Cloud Features for Materialized Views and Cross-Cloud Joins

BigQuery cross-cloud features now support materialized views over Amazon S3 BigLake tables and cross-cloud joins, enabling efficient querying and data management across cloud environments. This feature fosters greater interoperability, allowing businesses to leverage data from multiple clouds seamlessly.

High Cardinality Joins

BigQuery now provides query performance insights for high cardinality joins, a feature that is generally available (GA). This addition helps identify and resolve performance issues when joining large datasets with high cardinality, improving query optimization and execution efficiency.

Materialized Views

New features are now generally available (GA) in queries and materialized views, including HAVING MAX and HAVING MIN clauses for the ANY_VALUE function, as well as MAX_BY and MIN_BY functions. These enhancements improve query flexibility and allow for more complex data aggregation and retrieval operations.

BigQuery JSON Functions

BigQuery has introduced new JSON functions like JSON_ARRAY, JSON_OBJECT, and JSON_REMOVE, which provide enhanced capabilities for working with JSON data. These functions simplify the process of manipulating and querying JSON objects, making it easier to handle semi-structured data within BigQuery.

Manifest Files for External Tables

BigQuery now supports using manifest files for external tables, providing more control over how data is imported from external storage systems. This feature streamlines data ingestion by allowing users to define specific file lists for loading, improving data management and query efficiency.

Analytics Hub Shared Datasets Metrics Tracking

BigQuery now allows users to track the usage metrics of shared datasets within Analytics Hub, including job details, project or organization consumption, and rows or bytes processed. This feature helps data providers monitor and analyze how their shared data is being utilized across subscribers.

GRANT/REVOKE Access to Materialized Views

BigQuery now supports GRANT and REVOKE access to materialized views with SQL statements. This feature simplifies the management of data access permissions, allowing administrators to control who can view and query materialized views directly through SQL commands.

Primary and Foreign Key Constraints

BigQuery now supports managing primary and foreign key constraints through the BigQuery API, as well as viewing them in the console. This feature enforces data integrity rules within tables, making it easier to maintain relationships between datasets and ensure data consistency.

Metadata Caching for BigLake Tables

Metadata caching is now available for BigLake tables that reference Amazon S3 data, currently in preview. Cached metadata can improve query performance by reducing the time spent retrieving metadata from external storage systems, allowing for faster query execution.

Metadata Caching for BigLake Tables

Enhancing query performance for BigLake tables and object tables is now possible with the use of cached metadata, reducing query latency and retrieval times by storing frequently accessed metadata, leading to more efficient and faster query execution.

Apache Iceberg Tables

BigQuery now supports querying Apache Iceberg tables created by open-source engines. This feature, now generally available (GA), enhances interoperability with Iceberg, a popular open table format for handling large-scale, distributed data, providing users with more flexibility in their data management workflows.

BigLake Metastore

BigLake Metastore is now generally available (GA), providing a centralized service to manage metadata for Iceberg tables across multiple data sources. This feature simplifies the management and querying of Iceberg tables, allowing for more efficient use of BigLake resources.

BigQuery Query Execution Graph

The query execution graph in BigQuery helps diagnose query performance issues by providing visual insights into query stages, resource usage, and optimization opportunities. This feature aids users in troubleshooting performance bottlenecks and improving overall query execution efficiency.

BigQuery Object Tables

Object tables in BigQuery allow users to analyze and perform inference on unstructured data stored in Cloud Storage. This feature expands BigQuery’s capabilities by enabling advanced analytics and machine learning tasks directly on object data, such as images, audio, and video files.

Custom Dual-Regions for Cloud Storage

BigQuery now supports custom dual-regions for Cloud Storage in BigLake and non-BigLake external tables, providing enhanced data durability and availability. This feature allows users to store data across two distinct geographic regions, ensuring business continuity and compliance with data residency requirements.

Case-Insensitive Collation

Case-insensitive collation support is now generally available (GA), providing enhanced query capabilities with features like MIN, MAX, COUNT with DISTINCT, and PERCENTILE_DISC window functions. This update improves the consistency and accuracy of string comparison operations across views, materialized views, and table functions.

Case-Insensitive Dataset and Tables

BigQuery now allows users to create case-insensitive datasets and tables, providing more flexibility when managing and querying data. This feature, now generally available (GA), reduces potential errors caused by case sensitivity and simplifies data handling for users who work with mixed-case data.

BigQuery Recursive CTEs

BigQuery now supports the WITH RECURSIVE clause, allowing users to create recursive common table expressions (CTEs) in their queries. This feature is useful for querying hierarchical data, such as organizational charts or file directories, enabling more advanced data modeling and querying patterns.

Upcoming Lakehouse Foundation Innovations

The upcoming updates to Lakehouse Foundation in BigQuery are set to bring better tools, improve data integration, and boost performance, making data management smarter and more efficient.

BigQuery Studio for Integrated Development and Management

BigQuery Studio is a unified, collaborative workspace to accelerate data-to-AI workflows by integrating SQL, Python, and Spark within a single platform. It enhances productivity through seamless data preparation, analysis, and machine learning while offering robust security, governance, and AI-powered code assistance.

Partition Skew Insights for Optimized Query Performance

Currently, in Preview, BigQuery's Partition Skew Insights tool helps identify uneven data distribution in JOIN operations, which can lead to performance bottlenecks. This feature highlights oversized partitions that are resistant to further splitting, guiding users to rebalance data distribution.

Stored Procedures for Apache Spark Integration

BigQuery’s new Stored Procedures for Apache Spark integration allow complex data processing tasks to be executed within Spark directly from BigQuery using SQL. This feature improves how Spark and BigQuery work together, making workflows simpler and increasing the efficiency and scalability of data processing across platforms.

AWS Glue Federated Dataset for Enhanced Data Connectivity

The AWS Glue Federated Dataset feature in BigQuery enhances data connectivity by enabling queries on data stored in AWS without needing to move or copy it. This seamless integration supports comprehensive analytics across platforms while maintaining the original data in its AWS environment, providing users with a broader reach for their data analysis.

Metadata Caching for Amazon S3 Tables

BigQuery’s Metadata Caching for object tables and BigLake tables, including those using Amazon S3, significantly improves query performance by storing metadata like file names and partition details. This reduces query latency by avoiding the need to repeatedly retrieve metadata from external sources, which is especially beneficial for queries involving numerous files or Hive partition filters, ensuring faster data processing.

Data Clean Rooms for Secure Data Collaboration

BigQuery's Data Clean Rooms offer a secure environment for data collaboration, allowing multiple parties to analyze and share insights without exposing sensitive data. Built on the Analytics Hub platform, these clean rooms ensure privacy and compliance, enabling secure data exchanges while maintaining strict data governance standards.

Quantitative LIKE Operator for Precise Query Filtering

BigQuery’s Quantitative LIKE Operator enhances query precision by allowing pattern matching with numeric values within JSON data. This operator enables more accurate filtering based on numeric conditions, providing query flexibility and specificity, mainly when working with complex, nested JSON structures.

Expanded Query Support for Search Indexes

The Expanded Query Support for Search Indexes in BigQuery improves data retrieval capabilities by allowing more complex queries on indexed data, including pattern matching and range queries. This enhancement boosts query performance, making it easier to search and analyze large datasets, especially in text-heavy or unstructured data environments.

SQL Grouping Sets (Cube, Rollup) for Advanced Data Aggregation

BigQuery's SQL Grouping Sets, including Cube and Rollup functions, offer advanced data aggregation capabilities. These functions enable users to generate multiple grouping combinations within a single query, allowing for more comprehensive analysis and reporting.

Cube aggregates across all possible dimensions, while Rollup creates hierarchical groupings, facilitating deeper insights and more thorough data summarization.

Text Analysis Configurations for Improved Search Capabilities

BigQuery introduces advanced text analysis configurations to enhance search capabilities. These include options for CREATE SEARCH INDEX, LOG_ANALYZER, PATTERN_ANALYZER, and the new TEXT_ANALYZE function.

These features support advanced processing functions like ML.BAG_OF_WORDS, ML.TF_IDF and various distance measures enable more precise and sophisticated text analysis and search operations.

Native Integration for Looker Studio

BigQuery’s Native Integration with Looker Studio allows seamless data visualization and reporting. Users can connect BigQuery datasets directly to Looker Studio, enabling real-time analysis and the creation of interactive dashboards.

Subscription Management in Analytics Hub for Simplified Service Oversight

BigQuery’s Subscription Management in Analytics Hub streamlines service oversight by allowing users to manage data subscriptions efficiently. This feature offers a centralized platform to monitor, update, and cancel subscriptions, ensuring easy access to shared datasets and better control over data usage across projects.

Routines in Analytics Hub for Optimized Workflows

Routines in Analytics Hub are designed to optimize workflows by enabling the sharing and execution of reusable SQL scripts, functions, and stored procedures across different projects and organizations. This feature enhances collaboration and streamlines complex data operations, allowing for more efficient data analysis and consistent task execution.

Restrict Data Egress in Analytics Hub for Enhanced Security

BigQuery’s Analytics Hub includes the ability to restrict data egress, which enhances security by controlling how shared datasets are accessed and transferred. This feature ensures that sensitive data remains within authorized environments, preventing unauthorized export or misuse while enabling collaborative data analysis and sharing across projects.

Non-incremental Materialized Views for Efficient Data Refresh

BigQuery’s Non-incremental Materialized Views allow for a complete data refresh by fully rebuilding the view rather than updating only the changed data. This approach is ideal for scenarios requiring comprehensive data recalculations, ensuring accuracy and consistency in complex datasets.

Unicode Column Naming for Universal Data Compatibility

BigQuery’s Unicode Column Naming feature enhances universal data compatibility by supporting the use of Unicode characters in column names. This update accommodates global languages and symbols, enabling more intuitive and culturally relevant data schemas and improving accessibility and usability across diverse datasets and international data environments.

GROUP BY and SELECT DISTINCT with ARRAY and STRUCT

The GROUP BY clause and SELECT DISTINCT can now be used with ARRAY and STRUCT data types. This enhancement allows for better aggregation and filtering of complex data structures, enabling users to perform more advanced queries and analytics on structured and nested data.

JSON_KEYS Function

The JSON_KEYS function is now available for extracting unique JSON keys from a JSON expression. This function provides flexibility, as it works with JSON Path to match various JSON data structures, allowing users to effectively manipulate and extract information from JSON objects.

Materialized Views on Apache Iceberg Table

BigQuery now allows the creation of a materialized view over Apache Iceberg tables that are partition-aligned with the base table. This feature ensures optimal performance and compatibility with Iceberg, and includes an updated version of JDBC for improved connectivity.

Search Index Optimization for INT64 and TIMESTAMP Data Types

Search indexes in BigQuery can now be used to optimize lookups on INT64 and TIMESTAMP data types. This update improves query performance by allowing faster searches and better indexing, making data retrieval more efficient for large-scale datasets.

Reference Iceberg Tables in Materialized Views

You can now reference Iceberg tables in materialized views without the need to migrate data to BigQuery-managed storage. This integration with Iceberg improves flexibility and performance, enabling efficient querying of external data sources.

BigQuery Materialized View Recommender

The materialized view recommender in BigQuery identifies potential cost savings from using materialized views based on past queries. This tool helps optimize costs by recommending views that can speed up query performance while reducing compute costs.

Incremental Materialized Views

Incremental materialized views now support LEFT OUTER JOIN and UNION ALL operations. This feature, currently in preview, allows users to build more complex queries while maintaining the performance advantages of incremental data processing in materialized views.

Support for Delta Lake Format for Amazon S3 and Azure Tables

BigQuery now supports the Delta Lake format for tables stored in Amazon S3 and Azure. This update provides better data management and querying capabilities for data stored in external cloud storage services, offering improved performance and flexibility.

Cross-Cloud Joins Between Google Cloud and BigQuery Omni Regions

BigQuery now supports cross-cloud joins between Google Cloud and BigQuery Omni regions. This enables queries to span multiple cloud environments, providing businesses with greater flexibility in accessing and analyzing distributed datasets across different geographic regions.

BigQuery Cross-Cloud Materialized Views

Materialized views can now be created over Amazon S3 metadata cache-enabled BigLake tables to avoid egress costs and improve performance. This cross-cloud capability allows users to optimize queries while reducing data movement and associated costs.

Query Performance Insights for Partition Skew

BigQuery now provides new insights to detect and analyze partition skew in queries. This feature helps users identify uneven data distribution across partitions, which can lead to performance bottlenecks, allowing them to optimize query execution and resource usage.

Apache Spark Stored Procedures

Stored procedures for Apache Spark are now available in BigQuery without requiring enrollment. This feature enables users to execute Spark code within BigQuery environments, improving integration and providing more options for data processing and analysis.

Grouping Sets

BigQuery now supports GROUP BY GROUPING SETS, GROUP BY CUBE, and GROUP BY ROLLUP clauses, along with the new GROUPING function. These features allow users to produce aggregated data for multiple grouping sets and check whether values in the GROUP BY clause are aggregated. Currently in preview.

Google Cloud Blockchain Analytics

Google Cloud Blockchain Analytics datasets are now available in preview through the Public Datasets Program and Analytics Hub. This includes Tron Mainnet, Optimism Mainnet, Avalanche Contract Chain, Fantom Opera, Ethereum Mainnet, Arbitrum One Chain, and Cronos Mainnet Chain datasets.

BigQuery Native Integration in Looker Studio

BigQuery’s native integration in Looker Studio is now in preview. This enables enhanced monitoring features for Looker Studio queries, improves query performance, and supports a range of BigQuery features, allowing for more efficient data analysis and visualization.

BigQuery Studio

BigQuery Studio provides enhanced tools for data exploration, analysis, and inference, including the use of Python notebooks powered by Colab Enterprise. This feature allows users to collaborate and manage their data workflows more effectively, enhancing productivity.

Analytics Hub and Routine Linked Dataset

Analytics Hub now supports the use of routines in linked datasets, enabling users to define and execute complex operations on shared datasets. This feature enhances the flexibility of working with shared data across different teams and organizations.

Unicode Column Naming

BigQuery now supports Unicode column naming, allowing the use of international character sets, including alphanumeric and special characters. Existing columns can be renamed to use these new capabilities, which provide greater flexibility when working with multilingual datasets. This feature is in preview.

Authorized Stored Procedures

Authorized stored procedures are now in preview, allowing users to share stored procedures with other users or groups without providing direct access to the underlying tables. This feature enhances security and data governance by separating permissions for procedures and data access.

BigQuery Primary and Foreign Key Constraints

BigQuery now allows users to define primary and foreign key constraints for tables using the BigQuery API or console. This feature improves data integrity by ensuring proper relationships between tables, which is critical for maintaining consistency in relational databases.

Dive into New Data + AI Capabilities

BigQuery’s latest Data and AI capabilities are designed to transform data analysis and machine learning. These new features empower organizations to leverage advanced AI tools, streamline data processing, and enhance predictive modeling.

Generally Available Data + AI Features in BigQuery

Beyond AI, BigQuery's Generally Available Data offers a powerful suite of tools for real-time predictions, advanced data analysis, and seamless machine learning integration. These capabilities make it easier for organizations to process and analyze large datasets with ready-to-use AI solutions.

Custom Holiday Modeling in BQML for Tailored Predictions

BigQuery ML's Custom Holiday Modeling feature enhances time-series forecasting by letting users define and incorporate custom holidays into their models. This capability improves prediction accuracy by accounting for specific events not covered in standard holiday regions, enabling tailored forecasting that meets unique business needs.

BigQuery ML Inference Engine for Real-time Predictions

The BigQuery ML Inference Engine enables real-time predictions by allowing you to run machine learning inferences directly within BigQuery. You can use imported custom models, remotely hosted models on Vertex AI, and Google’s pre-trained Cloud AI models.

This integration reduces data movement, cuts costs, and simplifies machine learning deployment using familiar SQL, all while enhancing security and scalability.

BigQuery ML Feature Preprocessing for Optimized Model Training

The ML.TRANSFORM function in BigQuery ML optimizes model training by preprocessing feature data based on transformations defined during the model’s training phase. This ensures consistency and accuracy in the training and prediction stages, maintaining transparency throughout the process.

Text Embedding Support in BigQuery ML for Advanced Text Analysis

BigQuery ML now supports text embeddings, allowing for advanced text analysis directly within SQL. You can generate embeddings using models like BERT, NNLM, SWIVEL, and text embedding-gecko for sentiment analysis and semantic search tasks.

These embeddings integrate seamlessly with BigQuery ML models.

Multivariate Time Series Forecasting for Complex Data Predictions

BigQuery ML’s Multivariate Time Series Forecasting feature enables the prediction of complex data patterns by simultaneously modeling multiple time-dependent variables. This feature uses ARIMA_PLUS with external regressors (XREG) to enhance forecasting accuracy by incorporating additional factors. It is ideal for complex scenarios such as sales forecasting and financial planning.

New Explainable AI Capabilities for Transparent Machine Learning

BigQuery ML's new Explainable AI (XAI) capabilities bring transparency to machine learning by offering insights into how models arrive at their predictions. These tools help identify the key factors influencing outcomes, allowing users to understand better and refine their models. XAI supports compliance, improves decision-making, and ensures ethical AI practices in data-driven environments.

Gemini in BigQuery Features

Features such as data insights, data canvas, SQL and Python code assistance, and partitioning/clustering recommendations are now generally available. These enhancements empower users to derive insights from their data faster, optimize data workflows, and improve query efficiency across large datasets.

Anomaly Detection with BigQuery ML

Anomaly detection using BigQuery ML now supports multivariate time series (ARIMA_PLUS_XREG) models, and this capability is generally available. This feature helps detect unusual patterns in data, enabling users to identify anomalies in complex datasets and improve forecasting accuracy.

Time Series and Range Functions

BigQuery now includes support for time series and range functions to assist in advanced data analysis. These functions allow users to analyze temporal data, manage date ranges, and execute more granular queries across various datasets, improving data exploration.

DLP Functions for Encryption/Decryption with AES-SIV

BigQuery now supports DLP functions for encryption and decryption between BigQuery and Sensitive Data Protection using the AES-SIV encryption method. This update enhances data security by allowing for more robust encryption and decryption workflows in BigQuery environments.

BigQuery ML Data Preprocessing Features

BigQuery ML introduces the ML.TRANSFORM function and transform-only models to streamline data preprocessing, decoupling it from the actual model training process. These features allow users to automate preprocessing tasks, improving the quality of training data and overall model performance.

BigQuery DataFrames General Availability

BigQuery now supports DataFrames, offering server-side processing with popular pandas and scikit-learn APIs for Python. This feature allows data scientists to leverage familiar Python libraries for data analysis and machine learning, while benefiting from BigQuery's scalability.

Create and Run Spark Stored Procedures in Python, Java, and Scala

You can now create and run Spark stored procedures in Python, Java, and Scala within BigQuery. Users can take advantage of the PySpark editor to write and execute Spark procedures, which enhances data processing workflows for large-scale datasets.

BigQuery ML Text Embedding Features with Remote Models and TensorFlow Models

BigQuery ML now supports text embedding with Vertex AI textembedding-gecko models*, as well as popular TensorFlow models like NNLM, SWIVEL, and BERT. These features help perform natural language processing tasks, such as text classification and sentiment analysis, within BigQuery.

Generative AI Features with Gemini-pro Vertex AI LLM and ML.GENERATE_TEXT Function

BigQuery introduces generative AI features using the Gemini-pro Vertex AI LLM, allowing for natural language processing tasks via the ML.GENERATE_TEXT function. These capabilities extend BigQuery's ability to generate and analyze text data, integrating with BigQuery DataFrames for powerful inference tasks.

View Query Plans for SQL Pushdowns in Federated Queries

You can now view query plans in BigQuery to analyze the details of SQL pushdowns in federated queries. This feature provides deeper insights into query performance, helping users optimize data retrieval from external data sources by viewing how SQL is executed in federated queries.

BigQuery ML: Residual Column in ML.EXPLAIN_FORECAST Output

BigQuery ML now adds a residual column to the output of the ML.EXPLAIN_FORECAST function for ARIMA_PLUS models. This new addition helps users better understand the differences between forecasted and actual values, providing more comprehensive insights into model performance.

BigQuery ML: Performance Improvements with APPROX_GLOBAL_FEATURE_CONTRIB

Performance improvements in BigQuery ML can now be achieved using the APPROX_GLOBAL_FEATURE_CONTRIB argument for boosted tree and random forest models. This feature enhances the scalability of these models by reducing computational overhead while maintaining accurate feature contribution analysis.

ST_LINEINTERPOLATEPOINT Geography Function Support

BigQuery now supports the ST_LINEINTERPOLATEPOINT geography function, which retrieves a specific point along a linestring based on a fractional distance. This function is useful for geospatial analysis and helps users perform more precise calculations on geographic data.

Analytics Hub Listings with Customer-Managed Encryption Keys (CMEK)

Analytics Hub listings can now include data encrypted with customer-managed encryption keys (CMEK). This feature gives data providers greater control over data security, ensuring that datasets shared through Analytics Hub maintain strong encryption standards.

Operational Health Administrative Resource Charts

Operational Health administrative resource charts are now in preview, providing visual insights into slot and shuffle usage, job concurrency, errors, and other critical metrics. These charts enable users to monitor the performance and health of their BigQuery resources more effectively.

Vertex AI LLM Support in BigQuery

SQL syntax updates in BigQuery allow users to access text generation and embedding LLMs with expanded region support. This integration with Vertex AI enables powerful generative AI tasks, improving the ability to handle natural language processing within BigQuery queries.

BigQuery ML Point-in-Time Lookup

BigQuery ML introduces point-in-time lookup functions, now generally available (GA). Users can specify a point-in-time cutoff when retrieving features for model training or inference, preventing data leakage and ensuring more accurate historical data analysis across entities.

BigQuery ML Features for Time Series Forecasting

BigQuery ML features for time series forecasting are now generally available (GA). These include setting forecast value limits via FORECAST_LIMIT_LOWER_BOUND and FORECAST_LIMIT_UPPER_BOUND, and specifying custom holiday modeling in time series models. The updated ML.EXPLAIN_FORECAST function explains holiday effects in the model.

BigQuery Geography Functions

BigQuery now offers new geography functions, such as ST_LINESUBSTRING and ST_HAUSDORFFDISTANCE, enabling users to perform advanced geospatial analysis. These functions enhance the ability to handle and analyze geographic data with more precision.

BigQuery Array Subscript Operator

BigQuery introduces the array subscript operator, which allows users to directly retrieve a value from an array by its index. This operator simplifies queries involving arrays by providing a straightforward method for accessing individual elements.

Generative AI in BigQuery ML

BigQuery ML now supports generative AI features, including the ML.GENERATE_TEXT function and integration with Vertex AI LLM models. These tools enable natural language generation and analysis directly within BigQuery, enhancing data processing capabilities for text-related tasks.

BigQuery ML Inference Features

BigQuery ML’s inference features are now generally available (GA). Users can import models like ONNX, XGBoost, and TensorFlow Lite, host them on Vertex AI Prediction, and perform inference with pretrained models for tasks like natural language processing and computer vision.

Create Data Profiles

You can now create data profiles in BigQuery by scanning tables to monitor data quality. These features help users gain deeper insights into their data and ensure accuracy and reliability by profiling tables for inconsistencies and anomalies. Generally available (GA).

Protocol Buffer (Protobuf)

BigQuery now allows users to export data as Protocol Buffer (Protobuf) columns using user-defined functions. This feature is generally available, offering better support for serializing structured data and improving interoperability across different applications.

Multivariate Time Series Forecasting

Multivariate time series forecasting using the ARIMA_PLUS_XREG model in BigQuery ML is now generally available (GA). This feature enables users to forecast time series data while incorporating additional feature columns, improving predictive accuracy for complex datasets.

EXTERNAL_QUERY SQL Pushdown

EXTERNAL_QUERY SQL pushdown optimizes data retrieval from external sources like Cloud SQL or Cloud Spanner. This feature reduces execution time and cost by minimizing data transfers and applying column pruning and filter pushdowns in federated queries, now generally available (GA).

JSON Data Mapping for Cloud Spanner

BigQuery now supports JSON data type mapping for federated queries in Cloud Spanner. This update enhances compatibility between Cloud Spanner databases and BigQuery, allowing users to handle JSON data more efficiently in federated query workflows.

BigQuery Rounding Modes

BigQuery introduces support for ROUND_HALF_EVEN and ROUND_HALF_AWAY_FROM_ZERO rounding modes for NUMERIC or BIGNUMERIC columns. These modes allow users to specify how values should be rounded, improving control over numeric data operations.

AutoML Tables Model Features

BigQuery’s AutoML Tables model features are now generally available (GA), including availability in additional regions and CMEK support. New optimization objectives like MAXIMIZE_PRECISION_AT_RECALL and MAXIMIZE_RECALL_AT_PRECISION have also been added to enhance model performance.

ALTER TABLE RENAME COLUMN

The ALTER TABLE RENAME COLUMN statement is now generally available in BigQuery. This update allows users to rename columns in existing tables using a simple SQL command, improving the management and organization of database schemas.

Vertex AI Managed Pipelines

More than 20 BigQuery ML components are now available for Vertex AI Managed Pipelines, benefiting AI/ML users by enabling seamless integration with Vertex AI. These components improve automation, model deployment, and tracking for machine learning workflows in BigQuery.

Sparse Input Support in BigQuery ML

Sparse input support in BigQuery ML model training is now generally available (GA). This feature improves model training for datasets where values are mostly zero or empty, enhancing the ability to process and analyze sparse data more effectively.

BigQuery Model Registry Integration

BigQuery ML now integrates with the Vertex AI Model Registry, enabling users to register, monitor, and deploy models directly from the registry. This integration streamlines MLOps workflows, providing versioning, tracking, and model management for BigQuery ML models.

BigQuery ML Integration with Vertex AI Model Registry

BigQuery ML’s integration with Vertex AI Model Registry is generally available, providing capabilities like model versioning, deployment, and MLOps features such as model monitoring. This allows users to manage BigQuery ML models in Vertex AI, benefiting from advanced model governance and deployment options.

Preview Data + AI Enhancements

BigQuery's Preview Data and AI Enhancements bring cutting-edge features that significantly expand the platform's data processing and machine learning capabilities. These upcoming tools promise advanced AI integration, improved data analysis, and enhanced predictive modeling.

Gemini in BigQuery for Next-Level AI Integration

Gemini in BigQuery is set to revolutionize data preparation and analytics. It introduces AI-driven tools that simplify data cleansing, wrangling, and pipeline management. With natural language prompts for SQL and Python code assistance, it reimagines analytics workflows through a visual data canvas and offers performance optimization recommendations.

DocAI with Object Tables for Transparent Machine Learning

BigQuery's integration with DocAI using Object Tables allows for transparent machine learning. This feature extracts and organizes document insights directly within BigQuery, making it easier to manage, query, and analyze unstructured data. It ensures a seamless process while maintaining transparency and traceability in machine learning workflows.

Audio Transcription with Object Tables for Enhanced Data Accessibility

With BigQuery’s ML.TRANSCRIBE function combined with Object Tables, you can seamlessly transcribe audio files directly within BigQuery. This feature enhances data accessibility by converting audio content into searchable text, allowing for efficient analysis and querying, all while integrating smoothly with your existing BigQuery workflows.

Vertex AI PaLM in BigQuery ML for Advanced Natural Language Processing

The integration of Vertex AI PaLM with BigQuery ML takes natural language processing to the next level. Using the ML.GENERATE_TEXT function, users can perform tasks like content generation, text summarization, rephrasing, sentiment analysis, and critical information extraction from large text datasets - all without needing complex external tools.

Vertex AI Managed Pipelines for Automated ML Workflow Management

Vertex AI Managed Pipelines enhance BigQuery ML by automating ML workflow management. This includes online model serving, allowing real-time predictions by deploying BigQuery ML models to Vertex AI endpoints.

Additionally, robust MLOps capabilities automate model monitoring and retraining, ensuring your machine-learning processes remain accurate and efficient.

Integrated Colab Notebooks for Collaborative Data Analysis

BigQuery now integrates with Colab Notebooks, enabling seamless collaboration for data analysis. This integration lets users run SQL queries, visualize results, and build machine learning models directly within Colab.

Combining BigQuery's powerful data processing with Colab's interactive environment it fosters teamwork, streamlines workflows, and enhances productivity across data teams.

BigQuery DataFrames API for Seamless Data Manipulation

The BigQuery DataFrames API allows for seamless data manipulation, enabling users to work with BigQuery data directly in Python, similar to Pandas DataFrames.

This integration simplifies data processing tasks like filtering, aggregating, and joining datasets, while maintaining BigQuery's scalability and performance – making it an ideal tool for data scientists and analysts.

BigQuery ML Point-in-time Lookup Functions for Accurate Historical Analysis

BigQuery ML’s Point-in-time Lookup functions are essential for accurate historical analysis, as they enable the retrieval of data as it existed at a specific time. This capability is crucial for precise time-series forecasting. It ensures that models are trained on accurate historical data, which in turn improves the reliability and accuracy of predictions.

Remote Models in BigQuery ML based on Anthropic Claude Model

BigQuery ML now supports remote models based on the Anthropic Claude model available in Vertex AI for generative natural language tasks. This feature allows users to leverage cutting-edge AI models for tasks such as text generation and other NLP applications directly within BigQuery.

Python Code Completion

Python code completion is now available for all BigQuery projects. This enhancement improves the productivity of Python developers by providing automatic suggestions and completions while writing Python code in BigQuery, reducing the likelihood of errors and improving coding efficiency.

Insights and Recommendations Page

BigQuery introduces a new Insights and Recommendations page in the Google Cloud Console. This feature helps users identify areas for improvement in their queries and resources, offering recommendations to optimize performance, reduce costs, and improve data efficiency.

ML.GENERATE_EMBEDDING Function for Remote Models

The ML.GENERATE_EMBEDDING function now includes a new output_dimensionality argument for use with remote models in Vertex AI. This argument provides greater flexibility by allowing users to control the dimensionality of the generated embeddings, improving performance in machine learning tasks.

TreeAH Vector Index: ScaNN-Powered Batch Query Optimization in Preview

The TreeAH index for vector indexes is optimized for batch queries using Google's ScaNN algorithm. This update improves performance for large-scale vector searches, making it faster and more efficient to process complex queries that require similarity-based searches in BigQuery.

BigQuery Continuous Queries in Preview: Real-Time Data Processing

A new JupyterLab plugin for BigQuery enables users to explore data, work with DataFrames, and deploy notebooks to Cloud Composer. This feature enhances the integration between BigQuery and JupyterLab, allowing for seamless development and deployment of data science projects.

Supervised Tuning on BigQuery ML Remote Model (gemini-1.0-pro-002)

You can now perform supervised tuning on a BigQuery ML remote model using the gemini-1.0-pro-002 model. This feature enhances model training by allowing users to fine-tune models for specific tasks, improving overall performance and accuracy.

Supervised Tuning Using BigQuery DataFrames Python API

The BigQuery DataFrames Python API now supports supervised tuning with the GeminiTextGenerator class, allowing users to utilize the fit() and score() methods for fine-tuning models. This feature is designed to enhance the accuracy of text-based machine learning models within BigQuery.

Scheduled Notebooks

BigQuery introduces a new feature for scheduling Jupyter notebooks, allowing users to automate notebook execution at specific intervals. This update is especially useful for routine tasks such as data analysis, reporting, and model training, improving overall productivity.

SQL Code Generation

SQL code generation is now available for all BigQuery projects. This feature simplifies the creation of SQL queries by automatically generating optimized SQL code based on user inputs, making it easier to handle complex queries and reducing manual effort.

Text Embedding and Multimodal Embedding Support

BigQuery ML expands support for text embedding and multimodal embeddings, allowing users to create embeddings from multiple data types, including text, images, and videos. This feature improves machine learning tasks by offering more comprehensive representations of diverse data inputs.

_CHANGE_SEQUENCE_NUMBER

BigQuery now allows the definition of a _CHANGE_SEQUENCE_NUMBER for change data capture (CDC). This feature is in preview and helps manage the ordering of streaming UPSERT operations, ensuring data consistency and reducing conflicts during data ingestion in real-time applications.

Generative AI

Generative AI features in BigQuery are in preview, including remote models based on Vertex AI gemini-1.5-flash and the ML.GENERATE_TEXT function. These features support text generation, audio transcription, and document classification, allowing for advanced AI tasks across a wide range of content types.

BigQuery ML-Gen AI

BigQuery ML now offers Generative AI features with grounding and safety attributes in Vertex AI Gemini models. This includes the ground_with_google_search argument for additional context in AI outputs and the safety_settings argument to filter responses. Users can also create video embeddings in preview.

BigQuery ML Generative AI Features

BigQuery ML now includes Generative AI features, such as text generation and text embeddings using Vertex AI models. These capabilities allow users to perform sophisticated natural language tasks and generate embeddings for text-based data within BigQuery.

JavaScript User-Defined Aggregate Functions

BigQuery now supports JavaScript UDAFs with the new CREATE AGGREGATE FUNCTION statement. This feature enables users to define custom aggregation logic in JavaScript, allowing for more complex data processing and calculations directly within BigQuery queries.

SQL Code Generation for All BigQuery Projects

SQL code generation is now available in preview for all BigQuery projects, with support for Gemini setup. This feature automates SQL query creation, helping users generate optimized code quickly for complex queries and improving overall productivity.

User-Defined Aggregate Functions (UDAFs) Supporting SQL Expressions

BigQuery now supports user-defined aggregate functions (UDAFs) using SQL expressions, which can be created with the CREATE AGGREGATE FUNCTION statement. This allows users to define and reuse custom aggregation logic, improving flexibility in complex queries.

BigQuery ML: Multimodal and Expanded Embedding Support

BigQuery ML now supports creating multimodal embeddings for various data types, including PCA, autoencoder, and matrix factorization models. This update enhances BigQuery’s ability to process diverse data types and generate comprehensive embeddings for machine learning tasks.

Data Canvas Creation in BigQuery Studio

BigQuery Studio introduces data canvas for discovering, transforming, querying, and visualizing data through a graphical interface. This feature simplifies data workflows by offering an intuitive environment for users to interact with their data without relying solely on SQL.

Gemini in BigQuery: New Features in Public Preview

BigQuery's Gemini features are now in public preview, including data insights, data canvas, SQL and Python code assistance, and recommendations for views, partitioning, and Spark tuning. These capabilities enhance BigQuery’s ability to optimize and simplify data analytics tasks.

Notebook Size

The maximum notebook size in BigQuery has been increased from 10 MB to 20 MB. This update is currently in preview and allows users to work with larger notebooks, supporting more extensive data analysis and complex machine learning models within BigQuery.

Lists of All Saved Queries and All Notebooks

Users can now view lists of all saved queries and all notebooks in their BigQuery project. This feature, available in preview, improves project management by making it easier to organize and access saved work across different queries and notebooks.

Anomaly Detection in BigQuery ML Multivariate Time Series Models

Anomaly detection using BigQuery ML multivariate time series models like ARIMA_PLUS_XREG is now available in preview. This feature helps users identify unusual patterns in large datasets, improving forecasting accuracy and detecting outliers in complex time series data.

BigQuery ML Preprocessing Features

BigQuery ML introduces new preprocessing features, including the ML.TRANSFORM function and transform-only models, which allow users to preprocess data separately from model training. This update improves flexibility in handling raw data, optimizing it for machine learning workflows.

AI Features in BigQuery

New AI features in BigQuery include the ability to process documents and transcribe audio from BigQuery object tables. These features, in preview, use remote models based on Document AI and Speech-to-Text APIs to perform tasks like document processing and audio transcription.

BigQuery Advanced Text Analysis Functions

BigQuery adds support for advanced text processing functions such as ML.BAG_OF_WORDS and ML.TF_IDF. These functions improve the ability to analyze textual data within BigQuery, helping users process and extract valuable insights from large text datasets.

BigQuery ML Lookup Functions

BigQuery ML introduces point-in-time lookup functions for model training and inference, preventing data leakage by retrieving features only as they existed at a specific time. This feature ensures that historical data is correctly aligned when building machine learning models.

Data Clean Rooms in BigQuery

BigQuery now supports data clean rooms, a secure environment where multiple parties can analyze shared data without revealing the underlying information. This feature ensures that data privacy is maintained while allowing collaborative data analysis across organizations.

Duet AI in BigQuery

Duet AI in BigQuery is an AI-powered collaborator that assists with SQL query completion, generation, and explanation. This feature helps users create optimized SQL queries, improving their ability to interact with and analyze data more effectively.

BigQuery DataFrames

BigQuery DataFrames enables users to perform data analysis and machine learning tasks using a Python API, including bigframes.pandas and bigframes.ml. This feature allows data scientists to work with familiar Python libraries in the scalable environment of BigQuery.

BigQuery ML Text Embedding Features

BigQuery ML now supports text embeddings using the Vertex AI PaLM APIs, improving natural language processing tasks such as text classification, sentiment analysis, and entity recognition. This feature enhances the ability to work with text data in machine learning applications.

BigQuery ML XGBoost Version Option

BigQuery ML now allows users to specify the XGBoost version during model training, enabling them to choose between version 0.9 or 1.1. This update provides greater flexibility when training models, allowing users to tailor their approach to specific tasks.

TRANSFORM Clause

BigQuery now supports the TRANSFORM clause, which allows users to train models that can be exported in Tensorflow SavedModel format. This feature, currently in preview, improves the flexibility of machine learning models by enabling easy model export and deployment.

BigQuery ML Support for Image Analytics

BigQuery ML now supports image analytics, allowing users to perform tasks like object detection and OCR using vision models. This feature enables advanced image processing directly within BigQuery, expanding its capabilities beyond text and tabular data analysis.

Discover New Enterprise-Class Features Capabilities

BigQuery's new enterprise features strengthen data security, management, and performance. They provide customized solutions for real-time updates, secure operations, and smoother workflows, helping organizations improve their data strategies on a larger scale.

Ready-to-use Enterprise-Class Features

BigQuery’s ready-to-use enterprise-class features are crafted to meet the complex needs of modern, data-driven organizations. These features provide tailored solutions for data management, security, and performance.

BigQuery Editions for Tailored Data Solutions

BigQuery offers three editions - Standard, Enterprise, and Enterprise Plus, each tailored to support different workloads and features at varying price points. These editions can be enabled during BigQuery capacity reservation, allowing organizations to optimize costs and performance.

Change Data Capture Using the Storage API for Real-Time Updates

With the Storage API, BigQuery’s Change Data Capture (CDC) enables real-time updates by directly capturing and replicating changes from your data sources into BigQuery. This feature ensures your data is always up-to-date, facilitating timely analysis and decision-making, while supporting seamless data integration across platforms.

Storage Write API Multiplexing for Efficient Data Streaming

BigQuery's Storage Write API Multiplexing enables efficient data streaming by allowing multiple streams to write data concurrently into BigQuery tables. This feature improves throughput and reduces latency, making it ideal for high-velocity data environments where real-time data ingestion is crucial for maintaining up-to-date analytics and supporting dynamic business needs.

Query Inspector for Deep Query Analysis

BigQuery’s Query Inspector offers deep insights into query performance and execution. It helps you identify bottlenecks, optimize resource usage, and troubleshoot inefficiencies. With visualizations and comprehensive metrics, Query Inspector ensures your queries run efficiently, improving overall data processing.

Table Clones for Fast Data Duplication

BigQuery’s Table Clones feature provides a lightweight, writable copy of an existing table, known as the base table. These clones allow you to duplicate data without incurring additional storage costs unless the cloned data differs from the base table. This feature is ideal for developing, testing, and creating sandboxes, offering data manipulation and analytics flexibility while ensuring efficient storage usage.

Dynamic Data Masking for Enhanced Data Security

BigQuery’s Dynamic Data Masking strengthens data security by allowing you to obscure sensitive column data for specific user groups. This feature integrates with column-level access control, enabling fine-grained control over who can view or manipulate sensitive information, ensuring that data privacy is maintained across different teams and use cases.

Query Queues for Optimized Data Workflows

BigQuery's Query Queues feature optimizes data workflows by dynamically managing the number of queries that can run concurrently based on available compute resources.

Additional queries are queued until resources become available, ensuring efficient resource allocation and fair project scheduling. Users can control the maximum concurrency target within reservations, guaranteeing a minimum slot capacity for each query.

While BigQuery's Query Queues focus on optimizing query execution by managing concurrent queries based on available resources, tools like OWOX Transformation and Dataform address the challenge of data transformation itself. Both OWOX Transformation and Dataform help teams streamline the process of turning raw data into structured, meaningful datasets, but they differ in their approach.

Here is a comparison table between Dataform and OWOX Transformation.

Purpose
- Dataform: Manages SQL-based data transformation and workflow automation.
- OWOX Transformation: Automates marketing data transformation with customizable no-code/low-code templates.
Concurrency Management
- Dataform: Focuses on dependency management in SQL workflows to optimize data processing.
- OWOX Transformation: Focuses on automating data processing tasks like normalization, blending, and attribution.
Resource Allocation
- Dataform: Automates resource allocation for SQL transformations, ensuring efficient workflow execution.
- OWOX Transformation: Allocates resources for transforming, blending, and preparing marketing data for analysis.
Fair Scheduling
- Dataform: Manages execution order in SQL workflows, ensuring that dependencies are resolved before task execution.
- OWOX Transformation: Provides a structured approach to transforming and unifying marketing data across sources.
Control Options
- Dataform: Allows users to define workflows using SQL and JavaScript, with automated dependency resolution and version control.
- OWOX Transformation: Offers a visual, SQL-first approach with customizable templates, allowing for easy setup and maintenance.
Use Case
- Dataform: Ideal for developers and data engineers who need to manage complex SQL transformations and automate data pipelines.
- OWOX Transformation: Best for marketers and data professionals needing to unify and transform marketing data, focusing on ROAS, LTV, CAC, and omnichannel performance.

Physical Bytes for Storage Billing for Transparent Cost Management

BigQuery's Physical Bytes storage billing model allows customers to reduce costs by leveraging data compression, charging based on the actual physical storage rather than logical bytes. This option offers significant cost savings while maintaining data accessibility and query performance, providing a more transparent and efficient billing method.

Configure Time Travel Windows for Data Recovery

BigQuery's Time Travel feature allows you to configure data retention windows, enabling recovery of changed or deleted data for up to seven days. You can set the time travel window from a minimum of two to a maximum of seven days, with longer windows offering extended recovery periods and shorter windows reducing storage costs, particularly under the physical storage billing model.

Authorized Stored Procedures for Secure Data Operations

BigQuery's Authorized Stored Procedures enhance data security by allowing users to execute SQL code with elevated permissions, independent of the caller's access rights. This ensures that sensitive operations are performed securely, enabling controlled access to data while maintaining strict governance and compliance standards across your organization's data workflows.

Data Lineage, Profiles, and Data Quality for Governance

BigQuery enhances governance with features like Data Lineage, Profiles, and Data Quality. Data Lineage tracks data movement and transformations across systems, Data Profiles provide insights into data structure and content, and Data Quality tools ensure data accuracy and reliability. Together, these features enable robust data management and compliance.

Dataset Tag Support for Better Data Management

BigQuery's Dataset Tag Support enhances data management by allowing users to organize and categorize datasets, tables, and views with custom tags. These tags enable easier data discovery and improved access control in large data environments, simplifying the application of consistent governance and policy enforcement across your organization.

Workload Management Updates

Workload management now offers precise autoscaling, scaling up immediately in increments of 50 slots. Users can also adjust capacity commitments and reduce capacity consecutively without a one-minute delay. All these features are now generally available (GA), improving workload flexibility and performance.

Monitor Job Activity Effortlessly with GA Jobs Explorer

The administrative jobs explorer is now generally available for monitoring job activity across your organization. This tool provides a streamlined interface to quickly track and analyze job execution, making it easier for administrators to maintain operational efficiency and monitor resource usage.

Modify Rows with DML in BigQuery via Storage Write API (GA)

You can now use DML to update rows that were recently written to BigQuery tables via the Storage Write API. This feature, now generally available (GA), provides more flexibility in modifying data, improving real-time data handling and making updates more efficient.

Configure Default Storage Billing Model for New Datasets

You can now configure the default storage billing model for new datasets in BigQuery. This feature simplifies billing management, allowing organizations to specify storage pricing models for datasets at the time of creation, ensuring cost predictability.

Tags on BigQuery Tables

IAM policies can now be applied conditionally to BigQuery tables using tags. This feature allows more granular access control by tagging resources with specific labels and applying security or management policies based on those tags, improving data governance.

Global Rate Limits on BigQuery Omni Connection Creation and Use

Global rate limits on BigQuery Omni connection creation and use have replaced regional limits for AWS and Azure. This update standardizes connection limits, ensuring consistent usage and preventing resource overload across different cloud environments when using BigQuery Omni.

Slot Recommender

The slot recommender for BigQuery editions analyzes historical usage data to recommend the optimal capacity for slot purchasing. This feature, now generally available (GA), helps users make better decisions for managing edition and on-demand workloads, improving cost-efficiency and resource allocation.

IAM Conditions

You can now use IAM conditions to control access to BigQuery resources. This feature, generally available (GA), allows administrators to create conditional access policies based on context, such as time or resource attributes, providing more detailed security management for BigQuery resources.

Microsoft Entra Group Access to BigQuery in Power BI via Workforce Identity Federation

Users in Microsoft Entra groups can now access BigQuery data in Power BI using Workforce Identity Federation. This integration enables seamless identity management between Microsoft and Google Cloud, making it easier for organizations to collaborate across platforms.

Differential Privacy

Differential privacy features in BigQuery are now generally available. This feature allows users to protect the privacy of individual data entries while performing aggregate data analysis, ensuring compliance with privacy regulations while maintaining the integrity of the data.

Minimum Duration Between Scheduled Queries Reduced to 5 Minutes

The minimum duration between scheduled queries has been reduced from 15 minutes to 5 minutes. This update allows users to execute queries more frequently, improving real-time data processing and enabling faster insights from regularly updated datasets.

Efficient Deletion of Partitions Using DML in BigQuery

Users can now use DML statements to efficiently delete entire partitions in BigQuery without scanning bytes or consuming slots. This feature improves the performance of partition deletion, saving resources and optimizing data management processes for large datasets.

Custom Data Masking and Entity Resolution Features in BigQuery

BigQuery introduces custom data masking routines to protect sensitive data. Additionally, entity resolution capabilities using identity providers like LiveRamp allow users to link entities across datasets more accurately, improving privacy and data quality in marketing and analytics workflows.

Cloud Console Updates: Sort Query Results by Column

Users can now sort query results by column directly in the Cloud console. This update provides a more user-friendly experience when analyzing query results, allowing for easier sorting and visualization of data without needing to modify SQL queries.

Caches Results

You can now use cached results from the same query issued by other users in the same project when using the Enterprise or Enterprise Plus edition of BigQuery. This feature, now generally available (GA), improves performance by reducing redundant query execution.

BigQuery Change Data Capture (CDC)

BigQuery now supports change data capture (CDC), enabling real-time processing of streamed changes using the BigQuery Storage Write API. This feature helps users synchronize data changes efficiently, improving real-time data handling and reducing latency in data updates.

Adding Descriptions to the Columns of a View

Adding descriptions to the columns of a view is now generally available (GA). Users can use the CREATE VIEW or ALTER COLUMN DDL statements to add metadata descriptions, improving documentation and clarity for team collaboration on shared datasets.

BigQuery Administrator

As a BigQuery administrator, you can now monitor your organization's slots utilization and BigQuery jobs' performance over time using the administrative query inspector. This feature is now generally available, improving oversight of resource usage and query efficiency.

BigQuery Storage Write API Multiplexing

BigQuery Storage Write API multiplexing is now generally available (GA). This feature allows users to use multiplexing in the default stream to write to multiple destination tables with shared connections, improving the efficiency of data ingestion across multiple tables.

BigQuery Query Queues

Query queues in BigQuery now determine concurrency based on available slots, improving query performance and reducing wait times. This feature optimizes resource allocation, ensuring that queries are executed more efficiently during peak usage periods.

Spanner Data Boost

Spanner Data Boost lets you execute analytics queries and data exports with near-zero impact to existing workloads on your provisioned Spanner instance. This feature is now generally available (GA) in several regions, enhancing Spanner’s capabilities for running high-performance queries alongside transactional workloads.

Load Data with SQL Statement

Users can now load data into BigQuery tables from formats like Avro, CSV, JSON, ORC, or Parquet using the LOAD DATA SQL statement. This feature simplifies the data loading process, providing flexibility in importing various data formats into BigQuery.

Fail-Safe Period

BigQuery introduces a fail-safe period providing seven extra days of data storage after the time travel window for emergency recovery. There will be no billing for this period until July 17, 2023, ensuring data protection and recovery options without additional costs.

Physical Bytes for Billing

BigQuery now supports billing based on physical bytes used for storage, including time travel and fail-safe storage. This update provides more accurate billing, allowing users to manage storage costs better by understanding how their data storage impacts overall charges.

Time Travel Window Configuration

You can now configure the time travel window duration between 2 to 7 days in BigQuery. This feature enhances data management flexibility, giving users more control over how long they can access historical data for recovery or auditing purposes.

BigQuery Support for Compliance Programs

BigQuery now offers support for compliance programs like ITAR and EU regions with sovereignty controls. This feature ensures that organizations meet regulatory requirements, especially those operating in sectors with strict data privacy and sovereignty regulations.

TRUNCATE TABLE

BigQuery now supports TRUNCATE TABLE for multi-statement transactions, which is generally available (GA). This feature allows users to remove all rows from a table in a transaction, simplifying the process of clearing table data while ensuring consistency.

Information Schema

BigQuery’s INFORMATION_SCHEMA views that show table storage metadata are now generally available (GA). Users can access the TABLE_STORAGE and TABLE_STORAGE_BY_ORGANIZATION views to get snapshots of storage usage for tables at the project or organization level, improving transparency and reporting.

BigQuery VPC Service Controls

The VPC Service Controls perimeter now protects the BigQuery Reservation API, improving security by ensuring that only authorized resources and users can interact with the reservation system, helping to protect against unauthorized access and data breaches.

BI Engine Top Tables Cached Bytes

You can now view BI Engine Top Tables Cached Bytes, BI Engine Query Fallback Count, and Query Execution Count as dashboard metrics for BigQuery. These features, now generally available (GA), help users monitor BI Engine’s performance and query optimization.

Table Clones Feature

BigQuery’s table clones feature is now generally available (GA). This feature allows users to create clones of tables for testing or backup purposes without duplicating the data, saving on storage costs and providing more flexibility in data management.

Dynamic Data Masking for REPEATED Mode

Dynamic data masking in BigQuery has been updated to support REPEATED mode on RECORD columns. This feature improves security by allowing users to mask sensitive data within nested fields, providing better control over data visibility.

BigQuery BI Engine Reservation Updates

Updates to preferred tables for BI Engine reservations now propagate in under 10 seconds. This feature improves BI Engine’s responsiveness, ensuring that updates to table preferences are applied quickly, minimizing delays in query performance enhancements.

Maximum Result Size (20 GiB Logical Bytes) When Querying Azure or Amazon Simple Storage Service

The maximum result size when querying Azure or Amazon S3 data has been increased to 20 GiB logical bytes, and this feature is now generally available (GA). Querying these external storage services is now subject to updated quotas and limitations, improving query capabilities.

BigQuery Partner Center

The BigQuery Partner Center is now generally available (GA), providing a platform to discover and try validated partner applications. Additionally, the Google Cloud Ready - BigQuery initiative has added 14 new partners, expanding the ecosystem of BigQuery-integrated tools and services.

Compute (Analysis)

BigQuery’s compute (analysis) feature is now generally available (GA) in three new editions: Standard, Enterprise, and Enterprise Plus. These editions support autoscaling slots, helping organizations optimize their resource usage and costs based on their data processing needs.

Autoscaling Slots

Autoscaling slots are now generally available (GA). Slots created during the preview have been upgraded to the BigQuery Enterprise edition, allowing for automatic scaling based on workload demand, improving resource utilization and query performance during peak times.

BigQuery Lineage Tab

The lineage tab in BigQuery tracks how data moves and transforms across tables and views within the system. This feature provides transparency in data workflows, helping users visualize dependencies and data transformations, improving auditability and troubleshooting.

Default Values on Columns

Users can now set default values on columns in BigQuery tables. This feature, now generally available (GA), simplifies data management by allowing default values to be applied during data insertion, reducing the need for manual handling of missing values.

bq Commands

You can now run bq commands using service account impersonation. This feature is generally available (GA) and improves security by allowing users to delegate access permissions without sharing credentials, enabling more controlled access to BigQuery resources.

Explorer Pane

In the Explorer pane, the resource corresponding to the focused tab is now selected. Additionally, users can click Show more to see all resources at the current level. Both features are now generally available (GA), improving navigation and resource visibility.

Session Ends

The following generally available (GA) features have been added for sessions:

Temporary functions are now maintained until the session ends.
Statements that include the TEMP keyword can also include the OR REPLACE and IF NOT EXISTS keywords, improving session functionality and flexibility.

Future Enterprise Solutions in Preview

BigQuery's Future Enterprise Solutions in Preview offers a sneak peek into advanced tools designed to elevate data management, security, and performance. These upcoming features promise to equip organizations with cutting-edge capabilities for improved data governance, auditing, and cross-region data handling.

Table Change History for Enhanced Data Auditing

BigQuery's Table Change History feature takes data auditing to the next level by tracking and recording changes to tables. It allows you to review historical data states, understand modifications, and ensure compliance with auditing requirements. A detailed log of alterations makes monitoring and maintaining data integrity much easier.

Table Storage Information Schema for Transparent Storage Insights

The Table Storage Information Schema in BigQuery provides a clear snapshot of table storage usage and materialized views. It details logical, physical, and time travel bytes, helping you monitor storage consumption, plan for future growth, and understand update patterns. This transparency is key to efficient storage management and cost optimization.

Cross-Region Table Copy for Seamless Data Movement

BigQuery's Cross-Region Table Copy feature enables smooth data movement between regions, allowing you to replicate tables across different geographic locations. This capability ensures data availability, supports disaster recovery, and enhances data accessibility for global teams, making it easier to manage and safeguard your data in a distributed environment.

Cross-Region Dataset Replication for Data Protection

The Cross-Region Dataset Replication in BigQuery enhances data protection by creating a secondary replica of your dataset in a specified region. This replica is updated asynchronously, ensuring redundancy across multiple zones and regions, offering an additional layer of security.

Modify Recently Streamed Data for Greater Flexibility

BigQuery’s DML (Data Manipulation Language) now supports modifying recently streamed data, providing greater flexibility in managing your datasets. This feature lets you update or delete rows streamed into BigQuery within the last seven days, enabling timely corrections and adjustments that enhance accuracy and operational efficiency.

Partitioning and Clustering Recommender for Optimized Data Organization

BigQuery’s Partitioning and Clustering Recommender improves data organization by analyzing workload execution over 30 days to identify suboptimal table scans. Using machine learning, it estimates potential performance improvements and cost savings, offering tailored recommendations for partitioning and clustering based on your specific read-and-write patterns.

Custom Data Masking Functions for Tailored Data Protection

BigQuery’s Custom Data Masking Functions provide tailored data protection by allowing user-defined functions (UDFs) to mask sensitive data in columns. Supporting most data types, this feature offers flexibility in how data is masked while ensuring security.

These custom routines are ideal for implementing specialized data protection strategies that meet your organization’s unique needs.

Differential Privacy for Enhanced Data Security

BigQuery’s Differential Privacy feature boosts data security by adding controlled noise to query results, safeguarding individual data points from being identified. This ensures that sensitive information remains confidential while allowing meaningful aggregate analysis, making it essential to maintain privacy in datasets containing personally identifiable information (PII).

Shared Encryption Functions with Cloud SDP for Unified Security

BigQuery’s Shared Encryption Functions, in collaboration with Cloud Sensitive Data Protection (SDP), offer unified security by enabling the encryption and decryption of sensitive data directly within queries. This feature simplifies the management of encrypted data, ensuring it remains secure during processing without compromising performance.

IAM Conditional Policies for Granular Access Control

BigQuery’s IAM Conditional Policies offer precise control over who can access what and when. Permissions are granted based on specific conditions, like time constraints or operational contexts. This enhances security by allowing suitable access rules for projects, datasets, tables, and other resources, ensuring users have access exactly when needed.

IAM Deny Policies for Stronger Security Enforcement

BigQuery’s IAM Deny Policies strengthen security by explicitly blocking access to specific actions or resources, regardless of other permissions that may have been granted. This feature allows organizations to enforce stricter security measures, ensuring critical resources are protected by preventing unauthorized actions, even if users have broad permissions in other system areas.

Short Query Optimized Mode

BigQuery introduces short query optimized mode, which delivers lower latency for small queries. This feature enhances performance for simple queries by reducing execution time, making it faster to retrieve results from smaller datasets and enabling more efficient use of BigQuery resources.

Table Explorer in BigQuery

The new Table Explorer feature in BigQuery allows users to easily examine table data and create exploration queries. This intuitive interface improves data discovery and query generation, making it simpler for users to interact with their data without needing complex SQL commands.

New IAM Deny Policies

IAM deny policies now support additional permissions, including bigquery.tables.getData, which can block table reads. Administrators should be cautious when applying these policies as they may restrict essential access. Use this feature carefully to control table access in special cases.

Removal of bigquery.datasets.update Permission Check

Starting September 17, 2024, the bigquery.datasets.update permission check will be removed for authorized datasets. This change simplifies access control by eliminating an extra permission check, reducing administrative complexity for authorized users while maintaining security for sensitive datasets.

Secure SAP Datasphere Connections with Network Attachments

You can now configure SAP Datasphere connections using network attachments for enhanced security. This feature, currently in preview, allows users to securely link SAP Datasphere with BigQuery, ensuring a protected data flow through trusted networks and improving enterprise data governance.

CHANGES Change History Function

BigQuery introduces the new CHANGES function, which provides a history of table changes over a defined time window. This feature allows users to track data modifications, offering greater transparency and accountability for auditing and data analysis purposes.

Secure AWS S3 and Azure Blob Access with BigQuery Omni VPC Allowlists

BigQuery Omni now supports VPC allowlists, enabling restricted access to AWS S3 and Azure Blob Storage from specific VPCs. This feature, in preview, enhances security by ensuring that only authorized networks can access cloud storage, protecting data from unauthorized access.

Cloud Console Updates

Users can now drag a tab in the details pane to open a new column and compare multiple tabs side by side. Tabs can also be moved within the current or adjacent columns, improving navigation and workflow efficiency. Cloud Console Updates feature is currently in preview.

BigQuery Managed Disaster Recovery

BigQuery Managed Disaster Recovery offers failover and redundant compute capacity, ensuring that businesses using the Enterprise Plus edition have high availability and disaster recovery options in place. This feature protects against unexpected failures by automatically providing backup resources.

Subqueries in Row-Level Access Policies

BigQuery now supports subqueries in row-level access policies, allowing for more complex access control based on dynamic conditions. This feature enhances security and flexibility by enabling administrators to define access rules using more sophisticated queries to filter sensitive data.

Undelete a Dataset Within the Time Travel Window

You can now undelete a dataset within your time travel window to recover it to its previous state before deletion. This feature provides a safety net for accidental deletions, improving data recovery capabilities without incurring additional downtime or loss of data.

Change History in INFORMATION_SCHEMA Views

New INFORMATION_SCHEMA views display the history of configuration changes at both the organization and project levels. The ORGANIZATION_OPTIONS_CHANGES and PROJECT_OPTIONS_CHANGES views offer detailed insights into the changes made to configuration settings, enhancing auditing and compliance tracking.

Tags on BigQuery Tables for IAM Access Control

You can now use tags on BigQuery tables to conditionally grant or deny access using IAM policies. This feature enhances security management by allowing fine-grained access control based on specific tags, ensuring that data governance policies are followed.

BigQuery Slot Estimator for On-Demand Workloads

The slot estimator in BigQuery supports cost-optimal commitment and autoscale recommendations for on-demand workloads. This tool analyzes historical usage to recommend the best configuration for slot purchases, helping organizations optimize their costs while meeting performance needs.

BigQuery Cross-Region Table Copying

BigQuery now allows users to copy tables across different regions, enabling efficient data movement and replication across geographic locations. This feature supports global data operations, making it easier to ensure data availability and redundancy in multi-region environments.

DML

You can now use data manipulation language (DML) to modify rows that have been recently written by the Storage Write API. This feature, in preview, allows for real-time data updates and ensures that recent data can be adjusted without additional batch processing.

BigQuery Administrative Resource Charts

BigQuery introduces new administrative resource charts that allow users to view resource utilization at the project level. These charts provide insights into slot usage, job concurrency, and overall system performance, helping administrators monitor and optimize their BigQuery resources.

Custom Data Masking

Custom data masking now supports an expanded list of functions, including SHA hash functions with salt. This feature, currently in preview, enhances data security by providing more customizable options for masking sensitive information within datasets, protecting privacy while allowing analysis.

Cloud Console Updates

New features in preview for the Cloud console include a recently accessed resources section and query result visualizations in the Chart tab. These updates make it easier to access frequently used resources and visualize query results without needing external tools.

Slot Estimator Enhancements

The slot estimator has been enhanced to offer cost-optimal commitment and autoscale suggestions based on editions pricing and past performance metrics. These improvements help organizations make better decisions when purchasing slots, reducing costs while maintaining optimal performance.

BigQuery Partitioning and Clustering Recommender

The partitioning and clustering recommender identifies opportunities to apply partitioning or clustering to tables, potentially saving costs and improving query performance. This feature analyzes table usage patterns and provides actionable recommendations to optimize data organization and efficiency.

Check out New Migrations & Data Pipelines Capabilities

BigQuery’s latest updates introduce powerful new migrations and data pipeline features designed to simplify cross-platform data integration and streamline data transformations. These capabilities enhance the efficiency of data transfers, support seamless migration from other platforms, and optimize pipeline management.

Current Tools for Migrations & Data Pipelines

BigQuery offers robust tools designed to make data transformation and integration more seamless. These tools are essential for efficient data ingestion, streamlined pipeline management, and smoother platform transitions.

Dataform for Seamless Data Transformation

Dataform is a powerful tool that simplifies data transformation within the ELT (Extract, Load, Transform) process by enabling declarative SQL workflows in BigQuery. It supports version control, automated quality testing, and Git integration, which helps you manage dependencies and schedule executions efficiently.

Ingest Data from Azure Blob Storage for Cross-Platform Integration

BigQuery facilitates secure data ingestion from Azure Blob Storage, enabling seamless platform integration. By utilizing Azure federated identity, administrators can grant a Google service account access to Azure applications, ensuring secure and efficient data access.

This feature significantly enhances the ability of data analysts to query and manage data across different cloud environments.

Debugging in Interactive SQL Translator for Smoother Migrations

BigQuery’s Interactive SQL Translator has advanced debugging tools that make migrations smoother by translating legacy SQL code into BigQuery SQL. It offers step-by-step guidance, helping you identify and resolve potential issues, simplifying the migration process, and reducing errors.

Incremental Data Transfers in BigQuery Data Transfer Service

The BigQuery Data Transfer Service now enables incremental data transfers, allowing users to migrate only updated or new data from Teradata data warehouses to BigQuery. This feature improves efficiency by reducing the volume of data transferred, cutting down migration time with large-scale data migrations.

Federated Queries for AlloyDB

BigQuery now supports federated queries for AlloyDB, enabling seamless querying of data stored in AlloyDB without the need for data movement or duplication. This feature allows users to run SQL queries across both BigQuery and AlloyDB, simplifying data analysis and integration for hybrid cloud environments.

EXPORT DATA to Bigtable

BigQuery now allows direct data exports to Bigtable using the EXPORT DATA statement, enabling reverse ETL workflows. This feature simplifies data integration by allowing you to seamlessly export processed data from BigQuery into Bigtable, which is ideal within Bigtable environments.

BigQuery Migration Assessment Features

BigQuery migration assessment features are now generally available, providing automatic dataset creation for storing assessment results. Users can monitor progress with partial reports, view estimated completion times, and access detailed information or errors through the assessment details page for improved migration transparency.

Redshift Migration Assessment

The BigQuery migration assessment for Amazon Redshift helps evaluate the complexity of migrating data from Redshift to BigQuery. It provides insights into migration readiness, identifies potential challenges, and streamlines the transition process for smoother data warehouse migration to BigQuery.

Translation Configurations in BigQuery Interactive SQL Translator

The BigQuery interactive SQL translator now allows you to specify translation configurations, offering enhanced control over SQL translation processes. You can also use it to debug batch SQL translator jobs, improving accuracy and efficiency during SQL migration and transformation tasks.

BigQuery Data Transfer Service for Product Targeting Report

The BigQuery Data Transfer Service now supports transferring the Product Targeting report from Google Merchant Center. This allows businesses to seamlessly import product targeting data into BigQuery for deeper analysis, helping optimize marketing strategies and product performance tracking across platforms.

Salesforce Data Cloud Data to BigQuery

BigQuery now supports integration with Salesforce Data Cloud, allowing you to add Salesforce Data Cloud data directly into BigQuery for comprehensive analysis. This integration streamlines data workflows, enabling businesses to unify customer data and gain deeper insights across both platforms.

Federated Workforce Identities for Data Transfers

The BigQuery Data Transfer Service now supports federated workforce identities, enabling secure and seamless data transfers from most data sources. This feature simplifies authentication, allowing organizations to use existing workforce identity solutions for enhanced security and streamlined data transfer processes.

Apache Hive Connector

The Apache Hive connector for BigQuery is now generally available, facilitating seamless migration of data analytics pipelines. This connector allows users to directly access and query data in Apache Hive from BigQuery, simplifying analytics workflows and enhancing data pipeline integration.

YAML Files to Transform SQL Code

BigQuery now supports using YAML configuration files to transform SQL code during SQL query translation from source databases. These files can be used with the batch SQL translator, interactive SQL translator, and batch translation Python client, enhancing flexibility in SQL migrations.

Data Migrations & Pipelines Features in Preview

BigQuery is rolling out preview features to enhance data migrations and pipelines. It offers advanced tools that improve media integration, streamline advertising analytics, and support scalable storage solutions.

Display & Video 360 Transfers for Seamless Media Integration

The Display & Video 360 Transfers feature in BigQuery automates the integration of campaign data from Display & Video 360 into BigQuery. This seamless process makes it easier to analyze advertising performance, offering deeper insights and enabling more effective optimization of digital media campaigns.

Search Ads 360 Transfers for Enhanced Advertising Analytics

BigQuery’s Search Ads 360 Transfers feature simplifies the integration of search campaign data from Search Ads 360 into BigQuery. Automating this process allows for a more comprehensive analysis of search ad performance, providing deeper insights and enabling the fine-tuning of advertising strategies directly within BigQuery.

Export Data to Bigtable for Scalable Storage Solutions

With the Export to Bigtable feature, BigQuery makes it easy to move large datasets from BigQuery to Bigtable. This capability supports scalable storage solutions, making it ideal for managing high-throughput, low-latency workloads where fast access to large volumes of structured data is crucial across multiple applications.

Migration Assessment for Apache Hive

BigQuery’s Migration Assessment for Apache Hive helps you plan and evaluate the transition of your existing Hive data warehouse to BigQuery. This tool generates a detailed report that assesses storage costs, identifies cost-saving opportunities, and provides a comprehensive migration plan, ensuring a smooth and efficient migration of your workloads to BigQuery.

Migration Assessment for Snowflake

Similarly, BigQuery’s Migration Assessment for Snowflake assists in planning and executing the migration of your Snowflake data warehouse to BigQuery. The assessment produces reports estimating storage costs, identifying potential savings, and outlining a detailed migration plan, facilitating a seamless and efficient transfer of your data workloads to BigQuery.

Configuration YAML Files for SQL Query Translation

In preview, BigQuery allows using configuration YAML files to optimize and enhance the performance of translated SQL queries from your source database. This feature offers greater control over SQL translations, enabling fine-tuning for improved efficiency and execution in BigQuery.

Migration Assessment for Amazon Redshift, Teradata, or Snowflake

The Migration Assessment for Amazon Redshift, Teradata, or Snowflake now generates a dataset containing only highly aggregated assessment results, with no query logs or sensitive information. This ensures privacy, allowing you to securely share assessment insights with users outside your project while maintaining data integrity.

EXPORT DATA statements for Reverse ETL to Spanner

You can now utilize EXPORT DATA statements to perform reverse ETL, transferring BigQuery data directly to Spanner. This capability enhances data workflows by enabling seamless export of processed data into Spanner for operational use without complex integrations.

SQL Translation Tools for Multiple SQL Dialects

BigQuery's SQL translation tools, including the interactive SQL translator, translation API, and batch SQL translator, now support translating SQL dialects such as IBM DB2 SQL, Greenplum SQL, and SQLite into GoogleSQL. These tools streamline the migration process for various database systems.

Gemini-enhanced Translation Rules

BigQuery now allows you to create Gemini-enhanced translation rules for the interactive SQL translator, enabling customized SQL translations to fit your migration requirements. These rules provide flexibility and control, improving the accuracy and efficiency of SQL migrations across databases.

Federated Queries for SAP Datasphere

BigQuery now supports federated queries for SAP Datasphere, allowing you to query data directly from SAP Datasphere without transferring it to BigQuery. This enhances data analysis and integration capabilities, streamlining workflows for hybrid cloud environments and SAP systems.

Integrate Data from Facebook Ads, Oracle, Salesforce, and More into BigQuery

The BigQuery Data Transfer Service now supports data transfers from Facebook Ads, Oracle, Salesforce, Salesforce Marketing Cloud, and ServiceNow. This capability allows seamless integration of data from these sources into BigQuery, enabling comprehensive analysis across platforms and improving data-driven decision-making processes.

BigQuery Migration Assessment for Apache Hive

BigQuery Migration Assessment for Apache Hive is now available in preview. This feature allows you to evaluate the complexity of migrating data from Apache Hive to BigQuery. It helps identify potential challenges and simplifies planning for a smoother, more efficient data migration process.

Transfer Display & Video 360 Campaign Data to BigQuery

The BigQuery Data Transfer Service now supports transferring campaign reporting and configuration data from Display & Video 360 into BigQuery. This integration enables comprehensive analysis of campaign performance, allowing for more informed decision-making and optimization of digital advertising strategies across platforms.

Filter Data with INSERT INTO SELECT Statement

The INSERT INTO SELECT statement now allows you to filter data from files stored in Amazon S3 and Azure Blob Storage and append it directly into BigQuery tables. This feature streamlines data ingestion and transformation from external cloud storage sources into BigQuery.

Create Tables Using CREATE TABLE AS SELECT

The CREATE TABLE AS SELECT statement now allows you to filter data from files stored in Amazon S3 and Azure Blob Storage, transferring the filtered results directly into BigQuery tables for streamlined data processing and integration.

Uncover BigQuery Insights with OWOX Reports Extension for Google Sheets

The OWOX Reports Extension for Google Sheets is a handy tool that integrates BigQuery data into your spreadsheets, making data analysis more accessible and efficient. With this extension, data analysts can pull data directly from BigQuery into Google Sheets, allowing for real-time analysis, reporting, and visualization - all without the need to switch between platforms.

The OWOX Reports Extension also facilitates collaboration by allowing multiple team members to work on the same Google Sheet simultaneously, boosting transparency and teamwork. Its intuitive interface lets users easily manipulate and analyze BigQuery data, even if they are unfamiliar with SQL.

Exploring New BigQuery Capabilities for 2025: What’s Available and What’s Next

Overview of BigQuery's Latest Enhancements and Updates

Exploring New Lakehouse Foundation Capabilities

New and Generally Available Lakehouse Foundation Features

Materialized Views for BigLake Metadata Cache-enabled Tables

Enhanced JSON Functions for Improved Data Handling

Manifest Files for External Table Integration

Query Execution Graph for Optimized Query Planning

External Query SQL Pushdown for Enhanced Performance

Recursive CTEs for Complex Query Construction

Cross-user Result Cache for Faster Query Responses

High Cardinality Join Insights for Better Data Relationships

Subscriber Usage Metrics for Analytics Hub Optimization

Primary and Foreign Key Table Constraints for Data Integrity

Object Tables for Structured Data Management

Preferred Tables for BI Engine Performance

Case-insensitive Collation for Consistent Data Matching

Default Values on Columns for Simplified Data Entry

Delta Lake BigLake Tables

RANGE Data Type JSON Encoding

Manifest File Support for Amazon S3 and Azure Blob Storage

Analytics Hub Features

Collation Support

Analytics Hub Data Egress

AWS Glue Federated Datasets Creation in Google Cloud Console

Analytics Hub Subscription Management

Analytics Hub Provider Usage Metrics

AWS Glue Federated Dataset Connection

Quantified LIKE Operator

BigQuery Studio General Availability

INFORMATION_SCHEMA.WRITE_API_TIMELINE Views for BigQuery Storage Write API Statistics

BQ.REFRESH_EXTERNAL_METADATA_CACHE Procedure for Selective Metadata Refresh of BigLake Tables

BigQuery Cross-Cloud Features for Materialized Views and Cross-Cloud Joins

High Cardinality Joins

Materialized Views

BigQuery JSON Functions

Manifest Files for External Tables

Analytics Hub Shared Datasets Metrics Tracking

GRANT/REVOKE Access to Materialized Views

Primary and Foreign Key Constraints

Metadata Caching for BigLake Tables

Metadata Caching for BigLake Tables

Apache Iceberg Tables

BigLake Metastore

BigQuery Query Execution Graph

BigQuery Object Tables

Custom Dual-Regions for Cloud Storage

Case-Insensitive Collation

Case-Insensitive Dataset and Tables

BigQuery Recursive CTEs

Upcoming Lakehouse Foundation Innovations

BigQuery Studio for Integrated Development and Management

Partition Skew Insights for Optimized Query Performance

Stored Procedures for Apache Spark Integration

AWS Glue Federated Dataset for Enhanced Data Connectivity

Metadata Caching for Amazon S3 Tables

Data Clean Rooms for Secure Data Collaboration

Quantitative LIKE Operator for Precise Query Filtering

Expanded Query Support for Search Indexes

SQL Grouping Sets (Cube, Rollup) for Advanced Data Aggregation

Text Analysis Configurations for Improved Search Capabilities

Native Integration for Looker Studio

Subscription Management in Analytics Hub for Simplified Service Oversight

Routines in Analytics Hub for Optimized Workflows

Restrict Data Egress in Analytics Hub for Enhanced Security

Non-incremental Materialized Views for Efficient Data Refresh

Unicode Column Naming for Universal Data Compatibility

GROUP BY and SELECT DISTINCT with ARRAY and STRUCT

JSON_KEYS Function

Materialized Views on Apache Iceberg Table

Search Index Optimization for INT64 and TIMESTAMP Data Types

Reference Iceberg Tables in Materialized Views

BigQuery Materialized View Recommender

Incremental Materialized Views

Support for Delta Lake Format for Amazon S3 and Azure Tables

Cross-Cloud Joins Between Google Cloud and BigQuery Omni Regions

BigQuery Cross-Cloud Materialized Views

Query Performance Insights for Partition Skew

Apache Spark Stored Procedures

Grouping Sets