Modifying Tables in BigQuery: A 2025 DDL Guide

Alyona Samovar

Senior Digital Analyst

July 22, 2024

DDL (Data Definition Language) statements in BigQuery are crucial for defining, creating, and modifying tables within a data warehouse environment. These commands enable users to manage database schema efficiently without impacting the data.

Using DDL, users can alter table structures, precisely controlling how data is stored, accessed, and managed. This ability to change table structures is crucial for maintaining data integrity and optimizing database performance as organizational data needs to evolve.

This article specifically focuses on the ALTER DDL statement, detailing the methods for modifying existing tables. For those interested in learning how to create new BigQuery tables or delete existing ones, we recommend reading our article covering CREATE and DROP DDL statements, which provides comprehensive coverage of these functions.

What are DDL Statements in BigQuery?

Data Definition Language (DDL) statements are vital for defining and managing the structure of database objects in BigQuery. These commands enable you to create, modify, and delete schemas, tables, views, and more. Efficient use of DDL statements helps maintain organized, scalable, and high-performing databases. Whether you are a data engineer, SQL developer, or data analyst, mastering DDL statements is essential for optimizing data workflows and ensuring data integrity.

Modifying Tables with BigQuery ALTER Statements

BigQuery's ALTER statements are essential for changing the structure and properties of existing tables without disrupting the overall database. These statements allow you to add or drop columns, rename tables, change column data types, and set various table options.

ALTER TABLE SET OPTIONS

The ALTER TABLE SET OPTIONS statement in BigQuery modifies table properties such as description, expiration time, and labels without altering the table's structure. This capability is crucial for maintaining and managing table metadata, helping organizations keep their data organized and compliant with data governance policies.

Syntax of ALTER TABLE SET OPTIONS

ALTER TABLE [IF EXISTS] table_name
SET OPTIONS(table_set_options_list);

ALTER TABLE [IF EXISTS] table_name: Specifies the table to be altered. The optional [IF EXISTS] clause ensures that an error is not thrown if the table does not exist.
SET OPTIONS(table_set_options_list): Lists the options to be set or modified, such as description, expiration timestamp, labels, etc.

Example of ALTER TABLE SET OPTIONS

This example demonstrates how to update the description and add a label for a table named employee_records in BigQuery, ensuring the table exists before applying the changes.

ALTER TABLE IF EXISTS dataset.employee_records
SET OPTIONS (
 description = 'Employee records updated with recent hires',
 labels = [("department", "hr"), ("confidential", "true")]
);

Here:

ALTER TABLE IF EXISTS dataset.employee_records: Targets the employee_records table in the dataset and checks if it exists before applying any changes.
description = 'Employee records updated with recent hires': Updates the table's description to reflect recent changes in the data.
labels = [("department", “hr"), ("confidential", "true")]: Adds labels to the table for better classification and management, indicating the department responsible and the confidentiality status.

ALTER TABLE ADD COLUMN

The ALTER TABLE ADD COLUMN statement in SQL allows you to add one or more new columns to an existing table, enhancing the database's functionality without disrupting existing data. This command is crucial for adapting the database structure to evolving data requirements, offering flexibility and convenience.

Syntax of ALTER TABLE ADD COLUMN

ALTER TABLE table_name
ADD COLUMN [IF NOT EXISTS] column_name column_schema [, ...]

ALTER TABLE table_name: Specifies the table to which you want to add a column. Replace table_name with the name of your table.
ADD COLUMN [IF NOT EXISTS]: The ADD COLUMN clause tells the database to add a new column. The optional IF NOT EXISTS is a safety feature that prevents errors if the column already exists.
column_name: Replace column_name with the name you want to assign to the new column.
column_schema: Defines the data type and constraints for the new column, such as INT, VARCHAR(255), or DATE.
[,...]: Indicates that you can repeat the previous format to add multiple columns in a single statement.

Example of ALTER TABLE ADD COLUMN

Let's consider a practical example of adding a column to an existing table. We'll use a table named Employees, which originally only held employee IDs and names. We want to add an email address and a hiring date.

ALTER TABLE dataset.Employees
ADD COLUMN email STRING,
ADD COLUMN hire_date DATE;

Here:

ALTER TABLE Employees: This indicates that we are modifying the Employees table.
ADD COLUMN email STRING: Adds a new column named email that can store strings up to 255 characters long, suitable for storing email addresses.
ADD COLUMN hire_date DATE: Adds another column named hire_date that stores date values, representing the date an employee was hired.

ALTER TABLE RENAME TO

ALTER TABLE [IF EXISTS] old_table_name 
RENAME TO new_table_name;

The ALTER TABLE RENAME TO command in SQL is pivotal for renaming existing database tables, aligning table names with revised naming conventions, or clarifying the data they store. This command ensures clarity, aligns with new standards, and simplifies names for better recall.

Syntax of ALTER TABLE RENAME TO

ALTER TABLE [IF EXISTS]: This prefix indicates the start of the command to alter the table structure. The optional IF EXISTS is used to prevent errors if the specified table does not exist.
old_table_name: The current name of the table you wish to rename.
RENAME TO new_table_name: This part specifies the new name for the table.

Example of ALTER TABLE RENAME TO

In this example, we change the name of the table customer_details in mydataset to updated_customer_records, illustrating how to update table names to reflect current data usage or organizational changes.

ALTER TABLE dataset.customer_details
RENAME TO updated_customer_records;

Here:

ALTER TABLE dataset.customer_details: Specifies that the table named customer_details within dataset is selected for renaming.
RENAME TO updated_customer_records: The table’s name will be changed to updated_customer_records, which may reflect a more current or specific use of the data it holds.

ALTER TABLE DROP COLUMN

ALTER TABLE table_name
DROP COLUMN [IF EXISTS] column_name [, ...]

The ALTER TABLE DROP COLUMN command is used in SQL to remove one or more columns from an existing table. This functionality is crucial for maintaining and optimizing the database structure. The operation should be used with caution to avoid unintentionally losing data that may still have relevance or be needed for compliance purposes.

Syntax of ALTER TABLE DROP COLUMN

ALTER TABLE table_name: Specifies the table from which columns will be dropped. Replace table_name with the actual name of the table.
DROP COLUMN: This command indicates that one or more columns will be removed from the table.
[IF EXISTS]: An optional clause that prevents errors by only dropping the column if it exists in the table.
column_name: The name of the column to be dropped. You can list multiple column names separated by commas to drop more than one column at a time.

Example of ALTER TABLE DROP COLUMN

This example demonstrates removing columns from a table in a database using the ALTER TABLE DROP COLUMN command. It specifically focuses on removing the Name and Location columns from the all_users_table in mydataset.

ALTER TABLE mydataset.all_users_table
DROP COLUMN Name,
DROP COLUMN IF EXISTS Location;

Here:

ALTER TABLE mydataset.all_users_table: Specifies the all_users_table in the mydataset dataset that is being modified.
DROP COLUMN Name: Removes the Name column from the table. It does not include the IF EXISTS clause, so the command will error out if the column does not exist.
DROP COLUMN IF EXISTS Location: Removes the Location column only if it exists, preventing any errors if the column is not found in the table.

ALTER COLUMN SET OPTIONS

The ALTER COLUMN SET OPTIONS command in SQL modifies the properties of a column in an existing table without altering its data type or structure. This functionality is particularly useful for updating metadata, such as descriptions or labels, associated with columns to improve the clarity of the database schema.

Syntax of ALTER COLUMN SET OPTIONS

ALTER TABLE [IF EXISTS] table_name
ALTER COLUMN [IF EXISTS] column_name 
SET OPTIONS(column_set_options_list)

ALTER TABLE [IF EXISTS] table_name: Specifies the table containing the column to be altered. The optional IF EXISTS clause ensures that the command only executes if the specified table exists, avoiding errors.
ALTER COLUMN [IF EXISTS] column_name: Targets the specific column to be modified. Including IF EXISTS is optional but recommended to prevent errors if the column does not exist.
SET OPTIONS(column_set_options_list): Defines the new options for the column, such as descriptions or formatting details. column_set_options_list should contain key-value pairs that specify these options.

Example of ALTER COLUMN SET OPTIONS

This example shows how to use the ALTER COLUMN SET OPTIONS command to update the metadata of a column named quantity in the all_products_table within mydataset by setting a new description.

ALTER TABLE mydataset.all_products_table
ALTER COLUMN quantity
SET OPTIONS (
 description="Quantity in stock"
)

Here:

ALTER TABLE mydataset.all_products_table: Indicates the all_products_table in the mydataset dataset that is being modified.
ALTER COLUMN quantity: Specifies the quantity column for which the options are being set.
SET OPTIONS(description="Quantity in stock"): Updates the description option for the quantity column to "Quantity in stock," providing clearer information about what the column represents.

ALTER COLUMN DROP NOT NULL

The ALTER COLUMN DROP NOT NULL command in SQL is used to modify a column in a table to accept null values, useful for adjusting the database schema to accommodate changes in data requirements or application logic. By allowing null values, this command provides flexibility in handling data that may be optional or currently unavailable.

Syntax of ALTER COLUMN DROP NOT NULL

ALTER TABLE [IF EXISTS] table_name
ALTER COLUMN [IF EXISTS] column_name DROP NOT NULL

ALTER TABLE [IF EXISTS] table_name: Specifies the table that contains the column from which the NOT NULL constraint will be removed. The IF EXISTS clause is optional and helps to avoid errors if the specified table does not exist.
ALTER COLUMN [IF EXISTS] column_name: Identifies the column that will be altered. The IF EXISTS clause is optional and prevents errors if the column does not exist.
DROP NOT NULL: This command removes the NOT NULL constraint from the specified column, allowing it to store null values.

Example of ALTER COLUMN DROP NOT NULL

This example demonstrates how to use the ALTER COLUMN DROP NOT NULL command to modify the user_status column in the user_accounts_table within mydataset by removing its NOT NULL constraint.

ALTER TABLE mydataset.user_accounts_table
ALTER COLUMN user_status
DROP NOT NULL

Here:

ALTER TABLE mydataset.user_accounts_table: Specifies the user_accounts_table within the mydataset dataset that is being modified.
ALTER COLUMN user_status: Targets the user_status column for modification.
DROP NOT NULL: Removes the NOT NULL constraint from the user_status column, allowing the column to accept null values, which can be useful for handling users whose status might not be determined at the time of data entry.

ALTER COLUMN SET DATA TYPE

The ALTER COLUMN SET DATA TYPE command in SQL is designed to change the data type of an existing column within a table. This is particularly useful when the needs of your data storage evolve, requiring columns to handle different types or sizes of data.

Syntax of ALTER COLUMN SET DATA TYPE

ALTER TABLE [IF EXISTS] table_name
ALTER COLUMN [IF EXISTS] column_name 
SET DATA TYPE data_type

ALTER TABLE [IF EXISTS] table_name: Specifies the table containing the column whose data type is to be altered. The IF EXISTS is optional and ensures that the operation only proceeds if the table exists, preventing errors.
ALTER COLUMN [IF EXISTS] column_name: Identifies the column to be altered. The IF EXISTS clause is optional but useful to avoid errors if the column does not exist.
SET DATA TYPE data_type: Assigns a new data type to the selected column. The data_type should be a valid SQL data type like INT64, FLOAT64, etc.

Example of ALTER COLUMN SET DATA TYPE

This example demonstrates how to change the data type of the price column in the product_data table from an integer to a float, which allows for more precise representation of pricing information.

CREATE TABLE mydataset.product_data(price INT64);
ALTER TABLE mydataset.product_data 
ALTER COLUMN price SET DATA TYPE FLOAT64;

Here:

CREATE TABLE mydataset.product_data(price INT64);: Initially creates a table named product_data in mydataset with a column price of type INT64.
ALTER TABLE mydataset.product_data ALTER COLUMN price SET DATA TYPE FLOAT64;: Changes the data type of the price column to FLOAT64 to accommodate decimal values.

ALTER TABLE RENAME

The ALTER TABLE RENAME command is used in SQL to rename one or more columns within an existing table. This capability is particularly useful when you need to make column names more descriptive or align them with changes in database design and data usage.

Syntax of ALTER TABLE RENAME

ALTER TABLE [IF EXISTS] table_name
RENAME COLUMN [IF EXISTS] column_to_column[, ...]
column_to_column :=
 column_name TO new_column_name

ALTER TABLE [IF EXISTS] table_name: Specifies the table in which columns are to be renamed. The IF EXISTS is optional and ensures the command executes only if the specified table exists.
RENAME COLUMN [IF EXISTS] column_to_column: This command directs the renaming of one or more columns. The optional IF EXISTS for each column ensures that the command only attempts to rename columns that exist.
column_name TO new_column_name: This is the specific syntax to rename a column, where column_name is the existing name and new_column_name is the new name to be assigned.

Examples of ALTER TABLE RENAME

Here are two examples of ALTER TABLE RENAME to demonstrate its usage.

Example 1:This example demonstrates renaming multiple columns in the employee_records table to make the column names more intuitive.

ALTER TABLE mydataset.employee_records
 RENAME COLUMN EmpID TO EmployeeID,
 RENAME COLUMN IF EXISTS Dept TO Department

Here:

ALTER TABLE mydataset.employee_records: Specifies the table employee_records in mydataset that is being modified.
RENAME COLUMN EmpID TO EmployeeID: Renames the column EmpID to EmployeeID.
RENAME COLUMN IF EXISTS Dept TO Department: Renames the column Dept to Department, only if Dept exists.

ALTER COLUMN SET DEFAULT

The ALTER COLUMN SET DEFAULT command defines or changes the default value of a column within an existing table. This functionality is vital for ensuring that new records have a predefined value when no value is specified during data insertion, helping maintain data integrity and consistency.

Syntax of ALTER COLUMN SET DEFAULT

ALTER TABLE [IF EXISTS] table_name 
ALTER COLUMN [IF EXISTS] column_name
SET DEFAULT default_expression;

ALTER TABLE [IF EXISTS] table_name: Specifies the table that contains the column for which the default value is being set. The optional IF EXISTS clause prevents errors by ensuring the table exists before attempting the alteration.
ALTER COLUMN [IF EXISTS] column_name: Indicates the specific column to be modified. The optional IF EXISTS is recommended to avoid errors if the specified column does not exist.
SET DEFAULT default_expression: Assigns a new default value to the column. The default_expression can be a constant value, an expression, or a function that SQL supports.

Example of ALTER COLUMN SET DEFAULT

This example demonstrates setting a new default value for the registration_date column in the user_data table, assigning the current date and time as the default value to ensure that each new record has a timestamp of its creation.

ALTER TABLE mydataset.user_data
ALTER COLUMN registration_date
SET DEFAULT CURRENT_DATE();

Here:

ALTER TABLE mydataset.user_data: Specifies the user_data table within the mydataset that is being modified.
ALTER COLUMN registration_date: Targets the registration_date column for modification.
SET DEFAULT CURRENT_DATE(): Sets the default value for the registration_date column to the current date, ensuring that new entries without a specified registration date will automatically have the date of entry recorded.

ALTER SCHEMA SET DEFAULT COLLATE Statement

The ALTER SCHEMA SET DEFAULT COLLATE statement in SQL specifies or changes the default collation for a database schema. Collation determines how string comparison is performed within the database, influencing sorting and comparison operations based on linguistic rules.

Syntax of ALTER SCHEMA SET DEFAULT COLLATE Statement

ALTER SCHEMA [IF EXISTS]
[project_name.]dataset_name
SET DEFAULT COLLATE collate_specification

ALTER SCHEMA: Indicates that the command will modify the schema properties.
[IF EXISTS]: This optional clause ensures that the command will only execute if the specified schema exists, which helps avoid errors.
[project_name.]dataset_name: Specifies the dataset to modify, optionally including the project name if it is not the default or current project.
SET DEFAULT COLLATE: This command sets the default collation for the dataset.
collate_specification: Specifies the collation setting to be applied. This defines how string comparisons are performed in the dataset.

Example of ALTER SCHEMA SET DEFAULT COLLATE Statement

This example shows how to set the default collation for the customer_data dataset in the sales_data project to en_US to ensure that all string comparisons and sorting operations are performed according to English (United States) linguistic rules.

ALTER SCHEMA IF EXISTS
mydataset
SET DEFAULT COLLATE 'und:ci'

Here:

ALTER SCHEMA IF EXISTS mydataset: This part of the statement is used to modify an existing schema (also known as a database) named mydataset. The IF EXISTS clause ensures that the operation only proceeds if the specified schema (mydataset) exists. If it doesn’t exist, the statement won’t cause an error.
SET DEFAULT COLLATE 'und:ci': The SET DEFAULT COLLATE clause specifies the default collation (character set sorting and comparison rules) for the schema. In this case, the collation is set to 'und:ci'. 'und'represents the Unicode character set.':ci': Indicates case-insensitive sorting and comparison rules.

ALTER SCHEMA SET OPTIONS Statement

The ALTER SCHEMA SET OPTIONS statement is an SQL command used to modify various settings of a database schema, such as default table expiration or case sensitivity. This flexibility allows database administrators to configure schemas according to specific operational requirements or data governance policies.

Syntax of ALTER SCHEMA SET OPTIONS Statement

ALTER SCHEMA [IF EXISTS]
[project_name.]dataset_name
SET OPTIONS(schema_set_options_list)

ALTER SCHEMA: This initiates the command to modify schema settings.
[IF EXISTS]: An optional clause that ensures the command only executes if the specified schema exists, thus preventing errors.
[project_name.]dataset_name: Specifies the dataset to be modified, with an optional project name if not using the default project.
SET OPTIONS: Indicates that options for the schema are being set or modified.
schema_set_options_list: A list of key-value pairs that specify the options to be updated. These options can include settings like default table expiration days, case sensitivity, and more.

Example of ALTER SCHEMA SET OPTIONS Statement

This example demonstrates how to set the default table expiration for the employee_records dataset in the hr_data project to approximately 4 days, which helps manage data lifecycle by automatically deleting tables after the specified period.

ALTER SCHEMA mydataset
SET OPTIONS(
default_table_expiration_days=4
)

Here:

ALTER SCHEMA mydataset: Specifies the mydataset dataset within the hr_data project for modification.
SET OPTIONS(default_table_expiration_days=4): Sets the default expiration for tables in the dataset to 4 days, automating data retention policies.

ALTER SCHEMA ADD REPLICA Statement

The ALTER SCHEMA ADD REPLICA statement in SQL is used to add a replica to an existing dataset, enhancing data availability and read performance across geographically dispersed locations. This command is vital for businesses operating on a global scale, as it ensures faster data access and increased resilience against regional outages.

Syntax of ALTER SCHEMA ADD REPLICA Statement

ALTER SCHEMA [IF EXISTS]
[project_name.]dataset_name
ADD REPLICA replica_name 
[OPTIONS(add_replica_options_list)]

ALTER SCHEMA: This initiates the command to modify the schema.
[IF EXISTS]: An optional clause that ensures the command only executes if the specified schema exists, preventing errors from non-existent schemas.
[project_name.]dataset_name: Specifies the dataset to which a replica will be added. The dataset can optionally include the project name if not within the default project.
ADD REPLICA replica_name: Directs the addition of a new replica with the given name.
[OPTIONS(add_replica_options_list)]: Optional list of key-value pairs that define settings for the replica, such as its geographic location.

Example of ALTER SCHEMA ADD REPLICA Statement

This example demonstrates adding a new replica named Asia to the global_data dataset within the enterprise_solutions project, specifying its location in Asia to improve data access speed and reliability in the region.

ALTER SCHEMA enterprise_solutions.global_data
ADD REPLICA `Asia` OPTIONS(location=`asia`);

Here:

ALTER SCHEMA enterprise_solutions.global_data: Specifies the global_data dataset within the enterprise_solutions project, indicating where the replica is to be added.
ADD REPLICA 'Asia': Adds a new replica named Asia to the dataset.
OPTIONS(location='asia'): Sets the geographic location of the new replica to Asia, enhancing local read performance and availability.

ALTER SCHEMA DROP REPLICA Statement

The ALTER SCHEMA DROP REPLICA statement in SQL is crucial for removing a replica from an existing dataset, a process often needed when adjusting to changes in data storage strategy or cost management. Removing a replica can help in reducing overheads and focusing on regions that provide the most strategic value for data access and application performance.

Syntax of ALTER SCHEMA DROP REPLICA Statement

ALTER SCHEMA [IF EXISTS] dataset_name
DROP REPLICA replica_name

ALTER SCHEMA: Initiates the command to modify the schema.
[IF EXISTS]: An optional clause that ensures the command only executes if the specified dataset exists, preventing errors from attempting to modify a non-existent dataset.
dataset_name: Specifies the dataset from which the replica will be removed.
DROP REPLICA replica_name: Indicates the specific replica to be removed from the dataset.

Example of ALTER SCHEMA DROP REPLICA Statement

This example shows how to remove a replica named EU-West from the global_network dataset, adjusting the replication setup to better align with the current operational requirements and data usage patterns.

ALTER SCHEMA IF EXISTS mydataset
DROP REPLICA `Asia`

Here:

ALTER SCHEMA [IF EXISTS] global_network: Confirms that changes will only be made if the global_network dataset exists, ensuring the operation does not fail due to the dataset's absence.
DROP REPLICA 'EU-West': Specifies the removal of the EU-West replica from the global_network dataset, effectively reducing the replication across regions and possibly optimizing related costs and management efforts.

ALTER ORGANIZATION SET OPTIONS Statement

The ALTER ORGANIZATION SET OPTIONS command in SQL is used to configure or update settings at an organizational level, impacting how various data operations are handled across all projects and services. This command is particularly useful for standardizing operational settings such as time zones, data handling policies, and query timeouts.

⚠️ Caution: This statement modifies critical settings that affect your entire organization. Ensure comprehensive understanding and consult with your cloud administrator before making changes to avoid unintended impacts across projects and resources.

Syntax of ALTER ORGANIZATION SET OPTIONS Statement

ALTER ORGANIZATION
SET OPTIONS (
 organization_set_options_list
);

ALTER ORGANIZATION: This command specifies that the modifications will be applied at the organization level.
SET OPTIONS: Indicates that options are being set or updated.
(organization_set_options_list): A list enclosed in parentheses that contains key-value pairs defining the settings to be applied. These options can manage default behaviors for time zones, timeouts, security settings, etc.

Example of ALTER ORGANIZATION SET OPTIONS Statement

This example demonstrates setting the default time zone to America/New_York and the default query job timeout to two hours for an organization in the East US region, enhancing operational alignment with regional business hours and data processing needs.

ALTER ORGANIZATION
SET OPTIONS (
 `region-east-us.default_time_zone` = "America/New_York",
 `region-east-us.default_query_job_timeout_ms` = 7200000
);

Here:

ALTER ORGANIZATION: Modifies settings at the organization level.
**region-east-us.default_time_zone = "America/New_York"**: Sets the default time zone for the East US region to America/New_York`.
region-east-us.default_query_job_timeout_ms = 7200000`: Sets the default query job timeout to 7200000 milliseconds (or two hours), which helps in managing longer-running queries effectively.

ALTER PROJECT SET OPTIONS Statement

The ALTER PROJECT SET OPTIONS command is a valuable SQL tool for configuring and customizing project-level settings in a database environment. This command allows administrators to define or adjust settings such as default time zones, query timeouts, and encryption keys, enabling more tailored and efficient project management.

⚠️ Caution: Using this statement can change project-level settings, affecting data management. Carefully review these changes to prevent potential issues with data handling.

Syntax of ALTER PROJECT SET OPTIONS Statement

ALTER PROJECT project_id
SET OPTIONS (project_set_options_list);

ALTER PROJECT: This command starts the modification process for project-level settings.
project_id: Specifies the unique identifier of the project whose options you want to alter.
SET OPTIONS: This clause is used to define the settings that need to be updated or adjusted.
project_set_options_list: A list of key-value pairs enclosed in parentheses. Each pair defines a specific project setting to be set or adjusted.

Example of ALTER PROJECT SET OPTIONS Statement

This example showcases how to reset various project-level settings to their default values (NULL) in project_apac, effectively clearing custom configurations like time zones and encryption keys to revert to default behaviors.

ALTER PROJECT project_apac
SET OPTIONS (
 `region-apac.default_time_zone` = NULL,
 `region-apac.default_kms_key_name` = NULL,
 `region-apac.default_query_job_timeout_ms` = NULL,
 `region-apac.default_interactive_query_queue_timeout_ms` = NULL,
 `region-apac.default_batch_query_queue_timeout_ms` = NULL);

Here:

ALTER PROJECT project_apac: Indicates that the options are being reset for the project_apac.
region-apac.default_time_zone = NULL`: Clears the custom time zone setting, reverting to the system default.
region-apac.default_kms_key_name = NULL`: Removes any custom encryption key settings.
region-apac.default_query_job_timeout_ms = NULL`, etc.: Resets various timeout settings to their defaults, ensuring no custom timeouts interfere with the project's operations.

ALTER BI_CAPACITY SET OPTIONS Statement

The ALTER BI_CAPACITY SET OPTIONS command in SQL is designed to adjust the Business Intelligence (BI) capacity settings within a project. This includes modifying the allocated memory size and specifying preferred tables for BI operations.

⚠️ Caution: This statement adjusts computational resources for your BI projects. Improper configurations can severely impact query performance and cost efficiency. Ensure that these changes are in line with your performance expectations and budgetary needs before implementation.

Syntax of ALTER BI_CAPACITY SET OPTIONS Statement

ALTER BI_CAPACITY `project_id.location_id.default`
SET OPTIONS(bi_capacity_options_list)

ALTER BI_CAPACITY: Initiates the modification of BI capacity settings.
project_id.location_id.default: Specifies the BI capacity resource in a structured format, indicating the project ID, location ID, and typically the default BI capacity setting.
SET OPTIONS: This clause is used to define or modify the BI capacity options.
bi_capacity_options_list: A list of key-value pairs enclosed in parentheses that specify various settings such as size in gigabytes or preferred tables for BI operations.

Example of ALTER BI_CAPACITY SET OPTIONS Statement

This example demonstrates how to allocate 300 GB of BI Engine capacity to the default BI capacity setting in the your-project within the europe-west1 region, without specifying any preferred tables.

ALTER BI_CAPACITY `your-project.europe-west1.default`
SET OPTIONS(
 size_gb = 300
)

Here:

ALTER BI_CAPACITY: Specifies the BI capacity setting being modified.
your-project.europe-west1.default: Indicates the specific BI capacity setting in the your-project within the europe-west1 region.
size_gb = 300: Allocates 300 GB to the BI capacity, enhancing the available resources for BI processing.

Advanced Strategies for Table Management in BigQuery with ALTER

Advanced table management strategies in BigQuery involve leveraging ALTER statements to optimize data workflows and ensure efficient schema evolution. Let's explore these techniques:

Utilizing ALTER TABLE to Add Top-Level Fields

Using the ALTER TABLE command to add top-level fields in BigQuery allows for dynamic schema changes without disrupting existing data. This technique is useful for evolving table structures to accommodate new data requirements.

Example: Adding a Top-Level Field

Imagine we are tasked with adding a new field email of type STRING to the employee table in the company_data dataset.

mydataset.employee
ADD
COLUMN IF NOT EXISTS email STRING;
INSERT INTO
mydataset.employee (id, name, email, department)
VALUES
(101, "Alice", "alice@example.com", "HR");

Here:

ALTER TABLE company_data.employee: Specifies the table to be altered.
ADD COLUMN IF NOT EXISTS: Ensures the new column is added only if it does not already exist.
email STRING: Defines the new column name and its data type.
INSERT INTO company_data.employee (id, name, email, department): Specifies the columns for the new record.
VALUES (101, "Alice", "alice@example.com", "HR"): Provides the values for each column, including the newly added email field.

Modifying Data Types with CAST and ALTER COLUMN

Combining CAST with ALTER COLUMN allows for safe and efficient changes to data types in BigQuery. This technique ensures data integrity while updating the schema to meet new requirements.

Example: Changing Data Type

Imagine we need to change the data type of the salary field from STRING to FLOAT64 in the employee table within the company_data dataset.

ALTER TABLE
mydataset.employee
ADD
COLUMN new_salary FLOAT64;

UPDATE
mydataset.employee
SET
new_salary = CAST(salary AS FLOAT64)
WHERE
salary is not null;

ALTER TABLE
mydataset.employee
DROP COLUMN salary;

ALTER TABLE
mydataset.employee
RENAME COLUMN new_salary TO salary;

Here:

ALTER TABLE mydataset.employee ADD COLUMN new_salary FLOAT64: This statement adds a new column named new_salary to the employee table within the mydatasetschema. The data type of the new column is FLOAT64, which represents a floating-point number with double precision.
UPDATE mydataset.employee SET new_salary = CAST(salary AS FLOAT64) WHERE salary IS NOT NULL: This statement updates the values in the new_salary column. It casts the existing salary values (assumed to be in a different data type) to FLOAT64. The update only occurs for rows where the salary is not null.
ALTER TABLE mydataset.employee DROP COLUMN salary: This statement removes the old salary column from the employee table.
ALTER TABLE mydataset.employee RENAME COLUMN new_salary TO salary: This statement renames the new_salary column to salary.

In summary, this sequence of SQL statements adds a new salary column, updates its values, drops the old salary column, and renames the new column.

Replacing Tables with CREATE OR REPLACE TABLE

The CREATE OR REPLACE TABLE statement in Google BigQuery allows you to update table structures seamlessly. This command is particularly useful when you need to modify a table’s schema or redefine its data without manually dropping and recreating the table.

Example of Using CREATE OR REPLACE TABLE

The following example demonstrates how to use the CREATE OR REPLACE TABLE statement to update the structure of a table named sales in the dataset_name dataset. This example includes adding a new computed column for the total sales amount.

CREATE OR REPLACE TABLE dataset_name.sales AS
SELECT
    order_id,
    product_id,
    quantity,
    price_per_unit,
    quantity * price_per_unit AS total_sales
FROM
    dataset_name.sales;

Here:

CREATE OR REPLACE TABLE dataset_name.sales AS: This command replaces the existing sales table in the dataset_name dataset with a new definition.
SELECT order_id, product_id, quantity, price_per_unit: Selects the order_id, product_id, quantity, and price_per_unit columns from the existing table.
quantity * price_per_unit AS total_sales: Adds a new computed column total_sales that multiplies quantity by price_per_unit.
FROM dataset_name.sales: Specifies the source table from which the data is being selected and restructured.

Best Practices for Using DDL ALTER Statements in BigQuery

Data Definition Language (DDL) statements are crucial for managing and defining the schema of database tables in BigQuery. Adhering to best practices when using ALTER statements ensures smooth database operations and prevents disrupting ongoing processes.

Incremental Schema Evolution

Use DDL statements to modify schemas as your data needs to evolve adaptively. BigQuery supports adding columns to existing tables without downtime or data copying, which can be particularly useful for gradually integrating new data sources or slowly evolving data formats. This approach reduces disruption and allows for a more flexible and scalable schema design.

Use DDL for Bulk Schema Modifications

It is best to batch these modifications into a single DDL statement for bulk changes to a table’s schema, such as adding multiple columns simultaneously. This approach simplifies your scripts and minimizes potential errors.

Combining several schema alterations into one DDL command streamlines the update process, improves maintainability, and reduces the risk of inconsistencies.

Backfill Historical Data When Needed

After adding new columns to a table, consider whether it is necessary to backfill historical data. If so, use DDL statements in combination with DML (Data Manipulation Language) operations to populate the new columns accordingly. This ensures that all data, old and new, adheres to the updated schema. Plan backfilling during off-peak hours to minimize the impact on system performance and use batch processing for efficiency.

Troubleshooting Common DDL Errors When Working with ALTER

Working with DDL statements, especially ALTER statements in BigQuery, can occasionally lead to errors that disrupt your workflow. You can maintain smooth operations and prevent data management issues by understanding these common errors and their solutions.

Error When Renaming a Table Currently Receiving Streaming Data

⚠️ Error: "Cannot rename table currently receiving streaming data"

Attempting to rename a table that is actively receiving streaming data results in an error because BigQuery requires that streaming to the table be paused. This is necessary to ensure data consistency and to finalize all pending data operations before the rename operation.

✅ Solution:

To successfully rename a table that is currently receiving streaming data, you need to follow a few steps. First, pause the streaming data to the table by stopping the processes or systems that are sending data to it. Next, verify that BigQuery has completed processing all incoming data and that the table is no longer receiving streams.

Once you have confirmed that streaming is inactive, proceed with renaming the table using the ALTER TABLE RENAME TO statement. After the rename operation is successful, you can resume the data streaming processes to the newly renamed table.

ALTER TABLE mydataset.original_table
RENAME TO new_table_name;

Handling "Column Already Exists" Error When Using ALTER TABLE ADD COLUMN

⚠️Error: "Column already exists: column_name"

When attempting to add a new column using the ALTER TABLE ADD COLUMN statement, an error occurs if the column name already exists in the table and the IF NOT EXISTS clause is not used. This prevents unintentional duplication and possible conflicts in the table schema.

✅ Solution:

To avoid this error, always use the IF NOT EXISTS clause when adding new columns to ensure that the operation does not attempt to add a column that already exists. This clause checks the existing table schema and only adds the new column if it does not find a column with the same name. If adding multiple columns, ensure each column name is unique and does not conflict with existing columns.

ALTER TABLE mydataset.mytable
ADD COLUMN IF NOT EXISTS new_column STRING;

Deepening Your Understanding of Data Definition Language

This article has detailed the use of the ALTER function within Data Definition Language (DDL), an essential tool for modifying existing database structures in BigQuery to adapt to evolving business needs and data schemas.

While the ALTER function is pivotal for ongoing database management, it represents only one aspect of DDL capabilities. For a comprehensive understanding of DDL, you may refer to our previous article, where we thoroughly explored the CREATE and DROP statements.

These foundational functions are crucial for initially setting up and effectively removing database structures, providing you with full control over your data environment from creation to deletion.

If you want to explore more BigQuery Functions, here are some valuable functions listed down below:

Conditional Expressions: Execute logic-based actions to return values depending on specific conditions, using expressions like CASE and IF.
String Functions: These functions adjust and format text data, including operations like CONCAT, SUBSTRING, UPPER, and LOWER.
Conversion Functions: Transform data from one format to another with functions such as CAST and SAFE_CAST.
Array Functions: Employ functions to create, modify, and examine arrays, such as ARRAY_AGG, UNNEST, and ARRAY_LENGTH.
Timestamp Functions: Manage and adjust timestamp data using functions like CURRENT_TIMESTAMP, TIMESTAMP_ADD, and FORMAT_TIMESTAMP.
Datetime Functions: Handle operations involving date and time data types with functions like CURRENT_DATE, DATE_ADD, and EXTRACT.

Boost Your Data Analysis with the OWOX BI BigQuery Reports Extension

Elevate your data analytics capabilities with the OWOX BI BigQuery Reports Extension. This vital tool seamlessly connects BigQuery with Google Sheets, giving your team a user-friendly platform to pull and analyze data effortlessly.

Non-technical users can interact with complex datasets, generate automated reports, and derive actionable insights directly from Google Sheets. Simplify your big data approach and empower your team to make informed business decisions effectively. Begin using the OWOX BI BigQuery Reports Extension today and transform your data interaction experience.

Modifying BigQuery Tables: Advanced Data Definition Language (DDL) Techniques

What are DDL Statements in BigQuery?

Modifying Tables with BigQuery ALTER Statements

ALTER TABLE SET OPTIONS

Syntax of ALTER TABLE SET OPTIONS

Example of ALTER TABLE SET OPTIONS

ALTER TABLE ADD COLUMN

Syntax of ALTER TABLE ADD COLUMN

Example of ALTER TABLE ADD COLUMN

ALTER TABLE RENAME TO

Syntax of ALTER TABLE RENAME TO

Example of ALTER TABLE RENAME TO

ALTER TABLE DROP COLUMN

Syntax of ALTER TABLE DROP COLUMN

Example of ALTER TABLE DROP COLUMN

ALTER COLUMN SET OPTIONS

Syntax of ALTER COLUMN SET OPTIONS

Example of ALTER COLUMN SET OPTIONS

ALTER COLUMN DROP NOT NULL

Syntax of ALTER COLUMN DROP NOT NULL

Example of ALTER COLUMN DROP NOT NULL

ALTER COLUMN SET DATA TYPE

Syntax of ALTER COLUMN SET DATA TYPE

Example of ALTER COLUMN SET DATA TYPE

ALTER TABLE RENAME

Syntax of ALTER TABLE RENAME

Examples of ALTER TABLE RENAME

ALTER COLUMN SET DEFAULT

Syntax of ALTER COLUMN SET DEFAULT

Example of ALTER COLUMN SET DEFAULT

ALTER SCHEMA SET DEFAULT COLLATE Statement

Syntax of ALTER SCHEMA SET DEFAULT COLLATE Statement

Example of ALTER SCHEMA SET DEFAULT COLLATE Statement

ALTER SCHEMA SET OPTIONS Statement

Syntax of ALTER SCHEMA SET OPTIONS Statement

Example of ALTER SCHEMA SET OPTIONS Statement

ALTER SCHEMA ADD REPLICA Statement

Syntax of ALTER SCHEMA ADD REPLICA Statement

Example of ALTER SCHEMA ADD REPLICA Statement

ALTER SCHEMA DROP REPLICA Statement

Syntax of ALTER SCHEMA DROP REPLICA Statement

Example of ALTER SCHEMA DROP REPLICA Statement

ALTER ORGANIZATION SET OPTIONS Statement

Syntax of ALTER ORGANIZATION SET OPTIONS Statement

Example of ALTER ORGANIZATION SET OPTIONS Statement

ALTER PROJECT SET OPTIONS Statement

Syntax of ALTER PROJECT SET OPTIONS Statement

Example of ALTER PROJECT SET OPTIONS Statement

ALTER BI_CAPACITY SET OPTIONS Statement

Syntax of ALTER BI_CAPACITY SET OPTIONS Statement

Example of ALTER BI_CAPACITY SET OPTIONS Statement

Advanced Strategies for Table Management in BigQuery with ALTER

Utilizing ALTER TABLE to Add Top-Level Fields

Example: Adding a Top-Level Field

Modifying Data Types with CAST and ALTER COLUMN

Example: Changing Data Type

Replacing Tables with CREATE OR REPLACE TABLE

Example of Using CREATE OR REPLACE TABLE

Best Practices for Using DDL ALTER Statements in BigQuery

Incremental Schema Evolution

Use DDL for Bulk Schema Modifications

Backfill Historical Data When Needed

Troubleshooting Common DDL Errors When Working with ALTER

Error When Renaming a Table Currently Receiving Streaming Data

Handling "Column Already Exists" Error When Using ALTER TABLE ADD COLUMN

Deepening Your Understanding of Data Definition Language

Boost Your Data Analysis with the OWOX BI BigQuery Reports Extension

FAQ

You might also like

2,000 companies rely on us

Subscribe to the OWOX Insider: your analytics digest