Page cover

ACT 400: AES Data Encryption & Decryption with Data Distiller

Secure your sensitive data with AES encryption - a robust, industry-standard way to protect customer information, while easily decrypting it when needed.

Prerequisites

ACT 100: Dataset Activation with Data Distiller

Download the file:

Ingest the data as healthcare_customers dataset using this:

PREP 500: Ingesting CSV Data into Adobe Experience Platform

Also recommended

ACT 300: Functions and Techniques for Handling Sensitive Data with Data Distiller

Why Support AES (Advanced Encryption Standard)?

AES (Advanced Encryption Standard) support in Data Distiller enhances data security and aligns with industry standards. AES is the most popular symmetric encryption algorithm, widely trusted for its speed, efficiency, and strong security across industries like finance, healthcare, and cloud services. Its ability to encrypt large volumes of data efficiently makes it a superior choice over asymmetric algorithms like RSA, which, while highly secure, is slower and typically used for specific tasks like key exchanges and digital signatures rather than large-scale encryption.

Data Distiller includes support for encryption modes like GCM (Galois/Counter Mode), which is the most favored mode due to its dual ability to provide both encryption and data integrity. This makes it ideal for protecting sensitive data in secure communications, cloud storage, and large-scale enterprise operations.

In comparison to asymmetric encryption like RSA, which requires different keys for encryption and decryption, AES uses a single key, making it not only faster but also easier to manage in environments where large amounts of data need to be securely processed and stored. While RSA is excellent for securing small, highly sensitive pieces of data and key exchanges, AES is the gold standard for encrypting bulk data efficiently and securely.

AES support in Data Distiller ensures fast, scalable, secure, and robust data protection needed to meet regulatory standards like GDPR and HIPAA, while also offering high performance for enterprise use cases.

AES and Its Encryption Modes in Data Distiller

AES (Advanced Encryption Standard) is one of the most widely used and trusted methods for encrypting data. It’s employed globally to secure sensitive information, from financial transactions to personal communications. AES works by converting plain text data into an unreadable format, known as ciphertext, using a secret key. Only someone with the correct key can decrypt the data back into its original form.

AES in Data Distiller comes in 2 different key sizes: 128-bit and 256-bit, with the larger 256-bit key providing stronger security. But AES-256 is the most widely used. It offers the highest level of security with a 256-bit key, making it ideal for safeguarding sensitive data in industries like finance, healthcare, and government. AES-256 strikes a balance between security and performance, making it the preferred choice for robust encryption needs, especially where long-term data protection is critical.

However, AES doesn’t work alone—it uses different modes to encrypt and process data. These modes define how data is broken down and transformed, offering varying levels of security and performance. The three most common modes are GCM (Galois/Counter Mode) and ECB (Electronic Codebook Mode), each serving different purposes.

GCM (Galois/Counter Mode) is highly regarded for its speed and security. It not only encrypts data but also ensures that it hasn’t been tampered with, making it ideal for secure communications. GCM is especially useful in scenarios where both confidentiality and data integrity are important.

ECB (Electronic Codebook Mode) is the simplest and fastest mode, but also the least secure. In ECB, each block of data is encrypted independently, meaning identical pieces of input will result in identical encrypted output. While this makes ECB efficient, it can expose patterns in the data, making it less suitable for sensitive information.

Along with these modes, AES often relies on padding to ensure that data fits perfectly into the blocks required for encryption. For example, PKCS padding is commonly used to fill gaps when data doesn't perfectly match the block size. In some modes, like GCM, padding isn't required, making the encryption process more efficient.

The most popular mode of operation for AES encryption is GCM (Galois/Counter Mode). GCM is widely favored because it provides both data confidentiality (encryption) and data integrity (authentication) in a highly efficient manner. Its ability to ensure that data hasn't been tampered with while being transmitted, combined with its speed and performance, makes it ideal for modern applications, including secure communications, cloud services, and network encryption. GCM’s versatility and security features have made it the go-to mode in many industry-standard implementations.

Together, AES and its modes offer a versatile set of tools for protecting data in a wide range of scenarios, from high-security communications to everyday data protection. Whether you need speed, security, or flexibility, AES provides the foundation for keeping sensitive information safe.

AES Encryption Syntax

The generalized syntax is:

aes_encrypt(expr, key, mode [, padding])
  • expr: The data to be encrypted.

  • key: The binary key (use UNHEX() for hexadecimal key).

    • 16 bytes for AES-128.

    • 32 bytes for AES-256.

  • mode: Encryption mode (case-insensitive).

    • 'ECB': Electronic CodeBook mode.

    • 'GCM': Galois/Counter Mode (default mode).

  • padding (optional): Padding scheme (case-insensitive).

    • 'NONE': No padding (for 'GCM' mode only).

    • 'PKCS': Public Key Cryptography Standards padding (for 'ECB' mode).

    • 'DEFAULT': Uses 'NONE' for 'GCM' and 'PKCS' for 'ECB'.

AES Descryption Syntax

The generalized syntax is:

aes_decrypt(expr, key, mode [, padding])
  • expr: The binary data to be decrypted (typically stored as hex, so use UNHEX()).

  • key: The binary key (use UNHEX() for hexadecimal key).

    • 16 bytes for AES-128.

    • 32 bytes for AES-256.

  • mode: Decryption mode (must match the encryption mode).

    • 'ECB': Electronic CodeBook mode.

    • 'GCM': Galois/Counter Mode (default mode).

  • padding (optional): Padding scheme (must match the encryption padding).

    • 'NONE': No padding (for 'GCM' mode only).

    • 'PKCS': Public Key Cryptography Standards padding (for 'ECB' modes).

    • 'DEFAULT': Uses 'NONE' for 'GCM' and 'PKCS' for 'ECB'.

Understanding GCM and ECB Modes

GCM and ECB are different methods (or modes) of encrypting data. GCM (Galois/Counter Mode) is like locking your data with a secure padlock, but with an additional layer of protection to ensure that no one has tampered with it. This mode not only encrypts the data but also verifies its integrity, making it highly secure and fast. It is often used for secure communication, where speed and data integrity are critical.

ECB (Electronic Codebook Mode) treats each chunk of data the same way, without any chaining. It’s like putting each letter of a message in the same type of envelope, without considering the surrounding letters. This makes ECB fast but predictable, as identical chunks of data will produce identical encrypted output. Because of this, ECB is considered less secure than GCM since it can reveal patterns in the data.

What is Padding

In encryption, padding refers to filling in extra spaces when the data doesn’t perfectly fit the required block size (usually 16 bytes). Imagine you have a box that fits exactly 16 letters, but your message is only 13 letters long. Padding is like adding extra filler to make the message fit perfectly.

PKCS (Public Key Cryptography Standards) is a widely used method for padding. It adds extra characters to fill the gaps, making sure the data fits the block size. When the data is decrypted, the system knows how to remove the padding. In contrast, NONE means no padding is added, which only works if the data already fits the block size perfectly. This is commonly used in GCM mode, where padding isn’t required.

Key Generation

AES is a type of symmetric encryption. In symmetric encryption, the same key is used for both encrypting and decrypting data. This means that the person or system encrypting the data and the one decrypting it must both have access to the same secret key. Since AES is symmetric, the security of the system depends on keeping the key confidential. If someone gains access to the key, they can both encrypt and decrypt the data. Before using these functions, you will need to generate a key, securely track it, and store it in a secure vault.

Generate a 16-Byte Key

-- Generate a random 16-byte key (32 hexadecimal characters)
SELECT 
  UPPER(SUBSTRING(SHA2(CAST(RAND() AS STRING), 256), 1, 32)) AS generated_16_byte_key;

The query above generates hexadecimal characters, but the aes_encrypt and aes_decrypt functions require binary values. Therefore, you need to use the unhex(generated_16_byte_key) function in Data Distiller to convert the hexadecimal key into the required binary format

Generate a 32-Byte Key

-- Generate a random 32-byte key (64 hexadecimal characters)
SELECT 
  UPPER(SHA2(CAST(RAND() AS STRING), 256)) AS generated_32_byte_key;

The query above generates hexadecimal characters, but the aes_encrypt and aes_decrypt functions require binary values. Therefore, you need to use the unhex(generated_24_byte_key) function in Data Distiller to convert the hexadecimal key into the required binary format

AES-256 Encryption & Decryption with GCM (Default Mode, No Padding)

Let us demonstrate how the encryption and decryption works. Note that we will be using the HEX function and CAST functions for the purpose of displaying the results i.e. binary values cannot be displayed in the Data Distiller Query Pro Mode Editor. You should remove them when using these to functions:

WITH EncryptedData AS (
    -- Step 1: Encrypt the email and convert the encrypted binary data into a readable hex string
    SELECT
        customer_id,
        HEX(AES_ENCRYPT(email, UNHEX('6BB8E32DB365D1953C95377C547330B52FAF9C35C9350A2BA1FC5CB4651D28E9'))) AS encrypted_email_hex
    FROM
        healthcare_customers
)
-- Step 2: Decrypt the encrypted email and cast it back to STRING
SELECT
    customer_id,
    encrypted_email_hex,  -- Display encrypted email as hex string
    CAST(AES_DECRYPT(UNHEX(encrypted_email_hex), UNHEX('6BB8E32DB365D1953C95377C547330B52FAF9C35C9350A2BA1FC5CB4651D28E9')) AS STRING) AS decrypted_email
FROM
    EncryptedData;

The result should be:

Demonstration of AES encryption and decryption in Data Distiller

AES-256 Encryption & Decryption with ECB Mode and PKCS Padding

WITH EncryptedData AS (
    -- Step 1: Encrypt email using AES-256 with ECB mode and PKCS padding
    SELECT
        customer_id,
        HEX(AES_ENCRYPT(email, UNHEX('6BB8E32DB365D1953C95377C547330B52FAF9C35C9350A2BA1FC5CB4651D28E9'), 'ECB', 'PKCS')) AS encrypted_email_hex
    FROM
        healthcare_customers
)
-- Step 2: Decrypt the encrypted email using the same key, mode, and padding
SELECT
    customer_id,
    encrypted_email_hex,
    CAST(AES_DECRYPT(UNHEX(encrypted_email_hex), UNHEX('6BB8E32DB365D1953C95377C547330B52FAF9C35C9350A2BA1FC5CB4651D28E9'), 'ECB', 'PKCS') AS STRING) AS decrypted_email
FROM
    EncryptedData;
Demonstration of AES encryption and decryption in Data Distiller

The Genius of Galois: His Math Powers Modern Encryption

Galois

GCM (Galois/Counter Mode) is a mode of operation for encryption that ties back to the innovative work of mathematician Évariste Galois, whose contributions to abstract algebra, specifically Galois fields, play a pivotal role in how GCM operates.

What makes GCM special—and really cool—is that it combines both encryption and authentication in a highly efficient way, ensuring not only that data is protected, but also that it hasn’t been tampered with during transmission. This dual capability is crucial for modern data security.

At the heart of GCM's strength is its use of Galois fields, a concept developed by Galois in the 19th century, which involves operations on finite sets of numbers. In GCM, these fields enable fast and secure mathematical operations that verify data integrity while keeping the encryption itself highly efficient.

What’s particularly cool about this is that Galois, who tragically died young, couldn’t have foreseen how his abstract work in algebra would one day become foundational in securing digital communications in the 21st century. By leveraging the power of Galois fields, GCM mode manages to be both faster and more secure than many other encryption modes, making it a go-to solution for protecting sensitive data, especially in high-performance environments like cloud computing and secure messaging.

So, when using AES with GCM mode, you’re benefiting from the mathematical genius of Galois—applying 19th-century mathematics to cutting-edge digital encryption!

Last updated