Adobe Data Distiller Guide
  • Adobe Data Distiller Guide
  • What is Data Distiller?
  • UNIT 1: GETTING STARTED
    • PREP 100: Why was Data Distiller Built?
    • PREP 200: Data Distiller Use Case & Capability Matrix Guide
    • PREP 300: Adobe Experience Platform & Data Distiller Primers
    • PREP 301: Leveraging Data Loops for Real-Time Personalization
    • PREP 302: Key Topics Overview: Architecture, MDM, Personas
    • PREP 303: What is Data Distiller Business Intelligence?
    • PREP 304: The Human Element in Customer Experience Management
    • PREP 305: Driving Transformation in Customer Experience: Leadership Lessons Inspired by Lee Iacocca
    • PREP 400: DBVisualizer SQL Editor Setup for Data Distiller
  • PREP 500: Ingesting CSV Data into Adobe Experience Platform
  • PREP 501: Ingesting JSON Test Data into Adobe Experience Platform
  • PREP 600: Rules vs. AI with Data Distiller: When to Apply, When to Rely, Let ROI Decide
  • Prep 601: Breaking Down B2B Data Silos: Transform Marketing, Sales & Customer Success into a Revenue
  • Unit 2: DATA DISTILLER DATA EXPLORATION
    • EXPLORE 100: Data Lake Overview
    • EXPLORE 101: Exploring Ingested Batches in a Dataset with Data Distiller
    • EXPLORE 200: Exploring Behavioral Data with Data Distiller - A Case Study with Adobe Analytics Data
    • EXPLORE 201: Exploring Web Analytics Data with Data Distiller
    • EXPLORE 202: Exploring Product Analytics with Data Distiller
    • EXPLORE 300: Exploring Adobe Journey Optimizer System Datasets with Data Distiller
    • EXPLORE 400: Exploring Offer Decisioning Datasets with Data Distiller
    • EXPLORE 500: Incremental Data Extraction with Data Distiller Cursors
  • UNIT 3: DATA DISTILLER ETL (EXTRACT, TRANSFORM, LOAD)
    • ETL 200: Chaining of Data Distiller Jobs
    • ETL 300: Incremental Processing Using Checkpoint Tables in Data Distiller
    • [DRAFT]ETL 400: Attribute-Level Change Detection in Profile Snapshot Data
  • UNIT 4: DATA DISTILLER DATA ENRICHMENT
    • ENRICH 100: Real-Time Customer Profile Overview
    • ENRICH 101: Behavior-Based Personalization with Data Distiller: A Movie Genre Case Study
    • ENRICH 200: Decile-Based Audiences with Data Distiller
    • ENRICH 300: Recency, Frequency, Monetary (RFM) Modeling for Personalization with Data Distiller
    • ENRICH 400: Net Promoter Scores (NPS) for Enhanced Customer Satisfaction with Data Distiller
  • Unit 5: DATA DISTILLER IDENTITY RESOLUTION
    • IDR 100: Identity Graph Overview
    • IDR 200: Extracting Identity Graph from Profile Attribute Snapshot Data with Data Distiller
    • IDR 300: Understanding and Mitigating Profile Collapse in Identity Resolution with Data Distiller
    • IDR 301: Using Levenshtein Distance for Fuzzy Matching in Identity Resolution with Data Distiller
    • IDR 302: Algorithmic Approaches to B2B Contacts - Unifying and Standardizing Across Sales Orgs
  • Unit 6: DATA DISTILLER AUDIENCES
    • DDA 100: Audiences Overview
    • DDA 200: Build Data Distiller Audiences on Data Lake Using SQL
    • DDA 300: Audience Overlaps with Data Distiller
  • Unit 7: DATA DISTILLER BUSINESS INTELLIGENCE
    • BI 100: Data Distiller Business Intelligence: A Complete Feature Overview
    • BI 200: Create Your First Data Model in the Data Distiller Warehouse for Dashboarding
    • BI 300: Dashboard Authoring with Data Distiller Query Pro Mode
    • BI 400: Subscription Analytics for Growth-Focused Products using Data Distiller
    • BI 500: Optimizing Omnichannel Marketing Spend Using Marginal Return Analysis
  • Unit 8: DATA DISTILLER STATISTICS & MACHINE LEARNING
    • STATSML 100: Python & JupyterLab Setup for Data Distiller
    • STATSML 101: Learn Basic Python Online
    • STATSML 200: Unlock Dataset Metadata Insights via Adobe Experience Platform APIs and Python
    • STATSML 201: Securing Data Distiller Access with Robust IP Whitelisting
    • STATSML 300: AI & Machine Learning: Basic Concepts for Data Distiller Users
    • STATSML 301: A Concept Course on Language Models
    • STATSML 302: A Concept Course on Feature Engineering Techniques for Machine Learning
    • STATSML 400: Data Distiller Basic Statistics Functions
    • STATSML 500: Generative SQL with Microsoft GitHub Copilot, Visual Studio Code and Data Distiller
    • STATSML 600: Data Distiller Advanced Statistics & Machine Learning Models
    • STATSML 601: Building a Period-to-Period Customer Retention Model Using Logistics Regression
    • STATSML 602: Techniques for Bot Detection in Data Distiller
    • STATSML 603: Predicting Customer Conversion Scores Using Random Forest in Data Distiller
    • STATSML 604: Car Loan Propensity Prediction using Logistic Regression
    • STATSML 700: Sentiment-Aware Product Review Search with Retrieval Augmented Generation (RAG)
    • STATSML 800: Turbocharging Insights with Data Distiller: A Hypercube Approach to Big Data Analytics
  • UNIT 9: DATA DISTILLER ACTIVATION & DATA EXPORT
    • ACT 100: Dataset Activation with Data Distiller
    • ACT 200: Dataset Activation: Anonymization, Masking & Differential Privacy Techniques
    • ACT 300: Functions and Techniques for Handling Sensitive Data with Data Distiller
    • ACT 400: AES Data Encryption & Decryption with Data Distiller
  • UNIT 9: DATA DISTILLER FUNCTIONS & EXTENSIONS
    • FUNC 300: Privacy Functions in Data Distiller
    • FUNC 400: Statistics Functions in Data Distiller
    • FUNC 500: Lambda Functions in Data Distiller: Exploring Similarity Joins
    • FUNC 600: Advanced Statistics & Machine Learning Functions
  • About the Authors
Powered by GitBook
On this page
  • Prerequisites
  • Case Study Overview
  • Key Concepts
  • Calculate Marginal Returns per Channel
  • Calculate the Break-Even Point for Each Channel
  • Breakeven Point Formula
  • Overall Breakeven Analysis Across Channels
  • Dynamic Visualization Using Date Filtering
  • Seasonal Analysis
  • Event-Based Marketing
  • Analyzing External Influences on Marginal Returns
  • Key Findings by Channel
  • Analysis
  • Channel Segmentation Based on Performance Metrics
  • Training Data Clustering Results
  • Test Data Clustering Results
  • Analysis
  • Optimal Budget Optimization Across Channels: A Simple Approach
  • Analysis
  1. Unit 7: DATA DISTILLER BUSINESS INTELLIGENCE

BI 500: Optimizing Omnichannel Marketing Spend Using Marginal Return Analysis

Analyzing marketing effectiveness across various channels using

Last updated 6 months ago

In this tutorial, we explore how to optimize marketing spend across various channels by leveraging Data Distiller to analyze and visualize marketing effectiveness. The dataset includes about a million records of marketing activities spanning multiple channels—paid search, social media, email marketing, and display ads—across different dates, promotional activities, and economic conditions. Each record captures key data points such as marketing spend, revenue generated, promotional activity status, and economic condition for a given day. The assumption is that you have acquired this data from the various channels and have harmonized the data using Data Distiller.

Prerequisites

Download the sample data is:

using the following tutorial:

Case Study Overview

The primary objective of this case study is to help a fictional company, "RetailHub Inc.," determine the marginal returns of each marketing channel. The analysis will enable RetailHub to:

  1. Identify which marketing channels yield the highest returns relative to investment.

  2. Detect the break-even point where additional marketing investment no longer yields profitable returns.

  3. Understand how external factors like promotional activities and economic conditions influence channel effectiveness.

Note that the data has been standardized and harmonized using Data Distiller. Even without loading the data into AEP, you can open the CSV file in Excel to see the standardization:

Data harmonization is a foundational use case in Data Distiller, with the primary goal of bringing omni-channel data into a consistent format within the system. Harmonization is a crucial preprocessing step for analyzing datasets that may come from various sources and formats. For the dataset you provided, harmonization ensures that all data entries—across different marketing channels, economic conditions, and promotional activities—are standardized, consistent, and ready for effective analysis in our use case:

  1. Date Standardization:

    • Ensuring that dates are in a consistent format (e.g., YYYY-MM-DD or MM/DD/YYYY) allows for accurate time-based filtering and sorting.

    • Any time components, if not relevant, might be normalized to midnight or removed for simplicity, focusing on daily summaries.

  2. Consistent Channel Naming: Marketing channels like "paid_search," "social_media," "display_ads," and "email_marketing" appear with consistent naming. Harmonizing ensures that channels are not duplicated with different spellings or formats (e.g., "Paid Search" vs. "paid_search").

  3. Spend and Revenue Harmonization:

    • The spend and revenue columns need to be in a consistent currency and format (e.g., USD, with two decimal places).

    • Handling missing or zero values to avoid skewed marginal return calculations. For example, filling missing spend or revenue values with a logical default or calculating them based on available data if appropriate.

  4. Categorization of Promotional Activity:

    • The promo_activity column uses binary values (0 and 1) to indicate whether a promotion was active or not. Harmonizing categorical data in this way enables grouping and filtering without ambiguity.

    • This could involve mapping different promotional statuses (like "active," "on hold," etc.) to a binary format, ensuring simplicity in downstream analysis.

  5. Economic Condition Standardization:

    • The economic_condition column uses consistent categories like "Good," "Average," and "Poor" to represent economic conditions. Harmonizing these categories ensures that they are standardized and can be used effectively in groupings.

    • If the data sources used different terms for economic conditions, harmonization would involve mapping these variations to standard categories for consistency.

  6. Dealing with Outliers:

    • Identifying and handling outliers in spend and revenue ensures that extreme values don’t distort marginal return calculations.

    • Harmonization may involve detecting unusual spikes or drops in spend/revenue and determining whether they’re genuine or need adjustment or removal.

  7. Null Handling:

    • Ensuring there are no null values in critical columns like spend, revenue, and channel to prevent errors in calculations.

    • Null values might be imputed or flagged for exclusion in the analysis if they cannot be filled logically.

  8. Data Type Harmonization: Ensuring that numeric columns like spend and revenue are in a numeric data type (float or integer), while categorical fields like channel and economic_condition are in string format.

Remember that the revenue in our dataset is attributed revenue by channel: Each row's revenue value is tied to the channel listed in the channel column, indicating that it represents the outcome (revenue) generated as a result of marketing efforts on that specific channel. For example, if the channel is "social_media," then the revenue reflects only the income generated through social media activities on that date.

Key Concepts

  1. Calculating Marginal Return per Channel: Using SQL window functions, calculate marginal returns to compare incremental revenue with incremental spend on each channel. This helps identify which marketing channels provide the best returns over time.

  2. Identifying the Break-even Point: Define the break-even point where the marginal return becomes less than 1 (indicating diminishing returns). Highlight this point in the data to guide marketing budget allocation decisions.

  3. Dynamic Visualization by Date Range: Learn how to adjust date ranges to dynamically view changes in marginal return over specific timeframes. This allows RetailHub to analyze the effectiveness of marketing spend during promotional periods or under varying economic conditions.

  4. Analyzing External Influences: Examine how factors like promotions and economic conditions impact marginal returns across channels, allowing for more informed, context-aware budgeting decisions.

Calculate Marginal Returns per Channel

A key question to ask is: What is the additional revenue generated from every additional dollar spent? This analysis helps answer that by calculating the marginal return, which represents the incremental revenue generated for each incremental spend. Analyzing marginal returns across channels and time allows companies to optimize their marketing budgets by identifying the most effective channels and understanding when additional spend starts yielding diminishing returns.

For example, if a channel’s marginal return drops below a certain threshold (e.g., less than 1, meaning incremental revenue is less than incremental spend), it may indicate that additional spending on that channel is no longer cost-effective.

WITH marginal_return AS (
    SELECT
        date,
        channel,
        spend,
        revenue,
        LAG(revenue) OVER (PARTITION BY channel ORDER BY date) AS prev_revenue,
        LAG(spend) OVER (PARTITION BY channel ORDER BY date) AS prev_spend
    FROM marketing_data
)
SELECT
    date,
    channel,
    spend,
    revenue,
    COALESCE((revenue - prev_revenue) / NULLIF(spend - prev_spend, 0), 0) AS marginal_return
FROM marginal_return
WHERE prev_spend IS NOT NULL;

Calculate the Break-Even Point for Each Channel

We are now going to calculate the break-even point for marketing spend on each channel for each date. The break-even point represents the moment when additional spend on a given marketing channel no longer yields proportional increases in revenue—essentially, when the marginal return on investment drops below 1. The reason why we are computing this on a daily basis is:

  • Evaluate Daily Channel Efficiency: By calculating the break-even point on a daily basis, we can assess how efficiently each marketing channel is performing each day. Marketing effectiveness can fluctuate due to various factors, such as promotions, seasonal changes, or economic conditions. Calculating this daily allows us to capture these variations in real time.

  • Identify Diminishing Returns: The marginal return calculation (revenue generated by the incremental spend) tells us if each additional dollar spent on a channel still brings in more than a dollar in revenue. When marginal return falls below 1 on a specific date, it indicates diminishing returns, meaning the channel’s current spend level may no longer be cost-effective.

  • Optimize Budget Allocation: Knowing the daily break-even point helps make dynamic decisions on where to allocate marketing budgets. For instance, if a specific channel’s marginal return is consistently below 1, it might signal that budgets could be more effective if reallocated to other channels with higher marginal returns.

  • Context-Aware Decision-Making: Since this dataset includes external factors (like promo_activity and economic_condition), we can interpret the break-even point in the context of these factors. For example, if marginal returns are higher during promotional periods, it may make sense to increase spend during promotions and decrease it when promotions are inactive.

Breakeven Point Formula

The break-even point is determined by calculating the marginal return for each channel on each date, using the following approach:

  • Marginal Return

Revenue−Previous RevenueSpend−Previous Spend \frac{\text{Revenue} - \text{Previous Revenue}}{\text{Spend} - \text{Previous Spend}} Spend−Previous SpendRevenue−Previous Revenue​
  • If marginal return < 1, the additional spend on that date and channel yielded less revenue than the spend itself, indicating that the break-even point has been reached or exceeded.

To identify the break-even point where incremental return is less than incremental spend, we can calculate the marginal return and then filter for instances where this value drops below 1. This approach helps pinpoint the point where additional investment in a marketing channel does not yield a proportionate return, marking the point of diminishing returns. Here’s a more detailed breakdown of this process:

WITH marginal_return AS (
    SELECT
        date,
        channel,
        spend,
        revenue,
        LAG(revenue) OVER (PARTITION BY channel ORDER BY date) AS prev_revenue,
        LAG(spend) OVER (PARTITION BY channel ORDER BY date) AS prev_spend
    FROM marketing_data
),
calculated_margins AS (
    SELECT
        date,
        channel,
        spend,
        revenue,
        prev_revenue,
        prev_spend,
        COALESCE((revenue - prev_revenue) / NULLIF(spend - prev_spend, 0), 0) AS marginal_return
    FROM marginal_return
    WHERE prev_spend IS NOT NULL
)
SELECT
    date,
    channel,
    spend,
    revenue,
    marginal_return,
    CASE WHEN marginal_return < 1 THEN 'Break-even' ELSE 'Above break-even' END AS status
FROM calculated_margins;

Overall Breakeven Analysis Across Channels

Conducting a break-even analysis across the entire time period for all channels provides essential insights for strategic budgeting and resource allocation. By calculating the average marginal return over the full dataset, we gain a high-level view of each channel's effectiveness, revealing whether additional spending generally yields proportional revenue. This analysis is valuable for several reasons:

  • Long-term Channel Effectiveness: Averages over time help identify channels that consistently underperform (marginal return below 1), indicating that these channels may not justify continued investment.

  • High-level Budgeting and Simplified Decision-making: For executives and budget owners, a channel-level view simplifies decisions, making it easier to identify where to increase or decrease budgets. This approach avoids overreacting to short-term fluctuations and focuses on sustainable channel performance.

  • Baseline for Future Comparisons: Establishing a long-term average serves as a baseline to assess the impact of future strategic adjustments, allowing teams to track if new tactics improve returns over time.

  • Cross-channel Optimization: With insights on which channels consistently deliver value, marketing resources can be allocated to synergistic, high-performing channels, enhancing overall ROI and aligning spend with business goals.

WITH marginal_return AS (
    SELECT
        date,
        channel,
        spend,
        revenue,
        LAG(revenue) OVER (PARTITION BY channel ORDER BY date) AS prev_revenue,
        LAG(spend) OVER (PARTITION BY channel ORDER BY date) AS prev_spend
    FROM marketing_data
),
calculated_margins AS (
    SELECT
        channel,
        COALESCE((revenue - prev_revenue) / NULLIF(spend - prev_spend, 0), 0) AS marginal_return
    FROM marginal_return
    WHERE prev_spend IS NOT NULL
)
SELECT
    channel,
    AVG(marginal_return) AS avg_marginal_return,
    CASE WHEN AVG(marginal_return) < 1 THEN 'Break-even' ELSE 'Above break-even' END AS status
FROM calculated_margins
GROUP BY channel
ORDER BY channel;

This yields

The analysis in the screenshot shows the average marginal return for each marketing channel (display_ads, email_marketing, paid_search, and social_media) over the entire time period. Here’s what the results indicate:

  1. Above Break-even: All channels have an average marginal return greater than 1, which means that, on average, each channel generates more revenue than the amount spent. In other words, for every dollar invested, each channel yields more than a dollar in revenue. The status column confirms that each channel is "Above break-even," implying that, over the entire time period, none of the channels is losing money on average.

  2. Comparing Channel Effectiveness: Among the four channels: Email marketing has the highest average marginal return (approximately 1.76), suggesting it’s the most effective channel in terms of return on investment. Social media follows, with an average marginal return of about 1.62. Display ads and paid search are slightly lower but still above 1, with average marginal returns of approximately 1.27 and 1.49, respectively. This suggests that, although all channels are profitable, email marketing and social media may be the most efficient in terms of generating revenue relative to spend.

There are some implications of the above findings:

  • Budget Allocation: Since email marketing and social media offer the highest returns, it may be wise to prioritize spending on these channels. Conversely, while display ads and paid search are still profitable, they might be considered for optimized or reduced spending if budget reallocation is an option.

  • Channel Strategy: This analysis provides a baseline understanding of long-term performance. Channels with the highest average marginal returns could be candidates for further investment, while those with lower returns could be evaluated for strategy adjustments to improve their effectiveness.

Dynamic Visualization Using Date Filtering

It’s often helpful to examine performance metrics (like marginal return) over particular date ranges to understand how certain periods affect marketing channel effectiveness. For example:

Seasonal Analysis

During holiday seasons, promotional periods, or economic shifts, specific channels might perform better or worse.

SELECT
    date,
    channel,
    spend,
    revenue,
    COALESCE((revenue - prev_revenue) / NULLIF(spend - prev_spend, 0), 0) AS marginal_return,
    CASE WHEN COALESCE((revenue - prev_revenue) / NULLIF(spend - prev_spend, 0), 0) < 1 THEN 'Break-even' ELSE 'Above break-even' END AS status
FROM (
    SELECT
        date,
        channel,
        spend,
        revenue,
        LAG(revenue) OVER (PARTITION BY channel ORDER BY date) AS prev_revenue,
        LAG(spend) OVER (PARTITION BY channel ORDER BY date) AS prev_spend
    FROM marketing_data
    WHERE date BETWEEN '2024-12-01' AND '2025-01-15' -- Holiday season date range
) AS seasonal_analysis
WHERE prev_spend IS NOT NULL
ORDER BY channel, date;

Event-Based Marketing

Analyzing a date range around specific campaigns or product launches helps reveal the immediate impact of these events on returns.

SELECT
    date,
    channel,
    spend,
    revenue,
    COALESCE((revenue - prev_revenue) / NULLIF(spend - prev_spend, 0), 0) AS marginal_return,
    CASE WHEN COALESCE((revenue - prev_revenue) / NULLIF(spend - prev_spend, 0), 0) < 1 THEN 'Break-even' ELSE 'Above break-even' END AS status
FROM (
    SELECT
        date,
        channel,
        spend,
        revenue,
        LAG(revenue) OVER (PARTITION BY channel ORDER BY date) AS prev_revenue,
        LAG(spend) OVER (PARTITION BY channel ORDER BY date) AS prev_spend
    FROM marketing_data
    WHERE date BETWEEN '2024-04-01' AND '2024-04-15' -- Campaign launch date range
) AS event_analysis
WHERE prev_spend IS NOT NULL
ORDER BY channel, date;

Analyzing External Influences on Marginal Returns

External factors, like active promotions or varying economic conditions, often play a critical role in marketing effectiveness. For example:

  • Promotional Impact: Promotional periods may lead to a higher return on marketing spend, justifying increased spending during these times.

  • Economic Sensitivity: Different channels may perform better or worse depending on the economic conditions, helping organizations make strategic adjustments in spend allocation during economic downturns or booms.

By grouping marginal returns by promo_activity and economic_condition, analysts can discover which conditions are most favorable for each channel, enabling them to plan marketing budgets with these factors in mind.

WITH marginal_return AS (
    SELECT
        date,
        channel,
        spend,
        revenue,
        promo_activity,
        economic_condition,
        LAG(revenue) OVER (PARTITION BY channel ORDER BY date) AS prev_revenue,
        LAG(spend) OVER (PARTITION BY channel ORDER BY date) AS prev_spend
    FROM marketing_data
),
calculated_margins AS (
    SELECT
        date,
        channel,
        spend,
        revenue,
        promo_activity,
        economic_condition,
        COALESCE((revenue - prev_revenue) / NULLIF(spend - prev_spend, 0), 0) AS marginal_return
    FROM marginal_return
    WHERE prev_spend IS NOT NULL
)
SELECT
    channel,
    promo_activity,
    economic_condition,
    AVG(marginal_return) AS avg_marginal_return,
    COUNT(*) AS observations
FROM calculated_margins
GROUP BY
    channel,
    promo_activity,
    economic_condition
ORDER BY
    channel,
    promo_activity DESC,
    economic_condition;

Key Findings by Channel

  1. Display Ads:

    • When promotions are active (promo_activity = 1), display_ads performs best under "Good" economic conditions, with an average marginal return of 2.1446, indicating that each additional dollar spent on display ads generates more than double in revenue.

    • Under "Poor" economic conditions, even with promotions, display_ads still show a strong marginal return (1.7376), suggesting resilience in lower economic conditions.

    • When there is no promotion (promo_activity = 0), display_ads marginal returns drop significantly under "Average" conditions, showing a negative marginal return (-0.0211). This means that additional spend actually correlates with a decline in revenue, suggesting that display ads may be inefficient without promotional support in average economic times.

  2. Email Marketing:

    • Email marketing shows high marginal returns when promotions are active across all economic conditions, with values ranging from 1.6464 under "Average" conditions to 2.2325 under "Poor" conditions.

    • This channel appears to be especially effective during promotions, showing consistently high marginal returns, even in challenging economic conditions.

    • Without promotions, email marketing still maintains decent marginal returns across all economic conditions, with values around 1.5 or higher. This stability suggests that email marketing is a robust channel that performs well with or without promotions.

  3. Paid Search:

    • Paid search shows strong performance under "Good" economic conditions, both with promotions (average marginal return of 2.1201) and without (average marginal return of 2.0134).

    • During "Poor" economic conditions with no promotions, paid search's marginal return drops significantly, even turning negative (-0.1821), suggesting that paid search might be ineffective in low economic periods without promotional support.

    • Overall, paid search performs best in "Good" economic times but shows vulnerability in weaker economies if promotions are not running.

  4. Social Media:

    • Social media shows varied marginal returns and is the only channel with fewer records (observations). Under "Average" economic conditions with promotions active, the average marginal return is 0.8451, suggesting that this channel is not as efficient as others during promotions.

    • There is insufficient data in the displayed portion to analyze performance under "Good" or "Poor" economic conditions, indicating that further data collection might be needed to fully evaluate social media’s performance.

Analysis

  • Effectiveness of Promotions by Channel:

    • Display ads and email marketing show strong positive responses to promotions across all economic conditions, suggesting they should be prioritized for promotional campaigns. Display ads are highly effective in "Good" conditions, while email marketing is consistent even in poor economies.

    • Paid search performs well during "Good" economic conditions with or without promotions, but its performance drops significantly without promotions in weaker economies. It may be more cost-effective to support paid search with promotions during challenging economic periods.

  • Economic Sensitivity:

    • Display ads and paid search appear sensitive to economic conditions, performing significantly better in "Good" economic times. Without promotions, they may even produce negative returns in average or poor economies, indicating the importance of aligning spend on these channels with favorable economic periods.

    • Email marketing is the least sensitive to economic conditions, maintaining positive returns across all conditions. This robustness suggests that it could be a reliable channel for steady investment regardless of economic shifts.

  • Channel-Specific Budget Allocation:

    • Given the marginal returns observed, email marketing should receive a stable or even increased allocation due to its resilience and effectiveness across conditions.

    • Display ads should be prioritized for promotional periods and particularly leveraged when economic conditions are favorable.

    • Paid search should be monitored closely and potentially reduced during economic downturns, especially if promotions are not planned, as it shows poor performance under these conditions.

    • Social media might require further evaluation or data collection due to its limited observations, but it appears less effective during promotional periods compared to other channels.

This analysis highlights that promotions and economic conditions significantly impact the effectiveness of each marketing channel. Email marketing is a strong performer in both good and poor economies, making it a reliable investment, while display ads and paid search show high returns in favorable conditions or during promotions. Social media may require further data for a full assessment but appears less effective during promotions in average economic conditions.

Channel Segmentation Based on Performance Metrics

The goal of clustering for channel segmentation based on performance is to group similar marketing channels or time periods according to key performance metrics, such as marginal return, spend, and revenue. By identifying patterns and grouping channels with similar characteristics, we gain insights into high-performing and low-performing segments, which help guide strategic budget decisions.

To achieve this, we first prepare the data by ensuring that each record includes essential metrics—marginal return, spend, and revenue—along with additional contextual features, like promotional activity or economic conditions if available. It’s important to normalize or scale these metrics so that each one contributes equally to the clustering process.

Once we define the features—such as [avg_spend, avg_revenue, avg_marginal_return]—we combine them into a single vector for each channel or time period. Running the chosen clustering algorithm then assigns each data point (channel or date) a cluster label, grouping similar records together. Finally, we analyze and interpret each cluster to understand its performance characteristics. High-performing clusters might demonstrate high returns relative to spend, making them strong candidates for increased budget allocation. Conversely, low-performing clusters may indicate poor returns, suggesting areas where budget cuts or optimizations are needed.

CREATE TABLE IF NOT EXISTS calculated_margins AS
WITH marginal_return AS (
    SELECT
        date,
        channel,
        spend,
        revenue,
        promo_activity,
        economic_condition,
        LAG(revenue) OVER (PARTITION BY channel ORDER BY date) AS prev_revenue,
        LAG(spend) OVER (PARTITION BY channel ORDER BY date) AS prev_spend
    FROM marketing_data
)
SELECT
    date,
    channel,
    spend,
    revenue,
    promo_activity,
    economic_condition,
    COALESCE((revenue - prev_revenue) / NULLIF(spend - prev_spend, 0), 0) AS marginal_return
FROM marginal_return
WHERE prev_spend IS NOT NULL;

-- Step 0: Split Data into Training (80%) and Testing (20%)

-- Assuming calculated_margins is the table with data including marginal returns per channel
CREATE TABLE channel_data_split AS
SELECT *,
    CASE
        WHEN RAND() < 0.8 THEN 'train' ELSE 'test' END AS data_split
FROM calculated_margins;

-- Step 1: Create the K-Means Model on the Training Data Only
CREATE OR REPLACE MODEL channel_performance_clustering
TRANSFORM(vector_assembler(array(avg_spend, avg_revenue, avg_marginal_return)) AS features)
OPTIONS (
    MODEL_TYPE = 'KMEANS',
    NUM_CLUSTERS = 4,               -- Adjust the number of clusters based on analysis needs
    MAX_ITER = 20                   -- Set maximum iterations for convergence
)
AS 
SELECT 
    channel,
    AVG(spend) AS avg_spend,
    AVG(revenue) AS avg_revenue,
    AVG(marginal_return) AS avg_marginal_return
FROM 
    channel_data_split
WHERE data_split = 'train'
GROUP BY 
    channel;

-- Step 2: Store Clustering Results for Training Data
-- Create a table to store the clusters and predictions for each channel based on performance
-- Create a table to store predictions for the test data, which will be used for evaluating the model
--CREATE TABLE IF NOT EXISTS channel_cluster_train AS
SELECT *
FROM MODEL_PREDICT(channel_performance_clustering, 1, 
    SELECT 
        channel,
        AVG(spend) AS avg_spend,
        AVG(revenue) AS avg_revenue,
        AVG(marginal_return) AS avg_marginal_return
    FROM channel_data_split
    WHERE data_split = 'train'
    GROUP BY channel
);


-- Step 3: Store Clustering Results for Testing Data
-- Create a table to store predictions for the test data, which will be used for evaluating the model
--CREATE TABLE IF NOT EXISTS channel_cluster_test AS
SELECT *
FROM MODEL_PREDICT(channel_performance_clustering, 1, 
    SELECT 
        channel,
        AVG(spend) AS avg_spend,
        AVG(revenue) AS avg_revenue,
        AVG(marginal_return) AS avg_marginal_return
    FROM channel_data_split
    WHERE data_split = 'test'
    GROUP BY channel
);

The results are for training data:

The results for the test data are:

The clustering results from the training and test data provide insights into how different marketing channels are segmented based on their performance metrics (average spend, average revenue, and average marginal return). Let’s interpret the clustering assignments and what they suggest about each channel’s performance characteristics.

Training Data Clustering Results

  1. Cluster Assignments:

    • Paid Search: Assigned to cluster 1 with an average spend of approximately 5003.84, average revenue of 8757.87, and an average marginal return of 1.43.

    • Display Ads: Assigned to cluster 2 with an average spend of around 5001.57, average revenue of 8753.31, and an average marginal return of 0.96.

    • Social Media: Assigned to cluster 0 with an average spend of 4998.77, average revenue of 8749.15, and an average marginal return of 1.67.

    • Email Marketing: Assigned to cluster 3 with an average spend of 5003.85, average revenue of 8755.74, and an average marginal return of 1.80.

  2. Cluster Interpretations:

    • Cluster 1 (e.g., Paid Search): Channels in this cluster have moderate marginal returns and generate a reasonable balance between spend and revenue. They are effective but may not maximize revenue as efficiently as higher-return clusters.

    • Cluster 2 (e.g., Display Ads): This cluster has the lowest marginal return (0.96), indicating that these channels might not be cost-effective, as they generate less than a dollar in revenue for each dollar spent. Channels in this cluster may need optimization or reduced spending.

    • Cluster 0 (e.g., Social Media): Channels in this cluster show good performance, with a high marginal return (1.67), indicating that these channels are more effective at converting spend into revenue.

    • Cluster 3 (e.g., Email Marketing): This cluster has the highest marginal return (1.80), suggesting that email marketing is the most efficient channel in terms of revenue generation. Channels in this cluster should be prioritized for budget allocation due to their high ROI.

Test Data Clustering Results

  1. Cluster Assignments:

    • Paid Search: Assigned to cluster 1 with an average spend of 5009.28, average revenue of 8773.25, and an average marginal return of 1.39.

    • Display Ads: Assigned to cluster 0 with an average spend of 4999.25, average revenue of 8745.20, and an average marginal return of 2.35.

    • Social Media: Assigned to cluster 1 with an average spend of 5014.96, average revenue of 8781.21, and an average marginal return of 1.44.

    • Email Marketing: Assigned to cluster 1 with an average spend of 5012.91, average revenue of 8775.92, and an average marginal return of 1.59.

  2. Cluster Interpretations:

    • In the test data, Display Ads is uniquely assigned to cluster 0 with a significantly higher marginal return (2.35). This suggests that display ads might perform better under certain conditions or periods, achieving a much higher return than observed in the training data. This might be due to factors like specific promotional activity or seasonal influences during the test period.

    • Cluster 1 for the test data includes Paid Search, Social Media, and Email Marketing, all with moderate-to-high marginal returns ranging from 1.39 to 1.59. This indicates that, for this test period, these channels perform efficiently and consistently, converting each dollar spent into more than a dollar in revenue.

Analysis

  • Performance Consistency Across Periods: Email Marketing and Social Media generally remain in high-performing clusters across both training and test periods, suggesting stable effectiveness. These channels should be considered for higher budget allocations.

  • Cluster Variability for Display Ads: Display Ads shows significant variability, shifting from a low-performing cluster (2 in training) to a high-performing cluster (0 in test). This might indicate that Display Ads’ effectiveness is highly influenced by external factors (e.g., timing, promotional campaigns). Further investigation could help pinpoint specific conditions that make this channel more effective.

  • Budget Allocation Strategies

    • Channels in high-performing clusters with high marginal returns (such as Email Marketing and Social Media in the training period, and Display Ads in the test period) should be prioritized for budget increases.

    • Channels in lower-performing clusters with marginal returns below 1 (such as Display Ads in the training period) should either undergo optimization or receive reduced budgets unless specific conditions that improve their performance can be identified.

Optimal Budget Optimization Across Channels: A Simple Approach

Our goal now is to create an optimization model to maximize overall marginal return across all channels within a given budget.

To create an optimization model for budget allocation across marketing channels, we’ll design a model that maximizes the overall marginal return within a set budget constraint. Here, we’ll use linear programming techniques (since marginal return can typically be approximated as a linear function of budget) to distribute the budget across channels in an optimal way. This model will suggest how much budget to allocate to each channel to maximize total return, subject to budget limits and any specified minimum or maximum spend requirements.

Currently, Data Distiller does not support linear programming models natively. However, budget allocation across omni-channel campaigns can still be achieved using combination techniques, which are effective in this context.

To set up a budget optimization, we’ll generate combinations of budget allocations across channels within a defined range and filter out combinations that meet the total budget constraint and other conditions. This approach allows us to approximate an optimal allocation across four channels (email_marketing, paid_search, social_media, and display_ads) to maximize returns based on each channel's marginal return.

Assumptions

  • Total Budget Constraint: The combined spend across all channels cannot exceed a specified total budget.

  • Minimum and Maximum Spend per Channel: Each channel has a predefined minimum and maximum budget allocation.

  • Marginal Return for Each Channel: Each channel has an expected marginal return, which allows us to calculate the total return for each combination.

The objective function that we have here is:

Total Return=(email_marketing×1.8)+(paid_search×1.4)+(social_media×1.5)+(display_ads×1.0) \text{Total Return} = (\text{email\_marketing} \times 1.8) + (\text{paid\_search} \times 1.4) + (\text{social\_media} \times 1.5) + (\text{display\_ads} \times 1.0) Total Return=(email_marketing×1.8)+(paid_search×1.4)+(social_media×1.5)+(display_ads×1.0)
WITH email_marketing_values AS (
    SELECT explode(sequence(1000, 5000, 500)) AS email_marketing
),
paid_search_values AS (
    SELECT explode(sequence(1500, 6000, 500)) AS paid_search
),
social_media_values AS (
    SELECT explode(sequence(1200, 4500, 500)) AS social_media
),
display_ads_values AS (
    SELECT explode(sequence(800, 3000, 500)) AS display_ads
),
PossibleAllocations AS (
    SELECT 
        em.email_marketing,
        ps.paid_search,
        sm.social_media,
        da.display_ads,
        (em.email_marketing * 1.8 + ps.paid_search * 1.4 + sm.social_media * 1.5 + da.display_ads * 1.0) AS total_return,
        (em.email_marketing + ps.paid_search + sm.social_media + da.display_ads) AS total_spend
    FROM 
        email_marketing_values AS em
    CROSS JOIN 
        paid_search_values AS ps
    CROSS JOIN 
        social_media_values AS sm
    CROSS JOIN 
        display_ads_values AS da
)
SELECT 
    email_marketing, 
    paid_search, 
    social_media, 
    display_ads,
    total_spend, 
    total_return
FROM 
    PossibleAllocations
WHERE 
    total_spend <= 15000  -- Total budget constraint
ORDER BY 
    total_return DESC
LIMIT 100;  -- Select the allocation with the highest return

Remember that although we get lots of combinations that satisfy the budget constraint (and better), you need to choose the highest return i.e the top result which can be got by changing LIMIT 100 to LIMIT 1.

The results are the following:

The additional rows display the top 100 budget allocations sorted by total return in descending order, with returns ranging from 22,500 to 23,100. This range suggests there are multiple budget allocation combinations that yield returns close to the maximum. In some cases, slightly adjusting the budget between channels—such as increasing allocation for paid_search while reducing spend on social_media or display_ads—still meets the budget constraint and achieves a high return, though not as high as the optimal allocation. Notably, email_marketing and paid_search consistently receive high allocations (around 4,500 to 5,000) across the top results, indicating their higher marginal returns and prioritization in the budget. In contrast, social_media and display_ads exhibit more variability in their allocations, suggesting these channels are adjusted as needed to meet the total budget constraint while maximizing total return.

There are multiple combinations and we can do a tradeoff here. However, if we seek the optimized solution, we get:

The optimal allocation with the highest return shows the budget distribution across the four channels—email_marketing, paid_search, social_media, and display_ads—that maximizes the total return within the specified budget constraint. In this allocation, email_marketing and paid_search each receive a budget of 5,000, social_media receives 4,200, and display_ads is allocated 800. The total spend is exactly 15,000, which meets the budget constraint, and the total return achieved is 23,100, representing the maximum achievable return within these constraints.

Analysis

  • Prioritization of Channels:

    • Email Marketing and Paid Search consistently receive high allocations across the top results, implying that they likely have higher marginal returns relative to other channels. It would be beneficial to prioritize these channels when allocating future budgets.

    • Social Media and Display Ads should be treated as flexible channels, where spending can be adjusted to optimize within the remaining budget after prioritizing the higher-return channels.

  • Budget Flexibility:

    • For budgets slightly below 15,000, you might still achieve close to optimal returns by slightly reducing spend on lower-return channels like Display Ads or Social Media.

    • This flexibility can be useful for scenarios where the budget fluctuates or where strict budget adherence is essential.

  • Maximizing Returns within Constraints:

    • The results provide a clear strategy for allocating the budget across multiple channels to achieve the highest possible return, given the budget constraint and marginal return values for each channel.

    • This approach can be repeated with updated marginal returns or adjusted budget limits to refine allocation strategies based on real-time data or changing economic conditions.

PREP 500: Ingesting CSV Data into Adobe Experience Platform
55MB
marketing_data.csv
Data harmonization is an assumed step.
Marginal returns calculation
Overall breakeven analysis
Date filtering to the holidays
Campaign launch date range.
Factor analysis.
Training data clusters.
Test data clusters.
Optimization solution.
Optimal budget allocation
Page cover image