ENRICH 300: Recency, Frequency, Monetary (RFM) Modeling for Personalization with Data Distiller
Learn how to leverage RFM modeling to enhance real-time customer personalization and drive targeted marketing strategies.
Last updated
Learn how to leverage RFM modeling to enhance real-time customer personalization and drive targeted marketing strategies.
Last updated
Understanding customer behavior is crucial for optimizing marketing strategies, and a variety of models exist to help businesses do just that. One of the most well-known is RFM (Recency, Frequency, Monetary), which segments customers based on their purchasing patterns, but it's just the beginning. Other models, such as Customer Lifetime Value (CLV), and Propensity Models provide deeper insights into customer value, loyalty, and engagement. These models, along with tools like Customer Satisfaction (CSAT) and Behavioral Segmentation, allow businesses to tailor marketing strategies, whether in B2C or B2B contexts. By leveraging these analytical frameworks, companies can focus on the most relevant customer groups, improve personalization, and drive sustainable growth through data-driven decision-making.
Marketers use these models to gain deeper insights into customer behavior, segment audiences effectively, and optimize marketing strategies. These models help in several key areas:
Personalization: Models like RFM allow marketers to target the right customers with tailored messages based on their purchase history, engagement, and value to the business.
Resource Allocation: By identifying high-value customers, marketers can prioritize resources and efforts on the most profitable segments or those needing retention strategies.
Improved Customer Experience: Models like RFE (a variation of RFM) help marketers understand how engaged customers are and how likely they are to recommend the brand, guiding improvements in customer experience.
Data-Driven Decision Making: These models turn complex data into actionable insights, enabling marketers to make informed decisions, such as which segments to focus on, which campaigns to run, and how to optimize customer journeys.
Maximizing ROI: By using models to focus on the most promising customer groups, marketers can enhance the efficiency of their campaigns, leading to better returns on marketing investments.
RFM, shorthand for Recency (R), Frequency(F), and Monetary(M), represents a data-driven approach to customer segmentation and analysis. This methodology delves into three pivotal dimensions of customer behavior: the recency of purchase, the frequency of engagement, and the monetary value spent. Through the quantification of these parameters, businesses attain valuable insights into distinct customer segments, empowering the formulation of customized marketing strategies that effectively cater to individual customer needs.
RFE (Recency, Frequency, Engagement) is similar to RFM but emphasizes how recently and frequently a customer engages with the brand or product, without focusing on monetary value. It is commonly used in subscription or engagement-driven models where customer interaction is a key metric. The main factors it measures are user activity, interactions, and time spent with the brand.
The RFM model classifies customers based on their transactional behaviors, utilizing three key parameters:
Recency gauges the time elapsed since a customer's last purchase, providing insights into engagement levels and future transaction potential.
Frequency assesses the frequency of customer interactions, serving as an indicator of loyalty and sustained engagement.
Monetary value measures the total spending of customers, emphasizing their value to the business.
The combination of these factors enables businesses to assign numerical scores to each customer, typically on a scale from 1 to 4, where lower scores signify more favorable outcomes in our specific use case. For instance, a customer scoring 1 in all categories is deemed the "best," showcasing recent activity, high engagement, and substantial spending.
Derived from research in direct mail marketing, RFM analysis aligns with the Pareto Principle, suggesting that 80% of sales emanate from 20% of customers. Employing the RFM model allows businesses to adeptly segment their customer base, predict future purchasing behaviors, and tailor marketing initiatives to optimize engagement and profitability.
While RFM is often associated with B2C marketing due to its focus on customer behavior and purchasing patterns, RFM can also be highly valuable in B2B (Business-to-Business) contexts.
In B2B, RFM can be adapted to measure the activity of business clients based on things like:
Recency: How recently a client engaged with your company, whether through a purchase, inquiry, or other forms of communication.
Frequency: How often a client engages with your business, attends meetings, or makes purchases.
Monetary: The financial value of the client’s transactions or deals over time.
For example, B2B use cases can use RFM to segment clients based on their purchasing behavior or engagement levels, helping to inform account management, upsell opportunities, and personalized marketing strategies. The core principles of RFM are flexible enough to apply to both B2B and B2C environments.
RFM proves invaluable for comprehending customer dynamics and refining marketing strategies, with key advantages including:
`Enhanced Revenue through Precision Targeting
Tailoring messages and offers to specific customer segments optimize revenue by boosting response rates, retention, satisfaction, and Customer Lifetime Value (CLTV).
Effectively predicts future customer behavior by leveraging recency, frequency, and monetary metrics.
Allows precise messaging alignment, optimizing recommendations for frequent high-spenders and fostering loyalty among smaller spenders.
Objective Customer Segmentation and Decision Support
Provides an objective, numerical depiction of customers, simplifying segmentation without necessitating advanced expertise or software.
Assigns rankings on a scale, with lower rankings indicating a higher likelihood of future transactions.
Facilitates easy interpretation of intuitive outputs, supporting decision-making and strategy formulation.
Insights into Revenue Sources and Customer Dynamics
Offers insights into revenue sources, underscoring the significance of repeat customers and guiding efforts to enhance customer satisfaction and retention.
Emphasizes the need for balancing customer engagement, ensuring top customers are not over-solicited while nurturing lower-ranking customers through targeted marketing efforts.
Like any other approach, RFM also has limitations:
Simplicity and Generalization: RFM provides a straightforward framework but may oversimplify customer behavior, assuming uniformity within segments based on recency, frequency, and monetary values
Equal Weighting of Factors: The model assigns equal importance to recency, frequency, and monetary values, potentially misrepresenting customer value as one factor might be more critical than another in certain cases.
Limitations in Contextual Understanding: RFM lacks consideration for context, failing to account for product-specific characteristics or nuances in customer preferences, resulting in potential misinterpretations of purchasing behaviors.
RFM (Recency, Frequency, Monetary) segments can be dynamically integrated into real-time personalization strategies by leveraging customer behaviors to tailor interactions instantly. As customer data is updated in real time, businesses can adjust their personalization efforts based on the latest RFM scores. For example, a customer who recently made a high-value purchase might see personalized product recommendations or loyalty rewards immediately upon their next visit, while a less engaged customer could receive a targeted offer or incentive to re-engage. This real-time adaptation ensures that customers receive highly relevant and timely content, enhancing their overall experience and increasing the likelihood of conversions.
Once these attributes or base segments are created in Real-Time Customer Profile, they become available for personalization at the Edge (e.g., Adobe Target, Offer Decisioning) and for Streaming Activation through platforms like Adobe Journey Optimizer and Streaming Destinations.
Luma has recently opened a new website in a new country selling only 7 products. The price is shown below.
Radiant Tee: $20
Breathe-Easy Tank: $25
Hero Hoodie: $50
Aspire Fitness Jacket: $80
Quest Lumaflex Band: $15
Push It Messenger Bag' $45
Overnight Duffle: $60
Users explore the website to browse various products and have the option to log in with their email address at any time. As they navigate, they can add items to their cart, proceed to checkout, place an order, and receive a web order confirmation. Some users may also choose to call the toll-free number to cancel their order. Additionally, users often manage their cookies, frequently clearing them. A portion of these users participate in the loyalty program. To add an extra layer of privacy, all identifying information has been anonymized using Data Distiller.
As the Marketing Manager at "The Luma Store," your aim is to target customers based on their past behavior using RFM segmentation. This involves ranking customers by their recency, frequency, and monetary value scores on a scale of 1 to 4. The RFM model assigns each customer a score for these three factors, with 1 being the highest and 4 the lowest. Your goal is to construct an effective marketing strategy by creating customer segments. You have been assigned some requirements:
A customer can only belong to one of the 6 segments. This is not a hard requirement in practice, but the marketing department wants to tailor a consistent message to their customer by ensuring that they belong to a single segment.
Customers should be bucketed into the following 6 segments in the following priority order:
Core - Your Best Customers
Highly ranked in every category, these customers respond well to loyalty programs
They transact frequently, spend generously, and exhibit brand loyalty.
On a scale to 1 to 4, these would rank the highest among all the dimensions i.e. Recency=1, Frequency=1 and Monetary=1.
Loyal - Your Loyal Customers
Customers with top scores for frequency, indicating frequent transactions
Although they may not be the highest spenders, they exhibit consistent loyalty
On a scale to 1 to 4, these would rank the highest along the Frequency dimension i.e. Frequency=1 for all values of Recency and Monetary.
Whales - Your Highest-Paying Customers
Customers with top marks for monetary value, signifying high spending.
On a scale to 1 to 4, these would rank the highest along the Monetary dimension i.e. Monetary=1 for all values of Recency and Frequency.
Promising - Your Faithful Customers
Customers who transact frequently but spend less compared to other segments.
In this case, we will assume that they are frequent i.e. Frequency is (1,2,3) and spend not so much i.e. Monetization is (2,3,4).
Rookies - Your Newest Customers
Newest customers who have recently transacted but have low frequency scores.
In this case, we will assume that they are very recent i.e. Recency is 1 with lowest frequency i.e. Frequency is 4.
Slipping - Once Loyal, Now Gone
Formerly loyal customers who have become inactive or less frequent.
Presents an opportunity for retention efforts, such as discount pricing and exclusive offers, to win them back.
In our case, we will assume Recency is (2,3,4) and Frequency is lowest equal to 4.
While these requirements might seem like a simple assignment in this tutorial, this is exactly the type of analysis and requirements generation your marketing team should be doing.
First, you'll need to establish an RFM scale and determine the level of granularity for each dimension—how many categories will be used for Recency, Frequency, and Monetary value.
Next, you'll define how customers are categorized into these segments. I
n our example, the criteria are structured to ensure that customer segments don’t overlap. This was done deliberately to prevent conflicts in personalization strategies.
Additionally, pay attention to the taxonomy—the naming of segments plays a key role in aligning your team around these well-recognized foundational segments. Clear and consistent segment names help foster a shared understanding and focus, ensuring that everyone is on the same page when strategizing and executing marketing efforts.
As a marketer, you’re not expected to be writing or understanding SQL all day. The whole purpose of RFM (Recency, Frequency, Monetary) analysis is to have these attributes prepared so you can use them for audience analysis, activation, and personalization. Typically, data engineers, architects, or your marketing ops team will handle the technical work, while you’ll focus on consuming and applying the results. That’s even more reason to be kind to your data teams!
But if you're curious about SQL, don’t worry—it’s not as hard as it seems. SQL operates on similar principles to working with Excel. The main limitation of Excel is that it struggles with large, complex datasets and can’t handle high volumes of events. That’s why tools like Data Distiller exist, designed to process trillions of records in one go.
Keep in mind that all the RFM attributes created in Data Distiller are automatically added to the Real-Time Customer Profile. Once they’re in there, they become available for audience creation and activation across social media and paid media channels. They’re also ready to use as audiences in Adobe Journey Optimizer. And here’s the real advantage: these attributes are available for edge personalization through Adobe Target or even Offer Decisioning.
Also, RFM attributes are calculated for each individual customer. You can also add this data as a lookup table in Customer Journey Analytics, allowing you to analyze every journey within the context of RFM attributes.
Lastly, the same RFM attributes can be used to enrich the B2B Real-Time Customer Profile, which enables account segmentation and personalization of buying groups in Adobe Journey Optimizer's B2B edition. Essentially, this means that the entire Adobe DX (Digital Experience) portfolio can be activated using these attributes. Whether it's for precise account-based marketing, personalized experiences, or optimizing journeys for B2B audiences, these RFM attributes play a crucial role in driving effective personalization and engagement across Adobe’s ecosystem.
So, the big question you should be asking your data team isn’t how to build the RFM attributes, but rather how to gain access to them. Specifically, you should ask what data they are calculated on, how frequently they are updated, and how fresh the data is. Understanding these factors will help ensure that your audience analysis, segmentation, and personalization strategies are based on up-to-date and relevant insights.
But just in case, you want to know how SQL works. Look below.
Here are the steps we will follow:
We will start by exploring the web transaction data to gain insights into essential fields such as customer ID, timestamps, and order totals.
Once the data is fully understood, we will calculate RFM metrics for each customer: Monetary (M), representing the total amount spent; Frequency (F), counting the number of purchases; and Recency (R), measuring the days since the most recent purchase. Each RFM dimension will be divided into quartiles, resulting in 64 distinct segments in this three-dimensional space.
We'll then visualize the distribution of these segments using dashboards to ensure accuracy.
Once verified, we will automate the process of updating the Real-Time Customer Profile or Customer Journey Analytics. This segmentation will enable the creation of audience profiles based on marketing requirements, enhancing the Real-Time Customer Profile with RFM attributes for more personalized marketing and engagement strategies.
If you are unfamiliar with certain concepts in Adobe Experience Platform, it is recommended that you review the tutorial provided below:
The data has been generated in CSV format to capture the essence of the use case. In practice, you would typically source this data from Adobe Analytics, Adobe Commerce, or Adobe Web/Mobile SDK. The key takeaway is that you'll need to apply the techniques outlined in this tutorial to extract the relevant events and fields into a canonical CSV format using Data Distiller. The main goal is to work with only the necessary fields and keep the data as flat as possible, while maintaining practicality.
Download the above data locally.
Name the dataset as luma_web_dataset and follow the steps outlined here:
Since we are loading the CSV file directly, there is no need to create an XDM schema (whether it's record, event, or other B2B styles). Instead, we will be working with an Ad Hoc schema. While Data Distiller can work with any schema, when we prepare the final dataset for hydration into the Real-Time Customer Profile, we will use a Record XDM schema.
Data verification and exploration involve executing SELECT
to inspect, validate, and analyze data to ensure that it has been accurately translated during the ingestion process. This process helps identify any discrepancies, inconsistencies, or missing information in the data.
Let us access the Data Distiller Query Pro Mode Editor and execute the following query:
Navigate to Queries->Create Query
Paste and execute the following query:
Observe the following in the results:
The products column are the list of products associated with the event type.
The first 9 records from the top of the result set actually maps out a typical customer journey that started with some browsing and an eventual purchase.
Observe how a Purchase ID gets attached at the order step as purchase_id
If you scroll further down, you will see some of the customers have a loyalty ID associated with them.
The list of products is provided as a comma-separated list. While this isn't relevant for the RFM tutorial, if we were conducting a product affinity analysis, flattening this data would be a key step.
Remember our RFM model only focuses on the recency, frequency and monetary value of all purchases made. We are not so concerned about engagement (page views) or the checkout process. Also, we must exclude all orders that were cancelled as well as they do not contribute to a valid calculation - we would need to deal with cancellations differently.
First we will create a Data Distiller View. Copy and execute the following SQL in the Data Distiller Query Pro Mode Editor:
Remember, we are selecting all the non-null purchase IDs that had a cancellation associated with them and aggregating them with a GROUP BY. The purchase IDs that we get as a result set needs to be excluded from our dataset.
VIEWs behave like virtual datasets and so naming them helps in reusing them throughout the code.
Then we will select the purchase IDs that are not in the view and retain them
As you type mulltiple queries into the Data Distiller Query Pro Mode Editor, make sure you highlight and execute the query of interest:
Let us now exclude all events that are not orders:
You should now have the result set on which we will create the RFM model.
At this point in time, it is a good idea to name the query as a template RFM_{YourName}. Just click the arrow button at the bottom right to create a Data Distiller Template. You can also click the menu icon at the top left corner to make more space for the editor.
If you leave the Data Distiller Editor inactive for more than 30 minutes, you'll encounter a notification that the database connection has been lost when you try to use it again. This happens because the system requires you to refresh the page to re-establish the connection. To avoid losing any work, be sure to save your template before refreshing the page. Remember to execute all the SQL code that has temp tables as those are only persisted for the session.
If you want to delete a view then use the following syntax:
DROP VIEW IF EXISTS order_data;
But remember that VIEW
s have dependencies - if there is any view being used witin other views, then you will need to drop those views first. For this, you will need to manually examine the code or follow the hints from the error message itself i.e. it will list the depdent views.
To start the development of an RFM model, the first step is to calculate three scores for each customer: Recency, Frequency, and Monetary value. These scores are derived from raw data collected through customer interactions and past purchase transactions. Just as a recap:
Recency reflects the time elapsed since the customer's last purchase, considering their entire history with us.
Frequency denotes the total number of purchases made by the customer over their entire history.
Monetary represents the overall amount of money spent by the customer across all transactions during their entire tenure with us.
Let's delve into how we can leverage the raw data to compute these essential scores.
We are augmenting the query developed in the previous section by choosing email address as our userid as every order requires a email login. We also use the TO_DATE
row level function in Data Distiller to convert the timestamp date. The total_revenue
currently reflects the price for each individual transaction. Later, we will aggregate this value by summing it up for each email ID.
2. The results should look like this:
Next, we will create a TEMP TABLE (temporrary table) to cache the results of the previous query for the duration of our session. Unlike VIEWS, which execute the underlying query each time they are called, TEMP TABLEs store the data in memory, similar to how tables are persisted in the AEP Data Lake. Utilizing TEMP TABLEs and VIEWs enhances the modularity and readability of your code.
Remember that TEMP Tables (a feature of Data Distiller) uses the Ad Hoc Query Engine and hence does not use up the Batch Query Engine. This means all of the above data exploration can happen without using the Batch Query Engine as long as the query is within reason i.e. does not timeout within 10 minutes. If you have a very large dataset, you should explore the ANALYZE TABLE
command to create dataset samples. The only problem with TEMP TABLES
is that they cannot be used as part of materializing the data in the data lake which makes them well suited ror data exploration tasks only.
Copy paste the following command to create a TEMP TABLE
5. The result will be the following:
Since we will be materializing the results later, we will be using VIEW
s instead of TEMP TABLE
s
Copy paste the following query and execute
2. The results will be
DATEDIFF(CURRENT_DATE, MAX(purchase_date)) AS days_since_last_purchase
calculates the number of days between two dates.
Create a VIEW
to simplify the code
We 4 slots for each dimension and we need to arrange all the values from the slots in 4 bins from highest to lowest.
Copy paste and execute the following SQL code:
The NTILE
window function is a way to divide data into equal-sized groups, or "buckets.". In our query, it helps categorize customers into 4 equal groups (quartiles) based on their recency, frequency, and monetization values:
Frequency: Customers are ranked based on how many purchases they've made i.e. orders.
The ones with the most orders are placed in group 1, and those with the fewest orders are in group 4.
Monetization: This column ranks customers by how much total revenue they've generated total_revenue
. The highest spenders are placed in group 1, and the lowest spenders are in group 4.
Recency: The query ranks all customers based on how long it's been since their last purchase (days_since_last_purchase
). It divides them into 4 groups, where the customers who purchased most recently are in group 1, and the ones who haven't purchased for the longest time are in group 4.
The results should look like this:
Let us make sure we create the VIEW
for this as well:
Since we have the RFM scores, we can slot them into different segments as per the requirements listed in the case study section
Observe the use of CASE
statements with logical conditions that can be used to set the value of the RFM_Model
variable
The results are shown below:
4. Create a VIEW
to save the RFM segments, scores and values:
An important task at this point is to start visualizing the slices of the RFM cube so that we can get a sense of what the distribution of customers looks like.
First, you need to complete the following prerequisite:
It is recommended that you also read through this as well:
Let us create a data model so that the Dashboards can recognize the data and allow us to build charts. Copy paste and execute the following piece of code
Let us make sure we understand the above code
CREATE DATABASE lumainsights
: This creates a new database named lumainsights
that will store and organize the data for insights.
WITH (TYPE=QSACCEL)
: The TYPE=QSACCEL
indicates that the database is optimized for query acceleration. This is used to improve the speed of dashboard queries, which is crucial for dashboards and analytics use cases where performance is key.
ACCOUNT=acp_query_batch
: This specifies the Data Distiller account used for batch query processing. If you do not have the Data Distiller license, this account will not exist.
WITH (TYPE=QSACCEL, ACCOUNT=acp_query_batch)
specify that the database should be created in the Accelerated Store specifically and not on the AEP Data Lake. AEP Dashboards can only work on datasets in the Accelerated Store.
CREATE SCHEMA lumainsights.lumakpimodel
: This creates a schema named lumakpimodel
under the lumainsights
database. A schema is a logical container for organizing database objects like tables and views.
lumainsights.lummakpimodel
is the data model and using the ALTER MODEL
command, it is changed to luma_dash
for easy readability in dashboards.
We need to first create an empty table. Observe the WHERE
condition where a contradiction results in no rows being returned and hence an empty table ius created.
Insert the RFM_MODEL_SEGMENT
data into this table:
Let us retrieve the results of the query. Observe that we just use the name of the table because this table name is unique across the data lake and the accelerated store. If you fully qualify the table name with the dot notation i.e. lumainsights.lumakpimodel.fact_rfm_model
, you will get the same result.
The results of the query will be the same as the VIEW
on the data lake:
We will be using SQL to build charts for our dashboard:
Navigate to the AEP left sidebar and click on Dashboards->Create Dashboard
Name the dashboard as RFM_Dashboard. Click on Query Pro Mode. This will open up the Data Distiller Editor within the context of Dashboard workflows. Click on Enter SQL.
Note that this feature of using SQL to author charts in Query Pro Mode is only available in Data Distiller.
In the Data Distiller Editor that opens, please make sure you choose luma_dash
as the data model from the dropdown and execute the following query:
The results will look like this. Click Select.
Choose Marks->Table. Then click on the + and add Header 1. Add Column and keep adding all the attributes. Name the table as RFM by User. You should get a preview that looks like this with 5 columns (instead of all the attributes shown). This is expected as the View More feature in the table will show all the columns and all the rows.
Cllick on Save and Close. Resize the table widget so that it covers the width of the dashboard. Then click Save. After saving, click Cancel to exit the Edit mode.
Click on the ellipsis to click on View More
You will get all the records that you can scroll through or even paginate through the various pages. Click on Download CSV on the top right corner to download upto 500 rows of data per page. If you page yourself to the next page, you can download that data as well.
Click on ViewSQL
As an exercise, create bar charts titled Users by RFM Segment. Click Edit->Add Widget->Enter SQL. Make sure that luma_dash
is chosen as the data model from the dropdown.
Use the following code:
The bar chart can be built like this. This is pretty easy to do and you should try this on your own.
If you click the Export button on the top right corner of the dashboard, you will have the option to print or save the dashboard as a PDF. This is how your dashboard as a PDF should look like:
These dashboards are highly beneficial because the Data Distiller Scheduling feature allows us to automatically generate fresh fact tables as soon as new data is available. For the end marketer, this means they can simply view the dashboards without needing to write any code or perform manual data analysis.
We are now ready to hydrate the Real-Time Customer Profile. First, we will create a new dataset on the data lake and then mark it for Profile.
You can also read up more about the theory behind this here:
Create the empty dataset first. We will need a primary identity as this dataset will be ingested into the Profile Store that needs a partition key.
2. Make sure that you have Email available as an identity namespace. You can check this here:
Once the dataset is created, you should be able to go to Datasets->Browse->adls_rfm_profile and see that the dataset is empty.
You will also see that it creates a proper XDM Individual Profile Schema with custom fieldgroups if you browse to Schemas->Browse->adls_rfm_profile. You need to copy the tenant name which is _pfreportingonprod
(in my case) at the very top of the schema.
Here is some explanation on what is happening with the code
userId text
: Defines a column named userId
of data type text
. This column will store the user identifiers. The datatype is string.
PRIMARY IDENTITY NAMESPACE 'Email'
: This specifies that userId
is the primary identity for the records in this table and belongs to the identity namespace 'Email'
.
Primary Identity: In Adobe Experience Platform, the primary identity is the unique identifier used to merge customer data across different datasets for the Real-Time Customer Profile.
Identity Namespace 'Email': Indicates that the values in userId
are email addresses and belong to the predefined identity namespace for emails. This helps in unifying profiles based on email addresses.
days_since_last_purchase integer
Stores the number of days since the user's last purchase and the datatype is a whole number. The same applies to orders integer, recency integer, frequency integer,
and monetization integer
total revenue decimal(18, 2)
has precision: up to 18 digits in total.and a scale: 2 digits after the decimal point.
rfm_model text:
Holds additional information about the RFM segment applied to the user. The data type is string.
The clause WITH (LABEL = 'PROFILE')
indicates that the table is marked as a Profile dataset in Adobe Experience Platform (AEP). Datasets labeled with 'PROFILE'
are enabled for Real-Time Customer Profile, meaning that data ingested into these datasets contributes to building unified customer profiles. Additionally, while the Identity Graph/Store processes all records, it will skip reading them if no additional identities (beyond the primary identity) are present. The Identity Graph is designed to identify and associate two or more identities within each attribute or event record, and without such associations, no further action is taken on these records.
We will now insert the data from RFM_MODEL_SEGMENT View into the adls_rfm_profile
that has been marked for Real-Time Customer Profile.
This code takes some time to run because it operates in Batch Mode, which involves spinning up a cluster to execute the query. The process includes reading data from the data lake into the cluster, performing the necessary processing, and then writing the results back to the data lake. The cluster spin-up and shutdown process can take several minutes, contributing to the overall execution time. This is typical for batch processing workloads where resources are provisioned dynamically for each job.
Observe that the order of the fields in the SELECT
query of the INSERT
statement mirrors exactly one-to-one with the order of the fields in RFM_MODEL_SEGMENT
. This ensures that the values from RFM_MODEL_SEGMENT
are inserted correctly into the corresponding fields in the target structure or table. Maintaining this strict alignment is crucial to avoid mismatches between the source and target fields during data insertion.
The keyword Struct
is used because _pfreportingonprod
is treated as an object or structured data type that encapsulates multiple fields. By using Struct
, you are grouping the data for the fields (such as userId
, days_since_last_purchase
, orders
, etc.) into a single object, which allows for these fields to be handled together as a unit. This is useful when you need to insert or manage multiple fields as a single entity within an object, such as _pfreportingonprod
.
Do not worry about having added data to this dataset for Profile. You can simply delete the dataset or use the DROP
table command. Deleting the dataset will remove all corresponding data from the Real-Time Customer Profile, including the Identity Store. This means any graph links or identity associations created from the dataset will also be deleted. It is the fastest and most efficient way to remove data from the Real-Time Customer Profile and ensure that no related data remains in the Identity Graph.
Once the dataset has data you should be able to go to Datasets->Browse->adls_rfm_profile and see that the dataset has data. It should have 2000 rows of data.
To see if the data has been loaded into Profile, navigate to Customer->Profile->Browse. Choose the Identity Namespace as Email and put in the value of user0076@example.com
Navigate to Customer->Audiences->Create Audience->Build Rule
Click on Attributes_>XDM Individual Profile
Click on the folder that has the same name as the tenant namespace Pfreportingonprod.
Custom attributes created in Data Distiller can be found in this folder.
You can easily drag and drop the Rfm_Model
attribute to begin building an audience. Keep in mind that these attributes can be utilized for Edge, Streaming, and Batch Audiences.
Even though the Profile has been populated, the Rule Builder may not display the attributes. To resolve this, click the settings icon on the Fields sidebar to the left and select the option labeled "Show all XDM Fields."
RFM data can be used as a lookup table in Adobe’s Customer Journey Analytics (CJA) to enhance the analysis of customer behavior. To do this, you would first upload the RFM dataset as a lookup table into CJA. This dataset typically includes key metrics such as Recency (how recently a customer made a purchase), Frequency (how often they purchase), and Monetary (how much they spend). The lookup table should include a common identifier, such as email
or customer ID
, which will be used to connect the RFM data to other journey datasets in CJA.
Once uploaded, you would configure the lookup relationship by mapping the RFM attributes (e.g., Recency, Frequency, and Monetary scores) to the corresponding customer profile data in CJA. This enables the RFM scores to enrich the event-level journey data, allowing for more granular and targeted analysis. For example, you could analyze how customers with high-frequency scores interact with different touchpoints in their journey, or track conversion rates for high-value customers across different campaigns.
By integrating RFM data as a lookup, you unlock the ability to create segments based on behavioral insights and incorporate them into dashboards, reports, and personalized marketing efforts. Additionally, RFM-enriched data can be utilized in real-time to power dynamic journey flows, enabling personalized experiences based on past behaviors. This method ensures you can continually refine and enhance customer experiences across all channels by leveraging both historical RFM data and real-time journey events in Customer Journey Analytics.