Predict subscription churn

7 minutes

You can create predictions to help predict missing values in your customer profile. This process is helpful because, with more complete profile data, you can better target your customers moving forward. However, performing predictions in Customer Insights - Data is more than predicting missing information; it's often about making predictions that are related to customer behavior, such as if a customer might discontinue using your services.

Customer Insights - Data includes a prebuilt customer churn prediction model that can help predict whether a customer is at risk for no longer using your company's subscription products or services. The customer churn prediction model is available to be configured from the predictions page.

Anyone who configures the model will need contributor permissions in their Customer Insights - Data environment. Additionally, they should have a strong business knowledge of what churn means for your business.

The process to configure the customer churn model is as follows:

Name model - Specifies the name of your prediction model that will be displayed in Customer Insights - Data and the name of the output entity that will be created to store data that is related to your prediction.
Preferences - Defines what constitutes churn for your organization, such as how many days to wait after a subscription end date before it's considered churned, and the time period for predicting attrition before a subscription end date.
Add required data - Defines the relevant fields that the model will use to predict which customers are at a higher risk of churn, including defining subscription details and activities that are used to support the subscription.
Update schedule - Defines how often to retain the model for your predictions.
Review and run - Allows you to review the prediction details before running the prediction for the first time.

Name your model

The first step in the process is to provide names for your model and the output entity. Comparable to the process of creating the missing values of prediction models, you'll need to provide a model name and output entity name. This information is important because the name is how you'll identify the model in the application. The output entity will be created automatically when the model runs for the first time. Any relevant data, such as scoring details, will be populated to the entity.

Define preferences

Two key items must be defined when you're creating the model to ensure that churn and attrition can be accurately predicted. The first item is to define how many days after a subscription ends that a customer must renew before they're considered churned. For example, your organization might not process a renewal until after the current subscription has expired. If it takes seven days for that process to complete, you would want to provide a seven-day buffer. You would provide this value in the Days since subscription ended field.

The second factor to consider is when the system should start to determine the risk of a customer churning. For example, you might set this value to 90 days to align with your organization's marketing retention efforts. Defining what this value should be is highly dependent on your organization's specific business requirements. Predicting churn risk for longer or shorter periods of time could make it more difficult to address the factors in your churn risk profile.

Add required data

To ensure that your model can make accurate predictions, you need to specify the subscription and customer activity data that the model will consider when making the prediction. Before you begin this process, make sure that the data sources that contain customer subscription and activity information have been ingested into the application.

The first item that you'll configure is the subscription history information. The data that is being used should include subscription identifiers that distinguish the individual subscriptions, in addition to customer identifiers, so that the subscriptions can be matched to your customers. The data should include event dates, which define start dates, end dates, and the dates that the subscription events occurred on and the details that are related to recurring subscriptions and frequency.

When you first configure the subscription history data, it needs to be matched with the customer entity that represents your primary customer entity. Customer Insights - Data will prompt you to define the relationship between the two items if one doesn't already exist.

After identifying the entity that you want to use and defining the relationship, you need to map the semantic fields to attributes within your subscription history entity. This approach will help the application better identify information, such as when a subscription ends, when determining churn.

When you're mapping to history data, make sure that your subscription history data has fields that include the following information:

Subscription ID - A unique identifier of a subscription.
Subscription End Date - Date that the subscription expires for the customer.
Subscription Start Date - Date that the subscription starts for the customer.
Transaction Date - Date that a change happened, such as buying or canceling the subscription.
Is it a recurring subscription - A true/false field that determines if the subscription will renew with the same subscription ID without customer intervention.
Recurrence Frequency (in months) - Subscription renewal period represented in months.
Subscription Amount (optional) - The amount of currency that a customer pays for the subscription renewal. It can help identify patterns for different levels of subscriptions.

After you have defined the subscription data, complete the same process to add customer activity data. During the process, you'll define an activity type that matches to the type of customer activity that you're configuring.

When mapping your activity data, make sure that the data source includes the following information:

Primary key - A unique identifier for an activity, such as a website visit or a usage record showing that the customer viewed a TV show episode.
Timestamp - The date and time of the event, identified by the primary key.
Event - The name of the event that you want to use. For example, a field called UserAction in a streaming video service could have the value of Viewed.
Details - Detailed information about the event. For example, a field called ShowTitle in a streaming video service could have the value of a video that a customer watched.

Note

You'll need to have at least two activity records for 50 percent of the customers whom you want to calculate churn for. If customer activities are located in multiple entities, you'll need to repeat this process for each entity that contains customer activities.

Set schedule and review configuration

Before you can start consuming your prediction model, set a frequency to retrain your model. This setting is important to ensure the accuracy of predictions as new data is ingested in Customer Insights - Data. While every organization will be different, most businesses can retrain once each month and receive good accuracy for their prediction.

Before completing your prediction model, review the configuration to ensure that it has been configured correctly based on your needs and then make changes, as necessary. After all values are configured correctly, you can save and run to begin the prediction process.

Note

Depending on the volume of data that is used in the prediction, the prediction process can take several hours to complete.

Review prediction status and results

After a prediction has been configured and run, you can view the prediction from Predictions under the Intelligence area and by selecting the My Predictions tab.

Each defined prediction model will display the following information:

Prediction name - The name of the prediction that is provided when you're creating it.
Prediction type - The type of model that is used for the prediction.
Output entity - Name of the entity to store the output of the prediction.
Predicted field - This field is populated only for some types of predictions and isn't used in subscription churn prediction.
Status - The current status of the predictions run.
- Queued - The prediction is currently waiting for other processes to run.
- Refreshing - The prediction is currently running the score stage of processing to produce results that will flow into the output entity.
- Failed - The prediction has failed.
- Succeeded - The prediction has succeeded.
Edited - The date when the configuration for the prediction was changed.
Last refreshed - The date when the prediction refreshed results in the output entity.

You can perform actions such as editing the prediction model, refreshing the data, viewing the details, or deleting the prediction by selecting the vertical ellipsis.

When you open a prediction to view the results, three primary sections of data will display within the results page: Training model performance, Likelihood to churn (number of customers), and Most influential factors.

Training model performance

The Training model performance section shows a score that indicates the performance of the prediction. This score will help you make the decision to use the results that are stored in the output entity.

The three available score options are: A, B, or C.

Scores are determined based on the following rules:

A - Model accurately predicted at least 50 percent of the total predictions, and the percentage of accurate predictions for customers who churned is greater than the historical average churn rate by at least 10 percent of the historical average churn rate.
B - Model accurately predicted at least 50 percent of the total predictions, and the percentage of accurate predictions for customers who churned is up to 10 percent greater than the historical average churn rate of the historical average churn rate.
C - Model accurately predicted less than 50 percent of the total predictions or when the percentage of accurate predictions for customers who churned is less than the historical average churn rate.

Likelihood to churn (number of customers)

The Likelihood to churn (number of customers) section shows groups of customers based on their predicted risk of churn. This data can be beneficial if you later want to create a segment of customers with high churn risk. Such segments help to understand where the cutoff should be for segment membership.

Most influential factors

Many factors are considered when you create your prediction. Each of the factors has their importance calculated for the aggregated predictions that a model creates. You can use these factors to help validate your prediction results. You can also use this information later to create segments that could help influence churn risk for customers.

For more information, see Predict subscription churn.

Continue