Introduction
Static model configurations use traditional machine learning techniques to predict outcomes based on a set of features, using models that are updated by offline retraining. In situations where the behaviour being predicted is varaible over time, Dynamic Interaction configurations can prove to be more effective. In this lesson we outline the approach to migrate a static model configuration to a Dynamic Interaction configuration.
The migration consists of the following steps:
- Selecting a Dynamic Interaction algorithm
- Creating your Dynamic Interaction configuration
- Duplicating the static model Deployment Configuration and updating it to use the Dynamic Interaction configuration.
- Adjusting your pre and post scoring logic to use the Dynamic Interaction configuration
- Testing your Dynamic Interaction configuration
- Run your Dynamic Interaction configuration in parallel to the static model in production
- Use a Network Runtime to route a portion of the traffic to the new Dynamic Interaction configuration
- Use the Network Runtime to test multiple Dynamic Interaction configurations
Below we give more details on implementing each of these steps.
Selecting a Dynamic Interaction algorithm
There are a number of different Dynamic Interaction algorithms available, which are described in detail in the Dynamic Models section of the documentation. Here we give a brief overview of the algorithms:
- \(\epsilon\)-greedy: This is the simplest algorithm. A portion (\(\epsilon\)) of the recommendations are made at random, with the remainder being the best performing offer, given the values of the contextual variables. This is a good algorithm to use when you want to explore your prediction space and have a clean set of data to use for further modelling or when you want the behaviour of the algorithm to be as explainable as possible.
- Ecosystem Rewards: This algorithm uses a Thompson Sampling approach to rank offers. A Beta distribution is generated and updated for each option and combination of contextual variable values and options are scored by sampling from the Beta distributions. The Ecosystem Rewards algorithm provides a good balance between learning and explainability, and it has more optionality in how historical data is used in the learning process than the other algorithms.
- Bayesian Probabilistic: This algorithm uses a Naive Bayes model to score options, with a number of approaches available to impact how missing data in the Naive Bayes training is handled. This algorithm has less focus on balancing exploration and exploitation and instead uses a larger number of features to aim to improve the prediction accuracy. While still explainable, this algorithm is less interpretable than the Ecosystem Rewards and \(\epsilon\)-greedy algorithms.
- Q-learning: The Q-learning algorithm allows for specific rewards and policies to be taken into account. However, it is the most complex algorithm to implement as the reward function must be implemented using the java plugin system.
While one of these algorithms should be selected initially, it is possible to test multiple algorithms in parallel using the Network Runtime and then select the best performing algorithm based on the results of the tests. This is described in more detail in the last section of this lesson.
Creating your Dynamic Interaction configuration
Once you have selected a Dynamic Interaction algorithm, you will need to create the Dynamic Interaction configuration. Here we will discuss the Ecosystem Rewards algorithm as an example, but the same approaches apply to the other algorithms.
The first step is to select the contextual variables or features that you want to use to inform the Dynamic Interaction learning process. One option for this when converting from a static model is to use the features with the highest variable importance from the static model. If you want to create new variables to use or combine existing variables you can set up Virtual Variables to do this.
To implement your configuration you can follow the configuration documentation or follow the Dynamic Interaction user guide.
Set up the Deployment Configuration
When migrating from a static model configuration to a Dynamic Interaction configuration, the easiest way to create the Deployment Configuration is by duplicating the existing static model Deployment Configuration and updating it to use the Dynamic Interaction configuration. Changing the version of the static Deployment Configuration and clicking Update
will create a copt that you can use for this purpose. At the same time you will probably want to change the name of the Deployment Configuration to be the same of the name of the Dynamic Interaction configuration you created in the previous step.
In your new Deployment Configuration, select the New Knowledge
option and deselect the Prediction Model
and Model Selector
options. Scroll down to the New Knowledge
accordion, expand it and select the Dynamic Interaction configuration that you created in the previous step.
You will also need to update the pre and post scoring logic in the Plugins accordion. This is discussed in more detail in the next section.
Adjusting your pre and post scoring logic
You will need to adjust your pre and post scoring logic to use the Dynamic Interaction configuration. If you do not have custom pre and post scoring logic, this is as simple as switch templates in your Deployment Interaction Configuration. Alternatively, you will need to make some minor changes to your existing pre and post scoring logic.
Pre scoring logic
The pre scoring logic will need to be updated if you are using contextual variables and the contextual variable values are being looked up from a data source. In this case you should either use the PreScoreDynamic.java
template prescore or if you have existing custom prescoring logic you will need to ensure that the pre scoring class extends PreScoreSuper
and then call the getDynamicSettings
and getPrepopulateContextualVariables
methods, as per the code snippet below:
params = getDynamicSettings(mongoClient, params);
params = getPrepopulateContextualVariables(params);
Post scoring logic
If you are not using custom post scoring logic, you can use the PlatformDynamicEngagement.java
template post score. If you have existing custom post scoring logic, you will need to make two key changes, which are illustrated in PlatformDynamicEngagement.java
:
- Extract the results of the Dynamic Interaction scoring from
params
- Loop through the Options Store when processing options rather than the Offer Matrix or Model Scoring results
- Decide how to handle options which are in the Options Store but not in the Offer Matrix
- Get the offer score by getting the
arm_reward
from theoption
in the Options Store - Add the Dynamic Interaction specific outputs to the API response for logging, explainability and to enable the online learning process
To extract the results of the Dynamic Interaction scoring from params
, check that your post scoring logic extends PostScoreSuper
and use the following code snippet:
/***************************************************************************************************/
/** Standardized approach to access dynamic datasets in plugin.
* The options array is the data set/feature_store that's keeping track of the dynamic changes.
* The optionParams is the parameter set that will influence the real-time behavior through param changes.
*/
/***************************************************************************************************/
JSONArray options = getOptions(params);
JSONObject optionParams = getOptionsParams(params);
JSONObject locations = getLocations(params);
JSONObject contextual_variables = optionParams.getJSONObject("contextual_variables");
JSONObject randomisation = optionParams.getJSONObject("randomisation");
/***************************************************************************************************/
/** Test if contextual variable is coming via api or feature store: API takes preference... */
if (!work.has("contextual_variable_one")) {
if (featuresObj.has(contextual_variables.getString("contextual_variable_one_name")))
work.put("contextual_variable_one", featuresObj.get(contextual_variables.getString("contextual_variable_one_name")));
else
work.put("contextual_variable_one", "");
}
if (!work.has("contextual_variable_two")) {
if (featuresObj.has(contextual_variables.getString("contextual_variable_two_name")))
work.put("contextual_variable_two", featuresObj.get(contextual_variables.getString("contextual_variable_two_name")));
else
work.put("contextual_variable_two", "");
}
/***************************************************************************************************/
These variables will be used rather than domainsProbabilityObj
to get the scoring results. You can look at PlatformDynamicEngagement.java
for an example of how this, and subsequent, snippets can be used.
To loop through the Options Store when processing options, you can use the following code snippet:
int[] optionsSequence = generateOptionsSequence(options.length(), options.length());
String contextual_variable_one = String.valueOf(work.get("contextual_variable_one"));
String contextual_variable_two = String.valueOf(work.get("contextual_variable_two"));
for(int j : optionsSequence) {
JSONObject option = options.getJSONObject(j);
If an option is in the Options Store but not in the Offer Matrix you can either ignore that option or generate a default Offer Matrix entry for that option and generate a warning in the logs
/** Skip the item if offer matrix does not contain option */
/*
if (!offerMatrixWithKey.has(option.getString("optionKey")))
continue;
*/
/** Generate default offer matrix entry if offer is not in the Offer Matrix */
String offer = option.getString("optionKey");
if (!offerMatrixWithKey.has(option.getString("optionKey"))) {
JSONObject singleOffer = defaultOffer(offer);
offerMatrixWithKey.put(option.getString("optionKey"), singleOffer);
LOGGER.warn("BEWARE, DEFAULT OFFER GENERATED. IN OPTIONS STORE AND NOT OFFER MATRIX: " + option.getString("optionKey"));
}
To get the score for the offer from the current option in the loop, you can use the following code snippet:
double p = 0.0;
double arm_reward = 0.001;
double learning_reward = 1.0;
if (option.has("arm_reward")) {
p = (double) option.get("arm_reward");
} else {
p = arm_reward;
}
arm_reward = p;
To add the Dynamic Interaction specific outputs to the API response, you can use the following code snippet:
/** Add dynamic interaction specific outputs to the API response */
finalOffersObject.put("p", p);
if (option.has("contextual_variable_one"))
finalOffersObject.put("contextual_variable_one", option.getString("contextual_variable_one"));
else
finalOffersObject.put("contextual_variable_one", "");
if (option.has("contextual_variable_two"))
finalOffersObject.put("contextual_variable_two", option.getString("contextual_variable_two"));
else
finalOffersObject.put("contextual_variable_two", "");
double alpha = (double) DataTypeConversions.getDoubleFromIntLong(option.get("alpha"));
double beta = (double) DataTypeConversions.getDoubleFromIntLong(option.get("beta"));
finalOffersObject.put("alpha", alpha);
finalOffersObject.put("beta", beta);
if (!option.has("weighting"))
finalOffersObject.put("weighting", -1.0);
else
finalOffersObject.put("weighting", (double) DataTypeConversions.getDoubleFromIntLong(option.get("weighting")));
finalOffersObject.put("arm_reward", arm_reward);
finalOffersObject.put("learning_reward", learning_reward);
/* Debugging variables */
if (!option.has("expected_takeup"))
finalOffersObject.put("expected_takeup", -1.0);
else
finalOffersObject.put("expected_takeup", (double) DataTypeConversions.getDoubleFromIntLong(option.get("expected_takeup")));
if (!option.has("propensity"))
finalOffersObject.put("propensity", -1.0);
else
finalOffersObject.put("propensity", (double) DataTypeConversions.getDoubleFromIntLong(option.get("propensity")));
if (!option.has("epsilon_nominated"))
finalOffersObject.put("epsilon_nominated", -1.0);
else
finalOffersObject.put("epsilon_nominated", (double) DataTypeConversions.getDoubleFromIntLong(option.get("epsilon_nominated")));
If you want to add a check to confirm that any contextual variables are being correctly processed, you can use the following code snippet in the loop through the Options Store:
String contextual_variable_one_Option = "";
if (option.has("contextual_variable_one") && !contextual_variable_one.equals(""))
contextual_variable_one_Option = String.valueOf(option.get("contextual_variable_one"));
String contextual_variable_two_Option = "";
if (option.has("contextual_variable_two") && !contextual_variable_two.equals(""))
contextual_variable_two_Option = String.valueOf(option.get("contextual_variable_two"));
if (contextual_variable_one_Option.equals(contextual_variable_one) && contextual_variable_two_Option.equals(contextual_variable_two)) {
Testing your Dynamic Interaction configuration
Once you have deployed your Dynamic Interaction configuration deployment configuration, you can run tests by making individual API calls or by running a simulation. This can be done in python or using the workbench. In the workbench use the API Management functionality for individual calls or the Simulation functionality for running a simulation. In python these test can be run as per the example below.
Run your Dynamic Interaction configuration in parallel
To run initial tests of your Dynamic Interaction configuration in production you can run the Dynamic Interaction deployment in parallel to the existing static model deployment. This can be done by following the Testing Dynamic Interaction Deployments guide.
Route some traffic to your Dynamic Interaction configuration
Once you have completed the testing of your Dynamic Interaction deployment you can route a portion of the traffic currently going to the static model deployment to the Dynamic Interaction deployment. This can be done using the Network Runtime functionality. The Network Runtime allows you to route traffic in a variety of ways. The experiment_selector
network type is likely to be a good option for this but the type to use can be evaluated based on the requirements of your use case.
Test multiple Dynamic Interaction configurations
Once you have a Dynamic Interaction configuration running in production, it is good practice to test multiple Dynamic Interaction configurations. This can be done using the Network Runtime functionality. The Network Runtime allows you to route traffic to multiple Dynamic Interaction configurations and compare the results. The experiment_selector
network type is likely to be a good option for this but the type to use can be evaluated based on the requirements of your use case.