Risk Aversion
Uses mean-variance utility from financial portfolio theory. Each offer’s score is penalized by its variance — offers with uncertain outcomes are scored lower than offers with consistent (even if slightly lower) acceptance rates.
Algorithm
Config value: "approach": "behaviorAlgos", "sub_approach": "riskAversion"
For each offer, the algorithm computes a risk-adjusted utility score:
\(p = \frac{\text{response\_count}}{\text{count}}\)
\(\text{variance} = p \times (1 - p)\)
\(\text{utility} = p - \frac{\gamma}{2} \times \text{variance}\)
Where \(\gamma\) is the risk aversion coefficient. Scores are aggregated per offer and normalized to sum to 1.0.
An offer with 50% acceptance and low variance can beat an offer with 60% acceptance and high variance when the risk aversion coefficient is sufficiently high.
Parameters
- riskAversionCoefficient (\(\gamma\)): Higher values penalize variance more heavily. Default: 1.0. Range: 0 (risk-neutral, equivalent to raw acceptance rate) to >2 (highly risk-averse).
- Processing Window: Time window in milliseconds for historical data.
- Historical Count: Max records to process per update cycle.
Cold Start
Recommendations are always returned. The real-time training path in RollingBehavior always produces a scored options array, regardless of whether the algorithm has interaction history:
- No history: Every offer in the options store receives a uniform random score. All offers are ranked and passed to the post-score class.
- Partial history: Offers scored by Risk Aversion receive their risk-adjusted utility; offers the algorithm has not seen receive a random score. Both are included in the result.
- Established history: Risk Aversion scores drive the ranking for all known offers.
The scored options are then sorted by arm_reward and handed to the configured dynamic post-score class (e.g. PlatformDynamicEngagement), which controls the final offer selection, eligibility filtering, and response formatting.
The runtime always returns recommendations. During cold start, offers are ranked randomly. As interaction data accumulates, Risk Aversion’s mean-variance utility scores progressively replace the random rankings. The post-score class determines the final presentation.
When To Use
- When consistency matters more than maximum expected return
- When you want to avoid “volatile” offers with unpredictable acceptance rates
- Financial services, insurance, compliance-driven contexts where predictability is valued
- After sufficient interaction history has accumulated (the platform handles cold start with random scores until then)
When NOT To Use
- When you want to maximize expected acceptance regardless of variance
- When all offers have similar variance (the algorithm adds no value)
- When you need built-in exploration (combine with epsilon at the deployment level)
Example
from prediction.apis import deployment_management as dm
from prediction.apis import online_learning_management as ol
from prediction import jwt_access
auth = jwt_access.Authenticate("http://localhost:3001/api", ecosystem_username, ecosystem_password)
deployment_id = "demo-risk-aversion"
online_learning_uuid = ol.create_online_learning(
auth,
algorithm="ecosystem_rewards",
name=deployment_id,
description="Risk Aversion configuration",
feature_store_collection="set_up_features",
feature_store_database="my_mongo_database",
options_store_database="my_mongo_database",
options_store_collection="demo-deployment_options",
randomisation_processing_count=5000,
randomisation_processing_window=2592000000,
contextual_variables_offer_key="offer",
create_options_index=True,
create_covering_index=True
)
online_learning = dm.define_deployment_multi_armed_bandit(epsilon=0, dynamic_interaction_uuid=online_learning_uuid)
parameter_access = dm.define_deployment_parameter_access(
auth,
lookup_key="customer_id",
lookup_type="string",
database="my_mongo_database",
table_collection="customer_feature_store",
datasource="mongodb"
)
deployment_step = dm.create_deployment(
auth,
project_id="demo-project",
deployment_id=deployment_id,
description="Risk Aversion demo deployment",
version="001",
plugin_post_score_class="PlatformDynamicEngagement.java",
plugin_pre_score_class="PreScoreDynamic.java",
scoring_engine_path_dev="http://localhost:8091",
mongo_connect=f"mongodb://{mongo_user}:{mongo_password}@localhost:54445/?authSource=admin",
parameter_access=parameter_access,
multi_armed_bandit=online_learning
)The approach should be set to behaviorAlgos and sub_approach to riskAversion in the randomisation object. A longer processing_window (e.g. 30 days) is recommended to produce stable variance estimates.