Risk Aversion

Uses mean-variance utility from financial portfolio theory. Each offer’s score is penalized by its variance — offers with uncertain outcomes are scored lower than offers with consistent (even if slightly lower) acceptance rates.

Algorithm

Config value: "approach": "behaviorAlgos", "sub_approach": "riskAversion"

For each offer, the algorithm computes a risk-adjusted utility score:

p = response_count / count

variance = p × (1 - p)

utility = p - (γ / 2) × variance

Where \(\gamma\) is the risk aversion coefficient. Scores are aggregated per offer and normalized to sum to 1.0.

An offer with 50% acceptance and low variance can beat an offer with 60% acceptance and high variance when the risk aversion coefficient is sufficiently high.

Parameters

riskAversionCoefficient (\(\gamma\)): Higher values penalize variance more heavily. Default: 1.0. Range: 0 (risk-neutral, equivalent to raw acceptance rate) to >2 (highly risk-averse).
Processing Window: Time window in milliseconds for historical data.
Historical Count: Max records to process per update cycle.

Cold Start

Recommendations are always returned. The real-time training path in RollingBehavior always produces a scored options array, regardless of whether the algorithm has interaction history:

No history: Every offer in the options store receives a uniform random score. All offers are ranked and passed to the post-score class.
Partial history: Offers scored by Risk Aversion receive their risk-adjusted utility; offers the algorithm has not seen receive a random score. Both are included in the result.
Established history: Risk Aversion scores drive the ranking for all known offers.

The scored options are then sorted by arm_reward and handed to the configured dynamic post-score class (e.g. PlatformDynamicEngagement), which controls the final offer selection, eligibility filtering, and response formatting.

The runtime always returns recommendations. During cold start, offers are ranked randomly. As interaction data accumulates, Risk Aversion’s mean-variance utility scores progressively replace the random rankings. The post-score class determines the final presentation.

When To Use

When consistency matters more than maximum expected return
When you want to avoid “volatile” offers with unpredictable acceptance rates
Financial services, insurance, compliance-driven contexts where predictability is valued
After sufficient interaction history has accumulated (the platform handles cold start with random scores until then)

When NOT To Use

When you want to maximize expected acceptance regardless of variance
When all offers have similar variance (the algorithm adds no value)
When you need built-in exploration (combine with epsilon at the deployment level)

Example


from prediction.apis import deployment_management as dm
from prediction.apis import online_learning_management as ol
from prediction import jwt_access
 
auth = jwt_access.Authenticate("http://localhost:3001/api", ecosystem_username, ecosystem_password)
 
deployment_id = "demo-risk-aversion"
 
online_learning_uuid = ol.create_online_learning(
        auth,
        algorithm="ecosystem_rewards",
        name=deployment_id,
        description="Risk Aversion configuration",
        feature_store_collection="set_up_features",
        feature_store_database="my_mongo_database",
        options_store_database="my_mongo_database",
        options_store_collection="demo-deployment_options",
        randomisation_processing_count=5000,
        randomisation_processing_window=2592000000,
        contextual_variables_offer_key="offer",
        create_options_index=True,
        create_covering_index=True
)
 
online_learning = dm.define_deployment_multi_armed_bandit(epsilon=0, dynamic_interaction_uuid=online_learning_uuid)
 
parameter_access = dm.define_deployment_parameter_access(
    auth,
    lookup_key="customer_id",
    lookup_type="string",
    database="my_mongo_database",
    table_collection="customer_feature_store",
    datasource="mongodb"
)
 
deployment_step = dm.create_deployment(
    auth,
    project_id="demo-project",
    deployment_id=deployment_id,
    description="Risk Aversion demo deployment",
    version="001",
    plugin_post_score_class="PlatformDynamicEngagement.java",
    plugin_pre_score_class="PreScoreDynamic.java",
    scoring_engine_path_dev="http://localhost:8091",
    mongo_connect=f"mongodb://{mongo_user}:{mongo_password}@localhost:54445/?authSource=admin",
    parameter_access=parameter_access,
    multi_armed_bandit=online_learning
)

The approach should be set to behaviorAlgos and sub_approach to riskAversion in the randomisation object. A longer processing_window (e.g. 30 days) is recommended to produce stable variance estimates.