Skip to Content
Meet the New ecosystem.Ai Resources Hub! 🚀

Coverage-Aware Thompson

An extension of Thompson Sampling that boosts under-exposed offers so niche and long-tail items get a fair chance alongside popular ones. It adjusts priors by exposure, draws from Beta posteriors, and applies an inverse-popularity factor; optional \(\epsilon\) mixing adds uniform exploration across the catalog.

Algorithm

Config value: "approach": "behaviorAlgos", "sub_approach": "coverageAwareThompson"

Prior adjustment (exposure-shaping of the Beta prior):

\(\beta_{\text{prior}} = 1.0 + \text{exposure}^{\gamma}\)

Thompson draw:

\(\theta \sim \mathrm{Beta}(\alpha, \beta)\)

Score (inverse popularity and optional multipliers):

\(\text{score} = \theta \cdot \frac{1}{(\text{exposure} + 1)^{\gamma}} \cdot \text{variableMultiplier}\)

Optional \(\epsilon\) exploration (after normalization):

\(\text{finalScore} = (1 - \epsilon) \cdot \text{normalized} + \frac{\epsilon}{|\text{offers}|}\)

Higher \(\epsilon\) allocates more mass uniformly across arms, improving coverage at the cost of short-term exploitation.

Parameters

  • gamma (\(\gamma\)): Popularity-penalty exponent; controls how strongly under-exposed offers are boosted. Default: 1.0.
  • epsilon (\(\epsilon\)): Additional uniform exploration over the offer set. Default: 0.0. Typical values for extra coverage: 0.050.1.
  • Processing Window: Time window in milliseconds for historical data.
  • Historical Count: Max records to process per update cycle.

Cold Start

Recommendations are always returned. Coverage-Aware Thompson has the strongest cold-start handling among behavioral algorithms:

  • No history: The RollingBehavior layer assigns uniform random scores to every offer. The algorithm itself also handles this well — default \(\alpha = 1.0\), \(\beta = 1.0\), and \(\text{exposure} = 0\) yield \(\mathrm{Beta}(1,1)\) (uniform sampling).
  • Inverse-popularity \((\text{exposure}+1)^{-\gamma}\) is maximized when exposure is zero, so unseen offers receive the strongest relative boost.
  • As data accumulates, the Beta posteriors sharpen and the inverse-popularity term naturally balances popular vs. niche offers.

The scored options are then sorted by arm_reward and handed to the configured dynamic post-score class, which controls the final offer selection and response formatting.

Coverage-Aware Thompson is the best cold-start choice when catalog fairness and tail coverage matter. Its Beta(1,1) prior and inverse-popularity boost give unseen offers maximum exposure from day one. The post-score class determines the final presentation.

When To Use

  • Catalog coverage or fairness requirements (every offer should eventually get trials)
  • Promoting the long tail of niche offers
  • Mitigating popularity bias where raw counts dominate rankings

When NOT To Use

  • When you want to converge quickly to a single best offer with minimal exploration
  • When catalog coverage is not a business concern and raw performance ranking is enough

Example

from prediction.apis import deployment_management as dm from prediction.apis import online_learning_management as ol from prediction import jwt_access auth = jwt_access.Authenticate("http://localhost:3001/api", ecosystem_username, ecosystem_password) deployment_id = "demo-coverage-aware-thompson" online_learning_uuid = ol.create_online_learning( auth, algorithm="ecosystem_rewards", name=deployment_id, description="Coverage-Aware Thompson configuration", feature_store_collection="set_up_features", feature_store_database="my_mongo_database", options_store_database="my_mongo_database", options_store_collection="demo-deployment_options", randomisation_processing_count=5000, randomisation_processing_window=604800000, contextual_variables_offer_key="offer", create_options_index=True, create_covering_index=True ) online_learning = dm.define_deployment_multi_armed_bandit(epsilon=0.05, dynamic_interaction_uuid=online_learning_uuid) parameter_access = dm.define_deployment_parameter_access( auth, lookup_key="customer_id", lookup_type="string", database="my_mongo_database", table_collection="customer_feature_store", datasource="mongodb" ) deployment_step = dm.create_deployment( auth, project_id="demo-project", deployment_id=deployment_id, description="Coverage-Aware Thompson demo deployment", version="001", plugin_post_score_class="PlatformDynamicEngagement.java", plugin_pre_score_class="PreScoreDynamic.java", scoring_engine_path_dev="http://localhost:8091", mongo_connect=f"mongodb://{mongo_user}:{mongo_password}@localhost:54445/?authSource=admin", parameter_access=parameter_access, multi_armed_bandit=online_learning )

Set approach to behaviorAlgos and sub_approach to coverageAwareThompson in the randomisation object. Configure gamma and epsilon there to tune inverse-popularity strength and uniform exploration; the Python define_deployment_multi_armed_bandit(epsilon=...) API controls deployment-level epsilon separately—keep both layers consistent with your intent.

Last updated on