Skip to Content
DocsModulesTwo-TowerOverview

Two-Tower Module

Overview

The Two-Tower module brings the dual-encoder retrieval pattern to ecosystem.Ai. It trains two neural “towers”:

  • a user (customer) tower that maps a customer’s features into a p-dimensional vector, and
  • an item (offer/product) tower that maps each candidate offer into the same p-dimensional space.

Compatibility between a customer and an offer is the similarity of their two vectors — a dot product (or, on unit-normalized vectors, the cosine). Because the two towers are independent at inference time, item vectors can be precomputed once and a customer is matched against thousands of offers with nothing more than vector math. This is what makes two-tower models the standard choice for large-scale candidate retrieval.

Where it fits in ecosystem.Ai

StageComponentRepo
Train towers + export embeddingsTwo-Tower algorithm (H2O Deep Learning)ecosystem-workbench2
PyTorch training + serving (optional)/pytorch sidecarecosystem-notebooks
Real-time scoringsimilarity model type + SimilarityScorer + PostScoreTwoTowerecosystem-runtime
Storageembedding collections in MongoDBshared

The module is engine-agnostic: towers may be trained with H2O Deep Learning (the built-in workbench path) or PyTorch (the notebooks sidecar). Either way the runtime consumes embedding vectors, never the model itself, so real-time scoring carries no model-inference cost in the hot path.

“Two-tower” refers to the two encoders trained against a single similarity objective. It is distinct from the “two-stage” (retrieval-then-ranking) system architecture, although two-tower models are the usual choice for the retrieval stage of such systems.

Lifecycle at a glance

  1. Data Preparation — interaction rows in logging.ecosystemruntime_flatten become a training frame.
  2. Model Training — dual towers are trained and embeddings are produced.
  3. Offline Scoring — concept tests and batch scoring validate the run and write recommendations.
  4. Real-Time Scoring — the runtime ranks offers per request using precomputed (or live PyTorch) embeddings.
  5. PyTorch Serving and API Reference cover the serving sidecar and the full request/response contracts.

Key properties

  • Shared embedding space — both towers output the same dimension (embedding_dim, default 32).
  • Similarity score — dot product of L2-normalized vectors (equivalent to cosine). See Architecture & Theory.
  • Decoupled inference — item vectors precomputed; only the user vector is needed per request.
  • Runtime integration — a dedicated similarity model type bypasses H2O and dynamic scoring and routes to the reusable SimilarityScorer.
Last updated on