Two-Tower Module
Overview
The Two-Tower module brings the dual-encoder retrieval pattern to ecosystem.Ai. It trains two neural “towers”:
- a user (customer) tower that maps a customer’s features into a p-dimensional vector, and
- an item (offer/product) tower that maps each candidate offer into the same p-dimensional space.
Compatibility between a customer and an offer is the similarity of their two vectors — a dot product (or, on unit-normalized vectors, the cosine). Because the two towers are independent at inference time, item vectors can be precomputed once and a customer is matched against thousands of offers with nothing more than vector math. This is what makes two-tower models the standard choice for large-scale candidate retrieval.
Where it fits in ecosystem.Ai
| Stage | Component | Repo |
|---|---|---|
| Train towers + export embeddings | Two-Tower algorithm (H2O Deep Learning) | ecosystem-workbench2 |
| PyTorch training + serving (optional) | /pytorch sidecar | ecosystem-notebooks |
| Real-time scoring | similarity model type + SimilarityScorer + PostScoreTwoTower | ecosystem-runtime |
| Storage | embedding collections in MongoDB | shared |
The module is engine-agnostic: towers may be trained with H2O Deep Learning (the built-in workbench path) or PyTorch (the notebooks sidecar). Either way the runtime consumes embedding vectors, never the model itself, so real-time scoring carries no model-inference cost in the hot path.
“Two-tower” refers to the two encoders trained against a single similarity objective. It is distinct from the “two-stage” (retrieval-then-ranking) system architecture, although two-tower models are the usual choice for the retrieval stage of such systems.
Lifecycle at a glance
- Data Preparation — interaction rows in
logging.ecosystemruntime_flattenbecome a training frame. - Model Training — dual towers are trained and embeddings are produced.
- Offline Scoring — concept tests and batch scoring validate the run and write recommendations.
- Real-Time Scoring — the runtime ranks offers per request using precomputed (or live PyTorch) embeddings.
- PyTorch Serving and API Reference cover the serving sidecar and the full request/response contracts.
Key properties
- Shared embedding space — both towers output the same dimension
(
embedding_dim, default32). - Similarity score — dot product of L2-normalized vectors (equivalent to cosine). See Architecture & Theory.
- Decoupled inference — item vectors precomputed; only the user vector is needed per request.
- Runtime integration — a dedicated
similaritymodel type bypasses H2O and dynamic scoring and routes to the reusableSimilarityScorer.