Access & Operations
Access control
The module is gated by the two_tower capability. Grant it to the roles that
should train, batch-score, and deploy two-tower recommenders. The workbench
Solutions hub surfaces the module at the /two-tower route.
Deployment configuration
Real-time scoring is configured entirely through campaign properties (no model file is loaded):
predictor.model.type=similarity
plugin.prescore=com.ecosystem.plugin.customer.PrePredictTwoTower
plugin.postscore=com.ecosystem.plugin.customer.PostScoreTwoTower
# optional live user embedding via the PyTorch sidecar
predictor.twotower.user.embed=pytorch:http://ecosystem-notebooks:8010:two_tower_user_v1| Property | Required | Meaning |
|---|---|---|
predictor.model.type=similarity | yes | bypass H2O + dynamic; use cosine/dot scoring |
plugin.prescore | yes | PrePredictTwoTower (loads embeddings) |
plugin.postscore | yes | PostScoreTwoTower (ranks offers) |
predictor.twotower.user.embed | optional | live user vector via api:pytorch |
mojo.key | omit | not used in similarity mode |
Prerequisites
- A trained run (
two_tower_runs) from workbench2 or the PyTorch sidecar. - Embeddings available to the runtime, via either:
- precomputed vectors in
two_tower_user_embeddings/two_tower_item_embeddings, or - a reachable PyTorch sidecar for the user vector plus precomputed item vectors.
- precomputed vectors in
In similarity mode the runtime computes scores purely from vectors. If neither
precomputed embeddings nor the sidecar are available, offers cannot be ranked.
Confirm the embedding collections are populated (or the sidecar responds) before
routing traffic.
Deployment checklist
- Train towers (H2O via workbench2, or PyTorch via the sidecar).
- Export user/item embeddings to MongoDB (or stand up the sidecar).
- Set
predictor.model.type=similarityand the two plugin properties. - (Optional) Set
predictor.twotower.user.embedfor live user vectors. - Smoke test with
POST /invocationsand confirmfinal_resultranks offers. - Verify logging rows are written (see Logging & Reporting).
Operations notes
- Latency is dominated by vector math and (optionally) a single sidecar call per request; it is independent of model size.
- Retraining produces a new
run_id; re-export embeddings and update the collections. The runtime reads the latest vectors keyed bycustomer_id/offer. - Cold-start offers without an embedding are skipped during ranking (logged and continued), so a missing vector never fails a request.
Related pages: Real-Time Scoring, PyTorch Serving, API Reference.
Last updated on