Skip to Content
DocsModulesMLRun (Train + Deploy)Overview

MLRun Module

Overview

The MLRun module turns the Ecosystem.Ai Workbench2 console into a full ML lifecycle hub: pull data from MongoDB, draft engineered feature pipelines through a natural-language AI pipeline assistant, train sklearn, xgboost, lightgbm, or pytorch (tabular wrapper) models against a containerised trainer, emit the entire lifecycle as a runnable Python project (feature pipeline, training, scoring, K8s deploy), and deploy the trained model — alongside the Java-based ecosystem-runtime — onto Docker Desktop Kubernetes with two clicks.

The configuration editor is a 13-tab IDE for one MLRun configuration:

Overview · Project · Data Source · Features · H2O Frame · Training · Models · Training Runs · Adapters · Python · Runs · Deployments · Generate

For a screenshot-led walkthrough of every tab see Console Tour.

Two opinionated reference use-cases ship out of the box and seed real training runs on first install:

  • Customer Spend Risk — binary classification on master.bank_transactions, demonstrating numeric + categorical feature engineering and risk scoring against an ecosystem-runtime pod.
  • Customer Personality — multiclass classification on master.bank_customer, predicting personality from generic demographic fields (age, gender, income band, region, life stage).

Benefits

  • Single-pane lifecycle: feature engineering, training, scoring, and deployment in one console.
  • AI pipeline assistant: describe the target collection and the prediction goal in plain English; the assistant samples the schema, drafts the aggregation pipeline, infers the target column / problem type, persists the configuration, and (optionally) trains every selected framework end-to-end. Seven pipeline_kind presets (auto, numeric, categorical, temporal, mixed, aggregates, type_coercion) shape the generated MongoDB aggregation.
  • Bring-your-own framework: sklearn, xgboost, lightgbm, and a pytorch tabular wrapper using identical contracts.
  • Reproducible by design: every console action is exposed in a generated Python project (feature_pipeline.py, train_model.py, score_model.py, deploy_to_k8s.py) so the same lifecycle can run in CI, in a notebook, or in an MLRun job. The Python tab adds toggles for training scripts, the scoring driver, and the Kubernetes deployer, so you only emit what you need.
  • Coexistence with MLRun CE: when MLRun Community Edition is enabled, runs land in the MLRun project store and the trainer sidecar continues to handle compute. See MLRun Community Edition.
  • Runtime-native scoring: trained adapters generate ecosystem-runtime-compatible logging payloads, so production scoring goes through the same audited paths as the rest of the platform.
  • Cascade-safe configurations: deleting a configuration removes the linked feature pipelines, feature sets, training runs, and any K8s deployments scoped to its project_id, with a confirmation dialog that prints the final delete counts.

Architecture

┌──────────────────────────────┐ │ Workbench2 frontend │ │ /mlrun-console │ └──────────────┬───────────────┘ ┌──────────────────────────────┐ │ Workbench2 backend (FastAPI) │ │ /api/v1/mlrun-runtime/* │ │ /api/v1/k8s/* │ └─────┬──────────┬──────────────┘ feature pipe │ │ k8s api ▼ ▼ ┌──────────────────┐ ┌────────────────────────────┐ │ MongoDB │ │ Docker Desktop Kubernetes │ │ master.* │ │ ┌──────────────┐ │ │ ecosystem_meta.* │ │ │ mlrun-trainer │ (8003) │ └──────────────────┘ │ │ sklearn / xgb │ │ │ │ lgbm / pytorch│ │ │ └──────────────┘ │ │ ┌──────────────────────┐ │ │ │ ecosystem-runtime │ (8091) │ │ Java Spring scoring │ │ │ └──────────────────────┘ │ │ ┌──────────────────────┐ │ │ │ MLRun Community Ed. │ │ │ │ (optional, Helm) │ │ │ └──────────────────────┘ │ └────────────────────────────┘

What ships in the seed

The seeder script (backend/scripts/seed_mlrun_use_cases.py) creates one project, one feature pipeline, one feature set, one configuration, and four training runs (sklearn / xgboost / lightgbm / pytorch) per use-case. With both use-cases enabled you get:

AssetSpend RiskCustomer Personality
Source collectionmaster.bank_transactionsmaster.bank_customer
Target derivationaggregation pipeline (frequent / declined)$personality
Problem typebinary_classification (per use-case)multiclass_classification
Frameworks trainedsklearn · xgboost · lightgbm · pytorchsklearn · xgboost · lightgbm · pytorch
Activity log entrySEED_MLRUN_USE_CASESEED_MLRUN_USE_CASE

MLRun console — configurations list

Where to next

When the goal is purely to import already-trained mojo models from an MLFlow registry into the Java runtime, see MLFlow Integration. The MLRun module is the right choice when you also want to train models in-platform from feature-engineered MongoDB collections.

Last updated on