Lavoro
I miei annunci
Le mie notifiche
Accedi
Trovare un lavoro Consigli per cercare lavoro Schede aziende Descrizione del lavoro
Cerca

Senior reinforcement learning builder

Milano
Neutralis S.R.L
Pubblicato il 26 novembre
Descrizione

About Neutralis

Neutralis is building the learning brain for industrial heat-pump plants. We fuse model-based RL with digital twins and strict safety constraints to turn messy plant telemetry into better decisions, hour by hour. This is paper-to-plant work with real impact on energy, reliability, and decarbonization.

The challenge

Industrial plants are complex, safety-critical, and non-stationary. Off-policy data, partial observability, actuator limits, drift, and human-in-the-loop operations make naïve RL fail fast. Your mission is to own a safe, reproducible path from data to control: offline → simulated → shadow → live, with guardrails at every step.

What you'll do

* Own the RL/control roadmap
: architect offline RL + model-based control with a digital twin in the loop; define safety envelopes and verification gates.
* Build the pipeline
: data curation, policy learning, simulation/gym environments, evaluation harnesses, and promotion criteria from sim to plant.
* Ship reproducible research to production
: baselines, ablations, and clear experiment tracking; transform results into services/APIs.
* Lead and mentor
a 15–20 person cohort of MSc/PhD thesis students and research engineers; set standards for code, experiments, and writing.
* Partner with domain experts
(HVAC/OT/BMS) on constraints, actuation limits, failure modes, and alarm triage.
* Land safety
: define fallback controllers, interlocks, and shadow-mode strategies; quantify risk and uncertainty.
* Collaborate across the stack
with our FastAPI services, time-series store, and observability/ML Ops.
* Communicate
: write crisp technical notes, contribute to publications where useful, and present results to partners.

What you'll bring

* Track record shipping
RL/controls for physical systems
(energy, robotics, process, automotive, etc.).
* Deep hands-on skill in
offline RL
(e.g., CQL/IQL/TD3-BC) and
model-based RL/MPC
; comfort with system identification and constrained optimization.
* Strong engineering in
Python
and
PyTorch or JAX
; experience with experiment tracking (MLflow/W&B), containers, and CI.
* Rigor around
evaluation and safety
: distribution shift, uncertainty, guardrails, fallback policies.
* Ability to
lead, mentor, and scale
a research-engineering team.
* Clear writing and stakeholder communication.
* Degree in CS/EE/ME/Controls or equivalent experience.

Nice to have

* Familiarity with OT/BMS/historians (OPC UA, Modbus, BACnet, PI), time-series modeling, anomaly detection.
* Experience with digital twins/simulation, domain randomization, and sim-to-real transfer.
* MLOps in AWS; FastAPI, PostgreSQL + a time-series DB.
* Italian language skills.

Why Neutralis

* Hard problems, real plants
: your work moves real energy, not just a leaderboard.
* Ownership
: technical stewardship from first principles to deployment.
* Talent platform
: lead a serious thesis cohort and shape a next-gen team.
* Impact
: measurable COP uplift, energy savings, reliability gains.
* Compensation
: competitive package with meaningful equity; conference and equipment budget.

Location & working model

On-site in Milan (primary). Some flexibility for exceptional candidates. Occasional visits to partner sites.

What success looks like (6–12 months)

* A documented, reproducible RL pipeline from data → policy → evaluation → shadow.
* Benchmarked policies that outperform baselines in sim and shadow with clear safety margins.
* A mentored student cohort delivering publishable experiments and production-ready components.
* Accepted path to controlled live trials with partners.

How to apply

Apply on LinkedIn or send a short note with "
RL — Senior
", a link to work you're proud of (GitHub/Google Scholar/website), and availability. DMs welcome.

* Neutralis is an equal-opportunity employer. We value clarity, safety, and results over pedigree. If you've shipped control systems that matter, we want to hear from you.

Rispondere all'offerta
Crea una notifica
Notifica attivata
Salvato
Salva
Offerte simili
Lavoro Milano
Lavoro Provincia di Milano
Lavoro Lombardia
Home > Lavoro > Senior Reinforcement Learning Builder

Jobijoba

  • Consigli per il lavoro
  • Recensioni Aziende

Trova degli annunci

  • Annunci per professione
  • Annunci per settore
  • Annunci per azienda
  • Annunci per località

Contatti/Partnerships

  • Contatti
  • Pubblicate le vostre offerte su Jobijoba

Note legali - Condizioni generali d'utilizzo - Politica della Privacy - Gestisci i miei cookie - Accessibilità: Non conforme

© 2025 Jobijoba - Tutti i diritti riservati

Rispondere all'offerta
Crea una notifica
Notifica attivata
Salvato
Salva