We are a fast-growing HR tech startup backed by leading international VCs, having raised €9M+ from 360 Capital, IFF, Kfund, and 14Peaks. We are a team of 30+ professionals passionate about shaping the future of talent assessment. Skillvue is a Skill AI Assessment platform (SaaS) to hire top-skilled candidates and measure employee skill, culture, and leadership at scale to upskill and grow the workforce by leveraging AI. You will own end-to-end ML systems: model training, fine-tuning, deployment, monitoring, and cost/performance optimization. Partner closely with organizational psychologists, people scientists, and software engineering to productionize LLMs, real-time conversational agents, and ML pipelines. You will report to the Head of AI & Science and drive engineering best practices, reliability, and reproducibility across the stack.
Design, build, and maintain end-to-end ML platforms and pipelines: data ingestion, feature engineering, training, validation, deployment, and monitoring.
Develop, fine-tune, and deploy LLMs and GenAI services for assessment tasks (prompt engineering, instruction tuning, RLHF/IL, retrieval-augmented generation).
Build infrastructure-as-code (Terraform/CloudFormation) for reproducible environments and secure, compliant deployments.
Create automated CI/CD for data, models, and infra (model/data versioning, reproducible training runs, canary/blue-green deployments).
Implement monitoring, observability, drift detection, and alerting for model performance and data pipeline health; Integrate vector databases, retrieval pipelines, and caching strategies for RAG systems; Ensure data and model governance: lineage, access controls, privacy safeguards, and auditability.
Bachelor or Master’s degree in Computer Science or related field.
~7+ years experience in ML engineering/ML-Ops delivering production ML products.
~3+ years practical experience training and deploying GenAI/LLMs in production.
~ Strong production experience on AWS (SageMaker, Lambda, ECS/EKS, Bedrock experience is a plus).
~ Proven track record building highly scalable services and real-time systems.
~ Experience with infrastructure-as-code (Terraform, CloudFormation) and container orchestration (Docker, Kubernetes).
~ Proficiency in Python and TypeScript; solid software engineering practices and Git workflows.
~ Experience implementing model monitoring, drift detection, and A/B testing for ML models.
~ Fluency in English (C1) and strong communication for cross-functional collaboration.
Experience with distributed training frameworks (Horovod, DeepSpeed, ZeRO) and model parallelism.
Familiarity with feature stores and online/offline serving (Feast, Tecton).
Contributions to open-source ML infra or published ML blog posts or conference papers.
Opportunity to shape and scale AI systems at an early-stage company with real product impact.
Remote work (within the EU timezone)
Competitive compensation, flexible work, and budget for conferences, training, and research resources.
A collaborative, flat environment where engineering leadership influences product and research direction.