Ph3Overview /h3pALLSIDES is redefining how the world experiences 3D content. We combine physically accurate scanning and generative AI to power content creation workflows for e-commerce, virtual environments, and immersive experiences. Our clients include global brands like adidas, Meta, Amazon, and Zalando. We operate a rapidly scaling photorealistic 3D scanning operation, capturing tens of thousands of assets annually while training next-generation AI models. As an NVIDIA Inception member, we collaborate with leading research institutions and actively participate in top-tier conferences in 3D computer vision and AI. More info: | /ppAbout ALLSIDES /ppPosition Overview /ppWe\'re looking for an bInfrastructure DevOps Engineer /b to build and maintain the foundation of our compute infrastructure. You\'ll work on hardware provisioning, networking, container orchestration, and deployment pipelines across cloud and on-premise environments. This role focuses on making our multi-GPU clusters reliable, our deployments reproducible, and our developers productive. /ph3Main Responsibilities /h3ulliProvision, configure, and maintain heterogeneous compute clusters (CPU/GPU) across multiple physical locations /liliImplement dynamic compute and storage provisioning based on workload demands /liliDesign storage solutions at both hardware and software level (NAS, distributed filesystems, storage tiering) /liliImplement and manage container orchestration systems (Kubernetes, Docker) for development and production workloads /liliDesign and maintain infrastructure as code using tools like Terraform and Ansible /liliBuild and optimize job scheduling and resource allocation systems (Slurm, Kubernetes) /liliSet up monitoring, alerting, and observability infrastructure (Prometheus, Grafana, IPMI) /liliProfile and optimize system-level performance: GPU utilization, memory bandwidth, I/O throughput, network latency /liliManage networking, VPNs, and secure access across distributed systems /liliHandle reliability concerns: hardware failure detection, job checkpointing, disaster recovery /li /ulh3Qualifications /h3ulliStrong Linux system administration knowledge /liliExperience with containerization (Docker) and orchestration (Kubernetes) /liliKnowledge of infrastructure as code (Terraform, Ansible) /liliExperience with HPC clusters and job scheduling (Slurm) /liliFamiliarity with monitoring solutions (Prometheus, Grafana) /liliUnderstanding of networking principles and implementation /liliExperience with hardware infrastructure management (IPMI, BMC, server maintenance) /liliKnowledge of storage systems design (NFS, Ceph, distributed filesystems) /li /ulh3Nice to Have /h3ulliExperience with cloud services (AWS, or others) /liliFamiliarity with bare-metal provisioning (MaaS) /li /ulh3What we offer /h3ulliCompensation that reflects your experience including stock-options /liliLunch voucher for working days /liliWe assist with relocation /liliFlexible working hours and work-from-home policy /liliFamily-friendly environment /liliAmazing office space in South Tyrol, located at the Durst Group /liliPersonal and professional growth opportunities /li /ulpYou don\'t have to tick every box to apply, your drive and passion matter most! /ppThis role is located on-site in Brixen/Bressanone, Italy. If you are interested, please apply with your CV attached to /p /p #J-18808-Ljbffr