OverviewALLSIDES is redefining how the world experiences 3D content. We combine physically accurate scanning and generative AI to power content creation workflows for e-commerce, virtual environments, and immersive experiences. Our clients include global brands like adidas, Meta, Amazon, and Zalando. We operate a rapidly scaling photorealistic 3D scanning operation, capturing tens of thousands of assets annually while training next-generation AI models. As an NVIDIA Inception member, we collaborate with leading research institutions and actively participate in top-tier conferences in 3D computer vision and AI. More info: https://www.allsides.tech | https://blogs.nvidia.com/blog/covision-adidas-rtx-ai/About ALLSIDESPosition OverviewWe\'re looking for an Infrastructure & DevOps Engineer to build and maintain the foundation of our compute infrastructure. You\'ll work on hardware provisioning, networking, container orchestration, and deployment pipelines across cloud and on-premise environments. This role focuses on making our multi-GPU clusters reliable, our deployments reproducible, and our developers productive.Main ResponsibilitiesProvision, configure, and maintain heterogeneous compute clusters (CPU/GPU) across multiple physical locationsImplement dynamic compute and storage provisioning based on workload demandsDesign storage solutions at both hardware and software level (NAS, distributed filesystems, storage tiering)Implement and manage container orchestration systems (Kubernetes, Docker) for development and production workloadsDesign and maintain infrastructure as code using tools like Terraform and AnsibleBuild and optimize job scheduling and resource allocation systems (Slurm, Kubernetes)Set up monitoring, alerting, and observability infrastructure (Prometheus, Grafana, IPMI)Profile and optimize system-level performance: GPU utilization, memory bandwidth, I/O throughput, network latencyManage networking, VPNs, and secure access across distributed systemsHandle reliability concerns: hardware failure detection, job checkpointing, disaster recoveryQualificationsStrong Linux system administration knowledgeExperience with containerization (Docker) and orchestration (Kubernetes)Knowledge of infrastructure as code (Terraform, Ansible)Experience with HPC clusters and job scheduling (Slurm)Familiarity with monitoring solutions (Prometheus, Grafana)Understanding of networking principles and implementationExperience with hardware infrastructure management (IPMI, BMC, server maintenance)Knowledge of storage systems design (NFS, Ceph, distributed filesystems)Nice to HaveExperience with cloud services (AWS, or others)Familiarity with bare-metal provisioning (MaaS)What we offerCompensation that reflects your experience including stock-optionsLunch voucher for working daysWe assist with relocationFlexible working hours and work-from-home policyFamily-friendly environmentAmazing office space in South Tyrol, located at the Durst GroupPersonal and professional growth opportunitiesYou don\'t have to tick every box to apply, your drive and passion matter most!This role is located on-site in Brixen/Bressanone, Italy. If you are interested, please apply with your CV attached to careers@allsides.tech#J-18808-Ljbffr