STACK is a trusted global partner in digital infrastructure, delivering scalable, AI/ML and Cloud data center solutions to tackle global challenges with speed, scale, certainty, and responsibility. With a client-first focus, conscientious development and operational practices, we empower innovative companies to tackle their demand curve.General Purpose of the PositionAs a Critical Operations Engineer (COE), you are a business-hours-based technical specialist providing engineering depth, governance, and reliability focus to support safe operations, asset performance, and lifecycle management across STACK EMEA Data Centers. You will maintain and govern high-quality operating procedures and ensure disciplined change execution with proper planning, prerequisites, and acceptance criteria. You will provide technical oversight of vendor maintenance and installations, contribute to incident management through diagnosis, recovery planning, and corrective actions, and embed lessons learned into procedures and preventive maintenance.Key Responsibilities
Create, review, maintain, and improve SMPs, SOPs, MOPs, and EOPs, ensuring technical accuracy, risk controls, monitoring points, rollback/recovery steps, acceptance criteria, and evidence requirementsEnsure procedure quality and maintain strict document control disciplineLead or support pre-job briefs and method reviews, verifying all prerequisites (spares, tools, isolations, monitoring, communications/escalation, rollback) before executionSupport PM planning and optimisation, including task quality, frequencies, compliance, and elimination of recurring defects through structured reviews and lessons learnedSupport identification of critical spares, spares strategy, and vendor repair workflows for key assetsMaintain high-quality technical records to support KPIs, audits, and customer assuranceProvide technical oversight of vendors and contractors for maintenance, repairs, and installations, validating methods, test results, and close-out evidenceDeliver technical diagnosis and recovery planning for complex issues, supporting stabilization and restoration activitiesProvide coaching and technical guidance to technicians and operations teams, contributing to training materials, competency development, and scenario drill preparation
Required Skills and Background
Experience in facilities, operations, or engineering within mission-critical environments; Data Center or hyperscale experience preferredCompetence in operating, maintaining, and troubleshooting complex M&E systems, including UPS, generators, LV distribution, transformers, cooling systems, controls/monitoring, and fire/life safety interfacesStrong understanding of safe systems of work, risk controls, and disciplined change execution in live operational environmentsProficient with monitoring platforms (BMS/SCADA, EPMS/DCIM desirable) and OMS workflow disciplineAbility to read, interpret, and explain electrical single-line diagrams and system schematics accuratelySkilled in analysing technical problemsGood command of written and spoken EnglishExcellent organisation, prioritisation, and stakeholder management skillsContinuous improvement mindset with focus on standardisation and raising engineering governance standards
#J-18808-Ljbffr