- Job Board Home
- Search Jobs
- Devops Engineer
Results
Job Details
Explore Location
Test Triangle
Farringdon, Central London, United Kingdom
Devops Engineer
The insights provided are generated by AI and may contain inaccuracies. Please independently verify any critical information before relying on it.
Devops Engineer
The insights provided are generated by AI and may contain inaccuracies. Please independently verify any critical information before relying on it.
Description
DevOps Engineer — EMEIA InfrastructureROLE OVERVIEW
We are looking for a skilled and pragmatic DevOps Engineer to own and evolve our infrastructure across the EMEIA region. This is a dual-horizon role: you will keep our existing VM-based systems healthy while leading a greenfield effort to design and build the managed environment that those solutions will migrate onto.
A significant proportion of what we build is produced rapidly using AI-assisted, structured development. That means our solutions can move from idea to deployment faster than ever, and our infrastructure needs to keep pace. We need someone who thrives in a fast-moving, ambiguous environment, can absorb change quickly, and treats adaptability as a core part of the job rather than an occasional demand.
The new managed environment is most likely to be based on Kube — Apple’s internal Kubernetes (EKS) deployment — though the final architecture will be a team decision and AWS@Apple remains an option for workloads requiring greater control. You will help inform that decision and then own the build-out, regardless of which direction is chosen.
You will work closely with data engineers, developers, and analysts, acting as the infrastructure backbone for a team that moves quickly and expects you to move with it. The role also involves working directly with third-party vendors who support some of the tools being deployed, and collaborating with teams outside of EMEIA — including WorldWide — to align on standards, share solutions, and resolve cross-regional dependencies.
KEY RESPONSIBILITIES
Platform Migration & Environment Design
- Lead the design and build-out of a new managed container environment to replace existing VM-based infrastructure — the most likely candidate is Kube (Apple’s internal Kubernetes/EKS cluster), but the final decision will be made collaboratively as a team
- Contribute meaningfully to the environment selection decision: weigh trade-offs between managed solutions (Kube) and more directly controlled alternatives (AWS@Apple), considering maintenance overhead, operational control, and team capability
- Own the migration of existing VM-based workloads onto the new platform, managing sequencing, risk, and continuity of service throughout
- Establish and maintain the standard workflow for deploying solutions: build locally → containerise → publish to Kube → configure connectivity to Apple internal system dependencies
Apple Internal Networking & Connectivity
- Configure and maintain networking between Kube and Apple’s internal systems, including Shield, Snowflake, Appleconnect, Floodgate, and any other platform dependencies the team relies on
- Own namespace and compute provisioning on the shared Kube cluster, ensuring workloads are appropriately isolated and correctly configured
- Manage credentials, service accounts, and access controls across the full connectivity chain — from container to downstream service
- Act as the go-to expert on how things connect within Apple’s internal network topology
Infrastructure Management
- Own and manage cloud infrastructure across EMEIA using internal cloud tooling (cloud.apple.com and connected systems including Shield)
- Manage certificates, firewalls, resource pools, networking, and access controls
- Ensure infrastructure is appropriately sized, resilient, and cost-efficient
- Maintain accurate documentation of infrastructure topology and configuration
VM Provisioning & Automation (Existing Estate)
- Maintain and operate existing virtual machines, primarily on RHEL, while migration to the new environment is in progress
- Build and maintain standardised, repeatable provisioning processes (e.g. via Ansible, Terraform, or equivalent IaC tooling)
- Manage package deployment, software repositories, databases, and web servers
- Own the patching and update lifecycle for managed systems
Monitoring & Reliability
- Implement and maintain monitoring, alerting, and observability across both the existing VM estate and the new container environment
- Proactively identify risks, bottlenecks, and failure patterns before they impact users
- Define and track appropriate SLIs/SLOs for critical services
- Conduct post-incident reviews and drive lasting improvements
Supporting AI-Augmented Development
- A large proportion of the solutions you will support are built rapidly using structured AI-assisted development — you must be comfortable working with codebases and configurations that evolve quickly, may not have deep documentation histories, and may have been substantially generated with AI tooling
- Provide the infrastructure scaffold that allows AI-assisted solutions to move from local development to production reliably and safely
- Be a pragmatic partner to developers: unblock deployment quickly, catch infrastructure-level risks early, and help establish patterns that make rapid iteration safe at scale
- Actively use AI tools (e.g. Claude, Copilot, or similar) to accelerate your own work: writing scripts, diagnosing issues, generating runbooks, reviewing configurations
Diagnosis & Incident Response
- Take ownership of vague or ambiguous production issues (e.g. “it’s running slow”, “the server keeps falling over”) and drive them through to resolution
- Deliver short-term fixes rapidly to restore service, while tracking and delivering long-term root cause resolutions
- Maintain a pragmatic balance between speed-of-recovery and quality-of-fix
Essential
- Proven experience in a DevOps, infrastructure, or platform engineering role
- Hands-on experience with Kubernetes — deploying, configuring, and operating workloads in a shared or managed cluster environment
- Experience containerising applications: writing Dockerfiles, managing images, publishing to a registry, and debugging container-level issues
- Strong networking fundamentals: DNS, TLS/SSL certificates, firewall rules, load balancing, VPNs, and service-to-service connectivity
- Comfort operating in environments where the architecture is still being defined — able to contribute to the decision, then execute once direction is set
- Hands-on experience with RHEL (or equivalent enterprise Linux) — provisioning, hardening, package management (yum/dnf), systemd services
- Experience managing cloud infrastructure, ideally in an enterprise private/hybrid cloud environment
- Experience with infrastructure-as-code or configuration management tooling (e.g. Terraform, Ansible, Puppet, or similar)
- Solid scripting ability in Bash and at least one higher-level language (Python preferred)
- Experience with monitoring and observability tooling (e.g. Prometheus, Grafana, Datadog, or similar)
- Strong incident diagnosis skills — able to work from vague symptoms to root cause using logs, metrics, and reasoning
- Comfortable working with AI-generated or AI-assisted codebases: reading, extending, and debugging solutions without a full traditional authorship history
- Clear written and verbal communication — able to translate infrastructure complexity for non-technical stakeholders
Desirable
- Experience with AWS or AWS@Apple, particularly EKS
- Familiarity with Apple’s internal platform tooling: Kube, Shield, Appleconnect, Floodgate, or similar
- Experience integrating with Snowflake, including managing drivers, credentials, and network access
- Experience with CI/CD pipelines (GitLab CI, Jenkins, GitHub Actions, or similar)
- Exposure to security tooling, vulnerability scanning, or compliance frameworks (e.g. CIS Benchmarks)
- Familiarity with secrets management tooling (Vault, CyberArk, or similar)
- Experience working in a regulated or enterprise environment with change management processes
WAYS OF WORKING
- You are comfortable with genuine ambiguity — including at the architectural level — and can make progress and contribute to decisions without waiting for everything to be resolved
- You default to automation: if you do something twice, you script it; if you do it three times, you build a process
- You adapt quickly: the tools, environments, and solutions you support can change fast, and you treat that as normal rather than exceptional
- You are pragmatic under pressure: you know when to stop the bleeding first and fix it properly later
- You are self-directed and comfortable owning problems end-to-end with minimal hand-holding
- You are a willing partner to developers who move fast — you keep up, add guardrails where they matter, and don’t become a bottleneck
WHAT SUCCESS LOOKS LIKE
- A new managed container environment is designed, built, and running — with existing VM-based workloads migrated onto it in a controlled, sequenced way
- The standard deployment path (build → containerise → publish → connect) is well-established, documented, and easy for the team to use
- Connectivity from the new environment to Apple internal systems (Snowflake, Appleconnect, Shield, Floodgate, etc.) is reliable, well-understood, and correctly secured
- Teams are unblocked quickly when they need new integrations, access, or capabilities — even when the solutions they are deploying have been built at speed
- Production issues are resolved rapidly, with lasting fixes following close behind
- Monitoring catches issues before users do
- The infrastructure estate — both old and new — is well-documented, well-understood, and in a known-good state
-
Job ID: f14adde5-5748314640
Jobs You May Like
Community Intel Unavailable
Details for Farringdon, Central London, United Kingdom are unavailable at this time.
Loading...