Vonage, a leader in WebRTC communications, is looking for a Site Reliability Engineer to be based out of our Barcelona office.
We believe that there shouldn’t be walls between operations and development and we have embraced the DevOps movement.
As a site reliability engineer, you will work as part of the development team to build automation and tools to deploy, monitor and maintain the platform's health, targeted SLO and SLAs.
What you'll do
Lead the effort in ensuring reliability of the platform.
Create Software and Tooling that improves performance, stability, and reliability of the platform.
Ability to work as part of a Development Team.
Monitor Application Metrics to help with improving software performance.
Build solutions that are highly resilient, scalable, and secure.
Have a wide breadth of knowledge from software, infrastructure, and security.
Adopt best practices and champion an engineering culture emphasizing Agile.
What's required for application
Proven experience building, supporting, and architecting high-availability cloud infrastructure.
Experience working on monitoring, logging. and alerting solutions and used tools.
Experience with tooling such as Terraform, Ansible, Docker, Kubernetes, and Chef.
Fluent and comfortable working with Cloud Infrastructure.
Ability to read, write, and troubleshoot software code.
Good understanding of CI / CD tools.
Champion of devsecops using tools such as Hashicorp Vault, KMS, Secrets Manager,
Experience with software development, algorithms, data structures, and systems design.
Understand monitoring tools such as DataDog, ELK, and Grafana.
Bachelor's degree (or higher) in Computer Science and / or related work experience.
Nice to have, but not required
Working knowledge on other AWS services like Glacier, Elastic Container Service (ECS),
Elastic MapReduce (EMR), DynamoDB etc.
Automation and Orchestration tools such as Jenkins
Ruby or Java development skills
Data Pipeline knowledge, especially with tools like MapReduce, Kafka and ELK stack.