Sr. Application Operations Engineer
hace 4 días

What you will do

  • Train and mentor, in a coach / player role, a team of first responders as Application Operations Specialists.
  • Monitor and respond to incidents of relating to all Mitek SaaS Products and Critical services.
  • Escalate incidents and issues, and take ownership of the escalation process, outside of the Application Operations Team.
  • Assist in implementing, modifying, and tuning application monitoring based on Cloud Engineering or Software Engineering recommendations.
  • Assist with production deployments and system upgrades.
  • Monitor systems and applications to proactively identify problems and perform periodical health checks.
  • Communicate problem and incident management updates to impacted business users including action taken to resolve.
  • Maintain a knowledge base of common resolution and recovery actions for all critical systems and applications.
  • Provide responses to internal customers' trouble, request, or break / fix tickets in a timely fashion and in compliance with NOC standards and Cloud Operations team.
  • Create / develop automation or procedures to address incidents or requests.
  • Assist in development, improvement and implementation of the processes for Problem and Incident Management consistent with ITIL and COBIT best practices.
  • Measure & report on production metrics including "Uptime" but not limited to using metrics and SLAs for each technology area monthly.
  • Establish minimum Runbook requirements for all critical systems and applications and establishes a process to keep Runbooks current.
  • Provide support for root cause analysis and preventative analysis of incidents.
  • Assist leadership in the development of training documents and tutorials.
  • Qualifications :

  • 5-8 years of IT / Development experience including Network Operations Center and 24 / 7.
  • Bachelor's Degree in Computer Science, Engineering, Information Technology, or related field preferred.
  • Excellent written and verbal English communication skills.
  • Ability to lead complex troubleshooting efforts including evidence-based.
  • Excellent documentation skills regarding system issues, troubleshooting steps, resolution, and communication with stakeholders.
  • Experience with Software Change Management, Production Incident Management, Problem Management, System & Application Monitoring and Logging.
  • Willing to work flexible hours including night and / or swing shifts and to be part of an on-call rotation.
  • Skills you bring :

  • Experience with both Linux and Windows operating systems administration.
  • Experience with system and application health monitoring and alerting such as Grafana, Zabbix, ElasticSearch, Nagios, and Kibana.
  • Working knowledge of basic network and routing concepts.
  • Experience in a scripting language such as Bash, Python or Powershell.
  • Experience and proven success working in a highly collaborative environment.
  • May be required to lift up to 30 pounds.
  • Nice skills to bring :

  • Knowledge of ITIL and COBIT reference frameworks
  • Experience with Configuration Management tools such as Chef, Ansible, Puppet
  • Experience with Cloud Service Providers such as AWS
  • Event Log Correlation / Security Event & Incident Management
  • Knowledge of REST API's
  • Experience in operating SaaS
  • Inscribirse
    Mi Correo Electrónico
    Al hacer clic en la opción "Continuar", doy mi consentimiento para que neuvoo procese mis datos de conformidad con lo establecido en su Política de privacidad . Puedo darme de baja o retirar mi autorización en cualquier momento.
    Formulario de postulación