Sr Application Operations Engineer
hace 5 días

What you will do

  • Train and mentor, in a coach / player role, a team of first responders as Application Operations Specialists.
  • Monitor and respond to incidents of relating to all Mitek SaaS Products and Critical services.
  • Escalate incidents and issues, and take ownership of the escalation process, outside of the Application Operations Team.
  • Assist in implementing, modifying, and tuning application monitoring based on Cloud Engineering or Software Engineering recommendations.
  • Assist with production deployments and system upgrades.
  • Monitor systems and applications to proactively identify problems and perform periodical health checks.
  • Communicate problem and incident management updates to impacted business users including action taken to resolve.
  • Maintain a knowledge base of common resolution and recovery actions for all critical systems and applications.
  • Provide responses to internal customers' trouble, request, or break / fix tickets in a timely fashion and in compliance with NOC standards and Cloud Operations team.
  • Create / develop automation or procedures to address incidents or requests.
  • Assist in development, improvement and implementation of the processes for Problem and Incident Management consistent with ITIL and COBIT best practices.
  • Measure & report on production metrics including "Uptime" but not limited to using metrics and SLAs for each technology area monthly.
  • Establish minimum Runbook requirements for all critical systems and applications and establishes a process to keep Runbooks current.
  • Provide support for root cause analysis and preventative analysis of incidents.
  • Assist leadership in the development of training documents and tutorials.
  • Qualifications :

  • 5-8 years of IT / Development experience including Network Operations Center and 24 / 7.
  • Bachelor's Degree in Computer Science, Engineering, Information Technology, or related field preferred.
  • Excellent written and verbal English communication skills.
  • Ability to lead complex troubleshooting efforts including evidence-based.
  • Excellent documentation skills regarding system issues, troubleshooting steps, resolution, and communication with stakeholders.
  • Experience with Software Change Management, Production Incident Management, Problem Management, System & Application Monitoring and Logging.
  • Skills you bring :

  • Experience with both Linux and Windows operating systems administration.
  • Experience with system and application health monitoring and alerting such as Grafana, Zabbix, ElasticSearch, Nagios, and Kibana.
  • Working knowledge of basic network and routing concepts.
  • Experience in a scripting language such as Bash, Python or Powershell.
  • Experience and proven success working in a highly collaborative environment.
  • May be required to lift up to 30 pounds.
  • Nice skills to bring :

  • Knowledge of ITIL and COBIT reference frameworks
  • Experience with Configuration Management tools such as Chef, Ansible, Puppet
  • Experience with Cloud Service Providers such as AWS
  • Event Log Correlation / Security Event & Incident Management
  • Knowledge of REST API's
  • Experience in operating SaaS
  • I'm interested

    Reportar esta oferta

    Thank you for reporting this job!

    Your feedback will help us improve the quality of our services.

    Mi Correo Electrónico
    Al hacer clic en la opción "Continuar", doy mi consentimiento para que neuvoo procese mis datos de conformidad con lo establecido en su Política de privacidad . Puedo darme de baja o retirar mi autorización en cualquier momento.
    Formulario de postulación