We at Confiz are hiring an Application Support Engineer with hands-on experience in monitoring tools, backend API troubleshooting, and incident management. Join our team to ensure seamless application performance and drive operational excellence.
Responsibilities
Monitor applications, systems, and hardware in the processing environment for abnormal processing conditions.
Troubleshoot technical issues with Development and Quality Assurance and assist teams with issue resolution.
Maintain the scheduling system and procedures for the processing environment.
Plan and execute change, problem, incidents, production processes, controls, and service requests.
Research and resolve user problems as well as issues with software systems, operations processing, and assigned processing environments.
Use tools and resources effectively to solve system problems.
Utilize manuals, write-ups, and other tools to aid problem-solving and contribute to the body of knowledge.
Follow escalation procedures to resolve processing and user problems timely and meet service levels and other standards.
Provide subject matter expertise in application and infrastructure support, including incident, change, problem management, controls, monitoring, ITIL framework, knowledge management systems, contract metrics and SLAs, Remedy, ServiceNow, and JIRA.
Identify and use tools, technology, or processes to improve processing environment or efficiency.
Maintain operations support disaster recovery/business continuity plans for assigned environments.
Coordinate and conduct disaster recovery testing, document results, and note modifications for future testing.
Requirement
Bachelor's or Master's degree in Computer Science or related field with 3-5 years of experience.
Experience working in a technical support role is required.
Hands-on experience with Azure Application Insights for log analysis and troubleshooting.
Good understanding of backend APIs including request/response journeys performed at front end.
Hands-on experience testing backend APIs.
Skilled in writing and optimizing SQL queries to extract relevant telemetry data.
Strong experience in monitoring tools like New Relic, Splunk, or Grafana.
Ability to create, modify, and interpret dashboards and alerts based on application health and performance.
Capable of identifying and analyzing application issues based on logs and metrics.
Take ownership of P1 and P2 incidents, ensuring timely resolution and communication.
Proactively escalate issues to L3 engineers or developers when required.
Understand and perform Root Cause Analysis (RCA) for incidents.
Conduct impact analysis to evaluate scope and user impact of an issue.
Work well in a team environment and collaborate effectively with cross-functional teams.
Follow through on post-incident actions and ensure continuous improvement in monitoring and support processes.