W2 ONLY
Key Responsibilities:
• Deliver incident management and advanced-level L1/L2 support for internal applications across public cloud platforms, with a strong emphasis on AWS.
• Serve as the initial point of contact for application developers via a ticketing system.
• Communicate effectively with users at various organizational levels.
• Implement and utilize automation to support the scalability of the environment.
• Optimize operational processes to enhance efficiency, reliability, and security.
• Train users to self-diagnose and troubleshoot issues for expedited resolution.
• Conduct thorough investigations into issues to identify root causes and document strategies to prevent recurrence.
• Provide support for public cloud environments, particularly AWS.
• Manage events and incidents efficiently.
• Develop and implement scalable automation processes to handle tasks in a large-scale environment.
• Analyze and debug incidents, following up to gather feedback and prevent future issues.
• Support different development environments, including Unix, Linux, Mainframe, and Windows.
Required Skills and Experience:
• Proficiency in SDLC with the ability to read code (Java and Python).
• Hands-on scripting experience (Unix shell, Python).
• Extensive cloud experience, particularly with AWS.
• Expertise in Kubernetes.
• Strong troubleshooting and diagnostic skills for security and access issues in a large enterprise environment.
• Database management skills (Oracle DBA, Cassandra DBA, CockroachDB) including performance tuning, connectivity, backups, indexes, and monitoring alarms.
• Middleware and messaging experience (Kafka, MQ).
• Experience with Tomcat.
• System engineering and administration skills (Unix/Linux).
• Familiarity with monitoring tools and ticketing systems.
• Commitment to automating processes for continuous improvement.
• Excellent communication skills.
• Ability to analyze details, understand incident causation, and implement preventive measures to ensure reliability and security.