职位描述
长期驻场至世界知名体育用品公司
Job Description:
We are seeking an experienced SRE to ensure the high availability and reliability of our consumer-facing applications. The ideal candidate will have a strong background in supporting retail industry applications, with a preference for those with hands-on experience on AWS, Grafana, and Java development. Good English communication skill is also essential for this role.
Key Responsibilities:
- Implement and maintain site reliability processes and systems.
- Provide service outage escalation response and guidance alongside software engineers.
- Review and assess the impact of monitoring metrics on current system behavior.
- Research and implement new tools and technologies to solve problems more efficiently.
- Conduct root cause analysis of production issues, including complex backend troubleshooting and debugging.
- Collaborate with cross-functional teams to achieve reliability excellence.
Preferred Experience:
- 5yrs+ Proven work experience as a Site Reliability Engineer or in a similar role, particularly in the retail industry.
- Hands-on experience supporting consumer-facing applications.
- In-depth knowledge of AWS services and best practices for cloud infrastructure.
- Proficiency with Grafana for monitoring and observability.
- Strong Java development skills, with experience implementing API functionality using REST, JSON, or similar technologies.
- Background with statistical or reliability software packages.
Skills:
- Expertise in troubleshooting Linux servers.
- Strong coding skills in at least one programming language, with a preference for Java.
- Familiarity with software engineering best practices, including testing, continuous integration, and continuous delivery.
- A passion for solving problems using open-source software.
- The ability to thrive in a rapidly evolving, globally distributed environment.
- Strong English communication skills, both written and verbal.
以担保或任何理由索取财物,扣押证照,均涉嫌违法,请提高警惕