Senior Site Reliability Engineer, Observability, FedRAMP
Company: Splunk Inc.
Location: Washington
Posted on: March 4, 2025
Job Description:
Splunk, a Cisco company, is building a safer and more resilient
digital world with an end-to-end full stack platform made for a
hybrid, multi-cloud world. Leading enterprises use our unified
security and observability platform to keep their digital systems
secure and reliable. Our customers love our technology, but it's
our caring employees that make Splunk stand out as an amazing
career destination. No matter where in the world or what level of
the organization, we approach our work with kindness. So bring your
work experience, problem-solving skills and talent, of course, but
also bring your joy, your passion and all the things that make you,
you. Come help organizations be their best, while you reach new
heights with a team that has your back.Join us on the Splunk
TechOps team, working on our vision to make machine data
accessible, usable, and valuable to everyone! You will configure
and maintain our customer-facing SaaS product, Splunk Cloud. Come
join a team that is striving for operational awesomeness and trying
to automate the world. We have a large AWS presence, and you should
have experience with AWS architecting, deployments, and networking.
This is an incredible opportunity to use your existing cloud
experience and drive the growth of the Splunk Cloud.Meet the
Products and Technology TeamWant to build security and
observability products people love AND work with people as smart
(and humble) as you are? Our products and technology team delivers
digital resilience at enterprise scale with a self-service Splunk
portfolio that offers unified security analytics, full stack
observability and real-time visibility of streaming data. Learn
more about the team, meet our leaders, and hear from Splunk
technologists and engineers at .Responsibilities:
- You are passionate about building and running distributed
systems at scale in production. You understand the challenges and
trade-offs to be made when building and deploying systems to
production.
- You constantly consider "How can I automate this process?"
- Knowledge of best practices related to security, performance,
and disaster recovery.
- Skilled in identifying performance bottlenecks, spotting
anomalous system behavior, and determining the root cause of
incidents.
- Experience monitoring cloud environments using tools like
Splunk, VictorOps and SignalFx.
- You care about good documentation and appreciate how it allows
a distributed team to function.
- Ability to tackle complex problems, resolve operational issues,
and interact with vendors to find solutions.
- Comfortable working with critical, customer-facing issues and
able to prioritize quickly when escalations happen.Requirements:
- Extensive experience as a Linux system administrator supporting
enterprise computing platforms and systems.
- Expertise in public cloud (AWS, GCP, Azure) and container
orchestration tools (Kubernetes, Docker).
- Knowledge and understanding of OpenTelemetry.
- Deep understanding of logging, monitoring, tracing, and
alerting practices in large-scale distributed systems.
- Proficiency with programming languages like Python along with
shell scripting to automate tasks.
- Experience supporting customer facing SaaS infrastructure or
similar cloud related services.
- Experience in administering or architecting distributed Splunk
and Observability environments.
- Experience in setting up SLOs & SLIs.Splunk is an Equal
Opportunity EmployerAt Splunk, we believe creating a culture of
belonging isn't just the right thing to do; it's also the smart
thing. We prioritize diversity, equity, inclusion, and belonging to
ensure our employees are supported to bring their best, most
authentic selves to work where they can thrive. Qualified
applicants receive consideration for employment without regard to
race, religion, color, national origin, ancestry, sex, gender,
gender identity, gender expression, sexual orientation, marital
status, age, physical or mental disability or medical condition,
genetic information, veteran status, or any other consideration
made unlawful by federal, state, or local laws. We consider
qualified applicants with criminal histories, consistent with legal
requirements.Base Pay RangeSF Bay Area, Seattle Metro, and New York
City Metro AreaBase Pay Range: $174,800.00 - 240,350.00 per
yearCalifornia (excludes SF Bay Area), Washington (excludes Seattle
Metro), Washington DC Metro, and MassachusettsBase Pay Range:
$157,320.00 - 216,315.00 per yearAll other cities and states
excluding California, Washington, Massachusetts, New York City
Metro Area and Washington DC Metro Area.Base Pay Range: $139,840.00
- 192,280.00 per yearSplunk provides flexibility and choice in the
working arrangement for most roles, including remote and/or
in-office roles. We have a market-based pay structure which varies
by location. Please note that the base pay range is a guideline and
for candidates who receive an offer, the base pay will vary based
on factors such as work location as set out above, as well as the
knowledge, skills and experience of the candidate. In addition to
base pay, this role is eligible for incentive compensation and may
be eligible for equity or long-term cash awards.Benefits are an
important part of Splunk's Total Rewards package. This role is
eligible for a competitive benefits package which includes medical,
dental, vision, a 401(k) plan and match, paid time off and much
more! Learn more about our next-level benefits at .
#J-18808-Ljbffr
Keywords: Splunk Inc., Washington DC , Senior Site Reliability Engineer, Observability, FedRAMP, Engineering , Washington, DC
Didn't find what you're looking for? Search again!
Loading more jobs...