Awesome 网站可靠性工程

精选的站点可靠性和生产工程资源列表。

什么是站点可靠性工程？

“从根本上说，当您要求软件工程师设计操作功能时，就会发生这种情况。” -Ben Treynor Sloss，Google工程副总裁，Google SRE创始人

贡献

请先查看贡献准则。

内容

文化
教育
书籍
招聘
可靠性
监视与可观测性与警报
值班
发布验证
容量规划
服务水平协议
性能
杂项文章
博客
会议
推特
SRE工具
SRE干货

文化

教育

Books

Hiring

Reliability

Monitoring & Observability & Alerting

On-Call

Post-Mortem

Capacity Planning

Service Level Agreement

Performance

Programming

Misc Articles

Real-time Messaging

#sre channel at Hangops Slack - Discussion of Site Reliability Engineering generally.
#incident_response channel at Hangops Slack - Discussion about Incident Response.
USENIX SREcon Slack

Blogs

Brendan Gregg's Blog - Highly Technical Blog Posts About Systems Internals, Performance and SRE.
Everything Sysadmin - Blog Posts About SysAdmin/DevOps/SRE by Tom Limoncelli.
High Scalability - Technical Blog Posts About Systems Architecture.
rachelbythebay - Techincal Blog Posts.
SRE Weekly - Weekly Site Reliability Newsletter.
Production Ready - A mailing list about building resilient infrastructure and tools.
Susan J. Fowler - Various blog posts about SRE, Software Engineering and Microservices.
SysAdvent - One article for each day of December, ending on the 25th article.
Operations for Developers - A collection of resources for developers to strengthen their Ops skills.
Stephen Thorne's Blog - Blog Posts About SRE
Increment - A digital magazine about how teams build and operate software systems at scale.
O’Reilly Systems Engineering and Operations Newsletter - Weekly systems engineering and operations news and insights from industry insiders.
GopherSRE - Blog Posts about Go and SRE.
Cindy Sridharan - Blog posts about distributed systems and their management.
Blameless Blog - Blog posts about SRE culture and practices.
Resilience Roundup - Weekly analysis of Resilience Engineering and Human Factors research designed for software systems
Squadcast Blog - Blog posts about SRE best practices, reliability, on-call and incident management.

Conferences & Meetups

SRECon Conferences - The Official SRE Conference.
LISA Conferences - Prominent Conference About SysAdmin/DevOps/SRE.
SRE Tech Talks - SRE Talks Hosted by Google.
South Bay Site Reliability Engineering (Sunnyvale, CA) Meetup - A Group For Individuals Who Tackle Reliability Challenges For Web-Scale Systems.
San Francisco Reliability Engineering - A Group Of People Who Are Passionate About Reliable, Performant Software Systems.
Front Range Site Reliability Engineering - SRE Meetup in Boulder/Denver/Golden/DTC/FoCo area.
Site Reliability Engineering Munich, Germany - SRE Meetup in the greater area of Oktoberfest city.
ADDO - All Day DevOps - A 24 hour conference that is completely online and free.
Site Reliability Engineering Paris, France - SRE Meetup in the city of light.

Twitter

Google SRE Twitter Account - Google's SRE Twitter Account.
SREBook - The Official Twitter Account of Site Reliability Engineering Book.
SREcon - SRECon's Official Twitter Account.
SREWorkbook - The Official Twitter Account of Site Reliability Workbook.
The SRE Dev - SRE-related Posts from dev.to.
Twitter SRE - The Official Twitter Account of Twitter's SRE team.
Twitter SRE Weekly - The Official Twitter Account of SRE Weekly Newsletter.
USENIX Association - The Official USENIX Twitter Account.

SRE Tools

Awesome SRE Tools - A curated list of Site Reliability and Production Engineering tools
List of Continuous Integration services

SRE 干货

Google运维揭密

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
ppt		ppt
LICENSE		LICENSE
ONTRIBUTING.md		ONTRIBUTING.md
README.md		README.md
_config.yml		_config.yml
awesome-sre-logo.svg		awesome-sre-logo.svg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome 网站可靠性工程

什么是站点可靠性工程？

贡献

内容

文化

教育

Books

Hiring

Reliability

Monitoring & Observability & Alerting

On-Call

Post-Mortem

Capacity Planning

Service Level Agreement

Performance

Programming

Misc Articles

Real-time Messaging

Blogs

Conferences & Meetups

Twitter

SRE Tools

SRE 干货

About

Releases

Packages

License

wh211212/awesome-sre-cn

Folders and files

Latest commit

History

Repository files navigation

Awesome 网站可靠性工程

什么是站点可靠性工程？

贡献

内容

文化

教育

Books

Hiring

Reliability

Monitoring & Observability & Alerting

On-Call

Post-Mortem

Capacity Planning

Service Level Agreement

Performance

Programming

Misc Articles

Real-time Messaging

Blogs

Conferences & Meetups

Twitter

SRE Tools

SRE 干货

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages