Today at my internship I merged a PR that broke the master branch. Surprisingly, it wasn’t a big deal. As soon as the issue was reported to my team, I knew which commit was to blame. I pushed up a branch that reverted that commit and waited for it to pass CI. On the recommendation of my mentor, I sent a Slack message to the whole engineering group letting them know that someone had broken master and a fix was incoming.
Yesterday, this site went down for about four hours. Complaints started rolling in from my millions of ardent followers, spurring me into action. Join me as I deconstruct what went wrong, how I fixed it, and how I tried to prevent the problem from occurring again.