destroytoday.com

The build failed, but I swear it wasn’t me

Last night, I was relaxing on the couch, watching a movie, when I decided to merge a small pull request. The PR adds a tiny feature, which lets you click a date label in Cushion to navigate to that date—a simple front-end addition that doesn’t touch anything on the backend. The PR’s checks all passed, so merging it to staging should be uneventful—I thought. Immediately, I receive a `build failed` notification.

I open the logs and see that the CI tests failed on setup. Looking closer, they failed when running the task to migrate the database in preparation for the tests. The error showed:

Gem::ConflictError: Unable to activate activesupport-4.1.16, because json-2.3.0 conflicts with json (~> 1.7, >= 1.7.7)

I scratched my head and thought this was strange because I didn’t touch activesupport or json. The especially strange part was that it said json-2.3.0 conflicted even though that isn’t the version I have installed. That right there should’ve clued me off to the culprit, but I figured I’d spend a couple more hours spinning with confusion until my brain fully digested that.

I searched google for the error without anything coming even close. I even tried updating activesupport, only to find that its json dependency was removed because it actually didn’t need it. That was lovely to read, considering that this dependency conflict is the one thing preventing me from doing anything. Unfortunately, the major upgrade from 4.x.x to 6.x.x caused a few of my tests to fail and I wasn’t ready to troubleshoot that at 11pm on a Monday, so I went back to the build logs and looked even closer.

Something else seemed strange, but I didn’t see it before. During the setup phase, Heroku’s Ruby buildpack runs rake db:migrate, which seemed normal until I remembered that Cushion’s setup script actually runs rake db:migrate db:views, which also sets up the Postgres views. That made me do a double-take and realize that the CI setup has been running both! What a waste! That still wasn’t the culprit for the gem conflict, but it helped me narrow in on it.

Looking between the two rake calls, my subconscious said, “Okay, I’m done having fun. Here’s the answer,” and let me see real issue. Heroku’s task was called with rake db:migrate, but my task was run with bundle exec rake db:migrate db:views. Anyone who has spent any time running a Ruby app knows this hiccup all too well. Prefixing rake with bundle exec tells Ruby to use the gem versions specified from Cushion’s Gemfile.lock file, which explains why a conflict existed for a version of json that wasn’t specified.

Unfortunately, you can’t disable the Heroku Ruby buildpack from running rake db:migrate, so I had to do some unsavory things that I’m not proud of in order to make everything work. First, I changed my db:migrate task to be a no-op that simply logs that it should no longer be used to actually migrate the database. Then, I created a new task that actually performs the migrations, and kept this in my test-setup script, prefixed with bundle exec. I also made sure to update this for my Heroku apps’ release phase. This fixed the issue, set up my test suite successfully, and unblocked me from moving forward.

This experience further fuels my negative attitude toward dependencies. For any of my Ruby gems, even if they were updated, their versions are locked, so I know they’ll work. The buildpack, however, isn’t version-locked, so any update affects anyone who uses it from then on. I’ve already seen a GitHub issue pop up for this from someone else, but affecting their actual deploy. If this were preventing a serious launch, I’d be livid. Hopefully it’s corrected swiftly and further updates aren’t rolled out as hastily. Until then, I feel better knowing that I don’t need to wait for the fix.

Update

I opened a Heroku support ticket and they responded quickly to acknowledge the issue. Then the buildpack maintainer replied, apologized profusely, and mentioned that they tested the buildpack with plenty of other Heroku apps, but the testing period didn’t catch this issue. He also told me that they rolled back the buildpack.

Now that I’ve given it some time, I realize that I was a bit heated when I insinuated that these updates were rolled out hastily. Obviously, that’s not the case—we’ve all pushed code that included something unexpected. I should’ve been better than that.

Also, correcting my uninformed claim that buildpacks can’t be version-locked, I discovered that you can lock a buildpack version by using its GitHub URL with the version number as a hash (e.g., https://github.com/heroku/heroku-buildpack-ruby#v215).