Vendoring Dependencies in Go


The most common question I'm asked about Go by people working in other languages is how we deal with external dependencies. With solutions like pip in Python, npm in Node, various Java/JVM build systems like Maven/SBT, and cargo in Rust, the absence of a standard dependency management tool can feel like a void when looking at building production systems in Go.

When getting started with Go, the simplest way to retrieve code is through the go get command, which pulls down a package and stores it within the GOPATH, resolving dependencies and retrieving them as necessary. This command will always grab whatever is the latest version of the code and can update existing repositories with the addition of a -u flag. This is the process that's used in the official getting started documentation.

The origins of this approach come from the way Google writes code internally. In the Go FAQ they address the question of managing package versions with an answer that basically recommends authors of Go code to use package paths to differentiate code, follow the Go 1 compatibility guidelines to not break existing APIs, and copy code which may change directly into the source tree of projects. This approach purportedly works well for Google because they use a billion line monorepo for most all their code, so everything can live together in the same GOPATH. A comprehensive story for vendoring external dependencies isn't a necessity for them.

Until recently, these workflow guidelines from the FAQ were the extent of the official vendoring story for Go. The ease of go get, combined with the lack of any comprehensive official story for how dependencies should be managed caused a degree of fragmentation within this piece of the Go ecosystem.

Early days at Meta

While many tools existed at this stage, few of them were truly satisfactory. Starting in our very early days of development and extending a suprisingly long time forward, the way we handled both pinning and updating dependencies at Meta was through a cached GOPATH for each of repository on our build servers. Our process for updating dependencies was pretty simple. When we wanted to update dependencies, we would clear the cached GOPATH and everything would be brought up to the latest version in the next build. For local development, we just relied on go get -u to update our dependencies. While rife with potential issues, this approach was functional, and carried us through the very early phases of product development. There weren't more than a handful of times where we had actual issues stemming from this system.

However, as we began to grow our development team and roll out our closed beta with users coming onto the product, the handful of issues we had been became too many to tolerate. Additionally, we really needed the ability create reproducable builds, which would allow buggy deployments to be easily rolled back. Reversing an update to the cached directory on the build servers was a process that was cumbersome at best, and when issues arose, rather than rolling back, we ususally took the approach of just grinding through the bugs to get a fix out as fast as possible, which is neither a reliable or sustainable approach.

Our requirements

Working with codebases written by third parties is a pivotal component of the work done by nearly all software developers. At Meta, we pull in existing functionality for everything from HTTP middleware, to semantic version parsing, to API client libraries. Most of these codebases have active communities using and working with them and some are undergoing active development and change. Because of this, it's important to be able to know what version of these external libraries a particular service we have running is using. On top of that, while the Go community values stability very highly, because tools around dependency management haven't existed for most of the language's history, many large, mature, useful projects don't really use semantic versions for release tags, making the ability to pin exact commits crucial for reproducability.

Additionally, our own system is broken up into many separate codebases, encouraging us to keep services loosely coupled and simplifying our setup for independent, automated deployments that are run many times each day. While a good deal of our services are fairly stable in the functionality they provide, most are under some development, and a few are very heavily worked with. Additionally, we have a few library codebases containing common functionality. The way we keep everything up to date needs to be easy to manage with precision, and not so cumbersome that we're discouraged from doing so.

Another fundamental requirement was support for Google App Engine, which we use heavily within our infrastructure. Any dependency management tool we were to pick would have to be compatible with the various tools provided by the App Engine SDK, including both local development and deployment.

A light in the distance

The release of Go 1.5 introduced support for vendored packages within a special vendor/ directory, detailed in a design document leading up to the release. In Go 1.6, this functionality became standard.

The vendor/ directory acts in a very similar way to the $GOPATH/src folder, and the go tooling uses this directory to look for packages before looking with in the global GOPATH location.

This capability makes vendored dependencies a first-class citizen within the standard Go build system, and opened the doors for us to migrate our old cached GOPATH system to a real dependency management tool.

Surveying our options

The offical Go repository on Github has a wiki article on package management tools which provides a fairly complete list of the available options for package management.

Some of these libraries work by modifying the GOPATH environment variable to use local dependencies. With App Engine, this doesn't work when you're trying to run multiple services at the same time, each with a different set of dependencies. The App Engine SDK works by specifying all modules that the developer wants to run in a single command, and since the GOPATH can only be set to a single value, there's no good way use different dependencies with differente services. Furthermore, the introduction of vendor/ made this approach largely unecessary.

One tool that has gained a lot of traction in the community is gb, which actually does away with the standard Go toolchain entirely. It provides a project-based structure for working with Go code, and in doing so eliminates the issues that come from the go tool itself. While it has gained quite a lot of popularity, because it does away with the GOPATH and standard Go toolchain, we hit some roadblocks trying to get it working with the App Engine SDKs that we couldn't work our way around.

The tool we eventually settled on is glide, a package manager that takes it's inspiration from other existing tools, as they describe in the README for the tool.

Are you used to tools such as Cargo, npm, Composer, Nuget, Pip, Maven, Bundler, or other modern package managers? If so, Glide is the comparable Go tool.

While the majority of our backend systems at Meta are written in Go, we also make extensive use of Python, using pip for dependencies, Javascript, using both npm and bower, and Swift and Objective-C for our Mac app, using Cocoapods. Thus, a tool that has similarities to other package managers is immediately appealing.

Our team also liked the declarative nature of stating dependencies in a glide.yaml file, the presence of the glide.lock file specifying exact versions of all dependencies, and the ability to choose whether or not vendored packages are copied into source control. Because we have so many repositories, many of which use each other, we have chosen not to copy vendored dependencies into source control, and glide fully supports this.

As an example for comparison, here's a side by side of an example glide.lock file on the left, and an example package.json file on the right. Both these come from the official documentation for getting started with the tools.

Pain points using Glide

While migrating all our codebases to Glide was a huge step forward, the tool is still under active development, and while it's improving steadily, there are some features which we have felt a need for recently and issues we have run into with everyday use.

One feature we feel a need for quite often is the ability to update a single dependency in isolation. It's not uncommon that we want to update a single external dependency to bring in some new feature we've added to an internal codebase, without updating all the external dependencies at the same time. This can be achieved by just updating the commit hash in the glide.lock file, but that's best avoided. There is an open issue for this feature, and an open pull request which will bring that functionality in, so we look forward to having that functionality available soon!

We've also seen strange non-determinism in the behavior of glide update, which is supposed to update all dependencies to the latest version. The state of the local vendor/ folder seems to have an effect on how dependencies are updated, so the same update command exeuted for the same repository on two separate computers could have different results. In some situations, we've even seen dependencies get downgraded because of this behavior. There is a related issue slated for the 0.12.0 milestone (the current release is 0.11.1) so hopefully this will be brought to resolution soon.

Concluding thoughts

While there's still much work to be done to get vendoring in Go to a reliable state, tons of progress has been made recently, and there are a variety of tools to pick from when looking for the one that best suits your needs or the needs of your team. Vendoring in Go is becoming a solved problem, with community action providing solutions.

An ongoing discussion on the golang-nuts mailing list will be setting up a working group to discuss further improvements to package management in Go, and the result of this effort will then likely be official approach recommended by the Go language. These changes will hopefully serve to make the vendoring process clearer for new Gophers and unify the existing ecosystem of package managers.