Go dependencies and binary size
π€ The issue
As an app developer, how to check what's included in your binary so it's not bloated by unused dependencies?
As a library developer, how to design your library so users won't see their binary size explode?
To answer these questions, we need to understand how the Go toolchain compiles your app.
π‘ This post doesn't address other factors that may impact binary size, like compilation flags for example.
π Hypothesis
π¨βπ» I'm not an expert in compilers or the Go toolchain, so the vocabulary used in this post may be incorrect. Please contact me if you find mistakes!
Let's take for example a simple app, with only one external dependency:
In this scenario, github.com/gin-gonic/gin is listed as a direct dependency in the mod1's go.mod.
We can postulate that since we depend on mod1, we'll embed gin in our binary as well. At least that's what I imagined, coming from Python. And this wouldn't be good news since we only need func2 from package b, which doesn't need gin.
π¨π»βπ¬ Experimenting
Test repo
In order to test this hypothesis, I threw together a small experiment in the form of a repository containing the app described in the graph above, plus a few variants.
It contains two directories:
-
app: a Go module intended as an application. It contains 4 packages compilable as standalone binaries:a,a2andbdepend exclusively on matching packages frommod1controldepends only on stdlib'sfmtand is used as a control for what a small Go binary should weigh
-
mod1: a Go module intended to represent a library thatappincludes. It contains 3 packages:a: depends on a large external dependency (github.com/gin-gonic/gin) in its non-test file and also depends on some other external dependencies exclusively used in the test files, whether in the same package or in thea_testpackage.a2: same asabut doesn't contain the test filesb: depends only onfmt
Observations
When building the binaries from inside app, we observe the following:
$ make build
mkdir -p ./dist
GOARCH=amd64 GOOS=darwin go build -o ./dist/a ./a
GOARCH=amd64 GOOS=darwin go build -o ./dist/a2 ./a2
GOARCH=amd64 GOOS=darwin go build -o ./dist/b ./b
GOARCH=amd64 GOOS=darwin go build -o ./dist/control ./control
# on linux: stat --printf="%s %n\n" ./dist/*
$ stat -f "%z %N" ./dist/*
7261232 ./dist/a
7261232 ./dist/a2
2030336 ./dist/b
2030336 ./dist/control
$ shasum -a 256 ./dist/*
53b15316f40d69af54fb18ac9ec427b40c52fe9e7c852e2ebc3931f90ae851cb ./dist/a
e5251a45a82b28cf8bc2276bf82a6ca2cf29cd05e274b0e07d87c401ba74322d ./dist/a2
3feac94eec4c560a656aab6c9b3d07378f843189a9a9273c7a5351695d53bf9e ./dist/b
7bf645706dc55aede94ec9a373dbca7f4a06383fdea723b7237f5b343ee28b19 ./dist/control
The sizes of
aanda2are equal, therefore the dependencies used only in the test files inaaren't built in the final binary.The sizes of
aandbare different.bis smaller, as expected. Therefore the dependencies inaanda2aren't built inbdespite being listed inmod1's andapp'sgo.mod.The sizes of
bandcontrolare equal, which is interesting because the binaries don't produce the same results. They are different, as shown by the hashes, so the compiler probably does some good job flattening dependency trees and compacting source.
The depth package can also help us visualize the dependency as a tree.
$ go install github.com/KyleBanks/depth/cmd/depth@latest
go: downloading github.com/KyleBanks/depth v1.2.1
$ make deptree
mkdir -p ./deptree
depth ./a > ./deptree/a
depth ./a2 > ./deptree/a2
depth ./b > ./deptree/b
depth ./control > ./deptree/control
$ tail -1 ./deptree/a
113 dependencies (58 internal, 55 external, 0 testing).
$ diff ./deptree/{a,a2}
1,2c1,2
< ./a
< β example.com/mod1/a
---
> ./a2
> β example.com/mod1/a2
$ grep -rn "tonic" ./deptree/a
$ command cat ./deptree/b
./b
β example.com/mod1/b
β fmt
2 dependencies (1 internal, 1 external, 0 testing).
$ command cat ./deptree/control
./control
β fmt
1 dependencies (1 internal, 0 external, 0 testing).
I'm not including the full dependency tree for a and a2 because they're quite large. What's interesting to note is that they're identical, the only difference being the name of the package imported as root. There's also no trace of github.com/loopfz/gadgeto in the dependency tree of a, which is what we expected.
Conclusions
Based on these observations, my conclusion is that the Go toolchain considers packages as the compilation unit. This means that since our example app only depends on package b from mod1, package a and its dependencies won't be included. The go.mod file of our app will list example.com/mod1 as a direct dependency, but won't include github.com/gin-gonic/gin.
The real dependency graph looks more like this:
π― Takeaways
As a library developer, move functions/structs/interfaces with large dependencies to a separate package when it makes sense. For example, if you're building a logging library and you want to offer a middleware for gin, extract it to its own package. This will allow users of your library to not depend on gin if they don't want to.
As an app developer, before adding a new dependency for a very small subset of its features, consider copying. In our previous example, if func2 is small enough and since it only depends on fmt, it should be easy to copy and maintain alongside your code rather than depending on an external package.