In my previous blog, i’ve shown bazel in action by building a solr cloud package. In this blog i’m going to explain a bit more about Bazel.
Bazel is the Open Source version of Google’s internal build tool Blaze
. Bazel is currently in beta state, but it has been used by a number of companies in production. Bazel has some quite interesting features. Bazel has a good caching mechanism. It caches all input files, all external dependencies etc … Before running the actual build, bazel will first check the existing cache and if the cache is valid. If valid, then bazel will try to check if there are any changes to the input files/ dependencies. If it detect any changes, then bazel will start re-building the package. We can also use bazel to build our test targets and can make bazel to run our unit/integration tests for the built targets. Bazel can also detect cyclic dependencies with in the code. Another important feature is sandboxing
. On Linux, Bazel can run build/test inside a sandboxed environment and can detect file leaks or broken dependencies. This is because, during sandbox mode, bazel will mount only the specified input files, data dependencies on to the sandbox environment.
Bazel Build Flow
Let’s see how the bazel build process flow works. First thing that we need is a WORKSPACE
file. A bazel workspace is a directory that contains the source files for one or more software projects, as well as a WORKSPACE file and BUILD files that contain the instructions that Bazel uses to build the software. It also contains symbolic links to output directories in the Bazel home directory
Let’s create a simple workspace for testing
$ mkdir bazel-test && cd bazel-test
$ touch WORKSPACE
Now i’m going to build a simple python package. hello.py
is a simple python script which imports a hello
function from dep.py
. So our primary script is hello.py
which has a dependency on dep.py
vagrant@trusty-docker:~/bazel-test$ cat hello.py
from dep import hello
print hello("Building a simple python package with Bazel")
vagrant@trusty-docker:~/bazel-test$ cat dep.py
def hello(msg):
return msg
The Bazel’s build command basically looks for a BUILD
file on the target location. This file should contain the necessary bazel build rules. Bazel’s Python Rule Documentation explains the list of rules that are supported. Applying this to our test scripts, we are going to build a py_binary
for our hello.py
and this binary has a py_library
dependency towards dep.py
. So our final BUILD file will be,
py_library(
name = 'dep',
srcs = ['dep.py'],
)
py_binary(
name = 'hello',
srcs = ['hello.py'],
deps = [':dep'], # our dependency towards `dep.py`
)
So we have the BUILD file now, let’s kick off a build
vagrant@trusty-docker:~/bazel-test$ bazel build hello
............
INFO: Found 1 target...
Target //:hello up-to-date:
bazel-bin/hello
INFO: Elapsed time: 4.564s, Critical Path: 0.03s
woohoo, so bazel has build the package for us. Now if we check our workspace, we will see a bunch of bazel-*
symlinks. These directories points to the bazel home directory where our final build output lies.
vagrant@trusty-docker:~/bazel-test$ tree -d
.
├── bazel-bazel-test -> /home/vagrant/.cache/bazel/_bazel_vagrant/9dedbe0729180ec68a026adfb67cba5d/execroot/__main__
├── bazel-bin -> /home/vagrant/.cache/bazel/_bazel_vagrant/9dedbe0729180ec68a026adfb67cba5d/execroot/bazel-test/bazel-out/local-fastbuild/bin
├── bazel-genfiles -> /home/vagrant/.cache/bazel/_bazel_vagrant/9dedbe0729180ec68a026adfb67cba5d/execroot/bazel-test/bazel-out/local-fastbuild/genfiles
├── bazel-out -> /home/vagrant/.cache/bazel/_bazel_vagrant/9dedbe0729180ec68a026adfb67cba5d/execroot/__main__/bazel-out
└── bazel-testlogs -> /home/vagrant/.cache/bazel/_bazel_vagrant/9dedbe0729180ec68a026adfb67cba5d/execroot/bazel-test/bazel-out/local-fastbuild/testlogs
So our new python binary is available in bazel-bin/hello
. Also, bazel creates something called runfiles
which exists next to the binary. Bazel actually copies our dependencies (input files and data dependencies) onto this runfiles folder.
-r-xr-xr-x 1 vagrant vagrant 4364 Feb 19 20:13 bazel-bin/hello
vagrant@trusty-docker:~/bazel-test$ ls -l bazel-bin/hello
hello hello.runfiles/ hello.runfiles_manifest
vagrant@trusty-docker:~/bazel-test$ ls -l bazel-bin/hello.runfiles/__main__/
total 4
lrwxrwxrwx 1 vagrant vagrant 31 Feb 19 20:13 dep.py -> /home/vagrant/bazel-test/dep.py
lrwxrwxrwx 1 vagrant vagrant 130 Feb 19 20:13 hello -> /home/vagrant/.cache/bazel/_bazel_vagrant/9dedbe0729180ec68a026adfb67cba5d/execroot/bazel-test/bazel-out/local-fastbuild/bin/hello
lrwxrwxrwx 1 vagrant vagrant 33 Feb 19 20:13 hello.py -> /home/vagrant/bazel-test/hello.py
If we go through our python binary bazel-bin/hello
, it’s nothing but a wrapper script which basically identifies our runfiles
directory path, add this runfiles path to the PYTHONPATH env variable and then invokes our hello.py
file. In the beginning, i’ve mentioned that bazel has a good caching mechanism. Let’s re-run the build command and see the output, especially the time taken to complete the build process.
vagrant@trusty-docker:~/bazel-test$ bazel build hello
INFO: Found 1 target...
Target //:hello up-to-date:
bazel-bin/hello
INFO: Elapsed time: 0.247s, Critical Path: 0.00s
Let’s compare the build time for both the build process. The first build process took ~ 4.5 sec. But the second one is ~ 0.2 sec. This is because, bazel didnt run real build process during the second run. It actually verified the input files against its cache and found no change.
Now let’s add a simple unit test and see how bazel can run the same.
vagrant@trusty-docker:~/bazel-test$ cat hello_test.py
import unittest
from dep import hello
class TestHello(unittest.TestCase):
def test_hello(self):
self.assertEquals(hello("test message"), "test message")
if __name__ == '__main__':
unittest.main()
Now let’s add a py_test
rule to our BUILD file so that bazel can use it with bazel test
.
py_test(
name = "hello_test",
srcs = ["hello_test.py"],
deps = [
':dep',
],
)
We have the py_test
rule, now let’s run the bazel test
command and verify.
vagrant@trusty-docker:~/bazel-test$ bazel test hello_test
INFO: Found 1 test target...
Target //:hello_test up-to-date:
bazel-bin/hello_test
INFO: Elapsed time: 2.255s, Critical Path: 0.06s
//:hello_test PASSED in 0.0s
Executed 1 out of 1 test: 1 test passes.
woohoo the test seems to run fine. Now let’s manually break the test and see if bazel is picking the failure also.
vagrant@trusty-docker:~/bazel-test$ bazel test hello_test
INFO: Found 1 test target...
FAIL: //:hello_test (see /home/vagrant/.cache/bazel/_bazel_vagrant/9dedbe0729180ec68a026adfb67cba5d/execroot/bazel-test/bazel-out/local-fastbuild/testlogs/hello_test/test.log).
Target //:hello_test up-to-date:
bazel-bin/hello_test
INFO: Elapsed time: 0.199s, Critical Path: 0.05s
//:hello_test FAILED in 1 out of 2 in 0.0s
/home/vagrant/.cache/bazel/_bazel_vagrant/9dedbe0729180ec68a026adfb67cba5d/execroot/bazel-test/bazel-out/local-fastbuild/testlogs/hello_test/test.log
Executed 1 out of 1 test: 1 fails locally.
Bingo, bazel is detecting the test failure too. During our build process we saw that bazel caches the build and doesnt re-run the build process unless it desont detect any changes to the dependencies. Now lets see if what bazel does with tests too.
vagrant@trusty-docker:~/bazel-test$ bazel test hello_test
INFO: Found 1 test target...
Target //:hello_test up-to-date:
bazel-bin/hello_test
INFO: Elapsed time: 0.169s, Critical Path: 0.04s
//:hello_test PASSED in 0.0s
Executed 1 out of 1 test: 1 test passes.
There were tests whose specified size is too big. Use the --test_verbose_timeout_warnings command line option to see which ones these are.
vagrant@trusty-docker:~/bazel-test$ bazel test hello_test
INFO: Found 1 test target...
Target //:hello_test up-to-date:
bazel-bin/hello_test
INFO: Elapsed time: 0.087s, Critical Path: 0.00s
//:hello_test (cached) PASSED in 0.0s
Executed 0 out of 1 test: 1 test passes.
There were tests whose specified size is too big. Use the --test_verbose_timeout_warnings command line option to see which ones these are.
Bingo, we can see the (cached)
line in the output of the second tests run. So like the build process, bazel does caches the tests too.
Customizing Bazel Rules
py_binary
, py_library
etc… are the default bazel python rules which comes with bazel. Unlike any other product, we might endup in cases where we need to have custom rules to solve our specific needs. And the good news is, Bazel comes with an extension called skylark. With skylark, we can create custom build rules matching our requirements. Skylark syntax are pretty similar to python. I’ll be writing a more detailed blog on skyalrk soon 🙂
Conclusion
Though bazel is still in beta, it seems to be a really interesting tool for building hermetic packages. Bazel does has the ability to detect cylic dependencies and dependency leaks which is really an important thing. The caching ability of bazel really helps us to build faster packages compared to other traditional build tools.