Michael Crosby


Dockerfile deep dive

Docker is an amazing project. There is plenty of great documentation and tutorials online to get you up and running with docker, building images, and deploying your apps. There is not much that I can say that others have not said better. Therefore, I am going to start a series of posts on the docker internals. How to hack on the Dockerfiles, adding new instructions, and how other features are implemented in the code base.

Getting started

The first thing that you need to do to begin hacking on docker is to fork the repo on github. If you already have a fork you need to make sure you stay up-to-date with the master branch. Docker development is moving very fast, make sure that you have an upstream set in your fork so that you can pull and rebase changes for the master repository.

origin  git@github.com:crosbymichael/docker.git (fetch)
origin  git@github.com:crosbymichael/docker.git (push)
upstream        git://github.com/dotcloud/docker.git (fetch)
upstream        git://github.com/dotcloud/docker.git (push)

Next make sure that I have all the upstream changes pulled down before creating a dev branch.

╭─michael@codeassistant  ~GOPATH/src/github.com/dotcloud/docker  ‹master›
╰─<<: git pull upstream master
remote: Counting objects: 133, done.
remote: Compressing objects: 100% (65/65), done.
remote: Total 74 (delta 49), reused 34 (delta 9)
Unpacking objects: 100% (74/74), done.
From git://github.com/dotcloud/docker
 * branch            master     -> FETCH_HEAD
 19 files changed, 203 insertions(+), 50 deletions(-)

Volumes

So now that we are ready to start coding we need to know what we will be implementing. Currently you can only specify volumes for your container when calling docker run -v. We want to be able to specify volumes inside a Dockerfile instead of on the cmd line. This is what we will be implementing. I'll start by creating a branch called buildfile-volumes.

There are a few files that we will be focusing on. buildfile.go and builder.go. The buildfile.go is where we parse the Dockerfile and build an image and builder.go is where we take the image config and the config from the commandline to build the container.

Buildfile.go

So lets jump in and see how the Dockerfiles work. So down near the bottom of the file you will see the Build method. Parsing the Dockerfile is very simple, like it should be. It starts by iterating over all the lines in the file and skipping comments and empty lines.

line, err := file.ReadString('\n')
if err != nil {
    if err == io.EOF && line == "" {
        break
    } else if err != io.EOF {
        return "", err
    }
}
line = strings.Trim(strings.Replace(line, "\t", " ", -1), " \t\r\n")
// Skip comments and empty line
if len(line) == 0 || line[0] == '#' {
    continue
}

Next we get the instruction, i.e. RUN and the arguments to the instruction. After we have the instruction and the arguments docker uses reflection to call the correct method in the buildfile.

instruction := strings.ToLower(strings.Trim(tmp[0], " "))
arguments := strings.Trim(tmp[1], " ")
stepN += 1

method, exists := reflect.TypeOf(b).MethodByName("Cmd" + strings.ToUpper(instruction[:1]) + strings.ToLower(instruction[1:]))
if !exists {
    fmt.Fprintf(b.out, "# Skipping unknown instruction %s\n", strings.ToUpper(instruction))
    continue
}
ret := method.Func.Call([]reflect.Value{reflect.ValueOf(b), reflect.ValueOf(arguments)})[0].Interface()
if ret != nil {
    return "", ret.(error)
}

To get the method of the buildfile by name, we prepend 'Cmd' and make sure that the first char of the instruction is upper case and the rest of the instruction is lowercase. Therefore, if we want to add a new instruction VOLUME we need a new method on the buildfile called CmdVolume.

func (b *buildFile) CmdVolume(args string) error {

}

Now we need to think about what type of args will be passed to the VOLUME instruction. We want to support one to many volumes passed as the args. The standard is to pass a json array just like you down with CMD.

VOLUME /redis/data
# or
VOLUME ["/redis/data", "/redis/tmp"]

So the method for CmdVolume needs to look like this.

func (b *buildFile) CmdVolume(args string) error {
        if args == "" {
                return fmt.Errorf("Volume cannot be empty")
        }

        var volume []string
        if err := json.Unmarshal([]byte(args), &volume); err != nil {
                volume = []string{args}
        }
        if b.config.Volumes == nil {
                b.config.Volumes = NewPathOpts()
        }
        for _, v := range volume {
                b.config.Volume[v] = struct{}{}
        }
        if err := b.commit("", b.config.Cmd, fmt.Sprintf("VOLUME %s", args)); err != nil {
                return err
        }
        return nil
}

Basically if we are able to Unmarshal the value into a json array we will append those values to the config else we will just use the single value.

Now we need to make sure that when a container is built we get any volumes added to the image config copied over to the container config. Containers are built in the builder.go file and we need to make sure the image config is merged with the user config. One thing to note is that the user config ( values that come in from the cmd line on docker run ) take precidence over image config values.

The MergeConfig function is located in utils.go. This is where we will add the new lines for merging the Volumes if no userConf Volumes exist.

func MergeConfig(userConf, imageConf *Config) {
        if userConf.Hostname == "" {
                userConf.Hostname = imageConf.Hostname
        }
        // ...
        if userConf.Cmd == nil || len(userConf.Cmd) == 0 {
                userConf.Cmd = imageConf.Cmd
        }
        if userConf.Dns == nil || len(userConf.Dns) == 0 {
                userConf.Dns = imageConf.Dns
        }
        if userConf.Entrypoint == nil || len(userConf.Entrypoint) == 0 {
                userConf.Entrypoint = imageConf.Entrypoint
        }
        //New code
        if userConf.Volumes == nil || len(userConf.Volumes) == 0 {
                userConf.Volumes = imageConf.Volumes
        }
}

Unit tests and Docs

The last changes that we have to make are to update the unit tests and documentation with the new instruction. Obviously you will need to execute the unit tests to make sure that your changes did not break anything.

Now it is time to build your docker binary and start user testing.

I wrote a small Dockerfile to test the new instruction.

FROM ubuntu

RUN mkdir /test

VOLUME ["/test"]

There are a few tests that we can run to verify this change. The first thing that we need to do is run docker build . to make sure the instruction is processed.

docker build .
Uploading context 10240 bytes
Step 1 : FROM ubuntu
---> 8dbd9e392a96
Step 2 : RUN mkdir /test
---> Using cache
---> 5f7c55f3749c
Step 3 : VOLUME ["/test"]
---> Running in 5c453826366d
---> f738fc81dc78
Successfully built f738fc81dc78

Next we can run docker inspect on the image id to make sure that the volume is added to the config.

//docker inspect <imageid>
{   "Volumes": {
       "/test": {}
    },
}

Now run a container from the image, make sure the volume mounted, and inspect the container for the volume mount point.

//docker inspect <containerid>
{"Volumes": {
    "/test": "/var/lib/docker/volumes/da262abbea2ee31ad27287155246cc00e8ec66fc87a503bf230514cb1e8cb3ca/layer"
},
"VolumesRW": {
    "/test": true
}}

If everything looks good and your tests complete successfully, review your diff carefully, commit, push, and open a pull request. You can view this the pull request on github. Hopefully the maintainers of the project will not require too many changes to your pull request but always be open to changes and suggestions.

Happy hacking.

comments powered by Disqus