Docker Workshop

By Finn Kempers

Done by Tim Fletcher, DevOps Freelancer working at Cubits in Berlin. @timjdfletcher

Slack: https://slack.flossuk.org

Schedule of day
  • “Containers, Docker, how it works”
  • “Exploring simple docker containers”
  • Lunch
  • “Deploying simple Rancher Cluster”
  • “Commissioning a service on Rancher”
Slides and files

https://drive.google.com/drive/folders/0Bz5TtpiW4_X1bWR4M0NHdHhfRzQ

Docker Introduction

It is is a isolated environment to work in. A lot of history with Unix and FreeBSD with containers in jail format and the like. However Docker is unique due to its strong customisation of restrictions. Google itself is infamous for nigh relying on containers due to its ease of use and security. Docker itself is beyond a container, it made an ecosystem that revolved around it too.

Virtual Machines are only similar in concept to containers. They are essentially a computer inside a computer but does have the problem that things can break out the hypervisor. Generally however very secure and close to full hardware “bare metal” performance. A very well established technology and simple to comprehend.

Containers are much more popular in cloud environments. They usually run inside virtual machines. The difference is that you are isolating processes, not a complete hardware setup. Containers are however not possible to “move” live, you kill it and then start it elsewhere. They should be seen as a running process within a restricted environment, hence a container. The security however is questionable, it depends simply on how well it is managed and can require time to customise to fit your needs. Docker containers only use process cache and not the kernel or such to worry about, so it is a lot more efficient. However generally Kubernetes or the like is used for putting containers out efficiently. Either way the big difference to VM’s to consider is that they share the same OS and same appropriate bins/libraries depending on your customisation. Slide 5 on “An Introduction to Docker” has a diagram of this.

Docker as a platform breaks down into several parts, see Slide 6 of the above mentioned. This will be broken down below.

The Docker Engine has been through a number of changes and has been standardised, now running with RunC. In itself it hides a lot of the more complicated stuff but can be user customised, but essentially it is what runs the guts of the Docker management.

Docker images are essentially tarballs, very akin to Vagrant files if one is familiar with them. THe images are layered and thus will “commit” the change, so that if an image wants Apache added, you install Apache and you add a layer with that in the image to the container. They are all cached on file system and are immutable file systems. They only contain changes, which add on to the base image of the container, which is usually the OS such as Debian. The layers are essentially like git in concept.

As such, also much like git, these layers are often publicly available in repositories such as Docker Hu and Quay which are similar to Github. Companies such as DIgitalOcean provide private container registries. All of this allows for branching, deduplication and add security reviewing to container images, so you can check they are up to date, since they still need to be kept up to date.

In this presentation, an example docker image is pulled made by Tim and it is noted that the image is downloaded in layers, for which it will build upon. The layers themselves appear to be identified with sha256 keys. “docker image inspect [docker image file url]” will present various information on the docker image. “docker image history” on the other hand will present the history of the layers. Due to this, secrecy of past errors is very difficult to mask. This does however also mean that anyone can assess the deployment of a container.

The Dockerfile is basically a bash script which builds the container image layer by layer and thus will also show where a layer may error. Not required for docker container building, but it’s the usual method.

The example dockerfile presented has an apt-get update and then a string of installs required, then doing some basic settings such as chmod, then explicitly defines the user to nobody to avoid being root. Entrypoints allows for changing how you run commands and presents for greater flexibility to allow making different containers for different environments. “docker run -it timjdfletcher/example-docker-container /bin/bash” let go into the docker container files itself and then looked at the example entry point with:

#!/bin/bash -e

# This if matchs the following:

# if $1 (argv[0]) is unset (ie it's null)

# or anything begining with a - (etc -hello or -test)

if [[ -z $1 ]] || [[ ${1:0:1} == '-' ]] ; then

  exec  /app/hello "$@"

else

# If we have been called with a variable that doesn’t start with a – then assume it’s a progrom and run it

# This is handy for container debugging

  exec "$@

Note that the entrypoint is useful to make sure bash is installed if running certain commands.

While building, a docker build will have layers cached of those already built and then will use the cache to build layers previously done to make builds a lot faster. This shouldn’t be done for deployment though as cache assumptions aren’t always perfect. This allows for building containers of older tools such as ssh or java and running isolated old versions of software within the container on a modern OS.

Using the dockerfile provided at https://github.com/TimJDFletcher/docker-workshop/blob/master/example-docker-container/Dockerfile, from the example container a build was made, the build of which illustrates the process the container goes through when building.

Docker allows for environment variables which compose of shell variables, basic config files and can attach network devices and tty to it. In an exercise we run busybox, a tiny tool very common in Android root management and we can run inside it with “docker run -it busybox”. With all these tools, it is possible to customise the environment of the container, defining networking, user account management and the like. It does however mean people can see these variables so not ideally secure.

Docker networking uses network namespaces that are virtualised inside the containers. All ports by default are blocked, so containers via the dockerfile will have certain ports open, then using NAT or proxy to get networking from host to containers, like the VSwitches on VMWare. Orchestration tooling solves a lot of the problem that is setting up networking on docker containers. Docker containers by default can talk to each other on the same network namespace but not to the outside world. “docker run -p 80:80 nginx” will however publish the port and make it accessible. This runs docker-proxy, which manages the traffic, however there are “better” ways of doing it and should mostly be used for quick fix service running.

Docker compose is a Docker tool that takes a prewritten YAML file and sets up whatever is defined, so in the example stack used there is a database, web server and a load balancer. While this allows us to with the example start up a web server very quickly, the data is all in a container, including the database. If the container is deleted, so is the database. There is however a way around this. Without this workaround with an in memory database, nodes are set up and duplicate each other to ensure there is always several nodes up with the database in a three node cluster.

Docker volume is essentially are a defined directory in the host filesystem, which are mounted to the container. NFS and the like also exist for solutions to allow for permanent storage for containers and file management. In the exercise we create a container with volume named html-volume, which we then run. See slide 21 for more details. It essentially demonstrates that we can edit a file on the filesystem and the container will at the same time have access to that file and using it live. Using “docker exec” one can access and be “inside” the running container. These docker volumes can be defined in the dockerfile, with a definition for example of the volume path in “dbdata” and then defining dbdata in the mysql part of the dockerfile. This allowed us to spin up a wordpress site, configure it, use “docker-compose down” to kill it, then spin it back up and find the site intact as all the mysql data is intact on the filesystem.

Docker Container Orchestration

Rancher is a UI tool for managing Rancher, which runs on a single container on the “master host”. It is simple but has a lot of power and allows for scaling and the like. The master Rancher server manags the container deployment, along with a metadata service which includes all the various details and DNS.

This afternoon is an exercise working with Rancher. The chosen master host, of which I am the victim of for my row of seats, run the master Rancher host. Various people then join as a host to the server with a given URL, which then run their hosts with the stacks. We then installed the global service Janitor which cleans up “dead” containers, running as a service across all hosts, hence “global”. There is control of the hosts and the services running from Rancher, however the access to that is available to everyone connected. However, you can set up Rancher environments, which allow for management of users and user access control and removes a lot of duplication for variables, such as running a given set of services.

Joining a Rancher master is incredibly easy as noted before by simply using a supported Docker on Linux. It also via docker-machine supports directly starting systems, allowing for very quick easy scaling to start up systems, such as needing 5 machines.

Rancher Networking is IPSec therefore allowing for all the containers in the Rancher environment to be able to talk to each other directly, which allows for large scale batch processes and various other networking processes, while offering security too. There are various security measures that can be implemented which since it is IPSec, is easy for sysadmins to generally implement.

Stacks must have a unique name, but allow for descriptions and the import of a docker-compose or rancher-compose.yml. With these stacks created, you can then create services which are scalable to a number of containers to run an instance on every host, along with the image to run and a various number of customisation options. With this, you have options such as in our exercise, pinging between hosts, or the stack as a whole, as the processes are tied together to the stack.

The Rancher Catalog itself from which the Janitor service was installed from offes a range of official Rancher services along with many made by the community (to an extent verified by Rancher), with many options for things such as GoCD stack, Minecraft server, Gogs git service and many others.

Containers that run alongside other containers are known as psychic containers. Those are however a more advanced topic that are not covered in this tutorial.

Using API keys we then set up our own wordpress stacks within Rancher, using access and secret keys created through Rancher. From there we modified the rancher-compose to be similar to the previous docker-compose, but with unique wordpress directory and db names, to ensure it didn’t clash with other stacks.

Hot Based Proxies known as Load Balancers are used in the Rancher world for getting traffic from the outside world into a container, known as Ingress with Kubernetes. The packets are delivered to the edge of the container which is then used. Allows for vhosts, http routing and the like.

Using a Load Balancer, we have made the containers accessible to the outside world with the URL of the machine (though firewalled to within the university). This meant we could using the Load Balancer “serve” the wordpress service that was running on each machine within the master host. When traffic hits the DNS of say, Docker-7, it routes that traffic to the wordpress service of Docker-7.

Using the Rancher UI, you can scale up and duplicate processes, increasing the number of containers, while theoretically still working well. Duplicating processes such as Apache does lead to memory constraint issues based on the resources available, but theoretically otherwise it will simply scale without error.

Leave a Reply

Your e-mail address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.