How to containerise a third-party tool for SIP
This guide will give you step-by-step instructions how to containerize a project on the SIP. As an example we will use the Open Source version of SonarQube. SonarQube is a great tool for code analysis which can help improve code quality and provides information about quality in a web interface. You can find the project on Gitlab.
Introduction
Goal
This guide will show you how to containerize an application for the SIP. Please note that a lot of these steps described in this guide are highly dependent on the app you're trying to containerize. Therefore it is not possible to write a guide that will give you detailed step-by-step instructions on how to containerize an application. However this guide will (hopefully) help you get an idea how to approach the containerization of your own tools.
If you have any feedback or improvements on this guide, don't hesitate to contact vseth-it-support@ethz.ch.
Prerequisites
For this guide you will need:
- Basic understanding of
docker
,docker-compose
as well as an installation on your local machine. - Understanding of
git
andgitlab
(you should be able to add files to a project, make commits and push them)
If these requirements are a hurdle for you, no problem! In the Technology Stack section of the documentation you can find some useful information about the technologies that we are using.
Outline
The guide consists of the following steps:
- Creating and setting up a repository for your project.
- Adding a
README.md
to your repository. - Writing a
Dockerfile
for your repository. - Adding
cinit
to start the container - Setting up
docker-compose
for you local development. - Connecting to other systems (such as databases)
- Documenting your setup
- Deployment
1. Creating and setting up a repository for your project.
The first step is to create a repository in the VSETH Gitlab group and configure the Teamcity build server. Please note that all projects that need to be deployed have to be created in a subgroup of the VSETH Gitlab group for our build-system to correctly work.
To setup a new repository please write an email to vseth-it-support@ethz.ch with the following information:
- The name of the repository that should be created.
- The Gitlab group the repository should be created in.
- A (very) short description of the project.
- Which people need access to the repository (list of nethz identifiers).
We know it sucks, that you have to write an email just to get a repository set up, but we're currently actively working on a better way to do this. In the future every developer will be able to create new projects through a simple web interface.
Example
In our example I sent the following message:
Dear VSETH ISG
I would like to containerize the SonarQube project for VSETH, which we will use to monitor our code quality.
To do this, please create a repository on Gitlab and configure the buildserver. The details are:
* name: `SonarQube`
* gitlab-group: ` 0403-isg/sip-apps`
* description: `Open source tool for static code analysis`
* people who need access: `lukasre`
Best regards,
Lukas
2. Adding a README.md
to your repository
A very important step, that should not be forgotten! Every repository should contain a README.md
giving other devs information about what the project does and how it is setup. At this step you probably want to add a basic description of the project to the README.md
file. It's always a good practice to add information about your setup to the README.md
file, while you're setting up the project (and not just at the end). Any webpage you read while learning how to setup the project is a great candidate to put into the README.md
. Keep in mind that there are very few projects with too much documentation, but a lot of projects with too little.
Example
For the SonarQube example I added the following README.md
containing a basic description:
# SonarQube
This repository contains a container to run [SonarQube](https://www.sonarqube.org/) on SIP. SonarQube is a static code analysis tool which helps enforce code standards among developers.
3. Writing a Dockerfile
for your project
Now we're getting to the fun stuff: Lets, write the Dockerfile. On the high-level, these are the steps you need to do:
- Decide which Base Images you're going to use: Note, that all applications running on SIP need to use one of the provided base images.
- Gather information about how to install your application on a Debian based system. Also check if there is already a Dockerfile present on Dockerhub, that you can (re)use.
- Check that all the technologies you want to use are part of the Technology Stack
- Write the Dockerfile
- Checking that you followed the Best-practices for docker containers.
Example
For our example we follow the steps outlined above:
Base Image
Since SonarQube is Java based application and we probably don't need to run a reverse proxy in front of the application we just use the base
image and install the Java Runtime.
Installation information
With a quick internet search we find the Install the Server in the SonarQube documentation. Also since we want to use OpenID Connect (OIDC) to let user's easily login to the system we search for a way to configure SonarQube to use OIDC and we find the following plugin on Gitlab, which exactly does what we want.
Checking the Technology Stack
Before moving on to writing the Dockerfile, we quickly check if the SIP provides us with all the additional systems that we need for our install, such as databases and check if we followed the best practices.
- Database: SonarQube supports the PostgreSQL database system which is the preferred database solution.
- OpenID-Connect: We want to use OpendID Connect to authenticate users in our system, since this is also the preferred way to do this on the SIP.
Write your Dockerfile.
Now we can start writing the Dockerfile
, I've added some comments to the file to give you some more information about each step.
# @Copyright VSETH - Verband der Studierenden an der ETH Zürich. # @Author Lukas Reichart <lukas.reichart@vseth.ethz.ch> FROM eu.gcr.io/vseth-public/base:delta # Install needed dependencies: # * default-jre-headless: SonarQube is a java application and therefore needs the Java Runtime (good idea to use headless) # * wget, unzip: to download & unzipthe SonarQube binary # * jq: in the post-start-config.sh script we need to parse a JSON Response from the SonarQube API. For that we use jq. RUN apt install -y default-jre-headless wget unzip jq gpg # Specify the version of SonarQube # In general only fixed versions should be used inside of containers (never use something like "latest")) # otherwise a future container build may fail. ENV BUILD_SONARQUBE_VERSION="8.6.1.40680" # Download the binary package from the sonarqube website and verify the download. RUN set -x \ # pub 2048R/D26468DE 2015-05-25 # Key fingerprint = F118 2E81 C792 9289 21DB CAB4 CFCA 4A29 D264 68DE # uid sonarsource_deployer (Sonarsource Deployer) <infra@sonarsource.com> # sub 2048R/06855C1D 2015-05-25 && (gpg --batch --keyserver ha.pool.sks-keyservers.net --recv-keys F1182E81C792928921DBCAB4CFCA4A29D26468DE \ || gpg --batch --keyserver ipv4.pool.sks-keyservers.net --recv-keys F1182E81C792928921DBCAB4CFCA4A29D26468DE) \ && curl -o sonarqube.zip -fSL https://binaries.sonarsource.com/Distribution/sonarqube/sonarqube-${BUILD_SONARQUBE_VERSION}.zip \ && curl -o sonarqube.zip.asc -fSL https://binaries.sonarsource.com/Distribution/sonarqube/sonarqube-${BUILD_SONARQUBE_VERSION}.zip.asc \ && gpg --batch --verify sonarqube.zip.asc sonarqube.zip \ && unzip sonarqube.zip \ && mv sonarqube-${BUILD_SONARQUBE_VERSION} sonarqube \ # Set permissions to the app-user to access file as that user && chown -R app-user:app-user sonarqube \ && rm sonarqube.zip # Volume for documentations purposes VOLUME "/app/sonarqube/data" # Specify the version of the OIDC Plugin we're going to use. ENV BUILD_SONARQUBE_OIDC_PLUGIN_VERSION="2.0.0" # Install the OIDC Plugin by loading the .jar file into `extensions/plugins/` ADD --chown=app-user:app-user https://github.com/vaulttec/sonar-auth-oidc/releases/download/v${BUILD_SONARQUBE_OIDC_PLUGIN_VERSION}/sonar-auth-oidc-plugin-${BUILD_SONARQUBE_OIDC_PLUGIN_VERSION}.jar sonarqube/extensions/plugins/sonar-auth-oidc-plugin.jar # replace USER in bin/linux-x86-64/sonar.sh with the default user of cinit (app-user) RUN sed -i -e "s/#RUN_AS_USER=.*/RUN_AS_USER=app-user/g" sonarqube/bin/linux-x86-64/sonar.sh # Copy the cinit file # Note that we're renaming the file which is best-practice. COPY cinit.yml /etc/cinit.d/sonarqube.yml # Copy the pre- and post-start-config.sh scripts into the container. COPY post-start-config.sh pre-start-config.sh /app/
Checking best practices
todo
4. Adding cinit to start the container
On the SIP we use a special tool to start processes inside containers call cinit. Cinit solves a few problems with starting processes inside containers, you can find more information about that in the cinit documentation. Cinit is configured through a simple YAML-file that has to be copied into the /etc/cinit.d
folder of the container.
Example
Writing a cinit file is pretty straight forward: you just add an entry for each program you want to start in the container.
The cinit file for SonarQube looks as follows:
--- programs: - name: sonarqube user: app-user group: app-user path: /app/sonarqube/bin/linux-x86-64/wrapper args: - "/app/sonarqube/conf/wrapper.conf" - wrapper.syslog.ident=SonarQube workdir: /app capabilities: - CAP_NET_BIND_SERVICE
A few points about the cinit file:
- We add the
CAP_NET_BIND_SERVICE
capability to the SonarQube app, so it can bind to port 80. Remember: cinit drops all privileges and you have to explicitly as for the capabilities your application needs. - We run the application as
app-user
as suggested by the docker best practices of the SIP.
For the cinit file to work we have to add it to our Docker build, by adding the following line to the Dockerfile
:
# Copy the cinit file
# Note that we're renaming the file which is best-practice.
COPY cinit.yml /etc/cinit.d/sonarqube.yml
Note that we rename the cinit file from cinit.yml
to sonarqube.yml
. In general you should always rename your cinit file to the name of your application because there might be other cinit files inside the /etc/cinit.d
folder (e.g. from the base image) and you don't want to accidentally overwrite the cinit file of your base image.
5. Setting up docker compose for local development
When you're containerizing a container on your local machine, you probably want to (and should) try out your configuration on your local machine before deploying it. At the moment we use docker-compose
to test locally. docker-compose
allows us to easily start multiple containers and also connect them to each other over a network. If your application uses a database or any other systems you probably want to add that to the docker-compose as well to make it easier for other developers to start your app locally.
Example
I created a docker-compose.yml
in your project with the content:
version: "3" services: postgres: container_name: pg-docker-sonarqube environment: - POSTGRES_PASSWORD=docker - POSTGRES_USER=postgres ports: - "5432:5432" image: postgres sonarqube: ports: - "80:80" container_name: sonarqube build: . environment: - SIP_POSTGRES_SONARQUBE_USER=postgres - SIP_POSTGRES_SONARQUBE_PW=docker - SIP_POSTGRES_SONARQUBE_SERVER=pg-docker-sonarqube - SIP_POSTGRES_SONARQUBE_PORT=5432 - SIP_POSTGRES_SONARQUBE_NAME=postgres - CUSTOM_ADMIN_PASSWORD=hello - SIP_AUTH_SONARQUBE_CLIENT_ID=sonarqube - SIP_AUTH_OIDC_CONFIG_URL=https://auth.vseth.ethz.ch/auth/realms/VSETH/.well-known/openid-configuration depends_on: - "postgres"
The docker-compose file just specifies that we want to build the Dockerfile
. Please note that I already added a postgres container to the docker-compose.yml
since we're going to configure the database in the next step. Also I've added the ENV Variables that will be used to configure the database (SIP_POSTGRES_*
) and authentication (SIP_AUTH_*
).
6. Connecting to other systems
Most containers that are deployed (maybe with the exception of some small static sites) also have dependencies on some other systems like databases or authentication. If you need to connect your project to another system the application you want to containerize will need to be configured to do so.
To understand how to this we need to take a quick detour and talk about how an application is started on the SIP:
- Your
Dockerfile
is built on the build server and the resulting image is pushed into the SIP registry. - Before your container is started a program called the SIP-Manager takes care of configuring the systems you want to use for you. For example: it will create a database on the shared database server for you application to use.
- A container is started on the SIP Kubernetes Cluster (Container orchestration system)
Now your container is running and it would like to connect to it's database. How does the container know how it can connect to its database and which user+passwort combination to use for the connection? SIP solves this by defining a set of environment variables that will be provided to your container.
Your job is now to get the configuration provided by those environment variables into the tool you are containerizing, because different tools may use different configuration mechanisms such as:
- Some tools read configuration from environment variables.
- Some tools use configuration files.
- Other tools write their configuration directly to a database.
Often it is necessary to write a small bash script that runs before your container, takes the values in the environment variables and writes them to the config file of tool.
The specification which environment variables will be defined for your application can be found in the sip.yml (SIP Application Template) documentation.
Example
In our example we want to connect the SonarQube container to a database as well as the VSETH authentication system by using the OIDC protocol. Let's start with the database:
Connecting to the database
As said before we want to use a postgres database. The database connection of SonarQube is configured through a file called sonar.properties
which can be found in the conf/
folder. The values we need to set are: sonar.jdbc.username, sonar.jdbc.password, sonar.jdbc.url.
To do this I wrote a small bash script pre-start-config.sh
which just uses the sed
command to search and set the values.
Connecting to the Authentication System.
The sonarqube application should use the VSETH authentication system to enable SSO (Single-Sign-On) for users. This has the bid advantage that end-users can use their University login instead of having to remember another username + password combination for every service. The VSETH authentication system is already used in several applications, for example also on the Wiki. The VSETH authentication system uses the industry standard OpenId Connect. Changes are high that you personally also have already used this protocol: this is the protocol that is used if a website offers the functionaltiy to "Login with Facebook, Google, Github, Twitter etc". The high level functionality of the protocol is quite simple: the application that wants to login a user, our example this would be sonarqube, redirects the user to the authentication server, where the user has to authenticate herself, in the VSETH case this is done using SWITCH AAI. After the authentication has been successfull the user is redirected back to sonarqube. From the redirect the application receives a cryptographically signed token that identifies the user and that the application can use to e.g. get the email adress of the user.
Many applications and frameworks already have support for the Openid Connect protocol, so if you need to integrate it yourself, it is probably a good idea to just do a quick web search for: "$nameOfTool Openid Connect integration". Sonarqube is no exception here and already has a plugin https://github.com/vaulttec/sonar-auth-oidc. Configuring such a plugin in general is easy, however the devil is in the details: sometimes different configuration options have different names etc. If you' re stuck consider asking for help in the #authentication channel in the VSETH IT Slack.
Putting it all together
Of course to execute the scripts they need to be copied into the container:
# Copy the pre- and post-start-config.sh scripts into the container.
COPY post-start-config.sh pre-start-config.sh /app/
Note: The docker COPY instruction also copies the file permissions of your host system to the target system. That means you have to make sure, that all .sh
scripts are executable on your system (you can do this by running chmod +x yourScript.sh
).
To start the scripts inside the container we also need to update our cinit file, which now looks like this:
--- programs: - name: pre-start-config user: app-user group: app-user path: /app/pre-start-config.sh env: - SIP_POSTGRES_SONARQUBE_USER: - SIP_POSTGRES_SONARQUBE_PW: - SIP_POSTGRES_SONARQUBE_SERVER: - SIP_POSTGRES_SONARQUBE_PORT: - SIP_POSTGRES_SONARQUBE_NAME: before: - sonarqube - name: post-start-config user: app-user group: app-user path: /app/post-start-config.sh env: - SIP_AUTH_OIDC_CONFIG_URL: - SIP_AUTH_SONARQUBE_CLIENT_ID: - CUSTOM_ADMIN_PASSWORD: after: - pre-start-config - name: sonarqube user: app-user group: app-user path: /app/sonarqube/bin/linux-x86-64/wrapper args: - "/app/sonarqube/conf/wrapper.conf" - wrapper.syslog.ident=SonarQube workdir: /app capabilities: - CAP_NET_BIND_SERVICE
Please note:
- Beside the main application we also execute two scripts:
pre-start-config
andpost-start-config
to configure the system. This is a pattern that you will see often when containerizing 3rd party tools: you need to write a small bash script to map the environment variables specified by the SIP specification to the config-file of the respective application. - You have to specify all the environment variables that you program should have access to (by default cinit doesn't pass any environment variables through to your program).
7. Document your setup
As I already mentioned it is very important that you add some documentation to your project so other developers can understand what you have done.
In general you should make sure that at least these points are present in the README.md
file of your project:
- Local development: How to start the application locally?
- Setup: What systems does your tool use?
- Configuration: How is configuration injected into your container?
Also you may want to consider adding a FAQ section at the end of the README.md
file to include some points that are worth documenting but don't fall into any of the other categories.
And last but not least: No setup is perfect: While you were setting up your project, you probably encountered several things in your setup that could be improved in the future. This is a very valuable insight and should not be forgotten! I highly recommend to document those points as well by creating Gitlab issues for thos points.
Example
For my project I wrote the following README file:
SonarQube
This repository contains a container to run SonarQube on
SIP. SonarQube is a static code analysis tool which helps enforce code standards among
developers.
Local Development
To run the container locally make sure you have docker-compose
installed.
Start locally
To start the container run:
docker-compose up
Rebuild
If you make any changes to files in the repository and you want to rebuild the container
before running it, you have to use the following command:
docker-compose up --build
since docker-compose up
alone won’t rebuild the container.
Setup
Database
SonarQube supports quite a few databases to store it’s data. We use PostgreSQL since this
is the preferred option of the SIP.
OpenID Connect
We use the sonar-auth-oidc plugin to setup
SonarQube with the VSETH Authentication service. The plugin is installed at Docker build
time.
Env Variables
To run the container the following environment variables are used:
Variable Name | Description |
---|---|
SIP_POSTGRES_SONARQUBE_USER | The database user that should be used for connecting to the database. |
SIP_POSTGRES_SONARQUBE_PW | The password of the user. |
SIP_POSTGRES_SONARQUBE_SERVER | The domain / address of the database server. |
SIP_POSTGRES_SONARQUBE_PORT | The port the database server is listening on. |
SIP_POSTGRES_SONARQUBE_NAME | The name of the database to connect to. |
CUSTOM_ADMIN_PASSWORD | The admin password that will be set for the admin user of sonarqube |
SIP_AUTH_SONARQUBE_CLIENT_ID | The client id of the sonarqube application in Keycloak. |
SIP_AUTH_OIDC_CONFIG_URL | The URL where the OIDC configuration of Keycloak can be laoded from. |
Configuration of Sonarqube
There are configuration options that have to be set at container start-up time and can not be set at build-time, such as the database connection. (because the connection will be injected into the container by the SIP using environment variables).
Pre Start Configuration
The pre-start-config.sh
script contains the configuration which has to be run before SonarQube has started, such as:
- Setting the database connection.
- Setting the port the application will run on (80)
Post Start Configuration
The post-start-config.sh
sciprt contains the configuration which has to run after SonarQube has started, such as:
- Overwriting the default admin password to the password value defined in the environment.
- Setting the Openid connect configuration.
These actions are performed using the SonarQube API and therefore have to be executed after the SonarQube server has started.
Deployment
todo: keycloak configuration
FAQ
Why not use the nginx with a proxy?
The only real reason I could think of, of using a proxy before the application is to do TLS termination in the Proxy. But since this app is intended to be deployed to the SIP where TLS termination by the load the ingress server and not the pods themselves, there is no need to do that.
Why not set the OpendId connect configuration not manually?
Of course all the configuration made in the post-start-config.sh
script could easily be done using the admin interface of SonarQube. However: The OIDC configuration is injected inside the container through SIP and it may change (e.g. if the auth server is migrated to a different domain), so it is just best practice to configure the application on container startup with the values provided by the environment variables.
I opted to use Gitlab Issues as the place for adding improvements, since I like to work with Gitlab issues, but that's just personal preference.
8. Deploy
TODO
Improvements
You did it! I hope this basic guide about containerization helped you and if you any questions / improvements don't hesitate to contact me.