GitLab CI/CD Best Practices I Recommend After 2 Years of Experience

Tin Plavec
7 min readMar 5, 2023

--

I’ve been writing pipelines for GitLab CI/CD for more than 2 years now. Here are the practices I follow, or at least try to. They will help speed up your pipeline and make it more readable and easier to manage.

Photo by Growtika on Unsplash

1. Never install stuff in script

I’m sure there are number of occasions when you’re writing a new CI/CD job, but you realize you’re missing a package to run some command. You do what’s easiest: add apt-get install , apk add , pip install or similar.

script:
- apt-get update
- apt-get install -y kubectl git
- git clone
- kubectl apply

However, this will slow down your pipeline because every time this job is run, it would install all these packages again and again. Instead, look for the image that already has what you need on the Docker Hub, or build your own Docker image.

FROM ubuntu:20.04
RUN apt-get update && apt-get install -y kubectl git

Most often, we end up building our custom Docker image since it’s hard to find the exact image you need on the Internet, and even if you find it, using it brings security risks. So, we developed a CI/CD pipeline and established a special directory in our Docker repo just for these images. Basically, it’s CI/CD automation for building CI/CD images 😄.

2. Standardize keyword order in jobs

Maybe it doesn’t seem so important at first, but this standardization will increase code readability, especially for complex CI/CD jobs. When checking someone else’s code, you’ll instantly see if a keyword is missing or wrongly defined, and it will be much easier to compare two jobs. A standard practice would be to go alphabetical, but this is not intuitive actually. For example, this way artifacts keyword would be at the front, while script would be near the end of the job definition. In reality, however, script is run first, and then artifacts are saved. Therefore, alphabetical keyword order doesn’t look good visually and may confuse developers new to CI/CD. So, after some thought, I developed a more intuitive order based on the chronological order of job execution.

Simple test:
extends:
- .job-template
stage: Test
needs: []
rules:
- if: $CI_COMMIT_REF_NAME == $CI_DEFAULT_BRANCH
tags:
- kubernetes
image: ubuntu
environment:
name: sandbox
variables:
FILE: README.md
before_script:
- echo "Running in Kubernetes cluster" > new-file.txt
script:
- test -f ${FILE}
allow_failure: false
artifacts:
paths:
- new-file.txt
cache:
key: $CI_COMMIT_REF_SLUG
paths:
- bin/

The extends keyword is the first one. It gives a clue that the following keywords will override it. Then stage , needs and rules define where and if the job will be scheduled in the pipeline. After the job is scheduled, it is run on the qualified runner with tags using the specified Docker image. The environment and variables are then set, and finally script is run.

After we started using this order, writing and reading jobs became more understandable and easier to explain to juniors.

3. Version your CI code when including

GitLab offers you a cool feature to include CI/CD configuration from a local file, remote file, another repo, or a template. For example:

include:
- project: 'my-group/ci-project'
file: '/ci/docker-build.yml'

However, this always includes the latest version of the file on the default branch. At first, this may seem simplest, but it’s not a good practice the same way using latest to tag Docker images is not a good practice. The second you create another repo that includes this file, you’ll realize you need to version your CI files. Therefore tag the CI repo with v1.0.0 and do something like this:

include:
- project: 'my-group/ci-project'
ref: 'v1.0.0'
file: '/ci/docker-build.yml'

This way you can easily update docker-build.yml by creating a new tag, without breaking repos that include the file. ref can be tag, branch name or literally commit SHA. If including remote YAML file, you obviously can’t use ref . In that case you can put file version at the end of file name.

4. Write comments

No matter if your CI/CD pipeline is easy or complex, it’s gonna be much more readable if you write comments. When we started with GitLab CI/CD, we didn’t write any comments. Sooner than later only the author of the code would know what the code does, and if the code was written a while ago, no one could easily understand it. Our pipeline structure is complex with a lot of including, extending, and advance rules to trigger jobs. One simple way to make extending clearer is to write down the origin. Also, be sure to explain rules clauses, and how you use variables.

Deploy to K8s cluster:
extends:
- .project-variables # from project-config.yml
- .deploy-job # from another-repo/core.yml
rules:
- if: $CI_COMMIT_REF_NAME == $CI_DEFAULT_BRANCH # if default branch
variables:
DEPLOY_TIER: "production" # override default value
- if: $CI_COMMIT_REF_NAME =~ /staging/ # if includes string 'staging'
variables:
DEPLOY_TIER: "staging" # override default value

Perhaps you don’t have to be so thorough as this example is, but still, take some time to write helpful comments, or else you will have to explain your code time and time again to every colleague every time they try to read it. Also, it’s useful to write explanations in job's script , just as you would when writing some Python script.

Test:
script:
- if run-test-1; then
- echo "Test 1 PASS"
- else
- exit 2 # give warning
- fi
- if run-test-2; then
- echo "Test 2 PASS"
- else
- exit 1 # give error
- fi
allow_failure:
exit_codes:
- 2

Finally, if your pipeline becomes too complex, it’s helpful to write docs that explain it on high level. There’s never enough of documentation 😅.

5. Using alpine images is not such a benefit actually

It is often recommended to use small alpine images to speed up your pipeline jobs. While these images will be pulled faster, there are often unexpected errors related to them. If you have a simple application to build, alpine will work, but when you start upgrading your app with complex functionalities, something will likely break and you will have to switch to full image instead of alpine. Also, apk , alpine’s package manager, is known to drop old versions of packages after some time. The only benefit of using alpine images for CI/CD jobs is faster image download. However, this benefit is not so significant because of 2 reasons. Firstly, the time difference for image download is usually 10 – 20 seconds. GitLab Runners run in the cloud with fast Internet access. So, to download 1 GB more does not take significantly more time. Secondly, GitLab Runners have cache for images, especially if you run your own runners. For example, we run our runners in the Kubernetes cluster where one job is spawned as one pod. When the first job triggers pulling of an image, that image is cached inside the cluster. So, when the job is triggered again, pulling of the image is instant, as the image is not actually pulled from the Internet, but from the cache. Setting up your own GitLab Runners is free and easier than you think, and they can perfectly run on a low-cost server. For these reasons I believe it makes no sense to use alpine images for CI/CD jobs. However, if you’re experienced and know all the problems that could arise when using alpine, it may make sense to you to use it.

6. Build stageless pipelines

Since GitLab 14.2 it is possible to create stageless pipelines. This pipeline actually runs all jobs in one single stage, while the job order is determined based solely on needs keyword.

stages:
- Pipeline

Build frontend:
stage: Pipeline
needs: []

Build backend:
stage: Pipeline
needs: []

Deploy frontend:
stage: Pipeline
needs:
- Build frontend

Deploy backend:
stage: Pipeline
needs:
- Build backend

The example above is simple, but it shows how needs improves your pipeline. If you would use stages instead, build jobs would be in the first stage, while deploy jobs would be in the second stage. This way Deploy backend needs (no pun intended) to wait for both Build frontend and Build backend to finish. With needs it only needs to wait for Build backend , which makes sense.

Furthermore, you have the ability to use optional needs . If the needed job exists in the pipeline, main job will wait for it to finish. If the needed job doesn’t exist in the current pipeline, main job will just run immediately.

Check backend working:
stage: Pipeline
needs:
- job: Deploy backend
optional: true

7. Use the Kubernetes agent

If you’re deploying to Kubernetes, like we do, you’ll love GitLab’s agent for Kubernetes. The agent connects GitLab repos with Kubernetes clusters. Setting it up is actually fairly easy. You need to create the agent on GitLab for a project or a group, get the token, and then install the agent with Helm into a Kubernetes cluster. And that’s it. The agent will automatically inject Kubernetes context (kubeconfig) you select to your CI/CD jobs. You can now just simply do kubectl apply and helm install as much as you want.

There is also another way to use the agent. It’s called GitOps workflow, where you just push manifests to the repo and the agent will automatically scan and deploy them. This workflow, however, requires more setup on the cluster side.

8. Read the official documentation

Most of my knowledge of GitLab CI/CD doesn’t come from tutorials, it comes from reading the documentation. Simple tutorials may be good for starters, but if you read the documentation you can truly become an expert in the field. GitLab Keyword reference explains almost all of the pipeline functionalities available. It is frequently updated as new GitLab versions are released. Just try to glance at this doc, I’m sure you’ll find some new cool functionality you were not aware of.

--

--

Tin Plavec
Tin Plavec

Written by Tin Plavec

I write about DevOps, CI/CD, Bash, Docker, GitLab and tech in general. plavy.me

Responses (3)