Saturday, February 22, 2025

Timeout for manual jobs in gitlab

There is a timeout functionality in gitlab pipelines. The running jobs are terminated after reaching configured period of time. However if the job has a manual step - it may hang in that state forever, as it is not actively running. This functionality is missing. See also: https://gitlab.com/gitlab-org/gitlab-runner/-/issues/29574.

But we do not need to wait for gitlab implementation of that feature.

Our solution will consist of an assisting watchdog job, run in parrallel in a separate stage. As this will require new `.gitlab-ci.yml` file, we will use an extra `util` repository, to handle it. Our util pipeline It will invoke a shell script that will just check whether the time passed is already above the limit, we have declared. We will need to pass some arguments like - the parent project id - so we can check the state and eventually terminate it, and the timeout value itself - so our solution is flexible and we avoid to hardcode it in watchdog.

In the parent pipeline we need to add a `watchdog` stage. We need to invoke the utility watchdog pipeline by calling build.

main pipeline
stages: [watchdog, build, deploy]
  
watch_pipeline:
  stage: watchdog
  trigger:
    project: your-group/util
    branch: main
  variables:
    PARENT_PIPELINE_ID: $CI_PIPELINE_ID
    PARENT_PROJECT_URL: $CI_PROJECT_URL
    MAX_PIPELINE_TIMEOUT: 3600          # 1 hour
 
 # later the build goes and whatever

Here is the `.gitlab-ci.yml` in util project:

stages:
  	- watchdog
  
watch_pipeline:
  	stage: watchdog
  	image: find_some_image_that_has_bash_curl_jq
  	script:
  		- bash ./watchdog.sh
  	allow_failure: true
    when: always
    parallel: 1

And the last piece is the script itself:

#!/bin/bash
set -euo pipefail  
  
printf "PARENT_PROJECT_URL=%s\nPARENT_PIPELINE_ID=%s\nMAX_PIPELINE_TIMEOUT=%s\n" \
  "$PARENT_PROJECT_URL" "$PARENT_PIPELINE_ID" "$MAX_PIPELINE_TIMEOUT"

echo "Watching pipeline $PARENT_PIPELINE_ID in $PARENT_PROJECT_URL"
echo "Max time: $MAX_PIPELINE_TIMEOUT sekund"


readonly START_TIMESTAMP=$(date +%s)

CURRENT_TIMESTAMP=$(date +%s)
ELAPSED_TIME=$((CURRENT_TIMESTAMP - START_TIMESTAMP))

if [ "$ELAPSED_TIME" -gt "$MAX_PIPELINE_TIMEOUT" ]
then
    echo "Cancelling pipeline: ${PARENT_PIPELINE_ID}!"

    curl --request POST \
         --header "PRIVATE-TOKEN: $TOKEN" \
         --header "Accept: application/json" \
         "$CI_API_V4_URL/projects/$PARENT_PROJECT_ID/pipelines/$PARENT_PIPELINE_ID/cancel"
fi
  

Last thing that is required is to add our Personal Access Token as TOKEN variable in gitlab CICD settings/variables. Make it protected and masked, so nobody can read it and access it. Unfortunatelly CI_JOB_TOKEN would not work in this case, due to missing permissions.