MOTOSHARE 🚗🏍️
Turning Idle Vehicles into Shared Rides & Earnings
From Idle to Income. From Parked to Purpose.
Earn by Sharing, Ride by Renting.
Where Owners Earn, Riders Move.
Owners Earn. Riders Move. Motoshare Connects.
With Motoshare, every parked vehicle finds a purpose.
Owners earn. Renters ride.
🚀 Everyone wins.
The retry keyword in a GitLab CI/CD job allows you to specify that a job should be automatically re-executed if it fails. This is particularly useful for jobs that might fail due to transient, temporary issues like network glitches, timeouts with external services, or brief runner problems, rather than an actual problem with the job’s script or code.
Example .gitlab-ci.yml with retry:
This example simulates a job that might fail on its first attempt but succeed on a retry.
YAML
# .gitlab-ci.yml
stages:
- test
- deploy
default:
image: alpine:latest
# Job 1: A job that might fail due to a transient issue (simulated)
flaky_test_job:
stage: test
retry: 2 # Allow up to 2 retries (total 3 attempts: initial + 2 retries)
script:
- echo "--- Running Flaky Test Job (Attempt: $((CI_JOB_MANUAL ? 0 : ${CI_JOB_ATTEMPT:-1}))) ---" # CI_JOB_ATTEMPT is available from GitLab 15.5
- echo "Simulating a test that sometimes fails..."
# Create a counter file to simulate success on a later attempt
- ATTEMPT_FILE=".job_attempt_counter"
- |
if [ ! -f "$ATTEMPT_FILE" ]; then
echo "First attempt: Simulating failure."
echo "1" > "$ATTEMPT_FILE" # Record first attempt
exit 1 # Fail the job
elif [ "$(cat $ATTEMPT_FILE)" = "1" ]; then
echo "Second attempt: Simulating another failure for demonstration."
echo "2" > "$ATTEMPT_FILE" # Record second attempt
exit 1 # Fail the job again
else
echo "Third attempt (or later if manual retry): Test passes!"
rm "$ATTEMPT_FILE" # Clean up
exit 0 # Succeed the job
fi
# Job 2: Another job with more specific retry conditions
deploy_to_staging:
stage: deploy
retry:
max: 1 # Allow up to 1 retry (total 2 attempts)
when: # Specify conditions under which to retry
- runner_system_failure # Retry if the runner had a system issue
- stuck_or_timeout_failure # Retry if the job got stuck or timed out
- script_failure # Also retry on general script failures for this example
script:
- echo "--- Deploying to Staging (Attempt: $((CI_JOB_MANUAL ? 0 : ${CI_JOB_ATTEMPT:-1}))) ---"
- RANDOM_NUMBER=$((1 + RANDOM % 3)) # Generate a random number between 1 and 3
- |
if [ "$RANDOM_NUMBER" -ne "3" ] && [ "${CI_JOB_ATTEMPT:-1}" -lt "2" ] ; then # Fail on 1 or 2 on first attempt
echo "Deployment to staging failed due to a simulated transient issue (e.g., network timeout)."
exit 1
else
echo "Deployment to staging successful!"
exit 0
fi
Code language: PHP (php)
Explanation:
retry:Keyword:- Used within a job definition to configure its retry behavior.
- It tells the GitLab Runner to re-run the job automatically if it fails, up to a specified number of times and under certain conditions.
flaky_test_job(Simple Retry Count):retry: 2:- This is the simpler way to define retries. The number
2means the job will be retried up to 2 times after its initial failure. - This results in a maximum of 3 attempts in total (1 initial attempt + 2 retries).
- GitLab allows a maximum of 2 retries (so values of
0,1, or2). - If
when:is not specified, it defaults to retrying on most failure reasons (effectivelyretry:when:alwaysin older terms, or a broad set of common failure types now).
- This is the simpler way to define retries. The number
- Script Logic:
- The script uses a file (
.job_attempt_counter) to simulate a scenario where the first two attempts fail, and the third attempt succeeds. CI_JOB_ATTEMPT(available GitLab 15.5+) is a predefined variable that indicates the current attempt number for a job, starting from 1. The script uses a fallback for older versions or manual retries.
- The script uses a file (
deploy_to_staging(Detailed Retry Configuration):retry:: This starts the retry configuration block.max: 1: Specifies the maximum number of retries (so, up to 1 retry, meaning 2 attempts in total).when:: This sub-key allows you to specify a list of failure reasons for which the job should be retried. The job will only be retried if its failure reason is in this list.runner_system_failure: A problem with the GitLab Runner itself (e.g., crash, unexpected shutdown).stuck_or_timeout_failure: The job was stuck or exceeded its execution timeout.script_failure: The job failed because a command in itsscriptsection exited with a non-zero status code.- Other common
whenvalues include:unknown_failure,api_failure,missing_dependency_failure,job_execution_timeout, etc. You can list multiple reasons. If any of the listed reasons occur, the job is retried.
- Script Logic:
- This script uses a random number to simulate a deployment that might fail on the first attempt but succeed on the retry. It’s designed to likely fail if
RANDOM_NUMBERisn’t 3 on the first try but will succeed on the second attempt (asCI_JOB_ATTEMPTwould be 2).
- This script uses a random number to simulate a deployment that might fail on the first attempt but succeed on the retry. It’s designed to likely fail if
Key Concepts and Behavior:
- Purpose: To handle intermittent or transient failures automatically, improving pipeline reliability without manual intervention for minor, non-deterministic issues.
- Number of Retries: The integer value for
retryorretry:max:specifies the number of additional attempts after the first failure. Soretry: 1means one initial attempt and one retry (total 2 attempts). The maximum allowed value is2(total 3 attempts). - Retry Conditions (
when):- If only
retry: <count>is used, it retries on a broad range of failures. retry:when:gives you fine-grained control over why a job should be retried. This is useful to avoid retrying jobs that fail due to genuine code errors (where a retry won’t help).
- If only
- Job Logs: Each attempt of a retried job is logged separately in the GitLab UI, allowing you to see the output and failure reason for each attempt.
- Final Status: If all allowed attempts (initial + retries) fail, the job is marked as failed in the pipeline. If any attempt succeeds, the job is marked as successful.
- Idempotency: Jobs that are configured to retry should ideally be idempotent. This means running the job multiple times should have the same effect as running it once successfully (e.g., it shouldn’t create duplicate resources or have unintended side effects on repeated runs).
CI_JOB_ATTEMPTVariable: Starting from GitLab 15.5, this predefined variable can be used within your script to know which attempt number is currently running. This can be useful for implementing conditional logic within your script for retries.
When to Use retry:
- Flaky Tests: Tests that occasionally fail due to timing issues, external service flakiness, or unstable test environments.
- Network Dependent Operations: Jobs that interact with external services over a network that might experience temporary outages or glitches (e.g., fetching dependencies, deploying to a cloud provider).
- Resource Contention: Situations where a job might fail if a shared resource (like a database connection pool on a test server) is temporarily exhausted.
Do not overuse retry to mask fundamental problems in your code or scripts. If a job consistently fails, it needs to be fixed, not just retried. retry is for genuinely transient issues.