Gitlab Pipeline – parrallel – What is parrallel in GitLab CI/CD?

DevOps

YOUR COSMETIC CARE STARTS HERE

Find the Best Cosmetic Hospitals

Trusted • Curated • Easy

Looking for the right place for a cosmetic procedure? Explore top cosmetic hospitals in one place and choose with confidence.

“Small steps lead to big changes — today is a perfect day to begin.”

Explore Cosmetic Hospitals Compare hospitals, services & options quickly.

✓ Shortlist providers • ✓ Review options • ✓ Take the next step with confidence

The parallel keyword in a GitLab CI/CD job definition allows you to run multiple instances of the same job concurrently. This is extremely useful for:

  • Speeding up test suites: By splitting tests across several parallel jobs (test sharding/splitting).
  • Building for multiple targets/architectures: Running the same build script with different configuration variables.
  • Processing data in chunks: Dividing a large dataset or a list of items among parallel jobs.

GitLab provides two main ways to use parallel:

  1. Fixed number of parallel jobs: parallel: <number>
  2. Matrix of parallel jobs: parallel:matrix: (using variable combinations)

Example .gitlab-ci.yml with parallel:

YAML

# .gitlab-ci.yml

stages:
  - test
  - build

default:
  image: alpine:latest

# Job 1: Demonstrates a fixed number of parallel jobs for sharding tests
parallel_test_execution:
  stage: test
  parallel: 3 # Run this job 3 times in parallel
  script:
    - echo "--- Parallel Test Execution ---"
    # CI_NODE_INDEX is 1-based (e.g., 1, 2, 3)
    # CI_NODE_TOTAL is the total number of parallel jobs (e.g., 3)
    - echo "This is test job instance $CI_NODE_INDEX of $CI_NODE_TOTAL."
    - echo "Simulating test execution for a subset of tests..."
    # Example: Logic to select a subset of tests based on CI_NODE_INDEX
    - |
      if [ "$CI_NODE_INDEX" -eq 1 ]; then
        echo "Running test group 1 (e.g., unit tests part 1)"
        sleep 5
      elif [ "$CI_NODE_INDEX" -eq 2 ]; then
        echo "Running test group 2 (e.g., unit tests part 2)"
        sleep 7
      elif [ "$CI_NODE_INDEX" -eq 3 ]; then
        echo "Running test group 3 (e.g., integration tests part 1)"
        sleep 6
      fi
    - echo "Test instance $CI_NODE_INDEX finished."

# Job 2: Demonstrates a matrix of parallel jobs for different configurations
build_multi_target:
  stage: build
  script:
    - echo "--- Building for Target: $TARGET_OS, Arch: $TARGET_ARCH ---"
    - echo "This is job instance $CI_NODE_INDEX of $CI_NODE_TOTAL in the matrix."
    - echo "Fetching dependencies for $TARGET_OS-$TARGET_ARCH..."
    - sleep 10 # Simulate build time
    - echo "Build complete for $TARGET_OS-$TARGET_ARCH."
    - mkdir -p "build_output/${TARGET_OS}_${TARGET_ARCH}"
    - echo "binary_for_${TARGET_OS}_${TARGET_ARCH}" > "build_output/${TARGET_OS}_${TARGET_ARCH}/app"
  parallel:
    matrix:
      # First set of combinations
      - TARGET_OS: ["linux", "windows"]
        TARGET_ARCH: ["amd64"]
      # Second set of combinations (can have different variables or values)
      - TARGET_OS: ["darwin"]
        TARGET_ARCH: ["amd64", "arm64"]
      # A single specific combination
      - TARGET_OS: ["linux"]
        TARGET_ARCH: ["arm64"]
        EXTRA_FLAG: ["--optimized"] # You can add other variables
  artifacts:
    paths:
      - build_output/
    expire_in: 1 hour

Code language: PHP (php)

Explanation:

  1. parallel: Keyword:
    • Used within a job’s definition to specify that the job should be run multiple times in parallel.
  2. parallel_test_execution Job (Fixed Number):
    • parallel: 3:
      • This tells GitLab to create and run 3 instances of the parallel_test_execution job concurrently (assuming enough runners are available).
      • In the GitLab UI, these jobs will appear as parallel_test_execution 1/3, parallel_test_execution 2/3, and parallel_test_execution 3/3.
    • Predefined CI/CD Variables:
      • $CI_NODE_INDEX: A 1-based index for the current parallel job instance (e.g., 1, 2, or 3 in this case).
      • $CI_NODE_TOTAL: The total number of parallel jobs in this set (e.g., 3 in this case).
    • Script Logic: The script uses $CI_NODE_INDEX to simulate running different groups of tests. This is a common pattern for test sharding, where a large test suite is divided among parallel jobs to reduce overall execution time.
  3. build_multi_target Job (Matrix):
    • parallel:matrix:: This allows you to define a list of variable combinations. GitLab will create a separate parallel job instance for each unique combination.
    • Variable Combinations:
      • The first matrix entry: YAML- TARGET_OS: ["linux", "windows"] TARGET_ARCH: ["amd64"] This will generate 2 jobs:
        1. TARGET_OS=linux, TARGET_ARCH=amd64
        2. TARGET_OS=windows, TARGET_ARCH=amd64
      • The second matrix entry: YAML- TARGET_OS: ["darwin"] TARGET_ARCH: ["amd64", "arm64"] This will generate 2 more jobs:
        1. TARGET_OS=darwin, TARGET_ARCH=amd64
        2. TARGET_OS=darwin, TARGET_ARCH=arm64
      • The third matrix entry: YAML- TARGET_OS: ["linux"] TARGET_ARCH: ["arm64"] EXTRA_FLAG: ["--optimized"] This generates 1 job:
        1. TARGET_OS=linux, TARGET_ARCH=arm64, EXTRA_FLAG=--optimized
      • Total Jobs: In this example, the build_multi_target job with this matrix configuration would generate 2 + 2 + 1 = 5 parallel job instances. Each instance will have the specified environment variables (TARGET_OS, TARGET_ARCH, EXTRA_FLAG) set accordingly.
    • Script Logic: The script uses these environment variables (e.g., $TARGET_OS, $TARGET_ARCH) to perform actions specific to that combination, like building for a particular operating system and architecture.
    • Artifacts: Each parallel job instance can produce its own artifacts. The example shows creating artifacts specific to the build target.

Key Concepts and Behavior:

  • Purpose: To significantly speed up pipelines by running identical or similar tasks concurrently or to build/test across a range of configurations.
  • Runner Availability: The actual level of parallelism depends on the number of available GitLab Runners that can pick up these jobs. If you define parallel: 10 but only have 2 runners available, only 2 jobs will run at a time.
  • Job Naming:
    • For parallel: <number>, jobs are named job_name X/Y.
    • For parallel:matrix:, jobs are typically named job_name: [VAR1:value1, VAR2:value2, ...] and also include the X/Y suffix based on their order in the expanded matrix.
  • CI_NODE_INDEX and CI_NODE_TOTAL: These are crucial for dividing work among the parallel instances, especially when using parallel: <number>. For parallel:matrix:, you typically rely on the matrix variables themselves to differentiate the work.
  • Resource Consumption: Using parallel can consume more CI/CD minutes and runner resources, so use it judiciously where the speed benefits outweigh the costs.
  • Dependencies and Artifacts: Each parallel job instance is treated as a separate job. If other jobs need a parallelized job, they would typically need all instances to complete (unless you specifically need a particular instance, which is more advanced). Artifacts from parallel jobs are usually named uniquely or aggregated in a later stage.

Using parallel effectively can drastically improve the efficiency of your CI/CD pipelines, especially for time-consuming tasks like extensive test suites or multi-platform builds.

Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x