performance-pipeline¶

Component providing templates for running performance benchmarks across multiple machines with result processing and optional GitHub benchmark reporting.

Purpose¶

The performance-pipeline component provides:

Multi-Machine Templates: Job templates for performance tests on selected machines
Processing Job Template: Template for processing raw benchmark results
GitHub Reporting Templates: Optional templates for posting results to GitHub

Use this component when you have performance benchmarks and want to run them consistently across machines.

When to Use¶

Enable performance pipeline when:

You have performance-critical code
You want to track performance trends over time
You need to detect performance regressions in PRs
You have dedicated performance benchmarks

Consider skipping if:

You’re just getting started with CI (add later)
You don’t have performance tests yet
CI time is already constrained

Tip

Run performance tests on 1-2 machines (not all) to balance coverage and CI time. Recommended: Tioga for AMD GPU, Matrix for NVIDIA GPU, or Dane for CPU.

Inputs¶

Core Inputs¶

Input	Required	Description
`job_cmd`	Yes	Command to run performance benchmarks
`perf_processing_cmd`	Yes	Command to process raw results into standardized format

GitHub Reporting (Optional)¶

Input	Required	Description
`github_token`	No	GitHub token with `repo` and `workflow` scopes
`github_project_name`	No	GitHub repository name (for reporting)
`github_project_org`	No	GitHub organization name (for reporting)

Machine-Specific Allocations¶

Specify allocation for each machine you want to test on:

Input	Required	Description
`dane_perf_alloc`	No	Dane allocation args (if omitted, no Dane perf jobs)
`matrix_perf_alloc`	No	Matrix allocation args
`corona_perf_alloc`	No	Corona allocation args
`tioga_perf_alloc`	No	Tioga allocation args
`tuolumne_perf_alloc`	No	Tuolumne allocation args

Note

Only machines with *_perf_alloc specified will run performance jobs. This allows selective testing (e.g., only Tioga and Matrix).

Artifact Configuration¶

Input	Required	Description
`perf_artifact_dir`	No	Directory for artifacts (default: “.”)
`perf_results_file`	No	Raw results filename (default: “results.txt”)
`perf_processed_file`	No	Processed results filename (default: “processed.json”)

Exported Templates¶

.custom_perf¶

Purpose: Base customization template for all performance jobs.

Usage: Override in .gitlab/custom-jobs.yml for project specific perf setup.

# In .gitlab/custom-jobs.yml
.custom_perf:
  before_script:
    - echo "Performance environment setup"
    - export OMP_NUM_THREADS=1
    - export CUDA_VISIBLE_DEVICES=0

.perf_on_<machine>¶

Purpose: Performance job template for each machine.

Usage: Extend this template to create performance jobs.

# In .gitlab/jobs/performances.yml
tioga-perf:
  extends: .perf_on_tioga
  variables:
    BENCHMARK_CONFIG: "large"

Available templates:

.perf_on_dane
.perf_on_matrix
.perf_on_corona
.perf_on_tioga
.perf_on_tuolumne

.results_processing¶

Purpose: Template for processing raw benchmark results.

Usage: Automatically used by the component (typically not extended directly).

Runs after all performance jobs complete to process raw results.

.results_reporting¶

Purpose: Template for GitHub benchmark reporting.

Usage: Automatically used when GitHub inputs provided.

Reports processed results to GitHub as benchmark comments.

.convert_to_gh_benchmark¶

Purpose: Converts processed results to GitHub benchmark format.

Usage: Internal template (not typically extended).

.report_to_gh_benchmark¶

Purpose: Posts benchmark data to GitHub.

Usage: Internal template (not typically extended).

Workflow¶

When configured, the performance pipeline creates this job flow:

1. Performance Jobs (parallel across machines)
   ├─ .perf_on_dane (if dane_perf_alloc set)
   ├─ .perf_on_matrix (if matrix_perf_alloc set)
   └─ .perf_on_tioga (if tioga_perf_alloc set)
   │
   ↓ (each produces raw results)
   │
2. Results Processing
   └─ .results_processing
      Runs: perf_processing_cmd
      Input: perf_results_file from all jobs
      Output: perf_processed_file (JSON)
   │
   ↓
   │
3. GitHub Reporting (if token provided)
   ├─ .convert_to_gh_benchmark
   │  Converts JSON to GitHub benchmark format
   │
   └─ .report_to_gh_benchmark
      Posts results to GitHub PR/commit

Data Formats¶

The component allows user-defined formats for raw results and requires a specific JSON format for processed results to enable GitHub reporting.

Your job_cmd produces results in whatever format your perf_processing_cmd expects (CSV, JSON, Custom format, etc.).

Your perf_processing_cmd should output JSON in GitHub benchmark format:

[
    {
      "name": "MatrixMultiply",
      "unit": "GFLOPS",
      "value": 234.5
    },
    {
      "name": "VectorAdd",
      "unit": "ms",
      "value": 12.3
    }
]

Note

As of v2026.02.2, the component makes commit_hash available to your processing script for inclusion in exported data.

Examples¶

Minimal Example¶

Basic performance testing on one machine:

# .gitlab-ci.yml
stages:
  - prerequisites
  - build-and-test
  - performance-measurements

performance-measurements:
  extends: [.performance-measurements]
  trigger:
    include:
      - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/performance-pipeline@v2026.02.2
        inputs:
          job_cmd: "./scripts/run-benchmarks.sh"
          tioga_perf_alloc: "--nodes=1 --exclusive --time-limit=15m"
          perf_processing_cmd: "python scripts/process-results.py"
      - local: '.gitlab/jobs/performances.yml'

# .gitlab/jobs/performances.yml
tioga-benchmark:
  extends: .perf_on_tioga

Multi-Machine Example¶

Performance testing on multiple machines:

performance-measurements:
  extends: [.performance-measurements]
  trigger:
    include:
      - component: .../performance-pipeline@v2026.02.2
        inputs:
          job_cmd: "./scripts/run-benchmarks.sh"
          # CPU testing
          dane_perf_alloc: "--nodes=1 --exclusive --time=15"
          # NVIDIA GPU testing
          matrix_perf_alloc: "--nodes=1 --exclusive --time=15 --gres=gpu:1"
          # AMD GPU testing
          tioga_perf_alloc: "--nodes=1 --exclusive --time-limit=15m -g 1"
          perf_processing_cmd: "python scripts/process-results.py"
          github_token: $GITHUB_STATUS_TOKEN
          github_project_name: $GITHUB_PROJECT_NAME
          github_project_org: $GITHUB_PROJECT_ORG
      - local: '.gitlab/jobs/performances.yml'

# .gitlab/jobs/performances.yml
dane-perf:
  extends: .perf_on_dane
  variables:
    CONFIG: "cpu"

matrix-perf:
  extends: .perf_on_matrix
  variables:
    CONFIG: "cuda"

tioga-perf:
  extends: .perf_on_tioga
  variables:
    CONFIG: "rocm"

Conditional Execution¶

Run performance tests only on specific branches:

performance-measurements:
  extends: [.performance-measurements]
  rules:
    # Auto-run on main/develop
    - if: '$CI_COMMIT_BRANCH == "main" || $CI_COMMIT_BRANCH == "develop"'
      when: on_success
    # Manual on PRs
    - if: '$CI_MERGE_REQUEST_ID'
      when: manual
    # Never on other branches
    - when: never
  trigger:
    include:
      - component: .../performance-pipeline@v2026.02.2
        inputs: { ... }

With Custom Artifacts¶

Customize artifact handling:

performance-measurements:
  trigger:
    include:
      - component: .../performance-pipeline@v2026.02.2
        inputs:
          job_cmd: "./run-benchmarks.sh"
          perf_processing_cmd: "./process-results.sh"
          perf_artifact_dir: "benchmark-results"
          perf_results_file: "raw-data.csv"
          perf_processed_file: "summary.json"
          tioga_perf_alloc: "--nodes=1 --exclusive --time-limit=30m"

Script Examples¶

Benchmark Script¶

Example run-benchmarks.sh:

#!/bin/bash
set -e

# Run your benchmarks
./build/benchmarks/matrix_multiply --iterations=100 > results.txt
./build/benchmarks/vector_add --size=1000000 >> results.txt

echo "Benchmarks complete"

Processing Script¶

Example process-results.py:

#!/usr/bin/env python3
import json
import sys

# Read raw results
with open('results.txt', 'r') as f:
    lines = f.readlines()

# Parse and convert to GitHub benchmark format
benchmarks = []
for line in lines:
    if 'GFLOPS' in line:
        name, value = line.split(':')
        benchmarks.append({
            'name': name.strip(),
            'unit': 'GFLOPS',
            'value': float(value.strip())
        })

with open('processed.json', 'w') as f:
    json.dump(benchmarks, f, indent=2)

print(f"Processed {len(benchmarks)} benchmarks")

Common Issues¶

Missing Machine Allocation¶

Symptom: No performance jobs run for a machine

Solution: Ensure machine-specific allocation input is provided:

inputs:
  tioga_perf_alloc: "--nodes=1 --exclusive --time-limit=15m"

Machines without *_perf_alloc are skipped.

Processing Command Fails¶

Symptom: Results processing job fails

Possible causes:

Script not found: Verify perf_processing_cmd path
Wrong format: Check output format matches expected JSON
Missing dependencies: Script may need Python/tools not loaded
File not found: Check perf_results_file name matches output

GitHub Reporting Not Working¶

Symptom: Results don’t appear on GitHub

Checklist:

github_token provided and valid
Token has repo and workflow permissions
github_project_name and github_project_org correct
Processing produced valid JSON in correct format

Long Running Benchmarks¶

Symptom: Performance jobs timeout

Solutions:

Increase *_perf_alloc time limit
Reduce benchmark iterations
Run benchmarks on fewer configurations

Result File Not Found¶

Error: “results.txt not found” in processing step

Solution: Ensure your benchmark script produces the expected filename:

inputs:
  perf_results_file: "results.txt"  # Must match your script output

Or adjust the input to match your actual filename.