RADIUSS Shared CI Explained¶

This guide explains how RADIUSS Shared CI works internally - the architecture, design patterns, and implementation details.

Repository Structure¶

The repository contains both component-based (current) and legacy (compatibility) implementations:

radiuss-shared-ci/
├── templates/              # Component implementations (GitLab 17.0+)
│   ├── base-pipeline/
│   ├── dane-pipeline/
│   ├── matrix-pipeline/
│   ├── tioga-pipeline/
│   ├── tuolumne-pipeline/
│   ├── corona-pipeline/
│   ├── performance-pipeline/
│   ├── utility-draft-pr-filter/
│   ├── utility-branch-skip/
│   └── utility-job-override-check/
├── pipelines/              # Legacy include-based files
├── utilities/              # Legacy utility files
├── customization/          # Template files for users
├── examples/               # Example configurations
└── docs/                   # Documentation source

Component-Based Implementation¶

Location: templates/*/template.yml

Components follow GitLab CI Components specification:

# templates/dane-pipeline/template.yml
spec:
  inputs:
    job_cmd:
      type: string
    shared_alloc:
      type: string
      default: "OFF"
---
# Component implementation
.job_on_dane:
  tags: [shell, dane]
  script:
    - ${{ inputs.job_cmd }}

Users consume components via:

include:
  - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/dane-pipeline@v2026.02.2
    inputs:
      job_cmd: "./script.sh"
      shared_alloc: "--nodes=1"

Legacy Implementation¶

Location: pipelines/*.yml, utilities/*.yml

Legacy files use traditional include: project: syntax:

include:
  - project: 'radiuss/radiuss-shared-ci'
    ref: 'v2026.02.2'
    file: 'pipelines/dane.yml'

Both implementations provide the same functionality. Components add input validation and catalog integration.

Pipeline Architecture¶

Parent-Child Pattern¶

RADIUSS Shared CI uses GitLab’s parent-child pipeline pattern:

Parent pipeline (.gitlab-ci.yml): - Can be extended by users as usual - Checks machine availability - Triggers child pipelines for each machine
Child pipelines (machine-specific): - Execute build/test jobs - Report results independently to GitHub

This pattern allows:

Independent machine status reporting to GitHub
Parallel execution across machines
Clean separation of concerns

Machine Abstraction¶

Each machine has:

Scheduler type (SLURM or Flux)
Allocation syntax
Runner tags
Job templates

Machine components abstract these differences:

# Dane (SLURM)
.job_on_dane:
  tags: [shell, dane]
  # SLURM-specific allocation logic

# Tioga (Flux)
.job_on_tioga:
  tags: [shell, tioga]
  # Flux-specific allocation logic

Users extend machine templates without worrying about scheduler details.

Shared Allocations¶

Shared allocation pattern:

allocate-resources job requests allocation and names it.
Jobs find allocation ID using name and run within it.
release-resources job cancels allocation.

Jobs can use jobs-stage-1, jobs-stage-2, jobs-stage-3 stages between allocation and release.

Individual allocation alternative: each job requests its own resources.

Component Implementation¶

Input Specification¶

Components define typed inputs in spec section:

spec:
  inputs:
    job_cmd:
      type: string
      description: "Command to execute for build and test"
    shared_alloc:
      type: string
      default: "OFF"
      description: "Shared allocation parameters or OFF"

GitLab validates inputs before pipeline execution. Inputs are the preferred way to pass parameters to components instead of variables.

Template Export¶

Our components typically export templates for user jobs:

# Component exports this
.job_on_dane:
  stage: jobs-stage-1
  tags: [shell, slurm, dane]
  script:
    - ${{ inputs.job_cmd }}

# User extends it
my-job:
  extends: .job_on_dane
  variables:
    COMPILER: "gcc"

This allows users to define multiple jobs with different configurations.

Variable Passing¶

Variables flow from parent to child via component inputs:

# Parent (.gitlab-ci.yml)
variables:
  JOB_CMD: "./script.sh"

dane-pipeline-trigger:
  trigger:
    include:
      - component: .../dane-pipeline@version
        inputs:
          job_cmd: $JOB_CMD  # Pass parent variable

Child pipeline jobs receive variables with RSCI_ prefix (v2026.02.1+).

Warning

Avoid setting variables with RSCI_ prefix by yourself (be it in UI or in .gitlab-ci.yml) as they are reserved for components intenal use and may cause conflicts.

GitHub Integration¶

Status Reporting¶

Each child pipeline reports status to GitHub using:

GITHUB_PROJECT_NAME and GITHUB_PROJECT_ORG identify repository
GITHUB_STATUS_TOKEN provides API access
Status context: gitlab-ci/<machine> or custom via CI_STATUS_CONTEXT

Machine checks must set ASSOCIATED_CHILD_PIPELINE to use the same context as build-and-test status reports.

This creates separate status checks per machine on GitHub pull requests.

Reproducers¶

Jobs print reproducer instructions to help debug failures locally:

### CI job ${CI_JOB_ID} reproducer on dane (${SYS_TYPE})
working_dir=...
mkdir -p $working_dir
cd $working_dir
git clone ...
[...]
# Commands to recreate environment and run test

Reproducer includes clone, environement setup, allocation parameters and job command.

File Organization¶

Repository README¶

Location: README.md (repository root)

The README serves dual purposes:

GitLab CI/CD Catalog entry: - Summary of component capabilities - Table of contents (for multiple components) - ## Components section with subsections per component - Each component: description, usage example, link to published component - ## Contribute section - Important: Do not duplicate input documentation (inputs appear automatically on component pages)
GitHub repository landing page: - Overview for potential users - Installation instructions - Links to resources

GitLab uses README content when displaying the project in CI/CD Catalog. Follow GitLab’s guidelines for component project README structure.

Customization Templates¶

Location: customization/

Template files users copy and customize:

gitlab-ci.yml - Main CI file template
subscribed-pipelines.yml - Machine triggers (legacy)
custom-jobs-and-variables.yml - Customization template (legacy)
jobs/*.yml - Per-machine job templates

These files show recommended patterns but users can modify as needed.

Examples¶

Location: examples/

Complete working examples demonstrating:

Component-based setup
Job customization
Machine-specific configurations
Performance testing

Examples use current best practices and latest version.

Documentation¶

Location: docs/sphinx/

Sphinx documentation organized as:

getting_started/ - New user guides
user_guide/ - Setup and usage
reference/ - Component API reference
dev_guide/ - Developer documentation

Built with Sphinx and published to ReadTheDocs.

Design Decisions¶

Why Components?¶

Components provide several advantages over legacy includes:

Type Safety: Input validation catches configuration errors before pipeline runs
Catalog Integration: Components appear in GitLab CI/CD Catalog with documentation
Cleaner Syntax: Named inputs clearer than variable-based configuration
Versioning: @version syntax explicit in component reference

Why Parent-Child Pipelines?¶

Parent-child pattern chosen for:

Independent Reporting: Each machine reports separately to GitHub
Parallel Execution: Machines run concurrently without blocking
Configuration and UI Scalability: Easy to add new machines, new jobs, and clear display of status

Why Shared Allocations?¶

Shared allocations offer:

Faster Job Startup: Jobs run immediately within existing allocation
Resource Efficiency: One allocation shared across jobs, with controlled concurrency
Ease of configuration: Only resource and duration of the shared allocation, not each individual job

Trade-off: All jobs must complete within allocation time limit.

Individual allocations alternative when jobs have different resource needs.

Implementation Details¶

Machine Check Pattern¶

Machine checks verify availability before triggering child pipeline:

dane-up-check:
  extends: [.dane, .machine-check]
  variables:
    ASSOCIATED_CHILD_PIPELINE: "dane-build-and-test"

Check uses ASSOCIATED_CHILD_PIPELINE to link with child pipeline status on GitHub.

If machine down, check fails and child pipeline doesn’t trigger, saving resources.

Allocation Naming¶

Shared allocations use unique names to allow multiple pipelines on same machine:

ALLOC_NAME="${ALLOC_NAME:-${CI_PROJECT_NAME}_ci_${CI_PIPELINE_ID}}"

This prevents conflicts when multiple pipelines run concurrently.

Jobs find allocation by name rather than storing ID in artifacts.

Stage Organization¶

Child pipelines have three job stages between allocation and release:

jobs-stage-1 - Default, most jobs run here
jobs-stage-2 - Jobs that depend on stage-1
jobs-stage-3 - Jobs that depend on stage-2

This allows users to define job dependencies without modifying stage list.

RSCI Variable Prefix¶

Components set variables with RSCI_ prefix (v2026.02.1+) to avoid conflicts with user variables.

Users should not set RSCI_ variables themselves.