RADIUSS Shared CI Explained¶
This guide explains how RADIUSS Shared CI works internally - the architecture, design patterns, and implementation details.
Repository Structure¶
The repository contains both component-based (current) and legacy (compatibility) implementations:
radiuss-shared-ci/
├── templates/ # Component implementations (GitLab 17.0+)
│ ├── base-pipeline/
│ ├── dane-pipeline/
│ ├── matrix-pipeline/
│ ├── tioga-pipeline/
│ ├── tuolumne-pipeline/
│ ├── corona-pipeline/
│ ├── performance-pipeline/
│ ├── utility-draft-pr-filter/
│ └── utility-branch-skip/
├── pipelines/ # Legacy include-based files
├── utilities/ # Legacy utility files
├── customization/ # Template files for users
├── examples/ # Example configurations
└── docs/ # Documentation source
Component-Based Implementation¶
Location: templates/*/template.yml
Components follow GitLab CI Components specification:
# templates/dane-pipeline/template.yml
spec:
inputs:
job_cmd:
type: string
shared_alloc:
type: string
default: "OFF"
---
# Component implementation
.job_on_dane:
tags: [shell, dane]
script:
- ${{ inputs.job_cmd }}
Users consume components via:
include:
- component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/dane-pipeline@v2026.02.2
inputs:
job_cmd: "./script.sh"
shared_alloc: "--nodes=1"
Legacy Implementation¶
Location: pipelines/*.yml, utilities/*.yml
Legacy files use traditional include: project: syntax:
include:
- project: 'radiuss/radiuss-shared-ci'
ref: 'v2026.02.2'
file: 'pipelines/dane.yml'
Both implementations provide the same functionality. Components add input validation and catalog integration.
Pipeline Architecture¶
Parent-Child Pattern¶
RADIUSS Shared CI uses GitLab’s parent-child pipeline pattern:
Parent pipeline (
.gitlab-ci.yml): - Can be extended by users as usual - Checks machine availability - Triggers child pipelines for each machineChild pipelines (machine-specific): - Execute build/test jobs - Report results independently to GitHub
This pattern allows:
Independent machine status reporting to GitHub
Parallel execution across machines
Clean separation of concerns
Machine Abstraction¶
Each machine has:
Scheduler type (SLURM or Flux)
Allocation syntax
Runner tags
Job templates
Machine components abstract these differences:
# Dane (SLURM)
.job_on_dane:
tags: [shell, dane]
# SLURM-specific allocation logic
# Tioga (Flux)
.job_on_tioga:
tags: [shell, tioga]
# Flux-specific allocation logic
Users extend machine templates without worrying about scheduler details.
Shared Allocations¶
Shared allocation pattern:
allocate-resourcesjob requests allocation and names it.Jobs find allocation ID using name and run within it.
release-resourcesjob cancels allocation.
Jobs can use jobs-stage-1, jobs-stage-2, jobs-stage-3 stages
between allocation and release.
Individual allocation alternative: each job requests its own resources.
Component Implementation¶
Input Specification¶
Components define typed inputs in spec section:
spec:
inputs:
job_cmd:
type: string
description: "Command to execute for build and test"
shared_alloc:
type: string
default: "OFF"
description: "Shared allocation parameters or OFF"
GitLab validates inputs before pipeline execution. Inputs are the preferred way to pass parameters to components instead of variables.
Template Export¶
Our components typically export templates for user jobs:
# Component exports this
.job_on_dane:
stage: jobs-stage-1
tags: [shell, slurm, dane]
script:
- ${{ inputs.job_cmd }}
# User extends it
my-job:
extends: .job_on_dane
variables:
COMPILER: "gcc"
This allows users to define multiple jobs with different configurations.
Variable Passing¶
Variables flow from parent to child via component inputs:
# Parent (.gitlab-ci.yml)
variables:
JOB_CMD: "./script.sh"
dane-pipeline-trigger:
trigger:
include:
- component: .../dane-pipeline@version
inputs:
job_cmd: $JOB_CMD # Pass parent variable
Child pipeline jobs receive variables with RSCI_ prefix (v2026.02.1+).
Warning
Avoid setting variables with RSCI_ prefix by yourself (be it in UI or in
.gitlab-ci.yml) as they are reserved for components intenal use and may
cause conflicts.
GitHub Integration¶
Status Reporting¶
Each child pipeline reports status to GitHub using:
GITHUB_PROJECT_NAMEandGITHUB_PROJECT_ORGidentify repositoryGITHUB_STATUS_TOKENprovides API accessStatus context:
gitlab-ci/<machine>or custom viaCI_STATUS_CONTEXT
Machine checks must set ASSOCIATED_CHILD_PIPELINE to use the same context
as build-and-test status reports.
This creates separate status checks per machine on GitHub pull requests.
Reproducers¶
Jobs print reproducer instructions to help debug failures locally:
### CI job ${CI_JOB_ID} reproducer on dane (${SYS_TYPE})
working_dir=...
mkdir -p $working_dir
cd $working_dir
git clone ...
[...]
# Commands to recreate environment and run test
Reproducer includes clone, environement setup, allocation parameters and job command.
File Organization¶
Repository README¶
Location: README.md (repository root)
The README serves dual purposes:
GitLab CI/CD Catalog entry: - Summary of component capabilities - Table of contents (for multiple components) -
## Componentssection with subsections per component - Each component: description, usage example, link to published component -## Contributesection - Important: Do not duplicate input documentation (inputs appear automatically on component pages)GitHub repository landing page: - Overview for potential users - Installation instructions - Links to resources
GitLab uses README content when displaying the project in CI/CD Catalog. Follow GitLab’s guidelines for component project README structure.
Customization Templates¶
Location: customization/
Template files users copy and customize:
gitlab-ci.yml- Main CI file templatesubscribed-pipelines.yml- Machine triggers (legacy)custom-jobs-and-variables.yml- Customization template (legacy)jobs/*.yml- Per-machine job templates
These files show recommended patterns but users can modify as needed.
Examples¶
Location: examples/
Complete working examples demonstrating:
Component-based setup
Job customization
Machine-specific configurations
Performance testing
Examples use current best practices and latest version.
Documentation¶
Location: docs/sphinx/
Sphinx documentation organized as:
getting_started/- New user guidesuser_guide/- Setup and usagereference/- Component API referencedev_guide/- Developer documentation
Built with Sphinx and published to ReadTheDocs.
Design Decisions¶
Why Components?¶
Components provide several advantages over legacy includes:
- Type Safety
Input validation catches configuration errors before pipeline runs
- Catalog Integration
Components appear in GitLab CI/CD Catalog with documentation
- Cleaner Syntax
Named inputs clearer than variable-based configuration
- Versioning
@versionsyntax explicit in component reference
Why Parent-Child Pipelines?¶
Parent-child pattern chosen for:
- Independent Reporting
Each machine reports separately to GitHub
- Parallel Execution
Machines run concurrently without blocking
- Configuration and UI Scalability
Easy to add new machines, new jobs, and clear display of status
Why Shared Allocations?¶
Shared allocations offer:
- Faster Job Startup
Jobs run immediately within existing allocation
- Resource Efficiency
One allocation shared across jobs, with controlled concurrency
- Ease of configuration
Only resource and duration of the shared allocation, not each individual job
Trade-off: All jobs must complete within allocation time limit.
Individual allocations alternative when jobs have different resource needs.
Implementation Details¶
Machine Check Pattern¶
Machine checks verify availability before triggering child pipeline:
dane-up-check:
extends: [.dane, .machine-check]
variables:
ASSOCIATED_CHILD_PIPELINE: "dane-build-and-test"
Check uses ASSOCIATED_CHILD_PIPELINE to link with child pipeline status
on GitHub.
If machine down, check fails and child pipeline doesn’t trigger, saving resources.
Allocation Naming¶
Shared allocations use unique names to allow multiple pipelines on same machine:
ALLOC_NAME="${ALLOC_NAME:-${CI_PROJECT_NAME}_ci_${CI_PIPELINE_ID}}"
This prevents conflicts when multiple pipelines run concurrently.
Jobs find allocation by name rather than storing ID in artifacts.
Stage Organization¶
Child pipelines have three job stages between allocation and release:
jobs-stage-1- Default, most jobs run herejobs-stage-2- Jobs that depend on stage-1jobs-stage-3- Jobs that depend on stage-2
This allows users to define job dependencies without modifying stage list.
RSCI Variable Prefix¶
Components set variables with RSCI_ prefix (v2026.02.1+) to avoid conflicts
with user variables.
Users should not set RSCI_ variables themselves.
See Also¶
Developer How-To - Step-by-step procedures for common tasks
Contributing - Contribution guidelines
Developer Troubleshooting - Common development issues