.. ##
.. ## Copyright (c) 2026, Lawrence Livermore National Security, LLC and
.. ## other RADIUSS Project Developers. See the top-level COPYRIGHT file for
.. ## details.
.. ##
.. ## SPDX-License-Identifier: (MIT)
.. ##

.. _choosing-your-path-label:

*******************
Choosing Your Path
*******************

After completing the :doc:`five-minute-setup`, this guide helps you decide
which machines, components, and features to enable for your project.

.. _components-vs-legacy:

========================
Quick Decision Tree
========================

.. code-block:: text

   ┌─ Using GitLab < 17.0?
   │  └→ Use legacy include-based approach
   │     See: user_guide/setup-legacy
   │
   ├─ Using GitLab 17.0+?
   │  ├─ New project?
   │  │  └→ Use components (you're on the right path!)
   │  │
   │  └─ Existing project with legacy setup?
   │     └→ Migrate to components
   │        See: user_guide/components_migration
   │
   └─ What do you need?
      ├─ Build and test only → Machine components
      ├─ Performance tracking → + performance-pipeline
      ├─ Skip draft PRs → + utility-draft-pr-filter
      └─ Skip non-PR branches → + utility-branch-skip

=========================
Choosing Machines
=========================

RADIUSS Shared CI supports multiple LC machines. Choose based on your needs:

Available Machines
==================

.. list-table::
   :header-rows: 1
   :widths: 15 15 20 50

   * - Machine
     - Scheduler
     - Architecture
     - Best For
   * - **Dane**
     - SLURM
     - Intel Sapphire Rapids
     - CPU-only development, build and test workloads
   * - **Matrix**
     - SLURM
     - Intel Sapphire Rapids + NVIDIA H100 GPUs
     - CUDA development, H100 GPU testing
   * - **Corona**
     - Flux
     - AMD Rome + AMD MI50 GPUs
     - ROCm development, AMD GPU testing
   * - **Tioga**
     - Flux
     - AMD Trento + AMD MI250X GPUs
     - ROCm development, production-like environment
   * - **Tuolumne**
     - Flux
     - AMD EPYC + AMD MI300A APUs
     - ROCm development, latest AMD hardware

Decision Criteria
=================

**Q: Which machine should I start with?**

Start with **one machine** where your team already has:

- Working build process
- Allocation or queue access
- Familiarity with the environment

Common starting points:

- **CUDA projects**: Start with Matrix
- **ROCm projects**: Start with Tioga or Tuolumne
- **CPU-only**: Prefer Dane which has no GPU

**Q: How many machines should I enable?**

Start small, expand gradually:

1. **One machine** - Get CI working reliably
2. **Two machines** - Add a second architecture (e.g., CUDA + ROCm)
3. **Multiple machines** - Full coverage once CI is stable

**Q: Should I enable all machines?**

No! Only enable machines where:

- You have active development
- You have allocation/access
- The architecture matters for your project

More machines = more CI time and resources.

Adding Machines
===============

To add a machine, add its check and trigger jobs to ``.gitlab-ci.yml``:

.. code-block:: yaml

   # Add to existing stages
   stages:
     - prerequisites
     - build-and-test

   # Add machine check
   dane-up-check:
     extends: [.dane, .machine-check]
     variables:
       ASSOCIATED_CHILD_PIPELINE: "dane-build-and-test"

   # Add machine pipeline
   dane-build-and-test:
     needs: [dane-up-check]
     extends: [.dane, .build-and-test]
     trigger:
       include:
         - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/dane-pipeline@v2026.02.2
           inputs:
             job_cmd: $JOB_CMD
             shared_alloc: "--reservation=ci --exclusive --nodes=1 --time=30"
             job_alloc: "--reservation=ci --overlap --nodes=1"
             github_project_name: $GITHUB_PROJECT_NAME
             github_project_org: $GITHUB_PROJECT_ORG
         - local: '.gitlab/jobs/dane.yml'

Then create ``.gitlab/jobs/dane.yml`` with your jobs.

See: :doc:`../user_guide/quick-reference` for all machine configurations.

Temporarily Disabling Machines
===============================

To temporarily disable a machine without removing its configuration:

.. code-block:: yaml

   variables:
     ON_DANE: "OFF"  # Disables Dane CI

This is useful during:

- Machine outages
- Debugging issues on specific machines
- Testing changes on subset of machines

=============================
Choosing Utility Components
=============================

Utility components add optional behavior to your pipeline.

utility-draft-pr-filter
=======================

**Purpose**: Skip CI on draft pull requests to save resources.

**When to use**:

- Your team uses GitHub draft PRs
- You want to avoid running CI until PR is ready for review
- You want to save CI time and machine resources

**How it works**:

1. Checks if current commit is from a draft PR
2. If yes, reports "Draft PR - CI skipped" to GitHub and exits
3. If no, continues with normal CI

**Adding it**:

.. code-block:: yaml

   include:
     - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/utility-draft-pr-filter@v2026.02.2
       inputs:
         github_token: $GITHUB_STATUS_TOKEN
         github_project_name: $GITHUB_PROJECT_NAME
         github_project_org: $GITHUB_PROJECT_ORG
         always_run_pattern: "^(main|develop)$"  # Optional: branches that always run

**Configuration**:

- ``always_run_pattern``: Regex for branches that skip the filter (e.g.,
  protected branches should always run even if from a draft PR)

utility-branch-skip
===================

**Purpose**: Skip CI on branches that aren't associated with a pull request.

**When to use**:

- You only want CI on PRs, not on random branch pushes
- You want to reduce noise from experimental branches
- You have many feature branches

**How it works**:

1. Checks if current branch has an open PR
2. If no PR, reports "Not a PR - CI skipped" and exits
3. If PR exists, continues with normal CI

**Adding it**:

.. code-block:: yaml

   include:
     - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/utility-branch-skip@v2026.02.2
       inputs:
         github_token: $GITHUB_STATUS_TOKEN
         github_project_name: $GITHUB_PROJECT_NAME
         github_project_org: $GITHUB_PROJECT_ORG

.. warning::
   This will skip CI on your main/develop branches unless they have PRs.
   Consider your workflow before enabling.

.. note::
   Both draft PRs filter and branch skip utilities can be used together depending on your needs.

============================
Performance Pipeline
============================

**Purpose**: Run performance benchmarks and report results to GitHub.

**When to use**:

- You have performance-critical code
- You want to track performance over time
- You want to detect performance regressions in PRs

**Requirements**:

- Performance test script that produces results
- (Optional) Processing script to format results
- GitHub token with ``repo`` and ``workflow`` permissions for reporting

Decision Questions
==================

**Q: Should I enable performance testing?**

Consider enabling if:

- Performance is critical to your project
- You have dedicated performance benchmarks
- You can afford the additional CI time

Skip if:

- You're just getting started with CI
- You don't have performance tests yet
- CI time is already too long

.. note::
   To reduce the burden of running performance tests, see "How often should
   performance tests run?" below for scheduling options.

**Q: Which machines should run performance tests?**

Typically, run performance tests on:

- **Fewer machines** than regular CI (1-2 representative machines)
- **Consistent hardware** (same machine for trend tracking)
- **Production-like environments** (e.g. Tuolumne for ROCm)

**Q: How often should performance tests run?**

Common patterns:

- **On main/develop only**: Use rules to restrict to protected branches
- **Scheduled**: Use GitLab schedules for nightly performance runs
- **Manual on PRs**: Set ``when: manual`` for PR performance testing

Adding Performance Pipeline
============================

Basic setup:

.. code-block:: yaml

   stages:
     - prerequisites
     - build-and-test
     - performance-measurements  # Add this stage

   performance-measurements:
     extends: [.performance-measurements]
     rules:
       # Only on main/develop, or manual on PRs
       - if: '$CI_COMMIT_BRANCH == "main" || $CI_COMMIT_BRANCH == "develop"'
         when: on_success
       - when: manual
     trigger:
       include:
         - component: $CI_SERVER_FQDN/radiuss/radiuss-shared-ci/performance-pipeline@v2026.02.2
           inputs:
             job_cmd: "./scripts/run-benchmarks.sh"
             dane_perf_alloc: "--reservation=ci --nodes=1 --exclusive --time=30"
             perf_processing_cmd: "./scripts/process-results.py"
             github_token: $GITHUB_STATUS_TOKEN
             github_project_name: $GITHUB_PROJECT_NAME
             github_project_org: $GITHUB_PROJECT_ORG
         - local: '.gitlab/jobs/performances.yml'

Then create ``.gitlab/jobs/performances.yml`` with performance jobs.

See: :doc:`../user_guide/quick-reference` for full performance pipeline reference.

=====================================
Service User vs Personal Account
=====================================

Already covered in :doc:`prerequisites`, but worth revisiting:

**Use Service User If**:

- ✓ Multiple team members need to manage CI
- ✓ You want consistent permissions across runs
- ✓ Disk quota is a concern
- ✓ Project is long-term / production

**Use Personal Account If**:

- ✓ Solo developer / small project
- ✓ Quick prototyping / temporary project
- ✓ Can't get service account approved
- ✓ Okay with quota limitations

.. note::
   You can always migrate from personal to service account later by changing
   the ``LLNL_SERVICE_USER`` variable.

=====================
Allocation Strategy
=====================

Understanding Shared Allocations
=================================

For SLURM/Flux machines, you can use:

**Shared Allocation** (recommended):

- One allocation for all jobs on that machine
- Jobs run sequentially or in parallel within allocation
- Easier to configure (adapt top-level allocation parameters as needed)
- Harder to understand resource usage (scheduling less transparent)

**Individual Allocations** (``shared_alloc: "OFF"``):

- Each job gets its own allocation
- Simpler to understand
- More work to maintain: optimal use requires the user to define allocation parameters for each job

.. note::
   We recommend shared allocations for regular CI jobs, and individual
   allocations for performance jobs to avoid interference.

Choosing Allocation Size
=========================

Consider:

**Time**:

- Start conservative
- Monitor actual job duration
- Add buffer for variability
- Max out at queue limits

**Resources**:

- Match your typical build (1 node usually sufficient)
- Only over-allocate shared allocation to pack more jobs in parallel (otherwise wastes resources)
- Can override per-job if needed

.. note::
   For flexibility, we advise not to set a time limit for jobs running under a
   shared allocation: the top level allocaton suffices.

Example configurations:

==============
Next Steps
==============

After deciding your configuration:

**Implement Your Choices**:

- Add chosen machines to ``.gitlab-ci.yml``
- Create job files for each machine
- Add utility components if desired
- Configure performance pipeline if needed

**Learn More**:

- **Detailed setup**: :doc:`../user_guide/setup-with-components`
- **How-to guides**: :doc:`../user_guide/how_to`
- **Quick reference**: :doc:`../user_guide/quick-reference`

**Get Help**:

- **Troubleshooting**: :doc:`../dev_guide/troubleshooting`
- **GitHub issues**: https://github.com/LLNL/radiuss-shared-ci/issues

.. seealso::

   - :doc:`../user_guide/concepts` - Architecture and design
   - :doc:`../user_guide/components_migration` - Migrate from legacy
   - ``examples/example-gitlab-ci.yml`` - Complete working example