Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPIR-V] Add pre-commit CI workflow #74092

Merged
merged 6 commits into from
Dec 7, 2023
Merged

Conversation

sudonatalie
Copy link
Member

Add a pre-commit CI workflow for the experimental SPIR-V backend. This action should only run when SPIR-V target or test files are modified. The codegen-spirv tests don't run as part of check-all because the SPIR-V backend is still experimental.

Depends on #73371 (for a green tree)

@sudonatalie sudonatalie requested a review from tstellar December 1, 2023 15:39
@sudonatalie
Copy link
Member Author

@tstellar We'd like to set up a precommit CI on this experimental backend, so I'm looking for feedback on the right way to accomplish that. I see that this workflow is currently only used for release checks. Does this seem like a reasonable use too or should we look into other options like Buildkite?

Add a pre-commit CI workflow for the experimental SPIR-V backend. This
action should only run when SPIR-V target or test files are modified.
The `codegen-spirv` tests don't run as part of `check-all` because the
SPIR-V backend is still experimental.

Depends on llvm#73371 (for a green tree)
Copy link
Collaborator

@tstellar tstellar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We depend a lot on ccache to get faster builds, do you know if adding DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD=spirv makes any changes to header files that may invalidate the ccache data?

Would it make sense to start by having this job manually triggered either by a comment or a label?

How important is it to run this job on Windows and Mac OS? We are resource constrained on those OSes.

.github/workflows/llvm-project-tests.yml Outdated Show resolved Hide resolved
.github/workflows/spirv-tests.yml Outdated Show resolved Hide resolved
@sudonatalie sudonatalie force-pushed the pr-action branch 5 times, most recently from 27b204c to 7a39e34 Compare December 5, 2023 20:54
@sudonatalie
Copy link
Member Author

@tstellar Thanks for taking a look!

We depend a lot on ccache to get faster builds, do you know if adding DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD=spirv makes any changes to header files that may invalidate the ccache data?

Do you have any suggestions for how I'd go about checking this? I've never used ccache before, but I just ran a couple tests with it using separate clean build directories and -DCMAKE_CXX_COMPILER=/usr/lib/ccache/clang++ -DCMAKE_C_COMPILER=/usr/lib/ccache/clang but the builds take ~the same time with the same CMake args regardless of whether the cache is cleaned or populated.

Would it make sense to start by having this job manually triggered either by a comment or a label?

In my opinion, I think the benefit of these checks would be that they'd run automatically, so people touching SPIR-V backend files get that feedback on their PR even if they forget/don't know that this bot exists. However, I understand we're using shared resources here, so if a manual trigger is a necessary compromise to start with then I'd certainly rather that than nothing.

How important is it to run this job on Windows and Mac OS? We are resource constrained on those OSes.

Much less important since the behavior shouldn't diverge on different platforms, so Linux only is fine. Still working on figuring out how to configure that with GitHub actions though, my first several attempts to override runs-on from the calling workflow were unsuccessful so I'm going to try to find a better way to test it than repeatedly force-pushing here.

@sudonatalie sudonatalie marked this pull request as ready for review December 5, 2023 22:37
@llvmbot
Copy link
Member

llvmbot commented Dec 5, 2023

@llvm/pr-subscribers-github-workflow

Author: Natalie Chouinard (sudonatalie)

Changes

Add a pre-commit CI workflow for the experimental SPIR-V backend. This action should only run when SPIR-V target or test files are modified. The codegen-spirv tests don't run as part of check-all because the SPIR-V backend is still experimental.

Depends on #73371 (for a green tree)


Full diff: https://github.com/llvm/llvm-project/pull/74092.diff

2 Files Affected:

  • (modified) .github/workflows/llvm-project-tests.yml (+20-9)
  • (added) .github/workflows/spirv-tests.yml (+29)
diff --git a/.github/workflows/llvm-project-tests.yml b/.github/workflows/llvm-project-tests.yml
index 996cfe41f047f..6751bde4a11a9 100644
--- a/.github/workflows/llvm-project-tests.yml
+++ b/.github/workflows/llvm-project-tests.yml
@@ -10,6 +10,11 @@ on:
         required: false
       projects:
         required: false
+      extra_cmake_args:
+        required: false
+      os_list:
+        required: false
+        default: '["ubuntu-latest", "windows-2019", "macOS-11"]'
   workflow_call:
     inputs:
       build_target:
@@ -20,6 +25,19 @@ on:
         required: true
         type: string
 
+      extra_cmake_args:
+        required: false
+        type: string
+
+      os_list:
+        required: false
+        type: string
+        # Use windows-2019 due to:
+        # https://developercommunity.visualstudio.com/t/Prev-Issue---with-__assume-isnan-/1597317
+        # We're using a specific version of macOS due to:
+        # https://github.com/actions/virtual-environments/issues/5900
+        default: '["ubuntu-latest", "windows-2019", "macOS-11"]'
+
 concurrency:
   # Skip intermediate builds: always.
   # Cancel intermediate builds: only if it is a pull request build.
@@ -35,14 +53,7 @@ jobs:
     strategy:
       fail-fast: false
       matrix:
-        os:
-          - ubuntu-latest
-          # Use windows-2019 due to:
-          # https://developercommunity.visualstudio.com/t/Prev-Issue---with-__assume-isnan-/1597317
-          - windows-2019
-          # We're using a specific version of macOS due to:
-          # https://github.com/actions/virtual-environments/issues/5900
-          - macOS-11
+        os: ${{ fromJSON(inputs.os_list) }}
     steps:
       - name: Setup Windows
         if: startsWith(matrix.os, 'windows')
@@ -85,7 +96,7 @@ jobs:
           # This should be a no-op for non-mac OSes
           PKG_CONFIG_PATH: /usr/local/Homebrew/Library/Homebrew/os/mac/pkgconfig//12
         with:
-          cmake_args: '-GNinja -DLLVM_ENABLE_PROJECTS="${{ inputs.projects }}" -DCMAKE_BUILD_TYPE=Release -DLLDB_INCLUDE_TESTS=OFF -DCMAKE_C_COMPILER_LAUNCHER=sccache -DCMAKE_CXX_COMPILER_LAUNCHER=sccache'
+          cmake_args: '-GNinja -DLLVM_ENABLE_PROJECTS="${{ inputs.projects }}" -DCMAKE_BUILD_TYPE=Release -DLLDB_INCLUDE_TESTS=OFF -DCMAKE_C_COMPILER_LAUNCHER=sccache -DCMAKE_CXX_COMPILER_LAUNCHER=sccache ${{ inputs.extra_cmake_args }}'
           build_target: '${{ inputs.build_target }}'
 
       - name: Build and Test libclc
diff --git a/.github/workflows/spirv-tests.yml b/.github/workflows/spirv-tests.yml
new file mode 100644
index 0000000000000..0e1d0caca410c
--- /dev/null
+++ b/.github/workflows/spirv-tests.yml
@@ -0,0 +1,29 @@
+name: SPIR-V Tests
+
+permissions:
+  contents: read
+
+on:
+  workflow_dispatch:
+  pull_request:
+    paths:
+      - 'llvm/lib/Target/SPIRV/**'
+      - 'llvm/test/CodeGen/SPIRV/**'
+      - '.github/workflows/spirv-tests.yml'
+
+concurrency:
+  # Skip intermediate builds: always.
+  # Cancel intermediate builds: only if it is a pull request build.
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: ${{ startsWith(github.ref, 'refs/pull/') }}
+
+jobs:
+  check_spirv:
+    if: github.repository_owner == 'llvm'
+    name: Test SPIR-V
+    uses: ./.github/workflows/llvm-project-tests.yml
+    with:
+      build_target: check-llvm-codegen-spirv
+      projects:
+      extra_cmake_args: '-DLLVM_TARGETS_TO_BUILD="" -DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD="SPIRV"'
+      os_list: '["ubuntu-latest"]'

@tstellar
Copy link
Collaborator

tstellar commented Dec 6, 2023

Do you have any suggestions for how I'd go about checking this? I've never used ccache before, but I just ran a couple tests with it using separate clean build directories and -DCMAKE_CXX_COMPILER=/usr/lib/ccache/clang++ -DCMAKE_C_COMPILER=/usr/lib/ccache/clang but the builds take ~the same time with the same CMake args regardless of whether the cache is cleaned or populated.

I think the way to test it would be:

# Configure with default targets:
cmake -G Ninja -S llvm -B build -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_COMPILER=/usr/lib/ccache/clang++ -DCMAKE_C_COMPILER=/usr/lib/ccache/clang

# Build llvm:
ninja -C build

# Save ccache stats
ccache -s > phase1.stats

# Rebuild llvm:
ninja -C build

# Save ccache stats
ccache -s > phase2.stats

# Edit CMakeCache.txt and enable SPIRV
vim build/CMakeCache.txt
-> LLVM_EXPERIMENTAL_TARGETS_TO_BUILD:STRING=SPIRV

# Rebuild llvm:
ninja -C build

# Save ccache stats:
ccache -s > phase3.stats

If it worked, when you compare the phase2.stats and phase3.stats output it should should mostly an increase in the Hits category. I did this test locally and it appears that adding -DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD:STRING=SPIRV does not invalidate the cache, which is good, but the SPIRV backend failed to build, so you might want to double check my work here.

@sudonatalie
Copy link
Member Author

If it worked, when you compare the phase2.stats and phase3.stats output it should should mostly an increase in the Hits category. I did this test locally and it appears that adding -DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD:STRING=SPIRV does not invalidate the cache, which is good, but the SPIRV backend failed to build, so you might want to double check my work here.

Looks the same for me and the build succeeded.

@sudonatalie
Copy link
Member Author

Looks the same for me and the build succeeded.

Actually I just pulled upstream and the SPIR-V backend is broken at tip-of-tree at the moment. Fix in #74660.

Copy link
Collaborator

@tstellar tstellar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with the caveat that if this ends up being too resource intensive, we may need to switch it to be manually triggered or disable it until we can get more resources.

@sudonatalie sudonatalie merged commit 155a013 into llvm:main Dec 7, 2023
3 of 4 checks passed
@sudonatalie sudonatalie deleted the pr-action branch December 7, 2023 14:42
@michalpaszkowski
Copy link
Member

I noticed that some recent runs are failing (e.g. this one) on the Setup ccache step:

/usr/bin/docker exec  9478d844c1eb517a405cdbc48edf97b2a87b3e96623dd6b74fc397265a41e6cc sh -c "cat /etc/*release | grep ^ID"
Install sccache
  /usr/bin/sh -xc curl -L 'https://github.com/mozilla/sccache/releases/download/v0.7.6/sccache-v0.7.6-x86_64-unknown-linux-musl.tar.gz' | tar xzf - -O --wildcards '*/sccache' > '/usr/local/bin/sccache'
  + curl -L https://github.com/mozilla/sccache/releases/download/v0.7.6/sccache-v0.7.6-x86_64-unknown-linux-musl.tar.gz
  /usr/bin/sh: 1: cannot create /usr/local/bin/sccache: Permission denied
    % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                   Dload  Upload   Total   Spent    Left  Speed
  
    0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
    0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  
    0 9007k    0  4937    0     0  14185      0  0:10:50 --:--:--  0:10:50 14185
  curl: ([23](https://github.com/llvm/llvm-project/actions/runs/12620519287/job/35166312082#step:7:24)) Failure writing output to destination
  Error: Restoring cache failed: Error: The process '/usr/bin/sh' failed with exit code 2

@tstellar Do you have any suggestions what might be causing this issue?

@tstellar
Copy link
Collaborator

tstellar commented Jan 6, 2025

@michalpaszkowski I have not seen this before.

@boomanaiden154
Copy link
Contributor

We recently switched the container to use a non-root user by default. I'm guessing that is what's causing the issue as the default user will not have permission to install applications.

@VyacheslavLevytskyy
Copy link
Contributor

This now always leads to fails in Github Actions for linux runs, and this doesn't depend on the sub-project. I can see the same fail here, for example: https://github.com/llvm/llvm-project/actions/runs/12631817480/job/35194267947

I guess it's a blocker now that causes failures in every sub-project that relies on .github/workflows/llvm-project-tests.yml. @boomanaiden154 @tstellar May I ask your advice, what you consider the best way forward to fix/mitigate this?

@tstellar
Copy link
Collaborator

tstellar commented Jan 7, 2025

It looks like the fix may be to give the container user write access to /usr/local/. For workarounds, if you switch to ccache and pre-install it using apt before the action runs, I think that will work.

@boomanaiden154
Copy link
Contributor

It's probably better to just add the gha to the sudoers file and give it passwordless sudo access and then modify the other actions to install things with sudo. I can work on a patch and hopefully get it up by EOD.

@VyacheslavLevytskyy
Copy link
Contributor

VyacheslavLevytskyy commented Jan 8, 2025

It's probably better to just add the gha to the sudoers file and give it passwordless sudo access and then modify the other actions to install things with sudo. I can work on a patch and hopefully get it up by EOD.

@boomanaiden154 Can we give the container user write access to /usr/local/ to proceed with the simplest solution?

@VyacheslavLevytskyy
Copy link
Contributor

@boomanaiden154 May I ask you what do you think about this idea to give the container user write access to /usr/local/ ?

@boomanaiden154
Copy link
Contributor

May I ask you what do you think about this idea to give the container user write access to /usr/local/ ?

#122221 already fixed the issue.

@VyacheslavLevytskyy
Copy link
Contributor

May I ask you what do you think about this idea to give the container user write access to /usr/local/ ?

#122221 already fixed the issue.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants