Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

alpine aarch64 build hanging downloading jar #3245

Closed
andrew-m-leonard opened this issue Nov 8, 2023 · 15 comments · Fixed by #3252
Closed

alpine aarch64 build hanging downloading jar #3245

andrew-m-leonard opened this issue Nov 8, 2023 · 15 comments · Fixed by #3252
Assignees

Comments

@andrew-m-leonard
Copy link
Contributor

andrew-m-leonard commented Nov 8, 2023

The jdk21u alpine aarch64 linux builds are hanging during download of jackson-core.jar:
https://ci.adoptium.net/job/build-scripts/job/jobs/job/jdk21u/job/jdk21u-alpine-linux-aarch64-temurin/2/console
Possibly specific to node dockerhost-equinix-ubuntu2204-armv8-1

16:06:12  download-jackson-core:
16:06:12       [echo] Downloading jackson-core
16:06:12       [echo] Executing macro download-file
16:06:12       [echo] File to download: https://ci.adoptium.net/view/all/job/build.getDependency/lastSuccessfulBuild/artifact/sbom_dependencies/jackson-core.jar
16:06:12       [echo] Destination: build/jar/jackson-core.jar
16:06:12       [echo] Download tool: curl
@steelhead31 steelhead31 self-assigned this Nov 10, 2023
@sxa
Copy link
Member

sxa commented Nov 10, 2023

I note that the cycloneDX bits are running ant using JDK-11 which I've generally found to be unreliable (we stopped using it for the jenkins agents...) so that may be related. At the point it's happening there's on obvious sign of a curl process on the machine (I'm assuming it's calling the curl util as opposed to directly using a link against libcurl)

@andrew-m-leonard
Copy link
Contributor Author

@steelhead31
Copy link
Contributor

This appears to work ok, when I run a container with the same base image on the same host, I'll continue investigating..

967687c0ff22:~/temurin-build/build-farm$ curl https://ci.adoptium.net/view/all/job/build.getDependency/lastSuccessfulBuild/artifact/sbom_dependencies/jackson-core.jar --output bob.jar
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  448k  100  448k    0     0   357k      0  0:00:01  0:00:01 --:--:--  357k

@sxa
Copy link
Member

sxa commented Nov 13, 2023

Yeah it doesn't seem to get as far as actually calling curl from the ant file.
While it might be interesting to swap in another provider's JDK11 and see if the problem still occurs I feel the right answer here is to default our Alpine images to 17 (maybe 21 in future) as the default on the system.
I'm not sure if it's the recommended way to do it (I feel Alpine probably has an equivalent of update-alternatives but not sure what it is) but we could ln -sf jdk-17 /usr/lib/jvm/default_jvm after installing Temurin 17, although I would recommend only installing openjdk17 and not openjdk11 as a better solution.

I tried running a test and managed to overload the machine a little so I haven't got proof that what I've said above will fix it yet 😇

@sxa
Copy link
Member

sxa commented Nov 13, 2023

Hmmm I've just tried again and been able to run make-adopt-build-farm.sh in docker containers with both JDk11 and 17 as the first in the PATH and they both worked 🙄

But I am a little worried about the fact that the machine does appear to be getting slow - we should perhaps keep an eye on the load on it through the day to see if it behaves

@steelhead31
Copy link
Contributor

Im 90% certain now , that it relates to the jdk being used to run the cyclone dx bits, when running with the jdk11 on the container created by the jenkins build it hangs, when I switch it to use jdk21 ( the first official release for alpine on aarch64 ) it works fine

@sxa
Copy link
Member

sxa commented Nov 14, 2023

I was convinced of that too until I started up a container and ran a build myself with the default jdk11 in the past and it seemed to work ok which somewhat confused me ... Unless that test was invalid somehow .. Or I just got REALLY lucky since 11 was quite temperamental. May be worth trying that test yourself standalone (outside Jenkins)

@sxa
Copy link
Member

sxa commented Nov 14, 2023

FYI it's the hangs that mean I'm not considering releasing a Temurin11 on that platform

@sxa
Copy link
Member

sxa commented Nov 14, 2023

Yup, I've just run a build with jdk17 as the default, and that has downloaded 3 or 4 jars ok, then hung...

OK THAT'S odd ...17 has been pretty reliable in my experience. I was tempted to adjust the code to print out java -version before invoking ant for the CycloneDX stuff, just to be 100% certain which JDK it's using ...

@steelhead31
Copy link
Contributor

Yup, I've just run a build with jdk17 as the default, and that has downloaded 3 or 4 jars ok, then hung...

OK THAT'S odd ...17 has been pretty reliable in my experience. I was tempted to adjust the code to print out java -version before invoking ant for the CycloneDX stuff, just to be 100% certain which JDK it's using ...

Yup, It was still defaulting to 11... I've a current run that should hopefully be more productive..

@steelhead31
Copy link
Contributor

I've tried with numerous versions of java, 11, 17 & 21, all have the same intermittent hangs, at different points during the curl downloads of individual dependencies...

@steelhead31
Copy link
Contributor

steelhead31 commented Nov 14, 2023

The issue is definitely tied to JDK_11 being present, in particular, I think this piece of code from build.sh

setupAntEnv() {
  local javaHome=""

  if [ ${JAVA_HOME+x} ] && [ -d "${JAVA_HOME}" ]; then
    javaHome=${JAVA_HOME}
  elif [ ${JDK17_BOOT_DIR+x} ] && [ -d "${JDK17_BOOT_DIR}" ]; then
    javaHome=${JDK17_BOOT_DIR}
  elif [ ${JDK8_BOOT_DIR+x} ] && [ -d "${JDK8_BOOT_DIR}" ]; then
    javaHome=${JDK8_BOOT_DIR}
  elif [ ${JDK11_BOOT_DIR+x} ] && [ -d "${JDK11_BOOT_DIR}" ]; then
    javaHome=${JDK11_BOOT_DIR}
  elif [ ${BUILD_CONFIG[JDK_BOOT_DIR]+x} ] && [ -d "${BUILD_CONFIG[JDK_BOOT_DIR]}" ]; then
  # fall back to use JDK_BOOT_DIR which is set in make-adopt-build-farm.sh
    javaHome="${BUILD_CONFIG[JDK_BOOT_DIR]}"
  else
    echo "Unable to find a suitable JAVA_HOME to build the cyclonedx-lib"
    exit 2
  fi
  echo "${javaHome}"
}

@steelhead31
Copy link
Contributor

Ok, regardless of what is set in the image, I think the docker build of an image file replaces the jdk17 home with the default one of the java agent running on jenkins... this is why the build works on dockerhost-equinix-ubuntu2004-armv8-1 and not on dockerhost-equinix-ubuntu2204-armv8-1

@steelhead31
Copy link
Contributor

Ok, so changing the jdk used to run the jenkins agent makes no difference, but removing jdk11 from the alpine build image works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants