Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The jmx_prometheus_javaagent not work in Sparkapllication driver pod #75722

Open
nozhang opened this issue Dec 12, 2024 · 4 comments
Open

The jmx_prometheus_javaagent not work in Sparkapllication driver pod #75722

nozhang opened this issue Dec 12, 2024 · 4 comments
Assignees
Labels
in-progress spark tech-issues The user has a technical issue about an application

Comments

@nozhang
Copy link

nozhang commented Dec 12, 2024

Name and Version

bitnami/spark:3.5.3

What architecture are you using?

amd64

What steps will reproduce the bug?

I download jmx_prometheus_javaagent-0.11.0.jar to image

FROM bitnami/spark:3.5.3
USER root

RUN mkdir -p /prometheus

ADD https://repo1.maven.org/maven2/io/prometheus/jmx/jmx_prometheus_javaagent/0.11.0/jmx_prometheus_javaagent-0.11.0.jar /prometheus/
RUN chmod 644 /prometheus/jmx_prometheus_javaagent-0.11.0.jar

My SparkApllication.yaml is like this

apiVersion: sparkoperator.k8s.io/v1beta2
kind: SparkApplication
metadata:
  name: spark-sample-job
  namespace: spark3-jobs
spec:
  type: Scala
  mode: cluster
  image: above image link
  imagePullPolicy: Always
  mainClass: org.apache.spark.examples.SparkPi
  mainApplicationFile: local:///opt/bitnami/spark/examples/jars/spark-examples_2.12-3.5.3.jar
  arguments:
  - "1000000"
  sparkVersion: "3.5.3"
  restartPolicy:
    type: Never
  driver:
    labels:
      version: "3.5.3"
...
    annotations:
      prometheus.io/scrape: "true"
      prometheus.io/port: "8090"
      prometheus.io/path: "/metrics"
  executor:
    labels:
      version: "3.5.3"
...
    annotations:
      prometheus.io/scrape: "true"
      prometheus.io/port: "8090"
      prometheus.io/path: "/metrics"
  monitoring:
    exposeDriverMetrics: true
    exposeExecutorMetrics: true
    prometheus:
      jmxExporterJar: /prometheus/jmx_prometheus_javaagent-0.11.0.jar
      port: 8090

What is the expected behavior?

The driver pod can run jmx_prometheus_javaagent as executor. And 8090 port is listening. Can curl http://127.0.0.1:8090/metrics in pod.

What do you see instead?

When exec driver pod, 8090 not listening, and the jar not running

$ metric-tests kubectl exec -it spark-sample-job-1-driver  -nspark3-jobs -- sh
$ netstat -nult
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp6       0      0 10.50.58.217:7078       :::*                    LISTEN
tcp6       0      0 10.50.58.217:7079       :::*                    LISTEN
tcp6       0      0 :::4040                 :::*                    LISTEN
$ ps -aux |grep java
root           1  124 15.4 5915776 2464812 ?     Ssl  10:10   2:48 /opt/bitnami/java/bin/java -cp /opt/bitnami/spark/conf/:/opt/bitnami/spark/jars/* -Xmx2G --add-exports java.base/sun.nio.ch=ALL-UNNAMED -XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/jdk.internal.ref=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED -Djdk.reflect.useDirectMethodHandle=false org.apache.spark.deploy.SparkSubmit --deploy-mode client --conf spark.jars.ivy=/tmp/.ivy --conf spark.driver.bindAddress=10.50.58.217 --conf spark.executorEnv.SPARK_DRIVER_POD_IP=10.50.58.217 --conf spark.driver.extraJavaOptions=--add-exports java.base/sun.nio.ch=ALL-UNNAMED --properties-file /opt/spark/conf/spark.properties --class org.apache.spark.examples.SparkPi local:///opt/bitnami/spark/examples/jars/spark-examples_2.12-3.5.3.jar 1000000
root         158  0.0  0.0   3324  1596 pts/0    S+   10:13   0:00 grep java

Additional information

I tried new version jmx_prometheus_javaagent-1.0.0.jar has same problem.
In the same K8s: same Spark operator, namespace, same yaml configuration etc apache/spark:3.5.3 is working.

@nozhang nozhang added the tech-issues The user has a technical issue about an application label Dec 12, 2024
@github-actions github-actions bot added the triage Triage is needed label Dec 12, 2024
@github-actions github-actions bot removed the triage Triage is needed label Dec 13, 2024
@github-actions github-actions bot assigned jotamartos and unassigned carrodher Dec 13, 2024
@jotamartos
Copy link
Contributor

Hi @nozhang,

Thank you for using Bitnami. Extra jars needs to be placed at /opt/bitnami/spark/jars/ as described in this section of the README file. You can see that Spark includes the jars in that folder when running the start command

spark-1  | Spark Command: /opt/bitnami/java/bin/java -cp /opt/bitnami/spark/conf/:/opt/bitnami/spark/jars/* -Xmx1g org.apache.spark.deploy.master.Master --host 74924de04c29 --port 7077 --webui-port 8080

If you think the Bitnami solution can be improved, you can always follow the contributing guides to suggest any change. The team will be more than happy to review the changes.

@nozhang
Copy link
Author

nozhang commented Dec 24, 2024

Thank @jotamartos
I tried put jar in /opt/bitnami/spark/jars/ , also changed yaml as blew setting.

  monitoring:
    exposeDriverMetrics: true
    exposeExecutorMetrics: true
    prometheus:
      jmxExporterJar: /opt/bitnami/spark/jars/jmx_prometheus_javaagent-0.11.0.jar
      port: 8090

However seems nothing changed.

# ps -aux |grep java
root           1  284  1.7 12364056 2277544 ?    Ssl  09:18   3:48 /opt/bitnami/java/bin/java -cp /opt/bitnami/spark/conf/:/opt/bitnami/spark/jars/* -Xmx2G --add-exports java.base/sun.nio.ch=ALL-UNNAMED -XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/jdk.internal.ref=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED -Djdk.reflect.useDirectMethodHandle=false org.apache.spark.deploy.SparkSubmit --deploy-mode client --conf spark.jars.ivy=/tmp/.ivy --conf spark.driver.bindAddress=10.50.34.130 --conf spark.executorEnv.SPARK_DRIVER_POD_IP=10.50.34.130 --conf spark.driver.extraJavaOptions=--add-exports java.base/sun.nio.ch=ALL-UNNAMED --properties-file /opt/spark/conf/spark.properties --class org.apache.spark.examples.SparkPi local:///opt/bitnami/spark/examples/jars/spark-examples_2.12-3.5.3.jar 1000000
root         234  0.0  0.0   3324  1588 pts/0    S+   09:19   0:00 grep java
# netstat -nltp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp6       0      0 10.50.34.130:7078       :::*                    LISTEN      1/java
tcp6       0      0 10.50.34.130:7079       :::*                    LISTEN      1/java
tcp6       0      0 :::4040                 :::*                    LISTEN      1/java
# ls -al /opt/bitnami/spark/jars/ |grep jmx_prometheus_javaagent
-rw-r--r-- 1 root  root    368881 Dec 24 09:16 jmx_prometheus_javaagent-0.11.0.jar

Copy link

github-actions bot commented Jan 9, 2025

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

@github-actions github-actions bot added the stale 15 days without activity label Jan 9, 2025
@jotamartos jotamartos removed the stale 15 days without activity label Jan 10, 2025
@jotamartos
Copy link
Contributor

Hi @nozhang,

I used the docker-compose file to mount the file and launch the container. I do not know what that "jmxExporterJar" parameter is and what you are using to deploy the container. I suggest you take a look a the docker-compose file and try launching the container that way.

If you are using the Bitnami Spark chart, you will need to use the extraVolumeMounts parameter to mount that jar into the pod.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in-progress spark tech-issues The user has a technical issue about an application
Projects
None yet
Development

No branches or pull requests

3 participants