Unable to pull jar file from JFrog Artifactory repo when running Spark job on K8's

11/8/2018

I am trying to run spark job on Kubernetes cluster but it fails with class not found exception. The reason which I feel is that it is not able to pull the jar file from the JFrog Artifactory repository. Any suggestions on what can be done?

Can we include something in the parameters of spark submit or create a password file?

-- bharath reddy
apache-spark
artifactory
jar
kubernetes

1 Answer

11/8/2018

You didn't mention how you are making sure how you are pulling the jar when you tested your job locally, or perhaps you haven't tested it yet. As per Advanced Dependency Management:

Spark uses the following URL scheme to allow different strategies for disseminating jars: hdfs:, http:, https:, ftp: - these pull down files and JARs from the URI as expected

And:

Users may also include any other dependencies by supplying a comma-delimited list of Maven coordinates with --packages. All transitive dependencies will be handled when using this command. Additional repositories (or resolvers in SBT) can be added in a comma-delimited fashion with the flag --repositories. (Note that credentials for password-protected repositories can be supplied in some cases in the repository URI, such as in https://user:password@host/.... Be careful when supplying credentials this way.)

If you are Jfrog repo or Jar file requires credentials looks like you will have to pass the credentials in the URL: https://user:password@host/...

-- Rico
Source: StackOverflow