Hot questions for Using Ubuntu in apache spark

Top Java Programmings / Ubuntu / apache spark

Question:

I've installed Spark 2.1.1 on Ubuntu and no matter what I do, it doesn't seem to agree with the java path. When I run "spark-submit --version" or "spark-shell" I get the following error:

/usr/local/spark/bin/spark-class: line 71: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin//bin/java: No such file or directory

Now obviously the "/bin//bin/java" is problematic, but I'm not sure where to change the configuration. The spark-class file has the following lines:

if [ -n "${JAVA_HOME}" ]; then
  RUNNER="${JAVA_HOME}/bin/java"

I was originally using a version of Spark meant for Hadoop 2.4 and when I changed it to "RUNNER="${JAVA_HOME}" it would either give me the error "[path] is a directory" or "[path] is not a directory." This was after also trying multiple path permutations in /etc/environment

What I now have in /etc/environment is:

JAVA_HOME="/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/"

This is the current java setup that I have:

root@ubuntu:~# update-alternatives --config java There is only one alternative in link group java (providing /usr/bin/java): /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java

bashrc has the following:

export SPARK_HOME="/usr/local/spark"
export PATH="$PATH:$SPARK_HOME/bin"

Can anyone advise: 1) What files I need to change and 2) how I need to change them? Thanks in advance.

spark-class file is in the link, just in case:

http://vaughn-s.net/hadoop/spark-class


Answer:

In the /etc/environment file replace

JAVA_HOME="/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/ 

with

JAVA_HOME="/usr/lib/jvm/java-8-openjdk-amd64/jre/

then execute

source /etc/environment 

also RUNNER="${JAVA_HOME}/bin/java" should be kept as it is

Question:

I'm trying to learn to use Apache Spark and I have a problem with a simple example but I can not find a solution. I'm working on Ubuntu 13.04 with Java-7-Oracle and scala 2.9.3. When I try to run SparkPi examples I get this output:

filippo@filippo-HP-Pavilion-dv6-Notebook-PC:/usr/local/spark$ ./bin/run-example SparkPi 10
java.lang.ClassNotFoundException: org.apache.spark.examples.SparkPi
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:270)
        at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:337)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties

This is the example show in Spark documentation but I don't understand what is the problem :(


Answer:

You may have downloaded a source release rather than a pre-built?

To build and assemble with sbt, you can run sbt assembly in the spark root directory.

Question:

I have created a spark application on eclipse then I used maven to build and package it, because my OS is windows, I had to add this line to my code to be able to run it on windows:

System.setProperty("hadoop.home.dir", "C:\\hadoop\\");

But now and because I want to access to a cluster, I have to use Ubuntu, and I want to use my jar, but I know that that line will cause an error because this path doesn't exist on Ubuntu, so I ask any suggestion to fix this easily.


Answer:

while running jar you can specify properties

java -Dhadoop.home.dir=/var/hadoop -jar spark.jar 

for that you may have to remove that line and you can give the hadoop home on run time