How to check the Spark version

Apache SparkCloudera Cdh

Apache Spark Problem Overview


as titled, how do I know which version of spark has been installed in the CentOS?

The current system has installed cdh5.1.0.

Apache Spark Solutions


Solution 1 - Apache Spark

If you use Spark-Shell, it appears in the banner at the start.

Programatically, SparkContext.version can be used.

Solution 2 - Apache Spark

Open Spark shell Terminal, run sc.version

enter image description here

Solution 3 - Apache Spark

You can use spark-submit command: spark-submit --version

Solution 4 - Apache Spark

In Spark 2.x program/shell,

use the

spark.version   

Where spark variable is of SparkSession object

Using the console logs at the start of spark-shell

[root@bdhost001 ~]$ spark-shell
Setting the default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.2.0
      /_/

Without entering into code/shell

spark-shell --version

[root@bdhost001 ~]$ spark-shell --version
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.2.0
      /_/
                        
Type --help for more information.

spark-submit --version

[root@bdhost001 ~]$ spark-submit --version
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.2.0
      /_/
                        
Type --help for more information.

Solution 5 - Apache Spark

If you are using Databricks and talking to a notebook, just run :

spark.version

Solution 6 - Apache Spark

If you are using pyspark, the spark version being used can be seen beside the bold Spark logo as shown below:

manoj@hadoop-host:~$ pyspark
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).

Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 1.6.0
      /_/

Using Python version 2.7.6 (default, Jun 22 2015 17:58:13)
SparkContext available as sc, HiveContext available as sqlContext.
>>>

If you want to get the spark version explicitly, you can use version method of SparkContext as shown below:

>>>
>>> sc.version
u'1.6.0'
>>>

Solution 7 - Apache Spark

Which ever shell command you use either spark-shell or pyspark, it will land on a Spark Logo with a version name beside it.

$ pyspark
$ Python 2.6.6 (r266:84292, May 22 2015, 08:34:51) [GCC 4.4.7 20120313 (Red Hat 4.4.7-15)] on linux2 ............ ........... Welcome to
version 1.3.0

Solution 8 - Apache Spark

use below to get the spark version

spark-submit --version

Solution 9 - Apache Spark

If you want to print the version programmatically use

 from pyspark.sql import SparkSession 

 spark = SparkSession.builder.master("local").getOrCreate() 
 print(spark.sparkContext.version)

Solution 10 - Apache Spark

If you are on Zeppelin notebook you can run:

sc.version 

to know the scala version as well you can ran:

util.Properties.versionString

Solution 11 - Apache Spark

If you want to run it programatically using python script

You can use this script.py:

from pyspark.context import SparkContext
from pyspark import SQLContext, SparkConf

sc_conf = SparkConf()
sc = SparkContext(conf=sc_conf)
print(sc.version)

run it with python script.py or python3 script.py


This above script is also works on python shell.


Using print(sc.version) directly on the python script won't work. If you run it directly, you will get this error:NameError: name 'sc' is not defined.

Solution 12 - Apache Spark

Most of the answers here requires initializing a sparksession. This answer provide a way to statically infer the version from library.

ammonites@ org.apache.spark.SPARK_VERSION
res4: String = "2.4.5"

Solution 13 - Apache Spark

If like me, one is running spark inside a docker container and has little means for the spark-shell, one can run jupyter notebook, build SparkContext object called sc in the jupyter notebook, and call the version as shown in the codes below:

docker run -p 8888:8888 jupyter/pyspark-notebook ##in the shell where docker is installed

import pyspark
sc = pyspark.SparkContext('local[*]')
sc.version

Solution 14 - Apache Spark

Try this way:

import util.Properties.versionString
import org.apache.spark.sql.SparkSession

val spark = SparkSession
  .builder
  .appName("my_app")
  .master("local[6]")
  .getOrCreate()
println("Spark Version: " + spark.version)
println("Scala Version: " + versionString)

Solution 15 - Apache Spark

In order to print the Spark's version on the shell, following solution work.

SPARK_VERSION=$(spark-shell --version &> tmp.data ; grep version tmp.data | head -1 | awk '{print $NF}';rm tmp.data)
echo $SPARK_VERSION

Solution 16 - Apache Spark

Non-interactive way, that I am using for AWS EMR proper PySpark version installation:

# pip3 install pyspark==$(spark-submit --version 2>&1| grep -m 1  -Eo "([0-9]{1,}\.)+[0-9]{1,}") 
Collecting pyspark==2.4.4

solution:

#  spark-shell --version 2>&1| grep -m 1  -Eo "([0-9]{1,}\.)+[0-9]{1,}"
2.4.4

solution:

# spark-submit --version 2>&1| grep -m 1  -Eo "([0-9]{1,}\.)+[0-9]{1,}"
2.4.4

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionHappyCodingView Question on Stackoverflow
Solution 1 - Apache SparkShyamendra SolankiView Answer on Stackoverflow
Solution 2 - Apache SparkVenu A PositiveView Answer on Stackoverflow
Solution 3 - Apache SparkOzgur OzturkView Answer on Stackoverflow
Solution 4 - Apache SparkmrsrinivasView Answer on Stackoverflow
Solution 5 - Apache SparkPatView Answer on Stackoverflow
Solution 6 - Apache SparkManoj Kumar GView Answer on Stackoverflow
Solution 7 - Apache SparkMurari GoswamiView Answer on Stackoverflow
Solution 8 - Apache SparkSwift userView Answer on Stackoverflow
Solution 9 - Apache SparkJulian2611View Answer on Stackoverflow
Solution 10 - Apache SparkHISIView Answer on Stackoverflow
Solution 11 - Apache SparkPiko MondeView Answer on Stackoverflow
Solution 12 - Apache SparkDyno FuView Answer on Stackoverflow
Solution 13 - Apache SparkyogenderView Answer on Stackoverflow
Solution 14 - Apache SparkJulio DelgadoView Answer on Stackoverflow
Solution 15 - Apache SparkKhetanshuView Answer on Stackoverflow
Solution 16 - Apache SparkValeriy SolovyovView Answer on Stackoverflow