We were looking solution for providing pyspark notebook for analyst. The below steps provide a virtual environment and local spark.
mkdir project-folder
cd project-folder
mkvirtualenv notebook
pip install jupyter
Check if browser opens the notebook using below command:
jupyter notebook
Quit the terminal by Cntrl + c, y.
For enabling the spark in notebook, Add below to .bashrc or .bash_profile
export PYSPARK_DRIVER_PYTHON=jupyter
export PYSPARK_DRIVER_PYTHON_OPTS=notebook
I have already downloaded the spark tar and untar it in the /Users/ashokagarwal/devtools/.
Now open the terminal and run below command:
/Users/ashokagarwal/devtools/spark/bin/pyspark
This will open a browser. Choose new -> python 2.
spark.sparkContext.parallelize(range(10)).count()
df = spark.sql('''select 'spark' as hello ''')
df.show()
mkdir project-folder
cd project-folder
mkvirtualenv notebook
pip install jupyter
Check if browser opens the notebook using below command:
jupyter notebook
Quit the terminal by Cntrl + c, y.
For enabling the spark in notebook, Add below to .bashrc or .bash_profile
export PYSPARK_DRIVER_PYTHON=jupyter
export PYSPARK_DRIVER_PYTHON_OPTS=notebook
I have already downloaded the spark tar and untar it in the /Users/ashokagarwal/devtools/.
Now open the terminal and run below command:
/Users/ashokagarwal/devtools/spark/bin/pyspark
This will open a browser. Choose new -> python 2.
spark.sparkContext.parallelize(range(10)).count()
df = spark.sql('''select 'spark' as hello ''')
df.show()
No comments:
Post a Comment