Skip to content Skip to sidebar Skip to footer

Spark-submit Fails To Detect The Installed Modulus In Pip

I have a python code which have the following 3rd party dependencies: import boto3 from warcio.archiveiterator import ArchiveIterator from warcio.recordloader import ArchiveLoadFai

Solution 1:

All checks mentioned above worked ok but setting PYSPARK_PYTHON solved the issue for me.

Solution 2:

Before doing spark-submit try going to python shell and try importing the modules. Also check which python shell (check python path) is opening up by default.

If you are able to successfully import these modules in python shell (same python version as you trying to use in spark-submit), please check following:

In which mode are you submitting the application? try standalone or if on yarn try client mode. Also try adding export PYSPARK_PYTHON=(your python path)

Post a Comment for "Spark-submit Fails To Detect The Installed Modulus In Pip"