Java Gateway Process Exited Before Sending its Port Number
Issue​
When running an EMR-connected Jupyter notebook, you see the following error when
running the Tecton SDK function get_historical_features()
:
~/.local/lib/python3.8/site-packages/pyspark/java_gateway.py in launch_gateway(conf, popen_kwargs)
107 if not os.path.isfile(conn_info_file):
--> 108 raise Exception("Java gateway process exited before sending its port number")
109
Exception: Java gateway process exited before sending its port number
The above exception was the direct cause of the following exception:
[... Python stack trace ...]
Note that other Tecton SDK functions may continue to work properly, such as
get_data_source()
and list_feature_views()
.
Resolution​
The most likely issue is that you are connected to a Python kernel instead of a PySpark kernel. When you create a notebook in EMR, you have the option to use a number of kernels, and you should choose PySpark. If you have already created the notebook, you can click in the upper-right-hand corner and select PySpark instead of Python.
get_historical_features()
explicitly uses Spark to execute, however, some
other Tecton functions only make calls to Tecton APIs to retrieve data, which
does not require a Spark context.