I need to implement exception handling in pyspark. What are the best ways to do. Can I achieve this with try/except in python. Is this recommendable for distributed computations.
Below are some of my requirements.
#1) how to handle network, memory issues. Based on on know issues need to add restart intelligence.
#2) Spark job is running in standalone mode using docker(1 master, 2 slaves) . Job actually fails but the master web UI (4040) does not show the status as FAILED. It still shows as FINISHED. Need to handle such errors.
Any inputs and references would be helpful. Thanks!!
Source: Docker Questions