According to both of these Link1 and Link2, my Airflow DAG run is returning the error
INFO - Task exited with return code -9 due to an out-of-memory issue. My DAG run has 10 tasks/operators, and each task simply:
- makes a query to get one of my BigQuery tables, and
- writes the results to a collection in my Mongo database.
The size of the 10 BigQuery tables range from 1MB to 400MB, and the total size of all 10 tables is ~1GB. My docker container has default 2GB of memory and I’ve increased this to 4GB, however I am still receiving this error from a few of the tasks. I am confused about this, as 4GB should be plenty of memory for this. I am also concerned because, in the future, these tables may become larger (a single table query could be 1-2GB), and I’d like to avoid these
return code -9 errors at that time.
Any thoughts or advice would be greatly appreciated on this. I’m not quite sure how to handle this issue, since the point of the DAG is to transfer data from BigQuery to Mongo daily, and the queries / data in-memory for the DAG’s tasks is necessarily fairly large then, based on the size of the tables.