Getting no module named pandas error in airflow even after pandas getting successfully installed

12/13/2018

How to resolve the error no module named pandas when one node (in Airflow's DAG) is successful in using it(pandas) and the other is not?

I am unable to deduce as to why I am getting an error no module named pandas.

I have checked via pip3 freeze and yes, the desired pandas version does show up.

I have deployed this using docker on a kubernetes cluster.

-- aviral sanjay
airflow
kubernetes
pandas
python-3.x

1 Answer

12/18/2018

Pandas is generally required, and sometimes used in some hooks to return dataframes. Well, it's possible that Airflow was installed with pip and not pip3 possibly being added as a Python 2 module and not a Python 3 module (though, using pip should have installed Pandas when one looks at the setup.py).

Which Operator in your DAG is giving this error? Do you have any PythonVirtualEnvironmentOperators or BashOperators running python from the command line (and thus possibly not sharing the same environment that you're checking has pandas)?

-- dlamblin
Source: StackOverflow