I am calling this line:
lang_modifiers = [keyw.strip() for keyw in row["language_modifiers"].split("|") if not isinstance(row["language_modifiers"], float)]
This seems to work where row["language_modifiers"]
is a word (atlas method
, central
), but not when it comes up as nan
.
I thought my if not isinstance(row["language_modifiers"], float)
could catch the time when things come up as nan
but not the case.
Background: row["language_modifiers"]
is a cell in a tsv file, and comes up as nan
when that cell was empty in the tsv being parsed.
Kenil Vasani
You are right, such errors mostly caused by NaN representing empty cells.
It is common to filter out such data, before applying your further operations, using this idiom on your dataframe df:
Alternatively, it may be more handy to use
fillna()
method to impute (to replace)null
values with something default.E.g. all
null
orNaN
‘s can be replaced with the average value for its columnor can be replaced with a value like empty string “” or another default value