Removing Stopwords From List Of Lists
I would like to know how I can remove specific words, including stopwords, from a list of list like this: my_list=[[], [], ['A'], ['SB'], [], ['NMR'], [], ['ISSN'], [], []
Solution 1:
Loop through my_words
, replacing each nested list with the list with stop words removed. You can use set difference to remove the words.
stop_words = stopwords.words('english')
my_list = [list(set(sublist).difference(stop_words)) for sublist in my_list]
It gets a little more complicated to do the comparisons case insensitively, as you can't use the built-in set difference method.
my_list = [[word for word in sublist if word.lower() not in stop_words] for sublist in my_list]
Post a Comment for "Removing Stopwords From List Of Lists"