ValueError: Length of values does not match length of index in nested loop

I’m trying to remove the stopwords in each row of my column. The columns contains rows and the rows since i already word_tokenized it with nltk then now it’s a list which contains tuples. I’m trying to remove the stopwords with this nested list comprehension but it says ValueError: Length of values does not match length of index in nested loop. How to fix this?

import pandas as pd from nltk.corpus import stopwords from nltk.tokenize import word_tokenize  data = pd.read_csv(r"D:/python projects/read_files/spam.csv",                     encoding = "latin-1")  data = data[['v1','v2']]  data = data.rename(columns = {'v1': 'label', 'v2': 'text'})  stopwords = set(stopwords.words('english'))  data['text'] = data['text'].str.lower() data['new'] = [word_tokenize(row) for row in data['text']] data['new'] = [word for new in data['new'] for word in new if word not in stopwords] 
Download script fix [LINK]
Download script fix [LINK 2]
Download script fix [LINK 2]
Vice Professor Asked on May 19, 2020 in No Category.
Add Comment
1 Answer(s)

The ValueError: Length of values does not match length of index raised because the previous columns you have added in the DataFrame are not the same length as the most recent one you have attempted to add in the DataFrame. So, you need make sure that the length of the array you are assign to a new column is equal to the length of the dataframe .

 

Download the fix file
Default Answered on April 21, 2021.
Add Comment

Your Answer

By posting your answer, you agree to the privacy policy and terms of service.