Dropping rows in Dataframe: labels not contained in axis

719
December 01, 2017, at 09:15 AM

I have a table with three columns: user_id, book_id and rating. So, one row shows what rating a user gave to a book.

I'm trying to remove rows that correspond to users who rated less than 10 books. I did something similar to what is described in answers to this question Remove low frequency values from pandas.dataframe . Here is my code:

threshold = 10
value_counts = ratings['user_id'].value_counts()
to_remove = value_counts[value_counts <= threshold].index
ratings.drop(to_remove, axis=0, inplace=True)

When I run it, I get an error in the last line:

ValueError: labels [40518 21743 30824 <...> 47178 46308 30460] not contained in axis

The table has 979478 rows, so the rows with these indices should exist. What am I doing wrong?

Answer 1

Using isin, cause, the user_id is not the index , we can not using .drop here.

threshold = 10
value_counts = ratings['user_id'].value_counts()
to_remove = value_counts[value_counts <= threshold].index
ratings.loc[~ratings['user_id'].isin(to_remove),:]
Rent Charter Buses Company
READ ALSO
Python Message Encryption/Decryption Program&hellip; How hard is it to decrypt?

Python Message Encryption/Decryption Program… How hard is it to decrypt?

Here's a program I wrote to encrypt a message and decrypt the message

248
How to prevent from blocking IP in scraping Alibaba products?

How to prevent from blocking IP in scraping Alibaba products?

I have a python script to scrape Alibaba productsWhen I start to run it and visited several Alibaba products, Alibaba website start to block my IP because I visited their site every 3 seconds

297
How to pivot a dataframe

How to pivot a dataframe

I've seen a lot of questions that ask about pivot tablesEven if they don't know that they are asking about pivot tables, they usually are

300