I have a table with three columns: user_id
, book_id
and rating
. So, one row shows what rating a user gave to a book.
I'm trying to remove rows that correspond to users who rated less than 10 books. I did something similar to what is described in answers to this question Remove low frequency values from pandas.dataframe . Here is my code:
threshold = 10
value_counts = ratings['user_id'].value_counts()
to_remove = value_counts[value_counts <= threshold].index
ratings.drop(to_remove, axis=0, inplace=True)
When I run it, I get an error in the last line:
ValueError: labels [40518 21743 30824 <...> 47178 46308 30460] not contained in axis
The table has 979478 rows, so the rows with these indices should exist. What am I doing wrong?
Using isin
, cause, the user_id is not the index , we can not using .drop
here.
threshold = 10
value_counts = ratings['user_id'].value_counts()
to_remove = value_counts[value_counts <= threshold].index
ratings.loc[~ratings['user_id'].isin(to_remove),:]
Firebase Cloud Functions: PubSub, "res.on is not a function"
TypeError: Cannot read properties of undefined (reading 'createMessageComponentCollector')
Here's a program I wrote to encrypt a message and decrypt the message
I have a python script to scrape Alibaba productsWhen I start to run it and visited several Alibaba products, Alibaba website start to block my IP because I visited their site every 3 seconds
I've seen a lot of questions that ask about pivot tablesEven if they don't know that they are asking about pivot tables, they usually are