Learn, Share, Build

255
September 23, 2017, at 02:10 AM

I have a big sparse dataframe and I'd like to remove automatically the columns (the column keys/names are dynamic, so in principle I don't know the column number and the naming) with a number of non zero element below a certain percentage of the total row number.

Thank you!

Answer 1

Pandas has a dropna function, which has a thresh parameter. Simply set that to the number of non-zero values you need to keep. So

df.dropna(thres=int(len(df)*0.8), axis=1)

Will drop columns where less than 80% of the rows are complete.

Rent Charter Buses Company
READ ALSO
Learn, Share, Build

Learn, Share, Build

I am just starting up Python!!

251
Learn, Share, Build

Learn, Share, Build

I need your help guys,

227
Learn, Share, Build

Learn, Share, Build

I have used the code below to send some DataFrames as tables in an emailThe tables contain 4 columns each, the first is a label and the next three are numbers

235
Learn, Share, Build

Learn, Share, Build

I am currently testing a web app using pytest and SeleniumAll pages have "Home" and "Log Out" links, so I have written a test like this:

224