Iterate through Python for loop more quickly

121
March 23, 2022, at 03:30 AM

I have a Pandas data frame (called "ud_flex" below) that looks like the one below: The data frame has over 27 million observations in it that I'm trying to iterate through to do a calculation for each row. Below is the calculation that I'm using:

def set_fpts(pos, rank, curr_fpts):
    if pos == "RB" and rank >= 3.0:
        return 0
    elif pos == "WR" and rank >= 4.0:
        return 0
    elif (pos == "TE" or pos == "QB") and rank >= 2.0:
        return 0
    else:
        return curr_fpts

Here is the for loop that I've created:

players = ud_flex.shape[0]
for i in range(0,players):
    new_fpts = set_fpts(ud_flex.iloc[i]['position_name'], ud_flex.iloc[i]['wk_rank_orig'], ud_flex.iloc[i]['fpts'])
    ud_flex.at[i, 'fpts_orig'] = new_fpts

Does anyone have any suggestions for how to speed up this loop? It's currently taking nearly an hour! Thanks!

Answer 1

You could start making an algorithm that exits faster:

def set_fpts(pos, rank, curr_fpts):
    if rank > 4:
        return 0
    if rank < 2:
        return curr_fpts
    if pos in ["TE", "QB"]:
        return 0
    if rank >= 3:
        if pos == "RB":
            return 0
    return curr_fpts
Answer 2

In general, iterating through pandas data frames is slow, so it's not surprising that your for loop based approach is taking a while.

I suspect that the following alternative should work more quickly for a data frame of your size.

mask = (((ud_flex['position_name']=="RB") & (ud_flex['wk_rank_orig']>=3))
       |((ud_flex['position_name']=="WR") & (ud_flex['wk_rang_orig']>=4))
       |((ud_flex['position_name'].isin["TE","QB"]) & (ud_flex['wk_rang_orig']>=2)))
ud_flex['fpts_orig'][mask] = 0
ud_flex['fpts_orig'][~mask] = ud_flex['fpts']
Rent Charter Buses Company
READ ALSO
how check python wrote package version automatically?

how check python wrote package version automatically?

How to check current running package and module version automatically based on something like hash, git(uncommitted is also required), change file date?

93
How to update Tweepy Streaming Python code

How to update Tweepy Streaming Python code

I had this code running for an earlier version of Python but now want to upgrade it to the latest version of Python and Tweepy however I cant seem to make it workAny help would be much appreciated! The error I get is:

174
How to make a request in an API only when some data changes?

How to make a request in an API only when some data changes?

I'm using an API and I'm running the requst like this:

88
Pandas transforming chronological rows to columns

Pandas transforming chronological rows to columns

I have a table of work experiences where each row represents a job in chronological order from the first job to the most recent jobFor data science purposes I'm trying to create a new table based on this table that displays new job attributes and old job attributes...

127