How to groupby Month and Year and then sum total in Pandas?

122
January 26, 2022, at 04:30 AM

In my dataset, I have 3 columns... I am looking to group by month and year... however i am also looking to group by Name and sum the Price.

Here is a mock data set I've made:

import pandas as pd
# initialise data of lists.
data = {'Name':['A', 'B', 'A', 'C', 'C', 'A', 'B', 'A', 'B','B','B', 'C', 'C', 'A', 'C', 'B'], 
'Date': ['06/01/19', '06/11/19', '06/25/19', '06/05/19', '06/02/19', '06/13/19', '06/21/19', '03/09/20', 
'03/17/20', '03/22/20', '06/30/20', '06/22/20', '06/10/20', '07/05/20', '07/25/20', '07/21/20'], 
'Price': [10, 27, 8, 10, 38, 38, 93, 12, 55, 39, 52, 62, 25, 10, 39, 37]}
# Create DataFrame
df = pd.DataFrame(data)
# Print the output.
print(df)
totalSum = df.groupby([df['Date'].dt.year, df['Date'].df.month]).agg({'Price':sum})

The output should look something like:

06/2019
    A   56
    B   120
    C   48
03/2020
    A   12
    B   94
...

etc.

Answer 1

Good try, but the column names are required to groupby, not the value of something like you tried. Thus, I updated df['Date']' form such as '2019-06' and used 'Date' and 'Name' columns for groupby, as follows:

import pandas as pd
# initialise data of lists.
data = {
    'Name': ['A', 'B', 'A', 'C', 'C', 'A', 'B', 'A', 'B', 'B', 'B', 'C', 'C', 'A', 'C', 'B'],
    'Date': ['06/01/19', '06/11/19', '06/25/19', '06/05/19', '06/02/19', '06/13/19', '06/21/19', '03/09/20', '03/17/20', '03/22/20', '06/30/20', '06/22/20', '06/10/20', '07/05/20', '07/25/20', '07/21/20'],
    'Price': [10, 27, 8, 10, 38, 38, 93, 12, 55, 39, 52, 62, 25, 10, 39, 37]
}
# Create DataFrame
df = pd.DataFrame(data)
# to make 06/01/19 to 06/01/2019
df['Date'] = df['Date'].apply(lambda x: x[:6]+'20'+x[6:])  
# to remove Day and leave only Year-Month
df['Date'] = pd.to_datetime(df['Date'], format='%m/%d/%Y').apply(lambda x: x.strftime('%Y-%m'))  
# Print the output.
totalSum = df.groupby(by=['Date', 'Name']).agg({'Price': sum})
print(totalSum)
#              Price
#Date    Name       
#2019-06 A        56
#        B       120
#        C        48
#2020-03 A        12
#        B        94
#2020-06 B        52
#        C        87
#2020-07 A        10
#        B        37
#        C        39
Answer 2

You were almost there, but for it to work, you needed first to call pandas' to_datetime() method twice to generate the years and the months based on the 'Date' and to use Name' as an additional argument for the groupbby:

totalSum = df.groupby([pd.to_datetime(df['Date']).dt.year,
                       pd.to_datetime(df['Date']).dt.month,
                       'Name']).agg({'Price': sum})
totalSum
Out[17]: 
                Price
Date Date Name       
2019 6    A        56
          B       120
          C        48
2020 3    A        12
          B        94
     6    B        52
          C        87
     7    A        10
          B        37
          C        39
Rent Charter Buses Company
READ ALSO
Receive messages/events from SQS and map them to Dynamodb

Receive messages/events from SQS and map them to Dynamodb

I am trying to figure out a way where I can read my SQS events and map them to DynamoDB using JavaI have a SQS in AWS which will get some messages having some information e

135
Why does jQuery or a DOM method such as getElementById not find the element?

Why does jQuery or a DOM method such as getElementById not find the element?

What are the possible reasons for documentgetElementById, $("#id") or any other DOM method / jQuery selector not finding the elements?

46
How to remove multiple children in one call - Realtime Firebase

How to remove multiple children in one call - Realtime Firebase

I wan to delete multiple children with known keyI searched all the related posts but it doesn't seem to work

122
Image data not saving to MongoDB

Image data not saving to MongoDB

Preface: using Nodejs w/Express to build a blog website and using multer to upload an image to mongoDB

123