Return all rows after groupby pandas (i.e. not a reduced number of rows that is the unique values for the group key)

The name of the picture


Return all rows after groupby pandas (i.e. not a reduced number of rows that is the unique values for the group key)



The following code from the tutorials yields the following results:



Code:


import pandas as pd
import numpy as np

df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
'foo', 'bar', 'foo', 'foo'],
'B' : ['one', 'one', 'two', 'three',
'two', 'two', 'one', 'three'],
'C' : np.random.randn(8),
'D' : np.random.randn(8)})

print(df)

grouped = df.groupby('A').mean()
print(grouped)



Result:


A B C D
0 foo one -0.787410 -0.857863
1 bar one 0.140572 1.330183
2 foo two -0.770166 2.123528
3 bar three -0.965523 0.771663
4 foo two 0.215037 -0.597935
5 bar two -1.023839 -0.248445
6 foo one -1.377515 2.041921
7 foo three -0.314333 1.379423
C D
A
bar -0.616263 0.617800
foo -0.606877 0.817815



However I would like to see all the rows as in the following:


0 foo one -0.606877 0.817815
1 bar one -0.616263 0.617800
2 foo two -0.606877 0.817815
3 bar three -0.616263 0.617800
4 foo two -0.606877 0.817815
5 bar two -0.616263 0.617800
6 foo one -0.606877 0.817815
7 foo three -0.606877 0.817815



I am open to use any other library as well. I just need to do this fast and efficiently using python3



Thanks in advance




2 Answers
2



Use GroupBy.transform with specifying columns:


GroupBy.transform


cols = ['C','D']
df[cols] = df.groupby('A')[cols].transform('mean')
print(df)
A B C D
0 foo one 0.444616 -0.232363
1 bar one 0.173897 -0.603437
2 foo two 0.444616 -0.232363
3 bar three 0.173897 -0.603437
4 foo two 0.444616 -0.232363
5 bar two 0.173897 -0.603437
6 foo one 0.444616 -0.232363
7 foo three 0.444616 -0.232363





Wow. Magic. Thanks
– Uğur Dinç
5 hours ago





@UğurDinç - You are welcome!
– jezrael
5 hours ago



You could also use apply. Perform the operation on each group, but return all the rows of the group.


apply


def my_func(x):
x["D"] = x.C.mean()
return x
grouped = df.groupby('A', as_index=False).apply(my_func)
print(grouped)






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Stripe::AuthenticationError No API key provided. Set your API key using “Stripe.api_key = ”

CRM reporting Extension - SSRS instance is blank

Keycloak server returning user_not_found error when user is already imported with LDAP