Mapping values inside pandas column

Mapping values inside pandas column

I used the code below to map the 2 values inside S column to 0 but it didn't work. Any suggestion on how to solve this?
N.B : I want to implement an external function inside the map.

df = pd.DataFrame({'Age':[30,40,50,60,70,80],'Sex': ['F','M','M','F','M','F'],'S': [1,1,2,2,1,2]}) def app(value): for n in df['S']: if n == 1: return 1 if n == 2: return 0 df["S"] = df.S.map(app)

8 Answers
8

Use eq to create a boolean series and conver that boolean series to int with astype:

eq

astype

df['S'] = df['S'].eq(1).astype(int)

df['S'] = (df['S'] == 1).astype(int)

Output:

Age Sex S 0 30 F 1 1 40 M 1 2 50 M 0 3 60 F 0 4 70 M 1 5 80 F 0

Hmm, this is much faster than assigning via loc
– user3483203
yesterday

loc

@user3483203 you can try mask , should be faster :-) df.S.mask(df.S>1,0)
– Wen
yesterday

df.S.mask(df.S>1,0)

Yep, much faster, I need to use mask more :D
– user3483203
yesterday

mask

Don't use apply, simply use loc to assign the values:

apply

loc

df.loc[df.S.eq(2), 'S'] = 0 Age Sex S 0 30 F 1 1 40 M 1 2 50 M 0 3 60 F 0 4 70 M 1 5 80 F 0

If you need a more performant option, use np.select. This is also more scalable, as you can always add more conditions:

np.select

df['S'] = np.select([df.S.eq(2)], [0], 1)

You're close but you need a few corrections. Since you want to use a function, remove the for loop and replace n with value. Additionally, use apply instead of map. Apply operates on the entire column at once. See this answer for how to properly use apply vs applymap vs map

for

n

value

apply

map

Apply

apply

applymap

map

def app(value): if value == 1: return 1 elif value == 2: return 0 df['S'] = df.S.apply(app) Age Sex S 0 30 F 1 1 40 M 1 2 50 M 0 3 60 F 0 4 70 M 1 5 80 F 0

If you only wish to change values equal to 2, you can use pd.DataFrame.loc:

pd.DataFrame.loc

df.loc[df['S'] == 0, 'S'] = 0

pd.Series.apply is not recommend and this is just a thinly veiled, inefficient loop.

pd.Series.apply

You could use .replace as follows:
df["S"] = df["S"].replace([2], 0)
This will replace all of 2 values to 0 in one line

Go with vectorize numpy operation:

df['S'] = np.abs(df['S'] - 2)

and stand yourself out from competitions in interviews and SO answers :)

>>>df = pd.DataFrame({'Age':[30,40,50,60,70,80],'Sex': ['F','M','M','F','M','F'],'S': [1,1,2,2,1,2]}) >>> def app(value): return 1 if value == 1 else 0 # or app = lambda value : 1 if value == 1 else 0 >>> df["S"] = df["S"].map(app) >>> df Age S Sex Age S Sex 0 30 1 F 1 40 1 M 2 50 0 M 3 60 0 F 4 70 1 M 5 80 0 F

You can do:

import numpy as np df['S'] = np.where(df['S'] == 2, 0, df['S'])

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

搜尋此網誌

Xuykyuu