How to unpack multiple dictionary objects inside list within a row of dataframe?

Multi tool use


How to unpack multiple dictionary objects inside list within a row of dataframe?
I have a dataframe with the below dictionaries within a single list in every row and per row, the list are different sizes with they are of different sizes as below:
ID unnest_column
1 [{'abc': 11, 'def': 1},{'abc': 15, 'def': 1},
{'abc': 16, 'def': 1},
{'abc': 17, 'def': 1},
{'abc': 18, 'def': 1, 'ghi': 'abc'},
{'abc': 23, 'def': 'xxx', 'def': 1},
{'abc': 23, 'def': 'xxx', 'def': 2},
{'abc': 23, 'def': 'xxx', 'def': 4}]
2 [{'abc': 11, 'def': 1}]
How do I unpack the dictionaries in the list and make the key values columns?
new df potentially, not sure exactly how it will look, just need keys into columns:
id abc def ghi
1 2 3 abc
1 Answer
1
IIUC, from
df = pd.DataFrame()
df['x'] = [[{'QuestionId': 11, 'ResponseId': 1},{'QuestionId': 15, 'ResponseId': 1},
{'QuestionId': 16, 'ResponseId': 1},
{'QuestionId': 17, 'ResponseId': 1},
{'QuestionId': 18, 'ResponseId': 1, 'Value': 'abc'},
{'QuestionId': 23, 'DataLabel': 'xxx', 'ResponseId': 1},
{'QuestionId': 23, 'DataLabel': 'xxx', 'ResponseId': 2},
{'QuestionId': 23, 'DataLabel': 'xxx', 'ResponseId': 4}],
[{'QuestionId': 11, 'ResponseId': 1}]]
You can sum
your lists to aggregate them, and use DataFrame
constructor
sum
DataFrame
new_df = pd.DataFrame(df.x.values.sum())
DataLabel QuestionId ResponseId Value
0 NaN 11 1 NaN
1 NaN 15 1 NaN
2 NaN 16 1 NaN
3 NaN 17 1 NaN
4 NaN 18 1 abc
5 xxx 23 1 NaN
6 xxx 23 2 NaN
7 xxx 23 4 NaN
8 NaN 11 1 NaN
If you want to maintain the original indexes, you can build a inds
list and pass it as arguments to the constructor:
inds
inds = [index for _ in ([i] * len(v) for i,v in df.x.iteritems()) for index in _]
pd.DataFrame(df.x.values.sum(), index=inds)
DataLabel QuestionId ResponseId Value
0 NaN 11 1 NaN
0 NaN 15 1 NaN
0 NaN 16 1 NaN
0 NaN 17 1 NaN
0 NaN 18 1 abc
0 xxx 23 1 NaN
0 xxx 23 2 NaN
0 xxx 23 4 NaN
1 NaN 11 1 NaN
this is awesome and way better than using
apply(pd.Series)
and stack
but with this method, how do keep the id number? is there a way to link this number in your new_df? just a curiosity :)– Ben.T
yesterday
apply(pd.Series)
stack
@Ben.T there is a way to maintain the IDs :) will edit.
– RafaelC
yesterday
@RafaelC smart and easy. already upvote so can't do more ^^ but I keep in mind this method :)
– Ben.T
yesterday
@RafaelC thank you very much. Works well.
– RustyShackleford
12 hours ago
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
this works. However the dataframe has roughly 30 other rows, could this operation be done in the same dataframe while maintaining the shape? If not no worries I will accept the answer
– RustyShackleford
yesterday