How to unpack multiple dictionary objects inside list within a row of dataframe?

How to unpack multiple dictionary objects inside list within a row of dataframe?

I have a dataframe with the below dictionaries within a single list in every row and per row, the list are different sizes with they are of different sizes as below:

ID unnest_column 1 [{'abc': 11, 'def': 1},{'abc': 15, 'def': 1}, {'abc': 16, 'def': 1}, {'abc': 17, 'def': 1}, {'abc': 18, 'def': 1, 'ghi': 'abc'}, {'abc': 23, 'def': 'xxx', 'def': 1}, {'abc': 23, 'def': 'xxx', 'def': 2}, {'abc': 23, 'def': 'xxx', 'def': 4}] 2 [{'abc': 11, 'def': 1}]

How do I unpack the dictionaries in the list and make the key values columns?

new df potentially, not sure exactly how it will look, just need keys into columns:

id abc def ghi 1 2 3 abc

1 Answer
1

IIUC, from

df = pd.DataFrame() df['x'] = [[{'QuestionId': 11, 'ResponseId': 1},{'QuestionId': 15, 'ResponseId': 1}, {'QuestionId': 16, 'ResponseId': 1}, {'QuestionId': 17, 'ResponseId': 1}, {'QuestionId': 18, 'ResponseId': 1, 'Value': 'abc'}, {'QuestionId': 23, 'DataLabel': 'xxx', 'ResponseId': 1}, {'QuestionId': 23, 'DataLabel': 'xxx', 'ResponseId': 2}, {'QuestionId': 23, 'DataLabel': 'xxx', 'ResponseId': 4}], [{'QuestionId': 11, 'ResponseId': 1}]]

You can sum your lists to aggregate them, and use DataFrame constructor

sum

DataFrame

new_df = pd.DataFrame(df.x.values.sum()) DataLabel QuestionId ResponseId Value 0 NaN 11 1 NaN 1 NaN 15 1 NaN 2 NaN 16 1 NaN 3 NaN 17 1 NaN 4 NaN 18 1 abc 5 xxx 23 1 NaN 6 xxx 23 2 NaN 7 xxx 23 4 NaN 8 NaN 11 1 NaN

If you want to maintain the original indexes, you can build a inds list and pass it as arguments to the constructor:

inds

inds = [index for _ in ([i] * len(v) for i,v in df.x.iteritems()) for index in _] pd.DataFrame(df.x.values.sum(), index=inds) DataLabel QuestionId ResponseId Value 0 NaN 11 1 NaN 0 NaN 15 1 NaN 0 NaN 16 1 NaN 0 NaN 17 1 NaN 0 NaN 18 1 abc 0 xxx 23 1 NaN 0 xxx 23 2 NaN 0 xxx 23 4 NaN 1 NaN 11 1 NaN

this works. However the dataframe has roughly 30 other rows, could this operation be done in the same dataframe while maintaining the shape? If not no worries I will accept the answer
– RustyShackleford
yesterday

this is awesome and way better than using apply(pd.Series) and stack but with this method, how do keep the id number? is there a way to link this number in your new_df? just a curiosity :)
– Ben.T
yesterday

apply(pd.Series)

stack

@Ben.T there is a way to maintain the IDs :) will edit.
– RafaelC
yesterday

@RafaelC smart and easy. already upvote so can't do more ^^ but I keep in mind this method :)
– Ben.T
yesterday

@RafaelC thank you very much. Works well.
– RustyShackleford
12 hours ago

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

搜尋此網誌

Xuykyuu