Matplotlib bar chart from two variable column - Pandas data manipulation (mostly)

Matplotlib bar chart from two variable column - Pandas data manipulation (mostly)

NOTE: Solution Posted Below!!!

I have a time-indexed table with a column BLIP that has only two values "XX" and "YY". The goal is to show a count of "XX" and "YY" with "YY" being below the x axis. I'm trying to create the correct data structure from a pandas table using code from Wes McKenney's book on Data Analysis (pg 26 I think):

df = base_df.drop(columns=dropcols).set_index('Created') group = ['f2','BLIP'] df0 = df_minus.groupby(group) agg_counts = df0.size().unstack().fillna(0) indexer = agg_counts.sum(1).argsort() count_subset = agg_counts.take(indexer).copy() table = count_subset.groupby('BLIP').resample('MS').count().unstack('BLIP')['BLIP'] chart = table.plot.bar(title = chart_title, x=None, color = ['green', 'red', 'grey']);

The line

agg_counts = df0.size().unstack().fillna(0)

results in the following error:

TypeError: 'numpy.int32' object is not callable

I found this gem of a snippet here, but can't find the documentation to decypher it.

data['values'].plot(kind='bar', color=data.positive.map({True: 'g', False: 'r'}))

This seems like is would be very simple, but I'm quite wrapped about the axle on this.

The pandas table format is something like

create_date f1 f2 f3 BLIP f5... dt_stamp X Y Z XX K1 dt_stamp S R Y YY K3 dt_stamp P P T XX K1

and so on.

Per Jesse's suggestion I tried

df_plus =df[df['BLIP']=='XX'] df_minus=df[df['BLIP']=='YY'] ax = plt.axes() ax.bar(df_plus.index, df_plus['BLIP'], width=0.4, color='g') ax.bar(df_neg.index, df_minus['BLIP'], width=0.4, color='r') ax.autoscale() plt.show()

This resulted in

ValueError: shape mismatch: objects cannot be broadcast to a single shape

Solution in its entirety

df = base_df plt.clf() fig = plt.figure() width = 8 height = 6 fig.set_size_inches(width, height) chart_title = 'YTD CR Trend Summary' df_plus =df[df['BLIP'] == 'XX'] df_minus=df[df['BLIP']== 'IYY'] p = df_plus.resample('MS').count()['BLIP'].fillna(0) n = df_minus.resample('MS').count()['BLIP'].apply(lambda x: int(-x)) print(chart_title, len(df), p.sum(), n.sum()) plt.clf() fig = plt.figure() # ax = fig.add_subplot(1, 1, 1) ax = plt.axes(label=chart_title) #label suppresses warning if p.sum() != False: ax.bar(p.index, p, width=10, color='g') if n.sum() != False: ax.bar(n.index, n, width=10, color='r') plt.suptitle(chart_title, fontsize=11) filename = f'{graph_images_dir}{chart_title}.png' print(f'Saving {filename}') plt.savefig(filename, bbox_inches='tight', pad_inches=0.5, dpi=200) plt.show()

I would like to know if I can use the data.positive.map to directly map to the "XX" values in BLIP or if I have to create a new field with True and False values.
– Harvey
yesterday

Is seems as if I have to go from this 3NF dataframe to some sort of object that gives me an object with BLIP.value_counts by month and then I somehow need to map them to the plot properly,
– Harvey
yesterday

1 Answer
1

You can plot it manually using matplotlib:

import matplotlib.pyplot as plt ax = plt.axes() ax.bar(table.index, table['XX'], width=0.4, color='g') ax.bar(table.index, table['YY'], width=0.4, color='r')

I divided the table into two dataframes df_pos and df_neg, then tried ax.bar(table.index, df_pos['BLIP'], width=0.4, color='g') and got this error ValueError: shape mismatch: objects cannot be broadcast to a single shape
– Harvey
yesterday

That means your table.index does not have the same length as df_pos['BLIP']. You can use df_pos.index instead
– Jesse Bakker
yesterday

table.index

df_pos['BLIP']

df_pos.index

Solution posted above. Jesse suggestion is correct, but table['XX'] should be table['BLIP'] Warning: Do not code when sleep deprived!!
– Harvey
19 hours ago

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

搜尋此網誌

Xuykyuu