Average the sum of rows without a creating new column in Excel

Multi tool use
Average the sum of rows without a creating new column in Excel
Here's a sample of my matrix:
A B C D E
1 0 0 1 1
0 0 0 0 0
0 0 1 1 0
0 2 1
You can think of each row as a respondent and each column as an item on a questionnaire.
My goal is to take an average of the sum of each row (i.e. total score for each respondent) without creating a new column AND accounting for the fact that some or all of the entries in a given row are empty (e.g., some respondents
missed some items [see row 5] or didn't complete the questionnaire entirely [see row 3]).
The desired solution for this matrix = 1.67, whereby
[1+0+0+1+1 = 3] + [0+0+0+0+0 = 0] + [0+0+1+1+0 = 2]/3 == 5/3 = 1.67
As you can see, we have averaged over three values despite there being five rows because one has missing data.
I am already able to take an average of the sum of rows which are only summed for non-missing entries, e.g.,:
=AVERAGE(IF(AND(A1<>"",B1<>"",C1<>"",D1<>"",E1<>""),SUM(A1:E1)),IF(AND(A2<>"",B2<>"",C2<>"",D2<>"",E2<>""),SUM(A2:E2)),IF(AND(A3<>"",B3<>"",C3<>"",D3<>"",E3<>""),SUM(A3:E3)),IF(AND(A4<>"",B4<>"",C4<>"",D4<>"",E4<>""),SUM(A4:E4)),IF(AND(A5<>"",B5<>"",C5<>"",D5<>"",E5<>""),SUM(A5:E5)))
However, this results in a value of 1 because it treats any row with some or all values values as = 0.
It does the following:
[1+0+0+1+1 = 3] + [0+0+0+0+0 = 0] + [0+0+0+0+0 = 0] + [0+0+1+1+0 = 2] + [0+0+0+0+0 = 0]/4 == 5/5 = 1
Does anyone have any ideas about how to adapt the current code to average over non-missing values or an alternative way of achieving the desired result?
You are correct and I have updated the answer to make this more explicit. Cheers, PJnin
– PyjamaNinja
3 hours ago
2 Answers
2
You can do this more concisely with an array formula, but the short answer to fix up your existing formula is, if you have a blank cell in your sheet somewhere (say it's F1) AVERAGE will ignore blank cells so change your formula to
=AVERAGE(IF(AND(A1<>"",B1<>"",C1<>"",D1<>"",E1<>""),SUM(A1:E1),F1),IF(AND(A2<>"",B2<>"",C2<>"",D2<>"",E2<>""),SUM(A2:E2),F1),IF(AND(A3<>"",B3<>"",C3<>"",D3<>"",E3<>""),SUM(A3:E3),F1),IF(AND(A4<>"",B4<>"",C4<>"",D4<>"",E4<>""),SUM(A4:E4),F1),IF(AND(A5<>"",B5<>"",C5<>"",D5<>"",E5<>""),SUM(A5:E5),F1))
This would be one array formula version of your formula - it uses OFFSET to pull out each row of the matrix then SUBTOTAL to see if every cell in that row has a number in it. Then it uses SUBTOTAL again to work out the sum of each row and AVERAGE to get the average of rows.
=AVERAGE(IF(SUBTOTAL(2,OFFSET(A1,ROW(A1:A5)-ROW(A1),0,1,COLUMNS(A1:E1)))=COLUMNS(A1:E1),SUBTOTAL(9,OFFSET(A1,ROW(A1:A5)-ROW(A1),0,1,COLUMNS(A1:E1))),""))
Has to be entered as an array formula using CtrlShiftEnter
Note 1 - some people don't like using OFFSET because it is volatile - you can use matrix multiplication instead but it's arguably less easy to understand.
Note 2 - I used "" instead of referring to an empty cell. Interesting that the non-array formula needed an actual blank cell but the array formula needed an empty string.
You can omit the empty string
=AVERAGE(IF(SUBTOTAL(2,OFFSET(A1,ROW(A1:A5)-ROW(A1),0,1,COLUMNS(A1:E1)))=COLUMNS(A1:E1),SUBTOTAL(9,OFFSET(A1,ROW(A1:A5)-ROW(A1),0,1,COLUMNS(A1:E1)))))
Many thanks, Tom! I have a crude awareness of array formula but it might be helpful since I need to create the formulae for 5k+ rows. Would you be able to demonstrate what it may look like?
– PyjamaNinja
2 hours ago
Tom Sharpe you have opened up a new world for me!
– PyjamaNinja
46 mins ago
Interesting isn't it? Enjoy.
– Tom Sharpe
15 mins ago
Basically, what you're describing here for your desired result is the =AVERAGEA()
function
=AVERAGEA()
The Microsoft Excel AVERAGEA function returns the average (arithmetic
mean) of the numbers provided. The AVERAGEA function is different from
the AVERAGE function in that it treats TRUE as a value of 1 and FALSE
as a value of 0.
With that in mind, the formula should look like this.
=SUM(AVERAGEA(A1:A4),AVERAGEA(B1:B4),AVERAGE(C1:C4),AVERAGEA(D1:D4),AVERAGEA(E1:E4))
Produces the expected result:
Note, if you want to ROUND()
the result to two digits, add the following formula to it:
ROUND()
=ROUND(SUM(AVERAGEA(A1:A4),AVERAGEA(B1:B4),AVERAGE(C1:C4),AVERAGEA(D1:D4),AVERAGEA(E1:E4)), 2)
Thank you for taking the time to do this, but I am limited to averaging horizontally not vertically. This is because some rows include partially missing data (e.g., see row 5 in the example above), and I would like to ignore these rows entirely.
– PyjamaNinja
3 hours ago
@PyjamaNinja I don't understand. This is the equivalent to ignoring, it even returns your desired result.
– Rawrplus
2 hours ago
Apologies - perhaps my description is still not clear enough so I will try to demonstrate. If you add a '2' to cell B3 in the spreadsheet you pasted, the result changes to 2.17 right? I want to ignore a row even if it includes some values. I only want to analyze full rows. I still +1'd your answer because it was helpful and informative. Hopefully that clarifies things. PJnin
– PyjamaNinja
1 hour ago
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
So if you have (say) one missing value in a row, you want to take the average of the other four values, and if all the values in a row are missing, you want to ignore that row completely?
– Tom Sharpe
4 hours ago