Mastering the venerable SUMPRODUCT function is a requirement for any Excel Hero.
From my perspective, Excel would be a very different application if the SUMPRODUCT function had never been developed to the degree it is today, and it is the primary reason I do not use OpenOffice. It's an amazing function that can be marshaled to solve a dizzying array of problems. It's not an Array Function, meaning it does not need to be entered with the Control-Shift-Enter key combination, and so it does not need the fancy parenthesis that adorn normal Array Functions. However, it is a function that works on arrays, and is actually slightly quicker than an equivalent CSE Array Function.
One would never guess from the innocuous name the shear power and versatility embedded within the SUMPRODUCT function. Not only is it well worth your time to examine this function in detail, it is quite literally a prerequisite to honing your Excel abilities to the Excel Hero level. Rarely do I build a model that does not make extensive use of the SUMPRODUCT function.
From a maths perspective, SUMPRODUCT calculates the dot product of two vectors (arrays), but realize that it can be used with many more than two arrays. A better name might have been the SumOfProducts function, as this in a nutshell is what it does. Consider the following image:

Notice that Column D is the product of the three columns to its left. Cell D2 contains the formula =A2*B2*C2 and that formula is copied down as far as the columns go (Row 30 in this case). The entire purpose for Column D is to get intermediary values so that we can later sum them. Cell F4 shows this final calculation. For some spreadsheets this intermediary column approach is wanted for presentation purposes. However, sometimes all we want is the ultimate total, the sum of all those products. And that's exactly what the SUMPRODUCT function does. Notice the formula in the Formula Bar. It results in the same total.
There are some points to note here.
Firstly, notice that the arrays I've used are the entire columns (A:A, etc.). I've done this on purpose to point out that doing so is a bad idea. This was not even possible prior to Excel 2007 and would result in an error. Excel 2007 is happy to accommodate the request, but it puts a massive hit on the instance of Excel that workbook is running in. Always try to limit the number of cells that any function of formula must process. A better formula would have been:
=SUMPRODUCT(A2:A30,B2:B30,C2:C30)
The SUMPRODUCT function multiplies elements of different arrays that hold the same position in those arrays and when done, sums all those products. This trivial example made this obvious by showing the function at work on three columns. But understand that those arrays could have just as easily been rows instead of columns, or they could have been rectangular ranges of any arbitrary dimension. They could have been array constants, i.e. {2,3,4,1,9,8} or Named Ranges, or the results of a boolean comparison (simple or complex), a custom mixture of many native functions, an Array Function, or even a User Defined Function. The one overriding requirement is that the arrays must be rectangular, contiguous, and of the same dimension. So if Array 1 is a column of 29 elements (as in the above image) then so must be all of the other ranges.
The next point is that this simple example had just three columns (arrays). The SUMPRODUCT function is just as happy with 1 array or up to 30 arrays, each separated by a comma. Thirty is the maximum number of arguments that any Excel worksheet function can support.
Ok, so SUMPRODUCT is adept at summing the products of equivalently placed elements in arrays. Why is that so powerful? Because it can be used in many non-obvious ways to calculate totals and mine data.
One way that it shines is calculating sums, counts, averages, and other metrics of list data based on criteria. The result here is basically analogous to a SQL database query that produces a total with a WHERE clause. For example the following SQL query:
SELECT SUM(OrderPrice) FROM Orders WHERE Customer='Hansen'
could be emulated in Excel with any one of these SUMPRODUCT formulas:
=SUMPRODUCT((OrderPrice),--(Customer="Hansen"))
=SUMPRODUCT((OrderPrice),1*(Customer="Hansen"))
=SUMPRODUCT((OrderPrice),(Customer="Hansen")*1)
=SUMPRODUCT((OrderPrice),(Customer="Hansen")+0)
=SUMPRODUCT((OrderPrice),(Customer="Hansen")^1)
=SUMPRODUCT((OrderPrice),SIGN(Customer="Hansen"))
=SUMPRODUCT((OrderPrice),N(Customer="Hansen"))
=SUMPRODUCT((OrderPrice),(Customer="Hansen")*True)
=SUMPRODUCT((OrderPrice)*(Customer="Hansen"))
Some notes are in order here.
The terms OrderPrice and Customer could be Named Ranges (they would basically be defined as columns in a list or table and would each be of the same length). Those Named Ranges (I actually prefer to refer to them as Named Formulas) could be dynamic ranges. Or conversely the terms could be replaced with direct cell references, such as C2:C1000 for OderPrice and B2:B1000 for Customer. Or the terms could be replaced by almost any combination of functions that returns an array. Please note that for some reason, SUMPRODUCT always fails when using an IF function on any of the terms inside the SUMPRODUCT, which is OK because we can use boolean logic to eliminate the need for branching. Please read I Heart IF for details.
The order of the terms is irrelevant. In any of the above examples, the answer is the same, whether the OrderPrice term or the Customer term comes first.
The first eight variations above all have two things in common. They use a comma to separate the terms, and each is employing a slightly different technique to coerce a lone boolean value (True or False) into their numeric equivalents, 1 or 0. If we fail to do this coercion on LONE boolean values, the SUMPRODUCT function always returns the value of 0. Of the eight different coercion methods, the double unary method (--) is the quickest.
The ninth variation (the last one) above is the syntax I personally use most often. The entire calculation is done in the first argument so no commas are required and since the boolean term is being multiplied by the OderPrice term, there is no need to further coerce the boolean into a number. The multiplication has already done.
This method is shortest in punctuation and therefore more concise. But it is imperceptibly slower. However, the trade off is greater flexibility. Since all of the work is being done in the first argument to the SUMPRODUCT function, there is no longer a limit of 30 terms. Using this syntax you can combine as many terms as you need. And most important of all, this method allows for more robust logic in the criteria. Specifically, we can specify OR clauses. For example the following SQL query:
SELECT SUM(OrderPrice) FROM Orders WHERE Customer='Hansen' OR Customer='Jones'
could be emulated in Excel with this SUMPRODUCT formula:
=SUMPRODUCT((OrderPrice) * ((Customer="Hansen")+(Customer="Jones")))
When combining the boolean terms (the ones that evaluate to True or False), they are easy to read if you remember that the plus symbol means OR, while the multiplication symbol means AND. Using this technique, you can tabulate very specific records indeed. For example, consider the following SQL query:
SELECT SUM(OrderPrice) FROM Orders WHERE (Customer='Hansen' OR Customer='Jones') AND State='CA' AND SalesDate>=#1/1/2010# And SalesDate<=#1/31/2010#
This query just says to total all of the orders in January of 2010 from California for the customers Hansen or Jones. It could be achieved in Excel using this SUMPRODUCT formula:
=SUMPRODUCT((OrderPrice) * ((Customer="Hansen")+(Customer="Jones")) * (State="CA")*(SalesDate>="1/1/2010") * (SalesDate<="1/31/2010"))
For me this is even more "readable" than the SQL database query.
No matter how precise your criteria, which basically means no matter how many criteria terms you need to combine, the SUMPRODUCT function stands ready to tally the specified records, when used with my preferred syntax of combining everything into the first argument.
Another advantage (in my opinion) to using this syntax style is that it acts differently than the comma separated form when there is text in the summing array (i.e. OrderPrice in our examples) in any record returned by the criteria. Sometimes erroneous text can be in a column of data. The comma separated method will treat such text as a 0, combining everything in the first term will result in an error when the text is found in criteria specific records. I prefer to be notified with the error rather than silently calculating an incorrect result.
You'll notice that in all of these examples, we are summing the OrderPrice column. If instead we wanted to count the number of records that satisfied the criteria, all we need to do is drop the OrderPrice term from the formula. So if we wanted to know how many orders there were in January of 2010 from California for the customer Hansen or Jones, we would use this formula:
=SUMPRODUCT(((Customer="Hansen")+(Customer="Jones")) * (State="CA") * (SalesDate>="1/1/2010") * (SalesDate<="1/31/2010"))
Notice that the only difference here is that the OrderPrice term is gone. And finally if we wanted to know the Average amount of those orders we would just divide the total amount by the number of orders. This makes for a long formula, but now that you understand the syntax, it's easy to follow. This very specific average would be calculated with this formula:
=SUMPRODUCT((OrderPrice) * ((Customer="Hansen")+(Customer="Jones")) * (State="CA") * (SalesDate>="1/1/2010") * (SalesDate<="1/31/2010")) / SUMPRODUCT(((Customer="Hansen")+(Customer="Jones")) * (State="CA") * (SalesDate>="1/1/2010") * (SalesDate<="1/31/2010"))It should be noted here that sometimes when the formulas grow this large, it can be helpful to name them (or parts of them) in the Name Manager.
These type of SUMPRODUCT calcs can become very interesting when the individual terms are used with OFFSET or INDEX or INDIRECT or MOD/ROW. These enhancements go beyond the scope of this post, but they enable such refinements as considering only every nth record in the tabulations, changing the scope of the array, and many other fascinating feats.
Experienced users may point out that most of the above can be done by using the SUMIF or COUNTIF (or in Excel 2007 SUMIFS and COUNTIFS) functions. This is only partially true. The SUMIF and COUNTIF functions can only work with one criterion. The SUMIFS, COUNTIFS, and AVERAGEIFS function of Excel 2007 can work with up to 29 criteria, but they are limited to AND Logic between the criteria with no way to specify OR Logic.
Really experienced users might point out that all of the above can be done with Array Formulas. While this is true, the SUMPRODUCT function is optimized and is roughly 10% faster than an equivalent Array Formula, and there is no need for the Control-Shift-Enter formula entry.
All of this just scratches the surface of what the SUMPRODUCT function can accomplish. It can be used in countless situations that have nothing to do with database calculations. For example, if you wanted to know if the number in A1 was a prime number, you could use this non-Array Formula:
=SUMPRODUCT(1*(MOD(A1,ROW(INDIRECT("1:"&INT(A1^0.5))))=0))=1
SUMPRODUCT can be used to do a multi-column sort by formula.
When a SUMPRODUCT function is combined with Conditional Formatting some very interesting results can be achieved.
Another extremely important use of the SUMPRODUCT function which opens an entire universe of new possibilities is to use it to conduct bitwise logical operations such as Logical AND, OR, XOR, NOT, IMP, or EQ. This is the direct manipulation of the bits inside of an integer. There are countless uses of bit manipulation and this post can get you started.
Finally, amongst Excel gurus there is considerable debate about the best way to do database type calculations in Excel. While SUMPRODUCT as demonstrated in this post is very capable, there are alternatives. No one solution is the perfect answer to every situation. I have been conducting timing trials and will soon share detailed results for many different techniques. In the meantime it should be enough to know that SUMPRODUCT is perfectly fine when it is scanning several thousand records, even tens of thousands. However, if you are using it for hundreds of thousands of records than an alternative is likely in order.
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=22105e25-4acf-4b1d-97e7-ea111af465c8)
For AND Criteria and for Excel 2007 and Greater
SUMIFS/COUNTIFS is the fastest followed by DSUM-SUM followed by SUMPRODUCT
For AND Criteria and for Excel 2003 and Below
DSUM-SUM is the fastest
For OR Criteria and for any version of Excel
DSUM-SUM is faster than SUMPRODUCT but SUMPRODUCT is more flexible
When some criterias are in Columns and some in Rows there is no alternate to SUMPRODUCT
There is one more method of coercion
=Sumproduct(N(condition1),N(Condition2)...(RngToSum))which some believe to be faster than --
I have a workbook where i have speeds of different formulas...I could e-mail it to you if you want to compare the results with your work
@Sam - Your results seem to line up with my own.
Have you tested SIGN as a method to coerce a lone boolean term inside a SUMPRODUCT? How about raising the term to the first power, as in (term^1)?
On a different note it's obvious that doing one MATCH column for a bunch of INDEX lookups is faster than including the MATCH inside each formula, but have you tested using one INDEX formula array-entered over an entire range, as opposed to independent INDEX formulas in all the cells?
I would be very interested if you would like to send me your file: daniel [dot] ferry [at] gmail [dot] com
Thanks,
Daniel Ferry
excelhero.com/blog
I tried the Sign Option V/s -- option.
The -- option seems to be faster
1 Match column and several Index's is definetly faster than several match/index and vlookups.
But 1 Match and a Single Array entered formula in each column is slightly faster than the above.
Thanks for sharing this!
@sam -
DSUM may be fast, but it's damn awkward to use since it needs criteria RANGES. Yeah, you could set up something with What-If tables, but it's still awkward.
@Harlan.
DSUM alone is awkward. But DSUM-SUM combo can be dragged down/across like SUMPRODUCT
Eg: =DSUM(Data,B$1,$A$1:$A2)-SUM(B$1:B1)
Hi Daniel,
What exactly are you missing in the OpenOffice implementation of SUMPRODUCT ?
Hi Daniel, can I choose the largest number from unsorted data ie, there are three column contain name, age and criteria (child, young, old) how to choose the oldest for each criteria without sort it? In this case can I combine sumproduct and large function?
@irwan satya-
Are you trying to return the entire record of the oldest for each criteria or just the age?
I ask because SUMPRODUCT only returns numeric answers.
If you send me you spreadsheet, I can take a look for you.
daniel dot ferry at gmail dot com
Regards,
Daniel Ferry
excelhero.com/blog
Daniel,
just read your comment on Chandoo's post about sumproduct, and directed me to this one. Absolutely fantastic definition. I am getting such a fan of this formula, that i was wandering how I survided all these year without using it....
All the best,
Martin.
@Martin -
Thanks. There's so many uses for SUMPRODUCT that I was thinking about writing a follow-up article if I can find the time.
Anyways, welcome to my blog. I hope you become a regular.
Regards,
Daniel Ferry
excelhero.com
Hi Daniel,
Can you please advise me how to use sumproduct to determine the following:
(1) Can you use SumProduct to highlight & sort if data in in all cells, over four columns are in credit. Do use SumProduct & Conditional formatting to do this, (ie Monthly, Quarterly, Yearly & 5 yearly).
How to calculate if all 4 rows are in credit, then sort by highest to lowest balance
ASX Code Company name Sector Market Cap 1 Month Trailing Return 3 Month Trailing Return 1 Year Trailing Return 5 Year Avg. Annual Return
FGL Foster's Group Limited Alcohol/Tobacco 10,327,660,923 -1.70% 0.00% 11.00% 5.10%
ANZ ANZ Banking Group Ltd Banks 60,844,189,521 17.20% 12.10% 60.90% 8.80%
(2) Place the sum of the four cells in column I and/or pivot table
(3) How do you Add the maximum & minimum dollar amount per month, quarter, Six monthly & Yearly.
(4) Do have a document that details all the uses of SumProduct function.
Refer row examples below:
Regards,
Ron.
I have a spreadsheet where I'm trying to use SUMPRODUCT but I need some help.
The data is a monthly calendar where individuals are on call.
column A is days of the week (M-Sunday)
colum b is the individuals' initials like "VG"
I would like to create a chart that shows the number of days VG works on weekends (for example)
But what if the person only works a Saturday or Sunday? Also if they do work both days, we would count that as working one weekend...if they work just Saturday or Sunday it would be counted as working 1/2 the weekend.
I can get the formula to count either Saturday or Sunday but not both.
I can provide the chart if that would help.
Thanks so much.
B
@Bill -
Yeah. Send me your file and I'll take a look. It's kind of difficult without actually seeing it.
Regards,
Daniel Ferry
excelhero.com
Please, Display more examples.
Can you please send the file which contain the excel formula.
Neeraj
Excel Enthusiast
sandesh2neeraj@gmail.com
@Neeraj -
Which file are you referring to?
There is no sample file for this article.
Regards,
Daniel Ferry
excelhero.com
Awesome post Daniel,
If you could find the time to do that followup post, I promise to read it.
Joe