January 2010 Archives

5 And 3 Is 1

| 89 Comments | 0 TrackBacks

Ok. So I haven't lost it. In arithmetic 5 and 3 is 8, but in bitwise operations 5 And 3 is 1.

Bitwise operations manipulate the individual bits within a number. An easy way to visualize this is to use the binary number system. Humans are not designed to process binary efficiently (we prefer the decimal system), but computers are. In fact, all information is stored and processed on your computer in binary.

binary_welcome_mat.jpg

This door mat says, "welcome" in binary if you take it 8 digits at a time and convert those chunks into ASCII.

 

Binary is a number system composed entirely of just 0 and 1. Any number conceivable can be expressed in binary; large numbers will have a lot of digits, many times more so that in decimal.

Here's a chart of some normal decimal numbers and their binary equivalent:

1            0001
2            0010
3            0011
4            0100
5            0101
6            0110
7            0111
8            1000
9            1001

In bitwise operations the AND operator compares the bits (the 1s and 0s) of two input numbers and produces a 1 in the output for a given digit, if and only if both of the input numbers have a 1 in the same digit. That's a mouthful. Visually it's simple:

0101
0011
0001

Do you see how the only digit that satisfies the AND operator is in the ones place. It just so happens that this example while expressed in binary, was in fact in decimal: 5 AND 3 = 1.

There are several other bitwise operators. In addition to AND, there are OR, XOR, IMP, EQ, and NOT. Each has a different affect on the bits and thus produces a different output. For example, the OR operator produces a 1 in the output for a given digit, if either (or both) of the input numbers have a 1 in the same digit. Visually:

0101
0011
0111

In decimal: 5 OR 3 = 7.

Being able to perform these types of manipulations from Excel worksheet formulas would be very useful. Oddly, Microsoft has never included these functions in Excel. But we can synthetically reproduce this functionality by combining several native functions.

We will be inputing decimal numbers and our formulas will be converting them to binary, manipulating the bits, and finally returning to us a decimal result. Nice and clean, so how's it done?

The first thing we need to do is determine the maximum size number that we wish to manipulate. For this post, we want to work with numbers between decimal 0 and 255. This requires 8 bits (8 digits) in binary. Under this scheme here are the binary representations of 0 and 255:

00000000
11111111

It turns out that we can get the binary bit (0 or 1) of each place of a decimal number by a fairly simple formula. Since we are talking the conversion to binary here, it is no great surprise that the number 2 plays a significant role. Starting with the first place value (the binary digit to the extreme right) and working to the left, the binary digits are found by simply taking our decimal number and dividing it by 2 raised to the power of that place position, minus 1. We then convert the result to an integer  to remove any decimal point places. Finally, we return the remainder after dividing the integer result by 2. Again that's a mouthful, but the process can be concisely described with this formula:

=MOD(INT(dec_val/2^(bin_place_val-1)),2)

We are working with 8 bit numbers in this post, so we would need to perform this calculation 8 times to get all of the binary digits for the decimal number.

Applying this to the two decimal numbers 222 and 127, we would have used the above formula 16 times and produced these two binary numbers:

11011110
01111111

Now let's say we wanted to perform the bitwise AND operation. Just by looking at them we can see the binary answer should be:

01011110

To find the AND of two place inputs, all we need to do is multiply the two together. So the first place (right to left) is 0 * 1, the second place is 1 * 1, etc. Continuing the process for 8 digits does in fact produce the same output as we found by just looking (directly above).

So that's it. We have our AND of 222 and 127. The only thing left to do is to convert it to decimal. To do this we just multiply each digit of our result by 2 raised to that digit's binary place value minus one, and then sum all of the products.

Wait a minute. Sum all of the products? Where have we heard that before? Oh yes, our friend the SUMPRODUCT function. So with one SUMPRODUCT formula we can accomplish all of these steps, including converting the two decimal inputs into 8 bit binary numbers, multiplying them together to get the bitwise AND result and converting this result back to a decimal number.

To make the formula manageable, we should first create a Named Formula in the workbook that will be used to represent the 8 bit conversions. The first place value uses 2^(1-1). The second place value uses 2^(2-1). The third uses 2^(3-1). Doing this for each of the 8 bits, we get the following array of conversion constants to hold in a Named Formula:

={1;2;4;8;16;32;64;128}

Let's name this formula: b

to stand for BITS.


So here is our SUMPRODUCT formula to do the bitwise AND manipulation of two decimal numbers (dec1 and dec2):


=SUMPRODUCT( b * MOD(INT(dec1/b),2) * MOD(INT(dec2/b),2) )


I like to put the b as the first term as a sort of heads up that this is a bitwise calculation. Then scanning further into the formula I can clearly see the multiplication symbol separating the next two terms, and I can see instantly that this is a bitwise AND formula.

Doing the bitwise OR is nearly the same. Instead of multiplying the bits, we add them. And then we use the SIGN of those sums so that digit values cannot be larger than one. It looks like this:


=SUMPRODUCT( b * SIGN(MOD(INT(dec1/b),2) + MOD(INT(dec2/b),2)) )


Again, the b at the beginning tells me this is a bitwise calc. The SIGN tells me this is a bitwise OR.


A supercharged method of table lookups is to use bitwise AND, and a bitmask  to decode multiple values from one decimal number. Here is a post on that.


The attached spreadsheet details all of this for AND, OR, XOR, IMP, EQ, and NOT.

Bitwise operations can be used to solve many challenging problems is spreadsheet design.

Here are some posts and sample workbooks that make use of bitwise operations:

LED RSS News Ticker

Unbreakable Cypher

Formula Based Sudoku Solver













 

Enhanced by Zemanta

The Venerable SUMPRODUCT

| 151 Comments | 0 TrackBacks

Mastering the venerable SUMPRODUCT function is a requirement for any Excel Hero.

It's one of the most useful in the roster of Microsoft Excel functions.

From my perspective, Excel would be a very different application if the SUMPRODUCT function had never been developed to the degree it is today. It's an amazing function that can be marshaled to solve a dizzying array of problems. It's an Array Function in that it operates on arrays, but it does not need to be entered with the Control-Shift-Enter key combination, and so it does not need the fancy parenthesis that adorn normal Array Formulas. However, it is a function that works on arrays, and is actually slightly quicker than an equivalent CSE Array Function.


excel-hero-academy-web-banner1.jpg


One would never guess from the innocuous name the shear power and versatility embedded within the SUMPRODUCT function. Not only is it well worth your time to examine this function in detail, it is quite literally a prerequisite to honing your Excel abilities to the Excel Hero level. Rarely do I build a model that does not make extensive use of the SUMPRODUCT function.

From a maths perspective, SUMPRODUCT calculates the dot product of two vectors (arrays), but realize that it can be used with many more than two arrays.  A better name might have been the SumOfProducts function, as this in a nutshell is what it does.  Consider the following image:

sumproduct.png

Notice that Column D is the product of the three columns to its left. Cell D2 contains the formula =A2*B2*C2 and that formula is copied down as far as the columns go (Row 30 in this case). The entire purpose for Column D is to get intermediary values so that we can later sum them. Cell F4 shows this final calculation. For some spreadsheets this intermediary column approach is wanted for presentation purposes. However, sometimes all we want is the ultimate total, the sum of all those products. And that's exactly what the SUMPRODUCT function does. Notice the formula in the Formula Bar. It results in the same total.

There are some points to note here. 

Firstly, notice that the arrays I've used are the entire columns (A:A, etc.). I've done this on purpose to point out that doing so is a bad idea.  This was not even possible prior to Excel 2007 and would result in an error. Excel 2007 is happy to accommodate the request, but it puts a massive hit on the instance of Excel that workbook is running in. Always try to limit the number of cells that Microsoft Excel functions or formulas must process. A better formula would have been:

=SUMPRODUCT(A2:A30,B2:B30,C2:C30)

The SUMPRODUCT function multiplies elements of different arrays that hold the same position in those arrays and when done, sums all those products. This trivial example made this obvious by showing the function at work on three columns. But understand that those arrays could have just as easily been rows instead of columns, or they could have been rectangular ranges of any arbitrary dimension. They could have been array constants, i.e. {2,3,4,1,9,8} or Named Ranges, or the results of a boolean comparison (simple or complex), a custom mixture of many native Microsoft Excel functions, an Array Function, or even a User Defined Function. The one overriding requirement is that the arrays must be rectangular, contiguous, and of the same dimension. So if Array 1 is a column of 29 elements (as in the above image) then so must be all of the other ranges.

The next point is that this simple example had just three columns (arrays). The SUMPRODUCT function is just as happy with 1 array or up to 30 arrays, each separated by a comma. Thirty is the maximum number of arguments that Microsoft Excel functions can support. 

Ok, so SUMPRODUCT is adept at summing the products of equivalently placed elements in arrays. Why is that so powerful? Because it can be used in many non-obvious ways to calculate totals and mine data.

One way that it shines is calculating sums, counts, averages, and other metrics of list data based on criteria. The result here is basically analogous to a SQL database query that produces a total with a WHERE clause. For example the following SQL query: 

SELECT SUM(OrderPrice) FROM Orders WHERE Customer='Hansen'

could be emulated in Excel with any one of these SUMPRODUCT formulas:

=SUMPRODUCT((OrderPrice),--(Customer="Hansen"))
=SUMPRODUCT((OrderPrice),1*(Customer="Hansen"))
=SUMPRODUCT((OrderPrice),(Customer="Hansen")*1)
=SUMPRODUCT((OrderPrice),(Customer="Hansen")+0)
=SUMPRODUCT((OrderPrice),(Customer="Hansen")^1)
=SUMPRODUCT((OrderPrice),SIGN(Customer="Hansen"))
=SUMPRODUCT((OrderPrice),N(Customer="Hansen"))
=SUMPRODUCT((OrderPrice),(Customer="Hansen")*True)
=SUMPRODUCT((OrderPrice)*(Customer="Hansen"))

Some notes are in order here. 

The terms OrderPrice and Customer could be Named Ranges (they would basically be defined as columns in a list or table and would each be of the same length). Those Named Ranges (I actually prefer to refer to them as Named Formulas) could be dynamic ranges. Or conversely the terms could be replaced with direct cell references, such as C2:C1000 for OderPrice and B2:B1000 for Customer. Or the terms could be replaced by almost any combination of Microsoft Excel functions that returns an array. Please note that for some reason, SUMPRODUCT always fails when using an IF function on any of the terms inside the SUMPRODUCT, which is OK because we can use boolean logic to eliminate the need for branching. Please read I Heart IF for details.

The order of the terms is irrelevant. In any of the above examples, the answer is the same, whether the OrderPrice term or the Customer term comes first.

The first eight variations above all have two things in common. They use a comma to separate the terms, and each is employing a slightly different technique to coerce a lone boolean value (True or False) into their numeric equivalents, 1 or 0. If we fail to do this coercion on LONE boolean values, the SUMPRODUCT function always returns the value of 0. Of the eight different coercion methods, the double unary method (--) is the quickest. 

The ninth variation (the last one) above is the syntax I personally use most often. The entire calculation is done in the first argument so no commas are required and since the boolean term is being multiplied by the OderPrice term, there is no need to further coerce the boolean into a number. The multiplication has already done. 

This method is shortest in punctuation and therefore more concise. But it is imperceptibly slower. However, the trade off is greater flexibility. Since all of the work is being done in the first argument to the SUMPRODUCT function, there is no longer a limit of 30 terms. Using this syntax you can combine as many terms as you need. And most important of all, this method allows for more robust logic in the criteria. Specifically, we can specify OR clauses. For example the following SQL query: 

SELECT SUM(OrderPrice) FROM Orders WHERE Customer='Hansen' OR Customer='Jones'

could be emulated in Excel with this SUMPRODUCT formula:

=SUMPRODUCT((OrderPrice) * ((Customer="Hansen")+(Customer="Jones")))

When combining the boolean terms (the ones that evaluate to True or False), they are easy to read if you remember that the plus symbol means OR, while the multiplication symbol means AND. Using this technique, you can tabulate very specific records indeed. For example, consider the following SQL query:

SELECT SUM(OrderPrice) FROM Orders WHERE (Customer='Hansen' OR Customer='Jones') AND State='CA' AND SalesDate>=#1/1/2010# And SalesDate<=#1/31/2010#

This query just says to total all of the orders in January of 2010 from California for the customers Hansen or Jones. It could be achieved in Excel using this SUMPRODUCT formula:

=SUMPRODUCT((OrderPrice) * ((Customer="Hansen")+(Customer="Jones")) * (State="CA")*(SalesDate>="1/1/2010") * (SalesDate<="1/31/2010"))

For me this is even more "readable" than the SQL database query. 

No matter how precise your criteria, which basically means no matter how many criteria terms you need to combine, the SUMPRODUCT function stands ready to tally the specified records, when used with my preferred syntax of combining everything into the first argument.

Another advantage (in my opinion) to using this syntax style is that it acts differently than the comma separated form when there is text in the summing array (i.e. OrderPrice in our examples) in any record returned by the criteria. Sometimes erroneous text can be in a column of data. The comma separated method will treat such text as a 0, combining everything in the first term will result in an error when the text is found in criteria specific records. I prefer to be notified with the error rather than silently calculating an incorrect result.


You'll notice that in all of these examples, we are summing the OrderPrice column. If instead we wanted to count the number of records that satisfied the criteria, all we need to do is drop the OrderPrice term from the formula. So if we wanted to know how many orders there were in January of 2010 from California for the customer Hansen or Jones, we would use this formula:

=SUMPRODUCT(((Customer="Hansen")+(Customer="Jones")) * (State="CA") * (SalesDate>="1/1/2010") * (SalesDate<="1/31/2010"))

Notice that the only difference here is that the OrderPrice term is gone. And finally if we wanted to know the Average amount of those orders we would just divide the total amount by the number of orders. This makes for a long formula, but now that you understand the syntax, it's easy to follow. This very specific average would be calculated with this formula:

=SUMPRODUCT((OrderPrice) * ((Customer="Hansen")+(Customer="Jones")) * (State="CA") * (SalesDate>="1/1/2010") * (SalesDate<="1/31/2010"))    /    SUMPRODUCT(((Customer="Hansen")+(Customer="Jones")) * (State="CA") * (SalesDate>="1/1/2010") * (SalesDate<="1/31/2010"))

It should be noted here that sometimes when the formulas grow this large, it can be helpful to name them (or parts of them) in the Name Manager.

These type of SUMPRODUCT calcs with multiple criteria can become very interesting when the individual terms are used with OFFSET or INDEX or INDIRECT or MOD/ROW. These enhancements go beyond the scope of this post, but they enable such refinements as considering only every nth record in the tabulations, changing the scope of the array, and many other fascinating feats.


Experienced users may point out that most of the above can be done by using the SUMIF or COUNTIF (or in Excel 2007 SUMIFS and COUNTIFS) functions. This is only partially true. The SUMIF and COUNTIF functions can only work with one criterion. The SUMIFS, COUNTIFS, and AVERAGEIFS function of Excel 2007 can work with up to 29 criteria, but they are limited to AND Logic between the criteria with no way to specify OR Logic.

Really experienced users might point out that all of the above can be done with Array Formulas. While this is true, the SUMPRODUCT function is optimized and is roughly 10% faster than an equivalent Array Formula, and there is no need for the Control-Shift-Enter formula entry.

All of this just scratches the surface of what the SUMPRODUCT function can accomplish. It can be used in countless situations that have nothing to do with database calculations. For example, if you wanted to know if the number in A1 was a prime number, you could use this non-Array Formula:

=SUMPRODUCT(1*(MOD(A1,ROW(INDIRECT("1:"&INT(A1^0.5))))=0))=1

SUMPRODUCT can be used to do a multi-column sort by formula.

When a SUMPRODUCT function is combined with Conditional Formatting some very interesting results can be achieved.

Another extremely important use of the SUMPRODUCT function which opens an entire universe of new possibilities is to use it to conduct bitwise logical operations such as Logical AND, OR, XOR, NOT, IMP, or EQ. This is the direct manipulation of the bits inside of an integer. There are countless uses of bit manipulation and this post can get you started.

Finally, amongst Excel gurus there is considerable debate about the best way to do database type calculations in Excel. While SUMPRODUCT as demonstrated in this post is very capable, there are alternatives. No one solution is the perfect answer to every situation. I have been conducting timing trials and will soon share detailed results for many different techniques. In the meantime it should be enough to know that SUMPRODUCT is perfectly fine when it is scanning several thousand records, even tens of thousands. However, if you are using it for hundreds of thousands of records than an alternative is likely in order.




And here are some other reference articles at Excel Hero:

The Imposing INDEX
The Venerable SUMPRODUCT
- I Heart IF
5 And 3 Is 1
Live Calendar Musings
Which Function To Use
Excel VBA Shortcut Range References
Excel Partial Match Database Lookup
Excel High Precision Engineering Chart #1
Excel Zip Code Lookup
Excel Dynamic Checkmark
Animated Business Chart
Excel Acrobat PDF Form Filler



And for something completely unique, here's a list of animated charts on Excel Hero:

 - Excel, A Presentation Platform (Number Spiral)
 - Lilac Chaser (Optical Illusion)
 - Stereokinetic (Optical Illusion)
 - Illusory Contours (Optical Illusion)
 - Breathing Square (Optical Illusion)
 - Enigma (Optical Illusion)
 - Two Sinusoids (Optical Illusion)
 - Perpetual Collisions (Optical Illusion)
 - Freezing Rotation (Optical Illusion)
 - Reverse Spoke Illusion (Optical Illusion)
 - Stepping Feet Radial Illusion (Optical Illusion)
 - Swimming Fish (Optical Illusion)
 - Mutually Interfering Shapes (Optical Illusion)



Learn Advanced Excel!

Advanced and Creative Formulas and advanced charting are just a couple of the literally dozens of advanced methods that my students in the inaugural class of the Excel Hero Academy are learning. EHA is Excel online training.

I've received dozens of messages from students telling me that the academy is the best Excel training they have ever found at any price and that it is possibly the best investment they've ever made. Wow.

In a few weeks I will be offering the course again. There's already over 250 people on the interest list for the next class. Why not join us? Just fill the form below and I'll send you more information as we get closer!

Try it. It truly is one of the best online courses Excel Hero Academy.



Enhanced by Zemanta

I Heart IF

| 206 Comments | 0 TrackBacks

Do you love IF()?

The IF worksheet function in Excel is essential.

In my experience, after SUM and AVERAGE, it is one of the first functions that budding Excel users discover with an Eureka moment.  They start feeling empowered and realize for the first time that they can control Excel and make interesting things happen in their spreadsheet.

IF is very important and sometimes a situation demands that the logic of a compound formula branch, i.e. if some situation, then do this, otherwise do that.


I_heart_if.png
However, IF is also the most overused function by far, especially once the modeler develops the expertise to create nontrivial spreadsheets. Nested IF functions often riddle these spreadsheets, sometimes to very deep levels. Besides making a spreadsheet unnecessarily complicated, there are very real limits to how deeply one can nest IF functions. In Excel versions prior to 2007 this limit was 7 levels deep. Excel 2007 raised this limit to 64, but I would suggest that if you need more than a few levels your approach probably needs tweaking.

So what are some alternatives to IF()?

 
I'll give you three, none of which have an arbitrary limit of conditions.

The first two are fairly simple: Table Lookup and Boolean Logic.

Suppose you had a situation where you wanted your formula to use a certain value based upon the contents of the Cell A2. You might have a nested formula that looks something like: 

=IF(A2="red",25,IF(A2="white",101,IF(A2="blue",74,0)))

A very simple formula, it results in a value of 25 for red in A2, 101 for white, 74 for blue, and finally 0 for any other color or no color at all. But what if you had more than seven colors where you needed to supply values in this formula. It simply would not work. In fact Excel won't even let you enter such a formula: when you hit enter it will pop up an error message informing you that you cannot nest the IF function to that many levels and the formula will be summarily rejected (Excel 2007 will let you get away with it, but please don't). A much better idea would be to create a lookup table somewhere in your workbook with one column for all of your colors and another for all of their corresponding values. Then you could do a simple Vlookup of the table to return the correct value for any color. The table could be extended to thousands of rows and our formula would stay just as concise as it was for three colors. Excel has quite a few functions that can lookup values in such tables not just Vlookup.

The second approach uses boolean logic. The following formula produces the exact same result as our original nested IF formula, but notice that it uses no IF functions at all:

=(A2="red")*25 + (A2="white")*101 + (A2="blue")*74

This is a much more elegant formula. It's shorter. It's easier to read. And, it's easier for Excel to calculate. There is no branching at all. There's one phrase for each color and the calculation engine just plows through and adds up the totals for each phrase. So how does this work? Consider the following:

=(A2="red")

This simple comparison will result in a value of TRUE if A2 has "red" in it (just to be clear: the string of letters, not the background color). If there is anything else or even nothing in A2, then the result will be FALSE. True and False are actual values in Excel. If you were to enter this formula in some cell, you would see the word TRUE or the word FALSE in that cell, depending on what was in A2. It's kind of interesting when you think that only one value out of an infinite number of values in A2 will result in a True. False will be the result for anything else (is that infinity minus one?)

I need to point out that this type of comparison does not distinguish between upper and lower case, so "Red", "RED", "rEd", and "reD" will all be counted as "red." If case sensitivity is important in your circumstance you should change the formula to =EXACT(A2,"red"). The EXACT() function will return True only if the two strings are identical including the case of each character. But for now let's assume that case is irrelevant as it often is.

You'll notice that our first phrase in my boolean logic replacement formula above has the "*25" appended. Here we are exploiting the fact that Excel stores the value of True internally as the number 1 and the value of False as the number 0. Let's assume that the string "blue" was in A2 and watch below how Excel gets to the correct result:

=(A2="red")*25 + (A2="white")*101 + (A2="blue")*74

=(False)*25 + (False)*101 + (True)*74

=(0)*25 + (0)*101 + (1)*74

=74

In practice, since there is no branching in the logic, Excel can calculate this style of formula at lightning speed, and as in the Table Lookup method, there is no limit to the number of phrases (colors in this case) you can include in the formula. This method is potent and will be used extensively in a vast variety of situations here at Excel Hero.

So there you have two fairly simple alternatives to blindly using the IF function. Both have no limits on the number of conditions, and both are much easier to read and easier for Excel to execute. In our mantra of "interactive, faster, shorter, easier to maintain, or simpler" they trump the nested IF approach. If the IF function has made you feel empowered, just wait until you master these two techniques!

Ok. I said I would share three alternative techniques. The third technique is to encode multiple answers into one integer value and to decode them using bit inspection and a bitmask. This is the most potent tactic of all, but it's way beyond the scope of this post. To understand this advanced technique, please read my post, 5 And 3 Is 1 and then study how a bitmask and bit inspection are used to decode the characters in my LED RSS News Ticker.

Finally remember that IF() is not evil, just precious. There are times when it's unavoidable, and it would be a shame to hit the nesting limit simply because several layers of unnecessary nesting have already occurred in a formula. Personally, I make a game out of it and pride myself on using as few (mostly none) IF functions as I can. Try it. You'll soon be treating the IF function as if it were a precious jewel not to be squandered, and as a byproduct your spreadsheet models will be much better.

As a footnote to this post I'd like to point out that prior to Excel 2007, one of the uses of the IF function that was unavoidable was error catching. Say you developed a beautiful custom formula and it produced the perfect result except that sometimes because of inputs beyond your control, the formula resulted in one of Excel's error values. In addition your model required that the result of your fancy custom formula be used as an input somewhere else, and the error values cascaded causing errors all over the place (error in = error out). The only fix for this type of scenario is to trap the error with an IF function that looks something like this:

=IF(ISERROR(MyFancyFormula),0,MyFancyFormula)

Heinous, to be sure. It requires doubling your fancy formula just to prevent Excel's error message from laying havoc to our model; and it forced us to use an IF function. Thankfully, in Excel 2007, Microsoft introduced a new function called IFERROR(). So now the Excel calc engine does not need to calculate MyFancyFormula twice just to accommodate the possibility of an error. Here is how the same situation is handled in Excel 2007 and 2010:

=IFERROR(MyFancyFormula,0)



LED RSS News Ticker

| 12 Comments | 0 TrackBacks
Why on earth would anyone want to build a simulated LED news ticker display in Excel? While certainly not the most efficient way to get one's news, this modest project is interesting on a number of levels, starting with how deftly it demonstrates the power of boolean bitmasks.

news_ticker.png


Please download the RSS News Ticker workbook.

Ask yourself how you would set about using the Excel charting and calculation engines to graph any alphanumeric message. Let's go ahead and stipulate an X Y (Scatter) chart type. But without using VBA how would you calculate the values for the chart series without knowing the message in advance? I suspect that many would conclude that a number of nested IF functions for each pixel would be in order. But be careful. There are at least 68 characters that need to be displayed (26 capitalized letters, 10 numeric digits, and a bevy of punctuation and special characters); in reality there could be hundreds of unique characters, but 68 would be the minimum to make a useful display.  Any strategy chosen should be able to handle all of the characters one might wish to include in your character set.

Excel versions prior to 2007 were limited to 7 nested IF functions.  While 2007 allows for 64 levels of nesting, that would be insanely unmanageable and would only allow for 64 characters; not to mention the horror it would be to suffer such a slow, bloated formula is 560 cells! That's how many points this chart has. 

The next strategy considered would probably be a lookup table using Index/Match, Offset/Match, or Vlookup. Each character in this LED display has 35 pixels, so you would be looking at 35 columns of Yes/No type data for each character in your character set. This is a viable solution, while the nested IF formula is definitely not.

However, I'd like to share with you an elegant approach that you may not have considered. In most programming languages (Excel formulas included) the use of boolean bitmasks can often dramatically reduce program looping and branching. Here is a Wikipedia article that explains the concept. In my last post I explained in detail the technique for achieving bitwise logical operations between two values, such as Logical AND, OR and XOR. 

These operators are available in VBA, but tragically they are not natively available in the roster of Excel worksheet functions (I'm not talking about the normal And and Or worksheet functions. Bitwise AND and OR do something quite different). But we can synthetically reproduce these by using Excel's indomitable function, the flexile Sumproduct. This strategy is blazingly quick and since the bitmask allows us to bit decode a column of 7 pixels from one number using the synthetic AND operator, we can reduce our lookup table from 35 columns per character to just 5 columns, one for each column of 7 pixels in a character. And instead of meticulously placing 35 distinct formulas in 16 character positions for the chart, we can use one standard formula copied to all 560 cells.

In our eternal quest of "faster, shorter, easier to maintain, or simpler" this bitmask technique wins hands down as the optimum non-VBA method to graph a message on our simulated LED RSS News Ticker. The technique is versatile and once you open your mind to it, you will find countless uses.  

Three years ago I used the very same technique in the crafting of an Excel formula based Soduku solver. All other solvers that I've seen use brute force VBA algorithms to crack a puzzle. Employing the bitmask technique, I was able to create a solver that uses accepted Sudoku techniques (human logic, such as Naked Triples, Remote Pairs, and Sword-Fish) to solve any puzzle and it interactively shows the user which technique is the one that worked to solve each cell.  I'll post about that later, as it has some other very interesting Excel tactics.

Another interesting Excel project where I've used this strategy was to produce an unbreakable (even in theory) encryption cypher by using the synthetic XOR operator on a message and a one-time-pad composed of a feed of genuine random numbers produced by the radioactive decays detected by a Geiger-Müller tube interfaced to a computer at Fourmilab in Switzerland! These are esoteric uses to be sure, but the bitmask strategy can be used in many everyday situations.

Please download the accompanying workbook. It's definitely worth examining. By adjusting the Interval setting, you may notice that those 560 Logical AND calcs via Sumproduct are faster than Excel's charting engine.  Try setting the Interval to "0." Remarkable, really.  

Another interesting aspect of this project are the Array Formulas used to parse the XML RSS feed from CNN and Fox News. I'm using Microsoft's MSXML2.XMLHTTP COM object to grab the RSS feeds, so it would have been simple enough to also use it to wend through the XML trees and parse out the news items, but alas Fox News embeds illegal characters in their feed.  So rather than ferreting them out, I chose to parse both feeds with an Array Formula and then concatenate the resulting array with VBA. Its an interesting approach which works quite well. Array formulas are not the scary monsters that many people fear. When used with understanding they are the most concise programming language I have ever encountered in my 25-year programming career. It is unfortuante that most users rarely go there. We will try to change that here with their judicious employment.

Take notice of the Character Set table in column A of the Font sheet. It was tedious ordering this table so that the MATCH formulas on the LED sheet could lookup each character without using the "Exact match" setting. The "Less than" setting works at binary sort speed and so I wanted to use that setting. Having Excel sort the table of characters DID NOT get the characters in the proper order for MATCH to work successfully; some characters worked, others did not. Through trial and error I arrived at the current table order, and this seems to work. If anyone knows of a reference that shows the sort order the "Less than" setting for MATCH works under, I would be very grateful if you left me a note.

I've taken care to align the 560 calculated cells directly under their corresponding chart pixels so it's easy to decipher how the process works. The final point of note here is that I used the WinINet library from within VBA to test for an Internet connection prior to retrieving a feed. This allows the program to gracefully restrict the user to displaying the Custom Text field when not connected. WinINet is the best method I've ever found for testing if a connection to the internet exists; it returns instantly, unlike some other methods that take seconds to time out.

Please download the RSS News Ticker workbook.


Live Calendar Musings

| 33 Comments | 0 TrackBacks

Let's jump in and look at a very simple project that really is nothing more than presentation. The project makes its own information out of thin air. Many people like to have a small calendar on their spreadsheet or report that always shows the current date. This project will make a live calendar that can be included in any spreadsheet in just seconds after you learn how.


calendar.png

This is an example of what the calendar can look like. It can be placed anywhere on a worksheet. While simple to construct, it does use some concepts you may not have heard of. 


SEVEN QUICK STEPS TO BUILD A LIVE CALENDAR

  1. Define three named ranges using the Define Name dialog:
    • DaysAndWks ={0,1,2,3,4,5,6} + {0;1;2;3;4;5} * 7
    • DateOfFirst =Date(Year(Now()),Month(Now()),1)
    • Calendar =DaysAndWks + DateOfFirst - Weekday(DateOfFirst,2)
  2. Select cells C3:I8. Press F2 on the keyboard and array enter the formula: =Calendar
  3. Center format those cells and apply the cumstom number format of "d".
  4. In C1 and C2 enter the formula: =C5
  5. Apply the custom number format of "ddd" and center text to C2. Apply the custom number format of "mmmm yyyy" to C1.
  6. Copy C2 to C2:I2. Apply "Center Across Selection" and a background color to C1:I1. 
  7. Adjust the column widths to your preference and format a box around the calendar.


If that's all Greek to you, don't despair. Someone with the experience to understand those instructions could complete them in less than a minute! If you're in the Greek camp, keep reading...


Here is a video of me constructing the live calendar in real time.  It's just under 2 minutes, but I was talking while doing it ;)


Start a new workbook and right-click on the Sheet3 tab at the bottom of the window. Select Rename and change the name of the tab to Scratch.  Select columns C, D, E, F, G, H, and "I" at the top of the sheet by clicking on C and before releasing the click, drag to the right until the mouse pointer is hovering over I.  Now let go of the click and right-click anywhere in the shaded, selected area.  Select Column Width... from the context menu and type the figure 4.29 into the box.  Click on OK.  To make this calendar we first need a grid of numbers ranging from 0 to 41.  The first thing you need to learn is the Auto-Fill feature of Excel.  Instead of typing 0,1,2,3,4,5,6 in the first row and 7,8,9,10,11,12 in the next and continuing that for a total of six rows, the Auto-Fill will make quick work of this task.  Just enter 0 and 1 into cells C3 and D3, and enter 7 and 8 into cells C4 and D4.


Now just below the 8 and slightly to the right you'll find a tiny, black, square anchor. Click on this and drag to the right to include everything through the "I" column and let go of the click.  Now click the anchor again and drag down to include all rows through 8 and let go of the click.  It should look like this:

calendar_part_1.PNG



You've just entered 42 numbers instantly and accurately, certainly better than typing them all.  Excel is quite smart and can figure out many patterns.  Try this technique in different situations.  It will often save you lots of time.  If it doesn't produce the results you wanted, just click the undo command from the Edit menu.

These figures are still static numbers however.  What we really need is one formula that will calculate all of them.  When designing spreadsheets, this should always be your goal, crafting a single formula to fill a large range of cells.  So let's make one. Select all the cells that have numbers in them.  You can do this using the Edit menu and then Go To Special...  Select Constants and click on OK.  [Excel 2007 Note: Go To Special... is under the binoculars on the Home tab of the ribbon.]

After the cells are selected, press F2 on the keyboard.  All formulas should begin with an equal sign.  Type ={0,1,2,3,4,5,6} without the quotation marks and then on the keyboard press Control-Shift-Enter together.  That is, press three buttons on the keyboard simultaneously.  This places the formula into the formula bar at the top. This formula will apply to all of the selected cells.

For our very first formula in this blog we are using an advanced type called an array formula that most users never discover.  As an Excel Hero you will use them frequently. They do impressive calculations, some that are not possible any other way.  Some refer to them as CSE formulas because of the keys on the keyboard required to input one.  If you had not hit all three keys simultaneously the formula would be a normal formula and would work incorrectly. In this case we have entered an array of constants within our array formula. The fancy parentheses tell Excel that we are entering an array of constants.  When we press Control-Shift-Enter on the keyboard, Excel will enter another pair of fancy parentheses around the entire formula.  Excel does this for us; we need only type the fancy parentheses around the array of constants.  At this point, it should look like this:

calendar_part_2.PNG

What this has done is given us daily offsets, meaning Sunday is 0, Monday is 1, etc. If we add these to weekly offsets then we should have our 0 to 41, produced by formulas. Let's replace the daily offset with a tiny formula for weekly offsets.  Make sure the cells in the calendar are all selected like before and then press F2 and type the following into the Formula Bar:

 ={0;1;2;3;4;5}*7  

Now press Control-Shift-Enter on the keyboard again. The asterisk means multiply. Notice that we have another array of constants surrounded by the fancy parentheses.  You'll see that the result in the cells is quite different. We now have weekly offsets.  Notice that the constants are separated by semicolons instead of commas.  In Excel arrays, commas separate items horizontally going across a row. Semicolons separate items vertically going down a column. This formula is telling Excel to take the first item, 0, and multiply it by 7 and place the result in every cell of the first row.  This process continues for each respective row. It should look like this:

calendar_part_3.PNG

Now let's combine the two formulas.  Select the cells and type into the Formula Bar:

={0,1,2,3,4,5,6} + {0;1;2;3;4;5}*7 

and voila, we have our numbers back, but calculated by formula. All that is happening here is that each cell is now the result of both our previous formulas added together.  You should have this:

calendar_part_4.PNG

Let's change the format of our calendar cells.  Select the cells, right-click and choose Format Cells...  On the Format Cell dialog make sure that the Number tab is selected at the top.  Under Category select Custom and in the box directly under "Type", enter "d" without the quotation marks and click OK at the bottom.  This tells Excel to present each number as a day of the month.  The numbers in the Calendar look a little better. We now have 31 as the maximum.  Now select the cells again, press F2 on the keyboard and add the following to our existing formula without the quotations:

+ Date(Year(Now()),Month(Now()),1)

Remember to use Control-Shift-Enter on the keyboard to enter this array formula. The entire formula should now be:

={0;1;2;3;4;5}*7 + {0,1,2,3,4,5,6} + Date(Year(Now()),Month(Now()),1)

While this may look completely foreign, don't be intimidated.  The phrase we just added is nothing more than a formulaic way to identify the date that is the first of THIS MONTH.  This is exciting because this allows our calendar to display the LIVE date when we are working on whatever worksheet we put it in.

excel_dates.png

In simple English we are adding to our existing offsets the date in which the year is now, the month is now, and the day is 1.  Depending on what month you are reading this, your calendar will now look something like:

calendar_part_5.PNG

OK, but the 1st was not a Sunday!  Let's fix that.  Excel has dozens of built-in worksheet functions to do all sorts of different calculations and lookups.  One of these is the WEEKDAY function.  It simply takes any date, January 18, 2010 for example, and calculates the day of the week in numerical representation (Monday = 1, Tuesday = 2, etc.).  In this step, we need to determine what day of the week the 1st was and subtract that number from our existing formula. This is the last step, the last fragment of our formula!

If all of this is new, your head may be spinning, but please hang in there.  This type of project will seem old-hat in no time, I promise.  We are learning a new way of thinking, instructing Excel to do the hard work.  The formula fragment that we need for the last part then is:  - (WEEKDAY(1st of month, 2)).  Of course, when we really enter it, we need to replace 1st of month with the fragment that we used in the previous step to calculate the 1st.  Look at the ", 2" near the end.  This is an option to the WEEKDAY function.  Placing a 2 here instructs the WEEKDAY function to make Monday = 1.  The final, entire formula, array entered of course, should be:

={0;1;2;3;4;5}*7 + {0,1,2,3,4,5,6} + Date(Year(Now()),Month(Now()),1) - Weekday(   Date(Year(Now()),Month(Now()),1), 2)

Now, while all the cells are still selected, center format them.  Just click on the center format button at the top of the window.  Let's add the Day titles.  In cell C2 we are going to enter a tiny, normal formula (NOT a CSE):

=C5

Let's immediately format this one cell to change its presentation.  Right click on it, choose Format Cells..., Number tab, Custom, and enter "ddd" in the Type box (without the quotation marks).  While you're at it also change the justification to Center Text on the Alignment tab and click on OK. 
 
Now click on the tiny anchor at the bottom of this cell and drag to the right through column "I" and let go of the click.  Let's also add the month title at the top.  In cell C1, enter the formula:

=C5

Yes this is the exact same formula we put into C2.  We'll just format it differently.  Before we do, select all the cells at the top of the calendar at the same time, C1 through I1.  Right click on this selected area and go to the Alignment tab of the Format Cells dialog once again. Under Horizontal, select Center Across Selection. Switch over to the Number tab and under Category choose Custom and in the Type box, enter: "mmmm" without the quotes.  Switch over to the Fill tab and click on one of the colors for Background Color; click OK.  Wow.  It really looks like a calendar now. Your calendar should look similar to:

calendar_part_6.PNG

What about the dates before and after this month?  Can't we make them dimmer? Select all of the cells with numbers in them.  Click on the Format menu at the top of Excel and select Conditional Formatting...  Under Condition 1, select "Formula Is" and then enter this small formula into the box just to the right:

=Month(Now())<>Month(C3) 

and click the Format button.  On the Format Cells dialog, pick a very light grey under Color.  Click on OK and OK again.  Go back to the Conditional Formatting dialog and click the Add button on the bottom.  Under Condition 2, select "Formula Is" and enter:

=C3=INT(Now())

and click the Format button.  On the Format Cells dialog, pick white under color and Bold above that.  Switch over to the Patterns tab and choose a background color for the current day highlight.  Click OK and OK again.  Now select all of the cells in the calendar, including the titles and place a box around them by right-clicking on the selection, Format Cells..., switch to the Border tab and click on the Outline button; click OK at the bottom.  And we're done!  It should look like this:

calendar.png

While normal formatting is useful and essential for presentation, Conditional Formatting kicks that up a few notches, as Emeril might say.  Conditional Formatting is extremely powerful and we will use it extensively throughout Excel Hero. 

conditional_formatting.png


This tiny project has been instructive.  We've learned:
  • There are many ways to do just about anything in Excel.
  • Better methods are faster, shorter, easier to maintain, or simpler, and in that order. 
  • Excel can auto-fill cells by recognizing a pattern.  This is done by clicking and dragging the anchor.
  • Formulas should always start with an equal sign.
  • Excel has dozens of built-in formulas.
  • These can be combined to create countless custom formulas.
  • CSE or array formulas are potent and are entered by Control-Shift-Enter.
  • Arrays of constants need to be surrounded by fancy parentheses. 
  • Horizontal elements are comma separated; vertical elements are semicolon separated.
  • The date system used by Excel makes date arithmetic easy.
  • Formatting can dramatically change the way information looks.
  • Conditional Formatting takes presentation to the next level.

So what can I do with this live calendar?  For starters, you can take a picture of it and place the picture anywhere in your workbook.  Two things are interesting about this. The picture floats above the Excel grid and is not limited to the underlying rows and columns of where you place it.  This can really help in formatting your entire page for presentation.  Secondly, the picture itself is live, so if the original calendar is updated either by the computer clock or by changes you make to the original's layout or design, the copy updates as well. 

I will introduce one more idea in this post.  What if our long, custom formula is really annoying, or just confusing?  Excel offers a robust subsystem that allows you to assign names to your custom formulas.  They modestly call this feature, "Named Ranges."  In Excel a range is a group of cells (a single cell is also considered a range).  

Most instruction on Excel teaches that the idea behind named ranges is that you can select some cells (or one) and invent a name for that range, which the workbook will remember.  This can make your custom formulas much easier to read than all of the cell references.  However, in my opinion, the fact that you can name a range, is almost beside the point.  If you were to name cell C5 as MyFirstRange, Excel stores this in a workbook table as MyFirstRange:  =C5.  As you can see it is a formula, because the definition of the name starts with an equal sign.  

Extreme kudos are in order for the designers of Excel on this topic because what this enables is Named Formulas.  In fact this feature of Named Ranges should have been called Named Formulas.  You can have just about any formula, simple or extremely complex, defined as a named formula.  The name does not even need to refer to a range at all.  

For example, the following are all valid named ranges: =5, ={1,2,3}, =Now(), ="Excel Hero", =SQRT(ROW())-1.  That's five named range definitions that have no reference to any range whatsoever.  I could just as easily give you a thousand.  From now on this blog will always refer to named ranges as Named Formulas.  

To define a named formula for a worksheet or workbook, just click on the Insert menu at the top of Excel, select Name, and then Define.  On the Define Name dialog, create a name at the top and type a formula at the bottom, and then click on Add or OK.  Later you can manage your named formulas by modifying and deleting.  We will be delving deeply into that later.  What if we define three named formulas for our calendar as follows?

  • DaysAndWks        ={0,1,2,3,4,5,6} + {0;1;2;3;4;5}*7 
  • DateOfFirst    =Date(Year(Now()),Month(Now()),1)
  • Calendar  =DaysAndWks + DateOfFirst - WEEKDAY(DateOfFirst,2)

We would then be able to array-enter our custom formula for all of the day cells as: 

       =Calendar

Is that a little easier to read?  What could be simpler?  Notice that none of these formulas refers to a cell or a range.  They truly are named formulas, not named ranges.

[Excel 2007 Tip: Conditional Formatting is on the Home tab of the ribbon; select Manage Rules, then New Rule, click on "Use a formula to determine which cells to format" and enter the first formula into the box below, click OK and then New Rule to add the second formula in the same way.  Pick the colors as described at left.]


This is not difficult stuff at all; it just may be a lot of new things all at once.  Please stick with it.  You'll learn that a lot of these menus that we clicked through have shortcut alternatives that eliminate most of the work.  If you keep reading this blog, the concepts and processes will become so old-hat that you will be able to construct this entire calendar from scratch in a new workbook in less than a minute.


Once again, here is that video of me constructing the live calendar in real time.  It's just under 2 minutes, but I was talking while doing it ;)


What's an Excel Hero?

| 7 Comments | 0 TrackBacks
An Excel Hero saves money.  Lots of money.
NASDAQ in Times Square, New York City, USA.

Image via Wikipedia


The vast majority of users are doing some business activity with Excel.  An Excel Hero looks at an activity and sees how to use advanced techniques to save time.  It might be a small formula; it might be an entire project integrated with a database that can eliminate wasted time, possibly in your own job, department or division.

You may even be able to eliminate your position... or your office!  All businesses that succeed are ever mindful of the bottom line and that translates into keeping expenses in check.  In most organizations, labor is the number one cost.  Look for ways to prune and you'll definitely be a hero.  Believe me, just about every company has wasteful activities done by staff by hand or on computer.  Ironically they're often done in Excel.  And that's the crux of this book.  Excel offers so many ways to do things that well-intentioned managers and executives choose simple over better.  Simple to them, that is.

My goal is to post entries that will spark your imagination, so that you'll be anxious to craft time-saving solutions immediately.  Go ahead.  Money saved today is better than money saved once you become a master.  Keep returning and I promise to keep posting new techniques.  At first you may learn how to shave minutes even hours on a daily task.  Later you may learn how to eliminate the task from your business altogether!  I want this blog to be that valuable.

Just the other day I delivered a project to one of my clients.  They are in the business of modeling subtle and complex trends in historical stock market data, tracking dozens of indicators such as business cycles, credit availability, and sector risk to see what affect myriad variations of these have on selected portfolio returns.  Using Excel, it was taking a minimum of 10 analyst-days of highly compensated analyst time to complete each simulation.  My Excel solution for them completes a simulation in less than five minutes and makes no mistakes.  They run thousands of simulations.  My project eliminated a department.  I've done this type of thing many times.  You can too.

When you start saving your company serious time, you'll be noticed.  You'll have wonderful choices.   As you read this blog always ask yourself, can I use this to save my company money? 

In short order you will be an Excel Hero!


Excel, a Remarkable Tool

| 0 Comments | 0 TrackBacks
What is Excel and what can I do with it?
Microsoft Excel (Windows)

Image via Wikipedia


Excel is a calculator.  You can add with it.  This is short and true, and misses the point completely.  

Outside of Windows, Excel is the most successful software ever.  Tens of millions of people use it every day as part of their work or at home.  It is used to store information, sort, filter, present, and oh yes, calculate.  It's used to make shopping lists, budgets, and to design the upcoming manned mission to Mars.  Of the millions of daily users of this amazing product, only a tiny fraction of a percent know how to do more than sum a column of figures.  That's tragic, when you consider the depth of capability Excel offers to analyze and visualize data and to solve problems.

If you have the ability to understand a problem, this blog is designed to help you judge whether Excel would be a useful tool for solving the problem.  It will illuminate dozens of simple, yet powerful techniques that once understood, will enable you to direct Excel in unexpected ways.  If you have never used Excel before, if you use it every day and wish you knew how to lighten your workload, even if you are an expert, you can be an Excel Hero and this blog will show you the path.

I have been using Excel since the very beginning and Lotus 123 before that.  As a consultant, I have crafted literally thousands of projects, solving problems for businesses large and small using Excel and Access.  This blog will draw from that experience and teach you how to do things that you likely never imagined before.

Most activities with Excel can be classified into three broad groups.  1. IMPORTING information (data) into Excel.  2. Manipulating data for ANALYSIS.  3. Polishing the data for PRESENTATION and reporting.  Some projects focus in just one of these categories; others will require all three. 

As you will discover, any given task in Excel will likely have many different solutions. I once won a formula competition on MrExcel.com's Challenge of the Month.  There were over 500 unique formulas to solve the contest problem entered by people the world 'round. Over 500 solutions to one problem!  I was honored that MrExcel.com's readers chose my formula as the best of the bunch.

What I want you to learn is that there may not be a "best" method for doing a task, but some are clearly better than others.  I would say that a method is better if it is interactive, faster, shorter, easier to maintain (because you are the one that will maintain your projects), or simpler, and in that order.
Reblog this post [with Zemanta]

About this Archive

This page is an archive of entries from January 2010 listed from newest to oldest.

February 2010 is the next archive.

Find recent content on the main index or look in the archives to find all content.