Data labels are terribly tedious to apply to slope charts, since these labels have to be positioned to the left of the first point and to the right of the last point of each series. This means the labels have to be tediously selected one by one, even to apply “standard” alignments.

I recently wrote a post called Slope Chart with Data Labels which provided a simple VBA procedure to add data labels to a slope chart; the procedure simplified the problem caused by positioning each data label individually for each point in the chart.

The problem is that often points are located close to each other; the result: overlapping data labels. I showed a feature in my commercial software that lets you fine-tune the position of individual data labels. But even with that helpful tool, it is still a tedious process.

A more inclusive VBA procedure that would take care of overlapping data labels is a much more difficult task. I thought about it for a while. Since each set of labels is aligned horizontally with the points they describe, I would only have to worry about vertical positioning. I would have to sort the labels by vertical position, then see if a label overlapped any subsequent label. Also, I should allow some overlap, since labels include a small white margin around their text. But it seemed like a lot of work.

My friend and colleague Andy Pope came to the rescue. He sent an email with a nifty VBA routine he’d worked out. I remembered that long ago Andy had a prototype label untangler, and I know I played with it. But what he sent me now was much better than a prototype. I tested it and liked it, then I thought about how I would build my own.

Andy’s routine used a class module to create a DataLabel object for each label that had to be realigned. His routine inserts each DataLabel object into a VBA Collection in such a way that the Collection is sorted by vertical position. Then the code loops through the labels: if two were to overlap, the upper label is moved up a tiny bit and the lower one down a little bit. This must be done iteratively because moving two labels apart might move either one closer to a third label. After looping until all labels were separated by a certain amount, the code finishes.

I decided to build a slightly different routine, because I already have a slope chart feature in my commercial software (which will be released in a new build soon), and I’d like something that worked with it. To compile all the labels, the program builds a two-column VBA array, with series numbers in the first column and vertical position in the second. The code bubble-sorts this array by the second column. Then it loops through the series numbers in a nested loop, to compare each label with every other label.

My entire module is shown below. I declare a couple of constants used in the positioning. When I move labels I do so by a pixel at a time, which in Windows is specified by `MoveIncrement = 0.75`

(0.75 points, to be exact). If I want to run this on a Mac, I must change `MoveIncrement`

to 1, because this is how the Mac measures screen distances (one of the myriad incompatibilities we have to deal with). After a few trials, I set an `OverlapTolerance`

of 0.45, meaning I would let labels overlap by 45% of their height.

The routine that does the work is `FixTheseLabels`

. It first loops through the series of the chart, and if the series has a valid label on the point being tested (we text point 1 for the left side labels and point 2 for the right), then the series number and the top position of the label are stored in the array. If the series has no valid data label, then zeros are stored in the array.

When the array is populated, the code calls `BubbleSortArrayByColumn`

(a routine I cobbled together years ago) to sort by the top position of the labels. After sorting the labels, the label top position in the second column of the array is no longer needed.

The code loops through the array, getting the series number from the first column; if it reads a zero, it means there isn’t a valid label, so it skips to the next array row. The program gets one label from the first loop and one from the second. If the top of the second label is less than `OverlapTolerance`

(0.45 or 45%) higher than the bottom position of the first, the first label moves up and the second moves down.

`TEST_FixTheseLabels`

is what I use to fix the labels on a selected chart. The two lines in this sub can be placed before the `ExitSub`

label in the `ApplySlopeChartDataLabels`

procedure in my previous post.

```
Option Explicit
Const MoveIncrement As Double = 0.75 ' 0.75 points = 1 pixel
Const OverlapTolerance As Double = 0.45
Sub TEST_FixTheseLabels()
FixTheseLabels ActiveChart, 1, xlLabelPositionLeft
FixTheseLabels ActiveChart, 2, xlLabelPositionRight
End Sub
Sub FixTheseLabels(cht As Chart, iPoint As Long, LabelPosition As XlDataLabelPosition)
Dim nLabels As Long
nLabels = cht.SeriesCollection.Count
Dim vDataLabels As Variant
ReDim vDataLabels(1 To nLabels, 1 To 2)
Dim iLabel As Long
For iLabel = 1 To nLabels
Dim srs As Series
Set srs = cht.SeriesCollection(iLabel)
If srs.Points(iPoint).HasDataLabel Then
Dim dlbl As DataLabel
Set dlbl = srs.Points(iPoint).DataLabel
If dlbl.Position <> LabelPosition Then
dlbl.Position = LabelPosition
DoEvents
DoEvents
End If
If IsNumeric(dlbl.Height) Then
vDataLabels(iLabel, 1) = iLabel
vDataLabels(iLabel, 2) = dlbl.Top
Else
vDataLabels(iLabel, 1) = 0
vDataLabels(iLabel, 2) = 0
End If
Else
vDataLabels(iLabel, 1) = 0
vDataLabels(iLabel, 2) = 0
End If
Next
BubbleSortArrayByColumn vDataLabels, 2
Do
Dim DidNotOverlap As Boolean
DidNotOverlap = True
Dim FirstIndex As Long, SecondIndex As Long
For FirstIndex = 1 To nLabels - 1
If vDataLabels(FirstIndex, 1) > 0 Then
Dim FirstLabel As DataLabel
Set FirstLabel = cht.SeriesCollection(vDataLabels(FirstIndex, 1)). _
DataLabels(iPoint)
For SecondIndex = FirstIndex + 1 To nLabels
If vDataLabels(SecondIndex, 1) > 0 Then
Dim SecondLabel As DataLabel
Set SecondLabel = cht.SeriesCollection(vDataLabels(SecondIndex, 1)). _
DataLabels(iPoint)
If FirstLabel.Top + FirstLabel.Height * (1 - OverlapTolerance) > _
SecondLabel.Top Then
DidNotOverlap = False
FirstLabel.Top = FirstLabel.Top - MoveIncrement
SecondLabel.Top = SecondLabel.Top + MoveIncrement
End If
End If
Next
End If
Next
If DidNotOverlap Then Exit Do
Dim LoopCounter As Long
LoopCounter = LoopCounter + 1
If LoopCounter > 30 * nLabels Then Exit Do
Loop
End Sub
Sub BubbleSortArrayByColumn(MyArray As Variant, iSortCol As Long)
Dim FirstItem As Long, LastItem As Long
FirstItem = LBound(MyArray, 1)
LastItem = UBound(MyArray, 1)
Dim LastSwap As Long
LastSwap = LastItem
Do
Dim LoopCounter As Long
LoopCounter = 1 + LoopCounter
If LoopCounter > 10000 Then Exit Do
Dim IndexLimit As Long
IndexLimit = LastSwap - 1
LastSwap = 0
Dim iRow As Long
For iRow = FirstItem To IndexLimit
If (MyArray(iRow, iSortCol) > MyArray(iRow + 1, iSortCol)) Then
' if the items are not in order, swap them
Dim jCol As Long
For jCol = LBound(MyArray, 2) To UBound(MyArray, 2)
Dim TempValue As Variant
TempValue = MyArray(iRow, jCol)
MyArray(iRow, jCol) = MyArray(iRow + 1, jCol)
MyArray(iRow + 1, jCol) = TempValue
Next
LastSwap = iRow
End If
Next
Loop While LastSwap
End Sub
```

Here is the chart with overlapping data labels, before running my routine. Half of the labels are illegible.

Here is the chart after running the routine, without allowing any overlap between labels (`OverlapTolerance`

= zero). All labels can be read, but the space between them is greater than needed (you could almost stick another label between any two adjacent labels here), and some labels have moved far from the points they label.

Finally, here is the chart after running the routine with `OverlapTolerance`

= 0.45. Labels are closer together, but not too close, and some labels did not have to be moved (for example, see the lowest three labels on the right side of the chart). I tried a series of values for the overlap tolerance: zero tolerance obviously led to excessive spacing, 50% led to labels being a little too close, 40% was a little too far apart and some labels were moved that shouldn’t have to be. 45% seems to be the Goldilocks setting, though this probably varies with the font used for the labels.

- Prevent Overlapping Data Labels in Excel Charts
- Apply Custom Data Labels to Charted Points
- Label Each Series in a Chart
- Label Last Point for Excel 2007
- Add Totals to Stacked Bar Chart

The post Prevent Overlapping Data Labels in Excel Charts appeared first on Peltier Tech.

]]>This brief tutorial shows how to construct a slope chart in Excel, provides a simple VBA procedure to apply and format data labels quickly, and shows a finished chart with some manual repositioning of overlapping labels.

In Format all data labels at once on the Mr Excel forum, a user was frustrated with having to format data labels on his slope chart one point at a time, which is a very tedious and frustrating experience. One solution to such tedious tasks is VBA, and I wrote a procedure that would apply and format all data labels at once. The procedure ran instantly, compared to the many minutes it would take to create and format the labels by hand.

A slope chart is a nice way to compare two sets of matching data, for example, before and after sales data. The before points are plotted in the left side of a line chart, and the after points are plotted on the right, and you can trace the performance of each item by tracking the sloped lines connecting the points.

Edward Tufte introduced Slope Graphs in *The Visual Display of Quantitative Information*; examples of slope charts and tutorials showing how to create them in Excel abound on the interwebs.

Slope charts are not specifically built into Excel, but line charts are, so slope charts are easy to do. The complication is with data labels.

This is the slope chart data in the Mr Excel post:

Select the data and insert a line chart to create the slope chart.

Excel always tries to minimize the number of series and maximize the number of points per series, which is usually the way you want your chart to be plotted. But a slope chart has multiple series and only two points per series, so you need to switch rows and columns:

Usually the chart Excel inserts is larger than I need, but in this case, there is a lot of data and I need lots of room for my labels, so I have made the chart taller and wider.

The chart has a legend, but this legend is nearly useless. The best practice is to apply labels right on the chart. It can be done manually, but Excel first adds default labels above the points showing just the values. We need to select the labels for each series individually to add the series names, then we need to select the left and right labels separately to position them to the left of the left category or to the right of the right category. Bo-o-o-oring!

VBA to the rescue!

The simple procedure below first removes the useless legend, which lets the actual data fill the space more effectively. Then it cycles through the series, adding data labels with leader lines to each series, specifies that the labels show series name and value, colors the label text to match the series lines, prevents wrapping of longer labels, and positions each label on the appropriate side of the chart.

```
Sub ApplySlopeChartDataLabels()
If ActiveChart Is Nothing Then
MsgBox "Select a chart and try again!", vbExclamation
GoTo ExitSub
Else
With ActiveChart
.HasLegend = False
Dim iSeries As Long
For iSeries = 1 To .SeriesCollection.Count
With .SeriesCollection(iSeries)
Dim iColor As Long
iColor = .Format.Line.ForeColor.RGB
.HasDataLabels = True
.HasLeaderLines = True
With .DataLabels
.ShowValue = True
.ShowSeriesName = True
.Font.Color = iColor
.Format.TextFrame2.WordWrap = False
With .Item(1)
.Position = xlLabelPositionLeft
End With
With .Item(2)
.Position = xlLabelPositionRight
End With
End With
End With
Next
End With
End If
ExitSub:
End Sub
```

Simply select the chart and run the code. The easiest way is to press Alt+F8, select ApplySlopeChartDataLabels, and click Run.

The chart with labels looks like this:

That was fast, but there are numerous overlapping data labels. Writing VBA to alleviate this is rather complicated (I’ve tried!). You need to select each label (use two single clicks: one to select the series of labels, another to select a specific label) and drag it into position with the mouse.

I’ve taken some of the tedium out of this experience with the *Move/Resize Chart Elements* function in Peltier Tech Charts for Excel, my commercial charting add-in. With it, I can select a set of labels, or even an individual label, and apply a built-in position, or move it by one or more pixels or points at a time, with the arrow buttons on the dialog or with the arrow keys on my keyboard.

Using my utility it took about five minutes to disentangle all the data labels. Labels in the default left or right position do not have leader lines connecting the labels to their respective points, but once a label is moved, its leader line appears. The result looks good and is easily digested:

- Apply Custom Data Labels to Charted Points
- Label Each Series in a Chart
- Label Last Point for Excel 2007
- Add Totals to Stacked Bar Chart

The post Slope Chart with Data Labels appeared first on Peltier Tech.

]]>I recently wrote Dynamic Array Histogram, a tutorial showing how to build a histogram with a normal curve overlay. It worked great, except that the chart was hard-coded to the worksheet ranges with the data. When I changed the Dynamic Array formulas, the spill ranges changed size, and I had to manually adjust the chart data ranges. Not too big a deal for just a few series, but still an inconvenience.

Here is what the data looks like. Cell D7 contains the following formula, which spills the numbers from 161 to 175 into the highlighted range D7:D21.

`=SEQUENCE(H3+1-G3,1,G3,1)`

I’ve written many tutorials about Dynamic Charts (see the list at the end of this article). Ordinarily I would generate some Names (aka dynamic range names) using a formula like this

`=OFFSET(Sheet1!$D$6,1,0,COUNT(Sheet1!$D$6:$D$100),1)`

then add the Names to the chart.

But we don’t need to use an OFFSET or other function to determine the size of the range of data. It’s a Dynamic Array, which we can reference using a hash sign (#): The entire spill range of the Dynamic Array formula in D7 is referenced by D7#.

We still need to use Names in the dynamic chart, so click Define Name on Excel’s Formulas tab, give our Name a good name (`xWeights`

), for Scope select `Sheet1`

, and use `=Sheet1!$D$7#`

as the Refers To formula. If you click in the Name textbox and back in the Refers To box, the range defined by the formula will be highlighted as shown.

Similarly we had Dynamic Array formulas in cells E7 and F7, which spilled into E7# and F7#. We define Names `yCount`

and `yCurve`

using `=Sheet1!$E$7#`

and `=Sheet1!$F$7#`

. These are the Names we will use in our dynamic chart.

Press Ctrl+F3 to open the Name Manager and see all of the Names. As above, clicking in the Refers To box will highlight the range defined by the formula.

*A note about naming Names. I use a prefix of x or y for chart data which will be used as X or Y values. I used to use a prefix of cht, but a fluky behavior in Excel 2007 and more recent versions changed that. If the name of a Name begins with R or C (think Row or Column), the name cannot easily be used in the series formula. If your language of Excel is not English, you can’t use the letters that the words for Row and Column begin with in your language.*

Start by selecting the range of data (D6:F21), including the header row. And note that the top left cell D7 is blank, to help Excel determine that the first column is X values. Insert a column chart, then right click on the Curve series, choose Change Series Chart Type, and select a line style.

When the chart is selected, the data range is highlighted.

Select the Count series (the blue columns). The data is highlighted, and this `SERIES`

formula appears in the formula bar.

`=SERIES(Sheet1!$E$6,Sheet1!$D$7:$D$21,Sheet1!$E$7:$E$21,1)`

This means the series name is in `Sheet1!$E$6`

, the X values are in `Sheet1!$D$7:$D$21`

, the Y values are in `Sheet1!$E$7:$E$21`

, and it is the first series in the chart.

Right in the formula bar, change `Sheet1!$D$7:$D$21`

to `Sheet1!xWeights`

and `Sheet1!$E$7:$E$21`

to `Sheet1!yCount`

. Click Enter, so the `SERIES`

formula becomes

`=SERIES(Sheet1!$E$6,Sheet1!xWeights,Sheet1!yCount,1)`

The worksheet data is still highlighted.

Repeat the procedure with the Curve series, changing the formula from

`=SERIES(Sheet1!$F$6,Sheet1!$D$7:$D$21,Sheet1!$F$7:$F$21,2)`

to

`=SERIES(Sheet1!$F$6,Sheet1!xWeights,Sheet1!yCurve,2)`

In the Dynamic Array Histogram tutorial, I showed how to make a multiple-column Dynamic Array from a single formula. This makes defining our Names more complicated, but only slightly so.

The Dynamic Array formula in cell D4 spills into multiple rows and columns.

The formula `=D4#`

(entered into cell I4) spills into the same size range.

But I can use `INDEX`

to return a portion of the Dynamic Array in D4#. For example, `INDEX(D4#,1,1)`

returns the cell in the first row and first column of D4#.

To get the first column, I specify the row as zero, in `INDEX(D4#,0,1)`

. I could leave out the zero altogether as long as I have the right amount of commas: `INDEX(D4#,,1)`

.

If I’d wanted just the first row, I use zero (or a blank) for the column index: `INDEX(D4#,1,0)`

or `INDEX(D4#,1,)`

.

As it turns out, I could use `INDEX(D4#,0,0)`

or `INDEX(D4#,,)`

to reference the entire array.

I define my Names as above, name: `xWeights`

, scope: `Sheet2`

, refers to: `=INDEX(Sheet2!$D$4#,0,1)`

.

Define `yCount`

and `yCurve`

as `=INDEX(Sheet2!$D$4#,0,1)`

and `=INDEX(Sheet2!$D$4#,0,1)`

, create and format the chart, and plug in the Names for the hard-coded addresses.

Here are more articles on the Peltier Tech blog that talk about dynamic charts.

- Dynamic Charts
- Easy Dynamic Charts Using Lists or Tables
- Dynamic Charts in Excel 2016 for Mac
- Dynamic Chart using Pivot Table and Range Names
- Dynamic Chart using Pivot Table and VBA
- Dynamic Ranges to Find and Plot Desired Columns
- Split Data Range into Multiple Chart Series without VBA
- VBA to Split Data Range into Multiple Chart Series
- Dynamic Chart Source Data (VBA)
- Dynamic Chart with Multiple Series
- Display One Chart Dynamically and Interactively

The post Dynamic Charts Using Dynamic Arrays appeared first on Peltier Tech Blog.

]]>*Written by Ron de Bruin, Microsoft Excel MVP.*

A lot of Excel procedures are packaged in Add-Ins. An Add-In is basically a workbook containing some code, that has some other unique properties. An add-in usually has some user interface elements, such as dialogs (UserForms), buttons on the ribbon, or buttons on context (right-click) menus, so you can access its features.

The add-in workbook itself is invisible, though it has one and often more hidden worksheets, where information related to the add-in may be stored. The add-in can be installed, which means it opens every time Excel opens, so that its functionality is always available. An Excel add-in can also be opened like a regular file when you need it, without installing it, though it is still hidden.

Very often, an Excel add-in is packaged in a zip file. If you download a zip file in safari it will automatically Unzip the file and you will find the add-in in your Download folder, if not double click the add-in and it will unzip.

The best place to store the add-in is in the User Add-in Library, which is:

`/Users/UserName/Library/Group Containers/UBF8T346G9.Office/User Content/Add-Ins`

To get to this folder, open a Finder window and click on Go in the Finder menu bar; if you press and hold the Option key you will see the hidden Library folder in the list so you can click on it to open it. Now drill down to Group Containers, `UBF8T346G9.Office`

, User Content, and then Add-Ins. The strange label “UBF8T346G9.Office” was assigned by Apple to Microsoft, and we have to live with it.

You can actually store the add-in in almost any convenient folder you want, but when you use the Add-Ins folder (above), the add-in is listed in the Excel Add-ins dialog, and you will not have to browse to find it and add it to this list.

Tip: with the Add-ins folder open in Finder, use the shortcut `Command+Control+t`

to add a link to this folder in the Finder Favorites sidebar so it is easy to access the folder another time.

**Pro Tip:** Go to Setup your Mac for Mac Office. Among other hints, this page includes a script that will install shortcuts to the Add-Ins folder and several other folders on the Mac desktop.

To install the add-in, follow these simple steps.

- Open Excel.
- From the top-level Mac menubar, click the Tools menu and select Excel Add-ins… to open the Add-ins dialog.
- If the Add-In is listed, simply check it in the list.
- If the Add-In is not listed, use Browse to select it, and click Open.
- Press OK.
- Done!

To uninstall the add-in, follow these steps.

- From the Mac menubar, click Tools and select Excel Add-ins… to open the Add-ins dialog.
- Find the Add-In in the list, and simply uncheck it.
- Press OK.

The add-in will remain on the list, to make it easy to reinstall at some future time.

When you install an add-in, it will load whenever Excel opens, so it is always ready to use. If you don’t want an Excel add-in open all the time, you can simply start it when you need it, using `File menu > Open`

in Excel or double-clicking on the add-in file in Finder. This opens the add-in, but just like an installed add-in, there is no visible workbook or worksheet in Excel.

I’ve known Ron de Bruin for over 20 years, when we were both lurking on the Microsoft news groups, in awe of the first generation of Excel MVPs, like John Walkenbach, Chip Pearson, Bob Umlas, Ron Bovey, and Stephen Bullen. We answered more than a few questions and learned all about Excel. I’ve worked quite a bit with Excel for Mac, but Ron has become an expert on all things Mac and Excel. I became an Excel MVP in 2001, and Ron followed soon after, in 2002. Ron has been a strong advocate for users of Excel on the Mac. Ron covers many Excel and VBA topics on Excel Automation, and he covers Mac-specific Excel issues at Mac Excel Automation.

The post Install an Excel Add-In in Excel for Mac appeared first on Peltier Tech Blog.

]]>My friend Thom and I had a discussion about tracking weight and using histograms to show the shape of the weight distribution. He told me he disliked the native Excel Histograms, and I agreed. I’ve written a lot of articles about histograms (see the list at the end of this article), and my commercial software, Peltier Tech Charts for Excel, provides a couple types of histogram which are much more flexible than Excel’s.

I was going to show Thom how to build his own histograms, with a frequency table and all, and I thought, you know, all those new functions and features will make this easier. So I’m going to build a histogram using Dynamic Arrays, and show you how easy it can be.

I’m using my recorded weights for 2020, which has a column for date and one for weight. The dates are in A2:A288 and the weights are in B2:B288. Easy peasy.

I like to make a little summary table when I do an analysis like this. Below I’ve included the number of data points (Count), the Mean and Standard Deviation, and the Minimum and Maximum values. The formulas are `=COUNT(B2:B288)`

, `=AVERAGE(B2:B288)`

, `=STDEV(B2:B288)`

, `=MIN(B2:B288)`

, and `=MAX(B2:B288)`

.

Let’s make a list of weights. I’ll use `=UNIQUE(B2:B88)`

to produce a list of weights in column B. But let’s also sort the list, using `=SORT(UNIQUE(B2:B288))`

. The dynamic array formula starts in cell D6, and spills down as far as it needs to, in this case to cell D19. The spill range is indicated by the blue shadowed border of D6:D19. This spilling into appropriate-size ranges makes Dynamic Arrays flexible and powerful.

Given the weights we need counts for, we can use a simple COUNTIF formula. In cell E6 I have `=COUNTIF(B2:B288,D6#)`

. The # symbol after D6 in the formula means Excel will use the entire Dynamic Array defined in D6, and spill the results starting in cell E6, however long it may be.

And we see the Dynamic Array result in E6:E19

We can calculate the points we need for a Normal Curve using the `NORM.DIST`

function. The function in cell F6, which spills into F6:F19, is `=NORM.DIST(D6#,E3,F3,FALSE),`

using the mean and standard deviation calculated in cells E3 and F3.

Select the range D6:F19 and insert a clustered column chart.

Right click on one of the visible data points, and choose Change Series Chart Type from the pop-up menu. Select the Curve series, and change the chart type to Line in the dropdown, and check the Secondary Axis checkbox.

It’s a good start, but still a bit rugged.

Format the Curve series line to use a Smoothed Line, and change the Gap Width of the columns to 50.

Adjust the secondary axis scale: I typed 0.00 for Minimum so it was no longer automatic and 0.205 for Maximum so the curve and bars were about the same height.

Finally I hid the secondary vertical axis tick labels (position = none) and deleted the legend.

The bars and the normal curve are not perfectly aligned, because there’s a longer tail at the higher weights, but that’s not a problem.

So it’s a pretty good chart. Or is it…??

Did you notice that there was no value 174 between 173 and 175? The problem with using `UNIQUE`

is that it only gives you what values are in the range, not every value you might expect.

I’ll fix this by using `SEQUENCE(rows, columns, start, increment)`

rather than `UNIQUE(range)`

. Cell D6 has the formula `=SEQUENCE(H3+1-G3,1,G3,1)`

. The number of rows is the max plus one minus the min, `H3+1-G3`

; the number of columns is `1`

, the starting value is the minimum, `G3`

, and the increment is `1`

. This Dynamic Array formula requires an extra row (for the previously missing 174) but the Count and Curve Dynamic Arrays keep up easily.

The chart now shows all categories, with a zero-height bar at 174.

That’s even better. But is it good enough?

Thom’s data is different from mine: I record weights as whole number pounds, but Thom records tenths of pounds. This requires a few changes.

If I regress to the first example and use `=SORT(UNIQUE())`

, I get a list of every unique value in the data range. Obviously this isn’t what I want; I really just want whole numbers in the chart’s X values.

If I use the `SEQUENCE`

approach with `=SEQUENCE(H3+1-G3,1,G3,1)`

in cell D6, I still encounter a problem because the minimum isn’t a whole number.

I need to adjust my calculated Min and Max, using `=FLOOR(MIN(B2:B288),1)`

and `=CEILING(MAX(B2:B288),1)`

to give me whole numbers in cells G3 and H3. The function in D6, `=SEQUENCE(H3+1-G3,1,G3,1),`

now provides what I need

But now the `COUNTIF()`

function in cell E6 falls flat. As written, the function looks for exact matches with the results from the D6# Dynamic Array, counting only 23 of the 287 weights.

I need a smarter counting function in cell E6, so I will use `=COUNTIFS(B2:B288,">="&D6#,B2:B288,"<"&D6#+1)`

, which counts values between D6$ and D6#+1. This counts all 287 of these values.

The same `NORM.DIST`

function as before gives me the normal curve coordinates. Cell F2 has the formula `=NORM.DIST(D6#,E3-0.45,F3,FALSE)`

.

The chart is nearly identical to the one with whole number weights above.

All the charts so far suffer from a certain asymmetry. Because it’s easier for my weight to float upward a bit for a few days than downward, there is a longer upward tail on the distribution, and the bulge of the chart is off center.

I can fix that with a further modification to the Min and Max in my summary table. To make the chart symmetrical, I need as much space above and below the mean.

The distance from the mean to the max is `MAX(B2:B288)-E3`

, while the distance from the mean to the min is `E3-MIN(B2:B288)`

. The symmetric space above and below the mean is given by `MAX(MAX(B2:B288)-E3,E3-MIN(B2:B288)`

, so the new min and max values in G3 and H3 ARE:

`=FLOOR(E3-MAX(MAX(B2:B288)-E3,E3-MIN(B2:B288)),1)`

=CEILING(E3+MAX(MAX(B2:B288)-E3,E3-MIN(B2:B288)),1)

When these are used as the bounds, I get the following Dynamic Arrays:

When I create a chart as above, the bulge of values is centered in the chart. This satisfies my internal aesthetic. Because of the longer tail at the top end, the curve and bars are not perfectly aligned, but I won’t dispute the data.

If the Dynamic Arrays are becoming easy for you, we can take it further, using the new `LET`

function. With `LET`

, I can define inputs and intermediate calculations, and use them in downstream calculations. I’m continuing with whole number data from here on, but these principles could be applied to either case.

My formula in cell D6 is shown below. I define the input range of weights, `rng`

, and the calculated minimum and maximum values, `datamin`

and `datamax`

. I compute `delta`

, the larger of the spans between the max or the min and the mean. Based on `delta`

I compute my new minimum and maximum values, `newmin`

and `newmax`

. I determine my list of `weights`

, then do my `COUNTIF`

and `NORM.DIST`

as in the individual Dynamic Arrays in D6:F6. By using `CHOOSE({1,2,3},...`

I can output all of these from a single formula. It’s mind-boggling at first, but also exciting.

`=LET(rng,B2:B288,`

`datamin,MIN(rng),`

`datamax,MAX(rng),`

`avg,AVERAGE(rng)`

,`delta,MAX(datamax-avg,avg-datamin),`

`newmin,FLOOR(avg-delta,1),`

`newmax,CEILING(avg+delta,1),`

`weights,SEQUENCE(newmax+1-newmin,1,newmin,1),`

`CHOOSE({1,2,3},`

`weights,`

`COUNTIF(rng,weights),`

`NORM.DIST(weights,avg,STDEV(rng),FALSE)`

`)`

`)`

With this large formula in cell D6, here is my new Dynamic Array formula. Just like the previous one, but it takes one formula, not three.

Since the calculations are identical, the resulting chart is identical.

Is your mind blown yet? If not, read on.

Dynamic Array formulas came first, and they awed and amazed us all. Then came the `LET`

function, which allowed us to input arguments and perform intermediate calculations leading to our desired results. But `LAMBDA`

takes Excel an order of magnitude further, allowing us to define a formula, then use it as a custom function wherever we need it.

Using `LAMBDA`

I’m going to define a function `HistoNormData`

, which will allow me to input a range, such as my weights, and spit out a data range that I can use in a histogram.

The bulk of my `LAMBDA`

formula will be the `LET`

formula from the previous section. I input the data range into `LAMBDA`

, pass it into the `LET`

, and output the chart data range. The LAMBDA formula looks like this:

`=LAMBDA(`

rng,`LET(datamin,MIN(rng),`

`datamax,MAX(rng),`

`avg,AVERAGE(rng)`

,`delta,MAX(datamax-avg,avg-datamin),`

`newmin,FLOOR(avg-delta,1),`

`newmax,CEILING(avg+delta,1),`

`weights,SEQUENCE(newmax+1-newmin,1,newmin,1),`

`CHOOSE({1,2,3},`

`weights,`

`COUNTIF(rng,weights),`

`NORM.DIST(weights,avg,STDEV(rng),FALSE)`

`)`

`)`

I can’t use it like this. But I can enter it into a formula and append values for the LAMBDA arguments. My only argument is rng, and I want to use B2:B288, so I enter the formula in cell D6, and append the range address in parentheses at the end of the formula:

`=LAMBDA(`

rng,`LET(datamin,MIN(rng),`

`datamax,MAX(rng),`

`avg,AVERAGE(rng)`

,`delta,MAX(datamax-avg,avg-datamin),`

`newmin,FLOOR(avg-delta,1),`

`newmax,CEILING(avg+delta,1),`

`weights,SEQUENCE(newmax+1-newmin,1,newmin,1),`

`CHOOSE({1,2,3},`

`weights,`

`COUNTIF(rng,weights),`

`NORM.DIST(weights,avg,STDEV(rng),FALSE)`

`)`

`)`

(B2:B288)

This approach helps to debug the `LAMBDA`

formula.

The result of the LAMBDA formula is identical to that of the LET formula in the previous section, and the resulting chart is also identical. Since the LAMBDA works out in this test, I’m ready to convert it into a custom function. This is done using Excel’s `Defined Name`

infrastructure.

On the Formulas tab of the ribbon, click `Define Name`

. When the New Name dialog pops up, enter the name of the custom function in the Name textbox, and enter the formula (not including the arguments in parentheses at the end) in the Refers To textbox. It’s easiest to just copy and paste the formula that you worked on in the Excel formula bar above: you can make the Name dialog larger, but you can’t make the Refers To box more than one row of text tall. Microsoft assures us they are working on a better formula editing experience, and we can’t wait.

Click OK and the custom function is created.

The function is used like any others in Excel. Cell D6 contains the formula

`=HistoNormData(B2:B288)`

and my three columns of values are output in the sheet, spilling to fill as many rows as are needed.

The range and chart are identical to what we’ve already seen above, but I can easily use my `HistoNormData`

function to compute similar output ranges for other data in the same workbook. For example, in the worksheet shown below, I have a much larger range of data. I entered this formula in cell D6

`=HistoNormData(B2:B783)`

And I get a corresponding chart of the larger data set, without any additional work.

Names are defined for a given workbook, so you would have to define your custom function in any workbook where it is needed. But it’s easier than you think: you use the custom LAMBDA function on a worksheet, then copy that worksheet to another workbook, and the custom function is also copied to the new workbook.

- Histogram With Normal Curve Overlay
- Histograms Using Excel XY Charts
- Histogram Using XY and/or Area Charts
- Filled Histograms Using Excel XY-Area Charts
- Histogram on a Value X Axis
- Histogram with Actual Bin Labels Between Bars
- Peltier Tech Histogram

The post Dynamic Array Histogram appeared first on Peltier Tech Blog.

]]>I’ve written several articles and done a few presentations about the new functions in Excel, such as Dynamic Arrays, the LET function, and the latest, the LAMBDA function. As I continue to build VBA-based workbooks and add-ins, I realize how much easier these features make the projects I’m working on, and I am planning to release Microsoft 365 versions of these projects.

Someone recently asked me how to test in VBA whether a user’s computer has these features. I realize that this is something I’ll need to do, so I did a little work to develop a VBA test.

I started by putting some new formulas into a worksheet, then I checked what happened to these formulas when the workbook was opened in an older version of Excel (I used Excel 2013). Then I checked what happened when I tried to use VBA to insert such formulas into a worksheet.

Here is my simple test worksheet in the latest Insiders build of Excel.

I named cell B2 `TestDynamicArray`

for future reference, and entered a Dynamic Array formula in that cell.

Cell B5 is named TestLET and contains a simple LET formula.

Cell B7 is named TestLAMBDA and contains a simple LAMBDA formula.

Cells C2, C5, and C7 (formatted with blue text) use the FORMULATEXT function to show the formulas in column B.

The SEQUENCE function develops a list of numbers using this syntax:

`=SEQUENCE(rows,columns,start,step)`

Where `rows`

and `columns`

indicates the size of the resulting array’s spill area, `start`

is the first number in the sequence, and `step`

is how much greater each number is than the previous one. Any missing parameter is assumed to be 1.

In cell B2 I entered this formula:

`=SEQUENCE(2)`

in order to get a spill range of two cells tall and one cell wide.

My simple LET formula in cell B5:

`=LET(one,1,two,2,one+two)`

basically says, let the value of `one`

be 1 and the value `two`

be 2, then return the result of `one+two`

. And the result is 3, yay arithmetic.

My equally simple LAMBDA formula in cell B7:

`=LAMBDA(one,two,one+two)(1,2)`

says input two variables `one`

and `two`

, then return `one+two`

. The values in parentheses after the LAMBDA closing parenthesis, 1 and 2, are used as the inputs `one`

and `two`

. The result again is 3.

When I select a cell and type `?ActiveCell.Formula`

into the Immediate Window of the VBA Editor, the formula is returned. Here are the formulas of my three cells:

These look just like the formulas we entered, and which `FORMULATEXT`

showed us above. Boring, I know.

I booted up a virtual machine that has Windows 10 and Excel 2013. Here is how my modern functions look in an obsolete version of Excel:

The calculated values look okay, but that’s because Excel has preserved them, without recalculating them. Here is what VBA says the formulas are in my test cells:

Note that the new functions have a prefix of `_xlfn`

and the arguments in the LET and LAMBDA formulas have a prefix of `_xlpm`

. I think Excel saves new functions with this prefix to warn old versions of Excel that these are formulas from the future, and old Excel shows the stored cell values without spewing errors.

If you enter a formula with these prefixes in a version of Excel that recognizes the function name after the prefix, the prefixes are stripped away and the formula just works fine.

If you try to re-enter one of these prefix-laden formulas in an old version of Excel, the formulas are entered, but you get a `#NAME?`

error. You get the same error if you try to enter the modern formula in old Excel without the prefix.

One way to test for the availability of these functions is to hide a worksheet like this in a workbook or add-in. We can write simple functions in VBA to test whether the formulas are recognized, simply by checking for formulas with a prefix of `_xlfn`

.

Remembering that I named the cells with these formulas TestDynamicArray, TestLET, and TestLAMBDA, the VBA code can examine the formulas. My module looks like this, with the prefix I’m looking for declared in a constant at the top, followed by three test functions.

The functions have the same structure. Find the cell, extract its formula, and return `True`

if it does not begin with the prefix.

```
Option Explicit
Const TestString As String = "=_xlfn."
Public Function TestDynamicArray() As Boolean
Dim rTest As Range
Set rTest = ThisWorkbook.Names("TestDynamicArray").RefersToRange
Dim sFmla As String
sFmla = rTest.Formula
TestDynamicArray = Left$(sFmla, Len(TestString)) <> TestString
End Function
Public Function TestLET() As Boolean
Dim rTest As Range
Set rTest = ThisWorkbook.Names("TestLET").RefersToRange
Dim sFmla As String
sFmla = rTest.Formula
TestLET = Left$(sFmla, Len(TestString)) <> TestString
End Function
Public Function TestLAMBDA() As Boolean
Dim rTest As Range
Set rTest = ThisWorkbook.Names("TestLET").RefersToRange
Dim sFmla As String
sFmla = rTest.Formula
TestLAMBDA = Left$(sFmla, Len(TestString)) <> TestString
End Function
```

You can test for each function as you need it, or you can test for all of them at once:

```
Public Function TestNewFeatures()
Dim TestResult As String
TestResult = "Testing..." & vbNewLine
TestResult = TestResult & "Dynamic Arrays: " & TestDynamicArray & vbNewLine
TestResult = TestResult & "LET Function: " & TestLET & vbNewLine
TestResult = TestResult & "Lambda Function: " & TestLAMBDA
MsgBox TestResult, vbInformation + vbOKCancel, "Testing Excel for New Features"
End Function
```

Here is the resulting message box when tested in the latest Insiders build of Excel:

Here’s the result in Excel 2013:

I wouldn’t necessarily bother my user with this message box, but it’s a good example of how to call the test functions from code.

I might instead show a message explaining that his Excel installation is woefully out-of-date and some of the fabulous features of my most excellent program will not work.

Instead of cluttering my workbook or add-in with a dummy worksheet that contained sample formulas, I can write a VBA procedure that tested for all of the new functions at once.

This VBA function opens a new workbook, inserts each of the new

Public Function TestExcelForNewFunctions(ByRef bDynamicArray As Boolean, _ ByRef bLET As Boolean, ByRef bLAMBDA As Boolean) As Boolean Application.ScreenUpdating = False '[see note 1 below] Dim ws As Worksheet Set ws = Workbooks.Add.Worksheets(1) ' test Dynamic Array With ws.Range("B2") On Error Resume Next '[see note 2] .Formula2 = "=SEQUENCE(2)" '[see note 3] bDynamicArray = .HasSpill On Error GoTo 0 If Not bDynamicArray Then '[see note 4] If .HasFormula Then bDynamicArray = .Text = "#SPILL!" End If End If If Not bDynamicArray Then GoTo ExitFunc '[see note 5] End With ' test LET With ws.Range("B5") On Error Resume Next '[see note 2] .Formula = "=LET(one,1,two,2,one+two)" On Error GoTo 0 bLET = .Text = "3" '[see note 6] If Not bLET Then GoTo ExitFunc '[see note 5] End With ' test LAMBDA With ws.Range("B7") On Error Resume Next '[see note 2] .Formula = "=LAMBDA(one,two,one+two)(1,2)" On Error GoTo 0 bLAMBDA = .Text = "3" End With TestExcelForNewFunctions = bDynamicArray And bLET And bLAMBDA ExitFunc: ws.Parent.Close False Application.ScreenUpdating = True End Function

Here are the notes referenced in the code above:

- Turn off
`ScreenUpdating`

so nobody notices the test workbook being opened and closed. - Use
`On Error Resume Next`

to avoid run-time errors while entering formulas. Important for Dynamic Arrays, but not really needed for LET or LAMBDA. `Formula2`

means enter the formula as a Dynamic Array formula; plain`Formula`

would simply enter a one-cell implicit intersection formula, like`=@SEQUENCE(2)`

.`bDynamicArray`

might be`False`

even in modern Excel, if somehow a value in a cell blocked the spill range, so we test for a spill error.- If
`bDynamicArray`

is`False`

, there’s no need to test for LET;

if`bLET`

is`False`

, there’s no need to test for LAMBDA. - The cell would display 3 if the formula worked, or most likely
`#NAME?`

if it didn’t; I decided to test for 3 in case some other error message appeared.

I would call it as follows, though as before I may decide the change the message to an alert that the user’s Excel version is not up to date.

```
Public Sub TestExcel()
Dim bTestDA As Boolean, bTestLet As Boolean, bTestLambda As Boolean
Dim TestAll As Boolean
TestAll = TestExcelForNewFunctions(bTestDA, bTestLet, bTestLambda)
Dim TestResult As String
TestResult = "Testing..." & vbNewLine
TestResult = TestResult & "Dynamic Arrays: " & bTestDA & vbNewLine
TestResult = TestResult & "LET Function: " & bTestLet & vbNewLine
TestResult = TestResult & "Lambda Function: " & bTestLambda
MsgBox TestResult, vbInformation + vbOKCancel, "Testing Excel for New Functions"
End Sub
```

The message boxes are the same as for the previous VBA test, because I simply reused that bit of the code.

The post VBA Test for New Excel Functions appeared first on Peltier Tech Blog.

]]>I’ve written a tutorial about making Step Charts in Excel. Given a timeline of points where a value is set and where that value is constant until the next time it is changed, I showed how to build a line chart that showed the periods between changes as horizontal lines, and the changes as vertical lines. I will show how to use VBA to convert a line chart to a step chart.

For example, here is a table and chart of first class postage rates in the US. Obviously, connecting the data points with diagonal lines does not reflect the price between points.

Here is the original data again, and I’ve added dummy points at the beginning and end of the arbitrary time period I’m watching.

I’ll start my chart data range using all but the first X values (dates) and all but the last X values. These points stagger the values to one date later.

We need to connect the two sets of points: a horizontal line from each blue (original) point to the next orange (staggered point), and a vertical line from each orange point to the next blue point.

And we don’t need those markers.

If we append the intact original data range below the staggered data, and make the chart using this new extended range, Excel internally sorts the dates, and produces the steps we want.

This isn’t the most difficult chart protocol I’ve ever described, but it requires an extended data range and a bit of work. Wouldn’t it be nice to run a simple macro and make it happen?

We’re going to do just that, with a short VBA procedure which will hack the chart SERIES formula to mimic generating the extended data range above. I’ve written about the SERIES formula in The Excel Chart SERIES Formula, and I wrote a tutorial about using VBA to hack the SERIES formula in Switch X and Y Values in a Scatter Chart.

If I select the chart series, I can see the series formula in Excel’s Formula Bar. (The artificial endpoints are colored red.)

The formula looks like this:

`=SERIES(Postage!$C$2,Postage!$B$3:$B$17,Postage!$C$3:$C$17,1)`

The four arguments are:

`=SERIES(SeriesName,XValues,YValues,Index)`

As shown in the manual approach above, I need to combine all but the first X value with all of the X values, and all but the last Y value with all of the Y values. In the series above, our X values will become

`Postage!$B$4:$B$17 and Postage!$B$3:$B$17`

and our Y values will become

`Postage!$C$3:$C$16 and Postage!$C$3:$C$17`

You can use multiple-area ranges if you separate each area’s address with commas, and surround them with parentheses:

`(Postage!$B$4:$B$17,Postage!$B$3:$B$17)`

(Postage!$C$3:$C$16,Postage!$C$3:$C$17)

Our final SERIES formula will look like this:

`=SERIES(Postage!$C$2,(Postage!$B$4:$B$17,Postage!$B$3:$B$17),`

`(Postage!$C$3:$C$16,Postage!$C$3:$C$17),1)`

- Start with the active chart (and abort if there is none).
- Get the SERIES formula for each series.
`=SERIES(SeriesName,XValues,YValues,Index)`

- Split this formula at the commas to create an array.
`{"=SERIES(SeriesName","XValues","YValues","Index)"}`

- Find the range of X values, then find the range from the second cell to the last, get the address, and append the address of the original complete range.
- Find the range of Y values, then find the range from the first cell to the next-to-last, get the address, and append the address of the original complete range.
- Finally, put the new X and Y references into the array, and join them into a new SERIES formula, and put that into the chart.

Here is the procedure, which took me about 15 minutes to write.

Sub LineToStepChart() If Not ActiveChart Is Nothing Then Dim iSrs As Long For iSrs = 1 To ActiveChart.SeriesCollection.Count Dim srs As Series Set srs = ActiveChart.SeriesCollection(iSrs) Dim nPts As Long nPts = srs.Points.Count Dim SrsFmla As String SrsFmla = srs.Formula '' =SERIES(name,xvalues,yvalues,number) Dim vFmla As Variant vFmla = Split(SrsFmla, ",") '' array {"=SERIES(name","xvalues","yvalues","number)"} If iSrs = 1 Then ' only needed on first series since all series use the same X values ' fix X values: all but first point then all points Dim sXVals As String sXVals = vFmla(LBound(vFmla) + 1) Dim sXSheet As String sXSheet = Left(sXVals, InStr(sXVals, "!")) Dim rXVals As Range Set rXVals = Range(sXVals) Set rXVals = Range(rXVals.Cells(2), rXVals.Cells(nPts)) Dim sNewXVals As String sNewXVals = "(" & sXSheet & rXVals.Address & "," & sXVals & ")" End If ' fix Y values: all but last point then all points Dim sYVals As String sYVals = vFmla(LBound(vFmla) + 2) Dim sYSheet As String sYSheet = Left(sYVals, InStr(sYVals, "!")) Dim rYVals As Range Set rYVals = Range(sYVals) Set rYVals = Range(rYVals.Cells(1), rYVals.Cells(nPts - 1)) Dim sNewYVals As String sNewYVals = "(" & sYSheet & rYVals.Address & "," & sYVals & ")" '' array {"=SERIES(name","new xvalues","new yvalues","number)"} vFmla(LBound(vFmla) + 1) = sNewXVals vFmla(LBound(vFmla) + 2) = sNewYVals SrsFmla = Join(vFmla, ",") '' =SERIES(name,new yvalues,new xvalues,number) srs.Formula = SrsFmla Next End If End Sub

After selecting the last chart above (with markers removed) and running this procedure, here is the resulting chart and series formula:

Note that this only works for Line or Area charts, for X values which are dates or at least treated numerically. Line charts sort numerical X values before plotting them, which makes this macro approach so easy. If the X values are text (non-numeric labels), the following chart results:

It did what I specified, but with a text axis, it failed to produce steps. We could test for the right kind of axis, and the right kind of data, but that’s a lot of work for a quick and dirty program.

Step Charts are included in the many custom chart types available in Peltier Tech Charts for Excel 4.0.

Click the button to open the Step Chart dialog, or click the dropdown arrow. The first menu item also opens the dialog, while the second will convert your existing chart to a step chart using an approach like the one above, but with more error checking and flexibility built into it. Actually, it converts a copy of your chart, in case you change your mind.

The Step Chart dialog lets you select your data range and specify its characteristics, then select a chart type.

The result looks great, but it’s much easier than the manual approach. The chart is built on a new worksheet because of the data manipulation required; below I’ve simply pasted the chart next to the original data.

Even if your X values are non-numeric text, the program will give you a step chart.

Visit Peltier Tech Charts for Excel 4.0 to purchase a license to this powerful Excel add-in.

The post Convert Line Chart to Step Chart with VBA appeared first on Peltier Tech Blog.

]]>“Nice” axis scales allow all of your data to be shown, and depending on chart type, improve resolution by minimizing white space between plotted data and the edges of the chart. These scales also allow for a reasonable number of tick marks and gridlines, and have tick marks at human-friendly intervals. A friendly scale is 20-25-30-35-40; an unfriendly scale is 17-24-31-38.

I’ve written before about how you can Calculate Nice Axis Scales in Your Excel Worksheet and Calculate Nice Axis Scales in Excel VBA. The problem with the first of these is that it requires setting up a block of cells do do intermediate calculations, and the problem with the second is that, well, it uses VBA. VBA is supported only for Windows and Mac, not online or mobile, and its use raises security concerns.

New features in Microsoft 365 (a/k/a Office 365) give us the flexibility of VBA functions, but allow the formulas to reside in the worksheet, using worksheet functions. In this tutorial I will make use of `LET`

, `LAMBDA`

, and Dynamic Arrays.

I’ve cleaned up the workbook I used for this exercise, and you can download it from this link:

Calculate Nice Axis Scales with LET and LAMBDA.xlsx

The `LET `

function allows you to define names in the formula for values or intermediate calculations. The syntax is:

`=LET(Name1,NameValue1,Result)`

`=LET(Name1,NameValue1,Name2,NameValue2,...,Result)`

Each name must begin with a letter but not look like a cell reference (e.g., `ABCD123`

is valid, but `A1`

is not). Each name value as well as the result can be a cell reference (`A1`

is allowed here), a constant (`3.14159`

), or a calculation (`2*Input`

). You can specify up to 126 pairs of names and name values.

A couple illustrations:

`=LET(Input,A1,2*Input)`

`=LET(Side1,A1,Side2,A2,SQRT(Side1^2+Side2^2))`

`=LET(Side1,A1,Side2,A2,SumSquares,Side1^2+Side2^2,SQRT(SumSquares))`

These `LET`

formulas are more complicated than a regular formula would be, but `LET`

allows a given value to be looked up or calculated once and reused in the calculation.

`LET`

was introduced several months ago, and has now propagated out to all Microsoft 365 subscribers.

The `LAMBDA`

function is similar to `LET`

, and can incorporate `LET`

, but it is much more powerful. You define inputs and intermediate calculations, and output a result. Syntax:

`=LAMBDA(ParameterOrCalculation1,ParameterOrCalculation2,...)`

You can specify up to 253 parameters.

Repeating the illustrations above:

`=LAMBDA(Input,2*Input)`

`=LAMBDA(Side1,Side2,SQRT(Side1^2+Side2^2))`

`=LAMBDA(Side1,Side2,LET(SumSquares,Side1^2+Side2^2,SQRT(SumSquares)))`

You can’t use these as shown in the worksheet. You must either define a name using the LAMBDA formula as its definition, then call it from the worksheet, or you can enter the LAMBDA followed by a comma-delimited list of parameters in parentheses.

So I can define a name called `DoubleFunction`

, using the first `LAMBDA`

formula above, and call it in the worksheet like this:

`=DoubleFunction(A1)`

You can define another Name, call it `Hypotenuse`

, and use the second or third `LAMBDA`

formulas, and call it like this:

`=Hypotenuse(A1,A2)`

Alternatively, the first `LAMDBA`

formula above can be used in a worksheet cell like this:

`=LAMBDA(Input,2*Input)(A1)`

and the second like this:

`=LAMBDA(Side1,Side2,SQRT(Side1^2+Side2^2))(A1,A2)`

Using `LAMBDAs`

in the worksheet isn’t useful by itself, though it makes it easy to test your formulas before using them to define a Name. And if you forget to append the parameters to the formula, you will get a `#CALC!`

error in the cell.

You unleash the power of `LAMBDAs`

when you use one in a Name, because you can then use this as a function anywhere in the workbook, without repeating the function each time. When the formula needs to be modified, it only needs to be modified in the Name definition, not in each place it is used, so each time it is used, it uses the same formula.

`LAMBDA`

has been introduced to Office Insiders, so regular Microsoft 365 subscribers will have to wait a few months or so.

Dynamic Arrays have now been part of Microsoft 365 for some time. These formulas provide many great capabilities, and the internet is loaded with great examples. I will use them to output three values (axis minimum, axis maximum, and axis major unit) using only one formula.

I can use `CHOOSE`

to convert separate values into a Dynamic Array. The syntax is:

`=CHOOSE(index_num,value1,value2,value3,…)`

If `index_num`

is 1, `CHOOSE`

returns `value1`

, in `index_num`

is 2, it returns `value2`

, etc. If I enter an array of values for `index_num`

, `CHOOSE`

returns an array, as follows:

`=CHOOSE({1;2},Value1,Value2)`

returns a vertical array of `Value1`

and `Value2`

, because `{1;2}`

is a vertical array.

In the same way,

`=CHOOSE({1,2},Value1,Value2)`

returns a horizontal array of `Value1`

and `Value2`

, because `{1,2}`

is a horizontal array .

Alternatively, I can wrap my function in `TRANSPOSE()`

to convert a vertical array to horizontal. This is the better approach, so I only need to create one function to produce my output, and I can use `TRANSPOSE`

only if I need that output in a horizontal array.

`=CHOOSE({1;2},Value1,Value2)`

When I compute my axis scales, I usually output them in a column of two or three cells.

The algorithm to calculate axis scales has a few parts. First, we need to check that the minimum is in fact less than the maximum, and if not, switch them; if the minimum and maximum are the same, we need to move them away from each other.

Given appropriate minimum and maximum values, we need to move them slightly further apart to ensure that there is a small margin between all points and the edges of the plot area.

Finally, given the magnitudes of these values, we need to select an appropriate major scale unit (tick spacing), then find the minimum and maximum axis scale values that are a multiple of the major scale unit.

Using names `min_0`

and `max_0`

for the inputs in A1 and A2, I can use this `LET`

formula to list the inputs in ascending order, :

`=LET(min_0,A1,max_0,A2,CHOOSE({1;2},MIN(min_0,max_0),MAX(min_0,max_0)))`

I can use this LAMBDA in the worksheet to get the same result:

`=LAMBDA(min_0,max_0,CHOOSE({1;2},MIN(min_0,max_0),MAX(min_0,max_0)))(A1,A2)`

Or I can define a Name (Formula tab > Define Name), enter a function name like `MinMax0`

, and enter this formula in the Refers To box:

`=LAMBDA(min_0,max_0,CHOOSE({1;2},MIN(min_0,max_0),MAX(min_0,max_0)))`

Editing formulas in the Define Name dialog is no fun, since the box is small and there is no IntelliSense. But once you have it working in the worksheet with the arguments in parentheses at the end, you can just paste in the formula.

The best part is that you can now use this function anywhere, like this:

`=MinMax0(A1,A2)`

Since I want to use the adjusted min and max in subsequent calculations, I want them in the form of additional names within the formulas. This means I’ll have to use `LET`

within my `LAMBDA`

.

In my `LET`

formula, I’ll introduce two new names, `min_1`

and `min_2`

for the intermediate calculations:

`=LET(min_0,A1,max_0,A2,min_1,MIN(min_0,max_0),max_1,MAX(min_0,max_0),CHOOSE({1;2},min_1,max_1))`

These formulas are getting longer, but you can accommodate them in the Formula Bar. You can type Alt+Enter to insert a new line in the formula, then pad each line with space characters to provide indenting.

My LAMBDA with the nested LET looks like this in the worksheet:

`=LAMBDA(min_0,max_0,LET(min_1,MIN(min_0,max_0),max_1,MAX(min_0,max_0),CHOOSE({1;2},min_1,max_1)))(A1,A2)`

Here it is nicely formatted in the Formula Bar:

And I can define a new Name called `MinMax1`

that refers to this `LAMBDA`

:

`=LAMBDA(min_0,max_0,LET(min_1,MIN(min_0,max_0),max_1,MAX(min_0,max_0),CHOOSE({1;2},min_1,max_1)))`

And I call it like this:

`=MinMax1(A1,A2)`

Now I can adjust my sorted minimum and maximum values to allow for a small margin, while also adjusting for equal min and max values. I need to add names `min_2`

and `max_2`

for the extended calculations.

Here is my new `LET`

formula:

`=LET(min_0,A1,max_0,A2,min_1,MIN(min_0,max_0),max_1,MAX(min_0,max_0),min_2,IF(min_1=0,0,IF(min_1>0,MAX(0,min_1-(max_1-min_1)/100),min_1-(max_1-min_1)/100)),max_2,IF(max_1=0,IF(min_1=0,1,0),IF(max_1<0,MIN(0,max_1+(max_1-min_1)/100),max_1+(max_1-min_1)/100)),CHOOSE({1;2},min_2,max_2))`

or if it’s easier to read:

`=LET(`

`min_0,A1,`

`max_0,A2,`

`min_1,MIN(min_0,max_0),`

`max_1,MAX(min_0,max_0),`

`min_2,`

`IF(min_1=0,`

`0,`

`IF(min_1>0,`

`MAX(0,min_1-(max_1-min_1)/100),`

`min_1-(max_1-min_1)/100`

`)`

`),`

`max_2,`

`IF(max_1=0,`

`IF(min_1=0,1,0),`

`IF(max_1<0,`

`MIN(0,max_1+(max_1-min_1)/100),`

`max_1+(max_1-min_1)/100`

`)`

`),`

`CHOOSE({1;2},min_2,max_2)`

`)`

My new `LAMBDA`

:

`=LAMBDA(min_0,max_0,LET(min_1,MIN(min_0,max_0),max_1,MAX(min_0,max_0),min_2,IF(min_1=0,0,IF(min_1>0,MAX(0,min_1-(max_1-min_1)/100),min_1-(max_1-min_1)/100)),max_2,IF(max_1=0,IF(min_1=0,1,0),IF(max_1<0,MIN(0,max_1+(max_1-min_1)/100),max_1+(max_1-min_1)/100)),CHOOSE({1;2},min_2,max_2)))(A1,A2)`

or

`=LAMBDA(`

`min_0,max_0,`

`LET(`

`min_1,MIN(min_0,max_0),`

`max_1,MAX(min_0,max_0),`

`min_2,`

`IF(min_1=0,`

`0,`

`IF(min_1>0,`

`MAX(0,min_1-(max_1-min_1)/100),`

`min_1-(max_1-min_1)/100`

`)`

`),`

`max_2,`

`IF(max_1=0,`

`IF(min_1=0,1,0),`

`IF(max_1<0,`

`MIN(0,max_1+(max_1-min_1)/100),`

`max_1+(max_1-min_1)/100`

`)`

`),`

`CHOOSE({1;2},min_2,max_2)`

`)`

`)(A1,A2)`

I can define a Name called `MinMax2`

that uses this `LAMBDA`

, without the trailing `(A1,A2)`

.

I need to take the latest calculations, perform some intermediate calculations using names `delta`

, `power`

, and `factor`

, do a quick lookup, then calculate `min_3`

, `max_3`

, and `major_3`

to be output into a three-cell vertical array.

The final `LET`

:

`=LET(`

`min_0,A1,max_0,A2,`

`min_1,MIN(min_0,max_0),`

`max_1,MAX(min_0,max_0),`

`delta,IF(min_1=max_1,9,max_1-min_1),`

`min_2,`

`IF(min_1=0,`

`0,`

`IF(min_1>0,`

`MAX(0,min_1-delta/100),`

`min_1-delta/100`

`)`

`),`

`max_2,`

`IF(max_1=0,`

`IF(min_1=0,1,0),`

`IF(max_1<0,`

`MIN(0,max_1+delta/100),`

`max_1+delta/100`

`)`

`),`

`power,LOG10(max_2-min_2),`

`factor,10^(power-INT(power)),`

`major_3,`

`XLOOKUP(`

`factor,`

`{0,2.1,5,10},`

`{0.2,0.5,1,2},,`

`-1`

`)*10^INT(power),`

`min_3,major_3*INT(min_2/major_3),`

`max_3,`

`major_3*`

`IF(max_2/major_3=INT(max_2/major_3),`

`max_2/major_3,`

`INT(max_2/major_3)+1`

`),`

`CHOOSE({1;2;3},min_3,max_3,major_3)`

`)`

And the final `LAMBDA`

:

`=LAMBDA(`

`min_0,max_0,`

`LET(`

`min_1,MIN(min_0,max_0),`

`max_1,MAX(min_0,max_0),`

`delta,IF(min_1=max_1,9,max_1-min_1),`

`min_2,`

`IF(min_1=0,`

`0,`

`IF(min_1>0,`

`MAX(0,min_1-delta/100),`

`min_1-delta/100`

`)`

`),`

`max_2,`

`IF(max_1=0,`

`IF(min_1=0,1,0),`

`IF(max_1<0,`

`MIN(0,max_1+delta/100),`

`max_1+delta/100`

`)`

`),`

`power,LOG10(max_2-min_2),`

`factor,10^(power-INT(power)),`

`major_3,`

`XLOOKUP(`

`factor,`

`{0,2.1,5,10},`

`{0.2,0.5,1,2},,`

`-1`

`)*10^INT(power),`

`min_3,major_3*INT(min_2/major_3),`

`max_3,`

`major_3*`

`IF(max_2/major_3=INT(max_2/major_3),`

`max_2/major_3,`

`INT(max_2/major_3)+1`

`),`

`CHOOSE({1;2;3},min_3,max_3,major_3)`

`)`

`)(A1,A2)`

While it’s easier to read in expanded form, this last one doesn’t even fit on my large secondary monitor. It’s only about 500 characters, though, and Excel’s limit is 8192 characters in a formula.

That last `LAMBDA`

can be used to define a Name called `MinMax3`

, but I decided that `MinMaxMajor`

is a better name.

The next thing I want to try is to refactor this last `LAMBDA`

function into several smaller ones. The first will convert the separate inputs into an array, the second will ensure the values in the array are sorted, the third will add the small margins, the fourth will perform the `XLOOKUP`

, and the fifth will compute the axis scale parameters.

But my brain is tired, so I’ll do this another day.

- Calculate Nice Axis Scales with LET and LAMBDA
- Calculate Nice Axis Scales in Your Excel Worksheet
- Calculate Nice Axis Scales in Excel VBA
- Chart UDF to Control Axis Scale
- How Excel Calculates Automatic Chart Axis Limits
- Reciprocal Chart Axis Scale
- Custom Axis Labels and Gridlines in an Excel Chart
- Custom Axis, Y = 1, 2, 4, 8, 16
- Logarithmic Axis Scales
- Link Excel Chart Axis Scale to Values in Cells
- Consistent Axis Scales Across Multiple Charts
- Gantt Chart with Nice Date Axis
- Select Meaningful Axis Scales
- Bar Chart Value Axis Scale Must Include Zero

The post Calculate Nice Axis Scales with LET and LAMBDA appeared first on Peltier Tech Blog.

]]>I’m pleased to announce a major upgrade, to **Peltier Tech Charts for Excel 4.0**. The previous major upgrade to 3.0 was over five years ago, in September 2015, and the last minor build release was in January of this year.

The new package lists for the same $99 as its predecessor, but there is a $20 discount for Cyber Monday, which extends through December 12. Enter the code `Cyber2020`

in the shopping cart to take advantage of this discount.

Additionally, if you are currently using the now-obsolete version of the software, you can upgrade for half price. Email Peltier Tech to request your discount code.

You can skip the following description and go directly to the new product’s web page: **Peltier Tech Charts for Excel 4.0**.

The interface has been redesigned, and all dialogs have been given a facelift. You can see this in the dialog below for what I call Layer Charts, which is a new chart type. Layer Charts separate each series in a chart into its own panel, so they do not obscure each other, and so each has its own baseline.

I wrote about this kind of chart many years ago in Stacked Charts With Vertical Separation.

Another new chart type is a Step Chart, which uses data points to describe horizontal and vertical steps, rather than connecting points directly. My tutorial Step Charts in Excel described these.

Both of these new chart types allow you to convert an existing chart of suitable type into the new type, rather than finding the data and creating a new chart from scratch.

I’ve introduced yet another Waterfall Chart variation, which allows you to specify free-standing values which are independent of the rise and fall of the intermediate waterfall values. Here is the Extended Waterfall Chart:

Many existing charts and related features have been enhanced. For example, Marimekko and Cascade charts now look better when the input values contain both positive and negative values.

In addition to the obvious, visible changes, there have been many improvements behind the scenes that make the program work better and in some cases faster.

You can read more about this new software, and purchase your own license, at **Peltier Tech Charts for Excel 4.0**. Don’t forget the coupon code `Cyber2020`

to get a $20 discount.

The post Charts for Excel Upgrade appeared first on Peltier Tech Blog.

]]>A recent post in the Mr Excel forum requested a conditional donut chart. The chart had to have 31 sections, one for each day of the month, and each section had to be colored according to a code in a worksheet cell. I started working on a solution, but I ran out of time. I tried returning to the post later, but I can’t find it, and Google can’t find it either.

It was a great question, though, so I’ll reproduce a simpler variation of it here.

The donut chart must have eight slices, and the color of each slice must reflect a value in the worksheet.

The required color for each slice is shown in the worksheet range to the left of the chart. I have colored the cells red, yellow, and green based on whether they contain the letter “R”, “Y”, or “G”. This is easy to do in the worksheet, but there is no mechanism to change colors like this in a chart.

You could do this with VBA, of course. I’ve written a tutorial with the necessary code: VBA Conditional Formatting of Charts by Category Label. But this is a problem which does not require VBA. You can make it work just as well with a handful of clever worksheet formulas, without requiring that someone enable macros in your workbook, and without coming up with a mechanism to run the code when the data changes.

While our conditional donut chart must have eight *visible* slices, it will have three times that amount, and two thirds will be hidden.

I’ve reproduced the slice and color data in this screenshot. You can’t make a meaningful chart of this data, of course, so below it I have constructed an expanded data region. The Slice column repeats the slice number three times for the three colors, and the Color column repeats all three colors for each slice, not just the one which appears in the first range.

Next I have added a column to the lower range, with a header of “Ones” and with a value of 1 in every cell. I can finally make a donut chart. With the chart selected, you can see from the data highlights in the worksheet that both the Slice and Color columns are used for category labels, and the Ones column is used for the donut slice values. The category labels are seen in the legend, and the donut slices are all the same size, 1/24 of the donut circumference.

Let’s do a little formatting. Double click on one of the donut slices to open the Format Data Series task pane, and under Series Options, change the Donut Hole Size from the default 75% to 50%. Click the plus icon that floats alongside the chart, and check Data Labels. Double click on one of the labels to open the Format Data Labels task pane, and under Label Options, check Category Name and uncheck Value. Finally, delete the legend.

I started formatting the slices by first making them all gray. This made it somewhat easier to see where I was in the process. To change all slices to the same color, click once on a slice to select all slices, then choose the color in any convenient way (Format Data Series task pane, ribbon control, or right-click pop-up). At the same time, keep the white border between slices, but change it from 1.5 points to 0.75.

Now, slice by slice, apply the colors. Here I have colored the slices in the first sector of the chart. To color an individual slice, click once to select all slices, then click to select a single slice, then apply the color.

Now finish coloring the slices. Here’s a trick that will work faster than clicking on and choosing its color, then clicking on the next, etc. Single click twice on the first slice to select it, then apply its red color. Then hold down Ctrl while you click the right arrow key three times, which brings you to the next red slice, and click the F4 function key. This should apply the same red color to this slice. Repeat the Ctrl+Right Arrow and F4 sequence until the red slices are formatted, then repeat for the yellow and green slices.

Now the chart is fully formatted. We just need to build the formulas that show and hide slices according to the data in the upper range.

Add two more columns to the second data range, and insert the headers “Labels” and “Values”. Ignore the Labels for now, but enter this formula into the range F13:F36:

`=IF(INDEX($C$3:$C$10,B13)=C13,1,0)`

The result is a column of zeros and ones, with two zeros and a single one for each slice number, according to the colors specified in C3:C10.

Click on the chart and notice the highlighted ranges in the worksheet. Drag the edge of the blue region so the highlight includes the Values column instead of the Ones column. The red highlight should move with the blue highlight if you had selected the chart or the plot area (which is the square that includes the donut). If you had selected one or all of the donut slices, the red highlight would not have followed the blue, but you can simply move it yourself.

Note that there is now only one visible slice for each of the eight sectors of the donut, and these are colored according to the values in C3:C10. Unfortunately we have a bunch of overlapping labels that we don’t want.

These labels correspond to the hidden slices, which are still in the chart. We need another formula that will let us show only the visible slice numbers. So enter this formula into the range E13:E36:

`=IF(F13=1,B13,"")`

Now select the pie slices (all of them or just one), drag any corner of the purple highlight so it occupies a single column, then drag the edge of the purple highlight so it includes the Labels column. The extraneous labels are gone, and we only see the slice numbers.

This is our conditional donut chart. You can select and drag the lower data range off to the side of the chart; you can even cut it and paste it onto another worksheet, out of the way. Move the chart closer to the data it indirectly reflects.

When the colors for each slice are changed in the worksheet, the chart instantly reflects the change.

Change the colors in the worksheet again, and the chart updates again.

And again, and again,…

- Conditional Formatting of Excel Charts
- Conditional XY Charts Without VBA
- Invert if Negative Formatting in Excel Charts
- Highlight Min and Max Data Points in an Excel Chart
- Split Data Range into Multiple Chart Series without VBA
- Conditional Formatting of Lines in an Excel Line Chart Using VBA
- VBA Conditional Formatting of Charts by Value and Label
- VBA Conditional Formatting of Charts by Series Name
- VBA Conditional Formatting of Charts by Value
- VBA Conditional Formatting of Charts by Category Label

The post Conditional Donut Chart appeared first on Peltier Tech Blog.

]]>You can spend five minutes fixing up your data, or five hours working on a chart with the wrong data.

A user on the Mr Excel forum asked about creating a chart from unsuited data. He asked, “Is there a way to do this without modifying the data table?”

My reply started with “I know you don’t want to hear this, but your data is in the wrong arrangement.” Fortunately, the data was simple and the arrangement was consistent, so it was easy to create a *chart staging area* to prepare the data for the chart.

I’ve recreated the data below. Obviously someone went to a lot of trouble laying out the data and formatting it just right. This becomes a problem when people become too attached to their fancy display.

The objective is to plot each product by month, for a single year. The chart needs three lines, one for each product. You can tell that it’s not possible using this data directly.

In general, one would have to rearrange the data, feed it into a pivot table, and create a pivot chart from that. But sometimes you’re lucky enough to be able to write a few formulas instead.

Here is the chart staging area I was able to construct. I listed the months down the side (N3:N14) and the products across the top (O2 to Q2). I put the year in N2, and highlighted it with light gold so a user knew it was important.

Rather than require the user to type in a new year, and possible type an invalid year, I set up data validation in cell N2. Click Data Validation on the Data tab, select List from the Allow dropdown, then select the years in the first row of the original data range (C2:L2).

The magic formula is in cell O3 of the chart staging area. It’s a relatively simple INDEX formula with a few MATCHes to find the right cell of the original data range.

`=INDEX($C$3:$L$38,`

`MATCH($N3,$A$3:$A$38,0)+MATCH(O$2,$B$3:$B$5,0)-1,`

`MATCH($N$2,$C$2:$L$2,0))`

I created a dynamic title for the chart with this formula in cell N17:

`=N2&" Product Sales by Month"`

Finally I selected the data range and inserted a line chart.

Note that I have included the blank row below the chart staging area. This adds a blank category to the chart with no data points, creating space for the data labels I added to the series. These labels, with font color to match the data points, make it easier to identify the data than the legend, so I deleted the legend.

To get the chart title into the chart, I selected the chart title, typed an equals sign in the formula bar, then clicked on cell N17 and pressed Enter. I also aligned the title with the edge of the plot area.

Sometimes (most times?) your data will be too complicated or too irregular to use simple INDEX/MATCH formulas to build a staging area. In the old days we would rely on copy and paste and a lot of one-off lookup formulas, but I’ll show how easy it is to stage this data with Power Query.

First select your data (or one cell in the data range) and on the Data tab of the ribbon, in the Get & Transform Data group, click From Table/Range. If the data isn’t already in a Table, you will be prompted to create one. Make sure the data is correctly identified, and check the My Data Has Headers box. The Table is created, and the Power Query editor opens up and shows the data.

I often remove the step of the query that assigns variable types, because I take care of that later. I selected the first two columns, right clicked in the headers, and selected Unpivot Other Columns.

This gives me the four-column data arrangement I will need for a Pivot-Table-based chart staging area. Here is where I changed the data types of the columns: Month and Product to text, Attribute to whole number, and Value to decimal number.

Then I renamed Attribute to Year and Value to Sales.

Finally I dragged the Year column to the first column of the table.

I clicked Close & Load and landed the query into a Table on a new worksheet.

It’s always better to start with an orderly Table as your data, and base your chart data and any tabulated displays on this.

From the Table above, I created a Pivot Table on a new worksheet, with Year in the Filters area, Month in the Rows area, Product in the Columns area, and Sales in the Values area.

I created a Pivot Chart, then added a Slicer based on the Year field of the Pivot Table. It’s very easy to select a year, even easier than with the data validation cell dropdown I used in the first approach.

With the data in a well-structured Table, we can create another pivot table to mimic the original data layout. I put Year in the Columns area, Month and Product in the Columns area, and Sales in the Values area. I chose the Repeat All Item Labels in the Report Layout dropdown on the Pivot Table Design tab.

The same effort that went into formatting the original data can be reproduced on this Pivot Table.

The post Prepare Your Data in a Chart Staging Area appeared first on Peltier Tech Blog.

]]>Often when I plot data in a column, line, or area chart, I want to plot additional points on that chart. For this I use an XY Scatter type series for the extra data points. These added points may be used for additional labels or other purposes.

There are a few tricks for positioning of XY data points. I use these tricks in many of my tutorials, and I describe them in the protocol. But it’s probably good to have a single dedicated page, like this, dedicated to precision positioning of these extra points.

Long ago, when this blog was very young and I was not so old, I wrote Stacked vs. Clustered, which compared stacked and clustered column charts and described what each is well suited for. I included the following throwaway image; well, I considered it a throwaway, until a reader asked how I added the lines and markers.

Thanks for the nudge, dear reader. What follows is the protocol for adding those markers so precisely to the chart.

Here is my data and my starting column chart.

What I could do, of course, is add an XY Scatter series on the secondary axis, then adjust my X and Y values and the secondary axis limits until the points are positioned appropriately. But that is tedious to do in the first place, and if the original data changes, the tweaked X and Y values would probably need readjusting.

But Excel is nothing if not extremely flexible. I can plot my XY data points on the same primary axis as the column data, with my tweaked data points based on the column chart configuration. This means that all XY points will stay in the same position relative to the column chart data, with a minimum of adjustment.

The two charts below illustrate how scatter chart X values can be calculated based on the column chart’s configuration.

First of all, each category of the column chart has a number, from 1 to the total number of categories. The chart below has categories numbered from 1 to 3. If I use these numbers for my scatter chart X values, the points would be centered on the categories (between the orange and blue columns).

I merely have to calculate how far to the left and right of each category I need my XY data points to be. I need to know my Gap Width and I need to know how many series there are.

The Gap Width is a number, stated as a percentage of the width of a column, that tells me how wide the gap is between the clusters of columns. A Gap Width of 100 means that the white space between clusters is 100% as wide as a column. In the schematic, I have indicated that the columns are 100 percent wide, so the distance between cluster centers is 100 times the number of series, plus the Gap Width. In this case, the Gap Width is 200%, so my categories are 600% from center to center.

The X axis of a line chart or an area chart work in exactly the same way. A bar chart is a different story; the approach is similar, but it is more complicated, and you do need to use secondary axes.

The amount my data points need to be offset is easy to calculate. For example, to center a point on the tallest bar in the chart, I start with the category position, then add half the bar’s width divided by the distance to the next category:

`X = 1 + 50 / (4 * 100 + Gap Width)`

which turns out to be 1.08333333

Since I want my points at the top of the columns, I use the same Y values for the XY points.

Below is my data for the XY points.

I put the name of the category in the first column, and the offset (in column width percentages) from the center of the category in the second. The formula in cell C9 is

`=MATCH(A9,$B$1:$D$1,0)+B9/(4*100+$F$9)`

The MATCH function looks up the category name and returns the number of the category (1 for North, etc.). Cell F9 contains the Gap Width.

I’ve inserted a blank row between categories, so there is a gap in the line between categories.

Copy the XY data in C9:D22. Select the chart, and use Paste Special from the Home tab of the ribbon. Click on New Series, Series in Columns, Series Names in First Row, and Categories in First Column.

Because it’s a column chart, Excel adds the data as another set of columns.

Right click on any series in the chart, and choose Change Series Chart Type. Scroll to the bottom of the list of series and select the added series (“Connect”). Use the Chart Type dropdown to select the XY Scatter with Markers and Straight Lines option (*NOT* a Line Chart option), and uncheck the Secondary Axis checkbox.

This results in the XY data points we want, except the lines and markers are in an ugly shade of green (well, for this particular color scheme).

But we know how to fix that.

If I change the Gap Width in my chart, from 200% above to 75%, the XY data points are no longer lined up.

But all that is needed is to change the Gap Width in cell F9 of my calculation range; the X values recalculate and the chart looks great again.

The post Precision Positioning of XY Data Points appeared first on Peltier Tech Blog.

]]>Microsoft is partnering with NASA to bring you the **Day of Data**, where you can explore how data powers astronauts, space missions, and our world! Join us over the next few weeks to celebrate **World Spreadsheet Day** and 20 years of humans living and working aboard the **International Space Station**. This is the place to find space-themed trivia, an exciting visualization challenge, and live events with an astronaut!

For more info, check out Microsoft Tech Community blog post: **Celebrate World Spreadsheet Day with the Day of Data!**

The post Welcome to the Day of Data! appeared first on Peltier Tech Blog.

]]>Last week, I wrote about Watching my Weight with SPC (Statistical Process Control).

I followed that post with with Dynamic Arrays, XLOOKUP, LET where I showed off some new Excel features that made it easy to manipulate my data so I could plot the data in a box plot.

Following the theme of *Excel is Evolving, Are You?* I’ve decided to show how to use Power Query to perform these same manipulations.

My data data is in a two-column Table of dates and corresponding weights; I’ve named the Table *DataDateWeight*. In my last article, I used a combination of Dynamic Arrays, XLOOKUP, and the new LET function to build a table in the worksheet, starting in cell D2, which has month as column headers and day numbers in the first column of each row. This will produce a grid of weights with a few missing values. I can feed this directly into Peltier Tech Charts for Excel, the VBA add-in I use to make box plots.

*Power Query* is a powerful set of tools that let you import data from a variety of places and then manipulate that data to make it useful. In their infinite wisdom, Microsoft has incorporated Power Query into Excel (as well as Power BI); in their infinite ability to confuse, they have renamed the Excel version *Get and Transform*. They’re really synonyms, since what Power Query does is Get and Transform data.

Power Query makes up the left portion of Excel’s Data tab. If you click the *Get Data* down arrow, you expose a cascading menu of data sources, including *From Table/Range*, which we’ll use to manipulate our data which is already in Excel.

You can also just click the *From Table/Range* button on the ribbon.

This opens up the marvelous and at first scary Power Query Editor. We see the contents of our Table in the editor window. Over in the task pane on the right, I have renamed the Query to *DateWeight* (the default query name is the name of the source Table).

The Applied Steps section in the task pane shows the contents of this query that I’m writing. So far we see *Source* and *Changed Type*. When we started, Power Query used the Table as its source, then applied its best guess at what data type was present in each column. Looks good, as we can see from the *Date* and *123* (whole number) icons in the header row.

The formula bar shows the current operation, in a specialized language called *M Code*. For the most part, you don’t need to be too concerned with these formulas, at least not until you’ve become familiar with Power Query.

At any time, I can go back and click on a previous step, and Power Query will update the window to show the state of the data at that step, with that step’s *M Code* in the formula bar. I can also go back and insert or remove a step, taking care not to break my whole query in the process.

I need a column with the day of the month, and I need to transform the Date column to the month abbreviations. Then I need to pivot the month abbreviations so I end up with one column for each month.

Let’s start with the day of the month. Click *Custom Column* on the *Add Column* tab of the ribbon.

This pops up a Custom Column formula editor.

Power Query has an amazing IntelliSense system, as you’ll see.

First I changed the new column name to *Day*. Then I typed *day *after the equals sign. Power Query shows a list of possible commands matching the word *day*, and I scrolled down until I found *Date.Day*.

I clicked on *Date.Day*, which Power Query inserted into the formula, and it popped up a description of what I needed to use in this formula. There’s also a hint at the bottom of the window next to the yield icon. What I need is the Date column, which appears in the list on the right.

Double clicking on the *Date* column in the list enters it in the formula. Power Query updates the yield icon warning to say I need a closing parenthesis. Actually it says *Token RightParen expected*, but pretty soon you’ll learn what these messages mean.

Type the closing parenthesis, and the icon changes to a green checkmark, and the message becomes surprisingly comprehensible, *No syntax errors have been detected.*

Click OK and the column is added. Congratulations, you’ve just written your first statement in *M Code*. Note the new *Added Column* entry under *Applied Steps*.

The *abc* icon in the new column header indicates that the Day column is formatted as text. Right click the column header, choose *Change Type* > *Whole Number*.

The icon changes to the *123* whole number icon. That’s another *M Code* statement, by the way. You didn’t even know you were writing it, but Power Query records every step you take, so you can repeat the query, or edit it as your model evolves.

We can transform our *Date* column in place. On the *Transform *tab, click on *Date*, then choose *Month *> *Name of Month*.

Our dates have been converted into month names, and the data type icon has changed from a miniature calendar to the *abc* text icon. Nice, now let’s abbreviate it. On the Transform tab, click *Extract* > *First Characters*.

In the *Extract First Characters* dialog, enter 3, then click OK.

The month names are now abbreviated.

With the Date column still selected, click on *Pivot Column* on the *Transform* tab.

The *Pivot Column* dialog asks which column to use for values. If we made a Pivot Table in regular Excel, *Month* would be in the *Columns* area, *Day* in the *Rows* area, and *Weight* in the *Value* area. So select *Weight *and click OK.

Power Query pivots our months into column headers, and it looks like we’re done with our query. The column data types are all *123* whole numbers, and everything looks great.

Click on the *Advanced Editor* button on the *Home* tab.

The *Advanced Editor* shows our entire query in *M Code*. If we wanted to and if we were smart enough, we could edit the code here.

It’s time to put the data back into Excel, so we can do something useful with it. The first button on the *Home* tab is *Close & Load*. We’re going to load the query into a Table on our worksheet. Out of the box, Power Query’s default is to load into a Table, but you can change your default. My default is to load the data into the workbook’s Data Model (like an advanced pivot cache) without a Table, so I’ll click *Close & Load To…*

In the *Import Data* dialog, I’ve clicked *Table* and selected the cell where I want my Table to go.

Click OK and wait a moment for Excel to refresh the query, and the Table appears. Power Query Tables use green formatting instead of the default Excel blue.

Now all I have to do is select the Table and insert my box plot.

The query does not immediately update when the data changes, so it’s not dynamic the way my formula-based approach from the previous tutorial. However, I can refresh the query using the Refresh button on the Ribbon, or by right-clicking on the Table and choosing Refresh from the pop up menu.

I can get to any queries in my workbook by clicking *Queries & Connections* on the *Data* tab, which pops up a task pane.

When I mouse over the query in the task pane, a preview of the query pops up, showing a portion of the query results and some information about the query. I can also edit or delete the query here, and take other actions. If a query is deleted, the table remains in the worksheet, but it is no longer linked to the query or its underlying data.

There are many places where you can read about Power Query. The following is an incomplete list of blogs and tutorials written by colleagues.

- Ken Puls, Excel Guru, Power Query Blog Archives
- John MacDougall, How To Excel, Power Query Archives
- Mark Proctor, Excel Off The Grid, Power Query Introduction
- Jon Acampora, Excel Campus, Power Query
- Gašper Kamenšek, Excel Unplugged, Power BI Archive
- Mynda Treacy, My Online Training Hub, Power Query Archives

The post Use Power Query to Manipulate Your Data appeared first on Peltier Tech Blog.

]]>Last week, I wrote my first blog post in a few months. I wrote about Watching my Weight with SPC (Statistical Process Control).

In the first comment, Derek asked for a box plot, so I quickly recompiled my data and built one.

Several new features of Office 365 made it very easy to mush my data around, and I realized that this is a good example to show them in action. I used Dynamic Arrays, the `XLOOKUP`

function, and the new `LET`

function.

Joe McDaid of Microsoft’s Excel team wrote about Dynamic Arrays in Dynamic Arrays and New Functions in Excel! These formulas break the old paradigm of one formula per cell. In classic Excel, you needed a formula for every value you calculated. Sure, we had array formulas, but you had to include all output cells together and enter an array formula in all of the cells at once. If you didn’t use the right number of formulas, or if your model expanded, you had to rebuild the formula. Ugh! Plus array formulas, amirite?

The new Dynamic Arrays allow you to enter a formula into one cell. Excel figures out how many cells you need, and “spills” the formula into the required range of cells. One formula, multiple cells. And with a retooling of Excel’s underlying grid, any formula can become a Dynamic Array.

In Preview of Dynamic Arrays in Excel, Joe tells us that Dynamic Arrays have rolled out to Office 365 subscribers, with users in the Semi-Annual (Targeted) channel getting them starting last month. I presume that users in the Semi-Annual channel getting them soon as well.

I used a few Dynamic Array Formulas in my example.

For years there has been a debate about which is better, `VLOOKUP`

or `INDEX/MATCH`

. I don’t know why there was a debate, because `INDEX/MATCH`

is clearly the superior choice. Like Zoolander, `VLOOKUP`

can’t turn left, and there were many things that could go wrong with your lookups. `INDEX/MATCH`

has its own problems as well, mostly because of the default `Match_Type`

setting which could lead to bad results.

In Announcing `XLOOKUP`

, again by Joe McDaid, we learned of this new lookup function that eliminates the perils and shortcomings of `VLOOKUP`

. It’s simpler to use and more flexible, and best of all, it begins with `XL`

. You can read documentation in `XLOOKUP`

Function. `XLOOKUP`

is on about the same release schedule as Dynamic Arrays.

According to Announcing `LET`

, the new function allows you to assign a name to a value or intermediate calculation at the beginning of the formula, then use this named argument in the main calculation. The `LET`

function improves performance by computing an intermediate calculation just once and allowing reuse of the computed value, and it helps readability of your formulas by allowing you to use descriptive labels for these intermediate calculations.

`LET`

is only available to Office Insiders at this writing, so you’ll have to be patient.

In my previous post, I examined fluctuations in my body weight between last September and this April. The data is in a two-column Table. I will construct a table in the worksheet, starting in cell D2, which has month and year as column headers and day numbers in the first column of each row. This will produce a grid of mostly weights with some missing values, and it feeds directly into Peltier Tech Charts for Excel, the VBA add-in I use to make box plots.

I need to show the months from September through April across the top. I will use the new `SEQUENCE`

function to make a list. In cell E2, I enter the formula

=SEQUENCE(1,8,9,1)

Which tells Excel to make a sequence with 1 row, 8 columns, start the sequence at 9, and increment by 1. And you see the Dynamic Array in E2:L2 filled with this sequence. If the active cell is within this range, the entire Dynamic Array range is highlighted as shown.

But Jon, you may say, you need the months to go up to 12, then start again at 1. And the answer is, no I don’t. A few rows below the first Dynamic Array, I have written a temporary formula to check the dates:

=DATE(2019,E2#,1)

Which tells Excel to give me a date using 2019 as the year, `E2#`

as the month, and 1 as the day. The hash or pound sign appended to `E2#`

tells Excel to use the Entire Dynamic Array that is anchored in cell E2. So I’ve entered this formula in cell E6 only, but it spills into the range E6:L6 to accommodate `E2#`

.

And we see that if I define a date using the year 2019 and the month 13, it uses 12 months to increment the year to 2020, and uses the leftover month, so I get January 2020. Well, you probably new that, but it’s a nice little trick.

In row 9 I repeat the previous formula, but with 31 days. Excel does the same thing with excess days, and converts September 31 to October 1, etc.

So Dynamic Arrays, pretty cool, right?

Now to get the dates I want, I’ll wrapt `SEQUENCE`

in the `DATE`

function:

=DATE(2019,SEQUENCE(1,8,9,1),1)

and Excel gives me the first of each month in the study.

Now let’s fill in the column of days. I’ll use another `SEQUENCE`

function in cell D3, an easier one this time:

=SEQUENCE(31,1,1,1)

Which is a sequence of 31 rows and 1 column, starting with 1 and incrementing by 1.

In cell E3, I entered a formula to calculate the dates for my lookups.

=DATE(YEAR(E2#),MONTH(E2#),D3#)

Meaning, give me the data using the year and month from the date `E2#`

and the day from `D3#`

. This new Dynamic Array fills up the whole rectangular range defined by the two Dynamic Arrays it references. Note the extra dates in the September, November, February, and April columns.

The dates are fine (except for those extras, which we’ll take care of soon). So let’s wrap the date in `XLOOKUP`

.

=XLOOKUP(DATE(YEAR(E2#),MONTH(E2#),D3#), Table6[Date], Table6[Weight], "")

I’ve put each argument onto its own line for clarity. Basically, I’m telling Excel to look for the date we’ve calculated, look in the Date column of Table 6, and if you find it, return the value from the Weight column of Table 6. And if you don’t find it, give me a cell that looks blank with “”.

And here are the weights, including some duplications: note that the 10/1/2019 weight of 167 appears at the top of the October column but also at the bottom of the September column.

So how do I eliminate those duplicates? Even though Excel computes `DATE(2019,9,31)`

using month 9, the result is month 10. So I’ll enter a formula that checks whether the month calculated in the lookup cell matches the month in the header cell: if so, perform the `XLOOKUP`

, but if not, return “”.

=IF(MONTH(DATE(YEAR(E2#),MONTH(E2#),D3#))=MONTH(E2#), XLOOKUP(DATE(YEAR(E2#),MONTH(E2#),D3#), Table7[Date],Table7[Weight],""),"")

Clever trick to skip those duplicates, Jon, but that’s an unwieldy formula. That big chunk `MONTH(DATE(YEAR(E2#),MONTH(E2#),D3#)`

appears twice, and we’ve all written monstrosities which has more than one chunk like this repeated more than twice.

So I’m going to take the new function `LET`

out for a spin. I’ll create an argument called `TheDate`

, and it will stand for `MONTH(DATE(YEAR(E2#),MONTH(E2#),D3#`

). Then in the calculation, I’ll use `TheDate`

where the big chunk was before.

=LET(TheDate,DATE(YEAR(E2#),MONTH(E2#),D3#), IF(MONTH(TheDate)=MONTH(E2#), XLOOKUP(TheDate,Table8[Date],Table8[Weight],""),""))

This formula is only a few characters shorter than the original, but it’s certainly a lot easier to read. And if I had to change something in the definition of `TheDate`

, I would only have to change it in one place, and not worry that I forgot to change it somewhere else.

Now all I have to do is select the data and insert my box plot.

The post Dynamic Arrays, XLOOKUP, LET – New Excel Features appeared first on Peltier Tech Blog.

]]>I’ve been working on a Statistical Process Control project for a client, building a workbook to automate construction of control charts. Years ago I wrote a tutorial called Introducing Control Charts (Run Charts). Many processes, in manufacturing, in business, or in nature, show fluctuations in their outputs. We can use Statistical Process Control (SPC) techniques to monitor these processes and ensure the fluctuations stay within expected limits.

I was looking for data to proof out the tool I was building, and I thought I could use my weight as a decent data set. My wife bought a new digital scale in 2006, and I’ve been weighing myself almost every day since then. And being an Excel jock, I put my measurements into a spreadsheet.

In the chart below, you can see how I fluctuated around 200 lb for over a decade. Then 20 months ago my wife and I joined Weight Watchers, and over the course of 6 or 8 months I lost 40 lb.

I thought looking at the past few months would be a good way to illustrate the use of SPC to track a process. This exercise will construct a series of control charts of this data.

I first learned about Statistical Process Control as a practitioner and as a trainer, while employed as a scientist/engineer for a large manufacturing corporation. One of the resources we had was a deceptively small book called Understanding Variation: The Key to Managing Chaos by Donald J. Wheeler.

*Understanding Variation: The Key to Managing Chaos by Donald J. Wheeler*

There are many other information sources about SPC and control charts. The National Institute of Standards and Technology (NIST) has an online Engineering Statistics Handbook, which has a chapter on Univariate and Multivariate Control Charts. Wikipedia has brief articles with many references covering SPC and Control Charts. And Google shows about 1.2 billion results for SPC and 0.5 billion results for Control Charts.

The first step is to identify the data and get it into a form where it can be analyzed. I decided to track from 1-Sept-2019 to 1-Feb-2020. Below is the top of my data worksheet, with a few calculations. The data is in three columns of an Excel Table named Table_1. The first two columns are date and weight, manually entered. The third column is Moving Range (MR), which we will use as a measure of variability in the data. The formula in cell C2 and filled down the Table column is

=IFERROR(ABS([@Weight]-OFFSET([@Weight],-1,0)),NA())

Essentially it determines the absolute value of my change in weight from one day to the next. Any error in the calculation (such as trying to subtract the column header) returns `NA()`

, or the `#N/A`

error.

I’ve calculated some values in a range beside the table, and I’ll explain them as I go along. The little table below the calculations show the formulas I’ve used. I’ve also named these cells as indicated, to make it easier to use the cells in formulas.

The next step is to plot the data. I’ve made two charts, one of my weight, the other of the calculated moving range. We look first for any obvious issues in the data, such as the spike late in September. If you look at the data above, apparently I gained 18 lb one day, and lost it the next. A more likely explanation is that I transposed digits in 168 and instead entered 186 in the worksheet. I’ll deal with this data issue soon, but for now I’ll continue with the SPC construction.

I added the calculated items as columns in my Table to make it easier to chart them. Having named the cells, I could use simple formulas in the Table: `=Mean`

in cell D2, `=LCL`

in cell E2, etc.

Among my calculations are averages of the weight data (Mean) and of the moving range data (MR Bar). Let’s add these as green horizontal lines to the weight and MR charts for reference.

So far, so good. Now let’s add a measure of “allowable” or “acceptable” variation. If the process is following statistical rules and its variability follows a normal distribution, we would use multiples of sigma, the standard deviation, to identify limits. According to the definition of a normal distribution, 68.3% of values fall within ±1 standard deviation of the mean, 95.5% fall within ±2 sigma, and 99.7% fall within ±3 sigma of the mean. By convention, 3 sigma is commonly used to identify acceptable variations.

We could measure the sample’s standard deviation (SD) directly, multiply it by 3, and use this to determine our limits. But using moving range is more robust, since outliers and non-normal distributions have a greater effect on sigma than on moving range.

The average moving range, or MR Bar, is used to calculate control limits. Less commonly, the median of the moving range is used to compute these limits.

First we determine MR UCL, which is the Upper Control Limit on the moving range, by multiplying the average moving range by 3.268. This is plotted to the moving range chart as a horizontal orange line (bottom chart below). We would expect 99.7% of our MR values to fall below this limit.

In the same way, we calculate the UCL and LCL (Upper and Lower Control Limits) of our individual data. We multiply MR Bar by 2.67, and add it to or subtract it from the mean to get our limits. These are plotted on our chart of individual values as horizontal orange lines (top chart below). Again, we expect 99.7% of our individuals to fall between these two lines.

These charts of measurements along with means and limits are called Control Charts. The chart of individual values is called an I Chart (no, not “eye chart”), and the moving range chart is the MR Chart. Together they are referred to as an IMR (sometimes ImR) Chart.

Our ±3 SD limits are shown in the dashed red lines below (they are calculated as LCL 2 and UCL 2). They fall pretty far outside the MR-based control limits. All points fall well within the SD-based limits, except for the one obvious outlier.

In fact, because the outlier causes two excessive moving range values, the MR-based limits are also too wide, and would lead us to accept points that would otherwise be out of control.

The spike in my weight in September is a “special cause” variation, because it is a one-off problem. Since it is obviously not a valid measurement, we can attribute it to a recording error, and ignore it. We want to remove this value from our moving range calculations, since it resulted in limits which were too wide.

The other variation we see in the timeline is “common cause” variation. It comes from variations in inputs, like exercise, meals, and other factors, which are themselves subject to normal variation.

In my adjusted table below (Table_2), I’ve added two columns. Wt 2 simply repeats the data in Weight, using the Table formula `=[@Weight]`

. I can replace any special cause deviation with `=NA()`

or `#N/A`

in this column. MR 2 uses the same formula as MR, based on the Wt 2 column:

=IFERROR(ABS([@[Wt 2]]-OFFSET([@[Wt 2]],-1,0)),NA())

Where there was one bad weight and two bad moving ranges, we now have `#N/A`

values in the table, which we can ignore in the chart and in our other calculations.

When we plot our individual and moving range values, the chart scales now show much narrower ranges, and there are no longer any obvious outliers: there is one high individual value and corresponding moving range in January, a few low weights in November, and a few high weights in December.

Let’s add our means and control limits, and see what we have. The MR chart shows the outlying value in late January, and four more moving range values that are just at the limit. In the individuals chart, the low values are within the limits (“in control”) while the high values we eyeballed before are above the UCL (“out of control”).

When values are out of control, we have to examine the process, to ensure that nothing is wrong with our process, and that nothing has changed. I can actually explain some of the variations. On Thanksgiving, I ran a “Turkey Trot” with my daughter, so for a couple weeks I was running more than my usual 3 miles a day: thus the few low values in November. And of course, the few values of 172 coincide with the Christmas and New Year’s holidays.

Below I’ve plotted the SD-based limits along with the MR-based limits. The limits are much closer to each other and closer to the mean than when the outlier was included in the calculations.

Here I’ve plotted these control limits as calculated with and without the outlier. The outlier had a substantial effect on the limits, especially on the SD limits.

When the variation fits a normal distribution, the two sets of limits are close together, with the MR-based limits wider sometimes and the SD-based limits wider other times. The larger the data set, the closer they will be.

For the rest of this analysis, I’ll ignore sigma and stick to MR-based calculations.

We can enhance our IMR Chart by highlighting points which are out of control. I’ve added two columns to my table to support this. Wt X has this formula

=IF(OR([@[Wt 2]]<=LCL,[@[Wt 2]]>=UCL),[@[Wt 2]],NA())

which shows the value from Wt 2 if it falls outside the control limits, and `#N/A`

otherwise. MR X has this formula

=IF([@[MR 2]]>=MR_UCL,[@[MR 2]],NA())

which again shows the value from MR 2 if it falls above the control limit, otherwise `#N/A`

.

I’ve added these columns to my IMR Chart as red/orange markers.

There are other features of control charts that indicate a process which is out of control. These are conditions which are not expected to be found in about 99.7% of cases. Here are a handful of common out-of-control rules; the first one is the one I highlighted above.

- One point beyond 3-sigma control limits
- 2 of 3 points outside 2-sigma on same side of mean
- 4 of 5 points outside 1-sigma on same side of mean
- 8 consecutive points outside 1-sigma on both sides of mean
- 15 consecutive points inside 1-sigma on both sides of mean
- 9 consecutive points on same side of mean
- 6 consecutive points moving in same direction
- 14 consecutive points alternating up and down

Advanced SPC software highlights any of these situations, in addition to the 3-sigma violations.

To show how to manage a growing data set, I added ten more weeks of my weight tracking.

Typically, when a process is determined to be steady, the limits are calculated and frozen, then these are extended forward. This is illustrated below: the frozen limits were calculated from September through February, indicated with solid lines, and extended into April, shown with dashed lines.

Where I had a few values above the UCL in December and January, I now had several below the LCL and only a few above the mean in February and beyond.

This is evidence of a process shift. Several of the additional rules mentioned at the end of the last section would have been triggered. Checking my exercise records gives us an explanation. For much of the period from September through January, I was running 3 miles a day, four or five days a week. The weather in February was rather mild, so I increased my mileage to about 3.5 miles a day, six days a week.

The control charts below show control limits calculated over the entire range. The process change is still noticeable, but it’s not as clear as with the frozen and extended limits above.

Another problem with continually recalculating limits is that the limits move over time. Points which were in control at one time may be pushed out of control by later measurements. A December point at 170 which was in control when the limits were frozen is now out of control under the newly computed limits.

We can overcome this concern by staging our analysis, that is, computing different limits for different subsets of our data. In my latest Table below, I’ve added a column named Stage, which contains 1 for the first stage and 2 for the second; these can be entered manually or with a formula, which for example increments the stage number on a given date. The control limits are computed separately for different stages.

The IMR Chart below shows a staged analysis. Stage 1 looks familiar; the UCL for both MR and Individuals are slightly lower because the large MR late in January coincided with the process change. The violations in stage 1 are the same as before; the few outliers in stage 2 would have been well within the stage 1 limits, but are actually above the stage 2 UCL.

It’s common practice not to compute a separate average moving range for all stages, especially if the stages have smaller numbers of points, but instead use an overall MR Bar. If I had used a combined measure of variation, stage 2 would not have had any outliers.

The post Watching my Weight with SPC (Statistical Process Control) appeared first on Peltier Tech Blog.

]]>VBA provides a number of ways to interact with users. `MsgBox`

lets you send a message to a user, and get a simple response (yes/no or okay/cancel) in return. `InputBox`

lets you ask a user to input some information. And you can design a whole UserForm as a custom dialog.

We will use an InputBox to get a range from our user. There are two kinds of `InputBox`

: regular `InputBox`

that accepts text input from the user, and `Application.InputBox`

, which lets you require a certain type of information from the user. Since we need a range, we’ll use `Application.InputBox`

. The required syntax is:

`Function InputBox(Prompt As String, [Title], [Default], _`

`[Left], [Top], [HelpFile], [HelpContextID], [Type])`

You supply a `Prompt`

to tell the user what you need, and optionally a `Title`

for the `InputBox`

and a `Default`

value. `Left`

and `Top`

position the `InputBox`

but haven’t worked in recent memory. `HelpFile`

and `HelpContextID`

are used if you have help content for the `InputBox`

. Finally, `Type`

describes the type of data you want; for our purposes, type 8 is used for a range.

Here is a simple VBA procedure that uses `Application.InputBox`

. It uses simple prompt, title, and default arguments, plus the 8 for range, and the in between arguments are left blank. When a range is returned, those cells are filled red.

Sub GetUserRange1() Dim UserRange As Range Set UserRange = Application.InputBox("Prompt", "Title", "$A$1", , , , , 8) UserRange.Interior.Color = vbRed End Sub

When the code runs, here is how it looks. The default range is highlighted with a dark dashed outline.

When another range is selected, its address appears in the InputBox, and the new range is highlighted a dark dashed outline.

When we then click OK, the selected cell is filled red.

If we try to enter anything that isn’t a range, we get a warning.

If we give up and hit cancel, we get a run time error. The statement calling `InputBox`

is highlighted.

The warning is fine, but we don’t want to saddle our poor user with the run time error, so we will make a minor modification to the code. Actually, two modifications. First, we wrap the call to `InputBox`

in `On Error Resume Next`

and `On Error GoTo 0`

. Then, if the user has canceled, `UserRange`

will not have been assigned, so it is `Nothing`

, and if this is the case, we’ll bail out before trying to format a nonexistent range, which would lead to another error.

Sub GetUserRange2() Dim UserRange As RangeOn Error Resume NextSet UserRange = Application.InputBox("Prompt", "Title", "$A$1", , , , , 8)On Error GoTo 0If UserRange Is Nothing ThenExit SubUserRange.Interior.Color = vbRed End Sub

Execution exits the sub above in two places, either `Exit Sub`

if `UserRange Is Nothing`

or `End Sub`

at the end. If you have a bit of OCD like I do, you’d prefer the code to always exit at one place, so I usually use this structure instead:

Sub GetUserRange2A() Dim UserRange As Range On Error Resume Next Set UserRange = Application.InputBox("Prompt", "Title", "$A$1", , , , , 8) On Error GoTo 0If Not UserRange Is Nothing ThenUserRange.Interior.Color = vbRed End If End Sub

Let’s punch up the code with a few text variables, and allow for a smart default range (or you can omit the default to leave out blank if that makes more sense).

Sub GetUserRange3() DimPromptAs StringPrompt= "Select a Range" DimTitleAs StringTitle= "Select Range" DimDefaultAs StringDefault= "$A$1" If TypeName(Selection) = "Range" ThenDefault= Selection.Address(True, True) End If Dim UserRange As Range On Error Resume Next Set UserRange = Application.InputBox(Prompt,Title,Default, , , , , 8) On Error GoTo 0 If Not UserRange Is Nothing Then UserRange.Interior.Color = vbRed End If End Sub

So now that we know how to get a range from the user, let’s put it to better use than just formatting a range to have a red background.

In this example, we’ll ask the user for a range containing series names, and we’ll apply the range to the active chart’s series one by one, until we run out of series in the chart or cells in the selected range.

Since we’re working on the active chart, the user must select a chart before running the code. We’ll use this structure to make sure there is an active chart:

Sub DoSomethingToActiveChart()If ActiveChart Is Nothing ThenMsgBox "Select a chart and try again.", vbExclamation, "No Chart Selected" Else ' do our thing here End If End Sub

Often I just bail out if there’s no active chart, but it’s nice to let the user know why nothing happened when they clicked a button.

In the procedure below, we check first for an active chart, then we ask for a range containing series names. Then we make sure the range is a single row or column (not strictly necessary, but otherwise it gets *complicated*), and step through the series of the chart, using each cell in the range as the title of the series.

Sub SelectRangeWithSeriesNames() If ActiveChart Is Nothing Then MsgBox "Select a chart and try again.", vbExclamation, "No Chart Selected" Else Dim Prompt As String Prompt = "Select a range that contains series names for the active chart." Dim Title As String Title = "Select Series Names" Dim SeriesNameRange As Range On Error Resume Next Set SeriesNameRange = Application.InputBox(Prompt, Title, , , , , , 8) On Error GoTo 0 If Not SeriesNameRange Is Nothing Then If SeriesNameRange.Rows.Count = 1 Or SeriesNameRange.Columns.Count = 1 Then With ActiveChart Dim iSrs As Long For iSrs = 1 To .SeriesCollection.Count If iSrs <= SeriesNameRange.Cells.Count Then .SeriesCollection(iSrs).Name = _ "=" & SeriesNameRange.Cells(iSrs).Address(, , , True) End If Next End With Else MsgBox "Select a range with one row or one column", vbExclamation, _ "Must be One Row or Column" End If End If End If End Sub

Here is a chart. The highlighted data range shows that the Y values are in C5:E10, and the category labels (X values) in B5:B10, but no series names are highlighted. The series names Series1 etc. in the legend also indicate that no series names have been specified.

Let’s use the labels in C2:E2 for series names. Run the code: the `InputBox`

is waiting for a range to be selected.

Select a range: it is highlighted with a dark dashed outline.

Click OK and the selected range is now highlighted as the series names, and the chart legend shows these labels as series names.

The above procedure can be used to assign series names (legend entries) for a chart that has none, or to replace the existing names (entries) in a chart that already has them.

If there are not enough cells in the selected range, some series do not get new names assigned. If there are more cells than series, the code ignores the excess cells.

In much the same way, we can ask a user for a range containing category labels (X axis values) to assign to a chart.

Sub SelectRangeWithCategoryLabels() If ActiveChart Is Nothing Then MsgBox "Select a chart and try again.", vbExclamation, "No Chart Selected" Else Dim Prompt As String Prompt = "Select a range that contains category labels for the active chart." Dim Title As String Title = "Select Category Labels" Dim CategoryLabelRange As Range On Error Resume Next Set CategoryLabelRange = Application.InputBox(Prompt, Title, , , , , , 8) On Error GoTo 0 If Not CategoryLabelRange Is Nothing Then If CategoryLabelRange.Rows.Count = 1 Or CategoryLabelRange.Columns.Count = 1 Then With ActiveChart If CategoryLabelRange.Cells.Count = .SeriesCollection(1).Points.Count Then .SeriesCollection(1).XValues = CategoryLabelRange Else MsgBox "Select a range with the correct number of points.", vbExclamation, _ "Wrong Number of Points" End If End With Else MsgBox "Select a range with one row or one column", vbExclamation, _ "Must be One Row or Column" End If End If End If End Sub

The above procedure can be used to assign category labels (X axis values) for a chart that has none, or to replace the existing labels in a chart that already has them.

The selected range must have one row and multiple columns or one column and multiple rows. (In some cases, a chart can have category labels that use multiple rows/columns, but that is a lot more complicated than needed for this example.) If this condition is not met, the code tells the user to try again.

Here is a chart. The highlighted data range shows that the Y values are in D3:F8, and the series names in D2:F2, but no categories are highlighted. The category labels 1, 2, 3, etc. along the X axis also indicate that no categories have been specified.

Let’s use the labels in B3:B8 for categories. Run the code: the `InputBox`

is waiting for a range to be selected.

Select a range: it is highlighted with a dark dashed outline.

Click OK and the selected range is now highlighted as the category range, and the horizontal axis displays these labels as categories.

The above procedure can be used to assign categories (X values) for a chart that has none, or to replace the existing categories in a chart that already has them.

If the selected range has the wrong number of cells for the points in a series, the procedure tells the user to select a range with the right number of labels.

If we construct a chart from an ideally-shaped range, there is a predictable alignment of X values, Y values, and series names in the worksheet. In this chart, the series data is in columns. The Y values for a series are in a vertical range, and the series name is in the cell directly above these Y values. The category labels (X values) are in the column to the left of the Y values of the first series.

In the next chart, the series data is in rows. The Y values for a series are in a horizontal range, and the series name is in the cell directly to the left these Y values. The X values are in the row above the Y values of the first series.

If for some reason the chart was constructed without series names or without category labels, but the missing elements are in the right alignment with respect to the Y values, we can find the cells with the series names or category labels and assign them to the chart.

The procedure below processes each series in the active chart. First it looks up the SERIES formula, and extracts its arguments into an array. The code finds the address of the Y values, which is in the third argument, and locates the associated range. Then it determines whether the data is by column or by row, identifies the cell with the series name, and uses this cell’s address as the name of the series.

Sub FindAndApplyNamesToSeries() If Not ActiveChart Is Nothing Then With ActiveChart Dim srs As Series For Each srs In .SeriesCollection' series formulaDim sFmla As String sFmla = srs.Formula' just the argumentssFmla = Mid$(Left$(sFmla, Len(sFmla) - 1), InStr(sFmla, "(") + 1)' split into an arrayDim vFmla As Variant vFmla = Split(sFmla, ",")' Y values are the 3rd argumentDim sYVals As String sYVals = vFmla(LBound(vFmla) + 2)' the Y value rangeDim rYVals As Range Set rYVals = Range(sYVals)' by row or column?If rYVals.Rows.Count > 1 Then' by column, so use cell above column of Y valuesDim rName As Range Set rName = rYVals.Offset(-1).Resize(1) ElseIf rYVals.Columns.Count > 1 Then' by row, so use cell to left of Y valuesSet rName = rYVals.Offset(, -1).Resize(, 1) Else' one cell only: who knows?Set rName = Nothing End If If Not rName Is Nothing Then srs.Name = "=" & rName.Address(, , , True) End If Next End With End If End Sub

The data range for the chart below is intact, but somehow, the series names were not associated with it.

Select the chart and run the procedure, and it applies the series names.

If the series in the chart already had names, this procedure overwrites those names with the names it finds in the worksheet.

The procedure below processes the first series in the active chart. First it looks up the SERIES formula, and extracts its arguments into an array. The code finds the address of the Y values, which is in the third argument, and locates the associated range. Then it determines whether the data is by column or by row, identifies the parallel range with the categories (X values), and uses this range as the categories for the first series, and therefore, for the chart.

Sub FindAndApplyCategoriesToChart() If Not ActiveChart Is Nothing Then With ActiveChart Dim srs As Series Set srs = .SeriesCollection(1)' series formulaDim sFmla As String sFmla = srs.Formula' just the argumentssFmla = Mid$(Left$(sFmla, Len(sFmla) - 1), InStr(sFmla, "(") + 1)' split into an arrayDim vFmla As Variant vFmla = Split(sFmla, ",")' Y values are the 3rd argumentDim sYVals As String sYVals = vFmla(LBound(vFmla) + 2)' the Y value rangeDim rYVals As Range Set rYVals = Range(sYVals)' by row or column?If rYVals.Rows.Count > 1 Then' by column, so use column to left of Y valuesDim rXVals As Range Set rXVals = rYVals.Offset(, -1) ElseIf rYVals.Columns.Count > 1 Then' by row, so use row above Y valuesSet rXVals = rYVals.Offset(-1) Else' one cell only: who knows?Set rXVals = Nothing End If If Not rXVals Is Nothing Then srs.XValues = rXVals End If End With End If End Sub

The data range for the chart below is intact, but somehow, the categories were not attached to it.

Select the chart and run the procedure, and it applies the categories to the chart.

If the series in the chart already had categories, this procedure overwrites those categories with the new labels it finds in the worksheet.

For completeness, here is the routine that lets the user select the chart source data for the active chart.

Sub SelectChartSourceDataRange() If ActiveChart Is Nothing Then MsgBox "Select a chart and try again.", vbExclamation, "No Chart Selected" Else Dim Prompt As String Prompt = "Select a range that contains source data for the active chart." Dim Title As String Title = "Select Chart Source Data Range" Dim ChartSourceData As Range On Error Resume Next Set ChartSourceData = Application.InputBox(Prompt, Title, , , , , , 8) On Error GoTo 0 If Not ChartSourceData Is Nothing Then ' if a range was selected If ChartSourceData.Rows.Count >= ChartSourceData.Columns.Count Then Dim DataOrientation As XlRowCol DataOrientation = xlColumns Else DataOrientation = xlRows End If ActiveChart.SetSourceData ChartSourceData, DataOrientation End If End If End Sub

If you follow this blog closely, you may have heard of Jon’s Toolbox, a new Excel add-in that I released recently. I’ve already made a few changes to this utility, thanks to suggestions from readers like you.

Jon’s Toolbox already had a feature that made it easy to select a new source data range for a chart. While finishing up this article, I realized that these features should be included in the software.

The features have only been added to the DEV version of the software on my computer, but in a week or so I will release a new build that includes these new functions. If you already have Jon’s Toolbox, subscribe to my newsletter to hear when the new build is live. If you don’t have Jon’s Toolbox, you can get it at the bottom of my article Jon’s Toolbox – A New Utility from Peltier Tech.

The post Assign Chart Series Names or Categories with VBA appeared first on Peltier Tech Blog.

]]>My colleague Frédéric LeGuen, an Excel MVP from France who runs the ExcelExercise web site and YouTube channel, asked me a tricky question last week. He wanted a way to visually track visitors to a museum, knowing how many arrived and left during a given hour. He tried a candlestick chart first, which is great for monitoring stock data, and a waterfall chart, which is great for tracking a quantity subject to up and down changes. The problem is that both of these charts show only up *or* down values in a given time period, while we want to show up *and* down for each time period.

We have the following data, which lists the number of visitors entering (Entrée) and leaving (Sortie) during the hour, and the net visitors . The formula in cell D2, filled down the list, is

=SUM(B$2:B2)-SUM(C$2:C2)

So let’s plot the data in a clustered column chart to see what we have.

No, that doesn’t show us anything. How about a stacked column chart?

What we want will look like this, but have blue arrows before each column showing visitors who have arrived and orange arrows after each column showing visitors who have left.

There is an easy way to do this, using all columns and the sometimes tricky custom error bars. And there is a hard way, using XY series and tricky custom error bars. There’s another hard way, which uses easier error bars but more complicated stacked columns that require a complicated data layout, so I think I’ll avoid that one.

Starting with the same initial data as above, I made a clustered column chart using time (Heure) as X values and Visitors as Y values. I’ve changed the series overlap from the default of -27% to 0% and the gap width from the default of 219% to 100%. I don’t know who thought up those defaults.

I copied the Visitors data, selected the chart, and pasted. Two series with the same values and name, that’s what I wanted. Really.

Then I repeated the copy and paste, so I have the same data in the chart three times. This will give me the total visitors as the middle column, with a column before and after this where I can draw my arrows.

I changed the colors of the bars, because I want the totals to be gray, and I want blue and orange to represent increases and decreases. And I’ve renamed the outer bars Before and After.

I added error bars to the Before and After series, using the little plus-sign icon floating beside the chart.

I don’t need to display the blue and orange columns anymore, so I’ve formatted them with no fill, to be transparent.

Time to format the error bars. Be sure to choose the *No End Caps* option, and then under *Error Amount*, select *Custom*, and then click the *Specify Value* button.

I used the data in the Entrée and Sortie columns as custom values. When you click the *Specify Value* button, you get this tiny little dialog, with even tinier little edit boxes for selecting the range. The values are initially `={1}`

for both: change the positive value to `={0}`

and select the Entrée values for the negative.

I’ve been complaining about the tiny Custom Error Bars dialog for a long time. Here’s what a more reasonably sized dialog would look like. See that, Microsoft? I did it in 5 minutes in MS Paint.

Note, for the Sortie error bar values, they are off by one row, so you need to select the range starting in the second cell, so use $C$3:$C$11. Here’s how the custom error bars look.

Format the error bars with the blue and orange colors. Make the lines a bit thicker; I’ve used 1.5 pt which is 2 pixels. Give the blue error bars a starting arrowhead, and the orange error bars an ending arrowhead.

A little clean up. I deleted the legend, and adjusted the chart data range to leave out the meaningless blank at 18:00.

You have some leeway in the spacing of the columns, by adjusting the gap width. In the chart above, the gap width is 100, meaning the space between the clusters is 100% as wide as a single column. The hidden columns that the arrows occupy are 100% of this width as well, so there is a reasonably wide space between arrows.

If you want to decrease the spacing, you can decrease the gap. The chart below has the minimum gap width of 0%.

If you had kept the default gap width of 219%, the columns and arrows would be rather far apart.

You can set the gap width as high as 500%, which looks so ridiculous that I’m not showing it..

The easy approach used two column chart series to hang the error bars on. The hard approach uses two XY scatter chart series. I’ve inserted a column of X values before the Y values (Visitors) in my data.

I selected the time (Heure) and Visitors, and created a column chart. I’ve recolored it gray, and applied overlap of 0% and gap width of 150%.

I copied the X and Visitors data in columns D and E, selected the chart, and used Paste Special (Ribbon > Home tab > Paste dropdown > Paste Special) to add the data to the chart as a new series in columns, series name in first row, categories in first column.

The result is a second set of columns with the same values as the first. I’ve recolored the columns blue and renamed them Entrée.

I right-clicked on the new series and chose Change Series Chart Type from the pop-up menu. I changed the chart type for Entrée to Scatter (arrow 1), and I unchecked the Secondary Axis box beside it (arrow 2).

The result is a combination chart with a set of gray columns and a set of blue markers, which I have reformatted into a large blue circle with no fill color.

The X values I’ve used, 1, 2, 3, etc., align with the bars at the first, second, third, etc. categories. What I need is to subtract a little bit from these X values to move the blue circles to the left of the columns, and add the same amount to these X values so another set of circles will sit to the right of the columns.

So I’ve added two more columns to my data, X-Before and X-After. I put a small value into cell D13. The formulas in the added columns are:

Cell E2: =D2+$D$13 Cell F2: =D2-$D$13

Keeping my original X values in their own column and using a cell to hold my small amount to add and subtract make it easier to make adjustments later, to cells I can see, rather than to formulas where the amounts are hidden.

I changed the X values of my Entrée series to X-Before.

Looking good. So I copied my X-After and Visitors data, and used Paste Special as before to add a third series to the chart. Since I’ve already changed the previous series to a scatter type on the primary axis, Excel made my new series also a scatter type on the primary axis. I’ve renamed the new series Sortie, and made it a large orange circle with no fill.

So far so good, as long as Disney doesn’t come after my for trademark infringement.

So let’s bring this puppy home. Add error bars to the Entrée and Sortie series, using the plus icon floating beside the chart.

Select and delete the horizontal error bars.

Let’s format the error bars. Choose the *No End Caps* option, and under *Error Amount*, select *Custom*, and then click the *Specify Value* button.

Use the data in the Entrée and Sortie columns as custom values for the error bars. You get this tiny little dialog I showed before, with microscopic edit boxes for selecting the range. The error bar values are initially `={1}`

for both: change the positive value to `={0}`

and for negative select the Entrée values.

Isn’t this Custom Error Bars dialog difficult to use? Here’s a better sized dialog. See that, Microsoft? I fixed it in 5 minutes using MS Paint.

Note that the Sortie error bar values are off by one row, so you need to select the range starting in the second cell, so use $C$3:$C$11. Here’s how the custom error bars look.

Format the error bars with the blue and orange colors and the lines thicker; I’ve used 1.5 pt (2 pixels). Give the blue error bars a starting arrowhead, and the orange error bars an ending arrowhead.

Format the Entrée and Sortie series to use no markers.

Finally a little clean-up. Delete the legend. Make all of the chart series one point shorter to remove the meaningless 18:00 category (stop at row 10 instead of row 11). Set the Y axis minimum to zero (the arrows between 12:00 and 13:00 actually drop slightly into negative territory, but that’s not crucial to the story).

The chart above used a gap width of 125% and a value of 0.333 to add or subtract from X to position the Entrée and Sortie arrows. You can adjust these together to change the chart’s appearance. Below, for example, I’ve applied a gap width of 75% and used an X increment of 0.41.

The hourly flow chart in this tutorial uses a similar approach to that in Revenue Chart Showing Year-Over-Year Variances.

There is a somewhat dated but still relevant tutorial on this blog that tells you all about Custom Error Bars in Excel Charts. Here is the cramped Custom Error Bars dialog in Excel 2007 and Windows 7, even narrower than Excel 365’s.

Here is a complete list of error bar related articles on the Peltier Tech blog.

- Error Bars in Excel 2007 Charts
- Custom Error Bars in Excel Charts
- Floating Bars in Excel Charts
- Error Bar Gantt Chart
- Callout Labels with Error Bars
- Step Charts in Excel
- Swimmer Plots in Excel
- High-Low Line Alternatives in Excel Charts
- Custom Axis Labels and Gridlines in an Excel Chart
- Add a Horizontal Line to a Column or Line Chart: Error Bar Method
- Add a Vertical Line to a Column or Line Chart: Error Bar Method

The post Hour by Hour Flow Chart appeared first on Peltier Tech Blog.

]]>**Jon’s Toolbox** is a simple Excel VBA add-in that provides a handful of useful functions. I use it when I am preparing or teaching classes, writing blog posts, and working on projects.

Jon’s Toolbox is not a replacement for Peltier Tech Charts for Excel, nor does it require Peltier Tech Charts for Excel in order to operate properly. The two utilities run independently, and I have both installed on each of my computers.

I’ll do a quick run-through of the menu controls to give you a sense of what Jon’s Toolbox is all about. Peltier Tech Charts for Excel is a large collection of advanced features, for creating custom chart types which are not native to Excel, for manipulating charts, and for controlling data that charts are linked to. Jon’s Toolbox is a self-contained set of functions that help with quickly inserting data and charts, and with quickly formatting charts and worksheets.

This is a quick look at the custom ribbon tab of Jon’s Toolbox. Click on the image to see it full size in a new browser tab.

The ribbon includes groups of controls, such as *Insert Data – Chart – Pivot*, *Chart and Font Size*, *Chart Tools*, *Range Tools*, *Range and Chart*, plus a teaser at the end, *From the Blog*.

*Insert Data – Chart – Pivot* has controls to quickly insert data, charts, or a whole data-pivot table-pivot chart ensemble. This is nice when I’m running a demo of my other software, preparing for a class, writing documentation or a blog post. It’s especially nice if I’m running a class and I want to demo a feature of Excel and I want some data for a chart quickly, and I don’t want to make everyone watch me mistype a bunch of numbers.

*Insert Data Range* inserts a handy dummy data set, ready for me to demonstrate Excel’s features, or the features of some software I’ve written. If the active sheet is blank, it puts the data there, otherwise it inserts the data onto a clean worksheet.

Nothing fancy, but it’s a good simple data range for demonstration purposes.

There are chart buttons for all of the chart types I routinely use. Yes, it includes pie charts, get over it.

Each of the buttons is a split button. I’ll illustrate the functionality with the *Column Chart* button.

If you click on the icon (left), the program creates the chart type shown. If you click on the dropdown (middle), you get a dropdown list with the options for that chart type (right).

When you use one of the options from the dropdown, the split button icon changes to that chart type, which is created next time you click on the icon.

The way a chart is inserted varies, depending on what is selected.

- If no data is selected, the program inserts a new worksheet, it inserts the dummy data range (see
*Data Range*above), and it inserts a chart one row below the data range using this data. - If a data range is selected, or if a cell in a block of data is selected, a new chart of the selected type is inserted, one row below the data range.
- If a chart is selected, the program inserts a new chart of the selected type, offset slightly from the original chart, using the same data as the previously active chart.

The charts created here are essentially the same as those from Excel’s own Insert tab, the same size and same series formatting, but they follow my own preferences for that chart type.

For example. Excel’s column charts have too much space between clusters, and unneeded space between columns in a cluster. Jon’s Toolbox fixes this.

Excel’s bar charts draw the axis in the opposite order of how the data appears in the worksheet. Jon’s Toolbox fixes this, while also changing to an aspect ratio that better accommodates the chart.

In addition to being spelled with too many silent letters, Excel’s doughnut charts have too fat a white line surrounding the individual points, the default hole size of 75% is ridiculously large, and the legend is placed below the chart, shrinking the doughnut’s height and leaving wide swaths of white space on either side of the doughnut. Jon’s Toolbox creates a donut chart spelled with no excessive letters; with a thin white outline around the points; with a default hole size of 66% for one ring (one series), 50% for two rings, and 33% for three or more rings; and with its legend on the right side of the chart, allowing the donut to fill more of the space occupied by the chart. In addition, Jon’s chart is closer to square, which means the chart takes up less space.

*Data, Table, and Chart* inserts a new worksheet. The worksheet contains a Table of data, a Pivot Table constructed using the data in this Table, and a Pivot Chart based on this Pivot Table. Click on the screenshot to view in a new window.

This is a great way to quickly show someone about Tables and Pivot Tables and Charts.

Are the default charts in Excel the right size for your purposes? In the US, the defaults are 5 inches wide and 3 inches tall, and frankly, that’s way larger than most of my charts need to be. Also, that 14-point chart title is much too large.

The *Chart and Font Size* group of controls provides the following predefined set of alternative sizes for charts and their fonts.

The different chart and font sizes look something like this. Note that they are independently applied to each chart. For my blog posts, I can easily fit two of the small charts, usually with medium font, side by side within a column of text.

Like many of the features of Jon’s Toolbox, if you select multiple charts before clicking the button, the formatting is applied to all selected charts.

Jon’s Toolbox has a handful of useful chart tools.

The default legend in an Excel chart is pretty widely spaced out (below left). *Condense Legend* closes it up (below right).

You can easily *Reverse Series* (and then unreverse them) in a chart, for example in a clustered column chart…

…or in a stacked column chart.

By default, Excel plots a bar chart with the vertical axis in exactly the opposite order as the data appears in the worksheet. This makes sense on some level, but not from the standpoint of usability. Click *Flip Bar Chart* and fix your bar charts instantly. (Click it again to get the upside-down version back.)

Note that if you use Jon’s Toolbox to insert a bar chart, it is automatically plotted the right way.

The default Excel color palette is not too bad, but if large areas of your chart are filled with the colors, the chart may seem oversaturated (below left). In Excel’s color pickers, the next lighter shade is 40% brighter, which may make the charts washed out. *Apply 20% Brightness* applies an intermediate shade to the selected series (below right). Note that you have to select and format each series separately.

Finally, the *Copy Bitmap* button copies the active chart to the clipboard in bitmap format, saving you several clicks (Home tab > Copy dropdown > Copy as Picture > select Appearance and Format). Simply switch to PowerPoint or your image editing software and Paste.

Jon’s Toolbox contains a set of range formatting tools.

When you select a chart or a series in a chart, Excel highlights the associated data in the worksheet.

*Chart Data* applies this formatting more permanently to the worksheet. The colors are the same, it only lacks the small square handles on the corners of the shaded regions. Great for tutorials and illustrations, though.

Click the *Chart Data* icon, and different things happen depending on what is selected.

- If a chart is selected, the data range for the whole chart is formatted.
- If a series in a chart is selected, the data range for that series is formatted.
- If a range is selected, the range is formatted as if it were used in a chart. If more columns that rows are selected, the range is formatted as if series are plotted by row, otherwise it is formatted as if series are plotted by column.

If you need more control over the formatting, click the *Chart Data* dropdown.

You can format a range including X and Y values and series names as if plotted by row or by column. You can format a range as if it contains only series names, X values, or Y values.

In addition, you can format a range so it looks like a selected range (below left), as a copied range (below center), or by clicking the two buttons in succession, as a selected range that’s been copied (below right).

I have found these very useful when writing blog posts, class notes, or documentation for my software.

Excel has an extensive built-in Borders menu.

And the borders it creates are stark, in-your-face dark black. Sometimes I even want to pluck out my own eyeballs. (My daughter once determined that any statement is funnier if you say “eyeballs” than if you just say “eyes”, and I agree. She’s a scientist working on visual perception using primates.)

Below are Outside Borders, Thick Outside Borders, All Borders, and a combination of All Borders and Thick Outside Borders. Pretty harsh.

*Nice Borders* applies more visually appealing borders to a selected range. Medium is a medium gray, darker than the default cell boundaries, but not as irritating as black, while light is a light gray similar to the default cell boundary color. “Nice” borders means a medium gray outline with light gray internal boundaries, and the other options format internal borders and outlines, or just outlines, with medium or light gray borders.

These are illustrated below with cell gridlines (Gridlines on the View tab) displayed on the left and turned off on the right.

*Medium Borders* applies medium gray borders to each cell in the selection.

*Medium Outline* applies a medium gray border around the outer cell boundaries of the selection.

*Light Borders* applies light gray borders to each cell in the selection. The light gray is indistinguishable from the default cell gridlines. I use this to put gridlines only on a certain range, while turning off gridlines elsewhere.

*Light Outline* applies the light gray border around the outer cell boundaries of the selection.

*Nice Borders* applies a medium gray outline around the selection and light gray borders to cell boundaries within the selection.

If I am making screenshots for documentation, I will often either hide the default gridlines and use *Light Borders* (first screenshot below), or I will show the gridlines and use *Nice Borders* (second screenshot below).

When you select one of these Border options, it becomes the new icon and default action for the command.

*Clear All Formats* will clear all formats, except for number formatting, from a selected range. This is nice, because sometimes you need to get rid of some ugly border and fill colors, but recreating number formats is a chore.

In the Clear All Formats dropdown, the options are to

- Clear border formatting,
- Clear border and fill formatting,
- Clear all formatting
*except for number formatting*.

*Hide Page Breaks* simply removes those unsightly page break dashed lines that make me crazy.

The *Range and Chart* group has some controls that help you control how the chart interacts with the worksheet.

*Select Data* allows you to quickly select a new source data range for the active chart.

*Cover with Chart* lets you select a range for the active chart to cover. The chart is repositioned and resized to cover the selected range.

*Stretch Chart* resizes the active chart (or all selected charts) to fill the cells it partially covers.

*Center Chart* moves the active chart (or all selected charts) so it is centered within the range of cells it covers.

Finally, Jon’s Toolbox includes some nice little programs that were previously presented in the Peltier Tech Blog.

These programs were featured in the following blog posts:

- Add Totals to Stacked Column Chart (but it works on stacked bar, stacked area, and stacked line charts, too)
- Trendline Calculator for Multiple Series
- Built In and Custom Lists in Excel

For a limited time only, Jon’s Toolbox is available for $29. Click this button to make your purchase.

Check your shopping cart with this button.

If you purchase multiple licenses, you qualify for a volume discount, shown in the table below. The discount is automatically applied in the shopping cart when you enter your quantity.

Like all of Peltier Tech’s software products, Jon’s Toolbox is guaranteed for 90 days from the date of purchase. If you are unsatisfied for any reason, contact Peltier Tech directly to request a refund. Any feedback is appreciated, but none is necessary.

Each user of Jon’s Toolbox requires a license. Each license allows the user to install Jon’s Toolbox on up to two computers of which they are the primary user. If a user gets a new computer, the software can be uninstalled from the old machine and reinstalled onto the new one.

Jon’s Toolbox is a simple Excel VBA add-in, which can be installed in Excel for Windows or Excel for Mac. Follow instructions at Install an Excel Add-In, and you’ll be working in no time.

Contact Peltier Tech with any questions about Jon’s Toolbox.

The post Jon’s Toolbox – A New Utility from Peltier Tech appeared first on Peltier Tech Blog.

]]>My friend and colleague Patrick Matthews, a former Excel MVP, posted a screenshot of an unusual bar chart on his Facebook page. The chart was taken from *What does the public say about impeaching Trump?*, the last section of a Washington Post article titled What happens next in the impeachment of President Trump? Patrick’s comment says it all: “Bar lengths on a chart, what do they even mean?”

At the risk of opening a torrent of political comments, I’ve reproduced the chart here.

Take a close look at the bar lengths in the first chart. The 12% bar is over half as long as the 85% bar, where in a bar chart with proportional bars, the 12% bar should be about 1/7 as long. But at least the 49% bar is slightly longer than the 47% bar, and they are in between the 12% and 85% bars. The same holds true of the bar lengths in the second chart.

Someone responded to Patrick’s post, wondering how they came up with those bar lengths. After the analysis in the previous paragraph, I replotted the data, set the axis scales to -100% to +100%, and set the vertical axis to cross at -100% on the horizontal axis. Nailed it!

Well, not exactly. As I sometimes do, I overanalyzed the charts. I’ve stripped most of the text from the WaPo graphic, replaced the outlines of my charts with red lines, and stretched my charts so they overlaid the WaPo plot.

It turns out that the axis minimum was really -92%, so my wild guess of -100% was pretty good. I’ve set the gridline spacing so that 0% and +92% are shown on the chart, and the far right edge of the plot area is at +100%.

I don’t think the graphic artist really used an axis minimum of -92%. I’m sure they started with 0%, then decided to fill in some white space by dragging the left edges of each bar while keeping the right edge in place. They filled in the space, all right. But by doing so, they obscured the differences between the values.

It’s the same issue that occurs when people start their axis at a value greater than zero, so the differences between values are accentuated. But now the axis and the bars start well below zero, and the differences are minimized.

My next step was to take my two charts, and set their axis minimum to 0%. These two charts now accurately show the relative percentages.

Those last two charts were a big improvement. But if we’re expected to compare the values, shouldn’t the bars all be in a single chart? Below I plotted the negative of one set of data, so the bars stretch in opposite directions, the way they do in population pyramids. Let’s call this a diverging bar chart.

Then I remembered why I dislike population pyramids, as I discussed ages ago in Tornado Charts and Dot Plots. It’s hard to compare bars that reach away from each other. It would be easier to compare the values of any two bars if they start at one horizontal position (the vertical axis) and stretch in the same direction (to the right). So I created this clustered bar chart:

An alternative is to plot one set of bars from left to right, and the other from right to left. It’s a converging rather than a diverging bar chart. This makes individual bars more difficult to compare, as in the population pyramid lookalike above. But the white spaces clustered between the colored bars represent the percentages of each category who have no opinion.

What do you think? Not about the topic of the chart, but about the construction of the chart. Do you prefer the diverging bar chart, the clustered bar chart, the converging (stacked) bar chart, or something else entirely?

The post Bar lengths on a chart, what do they even mean? appeared first on Peltier Tech Blog.

]]>