When Windows detects that a file has come from a computer other than the one you’re using, it marks the file as coming from the web, and blocks the file. This includes files emailed to you and files downloaded from the web. Here is how blocked files behave, and the steps you must take to unblock macros.

If you open a blocked file, Excel opens it in Protected View. The ribbon is hidden, and a warning appears across the top of the window. You can view but not edit the workbook.

When you click on Enable Editing, Excel reopens the workbook normally. But since it contains VBA, you get a message that macros are blocked. You can dismiss the message, but you can’t run macros from the workbook.

If the VBA Editor window is open, instead of the Security Risk message, you get a Security Notice informing you that macros are blocked, and your only option is to Disable Macros.

Close the file and unblock it as shown below.

Blocked add-ins exhibit their own strange behaviors. If you try to install a blocked add-in, you may see a message that Protected View is not supported for that document type. Or you may have been able to install the add-in, but now, although it is checked in the Add-Ins dialog, it doesn’t appear as expected in Excel’s ribbon. If you experience issues like this, try unblocking the add-in using the same protocol below, then install it as usual. These steps are also outlined in my article, Install an Excel Add-In.

To unblock a file that came from the web, find the file in Windows File Explorer. If you unblock a zip file, all files within the zip file will be unblocked.

Right-click on the file and click on Properties at the bottom of the context menu that pops up. This opens the Properties dialog. Note the warning, click to check the Unblock checkbox, and click OK.

Excel opens the workbook normally, but since it contains VBA, you get a Security Warning that macros have been disabled. Click the Enable Content button to enable macros.

If the VBA Editor window is open, instead of the Security Warning, you get a Security Notice informing you that macros are blocked, and you have the option to Enable Macros.

Now you’re ready to roll.

The protocol above works for the most common security settings. But your IT department may have imposed different, more stringent settings. If IT has blocked macros, you could talk to them about it, or maybe just quit your job.

To find your settings, go to File > Options, and click on Trust Center.

Click the Trust Center Settings button, then click on Macro Settings.

This is my preferred setting:

**Disable VBA Macros With Notification**will open with macros disabled but provide an option to enable macros. This is my preferred option.

Here are the other options:

**Disable VBA Macros Without Notification**will open with macros disabled and provide no option to enable macros, and won’t even tell you about the macros.**Disable VBA Macros Except Digitally Signed Macros**will open with macros disabled unless the workbook has been digitally signed and the associated certificate has been trusted.**Enable VBA Macros**will enable any and all macros without any notification.*DO NOT USE THIS OPTION.*

There are a couple of other items you can think about, which don’t affect VBA being blocked.

**Enable Excel 4.0 Macros when VBA Macros Are Enabled**enables the old macros when the new macros are enabled. I keep it checked because I feel that I’ve already trusted the document when I clicked on Enable Macros.**Trust Access to the VBA Project Object Model**will allow VBA macros to modify VBA projects, that is, write code and modify other settings. Check this box if you are developing VBA code that needs these capabilities, but in general you should probably keep it unchecked.

On some new computers, Windows 11 comes installed in S Mode, which is an ultra-secure mode. You may not even be allowed to download a file that contains VBA, and Windows will inform you incorrectly that the file has some kind of evil virus. I believe S Mode is limited to the Home edition of Windows, so you can deal with it yourself. Removing S Mode is a simple operation, then you can download and unblock files as described above.

A potentially dangerous macro has been blocked

Macros from the internet will be blocked by default in Office

The post Unblocking and Enabling Macros appeared first on Peltier Tech.

]]>Using ChatGPT and other AI tools to write Excel formulas has become popular. Bill Jelen aka Mr Excel has a simple test to see whether the particular AI engine is reliable: ask for a formula to determine whether a number is a prime number. Until recently, the response is this formula to test cell A2:

`=ISPRIME(A2)`

This makes use of the well-known but *nonexistent* `ISPRIME`

function (which does not exist in Google Sheets either). ChatGPT is now using a newer body of knowledge to generate its answers, but it still includes “may occasionally generate incorrect information” among its limitations; Bing has a new chat search feature that also uses an updated data source.

According to Wikipedia, a prime number is a natural number (a non-negative integer) greater than 1 that has as its natural number factors only itself and 1. Five is a prime number because its only factors are 5 and 1, i.e., 5 = 5 x 1. Six is not a prime number because it can be factored into other natural numbers, i.e., 6 = 6 x 1 = 3 x 2. To test whether a number is prime, we must divide it by all numbers greater than 1 but less than itself; if the remainder of any of these divisions is zero, the number is not prime. In fact, it is only necessary to divide by numbers less than the square root of the number being tested, because any factor greater than the square root is multiplied by a factor less than the square root.

As I mentioned, ChatGPT and Bing have improved their prime number formulas. Bing proposed two formulas, one of which was correct, and after several tries, ChatGPT’s fifth suggestion also was correct. I won’t clutter the internet with the incorrect responses, in case this article becomes part of the universe of knowledge about prime numbers and they are misunderstood to be valid formulas.

I also used a regular web search and found candidate formulas on various websites. Like the AI searches, these results were not necessarily correct. I should note that neither regular nor AI-assisted searches led to an Excel Lambda function that produced correct results.

I modified the two correct formulas and produced these:

```
=IFS(A2<2,FALSE,A2=2,TRUE,A2<>INT(A2),FALSE,A2>2,AND(MOD(A2,ROW(INDIRECT("2:"&ROUND(SQRT(A2),0))))<>0))
=IFS(A2<2,FALSE,A2=2,TRUE,A2<>INT(A2),FALSE,A2>2,MIN(MOD(A2,ROW(INDIRECT("2:"&ROUND(SQRT(A2),0)))))>0)
```

These are array formulas, and in older versions of Excel must be entered with Ctrl+Shift+Enter. In Excel 365, a simple Enter will suffice.

`ROW(INDIRECT("2:"&ROUND(SQRT(A2),0)))`

produces a list of row numbers between 2 and the square root of the number in A2, a clever construction that predates the modern `SEQUENCE`

function. `MOD`

performs the division and tells us the remainder. `AND`

tests whether all remainders are not equal to zero (i.e., it’s prime); alternatively, `MIN`

finds the smallest remainder, which is compared to zero (in which case it’s not prime). The first couple of IFS terms take care of 2, which is prime, numbers less than 2, which are not prime, and non-integers, which are not prime.

I don’t know which variation is better, `AND(MOD())`

or `MIN(MOD())`

. Both produce the correct results, and one is a character longer than the other. But later I’ll describe why I selected the MIN approach for my custom ISPRIME function.

I find that some formulas are easier to build from the inside out. I’ll illustrate my approach here for both variations.

Our first step is to build an array of potential factors. I entered numbers from 0 to 200 in A2:A202, then entered the following formula into B2 and copied it down to B202:

`=LAMBDA(n,SEQUENCE(1,SQRT(n),2))(A2)`

The top part of the worksheet looks like this. Each value in column A has a horizontal array of factors. B2 contains a #CALC! error, because it’s asking for an array (sequence) of zero values. Rows 3:5 have one-element arrays containing the factor 2, but larger values in column A produce larger arrays with factors 2, 3, 4, etc.

The next step is to divide the input number by the factors and check the remainders. The formula is

`=LAMBDA(n,MOD(n,SEQUENCE(1,SQRT(n),2)))(A2)`

and the worksheet now looks like this. The array of factors in each row has been converted into an array of remainders.

To use the `AND`

approach, we must first compare the remainders to zero. Our working Lambda becomes

`=LAMBDA(n,MOD(n,SEQUENCE(1,SQRT(n),2))>0)(A2)`

and the worksheet now looks like this. Remainders equal to zero become `FALSE`

and those greater than zero become `TRUE`

.

Now we can use AND to test whether all Boolean values in an array are TRUE, indicating a prime number. Our formula evolves to

`=LAMBDA(n,AND(MOD(n,SEQUENCE(1,SQRT(n),2))>0))(A2)`

and the worksheet changes to this. Note that the array in each row has been distilled into a single value.

The above correctly tells us which values are prime (`TRUE`

), except for 0, 1, and 2, and any non-integer values. We update the Lambda to include these cases:

`=LAMBDA(n,IFS(n<2,FALSE,n=2,TRUE,n<>INT(n),FALSE,n>2,AND(MOD(n,SEQUENCE(1,SQRT(n),2))<>0)))(A2)`

Finally we have our `ISPRIME`

Lambda function, `AND`

version.

First we identify our minimum remainder using MIN in the formula

`=LAMBDA(n,MIN(MOD(n,SEQUENCE(1,SQRT(n),2))))(A2)`

This transforms the array in each row into a single value.

Now we test whether the minimum remainder is greater than zero, indicating a prime number, using this formula.

`=LAMBDA(n,MIN(MOD(n,SEQUENCE(1,SQRT(n),2)))>0)(A2)`

This converts the column of zero and one values to `TRUE`

and `FALSE`

values.

This correctly shows which values are prime (`TRUE`

), except for 0, 1, and 2, and any non-integer values. We modify our Lambda to include these cases:

`=LAMBDA(n,IFS(n<2,FALSE,n=2,TRUE,n<>INT(n),FALSE,n>2,MIN(MOD(n,SEQUENCE(1,SQRT(n),2)))>0))(A2)`

Now we have our `ISPRIME`

Lambda function, `MIN`

version.

Our Lambda formulas so far have this syntax:

`=LAMBDA(input,calculation)(reference)`

We turn our Lambda into a custom function by defining a Name, and using the formula above without the reference:

`=LAMBDA(n,IFS(n<2,FALSE,n=2,TRUE,n<>INT(n),FALSE,n>2,MIN(MOD(n,SEQUENCE(1,SQRT(n),2)))>0))`

Go to Formulas > Define Name to open the New Name dialog. Enter “ISPRIME” in Name, “Input a number to determine whther it is prime” in Comment, and the Lambda formula in Refers To.

Click OK to create the function. We can open the Name Manager (Ctrl+F3) to see this and any other custom functions in the workbook.

I stated earlier that I didn’t know which version of the `ISPRIME`

formula was better, the one using `AND`

or the one using `MIN`

. I selected the `MIN`

version as my custom function because it converted the intermediate calculation to a single value one step earlier than the `AND`

version. A more rigorous approach would enter each function into many cells and measure how long Excel takes to recalculate each one.

Click in a cell where you want the new ISPRIME function and start typing ISPRIME. Before you even finish, IntelliSense will find the function and show the description in a tooltip.

Click Tab, and Excel inserts the full name and opening parenthesis. Click on the cell with a potential prime number that you want to test, and click Enter. Here is the sheet with the new function filled down:

- ISPRIME Lambda Function
- Lambda Moving Average Formulas
- Improved Excel Lambda Moving Average
- Excel Lambda Moving Average
- LAMBDA Function to Build Three-Tier Year-Quarter-Month Category Axis Labels
- Dynamic Array Histogram
- Calculate Nice Axis Scales with LET and LAMBDA
- VBA Test for New Excel Functions
- Dynamic Arrays, XLOOKUP, LET – New Excel Features

The post ISPRIME Lambda Function appeared first on Peltier Tech.

]]>In two recent articles, Excel Lambda Moving Average and Improved Excel Lambda Moving Average, I described my efforts to build a Lambda moving average formula that worked the way I wanted it to. I’ve included average calculations for the first few data points before the number of points being averaged is met. I’t’ve also included only counting points within a set number of days, even if there are missing days within the range.

Here I will compare three different moving averages, the Simple Moving Average (SMA), the Weighted Moving Average (WMA), and the Exponential Moving Average (EMA), often called the Exponentially Weighted Moving Average (EWMA). I will develop Lambda functions for each, and I will show how to build up a data range that will enable a chart to update when the size of the range changes.

Often data exhibits variation, and moving averages allow us to smooth out this variation. Our examples will apply seven-point moving averages to the following data set, consisting of 25 points that roughly follow a sine wave.

The downloadable workbook I used for this article is Lambda Moving Averages.xlsx.

My previously cited efforts involved Simple Moving Averages. A Simple Moving Average simply averages a certain number of values to produce the averaged values. For our 7-point moving average, the contribution of each value to the last average is shown below. The last seven points equally contribute 1/7 of their value (0.142857…) to the average. Often at the beginning of the data range, when fewer values are available than the number to be averaged (7 in my example), no average is calculated. I prefer to modify the formula so it averages the available values.

Here is my Simple Moving Average Lambda formula from the recent article; the data to be averaged is located in A2:A26, and the number of points to average (7 for these examples) is in cell K1. These inputs are appended to the Lambda formula in parentheses.

```
=LAMBDA(arr, n,
LET(
sma, MAKEARRAY(
ROWS(arr),
1,
LAMBDA(r, c,
LET(
movarr, TAKE(TAKE(arr, r), -n),
AVERAGE(movarr)
)
)
),
sma
)
)(A2:A26, K1)
```

This formula is entered into cell C2, and here are the calculated Simple Moving Averages.

You can use the Advanced Formula Environment (or AFE instead of its full name which is rather long to cram into the ribbon) to edit the Lambda function.

The AFE is easier to use than the built-in Formula Bar editor, it includes all of the usual IntelliSense, and it smartly adds line breaks and spaces to make the formula easier to read. Here is what the formula looks like in the AFE; Grid is selected in the second row indicating that the AFE is evaluating a cell formula. (Note: I do most of my editing in the old-school Formula Bar and in Notepad++, but I’m starting to transition to the AFE).

We can use that Lambda function, without the parenthetical arguments at the end, to define a Name that will serve as a function, as a User-Defined Function if you will; we can use this function anywhere in the workbook. On the Formulas tab of the ribbon, click on Define Name to pop up the New Name dialog.

Enter a function name (in Name), a brief description (in Comment), and the formula (in Refers To).

Click Enter, and the SMA function becomes available for use throughout the workbook.

This Name (John Walkenbach coined the phrase “Named Formula” decades ago) can also be added using the Advanced Formula Environment. and of course, the AFE can be used to edit any existing Named Formulas. Our SMA Named Formula is shown below in the AFE; Names is selected in the second row to indicate that the AFE is evaluating a Named Formula. Note that it does not show the outermost Lambda, and Formula Name and its Arguments are specified in their own sections of the dialog. This was confusing to me the first several times I tried using the AFE in this way, but I’m learning.

The function SMA can be used anywhere; it is used in cell E2 below as simply

`=SMA(A2:A26,K1)`

The results of this formula in E2 (spilling down to C26) match those of the Lambda function in cell C2.

I usually create my Lambda formulas in a cell, as in cell C2 above, then copy and paste it into a Name, where it becomes a reusable user-defined function.

I promised a technique to make a chart recognize the dynamic nature of the Dynamic Arrays produced by these Lambda functions. Charts dynamically update when their data is contained in a Table, and the Table resizes; see Easy Dynamic Charts Using Lists or Tables. Microsoft has recently extended this dynamic behavior to Dynamic Arrays; it’s not as reliable (yet) but it makes dynamic charts work despite Dynamic Arrays not being compatible with Tables. The only catch is that, at least so far, all of the chart’s data has to reside in the same Dynamic Array.

So I want to build a Dynamic Array with column headers, X values (point numbers), and Y values (original data and moving average). The headers are a simple array, `{"Point","Data","SMA"}`

. The point numbers are a simple `SEQUENCE`

function, the data simply links to the source data, and `SMA`

is calculated using our Lambda function. We can use `HSTACK`

to stack the point numbers, data values, and moving averages, then use `VSTACK`

to stack the headers on top of this `HSTACK`

ed block. Our formula is

```
=LAMBDA(array, num,
LET(
output, VSTACK(
{
"Point",
"Data",
"SMA"
},
HSTACK(
SEQUENCE(
ROWS(array)
),
array,
SMA(array, num)
)
),
output
)
)(A2:A26, K1)
```

This formula is entered into cell G1, and it produces the range used to plot the data.

Here is our data plotted with the Simple Moving Average.

Most moving averages are based on historical data, so they lag behind the data. Therefore the Simple Moving Average is a trailing measure. It does not reach as high or low as the original data, and it changes direction later than the original data. But it is also smoother than the original data. The more points being averaged, the greater the lag and the smoother the curve.

To illustrate how Dynamic Arrays can make a chart dynamic, let’s look at the following chart and DA source data. The top left cell uses the previous formula, but its final arguments are `(A2:A15, K1)`

, so it is only looking at 14 data points.

If all we do is change our arguments to `(A2:A26, K1)`

, the Dynamic Array expands to 25 points, and the chart automatically expands to include all 25 points. This behavior was astounding in Excel 2003 when its new Lists feature (renamed Tables in Excel 2007) allowed charts to react when their source data ranges expanded, and it is astounding in Excel 365 when combined with equally astounding Dynamic Arrays.

This link between a chart and a Dynamic Array is not as “smart” as the one between a chart and a Table. But if you are using an XY Scatter Chart, or if you are only making simple changes, like adding rows, the dynamics are amazing.

A Weighted Moving Average applied different weights to the data values to produce the averages. In principle you could apply whatever weights you want, but for a moving average, often a triangular pattern is used, which gives more emphasis to the most recent values. These weights are the array {1, 2, 3, … 7}. The last value is multiplied by 7, the next to last by 6, all the way to the seventh from last by 1; these are added together and divided by the sum of the weights, which is 28. These normalized weights for our 7-point moving average, the contribution of each value to the last average calculated, is shown below.

Here is my Weighted Moving Average Lambda formula; it is not much more complicated than the Simple Moving Average. As before, the data to be averaged, in A2:A26, and the number of points to average, in cell K1, are appended in parentheses to the Lambda formula.

```
=LAMBDA(arr, n,
LET(
wma, MAKEARRAY(
ROWS(arr),
1,
LAMBDA(r, c,
LET(
movarr, TAKE(TAKE(arr, r), -n),
m, ROWS(movarr),
weight, SEQUENCE(m) * 2 / m / (m + 1),
SUM(movarr * weight)
)
)
),
wma
)
)(A2:A26, K1)
```

I’ll go through the same progression as I did while developing the Simple Moving Average function above. If you get bored, feel free to scroll along at your own pace.

This formula is entered into cell C2, and here are the calculated Weighted Moving Averages.

We can follow the same procedure as above, using the Define Names dialog or the Advanced Formula Environment to define a Named Formula called WMA. We can then call this formula using a simple formula.

`=WMA(A2:A26,K1)`

This formula is entered into cell E2 and displays the same calculations as the longer LAMBDA in C2.

I didn’t need to duplicate the calculations, but I wanted to make sure they matched.

We can write another LAMBDA to stack together headers, point numbers, original data, and calculated Weighted Moving Averages.

```
=LAMBDA(array, num,
LET(
output, VSTACK(
{
"Point",
"Data",
"WMA"
},
HSTACK(
SEQUENCE(
ROWS(array)
),
array,
WMA(array, num)
)
),
output
)
)(A2:A26, K1)
```

Here is the result when we enter this formula into cell G1; we can use this range to plot the data.

Again, the duplication isn’t strictly necessary, but it serves to validate the results. In fact, in my original workbook I had extra columns with conventional formulas to calculate my moving averages.

Here is the data plotted with the Weighted Moving Average.

Like the Simple Moving Average, the Weighted Moving Average lags behind the original data while smoothing it. We’ll compare the various moving averages later.

An Exponential Moving Average is calculated using this formula:

EMA_{n} = k * X_{n} + (1 – k) * EMA_{n-1}

where X_{n} is the current value, EMA_{n} is the current calculation, EMA_{n-1} is the previous calculation, and k is the Exponential Smoothing Factor.

**Note:** The Exponential Moving Average (EMA) is sometimes referred to as the Exponentially Weighted Moving Average (EMWA).

The smoothing factor can be somewhat arbitrary, but when used in conjunction with N-point Simple or Weighted Moving Averages, it is often computed as

k = 2 / (N + 1)

For our example, k = 2 / (N + 1) = 2 / (7 + 1) = 2 / 8 = 0.25. This results in an effective weighting profile as shown below. The latest point is weighted the most, the previous point somewhat less, and earlier points increasingly less, but the weight never drops to zero.

We can see that the weighting fits an exponential distribution if we plot the weights on a scatter plot and draw a trendline (below left); on a semilog plot (below right), the trendline is a straight line.

Our LAMBDA uses SCAN to calculate the Exponential Moving Average. The first value is the first element of the input array, and each subsequent calculation multiplies the previous value by (1 – k) and adds it to k times the current value.

```
=LAMBDA(arr, n,
LET(
k, 2 / (n + 1),
ema, SCAN(
INDEX(arr, 1),
arr,
LAMBDA(a, b, a * (1 - k) + b * k)
),
ema
)
)(A2:A26, K1)
```

I’m repeating the same progression as with the Simple and Weighted Moving Average functions above. Feel free to jump ahead.

Entered into Cell C2, this formula produces the following calculations.

Let’s define another formula, EMA, to complete our family of Moving Average LAMBDAs. We call it like this:

`=EMA(A2:A26,K1)`

Entered into cell E2, we compute the moving averages as shown.

Again, we can build a LAMBDA to combine headers and columns of values into a single Dynamic Array.

```
=LAMBDA(array, num,
LET(
output, VSTACK(
{
"Point",
"Data",
"SMA",
"WMA",
"EMA"
},
HSTACK(
SEQUENCE(
ROWS(array)
),
array,
SMA(
array,
num
),
WMA(
array,
num
),
EMA(array, num)
)
),
output
)
)(A2:A26, I1)
```

The Dynamic Array appears when this formula is entered into cell G1.

That data can be plotted, to show the the Exponential Moving Average behaves in a similar way to the others, trailing the actual data while smoothing it out.

Like the Simple and Weighted Moving Averages, the Exponential Moving Average lags behind the original data.

Since we have already defined our Moving Average functions, we can write the following LAMBDA Dynamic Array formula to build a block of cells that we can use for a chart.

```
=LAMBDA(array, num,
LET(
output, VSTACK(
{"Point", "Data", "SMA", "WMA", "EMA"},
HSTACK(
SEQUENCE(ROWS(array)),
array,
SMA(array, num),
WMA(array, num),
EMA(array, num)
)
),
output
)
)(A2:A26, I1)
```

Entered into cell C1, the formula produces this data in the worksheet.

Here is that block of data plotted in an XY Scatter Chart.

All three of the Moving Averages lag changes in the data. The Simple Moving Average lags more than the others, because the weights for the last few values are lower than for the other averages. The Weighted Moving Average follows the data most closely, because it only includes the last seven points in its calculations, not all points in a long tail like the Exponential Moving Average.

If we look at a different data set, again roughly a sine wave but including many more points, and average more points, we can compare the three Moving Averages in more detail.

Our user-defined LAMBDA functions are saved in the workbook as Names, or Named Formulas. You can access them in the Name Manager on the Formulas tab of the ribbon (or using Ctrl+F3). The LAMBDAs are listed along with any other Names defined in the workbook. Select one to see a bit more of its definition in the Refers To box at the bottom (unfortunately it does not support line wrapping, right Microsoft?), and click the Edit button to see it in more detail (also not line-wrapped).

You can also use the Advanced Formula Editor, selecting names in the second row to view all of your Named LAMBDAs. Click on the pencil icon next to a LAMBDA to view it in more detail.

When you have created LAMBDA functions, they live in a specific workbook, and are not available in other workbooks. However, if you copy a worksheet from the workbook that has the LAMBDAs into a workbook without the LAMBDAs, your LAMBDAs go with the sheet into the second workbook. This is convenient, but also can cause confusion, as anyone knows if they have worked extensively with Names.

Moving Average, Weighted Moving Average, and Exponential Moving Average

How Is the Exponential Moving Average (EMA) Formula Calculated?

The post Lambda Moving Average Formulas appeared first on Peltier Tech.

]]>- Cornerstone Articles (Important and Popular Posts)
- Axis Labels
- Axis Scales
- Chart Events
- Combination Charts
- Conditional Formatting of Charts
- Custom Chart Types
- Data Labels
- Dynamic Arrays, LET, and LAMBDA
- Dynamic Charts
- Error Bars
- Floating Bars
- Gantt Charts
- Gaps
- Histograms
- Legends
- Marimekko Charts
- Pivot Tables and Charts
- SERIES Formula
- Slope Charts
- Statistical Process Control
- Statistics
- Tables
- Trendlines and Regression

- Good Chart Data
- The Excel Chart SERIES Formula
- Install an Excel Add-In
- Excel Waterfall Charts (Bridge Charts)
- Clustered and Stacked Column and Bar Charts
- Gantt Charts in Microsoft Excel
- Step Charts in Excel
- Fill Under or Between Series in an Excel XY Chart
- Conditional Formatting of Excel Charts
- Referencing Pivot Table Ranges in VBA
- Excel Interpolation Formulas
- Unlink Chart Data

- LAMBDA Function to Build Three-Tier Year-Quarter-Month Category Axis Labels
- Text Labels on a Horizontal Bar Chart in Excel
- Reciprocal Chart Axis Scale
- Custom Axis Labels and Gridlines in an Excel Chart
- Chart with a Dual Category Axis
- Using Pivot Table Data for a Chart with a Dual Category Axis
- Stagger Axis Labels to Prevent Overlapping
- Axis Labels on Small Charts
- Column Chart with Category Axis Labels Between Columns
- Axis Labels That Don’t Block Plotted Data
- Individually Formatted Category Axis Labels

- LAMBDA Function to Build Three-Tier Year-Quarter-Month Category Axis Labels
- Excel Chart with Square Gridlines
- Calculate Nice Axis Scales with LET and LAMBDA
- Calculate Nice Axis Scales in Your Excel Worksheet
- Calculate Nice Axis Scales in Excel VBA
- Chart UDF to Control Axis Scale
- How Excel Calculates Automatic Chart Axis Limits
- Reciprocal Chart Axis Scale
- Custom Axis Labels and Gridlines in an Excel Chart
- Custom Axis, Y = 1, 2, 4, 8, 16
- Logarithmic Axis Scales
- Link Excel Chart Axis Scale to Values in Cells
- Consistent Axis Scales Across Multiple Charts
- Gantt Chart with Nice Date Axis
- Select Meaningful Axis Scales
- Bar Chart Value Axis Scale Must Include Zero
- Text Labels on a Horizontal Bar Chart in Excel

- Chart Events in Microsoft Excel
- Chart Event to Highlight a Series
- Chart Event Class Module to Highlight a Series
- Highlight a Series with a Click or a Mouse Over

- Clustered Column and Line Combination Chart
- Precision Positioning of XY Data Points
- Add a Horizontal Line to an Excel Chart
- Horizontal Line Behind Columns in an Excel Chart
- Bar-Line (XY) Combination Chart in Excel
- Salary Chart: Plot Markers on Floating Bars
- Fill Under or Between Series in an Excel XY Chart
- Fill Under a Plotted Line: The Standard Normal Curve
- Excel Chart With Colored Quadrant Background

- Conditional Formatting of Excel Charts
- Conditional XY Charts Without VBA
- Invert if Negative Formatting in Excel Charts
- Conditional Donut Chart
- Highlight Min and Max Data Points in an Excel Chart
- Split Data Range into Multiple Chart Series without VBA
- Conditional Formatting of Lines in an Excel Line Chart Using VBA
- VBA Conditional Formatting of Charts by Value and Label
- VBA Conditional Formatting of Charts by Series Name
- VBA Conditional Formatting of Charts by Value
- VBA Conditional Formatting of Charts by Category Label

- Excel Waterfall Charts (Bridge Charts)
- Excel Box and Whisker Diagrams (Box Plots)
- Clustered and Stacked Column and Bar Charts
- Step Charts in Excel
- Cycle Plots in Excel
- Diverging Stacked Bar Charts
- See Marimekko Charts
- See Gantt Charts
- See Slope Charts

- Slope Chart with Data Labels
- Prevent Overlapping Data Labels in Excel Charts
- Apply Custom Data Labels to Charted Points
- Label Each Series in a Chart
- Label Last Point for Excel 2007
- Add Totals to Stacked Bar Chart
- Custom Axis Labels and Gridlines in an Excel Chart
- Text Labels on a Horizontal Bar Chart in Excel

- Improved Excel Lambda Moving Average
- Excel Lambda Moving Average
- LAMBDA Function to Build Three-Tier Year-Quarter-Month Category Axis Labels
- Dynamic Array Histogram
- Calculate Nice Axis Scales with LET and LAMBDA
- VBA Test for New Excel Functions
- Dynamic Arrays, XLOOKUP, LET – New Excel Features

- Dynamic Charts
- Easy Dynamic Charts Using Lists or Tables
- Dynamic Charts in Excel 2016 for Mac
- Dynamic Chart using Pivot Table and Range Names
- Dynamic Chart using Pivot Table and VBA
- Dynamic Ranges to Find and Plot Desired Columns
- Split Data Range into Multiple Chart Series without VBA
- VBA to Split Data Range into Multiple Chart Series
- Dynamic Chart Source Data (VBA)
- Dynamic Chart with Multiple Series
- Display One Chart Dynamically and Interactively

- Error Bars in Excel 2007 Charts
- Custom Error Bars in Excel Charts
- Floating Bars in Excel Charts
- Error Bar Gantt Chart
- Callout Labels with Error Bars
- Step Charts in Excel
- Swimmer Plots in Excel
- High-Low Line Alternatives in Excel Charts
- Custom Axis Labels and Gridlines in an Excel Chart
- Add a Horizontal Line to a Column or Line Chart: Error Bar Method
- Add a Vertical Line to a Column or Line Chart: Error Bar Method
- Hour by Hour Flow Chart
- Revenue Chart Showing Year-Over-Year Variances

- Floating Bars in Excel Charts
- Salary Chart: Plot Markers on Floating Bars
- Low-Medium-High Charts
- Excel Waterfall Charts (Bridge Charts)
- Excel Box and Whisker Diagrams (Box Plots)
- Microsoft Excel Stock Charts
- Candlestick Alternative: Individually Colored Up-Down Bars
- Swimmer Plots in Excel
- Colored Vertical Band Across an Excel Chart
- Stacked Column Charts that Cross the X Axis
- See Gantt Charts

- Gantt Charts in Microsoft Excel
- Gantt Chart with Nice Date Axis
- Error Bar Gantt Chart
- Repeated Gantt Chart to Track Players’ Ice Time
- Gantt Chart for Repeated Tasks
- Easier Gantt Chart for Repeated Tasks
- Gantt Chart with Repeated Tasks via Excel XY Chart

- Plot Blank Cells and #N/A in Excel Charts
- Mind the Gap – Charting Empty Cells (obsolete)
- Another approach to plotting gaps in Excel charts (obsolete)
- VBA Approaches to Plotting Gaps in Excel Charts (obsolete)
- Area Chart With Gap (obsolete)

- Dynamic Array Histogram
- Histogram With Normal Curve Overlay
- Histograms Using Excel XY Charts
- Histogram Using XY and/or Area Charts
- Filled Histograms Using Excel XY-Area Charts
- Histogram on a Value X Axis
- Histogram with Actual Bin Labels Between Bars
- Peltier Tech Histogram

- Order of Series and Legend Entries in Excel Charts
- Legend Entry Tricks in Excel Charts
- Legends in Excel Charts – Formats, Size, Shape, and Position
- Double Legend in a Single Chart
- Slim Down Those Fat Legend Keys

- Marimekko Charts
- The Problem with Marimekkos
- Marimekko Replacement – Overlapping Bars (Easy)
- Marimekko Replacement – Overlapping Bars (Hard)
- Marimekko Replacement – 2 by 2 Panel

- Using Pivot Tables in Microsoft Excel
- Referencing Pivot Table Ranges in VBA
- Pivot Table Conditional Formatting with VBA
- Grouping by Date in a Pivot Table
- Clean Up Date Items in An Excel Pivot Table
- Explore Your Data with Pivot Tables
- Preliminary Data Exploration with Excel Pivot Tables
- Prepare Your Data in a Chart Staging Area
- Working with Pivot Charts in Excel
- Pivot Chart Formatting Changes When Filtered
- Copy a Pivot Table and Pivot Chart and Link to New Data
- Making Regular Charts from Pivot Tables
- Update Regular Chart when Pivot Table Updates
- Create and Update a Chart Using Only Part of a Pivot Table’s Data
- Dynamic Chart using Pivot Table and Range Names
- Dynamic Chart using Pivot Table and VBA
- Using Pivot Table Data for a Chart with a Dual Category Axis

- The Excel Chart SERIES Formula
- Change Series Formula – Improved Routines
- How to Edit Series Formulas
- Simple VBA Code to Manipulate the SERIES Formula and Add Names to Excel Chart Series
- Edit Series Formulas
- Add One Trendline for Multiple Series
- Unlink Chart Data

- Slope Chart with Data Labels
- How to Make Slope Graphs in Excel
- Funny Conclusions from a Dual Pie Chart

- Watching my Weight with SPC (Statistical Process Control)
- SPC Approach to Browser Stats
- PBCharts Inflation Analysis
- Polynomial Fit vs. Statistical Process Control
- Use a Chart to See Patterns in Your Data
- Types of Control Charts
- Introducing Control Charts (Run Charts)
- Statistical Process Control

- Excel Box and Whisker Diagrams (Box Plots)
- See Histograms
- See Statistical Process Control
- See Trendlines and Regression

- Trendlines and Chart Types in Excel
- Add One Trendline for Multiple Series
- Trendline Calculator for Multiple Series
- Trendline Fitting Errors
- Choosing a Trendline Type
- Plot Two Time Series and Trendlines with Different Dates
- Polynomial Fit vs. Statistical Process Control
- Moving Averages
- Stacked Column Chart with Stacked Trendlines
- Deming Regression
- Deming Regression Utility
- LOESS Smoothing in Excel
- LOESS Utility for Excel

The post Index of Peltier Tech Blog appeared first on Peltier Tech.

]]>I recently wrote about how I wrote an Excel Lambda Moving Average formula. I started with internet searches, which led to formulas which didn’t work or formulas which were too complicated for me to understand. After a brief and amusing foray into ChatGPT, I built myself a workable formula that worked by generating a running sum, subtracting an earlier point’s running sum from the current point’s running sum, and dividing by the number of points. This worked fine for a simple N-point moving average, but it broke down for N-day moving averages where some days in the data set may be missing. And there are probably better ways to do the moving average without calculating running sums.

I developed the following LAMBDA moving average formula, as described in my previous post:

=LAMBDA(datarange,numpoints, LET( runsum,SCAN(,datarange,LAMBDA(a,b,a+b)), BYROW( SEQUENCE(ROWS(datarange)), LAMBDA(x, IF( x<numpoints, NA(), (INDEX(runsum,x)-IF(x=numpoints,0,INDEX(runsum,x-numpoints))) /numpoints) ) ) ) )

Let me partially deconstruct the formula below, showing internal calculations leading to the moving average. The `LAMBDA`

takes the data range and `numpoints`

, the number of points in the moving average, as arguments. The first column contains 25 values in the data range. Column 2 is the element number `x`

of the output array which is created using `BYROW`

. The third column contains the running sum, created using `SCAN`

. The fourth column is the same running sum offset by `numpoints`

. The fifth column shows the moving average calculation, which is the running sum minus the running sum `numpoints`

points ago divided by `numpoints`

. Where `x`

is less than `numpoints`

, the result is #`N/A`

, because there aren’t enough points to divide by `numpoints`

. Where `x`

equals `numpoints`

, zero is subtracted from the running sum, since the running sum of zero points is zero.

Here is how it looks in action. The data range is in B5:B29, the number of points being averaged is in C2, and the moving range formula is entered into cell C5 and spills down to C29. The first several calculated values are #N/A, until we have as many points as we are averaging (the number in C2).

I exchanged several comments on my earlier post with a smart reader named Henk-Jan van Well, who suggested several improvements to the moving average algorithm, starting with `CHOOSEROWS`

.

`CHOOSEROWS`

allows you to select which rows of an array to use. `CHOOSEROWS(array,3)`

returns the 3rd row of that array. `CHOOSEROWS(array,{2,4})`

returns the 2nd and 4th rows. Our approach uses `SEQUENCE`

to generate a list of row numbers to return, and we simply use `AVERAGE`

to, well, average these rows.

=LAMBDA(datarange,numpoints, MAKEARRAY( ROWS(datarange),, LAMBDA(r,c, IF( r<numpoints, NA(), AVERAGE( CHOOSEROWS( datarange, SEQUENCE(numpoints,,1+r-numpoints) ) ) ) ) ) )

Here is a deconstruction of the formula. Again, the `LAMBDA`

moving average takes the data range and `numpoints`

, the number of points in the moving average, as arguments. The data range is in the first column, the output row number `r`

is in the first row (I’m only showing the first 18 points, but you get the idea). For the first 4 points (less than `numpoints`

), the function returns `#N/A`

. After that, `CHOOSEROWS`

selects the indicated values from the data range. These are averaged in the last row, which is returned by the function.

The calculations and chart look the same as above.

Henk-Jan also mentioned an averaging scheme used by some econometricians for points that occur before the official number of points to average. I don’t like that definition of averaging early points, but I have an alternative, which simply averages however many points there are. This actually simplifies our formula.

=LAMBDA(datarange,numpoints, MAKEARRAY( ROWS(datarange),, LAMBDA( r,c, AVERAGE( CHOOSEROWS( datarange, SEQUENCE( MIN(r,numpoints),, MAX(1,1+r-numpoints) ) ) ) ) ) )

Here is a deconstruction of the new formula. The `LAMBDA`

takes the data range and `numpoints`

as arguments. The data range is in the first column, the output row number `r`

is in the first row (only showing the first 18 points). `CHOOSEROWS`

selects the indicated values from the data range; note that fewer than numpoints are selected for the first few columns. The selected values are averaged in the last row, which is returned by the function.

The results look the same as before, except that the moving average begins right at the first point.

The array functions TAKE and DROP allow you to keep or remove rows and columns from the beginning and end of an array. This is simpler than generating a list of row numbers to pass into CHOOSEROWS to define a moving array to average.

For each row r of the moving average array we create, we `TAKE`

the first r rows, then we `DROP`

all but `numpoints`

from the beginning (and if r is less than `numpoints`

, we don’t `DROP`

any rows).

=LAMBDA(datarange,numpoints, MAKEARRAY(ROWS(datarange),1, LAMBDA(r,c, LET( movingdatarange, DROP( TAKE(datarange,r), MAX(0,r-numpoints) ), AVERAGE(movingdatarange) ) ) ) )

I’ve only deconstructed a few points below. The `LAMBDA`

moving average takes the data range (which is `r`

rows tall) and `numpoints`

as arguments. The function uses TAKE to take the first `r`

rows, then uses DROP to drop the first `r`

–`numpoints`

rows, or zero if `r`

<`numpoints`

. For point 3, the function takes the first 3 rows, then drops zero rows, and returns the average of these three values. For point 10, the function takes the first 10 rows, drops 5 rows (`r`

–`numpoints`

= 10-5), and averages these 5 (`numpoints`

) rows. For point 23, the function takes the first 23 rows, drops 18 rows (`r`

–`numpoints`

= 23-5), and averages these 5 (`numpoints`

) rows.

The result looks the same, with the moving range starting right at the beginning. The LAMBDA formula may be a bit shorter, but it also seems more reliable when the number of points may vary (as when averaging by dates below).

An easier alternative is the following, where we `TAKE`

the first r rows for each array element, then `TAKE`

the last `numpoints`

rows of this. If you try to `TAKE`

more than the number of rows in an array, you simply get the entire array.

=LAMBDA(datarange,numpoints, MAKEARRAY(ROWS(datarange),1, LAMBDA(r,c, LET( movingdatarange, TAKE( TAKE(datarange,r), -numpoints ), AVERAGE(movingdatarange) ) ) ) )

Again I’ve only deconstructed a few points below. The `LAMBDA`

takes the data range (which is `r`

rows tall) and `numpoints`

as arguments. The function uses TAKE to take the first `r`

rows, then uses `TAKE`

again to take the last `numpoints`

rows; if `r`

<`numpoints`

then `TAKE`

only takes as many rows as are available. For point 3, the function takes the first 3 rows, then takes all available rows, and returns the average of these three values. For point 10, the function takes the first 10 rows, takes the last 5 (`numpoints`

) of these rows, and averages these rows. For point 23, the function takes the first 23 rows, then keeps the last 5 (`numpoints`

) of these rows, and averages these rows.

Same results, but the function is a few characters shorter and may be easier to understand.

If a range has values for all dates, then a moving average by date is the same as a moving average by point. A seven-day moving average is a seven-point moving average. But if some dates are missing, then a seven-point moving average will average points from more than seven days. I want to average only data which falls within a seven-day window, which means averaging fewer than seven points if some days are missing. The screenshot below illustrates the problem. There will be a missing value when a measurement is not made on a given day. Note for example, that 1/9/2023 and 1/10/2023 are missing.

Here is the `LAMBDA`

formula, which takes the dates, the data values to be averaged, and `numdays`

, the number of days to average, as inputs.

=LAMBDA(daterange,datarange,numdays, MAKEARRAY(ROWS(datarange),1, LAMBDA(r,c, LET( firstdate, XMATCH( INDEX(daterange,r)-numdays+1, daterange, 1), movingdatarange, DROP( TAKE(datarange,r), MAX(0,firstdate-1) ), AVERAGE(movingdatarange) ) ) ) )

In the first section of the deconstruction below I show row number (tan-shaded), the end date for that average, which is the date for that row number, the allowed start date, which is the end date minus `numdays`

+1, and the actual start date, which is the earliest date on or after the allowed start date. The blue shaded dates show which rows have different allowed and actual start dates.

In the second section of the deconstruction, I show dates (gold-shaded) and values in the first two columns. In the first row is the row number `r`

of the output array (tan-shaded). The remaining columns show the dates that fall between the allowed start date and end date for each row. The green-shaded columns show which columns contain fewer than `numdays`

dates. Not very many of the rows actually average `numdays`

values.

The LAMBDA moving average formula uses the `TAKE`

/`DROP `

approach from the previous section to average the appropriate values.

Below is the output of the formula.

Here is comparison of a 7-point (orange line) and a 7-day (blue line) moving average. The blue shaded cells show where the two are not the same.

This is the case when I track my weight: while traveling or when I wake up too late or am just too busy in the morning to weigh myself, then I don’t have a row for the date I’ve missed. This formula treats such occasions nicely.

You can click on this link to download a workbook that contains these examples: Improved Excel Lambda Moving Average.xlsx.

- Excel Lambda Moving Average
- LAMBDA Function to Build Three-Tier Year-Quarter-Month Category Axis Labels
- Dynamic Array Histogram
- Calculate Nice Axis Scales with LET and LAMBDA
- VBA Test for New Excel Functions
- Dynamic Arrays, XLOOKUP, LET – New Excel Features

The post Improved Excel Lambda Moving Average appeared first on Peltier Tech.

]]>When averaging time-series data, you often want to smooth out peaks and valleys. A moving average is an easy way to smooth your data. When I track my weight, for example, I use a 7-day moving average. This smooths out peaks associated with weekends when I might go out to eat and enjoy a beer or two.

The image below shows 25 random data points and a five-point moving average. The points were generated with this Dynamic Array formula in cell B5:

=RANDARRAY(25,,0,10,TRUE)

and the moving average was calculated with this formula in cell C5, filled down to C29:

=IF( COUNT(OFFSET($B5,0,0,-$C$2,1))=$C$2, AVERAGE(OFFSET($B5,0,0,-$C$2,1)), NA() )

*When building large, complicated LAMBDA formulas, it has become common to enhance the readability of the formulas with line feeds (use Alt+Enter to insert a line feed in the Formula Bar) and spaces. I find it helps with older formulas as well.*

This moving average formula needs to be placed in each row of the moving average range. If it’s in a Table, that’s no big deal, because adding more data will automatically fill the formula into added Table rows.

But the data was the result of a Dynamic Array formula in just cell B5, and the output spilled down as far as the formula required. It would be nice to build a Dynamic Array formula for moving average which is written just in cell C5 but spills down as far as the Dynamic Array it averages.

There are many formulas you can use to calculate a moving average, using variations of INDEX and OFFSET formulas. Incidentally, if you don’t need the moving average values in the worksheet, you can use an Excel chart’s trendline feature to display the moving average.

I tried my hand at writing my own formula and got stuck almost immediately. I searched Bingle to see what I could find.

I found a lot of possible answers. The ones that seemed easy didn’t work. The ones that worked were very complicated, and I really didn’t understand them very well. I finally settled on one from Lambda Moving Average – calculate rolling sum in Excel (and much more). The original version of this function included a parameter that lets you choose whether to calculate a moving average or other moving statistical functions. I cleaned out all the other functions and was left with the moving average below:

=LAMBDA(x,window, LET( _x, x, _w, window, _thk, LAMBDA(x, LAMBDA(x)), _fn, _thk(LAMBDA(x, AVERAGE(x))), _i, SEQUENCE(ROWS(x)), _s, SCAN( 0, _i, LAMBDA(a, b, IF(b < _w, NA(), _thk(MAKEARRAY(_w, 1, LAMBDA(r, c, INDEX(_x, b - _w + r)))))) ), _out, SCAN(0, _i, LAMBDA(a, b, _fn()(INDEX(_s, b, 1)()))), _out ) )

As I said, it works fine, but I don’t really understand how it works. I’m reluctant to include it in a project for a client if I don’t grok it, but I’ve implemented it in some of my own workbooks. It seems to run slowly, probably because for each element of the original array, it generates a subset of that array to calculate an average. For an array with hundreds of points, that adds up to hundreds of smaller arrays.

I also went to ChatGPT to see what it could tell me. It showed me lots of code samples and several formulas that resembled Excel formulas. But many formulas had errors, and no formula that had no errors returned a moving average.

I couldn’t wrap my tired, old brain around the algorithm above, but sometimes my brain gets bored, looks out the window, and surprises me with what it comes up with. And my approach isn’t rocket science, but it is less cumbersome than creating multitudes of arrays which all include partial duplicates of the original array’s values. I can calculate the running sum of the original array, subtract the running sum from an earlier row, and divide by the number of points, and I’ll have my moving averages.

First, I’ll show how it works. Here’s my data in the second column below. For reference I’ve inserted a sequence number in the first column. The gold-shaded range in the third column contains the running sum of the data in the second column. The fourth column contains the same shaded running sum, offset by 5 rows so I can compute my 5-point moving average. The first four cells of a 5-point moving average are not calculated; I’ve entered #N/A so they are not plotted.

Delta is the difference between the two running sums, that is, the intermediate moving sum, and Average is Delta divided by 5.

The first 5-point average is calculated for the 5th value, where the running sum is 39: 39 divided by 5 is 7.8.

The second calculation is for the 6th point, where the running sum is 46. But we only want the sum of points 2 through 6, so we subtract the running sum for point 1, which is 8. (46-8)/5 is 7.6. And so on.

Now let’s get it into a single formula.

The following range illustrates the steps toward building the formula; several steps are just me learning how some new Excel functions work. Column B contains the original formula, and column C is my old-style formula-in-every-cell to calculate the moving average. I’ve hidden columns D:I.

Column J spits out the original data range. This isn’t necessary in the final formula, but I was gaining confidence with `BYROW`

. `BYROW`

works by defining an array, passing it row by row into a `LAMBDA`

function, and building an array of the results of that `LAMBDA`

for each row. The array passed in is a simple `SEQUENCE`

, from 1 to the number of rows in the original data range. Each element of the sequence is passed as `x`

into `LAMBDA`

, which returns the `x`

th element of the data range.

=LET( datarange,B5#, BYROW( SEQUENCE(ROWS(datarange)), LAMBDA(x,INDEX(datarange,x)) ) )

The running sum in column K is easy to calculate with the new Dynamic Array helper functions. Like `BYROW`

, `SCAN`

passes each element of an array into a `LAMBDA`

function, which calculates each element of the output array, Again, this isn’t strictly necessary, but I was learning about `SCAN`

.

The first argument of `SCAN`

is the starting value (zero since it’s missing), the second is the original data range. `LAMBDA`

accepts the starting value `a`

and the data value `b`

then applies the function `a+b`

to generate the output value. This output becomes the new starting value `a`

, which is added to the next data value `b`

, etc.

=LET( datarange,B5#, SCAN(,datarange,LAMBDA(a,b,a+b)) )

The running sum in column L is a bit more convoluted, but it’s leading to my ultimate formula. `SCAN`

is used as above to generate the running sum, but instead of spitting it out into the worksheet, it is stored in the name `runsum`

. Then `BYROW`

is used to return each element of `runsum`

.

=LET( datarange,$B$5#, runsum,SCAN(,datarange,LAMBDA(a,b,a+b)), BYROW( SEQUENCE(ROWS(datarange)), LAMBDA(x,INDEX(runsum,x)) ) )

If I can get the `x`

th element of `runsum`

, I can also get the element `numpoints`

before that and subtract it. If `x`

is less than `numpoints`

, this corresponds to an early point which displays `#N/A`

. If `x`

is equal to `numpoints`

, it’s the first calculated value, and there is no running sum to subtract, so I subtract zero. After subtracting to get the intermediate sum, I divide by `numpoints`

to get the average.

=LET( datarange,$B$5#, numpoints,$C$2, runsum,SCAN(,datarange,LAMBDA(a,b,a+b)), BYROW( SEQUENCE(ROWS(datarange)), LAMBDA(x, IF( x<numpoints, NA(), (INDEX(runsum,x)-IF(x=numpoints,0,INDEX(runsum,x-numpoints))) /numpoints) ) ) )

I can rewrite this as a `LAMBDA`

:

=LAMBDA(datarange,numpoints, LET( runsum,SCAN(,datarange,LAMBDA(a,b,a+b)), BYROW( SEQUENCE(ROWS(datarange)), LAMBDA(x, IF( x<numpoints, NA(), (INDEX(runsum,x)-IF(x=numpoints,0,INDEX(runsum,x-numpoints))) /numpoints) ) ) ) )($B$5#,$C$2)

Note that the `LAMBDA`

formula above ends with `($B$5#,$C$2)`

, which is how these arguments are entered into the formula. To make this a reusable function, copy the formula without these arguments and their parentheses. Go to Formulas tab > Define Name. Enter a function name and short description, paste the formula into the Refers To box, and press Enter.

Using the Define Name method is suboptimal. The entire LAMBDA function cannot be viewed at once, and you lose the line feeds and white space. You could also use the Advanced Formula Environment, a free add-in from Microsoft.

You can now use this formula throughout the workbook using this simple syntax:

=MovingAverage(B5#,C2)

- Dynamic Array Histogram
- VBA Test for New Excel Functions
- Calculate Nice Axis Scales with LET and LAMBDA
- Dynamic Arrays, XLOOKUP, LET – New Excel Features

The post Excel Lambda Moving Average appeared first on Peltier Tech.

]]>A certain data layout can produce a chart axis which divides and subdivides the categories into logical subcategories, such as years, quarters, and months in the following chart.

Generally, this data layout must be produced by hand, because it relies on an arrangement of filled and blank cells to help Excel parse the data into subcategories. In this post I’ll show how a LAMBDA formula can build the range for you.

Earlier this month I had the honor and pleasure of participating in Excel Days 2022, held in Sofia, Bulgaria. On one day I held my Advanced Excel Charting Masterclass for a group of about 25; on another I presented a conference session entitled “The Best Excel Charting Tips and Tricks” to a crowd of 300 or more. That’s me standing in front of the largest LED screen I’ve ever used.

One of the tips was that using a blank top-left cell helps Excel parse the X values, Y values, and series names (highlighted purple, blue, and red respectively in the screenshot below), especially if the X values and series names consist of numbers (years, for example). The top-left cell, or TLC, is filled with light gold.

Many people know about this TLC trick.

What I call TLC+ extends this approach to identify multiple columns to use as X values, using blank cells to identify grouping of categories and subcategories. This example groups months into quarters. Again, blank cells are filled with light gold, and the axis categories and subcategories are highlighted with the same color.

Many people don’t know about grouping axis categories in this way, although they’ve likely seen it in a pivot chart that has multiple fields in the rows area of its pivot table. So this trick was new to much of my audience.

TLC+ doesn’t stop there. The next example shows months grouped into quarters, and quarters grouped into years. I’ve applied the same highlighting as above.

This grouping can be extended much farther; the limit is not with Excel’s technical capability but more with the legibility of the output. Once I built a chart with 9 or 10 levels of grouping in its category axis. By this point, the chart consisted of a very wide category axis and very little room for data in the plot area.

Speaking of legibility, those month abbreviations are difficult to read, since they have been rotated upward. But you can use a one-letter abbreviation for each month, by entering an actual date in the cells, and applying the custom number format below. One “m” in the custom format string results in a one- or two-digit month number; two results in a two-digit month number with a leading zero if necessary. Use three m’s for the three-letter month abbreviation, and four m’s for the full month name. Finally, five m’s will give you just the first initial of each month. This may be confusing if only one or two months are abbreviated this way, since there are multiple months for J, M, and A. But when a whole year of months is abbreviated like this, people recognize the months.

This “mmmmm” trick amazed one audience member so much that she tested whether “ddddd” would produce the one-letter abbreviations of the days of the week. She reported that, sadly, it does not.

This is the same data as above, with one-letter month abbreviations that improve legibility of the axis labels.

Those are three great tips: top-left cell, category label grouping, and the custom number format for one-letter month abbreviations. But what is the most important charting trick?

As I am energetically explaining below, the most important charting trick is to get the data right before you make your chart.

I promised a formula to construct a grouped data range for a chart’s category axis. Here is how I built my LAMBDA.

My initial approach was to build a string of years, quarters, and months delimited by commas and semicolons, then use the recently introduced TEXTSPLIT function to break that into the required grid.

I typed a long string into cell B2 below, and this formula in cell B5 produced the grouped dates in B5:D28 (with a blue border intended to resemble the Dynamic Array formula highlighting).

`=TRANSPOSE(TEXTSPLIT(B2,",",";",FALSE))`

Some people like to use spaces to make formulas easier to read; Excel ignores the spaces.

`=TRANSPOSE(TEXTSPLIT(B2 , "," , ";" , FALSE))`

Some will even write the formula on multiple lines for clarity, with spaces to provide indentation; Excel ignores the line feeds as well as the spaces. You can press Alt+Enter to insert a line feed within a formula. I’m not sure extra spaces and line feeds are necessary for this relatively simple formula, but they will help in a little while.

```
=TRANSPOSE(
TEXTSPLIT(
B2 ,
"," ,
";" ,
FALSE
)
)
```

The chart was a test that the output would work for my axis groupings.

My initial thought was to input an initial year and number of years into my LAMBDA, construct the long, delimited string, then TEXTSPLIT it. But I did a little experimenting first.

I used the DATE function to produce a list of months. DATE(year,month,day) produces a numerical date for month/day/year (or day/month/year outside the US). To get sequential months, I used the SEQUENCE function in the month argument of the DATE function.

All I needed was the initial month, as shown below. Using year 2021 and months 1 through 12, I get January through December of 2021. But them month 13 spills into the next year, giving me January of 2022. This makes it easier than having to calculate when the year turned over.

The screenshot below shows my approach. I listed the months in column B. In column D, I listed the year, but only if the month was January. In column E I listed the quarter, but only for the first month of each quarter. In column F I listed the first initial of each month using the “mmmmm” custom number format.

Column G holds some dummy data, so I could test out a chart, and it worked nicely.

I needed to condense all of this into one formula, so I started with the following LET function. I input the first year and number of years, then calculated my list of dates. Then I did a bit of arithmetic to calculate quarter number from the month: January is quarter number 1, February is 1-1/3, March is 1-2/3, April is 2, etc.

The CHOOSE function’s syntax the way I used it is something like this:

```
CHOOSE({column 1, column 2, column 3},
formula for column 1,
formula for column 2,
formula for column 3)
```

The CHOOSE function is a great way to build up a multiple-column Dynamic Array.

```
=LET(
firstyear,2021,
numyears,2,
dates,DATE(firstyear,SEQUENCE(12*numyears),1),
qtr,(MONTH(dates)+2)/3,
CHOOSE(
{1,2,3},
IF(MONTH(dates)=1,TEXT(dates,"yyyy"),""),
IF(qtr=INT(qtr),"Q"&qtr,""),
TEXT(dates,"mmmmm")
)
)
```

This milti-line, indented formula is much easier to read than the no-white-spaces version.

This formula produces the desired output.

I rearranged my LET to produce the following LAMBDA function. When used in a cell like this, the inputs to the LAMBDA are enclosed in parentheses after the LAMBDA’s closing parenthesis. The LAMBDA produces output identical to the LET output above.

```
=LAMBDA(
firstyear,numyears,
LET(
dates,DATE(firstyear,SEQUENCE(12*numyears),1),
qtr,(MONTH(dates)+2)/3,
CHOOSE(
{1,2,3},
IF(MONTH(dates)=1,TEXT(dates,"yyyy"),""),
IF(qtr=INT(qtr),"Q"&qtr,""),
TEXT(dates,"mmmmm")
)
)
)
(2021,2)
```

Here is how to convert the LAMBDA into a reusable user-defined function. Click Define Name on the Formulas tab to open the New Name dialog. Give the function a descriptive name, document it with a comment, then enter the LAMBDA formula (without the trailing inputs) into the Refers To box, and click Enter.

To simplify creating, editing, and deploying LAMBDA functions, Microsoft has introduced the Advanced Formula Environment. You can get it from Insert > Get Add-Ins. I have not used it here.

When you start typing the LAMBDA name into a cell, Excel’s IntelliSense suggests the full name of the function, and shows the comment, just like any built-in function.

Click the tab key and Excel fills in the function name with the opening parenthesis, and shows what arguments are expected.

I’ve used my LAMBDA function in cell B3 below, and the output works just fine in the chart. I’ve hard-coded 2020 and 3 as the starting year and number of years, but I could have linked to cells with these values.

Here I am showing how a chart looks when the top-left cell is not blank.

- Calculate Nice Axis Scales with LET and LAMBDA
- Dynamic Array Histogram
- VBA Test for New Excel Functions

- Text Labels on a Horizontal Bar Chart in Excel
- Reciprocal Chart Axis Scale
- Custom Axis Labels and Gridlines in an Excel Chart
- Chart with a Dual Category Axis
- Using Pivot Table Data for a Chart with a Dual Category Axis
- Stagger Axis Labels to Prevent Overlapping
- Axis Labels on Small Charts
- Column Chart with Category Axis Labels Between Columns
- Axis Labels That Don’t Block Plotted Data
- Individually Formatted Category Axis Labels

- Excel Chart with Square Gridlines
- Calculate Nice Axis Scales with LET and LAMBDA
- Calculate Nice Axis Scales in Your Excel Worksheet
- Calculate Nice Axis Scales in Excel VBA
- Chart UDF to Control Axis Scale
- How Excel Calculates Automatic Chart Axis Limits
- Reciprocal Chart Axis Scale
- Custom Axis Labels and Gridlines in an Excel Chart
- Custom Axis, Y = 1, 2, 4, 8, 16
- Logarithmic Axis Scales
- Link Excel Chart Axis Scale to Values in Cells
- Consistent Axis Scales Across Multiple Charts
- Gantt Chart with Nice Date Axis
- Select Meaningful Axis Scales
- Bar Chart Value Axis Scale Must Include Zero
- Text Labels on a Horizontal Bar Chart in Excel

The post LAMBDA Function to Build Three-Tier Year-Quarter-Month Category Axis Labels appeared first on Peltier Tech.

]]>**tldr;** Stick to XY Scatter charts if you need trendlines for your data. Line charts may misrepresent the relationships in your data.

A user had problems with my Trendline Calculator for Multiple Series and sent me his workbook. It turns out, he was using the program on a Line chart, and I recalled that Line charts can have problems when calculating trendlines.

Here is some simple data, plotted in a Line chart. A trendline has been calculated, and the formula and R² are shown in the chart. Wow, that’s a very nice straight line fit, with all points exactly on the line.

But wait! Look at the X axis: the labels go from 1 to 2, then to 5 and 6, then to 9 and 10, and the spacing between labels is equal! That happens because unless Excel recognizes the X values as dates, it treats them as non-numeric text labels, and spaces them evenly as categories across the chart. Excel also uses the category number, not the inherent numerical value of the category label, when calculating the trendline formula. So Excel uses 1 through 6 as X values, which match the Y values perfectly, resulting in a perfect fit to the formula `y = x`

.

The solution, of course, is to use an XY Scatter chart instead. X and Y values are all treated as numeric, and these numbers are used as is when calculating the trendline formula. The horizontal spacing of the points matches the X values, and we see that while the fit is rather close, the points are not all exactly aligned with the trendline.

Another solution would be to format the X axis as a Date axis. This forces Excel to treat the numbers numerically, even though they are not dates. Now the horizontal spacing of the points matches their true values, and the trendline matches the one calculated for the YX chart above.

But using an XY Scatter chart type is better than using a line chart in most cases, even with a Date axis (unless you are working with dates). The X axis does not begin at zero, for one thing. Other chart formatting is designed for line charts (you can change the formatting, but why bother?).

When inserting a chart, a user encounters a set of icons similar to those below. The Line chart icon shows data points connected by lines, while the XY Scatter icon shows points without connecting lines. Through inexperience or haste, I think a lot of people insert Line charts because they want lines connecting their points.

But this is not the difference between Line and XY charts. In both cases, you can format your data with or without markers, with or without connecting lines. The difference is in how the two chart types treat X values. As we saw above, Line charts will treat numerical X values as non-numeric labels, potentially spoiling your whole day.

- Add One Trendline for Multiple Series
- Trendline Calculator for Multiple Series
- Trendline Fitting Errors
- Choosing a Trendline Type
- Plot Two Time Series And Trendlines With Different Dates
- Polynomial Fit vs. Statistical Process Control
- Moving Averages
- Stacked Column Chart with Stacked Trendlines
- Deming Regression
- Deming Regression Utility
- LOESS Smoothing in Excel
- LOESS Utility for Excel

The post Trendlines and Chart Types in Excel appeared first on Peltier Tech.

]]>There are occasions when you may want to break the link between a chart and its underlying data. Maybe you copied the chart from another workbook, and you no longer have access to that workbook. Maybe you want to avoid the headaches that may arise from pasting a chart into PowerPoint or another program. Maybe you’re just tired of seeing this warning when you open the file:

There are several ways to disconnect your chart from its data source.

First let’s review chart data. I’ve written a lot about chart data, including

- Good Chart Data – The definitive description
- The Excel Chart SERIES Formula – also definitive
- Change Series Formula – Improved Routines
- How to Edit Series Formulas
- Simple VBA Code to Manipulate the SERIES Formula and Add Names to Excel Chart Series
- Edit Series Formulas

Below is a simple chart. A series is selected so the SERIES formula appears in the formula bar and the ranges in the formula are highlighted in the worksheet.

The SERIES formula looks like this:

`=SERIES(Sheet1!$C$2,Sheet1!$B$3:$B$8,Sheet1!$C$3:$C$8,1)`

The arguments in the formula describe the sources of the data.

`=SERIES(Series_Name,X_Values,Y_Values,Plot_Order)`

`Series_Name`

can be a link to a worksheet range, text (enclosed in double quotes), or blank.`X_Values`

can be a link to a worksheet range, an array enclosed in curly braces, or blank (and the chart will use 1, 2, 3, … for its X values).`Y_Values`

can be a link to a worksheet range, or an array enclosed in curly braces.`Plot_Order`

is an whole number between 1 and the number of series in the chart, signifying the order in which the series is drawn (complicated by chart type and axis group).

The cell addresses in the SERIES formula always use absolute references, such as $A$1, not relative references, like A1. But if you manually type relative references and press Enter, they will be converted to absolute references. The addresses always include the worksheet name.

If the chart links to data in another open Excel workbook, the SERIES formula includes the workbook name in square brackets before the worksheet name.

`=SERIES('[Data Source.xlsm]Sheet1'!$C$2,'[Data Source.xlsm]Sheet1'!$B$3:$B$8,'[Data Source.xlsm]Sheet1'!$C$3:$C$8,1)`

If the chart links to data in a closed Excel workbook, the SERIES formula includes the path, then the workbook name in square brackets, and finally the worksheet name.

`=SERIES('C:\Long Path[Data Source.xlsx]Sheet1'!$C$2,'C:\Long Path[Data Source.xlsx]Sheet1'!$B$3:$B$8,'C:\Long Path[Data Source.xlsx]Sheet1'!$C$3:$C$8,1)`

One way to represent an unlinked chart is to copy a picture of the chart, then paste it where desired.

Select the chart, then on the Home tab of Excel’s ribbon, under the Copy dropdown, select Copy as Picture…

… then select the appropriate options (usually Bitmap instead of Picture; I haven’t been able to figure out the difference between on screen vs. as printed) …

Then go to the other application, and Paste.

The disadvantage to this technique is that the pasted picture is no longer an Excel chart. You can no longer format any of the chart elements (rescale the axes, change marker styles or colors, etc.). Therefore, this method is unsuitable for use within Excel.

You can unlink chart data and still retain the actual chart with its formatting capabilities by editing the SERIES formula. Recall that the series formula in our first chart above was:

`=SERIES(Sheet1!$C$2,Sheet1!$B$3:$B$8,Sheet1!$C$3:$C$8,1)`

where the arguments referred to various links to the series data

`=SERIES(Series_Name,X_Values,Y_Values,Plot_Order)`

Select the series so that the SERIES formula appears in the formula bar, click in the formula bar so that the cursor is in the formula, and press F9. This keystroke converts references in the formula to their values:

`=SERIES("My Data",{"Jan","Feb","Mar","Apr","May","Jun"},{93,76,116,286,225,327},1)`

`Series_Name`

becomes `"MyData"`

, `X_Values`

becomes `{"Jan","Feb","Mar","Apr","May","Jun"}`

, and `Y_Values`

becomes `{93,76,116,286,225,327}`

. `Plot_Order`

is unchanged, of course, because it can only be a number, never a reference.

Press Esc to revert to the formula with references, or press Enter to keep the formula with hard-coded values.

If you select just one of the references in the formula, the F9 key only converts that reference to its value. These SERIES formulas are all valid:

`=SERIES("My Data",Sheet1!$B$3:$B$8,Sheet1!$C$3:$C$8,1)`

=SERIES(Sheet1!$C$2,{"Jan","Feb","Mar","Apr","May","Jun"},Sheet1!$C$3:$C$8,1)

=SERIES(Sheet1!$C$2,Sheet1!$B$3:$B$8,{93,76,116,286,225,327},1)

=SERIES(Sheet1!$C$2,{"Jan","Feb","Mar","Apr","May","Jun"},{93,76,116,286,225,327},1)

Any repetitive task that you can do manually, VBA can do faster with much less tedium.

To unlink chart data from all series in the active chart, simply run this code:

Sub DelinkChartFromData0() If ActiveChart Is Nothing Then MsgBox "Select a chart and try again", vbExclamation, "No Active Chart" Else Dim srs As Series For Each srs In ActiveChart.SeriesCollection ' Convert X Values to arrays of values srs.XValues = srs.XValues ' Convert Y Values to arrays of values srs.Values = srs.Values ' Convert series name to text srs.Name = srs.Name Next srs End If End Sub

That’s nice enough, but I like to adjust a sub like this by including an argument, so I can pass in the object I want to process from any entry point. The function corresponding to this is

Sub DelinkChartFromData1(cht As Chart) Dim srs As Series For Each srs In cht.SeriesCollection ' Convert X Values to arrays of values srs.XValues = srs.XValues ' Convert Y Values to arrays of values srs.Values = srs.Values ' Convert series name to text srs.Name = srs.Name Next srs End Sub

To process the active chart, I would call it with this entry point:

Sub DelinkActiveChartFromData1() If Not ActiveChart Is Nothing Then DelinkChartFromData1 ActiveChart End If End Sub

To process all charts on the active sheet, I would use this:

Sub DelinkAllChartsFromData1() Dim chob As ChartObject For Each chob In ActiveSheet.ChartObjects DelinkChartFromData1 chob.Chart End If End Sub

To select on or more charts to process, and ignore the rest, I use this approach:

Sub DelinkSelectedChartsFromData1() If Not ActiveChart Is Nothing Then DelinkChartFromData1 ActiveChart ElseIf TypeName(Selection) = "DrawingObjects" Then Dim shp As Shape For Each shp In Selection.ShapeRange If shp.HasChart Then DelinkChartFromData1 shp.Chart End If Next End If End Sub

In fact, the last sub is all I need, since it does the active chart if there is one, replacing `DelinkActiveChartFromData1`

, and it does any selected charts, so I could select all charts and run it to mimic `DelinkAllChartsFromData1`

.

I was working on an example to show that it works for chart with lots of data points, and I happened to use dates as my X values.

I used `DelinkSelectedChartsFromData1`

and got the resulting chart.

The SERIES formula is pretty long, but is nowhere near the limit (see Excel Chart Series Size Limits).

`=SERIES("Value",{36526,36557,36586,36617,36647,36678,36708,36739,36770,36800,36831,36861,36892,36923,36951,36982,37012,37043,37073,37104,37135,37165,37196,37226,37257,37288,37316,37347,37377,37408,37438,37469,37500,37530,37561,37591,37622,37653,37681,37712,37742,37773,37803,37834,37865,37895,37926,37956,37987,38018,38047,38078,38108,38139,38169,38200,38231,38261,38292,38322,38353,38384,38412,38443,38473,38504,38534,38565,38596,38626,38657,38687,38718,38749,38777,38808,38838,38869,38899,38930,38961,38991,39022,39052,39083,39114,39142,39173,39203,39234,39264,39295,39326,39356,39387,39417,39448,39479,39508,39539,39569,39600,39630,39661,39692,39722,39753,39783,39814,39845,39873,39904,39934,39965,39995,40026,40057,40087,40118,40148,40179,40210,40238,40269,40299,40330,40360,40391,40422,40452,40483,40513,40544,40575,40603,40634,40664,40695,40725,40756,40787,40817,40848,40878,40909,40940,40969,41000,41030,41061,41091,41122,41153,41183,41214,41244,41275,41306,41334,41365,41395,41426,41456,41487,41518,41548,41579,4160`

9,41640,41671,41699,41730,41760,41791,41821,41852,41883,41913,41944,41974,42005,42036,42064,42095,42125,42156,42186,42217,42248,42278,42309,42339,42370,42401,42430,42461,42491,42522,42552,42583,42614,42644,42675,42705,42736,42767,42795,42826,42856,42887,42917,42948,42979,43009,43040,43070,43101,43132,43160,43191,43221,43252,43282,43313,43344,43374,43405,43435,43466,43497,43525,43556,43586,43617,43647,43678,43709,43739,43770,43800,43831,43862,43891,43922,43952,43983,44013,44044,44075,44105,44136,44166,44197,44228,44256,44287,44317,44348,44378,44409,44440,44470,44501,44531,44562,44593,44621,44652,44682,44713},{1.59180141468432,2.98367529941487,3.53968832672474,4.166896463019,5.49680350931305,6.60295325255014,7.92084805050608,8.42969525036707,9.03114848703803,10.5276619316506,11.6955893543266,12.7967815520366,13.9389611218954,14.7527663401269,15.056329735329,16.5993323777924,17.3515548037874,18.5751579754327,19.5310520490484,20.1024924707731,21.0585061447489,22.5723596205116,23.6423776067453,24.1433918431237,25

.8809999743178,26.9737430334888,27.2520490776847,28.9524027272893,29.1845029748314,30.9521946135505,31.6902106518488,32.6795741650195,33.0489731513874,34.3260591677559,35.6901951915975,36.5728985726155,37.399073960589,38.3469907111962,39.9779540070637,40.6458619797685,41.8270884435173,42.4800724464723,43.5366272985535,44.8875676117536,45.2053138791033,46.1944737321369,47.3914095233474,48.8633128550379,49.5985742034472,50.7164885501005,51.653002976414,52.173798695412,53.797369677567,54.336065160202,55.2589150631074,56.4371639546982,57.3504724537478,58.7576090168706,59.8397486773469,60.5570284284911,61.2384044080596,62.8073951248627,63.0734139584437,64.023944739976,65.4569502217557,66.441196439216,67.8040167607499,68.4732296280772,69.383544349419,70.6736709532193,71.8772550111681,72.7285185169436,73.0304159679703,74.4543215360901,75.6139703970569,76.6795157883573,77.9474917376374,78.8675574588127,79.2434751623046,80.8059856927601,81.9149434410426,82.0822571567436,83.1165049108337,84.5766331546624,85.0485197930

954,86.4516756787662,87.103472072234,88.6166607931416,89.5569751902823,90.2997637601853,91.4861009853597,92.738947366043,93.1212971468925,94.0166978760269,95.1888503720746,96.8377506737342,97.3036391294494,98.126217895739,99.339467893605,100.259054804445,101.840983590986,102.741718964783,103.082626950564,104.362149881911,105.760866364373,106.396030094412,107.226463877175,108.638218224396,109.379526574316,110.109027223762,111.420326031626,112.61771263781,113.468694206656,114.619213431217,115.278679680971,116.466011457757,117.537068146506,118.753523891166,119.704656141103,120.131070166454,121.822525894342,122.977159915521,123.640235472236,124.383000359254,125.608871736392,126.188489945214,127.414144604553,128.39601244846,129.115889297518,130.08797401699,131.159077879218,132.410010962127,133.535396508137,134.718791752223,135.443733080568,136.661870239714,137.29210822438,138.55656753413,139.308552070942,140.27987742426,141.456569779806,142.774325544055,143.243355927806,144.505167482867,145.111184143168,146.30748

3083571,147.464847220224,148.218427246658,149.769670436945,150.776656088281,151.515371748972,152.408346863735,153.29435331721,154.857749604455,155.212400832703,156.31398296817,157.430827877581,158.068214294863,159.246613771057,160.346285238108,161.703341756362,162.939983684318,163.581827103078,164.41963298477,165.632147153497,166.697413290467,167.83618109404,168.878480595484,169.817382059171,170.125826262302,171.174106119149,172.182319412083,173.683778271622,174.350253903931,175.280205756401,176.628020353313,177.602771096537,178.468667385597,179.645508668238,180.131396317678,181.258745156011,182.238164094744,183.821618545389,184.261328363069,185.932899850116,186.187494356367,187.586391685274,188.120772297618,189.794395110081,190.509727284739,191.179753691424,192.54771311589,193.831688905835,194.638416224214,195.906316127413,196.845527552348,197.02489756276,198.382679905726,199.860935825007,200.422827058144,201.110300744418,202.209577950068,203.499633489411,204.12638600329,205.314473205528,206.095829209215,20

7.070675561553,208.875077186077,209.071848934263,210.37641200788,211.34927750657,212.666535668482,213.400957572161,214.438684951166,215.464275837814,216.580366174425,217.797909408301,218.167376086905,219.366623356776,220.548163602681,221.316902416248,222.044068195978,223.356501739512,224.984206101239,225.14746583972,226.275382045765,227.310864151405,228.851337545147,229.175137589426,230.520419824664,231.361806010337,232.175186942816,233.450560796329,234.269397102542,235.272652848491,236.560089595496,237.571235289495,238.499139612489,239.133135844129,240.582849816789,241.827775202926,242.833982805547,243.988020788098,244.077263106469,245.085954088788,246.671199871053,247.086129757983,248.030390822573,249.043987276631,250.071893428251,251.525524796095,252.600631102785,253.754358498755,254.665241202158,255.625959629045,256.98261131318,257.320204395814,258.444509485282,259.533974191495,260.492799645349,261.603579334917,262.194476107706,263.615554634444,264.987806666296,265.830581271325,266.39664304882,267.294282

876198,268.348630086833,269.253293622165,270.082557692269},1)

Unfortunately, the X axis tick labels have lost their date formatting. In the sub, the `srs.XValues = srs.XValues`

converted the input dates into numbers, because internally Excel stores dates as the number of days since 1 January 1900. Easy enough to apply the date format manually.

Apparently, the nice spacing that comes with an actual date axis is gone. The chart above has dates, but the axis is a simple category axis. I’ll have to adjust the procedure to prevent a date axis from being changed into a category axis. Before I convert the links to values, I will apply the number format to the axis ticks. Then if it is a date axis, I will make sure the chart treats it as one; this is complicated by the fact that Excel often applies the date category type automatically based on the data.

Sub DelinkChartFromData2(cht As Chart) Dim iGrp As XlAxisGroup For iGrp = xlPrimary To xlSecondary Dim ax As Axis Set ax = cht.Axes(xlCategory, iGrp) With ax ' apply formats .TickLabels.NumberFormat = .TickLabels.NumberFormat If IsDateAxis(ax) Then ' apply date type ax.CategoryType = xlTimeScale End If End With Next Dim srs As Series For Each srs In cht.SeriesCollection ' Convert X Values to arrays of values srs.XValues = srs.XValues ' Convert Y Values to arrays of values srs.Values = srs.Values ' Convert series name to text srs.Name = srs.Name Next srs End Sub

This function tests whether an axis is a date category type. The `BaseUnit`

property is undefined unless the axis is a date axis.

Function IsDateAxis(ax As Axis) As Boolean If ax.Type = xlCategory Then Dim vTest As Variant On Error Resume Next vTest = ax.BaseUnit IsDateAxis = (Err.Number = 0) On Error GoTo 0 End If End Function

The corresponding entry point that I call the above with is familiar:

Sub DelinkSelectedChartsFromData2() If Not ActiveChart Is Nothing Then DelinkChartFromData2 ActiveChart ElseIf TypeName(Selection) = "DrawingObjects" Then Dim shp As Shape For Each shp In Selection.ShapeRange If shp.HasChart Then DelinkChartFromData2 shp.Chart End If Next End If End Sub

The resulting chart is now indistinguishable from the original:

It’s easy to link many of a chart’s text elements to a worksheet range. Select the text element, click in the formula bar, type `=`

and click on the cell or range containing the text you want displayed. The result is a link formula like `=Sheet1!$A$1`

, and the text element updates dynamically to display whatever is in the reference. This works for the chart title, axis titles, data labels, and textboxes and other shapes that contain text.

If you’re delinking the chart’s data, you probably want to delink the titles in the chart. A simple VBA routine to do just that is shown below. For each possible title, see if the formula begins with an equals sign (if not, the formula just shows the text), and if so, replace the title’s text with the title’s text.

Sub UnlinkTitles() If Not ActiveChart Is Nothing Then With ActiveChart If .HasTitle Then If Left$(.ChartTitle.Formula, 1) = "=" Then ' convert chart title link to text .ChartTitle.Text = .ChartTitle.Text End If End If Dim iAx As XlAxisGroup For iAx = xlCategory To xlSeriesAxis Dim iGrp As XlAxisType For iGrp = xlPrimary To xlSecondary If .HasAxis(iAx, iGrp) Then With .Axes(iAx, iGrp) If .HasTitle Then If Left$(.AxisTitle.Formula, 1) = "=" Then ' convert axis title link to text .AxisTitle.Text = .AxisTitle.Text End If End If End With End If Next Next End With End If End Sub

Let’s merge this into our last Delink The Chart routine (I’ve also included the test for a date axis:

Sub DelinkChartFromData3(cht As Chart) With cht If .HasTitle Then If Left$(.ChartTitle.Formula, 1) = "=" Then ' convert chart title link to text .ChartTitle.Text = .ChartTitle.Text End If End If Dim iAx As XlAxisType For iAx = xlCategory To xlSeriesAxis Dim iGrp As XlAxisGroup For iGrp = xlPrimary To xlSecondary If .HasAxis(iAx, iGrp) Then Dim ax As Axis Set ax = .Axes(iAx, iGrp) With ax If .HasTitle Then If Left$(.AxisTitle.Formula, 1) = "=" Then ' convert axis title link to text .AxisTitle.Text = .AxisTitle.Text End If End If If iAx = xlCategory Then ' apply formats .TickLabels.NumberFormat = .TickLabels.NumberFormat If ax.Type = xlCategory Then Dim vTest As Variant On Error Resume Next vTest = ax.BaseUnit If (Err.Number = 0) Then ' apply date type ax.CategoryType = xlTimeScale End If On Error GoTo 0 End If End If End With End If Next Next Dim srs As Series For Each srs In .SeriesCollection ' Convert X Values to arrays of values srs.XValues = srs.XValues ' Convert Y Values to arrays of values srs.Values = srs.Values ' Convert series name to text srs.Name = srs.Name Next srs End With End Sub

And we’ll call it using the familiar entry point:

Sub DelinkSelectedChartsFromData3() If Not ActiveChart Is Nothing Then DelinkChartFromData3 ActiveChart ElseIf TypeName(Selection) = "DrawingObjects" Then Dim shp As Shape For Each shp In Selection.ShapeRange If shp.HasChart Then DelinkChartFromData3 shp.Chart End If Next End If End Sub

The post Unlink Chart Data appeared first on Peltier Tech.

]]>Last week we released the PBCharts (Process Behavior Charts) tool for performing SPC control chart analysis. I’ve been working on PBCharts for well over a year, and we had an extended beta testing period, during which we cleaned up numerous issues. I thought I’d walk through an analysis to show how easily PBCharts does its work.

My colleague posted a quick analysis of inflation since the start of 2019, and that looked like a good data set to analyze. The data file CPI YoY Pct.xlsx (click to download) looks like this, showing date and year-over-year percentage increase of the consumer price index:

I’ve described the manual process of generating control charts in Introducing Control Charts (Run Charts), based on Donald Wheeler’s Understanding Variation, an excellent introductory text on this topic. It can be tedious to set up your own control charts, but PBCharts makes it easy to apply the techniques to lots of data.

PBCharts uses a specially formatted Excel template for its data, calculations, and charts. A PBCharts workbook has four visible worksheets: Data, Run Chart, MR Chart, and I Chart.

There are a few ways to get the data into the PBCharts file. The PBCharts ribbon presents us with a few options.

** Import Data into PBCharts:** Click this button, then browse to a data file (CSV, TXT, or Excel) and PBCharts will import data from the selected file into a new PBCharts file.

** Analyze Selected Data in PBCharts:** Select your data range, or select one cell in the data range, and click this button, and PBCharts will populate a new PBCharts file with the selected data, or if a single cell is selected, with the larger data range that contains the selected cell.

** Blank PBCharts File:** Click this button to create a new, blank PBCharts file, then paste your data into the Data sheet of the PBCharts file.

The inflation data looks like this in the PBCharts data worksheet:

In general, you will have more than one column of data to analyze. For example, the view below shows the inflation data above, with another column showing a three-month moving average. The cell colors tell you about the data: the gray cells indicate that the cell contains a formula, the red cells indicate that they are blank. The blue header cell tells you which column is currently being analyzed (PBCharts analyzes one column at a time). If you click on a different header cell, PBCharts will focus the analysis on the newly selected column.

This isn’t really a manual for PBCharts, just a demonstration, but I’ll show the rest of the ribbon, so my discussion below makes sense. I’ve already shown the section that is used to manage PBCharts files.

The Analysis section of the ribbon includes controls for selecting the column to be analyzed, as well as controls for defining stages, trends, and excluded points, for labeling points, and more.

The Charts section provides a means to override axis scales, to set and show a target value, and to select a run test to apply. You can choose a chart style (bold, medium, or light), you can save charts as text files or export them to PowerPoint or Word, and you can define a dashboard-like layout of multiple charts.

Click on the Run Chart worksheet, and you see a run chart of the data in the highlighted column of the data sheet. It starts out pretty boring in 2019, then in early 2020 there is a dip, followed by another boring stretch, and starting in early 2021, inflation started ramping upwards.

You can select whether to test the run chart for certain conditions using the dropdown in the Charts section of the ribbon. The run chart above has no test.

The chart below has one test, Runs About Median, selected. The median is displayed as a green horizontal line, and every time the data crosses the median, a label is displayed along the top of the chart.

Our data displays 6 runs about the median. For a data set of this size, we would expect 21 runs about median, or between 17 and 26 with a p-value of 0.05. Since our runs are so low, the data is classified as clustering.

The chart below has another test, Runs Up or Down, selected. This counts each time the points reverse direction.

We would expect 27 runs up or down, or 23 to 31 with p=0.05, for a data set of this size. Instead, we see a much lower number, 14, which indicates trending behavior. Not surprising, given the obvious upward trend over the last third of the chart.

The run chart tests tell us that the data is not uniformly distributed, but we can still use control charts to analyze the behavior.

We turn to the MR (Moving Range) Chart. This shows us how much the data changes from one value to the next and is a measure of the variability in the data. We see immediately that there are two peaks in the MR data, indicating large changes in the data, and possibly process changes.

The green horizontal line shows the mean moving range value, and the red horizontal line shows the upper natural process limit (or upper control limit) on the moving range. The red square points indicate points which are out of statistical control, that is, which violate one of the rules, and the red labels at the top of the chart show which rules have been violated. There are four possible rules for MR charts, shown below, and our analysis is checking two of them, in bold. The two obvious peaks exceed the upper control limit.

I’ll show the I (Individuals or individual values) Chart, but it’s not usable yet. The green line shows the mean of the data, and the red lines show the upper and lower natural process limits (control limits). There’s a lot of red, which means that the average and control limits calculated so far do not describe the data well.

The red square markers indicate points which are out of control, and the text above the chart indicate which rules were violated. There are eight I Chart rules built into PBCharts, and we are checking four of them.

Let’s return to the MR chart and label a couple of points. I’ll label March 2020 “Covid” since it’s when we in the United States started severely limiting our activities due to the Covid pandemic. I’ll label March 2021 “Suez Canal” because that’s the month when the huge cargo ship ran aground in the Suez Canal, blocking all traffic. That event didn’t by itself cause the runaway inflation, but it marked the point when we became aware of how problems with shipping, supply chains, and logistics began to hurt the global economy.

Now back to the I chart. I’ll click the Stages button on the ribbon, and define two new stages starting in April 2020 and April 2021 (the first stage automatically starts at the beginning of the data set).

Next I’ll click the Trends button, and assign a linear trend to the third stage.

The I chart now looks well behaved. There is an initial horizontal section, then an intermediate horizontal section at a lower inflation value, indicative of reduced economic activity. Finally, there is the final increasing section. We still see a few red out-of-control points, three at the beginning of the middle Covid section, and one at the transition between the Covid static range and the final increasing stage.

The MR chart shows no point exceeding the control limit where Covid started, though we still have the over-the-limit point where global shipping became an issue. I’m not going to be concerned with these violations.

We could modify the I chart further, using the Exclusions dialog to exclude the three out-of-control points at the beginning of the Covid stage. Doing so causes three other points out-of-control, so I would probably just leave those points in the analysis.

The important thing to ask now, is whether the inflation rate will keep increasing. The US Treasury has raised its prime rate in March and again in May to try to slow inflation. There is no sign that this is having any effect, since the latest data points fall within the process limits and no other assignable causes are seen. But we can keep watching, add data every month, and follow the trends.

PBCharts allows us to select a period of time, and lock in the calculated limits. It is common practice, once a process is deemed stable, to lock the limits before plotting subsequent points. This way adding more data will not affect the displayed limits, and the added data is directly compared to the stable process behavior.

We can use the inflation rate data to illustrate locking of calculated process limits. Since the Fed first increased interest rates in March, let’s back up the data to that point, ignoring the last two points. Here is the I chart.

Now I’ll lock the limits in the last stage (April 2021 to March 2022). The dark circles have changed to squares to indicate the points used to calculate the locked limits.

The limits are the same in both charts. Let’s view the final stage side by side.

Let’s add back the data for April and May. The difference in the two charts is subtle: notice that the last two circles are closer to the green line without locked limits (left) than to the green line with locked limits (right). This is because the added points have changed the slopes of the green lines and red limits, rotating the right sides of these lines slightly downward.

Now let’s suppose that the Fed’s interest rate action has had an effect. To simulate this, and I’ll admit it’s a huge assumption on my part, I’ll repeat the last two actual points for April and May as presumed additional points for June and July. The unlocked green and red lines of the left-hand chart below have rotated further, so that the green line is now below the points for April and May, and the imaginary points for June and July are above the lower red line. The locked limits in the right-hand chart haven’t moved. The points for April and May are still below the green line, and the new points for June and July are below the red line and highlighted as out-of-control points.

These out-of-control points are a signal that something in the process has changed. We might presume that increasing interest rates has stopped the climb in the inflation rate.

PBCharts is a VBA add-in for Excel. It is limited to Windows, but it runs on any supported versions of Excel, from Excel 2013 through Microsoft 365.

To get your copy of PBCharts, visit the PBCharts website. You can download PBCharts on a 14-day free trial basis, or you can purchase a one-year license for $99.00.

And readers of this blog get a discount of $10.00 by entering the coupon code

. This special price of $89.00 is good until 6 July, 2022.*PeltierTech*

The post PBCharts Inflation Analysis appeared first on Peltier Tech.

]]>How many points can I plot in each series of my chart? How large of a VBA array can I plot in my chart? Two good questions, which I’ll investigate here.

A colleague emailed me asking about the VBA array size limit for plotting in a chart. He said he thought the limit was 32,000 points, but couldn’t find any official documentation of this, and his trials only worked for half that many points. I couldn’t find any documentation of any limit on how large a VBA array can be used to populate a chart.

The number of points in a chart series populated by worksheet ranges is limited by available memory, as the spec states. This limit can be greater, perhaps much greater, than the number of cells in a worksheet column.

The number of points in a chart series populated by a VBA array is 32,000 if the array is a 2-dimensional vertical array. The limit drops to 16,384 if the array is a 1-dimensional horizontal array.

I’ll start with the VBA question.

If you generate data in VBA using arrays, you can plot this data in two ways:

- Put the arrays into a worksheet, and plot the ranges that contain the data;
- Put the arrays directly into the chart.

Below is a simple VBA procedure that generates small arrays for the X and Y data, creates a scatter chart with one series, and puts the arrays into the `.Values`

and `.XValues`

properties of the series.

```
Sub ChartWithVBAArrays()
' declare arrays
Dim X(1 To 10) As Variant
Dim Y(1 To 10) As Variant
' populate arrays
Dim i As Long
For i = 1 To 10
X(i) = i
Y(i) = i
Next
' create the chart
Dim cht As Chart
Set cht = ActiveSheet.Shapes.AddChart2(240, xlXYScatter).Chart
' populate the chart
With cht.SeriesCollection.NewSeries
.Name = "VBA Arrays"
.Values = Y
.XValues = X
End With
End Sub
```

The resulting chart (Scatter with Markers and No Lines) looks like this.

Here is what the SERIES formula looks like:

`=SERIES("VBA Arrays",{1,2,3,4,5,6,7,8,9,10},{1,2,3,4,5,6,7,8,9,10},1)`

The arrays are simply shown as comma-separated lists of values, enclosed in curly braces.

Easy enough. Now let’s see how much data we can put into those arrays.

I made two procedures: `PlotManyPoints`

which had two arguments, the number of points to (try to) plot, and the chart to plot those points in; and `TEST_PlotManyPoints`

, which pops up an `InputBox`

asking me how many points I wanted to plot.

```
Sub TEST_PlotManyPoints()
Dim cht As Chart
' in case I forget to select the chart
If Not ActiveChart Is Nothing Then
Set cht = ActiveChart
Else
Set cht = ActiveSheet.ChartObjects(1).Chart
End If
' ask how many points
Dim n As Long
n = Application.InputBox("How many points?", , , , , , , 1)
' run the procedure below
PlotManyPoints n, cht
End Sub
Sub PlotManyPoints(n As Long, cht As Chart)
Dim x As Variant, y As Variant
ReDim x(1 To n), y(1 To n)
' build arrays
Dim i As Long
For i = 1 To n
x(i) = i
y(i) = i
Next
' apply arrays to chart
With cht.SeriesCollection(1)
.Values = y
.XValues = x
Dim Points As Long
Points = .Points.Count
End With
' report results
ActiveSheet.Range("B3:B4").Value2 = WorksheetFunction.Transpose(Array(n, Points))
End Sub
```

The chart was already present, so the program simply replaced the chart’s existing data with the new data, and it recorded in the worksheet how many points were in the arrays and how many points were plotted in the chart.

I started small, with 10 points. Here is the output worksheet with an XY Scatter Chart with Lines and No Markers.

Then I tried 100 points, 1000 points, 10,000 points…

I’ve written extensively about Excel’s chart SERIES formula:

- The Excel Chart SERIES Formula
- Change Series Formula – Improved Routines
- How to Edit Series Formulas
- Edit Series Formulas
- Delink Excel Chart Data

The chart SERIES formula keeps showing all of the X and Y values in their arrays for longer than I expected. Here is the SERIES formula for 10 points:

`=SERIES(,{1,2,3,4,5,6,7,8,9,10},{1,2,3,4,5,6,7,8,9,10},1)`

Here it is for 100 points:

`=SERIES(,{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100},{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100},1)`

Here it is for, well, I won’t show you 1000 points, but the series formula showed all 1000 X and Y values. In fact, the series formula was complete all the way up to 1042 points. The formula was 8221 characters long, slightly more than the stated limit of 8192 in Microsoft’s documentation. If I try to edit the formula and hit Enter, Excel warns me:

When I plot 1039 points, the SERIES formula has 8191 characters. I can edit this formula without changing its length, and Excel will accept the change.

Beyond 1042 points, Excel starts leaving off characters in the array of Y values, then of the X values, to keep to the almost-limit of 8221. At 2000 points, the end of the 8221-character formula looks “broken”:

`,1852,1853,1854,1855,1856,1857,1858,1859,1},{1,2,3,4,5,6,7,8,9,}`

The X values are truncated in the middle of point 1860, the Y values after point 9, and we’ve lost the final plot order argument and closing parenthesis. Despite the lack of fidelity of the series formula display, however, Excel plods, I mean, plots along.

After reopening a workbook that has such “broken” formulas, the formulas can no longer be read. If you select the series, Excel leaves the formula bar blank, and shows this message.

You can still format the series and the rest of the chart. And you can still access all 32,000 data points in VBA using the `.XValues`

and `.Values`

properties of the series object.

In the old days (Excel 2003), the limit on the SERIES formula was 1024 characters, each of the four arguments was limited to one-fourth of this, and if the SERIES formula was “broken” by having too many characters, Excel would refuse to plot anything. The limit using this methodology was 81 points.

If your data had more than one or two digits, the limit quickly became smaller. Try more than the limit and Excel gave you a nasty error.

Can you believe I opened up Excel 2003 to try this? I keep a virtual machine around with Excel 2003 just to try stuff like this, because I sure can’t rely on my memory. And by the way, get offa my lawn!

When I tried 100,000 points, I hit another limit: the arrays contained 100,000 points, but the chart only accepted 16,384 of them.

So my result validates what my colleague told me: a VBA array can populate much less than the 32,000 points that we unofficially think is the limit.

But that number 16,384 is not a nice round number like 32,000. In fact, it looks like a power of 2. It happens to be 2^14, which is also the number of columns in an Excel worksheet.

A one-dimensional array such as I used in this code is a horizontal array. Perhaps the chart can only accept a horizontal array with as many values as there are columns.

We can make a two-dimensional array in VBA readily enough. I’ve modified my earlier routine `PlotManyPoints`

to do just that. My arrays `x`

and `y`

are dimensioned with `1 to n rows`

and `1 to 1`

columns.

```
Sub PlotManyPoints(n As Long, cht As Chart)
Dim x As Variant, y As Variant
' 2-dimensional vertical arrays
ReDim x(1 To n, 1 To 1), y(1 To n, 1 To 1)
' build arrays
Dim i As Long
For i = 1 To n
x(i, 1) = i
y(i, 1) = i
Next
' apply arrays to chart
With cht.SeriesCollection(1)
.Values = y
.XValues = x
Dim Points As Long
Points = .Points.Count
End With
' report results
ActiveSheet.Range("B3:B4").Value2 = WorksheetFunction.Transpose(Array(n, Points))
End Sub
```

I ran this for 10, 100, 1000, 10,000, and finally 100,000 points, and I hit the limit of 32,000 points when I tried to add 100,000.

So 32,000 points is the limit for VBA arrays as chart series source data. If you need more, you could do one of two things:

- Put the longer arrays into a worksheet, and plot the ranges that contain this data;
- Break the arrays into smaller 32,000-element arrays, and use these smaller arrays to populate separate series in the chart. Even with the limit of 255 series per chart (which has not changed), you’re allowed 8,160,000 points.

So where does this VBA array limit of 32,000 come from? Microsoft’s own Excel specifications and limits informs us that Excel 2007 had a limit of 32,000 points per chart series, but the limit for Excel 2010 and later is “Limited by available memory”. I can’t find anything other than hearsay and speculation about the limit on VBA arrays.

I suspect that somewhere in Excel’s source code, there is still a hard-coded 32,000 chart series limit that nobody can find. This limit is ignored for worksheet data but is still enforced intentionally or otherwise for VBA data.

One reason people cite for using VBA arrays instead of worksheet ranges is that the arrays result in smaller workbooks. I made four workbooks to investigate this. One workbook contained data for 32,000 points, but no chart. One workbook contained the data and a chart that plotted the data. One workbook contained no data but had a chart that was copied from one of the first two workbooks, so it was still linked to the workbook with the data. The last workbook had no data, and a chart populated with 32,000 row VBA arrays. The results are summarized in the table below.

The workbook containing only the data needs 413 kb of storage. Add a chart, and the workbook increases to 721 kb. Interestingly enough, if the chart is alone in the workbook and links to a different workbook, the file size is about the same as if its parent workbook contained the data, 724 kb. Most interesting, the workbook with only the VBA-populated chart, with its “broken” and hidden SERIES formula, was smallest of all, at 316 kb.

So the conventional wisdom is correct, and a VBA-populated chart requires a smaller workbook than a chart that links to worksheet data. In any case, these are not enormous files, and I feel safer with the data accessible in the worksheet rather than hidden by VBA.

What does “Limited by available memory” mean?

To finish my analysis, I thought it would be informative to see how many regular worksheet data points I could squeeze into a chart series.

I set up data in columns A and B of a worksheet that start with 1 in row 1 and continue down to 1,048,576 in row 1,048,576 (the last row in the worksheet). I plotted this in an XY chart (line and no markers) and got all of the points into the chart.

Note that hard-coded values work much nicer than formulas in these million-plus-row ranges. It takes a long time to calculate such a lot of cells.

So the limit appears to be 1,048,576 points.

Then I thought it might be possible to extend this. Let’s examine the following ranges and charts, and apply some SERIES formula magic.

I can plot A1:B4, shown in the first chart below.

The SERIES formula looks like this:

`=SERIES(Sheet1!$B$1,Sheet1!$A2:$A4,Sheet1!$B2:$B4,1)`

I can add the data in D1:E4 as a second series (orange) in the middle chart, with the following formula:

`=SERIES(Sheet1!$E$1,Sheet1!$D2:$D4,Sheet1!$E2:$E4,2)`

But I can also combine the X and Y values to create a single series using both sets of data, as shown in this series formula and the third chart below:

`=SERIES("A,D vs B,E",(Sheet1!$A$2:$A$4,Sheet1!$D$2:$D$4),(Sheet1!$B$2:$B$4,Sheet1!$E$2:$E4$),1)`

The X values are `Sheet1!$A$2:$A$4`

and `Sheet1!$D$2:$D$4`

, comma-separated and enclosed in parentheses. Likewise, the Y values are `Sheet1!$B$2:$B$4`

and `Sheet1!$E$2:$E$4`

, also comma-separated and enclosed in parentheses.

Let’s add more full columns of X and Y data.

I’ve already plotted the first series above with this formula:

```
=SERIES("A vs B",Sheet1!$A:$A,Sheet1!$B:$B,1)
```

I can add another series with the data in columns D and E with this formula:

`=SERIES("D vs E",Sheet1!$D:$D,Sheet1!$E:$E,2)`

Which produces this chart:

But as I’ve shown above, I can combine the two sets of X and Y data into one series’ worth of X and Y data, using this formula:

```
=SERIES("A,D vs B,E",(Sheet1!$A:$A,Sheet1!$D:$D),(Sheet1!$B:$B,Sheet1!$E:$E),1)
```

When I first entered this SERIES formula, I got an error, and I thought that I’d found a hard limit. But I went back and fixed a typo, and it all worked. I ended up with this chart:

So I don’t know what the limit is, except that it is in fact “Limited by available memory,” and this limit is more than the number of rows in a worksheet column. I leave it as an exercise for the reader to determine how many full columns they can squeeze into the SERIES formula.

When the data is comprised of static values in cells, it isn’t too slow to build and display these charts. If you have formulas that calculate the X and Y values, though, this calculation takes a minute or more each time, and it becomes very tedious while you wait for Excel to start responding again.

I tabulated file sizes for 32,000 points above to show that populating charts with VBA arrays can reduce file sizes. For completeness here are file sizes needed for charting full columns of data. These are large files. I’ll let you draw any further conculsions.

The post Excel Chart Series Size Limits appeared first on Peltier Tech.

]]>I’m not sure what to call this chart, other than it’s a combination chart (stacked column and XY scatter) and it requires some additional data manipulation. The chart shows test results from a test program that includes different factor levels: Groups (Alpha, Beta, Gamma), Classes (High and Low), and Treatments (A, B, C), with four replications per set of factors. The intent is to show each replication, while also showing how the groups, classes, and treatments compare against each other.

A few years ago I wrote Make Technical Dot Plots in Excel, which showed each test result from a relatively simple test program. A typical one of these charts is shown here, where all results from each of the three test conditions are shown.

The above chart shows three treatments, but what if we have other factors in the test program. We have two classes and three groups, which might refer to any other way to categorize testing conditions. Age groups of test subjects, species of an infectious agent being targeted, season of the year, etc.

A comment to the article cited above posted a data set, part of which I’ve shown below (after anonymizing it). You can download a CSV file with the data set here: test_results.csv. For each combination of Group, Class, and Treatment, there are Four replications. We want to plot each of these replications in the final chart.

Here is how the first several rows of data look when opened in Excel and converted into a Table.

When I first get my hands on a new data set, I like to make a chart or three to see if there are any obvious insights. My first chart was a simple line chart of the data column. The X-axis is simply the point number, from 1 to 72 (the number of results in the program).

Nothing is immediately obvious, other than the scores being skewed toward the upper end of the range (more 2s and 3s on a scale of 0 to 3).

I change this chart to use the Treatment column for category labels. No insights come to light.

I further adjusted the chart to include the Class column to the category labels. Nothing but clutter along the axis.

When I included the Group column to the category axis labels, the clutter increased.

With the appropriate data layout, multiple columns of category labels can bring order to the chart. This is obviously not the appropriate data layout, but we’ll get to it shortly.

Sometimes a Pivot Table helps with displaying data. The Picot Table below has Group, Class, and Treatment in the Rows area, Rep in the Columns area, and the Scores in the Values area. Each test result appears in the cross-tabbed Pivot Table, with one set of conditions per row and the four associated test results in that row. Easier to see all at once, but not (yet) easier to find any meaning.

Here is a pivot chart, type clustered column. It shows all results, but the bars are so pinched together it’s hard to make sense of it. That horizontal axis looks better, though, since it is laid out in a much nicer way than the cluttered one above.

A line chart of the same pivot data looks a bit less cluttered, but it’s hard to see how many points might be occupying the same space. It would be better if we could move some left or right by a small amount (the way the columns are spread out in the previous chart), but line charts do not allow that.

How do we get that nice layout? I’ve described it in Chart with a Dual Category Axis and numerous other articles, but I’ll describe it again. Below is the Pivot Table, where I’ve highlighted the chart source data. The blue highlights indicate the Y Values (the test results in the Pivot Table’s Values area), the red highlights indicate the Series Names (in the Columns area), and the purple highlights indicate the X Values or Category Labels (in the Rows area).

There are three columns of category labels, just like in the original Table of data. What makes the axis work nicely in the Pivot Chart is that repeated labels are replaced by blanks. These blanks tell Excel how to construct the labels: place A-B-C along the axis, center Low and High under successive blocks of A-B-C, and center Alpha, Beta, and gamma under blocks of Low and High.

What we can do is use these three columns of the Pivot Table for our chart’s category axis, and then combine this with XY Scatter series where we calculate X values that include the lateral offset to reduce overlapping of the markers. (We could also construct a range like this by hand, but the Pivot Table is here, so let’s use it.)

My previous article, Clustered Column and Line Combination Chart, shows how to calculate X values to position XY Scatter markers precisely over columns in a chart. We’ll use a similar approach here. I’ve columns to the Table to accommodate these calculations.

Each combination of factors in the Table results in one category in the Pivot Chart and in our ultimate chart. So we need to calculate Category Number (“CatNum” in the Table). The formula in cell F3 (which Excel fills into the whole column of the Table) is

=INT((ROW()-ROW($F$2)+3)/MAX([Rep]))

Each replicated test takes up a fraction of the width of each category (“Frac” in the table), with the first replication at zero and the last replication at 1. The formula in G3 is

=([@Rep]-1)/(MAX([Rep])-1)

I want to leave space around the reps within a category, leaving a space between categories. This is like the gap width in a column chart. I’ve placed a gap width of 0.5 in cell $N$3, and this formula in cell H3 (under Decimal)

=($N$3/2+[@Frac]*(1-$N$3)

Finally, I add this Decimal amount to CatNum minus one in I2 (under X) using the formula

=[@CatNum]-1+[@Decimal]

I certainly could have combined all of these calculations into a single “X” column, but for ease of explanation, I used several columns for intermediate calculations. Many Excel users love their big ugly formulas that take up fewer columns, but our column limit now stretches beyond 256 columns, while our cognitive limit seems to have shrunk. Helper columns are your friends.

Let’s just see how it looks. First I’ll plot the Score column in an XY Scatter chart, using lines and markers. Without specifying X values, Excel will just use the counting numbers 1, 2, 3, up to the number of points.

Similar to our first line chart at the beginning of this article. Those connecting lines need to go, and let’s format the markers to have no fill, which lets us see more easily when points overlap.

Now let’s use the calculated X values for the data. There are 18 categories, so let’s set the axis maximum to 18. The data is also grouped into threes, that is, A, B, and C, so let’s set the axis major unit to 3 and minor unit to 1. We have four replications within each vertical slice of the chart.

We can make one further improvement. The tests within each category would stand out from other categories more visibly if we plot each category in a different color. To do this, I’ve added three more columns to the Table, labeled A, B, and C for the three treatments. The formula in J2, filled into the three added columns, is

=IF($C3=J$2,$E3,NA())

Here I’ve plotted the three different treatment columns in place of the one Score column, and colored each series distinctly.

We’re going to make a combination chart. We’ll start with a column chart using the Rows area of the Pivot Table as category labels and averages for each category of tests as the column Y values. To this, we’ll add XY Scatter series using our calculations in the data Table for X values and the test results as Y values.

We can’t use a Pivot Chart, since a Pivot Chart is constrained to plot all data in the Pivot Table and no data from outside the Pivot Table. But that’s no impediment, as I described in Making Regular Charts from Pivot Tables.

I’ve added three columns, A, B, and C. to the right of the Pivot Table for the average test result values (much as I did for the test results in the Table). These will serve as a background to the individual test results. The formula in X4 (filled into the range X4:Z21) is:

=IF($R4=X$3,AVERAGE($S4:$V4),0)

One thing to be careful of is the sorting in the original data and in the Pivot Table. The original data lists Class in the order Low-High. Default sorting in the Pivot Table lists Class in the opposite order, High-Low. You can change the order by clicking on a pivot item label and dragging it to where you want it.

Select the columns of average data; include the header row and Excel will use these cells as series names. Then insert a column chart. I’ve made a stacked column chart, so the overlap is 100% and all columns are centered within their categories. The chart’s source data is highlighted in the worksheet.

Now add the category labels from the Pivot Chart. I’ll add them bit by bit so you can see how Excel builds up the multiple tiers of labels.

You could use the Select Data dialog to add the labels, but I find that tedious. It’s much easier to edit the `SERIES`

formula. Select the blue columns and this will appear in the Formula Bar (the precise address will depend on the location of the Pivot Table in the sheet):

=SERIES(Sheet1!$X$3,,Sheet1!$X$4:$X$21,1)

The syntax of the SERIES function is

=SERIES([series name],[X values],[Y values],[series number])

Since there are no X values specified, Excel just uses 1, 2, 3, etc. in the chart. Do assign X values, put your cursor between the two commas where the X values should go, select the range in the sheet, then press Enter. The formula should now look like this:

=SERIES(Sheet1!$X$3,Sheet1!$R$4:$R$21,Sheet1!$X$4:$X$21,1)

You only need to add this to one series, since all series (except for XY Scatter series) share X values/category labels.

The worksheet now shows these highlights, and the column of Treatments is used as category labels in the chart.

Let’s include the column of Class labels in column Q. Simply edit the series formula to include the two-column range:

=SERIES(Sheet1!$X$3,Sheet1!$Q$4:$R$21,Sheet1!$X$4:$X$21,1)

The highlighted regions and the chart show the adjustment.

One more time: edit the SERIES formula to include Group in column P, so the formula looks like this. Of course, I could have just selected all three columns of the Rows area two steps ago, but I wanted to show how the category label structure develops.

=SERIES(Sheet1!$X$3,Sheet1!$P$4:$R$21,Sheet1!$X$4:$X$21,1)

The worksheet highlights and chart axis show the change.

We have the nice category axis we wanted, now let’s do a little cleanup. Set the scale of the Y-axis to minimum=0, maximum=3, major unit=1. Select one of the column series and set gap width to zero so the columns fill the width of each category. Lighten up the colors: I set the transparency to 75% or so, which makes the colors light and lets the gridlines show through.

Now it’s time to add the actual test results. Select and copy the last four columns of the Table (X, A, B, and C) including the column headers. Select the chart, click Paste Special from the Paste dropdown on the Home tab of the ribbon, and make the appropriate selections in the dialog: Add as New Series, Values in Columns, Series Names in First Row, Categories in First Column.

You can also Paste Special with Alt+E+S, which is a legacy of the old menu structure of Excel 97-2003, and which is permanently ingrained in my muscle memory. Another more recent shortcut that I can never remember is Ctrl+Alt+V.

Excel adds the new series as more sets of stacked columns.

That’s easy to fix. Right-click on any series in the chart, and choose Change Series Chart Type. This rich dialog pops up with a preview of the chart and a list of all series in the chart. Note that all series are Stacked Column.

Change each of the newly added series to Scatter using the Chart Type dropdowns. Excel will check the Secondary Axis box for each, which in this case is what we want.

The resulting chart is starting to look good.

Format the scale on the secondary (upper) horizontal axis: minimum=0, maximum=18, major unit=3, minor unit=1.

Add major and minor secondary vertical gridlines.

Format the secondary (upper) horizontal axis to show no labels. Delete the secondary (right-hand-side) vertical axis.

Finally, format the XY Scatter series so the markers have no fill. This makes overlapping points easier to see.

The post Combination Chart for Multi-Factor Test Results appeared first on Peltier Tech.

]]>Combination charts in Excel are pretty easy, once you figure them out. But sometimes they present challenges. If you make a combination clustered column chart and line chart, it takes special treatment to align the markers over the columns.

I will warn you that adding several series of lines to a chart with several series of columns might make your chart cluttered and difficult to read, especially if the lines and columns need different axes to show different types of data (for example, dollar sales vs. percentages of target). If that’s the case, you should break down the data into more easily digestible pieces, and use multiple charts for improved clarity.

I often use this technique when I need some kind of special highlighting or labeling, and my added data is partly or totally hidden (by formatting with no markers and no lines).

We’ll start with data for three categories (Alpha, Beta, and Gamma) and three series (Red, Green, and Blue). These might correspond to three companies over three years, or any other grouping you have in mind.

In addition, we want to display columns and lines for each series, showing perhaps target sales values as columns and actual sales as markers on the lines. Here is our data and the separate column and line charts.

Making a combination chart is pretty easy. Start by plotting all of the data using one of the chart types in the finished chart. I started with a column chart, but it would work the same if you start with a line chart.

Right-click on any series in the chart, and choose Change Series Chart Type from the pop-up menu to open the Change Chart Type dialog

*If you right-click just anywhere on the chart, the menu option is Change Chart Type, and when you select Combo, Excel will use its own favorite combination, and you will have to override Excel’s choices. In this case it would have been okay, but generally Excel chooses something other than what you want.*

Combo is selected in the list along the left of the dialog, there is a preview of the chart, and a list of all series in the chart, with a dropdown to select the chart type and a checkbox to select the axis of each series.

Change the chart type of each of the last three series to Line with Markers, and leave the Secondary Axis checkboxes unchecked.

The resulting combination chart looks just like the preview (well, perhaps I’ve formatted it a bit).

That was pretty easy, and it looks pretty good, except for one thing. The markers are all centered in each category and are not aligned over their respective columns. The red, green, and blue markers are all centered on the green column.

This is a consequence of how Excel draws line charts. Each data point fits into a category, and it is centered on a category. Unlike clustered column charts, where the points (the columns) are distributed within a category according to properties Gap Width and Overlap.

So we can’t use line chart series to align markers over columns, but all is not lost. We can instead use XY scatter series, because Excel will plot them along the category axis as if the category axis has a numerical scale.

Let’s examine our simple column chart, with three text labels (categories) along its category axis.

If we plot XY scatter data on the chart, Excel treats the categories as if the first category is at X=1, the second at X=2, and so on.

For the XY scatter data, we can consider the axis as a continuous numerical scale starting at the first category number minus 0.5 and ending at the last category number plus 0.5, or in our example, from 0.5 to 3.5. Each category takes up the space from the category number minus 0.5 to plus 0.5.

We can calculate X values for our XY data to position the markers wherever we want them. If we had used X values of 1, 2, and 3, our XY series would line up on the green columns just like our line chart series. But we can see that our Red X values need to be a little less and the Blue X values a little more than the category number.

How do we know the precise X values? We can guess and adjust them by trial and error, but we can also calculate them easily enough (you’re not afraid of a few formulas, are you??).

Excel’s column and bar charts use two parameters, Gap Width and Overlap, to control how columns and bars are distributed within their categories.

Gap Width is the space between bars in adjacent categories, given as a percentage of the width of a column in the chart. The default is 219%, which means the gap is 2.19 times the width of a column.

Overlap is the amount that columns in one series overlap columns in the next series within a category. The default is -27% (below left), meaning that there is a space 0.27 times the width of a column between adjacent series. A positive overlap of 27% (below right) means that the columns actually overlap by 0.27 times the width of a column.

I don’t know where Microsoft came up with these default values, but they are easy to adjust. Select any series and press Ctrl+1 (the universal shortcut to open the formatting user interface for the selected object in Excel). In the task pane, move the sliders or type in the desired values. I usually choose a gap width of 0.5 to 1.5 (and most frequently 0.75 to 1.0), and an overlap of zero (so adjacent series of columns are touching).

So let’s set up the calculations. Here are some variables:

`BW: width of a bar (or column) = 100`

GW: gap width as percentage of BW

OL: overlap as percentage of BW

iSrs: the number of the series

nSrs: the number of series

iCat: the number of the category

The X value of a point is the X value of the left edge of a category plus the distance the point is from that edge of the category divided by the width of a category.

The left edge of a category is the category number minus 0.5:

`Category Edge = iCat - 0.5`

The red marker is half a gap width plus half a bar width from the edge of the category.

The green marker is half a gap width plus one and a half bar widths plus one overlap from the edge of the category.

The blue marker is half a gap width plus two and a half bar widths plus two overlaps from the edge of the category.

The general expression is:

`Marker Distance = 0.5 * GW + (iSrs - 0.5) * BW - (iSrs - 1) * OL`

The width of a category is the gap width (half on each side) plus the bar width times the number of series plus the overlap times the number of series minus one. Actually minus the overlap, because a negative overlap adds to the total width:

`Category Width = GW + nSrs * BW - (nSrs - 1) * OL`

The complete expression for the X value of a marker is:

`X = iCat - 0.5 + [0.5 * GW + (iSrs - 0.5) * BW - (iSrs - 1) * OL] / [GW + nSrs * BW - (nSrs - 1) * OL]`

We can simplify this formula since all red markers are offset within a category by the same amount, as are all blue markers, and all red markers. We can then calculate each offset only once and use a lookup formula for each point.

`X = iCat + Xoffset`

Xoffset = - 0.5 + [ 0.5 * GW + (iSrs - 0.5) * BW

Here is the data range I have set up:

B2:H6 contains my original column and line chart data. B15:D18 is where I calculate the Xoffset values based on values of Gap and Overlap entered in the appropriate cells. The formula in D16 (filled down to D18) is:

`=($B$16/2+($C16-0.5)*100-($C16-1)*$B$18)/($B$16+MAX($C$16:$C$18)*100-(MAX($C$16:$C$18)-1)*$B$18)-0.5`

B8:H13 contains the helper row of category numbers in column B and the helper row of series numbers in row 13. There are X and Y values for the three XY series. The Y values come from F4:H6 in the table above. The X values in cell C10 (and copied into C10:C12, E10:E12, and G10:G12) are calculated by this formula:

`=$B10+XLOOKUP(C$13,$C$16:$C$18,$D$16:$D$18)`

There are several ways to generate the combination chart. You can start with a column chart with three series then add the XY data. To add the Red XY data, copy the range C9:D12, select the chart, and choose Paste Special from the Paste dropdown on the Home tab of the ribbon. Select the appropriate options in the dialog: New Series, Values in Columns, Series Names in First Row, Categories in First Column, and DO NOT CHECK Replace Existing Categories.

The data is added as another column series; we’ll fix that shortly. Repeat with the Green data in E9:F12 and the Blue data in G9:H12. Here is the column chart:

Convert to a combination chart as we did above for the column-line chart. Right-click on any series, and select Change Series Chart Type from the pop-up menu. Change the chart type of the last three series to Scatter with Straight Lines and Markers, and UNCHECK the Secondary Axis checkbox for all XY series.

Another way to generate the combination chart is to build a column chart using the block of data in C3:H6, changing the last three series to XY Scatter, then changing the data for these three series. Change the chart type before changing the data.

Changing the data is easy. You could right=click on the chart and choose Select Data from the pop-up menu and wrangle with that uncomfortable dialog, but my favorite way is to edit the SERIES formula directly. For example, the formula for the red scatter series starts out as:

`=SERIES(Sheet1!$G$3,Sheet1!$C$4:$C$6,Sheet1!$G$4:$G$6,4)`

but I edit it to:

`=SERIES(Sheet1!$D$9,Sheet1!$C$10:$C$12,Sheet1!$D$10:$D$12,4)`

You can type the new row and column addresses, or you can select a reference in the formula (select both the sheet name and the cell address), then drag to select the new data range in the worksheet with the mouse.

Here is the complete data range plus the chart.

Looks good. Let’s check with the same gap width but a positive overlap.

Also looks good. How about a smaller gap and zero overlap?

Still looks good. What if we add a series and a category?

Looks good, except that all those lines make the chart a bit cluttered. What if we plot markers without lines?

The calculations work and the cleaned-up chart looks pretty good.

I wrote a similar post a couple years ago about Precision Positioning of XY Data Points. If the explanation above isn’t clear, you might try this other article.

In addition, I have written several more articles about combination charts:

- Add a Horizontal Line to an Excel Chart
- Horizontal Line Behind Columns in an Excel Chart
- Bar-Line (XY) Combination Chart in Excel
- Salary Chart: Plot Markers on Floating Bars
- Fill Under or Between Series in an Excel XY Chart
- Fill Under a Plotted Line: The Standard Normal Curve
- Excel Chart With Colored Quadrant Background

The post Clustered Column and Line Combination Chart appeared first on Peltier Tech.

]]>Everybody loves their Excel tips and tricks. I know I’m famous for my extensive collection of Excel charting tricks. But the most important charting trick is **Get The Data Right**. The secret of successful Excel charting begins and ends with **Good Chart Data**. I teach it in all of my charting workshops and seminars, so I was surprised that I did not have a blog post dedicated to good chart data. But now I do.

What do I mean by “good” data? Of course, “good” has to do with the quality of the data. Where is the data from? How was the data measured or collected? Is the source reliable and trustworthy?

All of this is important, for any data you consume in Excel or in any other program. But right here, “good” data means data that can easily be rendered in a chart, without having to make excessive adjustments to the chart. Good chart data may not be good display data, but data that has been optimized for display, say in that annual shareholder’s report, is almost guaranteed to be bad for charting.

“Good” data has a layout that makes chart creation easy for you, so Excel knows how to partition the data between the important parts of the chart. X values or categories. Y values. Series names.

“Good” data also has as little formatting as possible. Enough formatting to make it readable, but not so much that it causes retina pain.

You can make good charts from bad data, but it will take longer, and you will have to muck around with the chart for a while to make it work. It’s better to spend five minutes with the data now than to spend five hours trying to clean up a chart later.

Good chart data has the following characteristics:

- The data exists in a contiguous range: no blank rows or columns
- The data is minimally and consistently formatted: easy to validate visually
- The data is aligned with Y values in columns (shaded blue below)
- X values are located in the leftmost column (purple)
- Series names are located in the topmost row (red)
- The top left cell may provide some magical behavior (gold)
- The data may exist in an Excel Table

What if you want to plot your data but also display it in an optimally formatted way? The good new is, you can. The bad news is, it takes more work. You should have two different data ranges: one arranged for best charting outcomes, the other formatted for visual consumption. Both of these ranges should link to the same original source, to make sure changes are reflected everywhere, and you maintain one version of the data.

A good chart data range is contiguous, that is, it is comprised of data in a single block of cells. There are no completely blank rows or columns separating the data into separate areas.

Why is contiguous data important? Because Excel tries to be smart when you click a button. If you select any single cell in the range shown above and click any chart button to insert a chart, Excel tries to determine the extent of your data. Excel will go from the active cell up, down, left, and right until it finds either the edges of the worksheet or a blank row or column, and it will include all of the data within this range for the chart. This is so much easier than having to select the entire data range, especially if the data extends beyond the first visible screen.

Excel does the same hunt for the current region when you convert a range to a Table, or create a Pivot Table, or do any number of other things in Excel.

It’s perfectly fine to have blank cells within the data range, as long as there are no completely blank rows or columns.

Keep in mind that successful charting means that these “blank” cells should be completely empty. A cell with a formula that returns “” is not blank, because it contains a formula. If you have the formula return NA() instead of “”, the cell will not look blank, it will contain the #N/A error. This looks ugly, but a chart treats most #N/A values as blank cells, while it treats “” as text. Text might be plotted as a zero, or it might mess up an axis.

A cell that contains a space character is also not blank, because it contains that space character. Sometimes people get in the habit of typing the space instead of just skipping the cell. This kind of non-blank cell is hard to fix because both the cell and the formula bar appear blank.

You should only use as much formatting as you need to help you quickly scan the data.

The range below is a bit over-formatted. The header row is bold and multi-colored, and the red and yellow are distracting and may eventually cause eyestrain. Bold text and light gray fills might be okay for headers, though they are unnecessary if the headers follow one or more blank rows.

The dark borders on all of the cells break up the appearance of the data range and the dark lines compete for attention with the numbers. The default light gray cell borders are sufficient. I’ve programmed some buttons in my general purpose and charting software to apply light and medium gray borders to selected ranges.

The green shading can be distracting. (See my arbitrary pattern? All cells with 3 are shaded, while b and d are shaded because they have no 3 in their rows. I don’t know why.) If you find shading helps to identify certain values, don’t manually color the cells: use Conditional Formatting instead, so the shading goes away if the condition is no longer met.

You should avoid centering your data. It might look “nice”, but centering hides important characteristics of the data. By default, text is left-aligned in cells and numbers are right-aligned. A common feature of “bad” data is numbers stored as text, and centering everything hides this distinction.

Here is a data range with all of its cells centered. There are some small triangular flags in some cells that indicate a possible error, but I think most of us have learned to ignore warnings like this.

When we uncenter the data, it’s easy to detect the cells with numbers stored as text, even if we’ve ignored the green flags.

We can select the flagged cells, click the little warning dropdown, and convert them to numbers.

Now that we’ve converted text to numbers, we still have difficulty parsing the numbers, because they have inconsistent numbers of decimal digits. Some numbers look larger than others, but they merely have a longer tail after the decimal point.

When we apply a consistent format, we can tell that the larger numbers are in fact longer.

It’s now very easy to identify the largest numbers in each column. You might want to check that value of 345 in the “beta” column since it’s 100 times as large as the rest of the column.

Excel charts can work with data in columns or in rows. You can use either arrangement and sometimes one just works better than the other. If a chart’s source data has more rows than columns, Excel creates the chart with series in columns. A pivot chart always plots the pivot table with series in columns.

It is probably good practice to get used to using data in columns, because of the way a database table is structured. A database has fields and records. A field is a variable or measurement, such as date, eye color, serial number. A record is a single instance of a set of values for these fields.

When printed on paper or viewed onscreen (or imported into Excel), a database table is shown as a grid of rows and columns. Each column is a field or a variable, and each row is a record. The first row of the database table is a header row that contains the field label. Usually one column of the table, often the first column, contains a unique identifier or key for that particular record.

In the table above, we have fields for TLC, alpha, beta, and gamma. We have records for a, b, c, d, e, and f.

Like a database table, chart data should have a header row. When data is plotted with series in columns, this header row is used for series names, that is, the labels that appear in the legend of the chart.

Database tables have one header row only. Chart data ranges usually only have one row, but you can use more, often to good effect.

Subtotals and totals help to understand data, but they have no place in the source data for a chart. Suppose I have monthly values I want to plot.

If I put subtotal rows into my source data, it breaks up the visual appearance, so it’s hard to scan the individual values for discrepancies.

The quarterly subtotals disrupt the flow of data in the chart, and the much larger magnitude of the subtotals shrinks the monthly values.

If I include yearly totals, the quarterly subtotals shrink, and the monthly values are lost in the weeds.

Note that you can filter the quarterly and yearly categories from the chart in recent versions of Excel. Also note that a pivot chart will only show values from the pivot table and not totals and subtotals.

Just as a database table has a unique key field, the data range for a chart should have a header column identifying each row. This column is generally used for labels or values which are plotted along the X-axis of a chart.

This X-axis column should be the first column, to the left of any Y-axis values. This makes charting easier because Excel looks to the left for X values. But you would be amazed at how many people have trouble plotting their data when they have placed their Y values to the left of their X values. You can always tell Excel which data comes from where, but it is a lot more work, especially if you have to do it repeatedly, again and again, ad nauseam (can you feel the tedium?).

Another trick to help Excel identify your X values is to make the column of X values different from the Y values.

Probably the most common way to make the first column different is by filling it with text. While month names are a component of a date, a list of month names is text, as shown here.

When you select the data range (or one cell inside the data range), Excel uses the text labels in the first column as X-axis (category axis) labels, it uses the other columns as Y-axis (value axis) values, one column per series, and it uses the labels in the header row as series names. Here are line and column charts using this data (and area charts work the same way). Months from the first column are automatically placed along the category axis, headers from the first row are automatically used as legend entries (series names). and each column of Y values is plotted as a distinctly formatted series.

It’s the same with a bar chart, except that the category and value axes are switched. That’s right, in a bar chart, the X-axis is vertical and the Y-axis is horizontal. But the origin of the axes is at the bottom left in all of these charts, so values increase from left to right (like months advance from left to right in a line or column chart). Similarly, months advance from Jan to Jun moving bottom to top in a bar chart (like values increase from bottom to top in a line or column chart).

Confused? Since the months are listed from top to bottom in the worksheet, it would make sense to plot them from top to bottom in the chart. But there is logic in how Excel does it, and it is also pretty easy to fix when you know how. See Excel Plotted My Bar Chart Upside-Down for the simple technique.

XY Scatter charts are different: they have numerical axes for both X and Y (category and value) axes. When you plug in text for the X values, the chart doesn’t know what to do. Normally text is considered to have a value of zero, but for X values in an XY chart, Excel substitutes the counting numbers 1, 2, 3, up to the number of points. (Excel also does this if no X values have been specified.)

What if ever have an XY chart with the numbers 1, 2, 3, etc. along the axis, and you know you selected numerical data for the X values? Check the range of X values: there is probably a number stored as text, or an actual text label, somewhere in the range.

Using dates in the first column is another way to make the first column different. Excel recognizes the date formatting of the cell and parses the column as X values. An added bonus is that line, column, area, and bar charts have a special date type of axis (as opposed to the text axis shown above) that provides unique formatting options. This range has dates in the first column; notice that the dates are not uniformly spaced, but are taken on the 1st and 8th of each month.

Here are line and column charts made from this data. The line chart looks great and shows some of the enhanced date axis features. The data points are not equally spaced but reflect the non-uniform spacing of the data. Also, there is a tick mark and axis label on the first of each month, despite the nonuniform month lengths.

The column chart is wacky though. To accommodate the nonuniform data spacing, the chart has a slot for each day, even days without data. So each column chart series has to appear within the slot for its given day, and there are lots of days in between. This column chart should really be called a toothpick chart.

You can always change the axis type to text, and the column chart will look normal. But you lose the non-uniform nature of the dates and the first-of-the-month labels.

An XY scatter chart treats dates for X values just like any numerical X values. You can see the non-uniform spacing of the data.

But the axis is not labeled as naturally as the line chart above. Using the value axis algorithm, Excel picked an axis that begins at 44,180 and increases to 44,280 in steps of 20. (Excel stores dates as whole numbers starting on 1 January 1900, so 44,180 is 15 December 2020 and 44,280 is 25 March 2021.) You can format the XY chart’s axis to begin on the 1st of a month, and have a tick label on the 1st of the next month. But for a non-trivial number of months, it’s impossible to repeat this pattern with Excel’s default axis labels.

If you have numerical X values in the first column, they won’t be different from the numerical Y values in the other columns. *You* know those values in the first column are years, and I know they are years, and the header label even says “Year”, but Excel simply recognizes them as numbers.

This doesn’t matter for an XY Scatter chart: Excel almost always treats the first column of numbers as X values. This XY chart shows years along the X-axis, as intended.

When Excel created this line chart, it saw the numbers in the first column and decided to plot them as Y values. There are two consequences of this: First, there is a series of values near 2000 floating far above the intended Y values in the chart; second, no X values were specified, so Excel simply used the counting numbers 1, 2, 3, etc.

You could avoid this by selecting just the Y values when the chart is created, and specifying the X values later. You could also do this with years by entering dates in the first column (1/1/2015, 1/1/2016, etc.) and formatting them using a custom number format of YYYY, which will display just the year numbers. Since the column now is formatted as dates, Excel will plot them the way you want.

You cannot avoid this by formatting the first column of numbers as text, Excel is too smart, and it converts the numeric text as numbers, and therefore as Y values.

You may have noticed the label “TLC” in the header row of the first column. TLC stands for **Top Left Cell**, and it is a little piece of magic for Excel chart data.

One way to make the first column different from the rest is to clear the contents of the Top Left Cell, as shown below.

We already know that an XY Scatter chart will plot the first column of numbers as X values with a label in the top left cell. But it also works if the top left cell is blank.

The top left cell works its magic in line (and area, column, and bar) charts. This line chart was generated automatically with the column below the blank top left cell as X values, and the columns below actual labels as Y values. Finally, a way to plot numbers as X values in a line chart without worrying about formats (text or dates).

Here is an important difference between XY scatter charts and line charts. Just because you can trick Excel into plotting numbers as X-axis values in a line chart, you can’t trick Excel into plotting those X values as numbers. The next section shows a few different sets of data that will help illustrate this difference.

Evenly spaced X values seem to be similarly plotted in XY and line charts. In the XY chart, the X-axis begins at zero and extends beyond the highest X value. In the line chart, the first X-axis label is the first X value and the last X-axis label is the last X value, without the padding found in a scatter chart’s default axis values.

Unevenly spaced X values are plotted differently in XY and line charts. In the XY chart, the X-axis begins at zero and extends beyond the highest X value, and data points are plotted unevenly, reflecting the unevenness in the X values. In the line chart, the first X-axis label is the first X value and the last X-axis label is the last X value, data points and X-axis labels are uniformly spaced regardless of their apparent numerical values, and there are no X-axis labels for missing X values.

It is obvious that the line chart treats the X values as non-numeric text labels, ignoring any apparent numerical values. This is because the default axis for non-date X values is a text axis (below left). We can change this to a date axis (below right), and we see the data points are now plotted according to their unevenly-spaced numerical values. The first X-axis label is still the first X value and the last X-axis label is still the last X value. (Excel treats the numbers as dates with a format of “D”, so only the day shows.)

When numbers are out of order, the above behavior is even more different. The XY scatter chart draws the points and the lines connect them in order, moving left or right as the X values decrease or increase. The line chart shows the points in the order they appear in the data range: the first X-axis label is still the first X value and the last X-axis label is still the last X value, so the labels are out of order.

When we convert the text axis (below left) to a date axis (below right), we see that not only have the points been spaced according to their non-uniform values, but that the dates have been internally sorted prior to plotting. This internal sorting is a feature of charts with a date axis. The first X-axis label is the *smallest* X value and the last X-axis label is the *largest* X value, so the labels are in order and the points are plotted left to right.

Let’s look at the same charts with dates instead of regular numbers. The XY scatter chart (left) and line chart (right) plot evenly spaced dates along the X axis in a similar way. The data is evenly spaced, so the data points are plotted evenly. The XY chart extends its X-axis a little bit below to a little bit above its data range, while the line chart uses the earliest date as the first axis label and the latest date as the last axis label.

The XY scatter and line charts also plot unevenly spaced dates similarly. The points are spaced unevenly to reflect the pattern in the data. As before, the XY chart extends its X-axis a little bit below to a little bit above its data range, while the line chart uses the earliest date as the first axis label and the latest date as the last axis label.

Out of order dates are plotted in an XY scatter chart in the order they appear in the worksheet. The lines connecting the points start at the first point, and move left or right for earlier or later dates. In the line chart, the lines always connect from left to right, reflecting the internal sorting that takes place. As always, the XY chart extends its X-axis a little bit below to a little bit above its data range, while the line chart uses the earliest date as the first axis label and the latest date as the last axis label.

We’ve seen how a blank top left cell can help Excel to parse data correctly into X and Y values and series names.

But this concept of blank cells is more magical than that. Suppose you have two columns of category labels, for years and quarters. Each of these label columns has a blank cell in its header (blank cells are shaded gold). Each year only appears once, next to the first quarter, and there are blank cells next to the other quarters.

Excel sees the two blanks in the top left, and uses both columns as category labels. There are two rows of labels, with quarter labels in the first row and years in the second. Each year label is centered under the corresponding quarter labels, and a vertical tick mark extends from the axis to help delineate the labels.

This is a very nice effect, only available for line, column, area, and bar charts that are using a text style X-axis. It will not work for a date axis or for an XY scatter chart’s X-axis: in those cases you will get the numbers 1, 2, 3, etc.

You may have seen this kind of data arrangement in a Pivot Table, with multiple fields in the rows area, and noticed the multiple-tier category labels in the corresponding Pivot Chart. But you’re not stuck needing a Pivot Table, you can build this data layout yourself. You are not limited to two tiers of labels, either: below you can see a three-tiered axis, and I’ve seen it used for 5 or 6 tiers.

What if you indicate to Excel that you have two rows of series names? The same effect applies. The North label is combined with the alpha and beta series names, What if you indicate to Excel that you have two rows of series names? The same effect applies. The North label is combined with the alpha and beta series names, while the South labels is combined with the gamma and delta series names, to generate compound names.

You can combine the effects of multiple-tier category axis labels and compound series names in the same chart, as shown below.

Note that these blank cells must be totally blank, and not just look blank like a formula that returns a value of “”. And while a chart can treat #N/A as a blank in a chart’s Y values, #N/A will be treated as a text label, not as a blank, when used in the series name and category labels regions of the data range.

When a chart’s data doesn’t conform to these definitions of “good” data, the Select Source Data dialog shows only a blank for the chart data range. Below the range selection box, you are told “The data range is too complex to be displayed.” This means that the data is irregular. Not all series start and end at the same row, perhaps, or the series have different numbers of points. Perhaps the series names are misaligned from the Y values, or the series are plotted out of order. Anything that prevents Excel from indicating a nice rectangular block of data.

When the data does conform to our “good” data definition, the Select Source Data dialog happily displays the address of the data range.

In fact, the Select Source Data dialog is a bit more forgiving that my rules. If the data is separated by complete blank rows or columns, but otherwise fits within a rectangular range, the dialog shows the addresses of the various areas of the data range.

An Excel Table is a special data structure which provide advanced data handling capabilities; I have converted my data into a Table below. The header row has buttons for filtering and sorting of the Table. You can add a total row if desired. There are numerous styles you can apply, some of them approaching hideous; the Table below shows the default style.

There are many benefits to storing your chart data in a Table. The major benefit is that when you add data to the row below or the column to the right of a Table, the Table automatically expands to include the added data. What’s more, any formulas that reference a Table column or row will update automatically if the Table expands or contracts. This includes chart SERIES formulas and the Chart Data Range formula in the Select Source Data dialog. So if your chart uses a Table like this for its source data, when you add rows of data, each series in the chart will add the corresponding points; when you add columns to the Table, your chart will add series. Dynamic charts made easy!

One drawback to using a Table for your chart’s source data is that the header row of a Table can contain no blank cells. This means that a lot of the blank-cell based data parsing, especially the top left cell magic, may not work with a Table. But if your first column(s) are not numerical, Excel will still automatically parse them into X values. And you could still select just part of the Table, create your chart, then manually specify the X values.

The post Good Chart Data appeared first on Peltier Tech.

]]>Excel does a pretty good job of scaling its chart axes. But there are times when you wish it would do a better job. One case is shown in XY Scatter chart below. All of the points have X and Y values between 0 and 7, but because the chart itself is rectangular, the gridlines are spaced differently along the X and Y axes. Wouldn’t it be nicer if the spacing was the same along both axes, giving you square gridlines?

There are a few ways to accomplish this, not including the tedious manual method of clicking and dragging with the mouse or trying a sequence of values for the axis maximums to try to make it work out. I’ll use VBA to handle this task.

The first approach works by measuring the chart’s plot area dimensions, locking in the axis scale parameters, and using the scale to determine how far apart the gridlines are horizontally and vertically. Then the axis which has the larger spacing has its maximum increased so its gridline spacing shrinks to match the spacing on the perpendicular axis.

I wrote this function: pass in the chart you want to be squared up, and the function does its job.

```
Function SquareGridChangingScale(myChart As Chart)
With myChart
' get plot size
With .PlotArea
Dim plotInHt As Double, plotInWd As Double
plotInHt = .InsideHeight
plotInWd = .InsideWidth
End With
' Get axis scale parameters and lock scales
With .Axes(xlValue)
Dim Ymax As Double, Ymin As Double, Ymaj As Double
Ymax = .MaximumScale
Ymin = .MinimumScale
Ymaj = .MajorUnit
.MaximumScaleIsAuto = False
.MinimumScaleIsAuto = False
.MajorUnitIsAuto = False
End With
With .Axes(xlCategory)
Dim Xmax As Double, Xmin As Double, Xmaj As Double
Xmax = .MaximumScale
Xmin = .MinimumScale
Xmaj = .MajorUnit
.MaximumScaleIsAuto = False
.MinimumScaleIsAuto = False
.MajorUnitIsAuto = False
End With
' Tick spacing (distance)
Dim Ytic As Double, Xtic As Double
Ytic = plotInHt * Ymaj / (Ymax - Ymin)
Xtic = plotInWd * Xmaj / (Xmax - Xmin)
' Keep plot size as is, adjust max scales
If Xtic > Ytic Then
.Axes(xlCategory).MaximumScale = plotInWd * Xmaj / Ytic + Xmin
Else
.Axes(xlValue).MaximumScale = plotInHt * Ymaj / Xtic + Ymin
End If
End With
End Function
```

I’ll describe several ways to call this function later, but for now, this line of code should be fine, either in your own sub or just in the Immediate Window:

`SquareGridChangingScale ActiveChart`

The squared-up chart is shown below. The gridlines are square, accomplished by changing the X-axis maximum to 12.9777. Good thing we have VBA to calculate this for us.

There is a strange blank edge to the chart, but you could make it look less strange by formatting the plot area border to match the axes.

Let’s try with another chart. This is like the first, but the X values are twice as large as before, leading to a different scale.

This is the resulting chart. Again, the gridlines are square and the right edge looks blank. But we see another problem. The X-axis tick spacings are 2 units apart, compared to the Y-axis with its 1 unit spacing.

I’ll modify my earlier procedure by adding an optional argument `EqualMajorUnit`

. If this is True, the code will apply the same spacing to both axes before adjusting the axis maximum values. If it is False or omitted, the code will ignore tick spacing.

```
Function SquareGridChangingScale(myChart As Chart, Optional EqualMajorUnit As Boolean = False)
With myChart
' get plot size
With .PlotArea
Dim plotInHt As Double, plotInWd As Double
plotInHt = .InsideHeight
plotInWd = .InsideWidth
End With
' Get axis scale parameters and lock scales
With .Axes(xlValue)
Dim Ymax As Double, Ymin As Double, Ymaj As Double
Ymax = .MaximumScale
Ymin = .MinimumScale
Ymaj = .MajorUnit
.MaximumScaleIsAuto = False
.MinimumScaleIsAuto = False
.MajorUnitIsAuto = False
End With
With .Axes(xlCategory)
Dim Xmax As Double, Xmin As Double, Xmaj As Double
Xmax = .MaximumScale
Xmin = .MinimumScale
Xmaj = .MajorUnit
.MaximumScaleIsAuto = False
.MinimumScaleIsAuto = False
.MajorUnitIsAuto = False
End With
If EqualMajorUnit Then
' Set tick spacings to same value
Xmaj = WorksheetFunction.Min(Xmaj, Ymaj)
Ymaj = Xmaj
.Axes(xlCategory).MajorUnit = Xmaj
.Axes(xlValue).MajorUnit = Ymaj
End If
' Tick spacing (distance)
Dim Ytic As Double, Xtic As Double
Ytic = plotInHt * Ymaj / (Ymax - Ymin)
Xtic = plotInWd * Xmaj / (Xmax - Xmin)
' Keep plot size as is, adjust max scales
If Xtic > Ytic Then
.Axes(xlCategory).MaximumScale = plotInWd * Xmaj / Ytic + Xmin
Else
.Axes(xlValue).MaximumScale = plotInHt * Ymaj / Xtic + Ymin
End If
End With
End Function
```

Here is our chart after running the second version of the function.

That’s better, the grids are square and have equal spacing. Now the unfinished edge is along the top of the chart. Again, matching the plot area line color to the axes makes it look less awkward.

We achieved square gridlines above by keeping the plot area fixed and adjusting the axis scales. But what if we shrink the plot area by the amount needed to square up the gridlines? We’ll get blank space along the edge of the chart without some gridlines hanging off in space, and we can then center the plot area within the chart.

Here’s the new function. Note that I’ve kept the `EqualMajorUnit`

argument.

```
Function SquareGridChangingPlotSize(myChart As Chart, Optional EqualMajorUnit As Boolean = False)
With myChart
' get plot size
With .PlotArea
Dim plotInHt As Double, plotInWd As Double
plotInHt = .InsideHeight
plotInWd = .InsideWidth
End With
' Get axis scale parameters and lock scales
With .Axes(xlValue)
Dim Ymax As Double, Ymin As Double, Ymaj As Double
Ymax = .MaximumScale
Ymin = .MinimumScale
Ymaj = .MajorUnit
.MaximumScaleIsAuto = False
.MinimumScaleIsAuto = False
.MajorUnitIsAuto = False
End With
With .Axes(xlCategory)
Dim Xmax As Double, Xmin As Double, Xmaj As Double
Xmax = .MaximumScale
Xmin = .MinimumScale
Xmaj = .MajorUnit
.MaximumScaleIsAuto = False
.MinimumScaleIsAuto = False
.MajorUnitIsAuto = False
End With
If EqualMajorUnit Then
' Set tick spacings to same value
Xmaj = WorksheetFunction.Min(Xmaj, Ymaj)
Ymaj = Xmaj
.Axes(xlCategory).MajorUnit = Xmaj
.Axes(xlValue).MajorUnit = Ymaj
End If
' Tick spacing (distance)
Dim Ytic As Double, Xtic As Double
Ytic = plotInHt * Ymaj / (Ymax - Ymin)
Xtic = plotInWd * Xmaj / (Xmax - Xmin)
' Adjust plot area size, center within space
If Xtic < Ytic Then
.PlotArea.InsideHeight = .PlotArea.InsideHeight * Xtic / Ytic
.PlotArea.Top = .PlotArea.Top + (.ChartArea.Height - .PlotArea.Height - .PlotArea.Top) / 2
Else
.PlotArea.InsideWidth = .PlotArea.InsideWidth * Ytic / Xtic
.PlotArea.Left = .PlotArea.Left + (.ChartArea.Width - .PlotArea.Width - .PlotArea.Left) / 2
End If
End With
End Function
```

When we run this code, we get square gridlines with no funny gridline extensions and no large blank areas in the chart. The plot area is nicely centered.

This is much better than the first approach. Let’s see how it works with the other data.

Okay, we forgot to use `EqualMajorUnit`

= True, so the square grid has different tick spacing on the X and Y axes. Let’s try again.

Yes, this second function is a major improvement over the first. It may even be better to set the default value of `EqualMajorUnit`

to True.

When the second function resized the plot area, we ended up with a bit of white space in the resulting chart. In some cases, this amount of white space was substantial. What if we shrink the whole chart, not just the plot area, and absorb the excess white space?

```
Function SquareGridChangingChartSize(myChart As Chart, Optional EqualMajorUnit As Boolean = False)
With myChart
' get plot size
With .PlotArea
Dim plotInHt As Double, plotInWd As Double
plotInHt = .InsideHeight
plotInWd = .InsideWidth
End With
' Get axis scale parameters and lock scales
With .Axes(xlValue)
Dim Ymax As Double, Ymin As Double, Ymaj As Double
Ymax = .MaximumScale
Ymin = .MinimumScale
Ymaj = .MajorUnit
.MaximumScaleIsAuto = False
.MinimumScaleIsAuto = False
.MajorUnitIsAuto = False
End With
With .Axes(xlCategory)
Dim Xmax As Double, Xmin As Double, Xmaj As Double
Xmax = .MaximumScale
Xmin = .MinimumScale
Xmaj = .MajorUnit
.MaximumScaleIsAuto = False
.MinimumScaleIsAuto = False
.MajorUnitIsAuto = False
End With
If EqualMajorUnit Then
' Set tick spacings to same value
Xmaj = WorksheetFunction.Min(Xmaj, Ymaj)
Ymaj = Xmaj
.Axes(xlCategory).MajorUnit = Xmaj
.Axes(xlValue).MajorUnit = Ymaj
End If
' Tick spacing (distance)
Dim Ytic As Double, Xtic As Double
Ytic = plotInHt * Ymaj / (Ymax - Ymin)
Xtic = plotInWd * Xmaj / (Xmax - Xmin)
' Adjust chart size
If Xtic < Ytic Then
.Parent.Height = .Parent.Height - .PlotArea.InsideHeight * (1 - Xtic / Ytic)
Else
.Parent.Width = .Parent.Width - .PlotArea.InsideWidth * (1 - Ytic / Xtic)
End If
End With
End Function
```

Some caveats apply to this approach. When you resize the chart, the chart title may decide it needs a line break, which will change the plot area size, and make the gridlines not square. Here are the results of our charts from the two data sets, without fixing the tick spacing mismatch of the second data set.

Here is the chart for the second data set, with `EqualMajorUnit`

set to True.

The only way to improve my functions is to ignore the first and combine the last two, passing in the parameter `ShrinkChart`

which tells the function whether to adjust the plot area (if False) or chart size (if True).

```
Function SquareXYChartGrid(myChart As Chart, ShrinkChart As Boolean, _
Optional EqualMajorUnit As Boolean = False)
With myChart
' get plot size
With .PlotArea
Dim plotInHt As Double, plotInWd As Double
plotInHt = .InsideHeight
plotInWd = .InsideWidth
End With
' Get axis scale parameters and lock scales
With .Axes(xlValue)
Dim Ymax As Double, Ymin As Double, Ymaj As Double
Ymax = .MaximumScale
Ymin = .MinimumScale
Ymaj = .MajorUnit
.MaximumScaleIsAuto = False
.MinimumScaleIsAuto = False
.MajorUnitIsAuto = False
End With
With .Axes(xlCategory)
Dim Xmax As Double, Xmin As Double, Xmaj As Double
Xmax = .MaximumScale
Xmin = .MinimumScale
Xmaj = .MajorUnit
.MaximumScaleIsAuto = False
.MinimumScaleIsAuto = False
.MajorUnitIsAuto = False
End With
If EqualMajorUnit Then
' Set tick spacings to same value
Xmaj = WorksheetFunction.Min(Xmaj, Ymaj)
Ymaj = Xmaj
.Axes(xlCategory).MajorUnit = Xmaj
.Axes(xlValue).MajorUnit = Ymaj
End If
' Tick spacing (distance)
Dim Ytic As Double, Xtic As Double
Ytic = plotInHt * Ymaj / (Ymax - Ymin)
Xtic = plotInWd * Xmaj / (Xmax - Xmin)
If ShrinkChart Then
' Adjust chart size
If Xtic < Ytic Then
.Parent.Height = .Parent.Height - .PlotArea.InsideHeight * (1 - Xtic / Ytic)
Else
.Parent.Width = .Parent.Width - .PlotArea.InsideWidth * (1 - Ytic / Xtic)
End If
Else
' Adjust plot area size, center within space
If Xtic < Ytic Then
.PlotArea.InsideHeight = .PlotArea.InsideHeight * Xtic / Ytic
.PlotArea.Top = .PlotArea.Top + (.ChartArea.Height - .PlotArea.Height - .PlotArea.Top) / 2
Else
.PlotArea.InsideWidth = .PlotArea.InsideWidth * Ytic / Xtic
.PlotArea.Left = .PlotArea.Left + (.ChartArea.Width - .PlotArea.Width - .PlotArea.Left) / 2
End If
End If
End With
End Function
```

Here is how you might call the function from one Sub, which determines which charts were selected and applies the function to each.

```
Sub SquareXYGridOfSelectedCharts()
If Not ActiveChart Is Nothing Then
SquareXYChartGrid ActiveChart, True, True
ElseIf TypeName(Selection) = "DrawingObjects" Then
Dim shp As Shape
For Each shp In Selection.ShapeRange
If shp.HasChart Then
SquareXYChartGrid shp.Chart, True, True
End If
Next
Else
MsgBox "Select one or more charts and try again.", vbExclamation, "No Chart Selected"
End If
End Sub
```

- Calculate Nice Axis Scales with LET and LAMBDA
- Calculate Nice Axis Scales in Your Excel Worksheet
- Calculate Nice Axis Scales in Excel VBA
- Chart UDF to Control Axis Scale
- How Excel Calculates Automatic Chart Axis Limits
- Reciprocal Chart Axis Scale
- Custom Axis Labels and Gridlines in an Excel Chart
- Custom Axis, Y = 1, 2, 4, 8, 16
- Logarithmic Axis Scales
- Link Excel Chart Axis Scale to Values in Cells
- Consistent Axis Scales Across Multiple Charts
- Gantt Chart with Nice Date Axis
- Select Meaningful Axis Scales
- Bar Chart Value Axis Scale Must Include Zero
- Text Labels on a Horizontal Bar Chart in Excel

The post Make Excel Chart Gridlines Square appeared first on Peltier Tech.

]]>