Uncle Josh Crunches Some Numbers

I guess it has been a couple of weeks since starting the data gathering, and Stephanie is out of town this week, so I should probably look at some the numbers I did collect. What else am I going to do, clean the kitchen?

My goal of first starting by comparing 5 full years from “back then” to 5 full years of “just now” aren’t coming along as well as I’d hoped because data for the 30’s and 40’s is sparse. I plugged my data into Google Sheets which has some cool graphing features I didn’t know about until I asked for a graph.

Here are the high temperatures for 2011-2016:

Daily High Temperature (F) 2011-2015
Daily High Temperature (F) 2011-2015

It’s more of a saw curve than I’d expect. Here is the Daily Delta for that time period:

Changes in Daily High 2011-2015

This is the sort of noise I expected to see. The Delta is simply (Tomorrow’s High)-(Today’s High). I included the high for Jan 1 2016 for the Delta on Dec 31 2015. That’s 1,826 data points in both sets.

To find spikes in temperature change, I can measure the distance between large jumps in temperature. What I don’t know how to do is how to determine if a change is severe enough. My daily deltas range from -20 to +21 degrees Farenheit. So I’ll start with saying anything over 15 degrees, positive or negative, is a severe delta. This seems reasonable. On Monday you dress for an 80 degree day, if Tuesday is 65 degrees you change everything, or if Tuesday is 95 you modify again.

But these could be markers between warm periods and low periods. A 20 degree jump followed by several days of changes between -5 and 5 degrees is not a spike, but a warm spell or a cool spell. I need better definitions and formulas. One way is to compare a 3-day delta to the absolute value of the deltals for those three days. Here’s an example:

On 11/5/2015 in was 53 degrees (and on 11/4/2015). On 11/6 in was 64. Then on 11/7 it was 53 again. I built a list of these 3-day deltas and compared them to the total absolute value of the change, and divided the latter from the former (using the latter if the former was 0) and this gives me a “score” of 22. We experienced 22 degrees of change in three days, with no net change in overall temperature. This was a spike, but only 11 degrees, so maybe not severe, and I’m sure we all said “oh it’s nice and warm today” and gave the weather gods a bilabial fricative on the 7th.


That’s a moderate spike. I want to find the big ones. I calculated this “3-day ratio” over five years. My highest ratio was 30 degrees, my lowest -29 degrees. I should just use the absolute values of these numbers, and probably will.

The three highs on July 18-20 2014 were 80, 87, 72, and 79. Wait. That’s four days. That’s a lot of thermal variation over a 4-day period.


This means everything I wrote about  the “3-day delta” and “3-day ratio” are in fact 4-day deltas and 4-day ratios. It’s a good thing I’m only explaining the process and not the results, isn’t it?

I did the same calculation using “2-Day Deltas” and “2-Day Ratios” and I went ahead and used the absolute value of the calculation. The maximum ratio is this calculation is 24. That occured between January 19-21 2012. Highs were 53, 41, 53. (Yes, I realize my calculation means that I’d score a 24 if the high ot the 21st was 52, 53, or 54 degrees. I can live with that to avoid division by zero errors.) This is the kind of switch I’m looking for, but what it tells me, as this is the maximum change, that I’m not going to see any one day spike greater than 15 degrees. 24 is my maximum ratio, and that’s 12 down and 12 back up again.

I found a 23 degree 2-Day Ratio on October 11-13 2012. Highs were 66, 54, and 65.

Thinking about this makes me consider some folly. Let’s face it, if October 14th had been 64 degrees, my ratio would have been 12.5, a much lower score that may not stand out even though the difference between 66 and 64 degrees is about the same as between 66 and 65.

So maybe I can just use the absolute value of any “2-Day Delta” to find these spikes. The 2-Day Delta is, well, maybe a bit more spreadsheet math to explain my thinking. All the highs are in Column B:

  • High Delta (Column X) : B3-B2
  • High Delta ABS (Column Y): ABS(X2)
  • 2-Day Actual (Column Z): SUM(X2:X3)
  • 2-Day Delta (Column AA): SUM(Y2:Y3)
  • 2-Day Ratio: IFERROR(ABS(AA2/Z2),AA2)

My maximum 2-Day Delta is 35. This happened on April 26-28 2015. The Highs were 61, 82, 68. That is a spike. Thirty-five degrees of change over two days, with only 7 degrees change at the end of it.

July 12-14 2014: Highs were 93, 74, 84. A 19-degree drop and 11 degrees back up.

June 30-July 2 2014: Highs were 85, 99, 82. (July 3 was 74, July 4 was 80 – almost makes me want to count that one, too.)

Those are the only 2-Day Delta’s over 30 in my 5-year dataset. I think I’m missing something. Let’s look at a frequency table of the 2-Day Deltas:

2-Day Delta Frequency

I’m not sure if this tells me anything other than 95.35% of my 2-Day Deltas are less than 20 degrees.

The 2-Day Deltas don’t tell me if both days are big movements in both directions or opposite directions. That’s what the ratio was supposed to tell me.

2-Day Ratios
2-Day Ratios

Does mapping the 2-Day Ratios in a simple bar chart illustrates what’s happening? Nope. That just tells me where these shifts appear but doesn’t tell me if it’s a proper spike. I have a vague definition of “spike” here. I don’t think it’s a mathematical formula I need, but a logical formula. A spike occurs IF the total change in temperature over two days is more than 20 degrees AND the actual change over that time period is between -3 and 3 degrees. That may work.

That gives me Spikes on the following dates: 5/3/2011, 1/19/2012, 10/11/2012, 6/21/2013, 5/17/2014, 5/21/2014, and 11/5/2015. Actually, the spikes occured on the day after all of those dates. But that only includes changes between -3 and 3 degrees exclusive. If I include -3 and 3, I have 13 dates in the five-year period. This, I think, gives me something to work with.

There’s a lot of testing here to find a metric. My next steps will be getting more early data, appying the same ideas to that set, and then trying to find a way to compare the two.