# Adding a Linear Trend Line to a Chart

August 30, 2011 42 Comments

I must admit, I have long lamented that Web Intelligence doesn’t have built in trend lines for charts. Of course, that doesn’t mean you can have one. You just have to build it manually, and that can be a lot of work. So this post is about how to do a linear trend line, also known as a Linear Regression. This technique makes extensive use of Calculation Contexts.

We’ll use the eFashion universe for this example, so you can follow along with all the steps. Start by creating a query with Store name, Year, Month name, and Sales revenue. Run the query, and remove Store name and Year from the block, and put a custom sort on Month name to get the months in chronological order. Finally, add report filters for Store name and Year to the Report Filter toolbar. Your report should now look like this.

OK, let’s get started. We’ll need to create quite a few variables along the way.

1. Create a variable called “n”. This will simply calculate the number of values in the block. So the formula is as follows:

=Count([Month Name])In Block

Add this new variable to the block, and verify that it shows 12 on every row.

2. The next variable simply numbers the rows in the block, so we’ll call it “x” as it numbers the values along the X axis of the chart. The formula for this one is:

=Sum(RunningCount([Month Name]))

If you add [Line Num] to the block, it should number the rows, 1 – 12.

3. Now we get into the heavy calculations. Call the next variable “a” with the following formula:

=(Sum([x] * [Sales revenue]) In Block * [n] In Block) – (Sum([x]) In Block * Sum([Sales revenue]) In Block)

You can add the [a] variable to the block, just to verify that you get the same number on each row. Notice the “In Block” context used several times in this formula. In Block is an output context operator that tells the formula to output one number for the entire block. This is why we get the same value on each row.

4. Moving along in order, let’s call the next variable “b”, with the following formula:

=([n] * [n] * (Sum([x] * [x]) In Block / [n] – (Average([x]) In Block * Average([x]) In Block)))

The first part of this formula multiplies the squares of [n] and [x], outputing one number for the block, dividing it by the number of values in the block. It divides that by the number of values minus the square of the average [x] values. Sound confusing? Trust me, we’re on the right path, but we’re not done yet.

5. OK, this brings us to the next variable, which we’ll call “c”. This one is pretty simple:

=[a]/[b]

Of course, since both [a] and [b] return one value for all rows, we can expect [c] to return one value for all rows.

6. We’re in the home stretch now. Add another variable, called “d”, with the following formula:

=(Average([Sales revenue]) – [c] * Average([x])) In Block

Again, this one should return one value for all rows. Any time you see “In Block” as the output context for a block, you can expect it to return one value for all rows.

7. OK, one final variable! We’ll call this last variable “Trend”. This one has the following formula:

=[d] + [c] * [x]

That’s all the variables we need. We are now ready to plot this on a chart. So, remove everything from the block except the Month name, Sales revenue, and Trend objects. Finally, turn the block into a line chart, and format as desired. You should end up with something like this:

As you select different years, or stores, from the filters, above, the trend line will adjust accordingly.

Give it a shot with your charts and let me know what you think?

Michael

I like this solution much better than another I found on the BOB board. I have a question, however. Neither the other solution nor this solution will work if I want to trend year over year. For example when I bring in 2010 and 2011 (against my data) the trend line does not work. Ideally our clients would like to see at least two years trending. Any idea how to do that with your solution?

Hey, Roland. You broke this solution already? I’m impressed. This solution should work with any data. I’d have to see the layout of your data to try to figure out why it isn’t working (Isn’t that a great “Consultant” type answer?).

Michael

The solution works perfectly with our data, “except” when I try to include mutliple years. As far as seeing our data layout, I guess I need to get our mutual friend Chris to reach out to you. :~)

Roland

Works nicely for weekly data. I just change the [Month Name] variable into my [week] variable. It even works for weeks spanning several months (In my case, I’m generating trend line for 17 weeks data)

Michael – great stuff. Thanks for sharing 🙂

My pleasure, Ferdie.

Michael,

Thank you for sharing. This works great with my rolling 12 week data. I have a question, since I am trending over a rolling 12 week, will this still work if comes January 2013 when my weekly data will be split between 2012 and 2013? This is awesome. Thank you.

Thanks, Tiffany. I’m glad it works for you. When 2013 rolls around, this will still work, as long as your data remains in chronological order.

Thank you Michael for the quick reply, One more question, my trend line tend to go from hgh to low instead of low to high. I reviewed the formulas and everything looks correct. I think I might have missed something.

-tiffany

Hi, Tiffany. Without looking at your report, I’m at a loss as to why it’s going in the wrong direction. Make sure your values on the X-Axis are sorted in the right order.

,

This is really a nice one..

We have implemented a trend line one almost a year back using interpolation function. It works wonders in some particular conditions

Michael, you are the man! This is a spot on linear regression solution practically explained, as oppose to a bunch of discrete mathematics terms. Just don’t understand why people saying things like “works on our data”, or “works on weekly data”, etc. This is a generic solution that will work on ANY data.

Thanks, David. I’m glad that you found it useful.

Michael, thank you for sharing. Works like a charm.

Very useful Post Michael. Can you tell me if this works for Bubble Charts too?

I’ve never tried it with a bubble chart, but I doubt that it would give the expected results, since all measures on a bubble chart are shown in bubbles. So you can’t have a trend line on a bubble chart.

Hello Michael, my question would be, is there any possibility to make this trendline if i have data for n month and i have to show this n month trend plus for the future n+2 month also in the same chart?

Greg, I’m not sure I understand the question. Have you posted this on BOB?

Hello Michael, let’s put it in this way, it is not enough to put the trend line on the existing data, i would like to predict this to the future. Like make the trend line as long through the months as i want. What do you think, is it possible? No i have not posted it to BOB yet.

Hi, Greg,

Well, where there’s a will, there’s a way, right? So, I would imagine that it’s possible. So, if you figure it out before I do, post the solution here. 🙂

Very nice explanation… I just followed step-by-step your instructions and I was able to add the trendline. Thanks very much.

Hi Michael,

Thanks for the helping to achieve trendline in webi :). Trendline worked perfectly in Webi rich client but same report exported in infoview i am getting trendline in negative values, could be pls help in identifying it?

Hi, Sindhu. That is strange. I haven’t seen that before. Try posting about this on BOB, and see if anyone else has seen this.

This is Fantastic. It works a treat for my purposes

You will never know how much I appreciate this posting. Thank you Sir!

You’re quite welcome. I glad you like it.

Thanks for the post. Could you add two sentences about the calculation of the slope? Somehow I’m stumped.

Is it not working for you?

Hello Michael,

Thanks for your post But I have some issue with this formula.

I tried with my set of data but I don’t get expected results.

Indeed, with only positive values among 8 months my trend line ends in negative values.

I think I have follow your hints but maybe I missed something.

If I export my results in an excel file and build a tendancy line on my set of data I get something which seems relevant.

Could you have time if I send you my file to have a look and tell me what I’m doing wrong?

Your method seems to be so useful and other people don’t face my issue so that’s why I find it strange…

Thanks in advance

Regards,

R.B.

Check your calculations, and, if you still can’t figure it out, try posting the issue on BOB,

Michael-excellent, excellent, excellent. I’m very new to using BI-Web Intelligence XI (and really the BI community as a whole). And your descriptions were easy and straightforward to use. Two questions: is there a way to remove the data markers on the trend line. I see you can ‘vary’ the data markers and this will at least give some distinction. And two, do you know of way to ‘orient’ the labels for the X-axis? I know there’s orientation for Y-axis, but I didn’t know if there’s a way to do this for the X-axis in some other feature.

Again, great explanation for trend lines!!

Jason

Hi Michael, to be honest I am lost. Can you please share picture with removed columns (points 3-6)? Thank you.

Regards,

Martin

Hi, Martin. I’m not sure what you want a picture of. If you try this with eFashion, you can copy the formulas right out the post and paste them in your formula editor.

Hi Michael,

thank you for response, but i don’t know eFashion. What I would like achieve is to be able generate this trendline in PHP language. But to do that, I have to understand formula clearly. I thought that I will be able simulate it in the Excel, but with no luck. I do not exactly know which formula apply for whole column or for a row only. If you give me some hint how reproduce in Excel, it will be helpful. Thank you.

Unfortunately, I’m not familiar enough with Excel to create these formulas there, especially since Excel already had a built in trend line capability. So I won’t be able to help you with that.

Any hints on how to get this working with multiple years.

I substituted Week Number for Month Number and for a single year it works great, but when multiple (3) years are selected (year is assigned to Region Colour) it causes things to go a bit wacky and have 3 parallel trend lines.

Looking at the values I think it’s because the running sum of variable X is not getting reset for each year and hence affects the ensuing calculations.

Ok after some playing around here are the amended variables that also cater for multiple years.

Pretty much just added “In [Year] ” everywhere there is an “In Block” in variables n, a, b and d.

Added “;([Year])” to variable x to reset the running count for each year.

Variables c and Trend remain the same since they are just built using other variables.

n

=Count([Month]) In ([Year]) In Block

x

=Sum(RunningCount([Month];([Year])))

a

=(Sum([x] * [Sales Revenue]) In ([Year]) In Block * [n] In ([Year]) In Block) – (Sum([x]) In ([Year]) In Block * Sum([Sales Revenue]) In ([Year]) In Block)

b

=([n] * [n] * (Sum([x] * [x]) In ([Year]) In Block / [n] – (Average([x]) In ([Year]) In Block * Average([x]) In ([Year]) In Block)))

d

=(Average([Sales Revenue]) – [c] * Average([x])) In ([Year]) In Block

Well done, Nick. Thanks for the update.

Nix, I was excited to find this comment as I’m trying something similar but instead of year to Regional color I’m adding stores. I’m able to reset x by using forall ([Stores]). Where you added ‘in [Year]’ I replaced with ‘in [Stores]’. Not quite there yet but something to work with. I also asked for help on BOB also.

If you know offhand what I may be missing from what you’ve done, I would love a reply.

Thanks.

I was also able to add a dimension such as “Store” to Regional Color data assignment to retrieve a separate line for each “Store”. My bigger challenge came when client wanted a trend line out 6 months on top of last 31 days. I decided it was impossible as I couldn’t associate a Store with a future date. I could use interpolation function but that shows the one line for each store with the first 31 days of actual and then a linear line for next 6 months. It’s not idea for client, but I don’t know of any other option.

Thank you for the awesome explanation 🙂 .Is this trend line calculation is valid for all kind of calculations ?

You can use these calculations to add trend lines to any kind of data.