Solving Aggregate Aggravation

The problem with Aggregate Awareness in a universe is that, when setting it up, you can’t skip any steps. Wouldn’t it be great if we could simplify it by skipping a step here and there? Well, it just doesn’t work that way. So, in this post, we’re going to go carefully through all the required steps (There are five of them) in setting up Aggregate Awareness in a universe. We’ll also explore some unconventional uses for the feature. By the time we’re done, my hope is that you’ll have a quick reference of Aggregate Awareness that you will find useful over and over again.

What are the Five Steps to Setting up Aggregate Awareness?
I’ll start by listing the five steps to setting up Aggregate Awareness in a SAP BusinessObjects universe. We’ll dig deeper into each step in the next section.

  1. Insert the aggregate table into the universe and add the appropriate joins.
  2. Create a context for the aggregate table.
  3. Redefine existing objects, using the Aggregate_Aware() function.
  4. Define Aggregate Navigation.
  5. Test.

Don’t even think about skipping a step. It won’t work.

NOTE: These steps are the same regardless of which tool you use.

Digging into the Five Steps

Step 1: Insert the aggregate table into the universe and add the appropriate joins

First of all, what is an aggregate table? An aggregate table is a table that has measure values already aggregated according to the needs for reporting. For example, you may have a SALE table that contains every sales transaction. But for reporting purposes, you need sales summed up by day for each branch. So, your DBA, or ETL developer, creates a script that extracts the transactional data, sums it up by day and branch, and inserts it into another table. This is a Daily Aggregate table.

Even though you may have multiple aggregate tables in the database, set them up in the universe, one at a time, with Aggregate Awareness. This will make it easier to troubleshoot any issues that may arise in the process. So, insert the table into your Structure Pane, or Data Foundation, and add any joins that may be appropriate for that table. If it is a completely self-contained table, then you may not need any joins. In reality, that is extremely rare. Don’t forget to set the appropriate cardinalities for the joins. As a rule, the aggregate table should be at the “many” end of all of its joins.

Step 2: Create a Context for the Aggregate Table

Keep in mind that an aggregate table is a fact table. And, like all fact tables, should have its own context. So, create a context for the aggregate table. Include all joins that are attached to the aggregate table, as well as any other joins that may be needed for that context. If the aggregate table is a self-contained table, with no joins, then a context will not be needed. But again, this is extremely rare. As a rule, you should run an integrity check at this point, just to make sure you haven’t missed any joins.

Step 3: Redefine existing objects, using the Aggregate_Aware() function

The Aggregate_Aware() function has a minimum of 2 arguments, and no maximum. Each argument is a complete Select statement for that object, in order of preference. For example, if we are setting this up for Sales Revenue, the first choice would be to get the data from the aggregate table. The second choice would be to get the data from the transaction table. So, the select for your Sales Revenue object might look something like this:

Aggregate_Aware(SUM(AGG_TABLE.REVENUE),SUM(TRANS_TABLE.REVENUE))

Since this is a measure object, we must include the aggregate function, SUM, for each argument. If this is a new object, you may need to set the object type and projection aggregate on the properties tab of the object’s dialog box.

Keep in mind that there may also be dimension objects that could access this aggregate table. If your aggregate table has dimension columns, related dimension objects will also need to be redefined.

Step 4: Define Aggregate Navigation

Aggregate Navigation is a tool that allows us to decide which objects in a universe will be incompatible with the aggregate table. For example, if your aggregate table has data aggregated by Branch and Month, then you cannot use the Week Number object with this table. Therefore, if the Week Number object is used in a query, you cannot use the aggregate table for that query. So, we need to tell the universe that the Week Number object is incompatible with this table.

To set the Aggregate Navigation, in Designer, go to Tools – Aggregate Navigation. In Information Design Tool, go to Actions – Set Aggregate Navigation (Make sure you are in the Business Layer). This opens the Aggregate Navigation window. On the left, select the aggregate table. Then, on the right, put a check mark on each object that is incompatible with the aggregate table. DO NOT select objects that are compatible with the aggregate table.

NOTE: If you have set your contexts and cardinalities correctly, you can click the “Detect Incompatibility” button at the bottom. My experience is that this is about 95% accurate, so you will need to select the aggregate table on the left again, and go through and verify that the correct objects are checked.

Step 5: Test

That’s right. You have to test it to make sure you got it right. Try creating some queries, at various levels of aggregation, to make sure you get the correct SQL. Do not skip this step, even if you think you never make mistakes. 😉 And, after you test it, test it some more. Make sure you’re getting the same results at various levels of aggregation. If you get different totals from the aggregate table and the transaction table, there might be a problem with the ETL process that populates the aggregate table. So make sure you test that, as well. If all goes well, you are ready for user acceptance testing (UAT).

Other uses for Aggregate Awareness

Although this functionality was originally developed to simplify the use of aggregate tables in a universe, it can also be used for other purposes within the universe. Let’s explore a few of these.

Building a universe on a Data Marsh

What is a Data Marsh? I’m glad you asked. I coined the term years ago, when I had to build a universe on this type of data source. The client had been running SQL queries against various systems within the company and loading the results into individual tables in a database. Although the data was often related between the various systems, no cleansing of the data had taken place, and no dimensional modeling had been done. There was simply no way to reliably join the various tables.

So, using Aggregate Awareness, I created a single set of dimension objects in the universe, using the Aggregate_Aware() function to select the different tables from which the data was stored. For example, several of these tables had a column for Sales Rep. So, I created a Sales Rep object, using aggregate awareness to select the needed columns from the various tables.

Next, I used Aggregate Navigation to make sure that the correct table was used for the Sales Rep value, depending on which measure was chosen in the query. Measure A, for example was in Table A. So, Measure A was marked as incompatible with Table B and Table C. Therefore, if Measure A was used in a query, the Sales Rep object would use the Sales Rep column from Table A.

Of course, I would prefer to never build a universe on a Data Marsh. But sometimes, you have to work with whatever you are given. A little creativity can go a long way.

Resolving a Fan Trap with Aggregate Awareness

In reality, Fan Traps are fairly rare in data models. But, when it does happen, there are multiple ways to resolve the Fan Trap. What is a Fan Trap? It happens when you get a measure from one table, and a dimension from another table in a one-to-many relationship. The measure, in essence, gets “fanned out”, or multiplied by the number of dimension values retrieved.

Fan Trap

In the example, above, the Sales Revenue will be multiplied by the number of rows retrieved from the INVOICE_DETAIL table. This is what happens in a Fan Trap. The classic way to resolve this is to create an alias of INVOICE_HEADER table, and join the alias to the CUSTOMER table, putting that join in a separate context. Then, get the Sales Revenue from the alias, instead of the original table. Quite honestly, this is the method I usually use to solve a fan trap.

However, a Fan Trap can also be resolved using Aggregate Awareness. As in the classic resolution, created an Alias of the INVOICE_HEADER table, and join it to the CUSTOMER table.

Fan Trap ResolvedNext, define the Sales Revenue as follows:

@Aggregate_Aware(SUM(INVOICE_HEADER.REVENUE),SUM(ALIAS_INVOICE_HEADER))

Next, use Aggregate Navigation to make any objects that come from the INVOICE_DETAIL table, incompatible with the INVOICE_HEADER table. So, Sales Revenue will come from the INVOICE_HEADER table, unless an object is chosen from the INVOICE_DETAIL table.

Conclusion

Is this an exhaustive list of uses for the Aggregate_Aware() function? Probably not. But that wasn’t my intention in writing this article. My intention was to document the steps in setting it up, as well as provide the most common uses. If you have other creative uses for this function, feel free to share it in the comments. Thanks for reading.

Advertisements

10 Responses to Solving Aggregate Aggravation

  1. Hi Michael,
    Great article.

    When creating Measures for Aggregate Awareness, I typically have the following:

    SUM(Aggregate_Aware(AGG_TABLE.REVENUE,TRANS_TABLE.REVENUE))

    instead of

    Aggregate_Aware(SUM(AGG_TABLE.REVENUE),SUM(TRANS_TABLE.REVENUE))

    It renders the same SQL, and saves on typing. Are you aware of any issues of having SUM outside of Aggregate_Aware, rather than inside?

    Greg

  2. Krysztof von Murphy says:

    Great article, I could not write it better.

    I use AA for aggregates of course (less and less necessary nowadays for my customers with small databases), but I avoid it like plague for other uses. I prefer to use contexts, aliases, views, whatever. I understand how AA works, but my universes are often maintained by less knowledgeable people, and they WILL forget the incompatibilities.

    I have tried to put AA on dimension objects (year/month in a fact table, year in aggregate, not time tables), it was okay for single SQL queries, but Webi was taking the wrong tables when mixing different measures coming from both tables. I should reproduce it and check the comptabilities again. Do you confirm that it should work? In an extreme case, that means that we could get rid of dimension tables…

    I agree with Greg, I prefer SUM (Aggregate… (… , …)), and I often use an intermediary object continaing only the Aggregate. One reason if that I have many COUNT(DISTINCT …) or AVG () for my customers, and that changing the formula resets the projection function! And I hate to repeat even a small formula like SUM().

    • Excellent points, Krysztof. In a correct data model, you really wouldn’t have dimension values in an aggregate table. Rather, you should have foreign keys to link to dimension tables. That way, there’s never a need to use AA on dimension objects.

      • kvonmurphy says:

        I’ve just tested “aggregated dimensions” on BI 4.1 SP2: just a simple example with a yearly and a daily fact tables, an aggregated measureA on both tables, dimension tables Dim_Day and Dim_Year, and a [Year] object which is @Aggregate_Aware (Dim_year.year , Dim_day.year).

        That worked OK as long as I was using only my aggregated measureA and my aggregated Year, and other “standard” dimensions. When adding other measures from the fact tables, Webi was unable to create two SQL, each one using the appropriate source for Year. I understand the WIS 00014 limitation on a query with AggegreateAwareMeasureA, Yearly measureB, Daily MeasureC and Year, but it should have worked with at least two of three measures. Sometimes I had a SUM(measureA) alone in a SQL, sometimes the Year alone. I saw no logic nor consistency so I think it is a bug. Of course, I’ve tried mor or less everything with incompatibilities.

        I don’t think that it is a YABI4B (Yet Another BI4 Bug), as I remember similar problems with XI 3.1.

        To summarize: in my experience, aggregated dimensions work only if you can live with queries with only one measure. Useful for a “Data Marsh”, but more a work around than a real feature.

        I’m rather worried because I’ve seen a customer whose fact tables are copied and pasted from SAP R/3 tables (that makes simple ETL flows) ; they do not have a time table dimension so they create their time axis with Aggregated dimensions. In their case, that works but I have not stress-tested their universes.

  3. Judy Mulders says:

    Michael, great explanation. I would add that proper object descriptions are helpful when using @AA so that your users don’t try incompatible objects – of course you have to train them to read the descriptions. 🙂 Judy

    • Thanks, Judy. I think you know me well enough to know that descriptions are one of my pet peeves. 🙂 I agree completely with you. Every object should have a user friendly description. After all, we are building this universe for our customers, so we better do everything we can to make it easy to use for them.

  4. Dave Rathbun says:

    One more unconventional use of the function is to solve the oft-asked question on BOB: “How can I prevent users from using object X and Y together?” I’ve been meaning to write this up on my own blog for years.

  5. Pingback: Enhancing Universes With Super Charged Aggregate Aware - EV Technologies, Inc.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: