Performing a Universe Review

Over the past decade, I have built many universes, taught the universe design class, and performed many universe reviews for various clients. Through it all, I’ve developed a standard methodology that I use for reviewing universes. In fact, I highly recommend developing a governance process for universes (as well as documents), that includes this review before universes are put into production. Remember, the users see the universe as their primary entry point into the systems that contain the data they need. If the users have a bad experience, they will have a bad impression of BusinessObjects, as well as any system attached to BusinessObjects.

When I perform a review of a universe, I document my findings into three categories:

Performance: These are issues that will affect the performance of queries built on this universe. We want to implement techniques that will maximize performance.

User Friendliness: We want to give the users a positive, professional experience. We don’t want them seeing the universe as poorly developed, or unprofessional.

Designer Friendliness: Be nice to your fellow designers. Eventually, someone else will need to work on the universes that you’ve built. So make sure that they can see quickly what you did, and why you did it.

Let’s start with performance related items. Then we’ll move on to user friendly topics, and finally, designer friendly topics.

Performance

This can vary depending on the database you’re using. Get familiar with the database you’re using. Your DBA can help with this.

So the first thing I go through are the settings in the universe parameters box, starting with the universe connection. In the connection settings there are a few items that can impact performance:

  • Connection pool mode: This should always be set to Disconnect after each transaction. This prevents users from tying up sessions at the database. If the database is maxed out on sessions, then new sessions will have to wait for a session.
  • Array fetch size: This determines the number of rows return in each fetch (or packet) from the database. The higher you set this, the fewer packets will need to be sent across the network. Depending on the database, I usually find somewhere between 300 and 1000 works well. Your mileage may vary.

Next, I look at the universe limits. These are not strictly related to performance, but can be an indicator of how the universe will perform. The universe should have a reasonable row limit and time limit. If either are set too high, I ask why. For example, if a universe has a row limit of 1 million rows, something is seriously wrong somewhere.

The next thing I look at are the Parameters on the Parameter tab of the Parameters box (there’s a tongue twister). Some of these settings can affect performance. Here are some:

  • ANSI92: This will force the universe to generate ANSI standard SQL in the queries, often using the best performance methods.
  • FILTER_IN_FROM: This will put the joins and filters in the FROM clause of the SQL. This is particularly helpful if you have partitioned tables in the database, or are using a Massively Parallel Processing (MPP) database, such as Teradata. Note: FILTER_IN_FROM cannot be used unless ANSI92 is set to Yes.
  • JOIN_BY_SQL: This can be useful with universes that have contexts. When set to Yes, it will merge the “Joined” queries across multiple contexts into a single SQL query with multiple Selects, and a FULL OUTER JOIN between each. This can reduce the volume of data returned to the report, as it causes the database to merge the results, rather than having the report merge the results.

Next, I start looking through the universe for transformations. A transformation is any SQL that transforms the results. For example, CASE statements, or Substring functions, perform transformations on the data. Transformations can exist in Derived Tables, Joins, Filters or Objects. You can’t avoid having transformations in universes. However, the more you have, the slower your queries will be. So I look for transformations that are costly, and could be pushed down to the ETL process. That way, the data is transformed in advance, rather than on the fly.

I also look for usage of Index Awareness in the universe. This feature can reduce the number of tables being accessed, thereby reducing the number of joins in the SQL. It can also make better use of indexes in the database.

If there are any isolated tables, I check each one to see if any visible objects are built from them. Needless to say, such objects, when used in a query, could cause a Cartesian product.

I look through every measure object, to see if it has an aggregate function. Every measure object should have an aggregate function in the Select. If you don’t want the data to aggregate, make it a dimension, not a measure. I have seen many universes with measure objects that only have a projection aggregate, not a Select aggregate. That can cause huge performance issues.

There may be other things I notice along the way that affect performance, but these are the most obvious ones.

User Friendliness

Whenever I’m working on a universe, the most important thing for me to remember is that I am doing this for the users. They are my customers, and it’s important for them to be happy. If they’re not happy with the universe, they’ll go elsewhere to get the information they need. So, keeping them happy is key to maintaining the entire system. So, we want the universes to be user friendly. So, when doing a universe review, these are some of the things I look for.

Let’s start with one of my basic rules of user friendliness: Nothing that the user sees should have underscores. Why? Because it tell the user that “this is going to be technical”. IT people love using underscores, and for very good reasons. But the rest of the world doesn’t. So keep it friendly. Don’t use underscores in the universe name, context names, class names, or object names. They should all be in proper case.

So, I first start with the name of the universe, and make sure it is user friendly. Then I look at the description. The universe description is our chance to tell the user what kind of information they can get when using this universe, and how up to date the data is. Again, it should be in user friendly terms.

In the universe Parameters, make sure that Cartesian Products is set to Prevent, not Warn. Most users won’t know what a Cartesian product is, so don’t let it happen.

If the universe has multiple fact tables, I take a look at the contexts. They should have a user friendly name and description. If the user is prompted for a context, the name and description should be friendly enough to help them make the right choice.

Next I look at the names of the classes and objects. Again, the name should be user friendly. Example of object names that are NOT user friendly would include the following:

  • CUSTOMER_ID
  • Customer Id
  • Customer Addr
  • Phone No
  • Start Dt

User friendly names would include the following:

  • Customer ID
  • Customer Address
  • Phone Number
  • Start Date

You get the idea. Make the names simple, but friendly. If more information is needed, include that in the object’s description.

Speaking of descriptions, every object should have a description. Don’t assume that the name is obvious, and a description isn’t needed. So I look at every object to see if it has a description, and the description is user friendly. If you really want to get fancy, you can even color code your object descriptions. Also, I check to see if the descriptions have misspelled words, or poor grammar. We want the users to see the universe as professionally built.

If I see any objects with a Where clause defined, I raise a red flag on that. It may be OK, but, as a rule of thumb, we want to avoid using the Where clause of an object, as it could cause conflicts between multiple objects.

I also check to make sure there aren’t too many classes, and they aren’t more than 3 levels deep. We don’t want the users to have to go on a treasure hunt just to find the objects that they need. Rather than too many classes, we can use Separator Objects, as described in this previous post.

While I’m checking the objects, I also check the List of Values (LOV) for each object. Measure objects, of course, should not have a LOV. LOVs should also be avoided for date objects, as they can be very large, and are not needed. For all other objects, there are ways to make the LOVs user friendly, such as adding columns to them, or using database delegation for large lists.

I also look for objects with the same name. Every object should have a unique name. This avoids confusion with the users.

One thing that often is overlooked in universes is object formatting. Every numeric object, whether detail, dimension, or measure, should be formatted in the universe.

Finally, I look at the hierarchies in the universe. Default hierarchies should never be used, as most of them won’t make sense for drilling. So make sure that hierarchies are set to custom only.

Designer Friendliness

We all like to think that we will be the only one to work on our universes, but the reality is that, at some point in time, someone else will need to work on a universe that I created. So, as part of reviewing a universe, I make sure that best practices are implemented to make this universe easier for the next person to work on.

Let’s start with something real simple. Every LOV should have a name that relates it to the object to which it belongs. So I look to see if the LOV has the default name (bad), or a meaningful name (good). Sometimes we need to track down the LOV file on a user’s PC. So meaningful names makes it easier.

If the cardinalities are not set on the joins, it makes no difference to the users, or the database. But they should all be set anyway, for several reasons:

  • It makes the structure pane easier to “read”, as you can more easily identify fact tables.
  • It makes detection routines, such as loop detection, work correctly.

If I see any isolated tables, I expect to see some verbiage next to them explaining why they are there. Perhaps they have been aliased, and the aliases are all that is being used. In fact, it can also be helpful to add some text next to alias tables explaining what they are used for.

Documentation can be very helpful, not just for other designers, but even for the original developer. You can add notes in the structure pane. You can add information in the Comments field of the universe Parameters box. You can also add comments in the Select of your objects. But make sure you use comment tags, so the database knows to ignore them.

Conclusion

Before a new universe goes into production, you can perform this universe review as part of the governance process. By doing this, you can make sure that the universe will make a good first impression on the users. And that’s very important.

Do you have a formal governance process for universe development? Does it include such a review? Feel free to comment, below.

Advertisements

25 Responses to Performing a Universe Review

  1. Dallas Marks says:

    Thanks for writing this- lots of juicy tidbits throughout.

    Hopefully tools like Lifecycle Manager will help developers stop and review before moving universes to production.

  2. Jansi D says:

    I liked the push down strategy. You do your job (planning for a good optimized universe & also leave a good base for other designers and report developers) and let them do their job (ETL guys and DBAs). 🙂

  3. Chris Pohl says:

    good stuff michael … I’ll add this: (under User Friendliness > LOV’s) I also check PRIMARY KEYS for a LOV … It doesn’t make any sense to associate a LOV on something that’s a primary key … not natively anyway, i usually advise to create a prompted filter if it’s a requirement.

    Keep em coming

  4. John Vincent says:

    Hi Michael,

    Thanks for this checkpoints. I’m currently designing my first relational universe and your post is really a big help.

    I think its also a good Idea if you add up some review on those universe created on top of BW. 🙂

    Cheers,
    John

  5. david lai says:

    Hi Michael,
    Great checklist reference for us to use!

    Thanks for the post!

  6. Some great reminders in there of best practices that everyone should consider. I wonder if anyone has any extra comments regarding the new IDT style Universes. We are currently reviewing and extending our best practices to include areas such as multi data sources, BEx via BICS etc.

  7. Arjun says:

    Michael, It is very informative. Thanks.

  8. Pingback: Web Intelligence Performance Tips « Michael’s BI Safari

  9. Raghu says:

    Thanks for sharing us these important checkpoints….

    Raghu

  10. Great post Michael. Would like to add a few checks that we do for Universe Review:-

    1. We ensure that Univ objects which represent Flags, Status codes etc have their all possible values listed in the description with meaning of each value clearly explained:-
    Incident Status:
    O – Open
    W – Work InProgress
    H – Hold
    R – Resolved
    C – Closed

    2. We also ensure that all amount based measures have the mention of the currency in which the object returns the value. Likewise for Quantity and UOM.

    3. We do a specific mention in an object which has DB indexed built on it so that user use them in the filtering condition to ensure performance whenever possible.

    4. We also specify the source of data for the object if the source is known not to change. Instead of specifying the table name, we specify the exact source system e.g. Work Order Data from SAP ERP etc.

    5. We also ensure that the designer has created what we call “Automation Objects”. These are nothing but variation of System Date such as :-

    Current Month,
    1st Day of Current Month,
    Previous Month,
    1st Day of Current Year,
    Last Day of Current Year

    Objects like these help in creating scheduled reports without having to change the scheduling parameters on Month or Year or Week Change.

    Regards,
    Kashif

  11. Gary says:

    Thank you for the tips. One question, what is the difference between projection aggregate and select aggregate?

    • The select aggregate is what goes into the Select of the object, such as SUM(table.column). The Project aggregate is selected on the Properties tab of the Properties box of an object. It determines how the object with aggregate when projected onto the report.

  12. Suresh.Inspire1 says:

    Great One Michael…

    Double like for this… “Remember, the users see the universe as their primary entry point into the systems that contain the data they need. If the users have a bad experience, they will have a bad impression of BusinessObjects, as well as any system attached to BusinessObjects”…

  13. Sherry West says:

    Thanks for the previous day formula, I have tried so many with so many failures. Do you have a solution to #MULTIVALUE in a crosstab. I have wrote variables for question/answers responses to patient follow up but in this one area they have selected multiple answers and I really need it to show in the same cell.

  14. Namrata says:

    Hi Micheal,

    I’ve been working on improving the performance of our Webi reports and have come across your very informative blog.
    Can you please let me know how to use selection instead of projection to sum up the universe measures. I’ve been trying to use the formula given below but am not able to figure out how to code this in the select section of the object definition. I keep getting errors while parsing saying that the supplied XML is not valid.

    Formula used: SUM(“Set”[, “Numeric Expression”] ) returns the sum of a numeric expression evaluated over a set

    Formula definition: SUM([Measures].[4N5EYRU1CB28PV6SI9ZOOUUYM]) where this code is the measure as given in the select statement.

    Also our universe is based on OLAP BW query. So we don’t have any exclusive tables in the universe.

    • Namrata,

      I don’t write XML, so I’m not sure how to do that. I believe you don’t need aggregate function in the select for OLAP queries, since the data is already aggregated in the cube.

      • Namrata says:

        Thanks for the reply Michael. Just to clarify I’m not coding any XML in the universe but just trying to use the Sum function given under Numeric functions in the edit window of the select section of a Universe variable.

        Quoting from your blog ” I have seen many universes with measure objects that only have a projection aggregate, not a Select aggregate. That can cause huge performance issues.”

        So does the above statement only apply to OLTP or RDBMS (like Oracle) databases and not for OLAP (like SAP) as data is already aggregated in the cube?

      • Yes, that applies to relational data sources, not cube based universes.

  15. Hi Michael,
    I use BO since 1996, my start was with release 3!
    My DB is oracle and when the BO SQL created has a IN () inside the WHERE condition with more than 1000 items, I receive a SQL error from the DB.
    This is a problem for me because when I have a web intelligence report with 2 quey, and I need to filter a query using the result of the other query, the BO SQL use the IN() and a Data Provider variable that point to the other query, but if this query extract more than 1000 item, I have a SQL error.
    You think that there is a wat to set the universe paramiter for generate the SQL in a different way, to bypass this problem? Like “…. IN ( select Data Provider Values from dual )… ”
    I don’t want modify the BO SQL manualy.

    Thanks
    Massimo Agosta
    Italy

    • Hi, Massimo;
      This is a limitation of your Oracle database. Talk to your DBA and find out if that can be modified at the database. If not, you can get around the limitation by writing a subquery instead of using the Query on Query feature. This works if both queries come from the same database. Subqueries have no limit.

  16. prakash says:

    Hi Michael,

    We are using BOXI R3 .We have one report which is based on one derived table.Previously when we generate the report for a specific month, it was executed within 2 minutes.Currently when we refreshing the report for that specific month, report is not generating even after 40 minutes …We have not made any changes in derived table / Universe level, still wonder what went wrong with the Universe.When we copy the same Derived table Query and run in the Oracle11 DB for that specic month , query get executed within 19 seconds.
    Please advice what can be done in universe level , so that report can be generate as before.

    • Hi, Prakash. Something had to have changed somewhere. Check the universe, report, and server. Find out what changed. Try various scenarios, like running the report from Webi Rich Client, both on the server, and on your machine.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: