The Importance of Governance

I don’t know if this is a new phenomenon, or if I’m just more observant of it, but there seems to be a trend in BI today that is rather troubling: the lack of governance. What is governance, you ask? That’s a good question.

Before anything is published into a BI system, it should be subjected to a process that insures that it meets certain standards. In other words, someone needs to look at it and see if it is formatted correctly, has the right naming convention, includes the right information, or disclaimers, etc. This process is known as governance.

In the world of SAP BusinessObjects, governance can be applied to Connections, Universes, Documents, Dashboards, Workspaces, etc. Let’s take a look at the things that can be applied to the governance process.

  1. Naming Standards
    1. Should define a guideline for everything that can be built in BusinessObjects
    2. The goal is to create a consistent experience for users, and an easily maintainable and supportable system for developers and administrators
    3. Naming conventions can be applied to the following objects:
      1. Database connections
      2. Universe names
      3. Universe class names
      4. Universe object names
      5. InfoView folder names
      6. Document names
        1. SAP BusinessObjects Web Intelligence (Webi)
        2. SAP Crystal Reports
        3. Xcelsius (SAP BusinessObjects Dashboards)
        4. Other
  2. Security – three approaches:
    1. See everything except what is forbidden
    2. See nothing except what is needed
    3. Hybrid (usually the preferred approach)
  3. Reporting Guidelines
    1. Define how a report is made public
      1. How it is done now?
      2. What works well?
      3. What doesn’t work well?
      4. How should it be done?
    2. Establish formatting standards for public documents
      1. Creates a common look and feel to all public reports
      2. Makes it easier for users to find what they need on public reports
  4. Best Practices
    1. Job scheduling
      1. Limit the number of users who can schedule, to one or two per group
      2. Take advantage of Events
      3. Don’t schedule everything for the same time
      4. Apply a global limit to instances
      5. Avoid scheduling the same document multiple times
    2. Universe techniques
      1. Keep universes small and focused (no more than 700 objects)
      2. Never use underscores in anything that a user will see
      3. Give the universe a user friendly name and description
      4. Set the universe connection to “Disconnect after each transaction”
      5. Set the array fetch size to an appropriate level
      6. Set reasonable universe limits
      7. Set parameters appropriate to your database
      8. Every object should have a user friendly description
      9. Never have more than 3 levels of classes/subclasses
      10. Use proper case for object names
      11. Use full words for object names
        1. Use “Date” rather than “Dt”
        2. Use “Number” rather than “No” or “Nbr”
      12. Use Index Awareness to improve performance
      13. Every measure object should have an aggregate function in the select
      14. Format all numeric objects in the universe
      15. LOVs should have meaningful names
      16. No two objects should have the same name
      17. Use only custom hierarchies
      18. Set cardinalities on all joins

Now that we know what’s possible in a governance process, let’s take a deeper look at some of these.

Naming Convention Suggestions:

Connections: The connection name should be devised so that it includes information about the database that it is connecting to. For example:

Syntax: Environment_Platform_Database

Example: DEV_TD_CSR (Development Environment, Teradata, CSR database)

Universes: Universe long names should match the application that the data belongs to, as well as the focus of the universe. For example, if the data is from the CSR application, and the focus of the universe is sales, the universe could be named “CSR Sales”. The idea is to use a name that makes sense to the users.

Universe Class and Object Names: These should be consistent between universes. For example, if the Month object from Universe A, and the Month Name object from Universe B, return the same values, they should both be called the same thing. This creates a consistent user experience, and avoids confusion.

Documents: It can be very helpful to begin the name of a document with a code that identifies the universe or group that it belongs to, as well as sequential number. For example:

Syntax: Universe-SequentialNumber Document Name

Example: CS0001 Quarterly Sales By Title

In this example, CS represents Customer Service, and 0001 is simply a sequential number. This type of naming allows the documents to be sorted easily by universe, and also helps the user to specify which document they are referring to when they need to discuss documents.

Security

As a starting point, we should review the current security, and see what works well, and what doesn’t. If the current security meets the needs of the organization, then we can keep the current security, and tweak as needed. The three common approaches to security are as follows:

See everything except what is forbidden. In this approach, everyone sees everything by default. Then, content is hidden that is not allowed to be seen by specific users. This approach can be very hard to manage.

See nothing except what is needed. In this approach, the default is to see nothing, and only grant access to content that is required. This is easier to maintain, but can create a bureaucracy that will only frustrate users looking for content that they can’t see.

Hybrid. In this approach, most, but not all, content is hidden by default. Some content is deemed appropriate for all users, and, therefore, is granted to all users. Other content is only granted as needed. This is the easiest to maintain, and is also the best for the users. Users don’t want to see everything, as the volume of content can be overwhelming. By limiting them to what they need, as well as what they are allowed to see, their view becomes more manageable.

Reporting Guidelines

There should be a standard format applied to all public documents. This common “look and feel” makes it easier for users, as they always know what they’re looking at, and always know where to find what they need on the report. They also can tell, very easily, when they are looking at a document that has not received official approval. Standards may include, but are not limited to, the following:

  • Cover page that includes standard information about the document:
    • Document name
    • Document author
    • Document description
    • Universe(s) in use in this document
    • Last Refresh Date/Time
    • Query filters, include prompts
    • Disclaimers
    • Company logo
      • Which logo?
      • Size
      • Position
    • Standard report colors
    • Document/Report naming conventions

All documents will need to pass the governance process before being placed into a public folder. This process should be streamlined, so as not to create a bottleneck.

There should be one or two people, in each group, who are trained to apply the governance process to a document, and who have permission to add objects to a public folder. These people will be responsible for making sure that all documents meet enterprise standards before being published.

Conclusion

Everything we do should be designed to make life better for the users. We need to treat them like customers, and we want their business. So make sure that your governance process is designed to do just that. Don’t let it become a hurdle that the users need to jump over. Streamline the process as much as possible.

Setting up a governance process is a difficult process. It needs to involve key users who will give input of what the standards should be. The problem is that everyone will feel strongly that their way is best. So your job is to try to find a solution that meets everyone’s needs as much as possible. Give it a shot, and let me know how it goes.

Migration from Deski to Web Intelligence

As you probably know by now, with the release of SAP BusinessObjects Business Intelligence 4.0 (BI4), Desktop Intelligence (Deski) is officially retired. Of course, we’ve known that this day would come for nearly ten years, and some people started early in migrating their Deski documents to Web Intelligence (Webi). But many also delayed until the last minute, waiting for Webi to mature a little more. But the time for waiting is over, and it’s now time to put Deski behind us. So, to that end, I’d like to share some experience in converting those Deski documents to Webi.

The Right Tool for the Right Job

Most of your Deski documents can be converted to Webi with the help of the Report Conversion Tool. If you haven’t used this tool yet, try it out with one of your Deski Documents, and see how it does. I’ve used it many times, with varying degrees of success. Sometimes it converts complex documents, with no issues, much to my surprise. Sometimes it fails to convert very simple documents, much to my dismay. :-(

So, my first piece of advice is, if it fails to convert a simple document, don’t waste time trying figure out why. Simply rebuild the document, from scratch, in Webi. I hate to admit it, but I learned this lesson the hard way.

I hate failure, and am usually not willing to give up easily. One time, I ran a Deski document through the Report Conversion Tool (RCT), and it failed. I couldn’t figure out why. The document had one query, one table, and one report, with only one simple variable. I spent two hours trying to figure out why it wouldn’t convert successfully. Then it dawned on me. I could rebuild it in Webi in five minutes. It wasn’t worth the time.

So, for simple reports, try it once with the RCT. If it fails, simply rebuild the document.

Did it Work?

The RCT has three potential outcomes for each document:

  • Converted Successfully
  • Partially Converted
  • Not Converted

If the conversion is successful, compare the Deski version with the Webi version. Make sure the formatting is correct. Many times, it will need some touching up. And make sure the data is the same. Not all functions work exactly the same, so you may need to modify the formulas in some variables to get the correct results.

If the document is partially converted, compare the two, and see what didn’t convert. For example, if your Deski document has any macros in it, they won’t be converted, as Webi doesn’t support that feature. You may need to get creative to find another way to get the same results.

If the document is not converted at all, the RCT will tell you why. Then you get to decide if it’s easier to try to solve the problem, or simply rebuild the document.

Freehand SQL

I’ve struggled a long time with this. Best practices tell us that we should never be using Freehand SQL to build reports. It defeats the purpose of using BusinessObjects. It is very difficult to maintain. And it often violates corporate security policy. But that hasn’t stopped people from building these reports. Sometimes it’s just easier to write the SQL by hand, rather than modify the universe to accommodate the report’s requirements. I get that.

So what do we do with these documents? Well, the good news is that you can use the RCT to convert them to Webi documents. However, since Webi doesn’t support Freehand SQL, how does it work? The RCT will convert the document by generating a universe with a derived table in it. The Freehand SQL from the Deski document will become the derived table in the universe. If the document has multiple Freehand SQL data providers, each will become an individual derived table in the same universe.

In fact, one such universe will be generated for all Deski documents that use the same database connection, and have Freehand SQL data providers.

But don’t leave it there. Once converted, start migrating these new Webi documents to real production universes. Simply change the universe for each query in the Webi document from the auto-generated universe, to your production universes. Finally, after you have all the new Webi documents attached to production universes, you can delete the auto-generated universe.

By the way, this also applies to Webi queries that were built with universes, but then the SQL was edited.

Marketing and Political Unrest

One of the most challenging tasks in converting from Deski to Webi is the fear of change coming from the user community. Let’s face it, once you get good at a tool, you don’t want to be forced to change to another tool. So, we need to make this as easy as possible for the users. This starts with two very important jobs:

  • Management Support: You have to have management behind this decision. If they aren’t, you can find yourself fighting political battles, and not always winning.
  • Marketing: You need a campaign to “sell” Webi to the user community. You want them on your side.

Both of these are interconnected. If the user community screams loud enough, management will not support the project. So, get a few key, influential users, who are respected throughout the organization, involved in the project. They will be your marketing evangelists among the rest of the community. They can help get the rest of users excited about this change.

Management will support the project based on the cost savings realized from supporting one tool, rather than two tools (One of which is reaching “End of Life”).

Are You On Board?

I’ve seen many posts on BOB from IT folks complaining of this change. Some have threatened to leave BusinessObjects and move to another tool. And if that decision makes good business sense, then do it. But don’t change tool sets simply because BusinessObjects, and SAP, have decided to drop Deski.

I grew up on Deski. I have spent hundreds, if not thousands, of hours developing reports in Deski. I become extremely competent with the tool, and even taught the classes for many years. But over the last five years, I’ve used it very little. I’ve focused my attention on Webi, and I’m glad I did. I like Webi very much. And I don’t miss all the quirks and bugs (Remember the Unhandled Exception Error?) in Deski. Yes, I know, Webi isn’t perfect. But it’s a great tool. So, make sure that you’re learning it, and excited about it, before you begin your conversion project.

Conclusion

The bottom line is that Deski is now retired. Honestly, I’m happy about that. As much as I loved that tool, it’s much easier to focus on one less tool in the tool belt. So, I hope this article has given you some useful tips converting those Desktop Intelligence documents to Web Intelligence.

What is your experience? Do you have any additional tips you could share?

Performing a Universe Review

Over the past decade, I have built many universes, taught the universe design class, and performed many universe reviews for various clients. Through it all, I’ve developed a standard methodology that I use for reviewing universes. In fact, I highly recommend developing a governance process for universes (as well as documents), that includes this review before universes are put into production. Remember, the users see the universe as their primary entry point into the systems that contain the data they need. If the users have a bad experience, they will have a bad impression of BusinessObjects, as well as any system attached to BusinessObjects.

When I perform a review of a universe, I document my findings into three categories:

Performance: These are issues that will affect the performance of queries built on this universe. We want to implement techniques that will maximize performance.

User Friendliness: We want to give the users a positive, professional experience. We don’t want them seeing the universe as poorly developed, or unprofessional.

Designer Friendliness: Be nice to your fellow designers. Eventually, someone else will need to work on the universes that you’ve built. So make sure that they can see quickly what you did, and why you did it.

Let’s start with performance related items. Then we’ll move on to user friendly topics, and finally, designer friendly topics.

Performance

This can vary depending on the database you’re using. Get familiar with the database you’re using. Your DBA can help with this.

So the first thing I go through are the settings in the universe parameters box, starting with the universe connection. In the connection settings there are a few items that can impact performance:

  • Connection pool mode: This should always be set to Disconnect after each transaction. This prevents users from tying up sessions at the database. If the database is maxed out on sessions, then new sessions will have to wait for a session.
  • Array fetch size: This determines the number of rows return in each fetch (or packet) from the database. The higher you set this, the fewer packets will need to be sent across the network. Depending on the database, I usually find somewhere between 300 and 1000 works well. Your mileage may vary.

Next, I look at the universe limits. These are not strictly related to performance, but can be an indicator of how the universe will perform. The universe should have a reasonable row limit and time limit. If either are set too high, I ask why. For example, if a universe has a row limit of 1 million rows, something is seriously wrong somewhere.

The next thing I look at are the Parameters on the Parameter tab of the Parameters box (there’s a tongue twister). Some of these settings can affect performance. Here are some:

  • ANSI92: This will force the universe to generate ANSI standard SQL in the queries, often using the best performance methods.
  • FILTER_IN_FROM: This will put the joins and filters in the FROM clause of the SQL. This is particularly helpful if you have partitioned tables in the database, or are using a Massively Parallel Processing (MPP) database, such as Teradata. Note: FILTER_IN_FROM cannot be used unless ANSI92 is set to Yes.
  • JOIN_BY_SQL: This can be useful with universes that have contexts. When set to Yes, it will merge the “Joined” queries across multiple contexts into a single SQL query with multiple Selects, and a FULL OUTER JOIN between each. This can reduce the volume of data returned to the report, as it causes the database to merge the results, rather than having the report merge the results.

Next, I start looking through the universe for transformations. A transformation is any SQL that transforms the results. For example, CASE statements, or Substring functions, perform transformations on the data. Transformations can exist in Derived Tables, Joins, Filters or Objects. You can’t avoid having transformations in universes. However, the more you have, the slower your queries will be. So I look for transformations that are costly, and could be pushed down to the ETL process. That way, the data is transformed in advance, rather than on the fly.

I also look for usage of Index Awareness in the universe. This feature can reduce the number of tables being accessed, thereby reducing the number of joins in the SQL. It can also make better use of indexes in the database.

If there are any isolated tables, I check each one to see if any visible objects are built from them. Needless to say, such objects, when used in a query, could cause a Cartesian product.

I look through every measure object, to see if it has an aggregate function. Every measure object should have an aggregate function in the Select. If you don’t want the data to aggregate, make it a dimension, not a measure. I have seen many universes with measure objects that only have a projection aggregate, not a Select aggregate. That can cause huge performance issues.

There may be other things I notice along the way that affect performance, but these are the most obvious ones.

User Friendliness

Whenever I’m working on a universe, the most important thing for me to remember is that I am doing this for the users. They are my customers, and it’s important for them to be happy. If they’re not happy with the universe, they’ll go elsewhere to get the information they need. So, keeping them happy is key to maintaining the entire system. So, we want the universes to be user friendly. So, when doing a universe review, these are some of the things I look for.

Let’s start with one of my basic rules of user friendliness: Nothing that the user sees should have underscores. Why? Because it tell the user that “this is going to be technical”. IT people love using underscores, and for very good reasons. But the rest of the world doesn’t. So keep it friendly. Don’t use underscores in the universe name, context names, class names, or object names. They should all be in proper case.

So, I first start with the name of the universe, and make sure it is user friendly. Then I look at the description. The universe description is our chance to tell the user what kind of information they can get when using this universe, and how up to date the data is. Again, it should be in user friendly terms.

In the universe Parameters, make sure that Cartesian Products is set to Prevent, not Warn. Most users won’t know what a Cartesian product is, so don’t let it happen.

If the universe has multiple fact tables, I take a look at the contexts. They should have a user friendly name and description. If the user is prompted for a context, the name and description should be friendly enough to help them make the right choice.

Next I look at the names of the classes and objects. Again, the name should be user friendly. Example of object names that are NOT user friendly would include the following:

  • CUSTOMER_ID
  • Customer Id
  • Customer Addr
  • Phone No
  • Start Dt

User friendly names would include the following:

  • Customer ID
  • Customer Address
  • Phone Number
  • Start Date

You get the idea. Make the names simple, but friendly. If more information is needed, include that in the object’s description.

Speaking of descriptions, every object should have a description. Don’t assume that the name is obvious, and a description isn’t needed. So I look at every object to see if it has a description, and the description is user friendly. If you really want to get fancy, you can even color code your object descriptions. Also, I check to see if the descriptions have misspelled words, or poor grammar. We want the users to see the universe as professionally built.

If I see any objects with a Where clause defined, I raise a red flag on that. It may be OK, but, as a rule of thumb, we want to avoid using the Where clause of an object, as it could cause conflicts between multiple objects.

I also check to make sure there aren’t too many classes, and they aren’t more than 3 levels deep. We don’t want the users to have to go on a treasure hunt just to find the objects that they need. Rather than too many classes, we can use Separator Objects, as described in this previous post.

While I’m checking the objects, I also check the List of Values (LOV) for each object. Measure objects, of course, should not have a LOV. LOVs should also be avoided for date objects, as they can be very large, and are not needed. For all other objects, there are ways to make the LOVs user friendly, such as adding columns to them, or using database delegation for large lists.

I also look for objects with the same name. Every object should have a unique name. This avoids confusion with the users.

One thing that often is overlooked in universes is object formatting. Every numeric object, whether detail, dimension, or measure, should be formatted in the universe.

Finally, I look at the hierarchies in the universe. Default hierarchies should never be used, as most of them won’t make sense for drilling. So make sure that hierarchies are set to custom only.

Designer Friendliness

We all like to think that we will be the only one to work on our universes, but the reality is that, at some point in time, someone else will need to work on a universe that I created. So, as part of reviewing a universe, I make sure that best practices are implemented to make this universe easier for the next person to work on.

Let’s start with something real simple. Every LOV should have a name that relates it to the object to which it belongs. So I look to see if the LOV has the default name (bad), or a meaningful name (good). Sometimes we need to track down the LOV file on a user’s PC. So meaningful names makes it easier.

If the cardinalities are not set on the joins, it makes no difference to the users, or the database. But they should all be set anyway, for several reasons:

  • It makes the structure pane easier to “read”, as you can more easily identify fact tables.
  • It makes detection routines, such as loop detection, work correctly.

If I see any isolated tables, I expect to see some verbiage next to them explaining why they are there. Perhaps they have been aliased, and the aliases are all that is being used. In fact, it can also be helpful to add some text next to alias tables explaining what they are used for.

Documentation can be very helpful, not just for other designers, but even for the original developer. You can add notes in the structure pane. You can add information in the Comments field of the universe Parameters box. You can also add comments in the Select of your objects. But make sure you use comment tags, so the database knows to ignore them.

Conclusion

Before a new universe goes into production, you can perform this universe review as part of the governance process. By doing this, you can make sure that the universe will make a good first impression on the users. And that’s very important.

Do you have a formal governance process for universe development? Does it include such a review? Feel free to comment, below.

Web Intelligence Performance Tips

The recommendations in this document are intended to help improve the general performance of BusinessObjects Web Intelligence documents. To that end, we’ll begin with an overview of Web Intelligence work flows that affect document performance.

Refreshing a Document

When a Web Intelligence document is refreshed, either manually or by the Job Server, the queries in that document are run sequentially. The first query to run is the query on the first tab in the Query Panel. Then the second query runs, and so on. The tabs can be reordered, if needed, to reorder the run sequence.

Once the query results are retrieved and stored in the document, Web Intelligence begins the process of combining results, where necessary, and calculating any variables or formulas in the document. Finally, the reports are rendered with the results.

Documents with many variables and/or many rows of data, can take a substantial amount of time to perform the needed calculations. Therefore, it is recommended to push as many calculations as possible to the universe. This allows the data to be calculated by the database, rather than the report server.

Queries that span multiple contexts will generate multiple SQL statements. How these statements are run, and combined, will depend on the JOIN_BY_SQL parameter in the universe. By default, Web Intelligence will generate a separate SQL statement for each context, as well as each measure from a different table. It will then run those SQL statement in parallel, and merge the results after the data has been retrieved. If the JOIN_BY_SQL parameter is set to Yes, then Web Intelligence will generate a single SQL query, with multiple Selects, and a FULL OUTER JOIN between each Select. The results will then be combined by the database, and the final results will be returned to the document. This can result in a much smaller and more efficient dataset being returned to the document.

However, make sure that queries do not contain objects that are not required for the reports. This is especially true of measure objects, since these typically come from one context only. Removing one unneeded measure can remove one SQL statement from the database.

Rendering a Document

Once the query, or queries, complete, the document will merge the results of multiple queries, then perform any and all needed calculations before rendering the results. When a user goes from one report tab to another, the new tab will run it’s calculations again, and then, once complete, display that tab for the user.

Reports that contain many calculations, in the form of formulas or variables, may render very slowly, depending on the volume of data contained in the report. For example, a report that contains 10,000 rows of data, and 10 columns of formulas or variables, will need to perform 100,000 calculations before rendering the results.

Therefore, significant performance improvements can be gained by reducing the number of rows contained on the report, as well as reducing the number of calculations needed. As mentioned in the previous section, it is much better to do calculations in the universe, than in the document. If the calculations are done in the universe, then the database performs the calculations once, and delivers them to the document. However, if the calculations are done in the report, they will happen when the query is completed, as well as when a user clicks on a different tab. Of course, documents will always have calculations in them, but we want to reduce them as much as possible, especially when there is much data in the report.

Data Reduction

As mentioned, one of the most common reasons for poor report performance is too much data in the report. Often, the reason for too much data is the need for multiple levels of data for drilling. Each level of a hierarchy that is added to a document causes the result set to grow exponentially, along with the document size. However, when data sets are large, alternate techniques need to be employed to reduce the amount of data in the document.

Using OpenDocument: One such technique involves creating a high level document, with only the data needed to display the top level before drilling. Then, using the OpenDocument function, we can create a URL that, when clicked on, will drill into a second document, passing values into prompts in that document. For example, if the user clicks on a Region to drill into that region, the Region name, or ID, is passed to the second document, which is then refreshed for that Region only. This technique keeps the first document small and agile, and keeps the second document small as well. Note that OpenDocument can be used in conjunction with standard drilling. A document can be set up to drill one or two levels within the document, then further drilling can be enabled with the OpenDocument function.

Using Query Drill: Another technique, which is similar to the OpenDocument technique, is Query Drill. The difference is that Query Drill operates entirely within one document. The document is built with only high level data. Then, when a user clicks on a value to drill into it, the query is rerun, with an additional query filter added, which will filter the results to those results needed for the drill. For example, if a user clicks on the West Region, the query is rerun, with a new query filter, where Region = ‘West’. This technique also keeps the document size much smaller.

Conclusion

There are many factors involved in improving the performance of a web Intelligence document. The techniques presented here are only a few. There may be things that can be improved in the universe, or at the database, to improve performance. It’s important to optimize your reporting in any and all places available.

Where is Your Data?

Over the years, I have noticed a rather disturbing trend in businesses, both large and small. While sales and operational data is often stored in databases, forecast data is often maintained in spreadsheets. Good forecasting is critical to an organizations ability to deliver goods or services. The failure to do proper forecasting can be disastrous for any size organization. If we produce too much product, we waste money. If we produce too little product, we run the risk of disappointing, and losing, our customers. We have to forecast demand as accurately as possible, for every product or service that we offer.

Part of the process of refining the forecasting process involves a comparison of forecasted results to actual results. If the forecast was significantly wrong, we need to find out why, and use that information to improve the process next time.

Nearly every organization that I have worked with has an interest in reporting actual results compared to forecasted results. That’s the good news. The bad news is that, with forecasted results stored in spreadsheets, comparing the two becomes challenging. Why is this so? Isn’t it pretty simple to build reports using data from databases, and data from spreadsheets, at the same time? Yes, it is. But let’s look at reality.

Let’s consider an organization that has ten different product lines, and a manager for each line. Each of the ten managers does their own forecasting, in their own spreadsheets, with their own formatting. The reporting team has to pull data from ten different spreadsheets, into a single report, and compare the results with actuals, retrieved from a database. However, each of the ten spreadsheets are formatted differently. The data in them is not always in a table format, which is how we need it.

So, the reporting team is tasked with having to reformat the spreadsheets, so the data can be brought into the reports. And each month, they go through the same process, reformatting the spreadsheets. Even if they can convince each manager to use the same format, which would be a miracle, what’s to keep one manager from changing his/her mind down the road, and changing the format again?

I can tell you from experience, reporting from spreadsheets is a challenging task, at best.

Considering the importance of forecasting, I can’t help wonder why forecasting is done with spreadsheets. Why isn’t it done with forecasting applications that store the data in a database? In the long run, the cost of the forecasting application would be far outweighed by the cost savings in eliminating the maintenance of spreadsheets.

Yes, there are plenty of organizations that store forecasted data in databases. These are the ones who have already seen the value of doing it right. It’s a pleasure to build reports for them. But they are few and far between.

Where does your organization store its forecast data? Tell me what you think.

Sessions at SAP BusinessObjects User Conference

Will you be attending the SAP BusinessObjects User Conference in Orlando this year? If so, I’d love to meet you. I’ll be speaking at a couple of sessions. Please come and say hi to me. :-)

Building a Data Warehouse: Building a Universe

In this session, we will look at the fundamentals of building a universe on a star schema data warehouse. Participants will watch as a universe is built, incorporating best practices in universe design. Topics will include basic concepts, SQL tricks, contexts, common mistakes, and documentation. Look at the best ways to make the universe user friendly and designer friendly.

Key Learning Points

  • Hear why good preparation is 80% of the work.
  • See why there’s more to building a universe than entering SQL.
  • Learn how to make sure that the users will understand this universe.

Note that this session is part of the Building a Data Warehouse series.

SAP BusinessObjects Security Made Easy

If you have worked with SAP BusinessObjects security, you probably have experienced a fair amount of pain trying to get it right. Yes, it can be confusing. However, there are some cool tricks that you can implement to make it much easier. This presentation will go over the basics of security, including some fundamental concepts. From there, you will explore some little-understood features of the Central Management Console (CMC) that can make security much easier. Some of the key concepts will include:

  • Setting security at the root folder
  • Using access levels
  • How to use the “Everyone” group
  • Dual security architecture
  • Best practices

ASUGNEWS Studio

In addition to my sessions, I’ll be joining Jamie Oswald and Dave Rathbun in the ASUGNews Studio for a live interview session on Tuesday morning at 9 AM. The studio will be located on the Show Floor. It should be fun, as the three of us in one place could mean non-stop laughter.

DSLayered Podcast

Eric Vallo is planning on recording a podcast from the conference, all about the early days of BOB. So, he’s pulled together a group of folks that have been there from the beginning, including Amy Miller and Susan Collins. They ran the List Serve that predated BOB. Of course, it will also include Dave Rathbun, Steve Krandel, myself, and perhaps a few special surprises. Note that this may be recorded after the conference, depending on everyone’s schedules, and a few technical details.

The Bottom Line

So, the bottom line is that I’d like to meet as many people as possible at the conference this year. So, look me up. I’ll also be attending the BI4 Launch Party, sponsored by Steven Lucas and Miko Yuk. See you there!

Display Input Control Values on Report

A common requirement in the reporting world is to display filtered values on a report. This could include Query Filters (including prompted values), Report Filters (including Drill Filters), or Block Filters. Of course, displaying Query Filters on the report is fairly simple. If the Query has one or more prompts, you would use the UserResponse() function to display the selected value(s) on the report. Otherwise, you could use a formula like this:

=Substr(QuerySummary();Pos(QuerySummary();”Filters”);Length(QuerySummary()))

Or, of course, you could simply type the values into a cell. The same goes for Block Filters. Since they aren’t dynamic, you can simply type them into a cell.

However, there is no function to display the values selected in an Input Control. This question was recently posted on BOB: “How do I display the values selected in an Input Control on the report?” The solution was fairly simple, but sometimes the simple solutions can be the hardest to find.

To demonstrate this, I’ll use the eFashion universe. I started by creating the following query:

After running the query, I removed the Store name column from the block. This column will be used for the Input Control.

Next, I created an Input Control as follows:

  • Object: Store name
  • Type: Check boxes (Multiple)
  • Number of lines: 14

For all other settings, use the defaults. Now, as you select one or more Stores from the Input Control, the values in the block adjust accordingly. Of course, we can’t see on the report which value(s) are selected. So here’s how we accomplish this:

From the Templates tab of the Report Manager, expand Tables, and drag a Vertical table on to the report. This gives you a blank table, with one column. From the Data tab of the Report Manager, drag the Store name object into the center of one of the cells in the table. At this point, the table should be showing all Stores.

Next, click the Edit button on the Input Control.

In the Edit Input Control box, go to the Dependencies tab. You see Block 1 is selected. Select Block 2 as well.

Click OK. Now, try it out. As you select values in the Input Control, Block 1 continues to show the filtered data, while Block 2 shows the selected Stores.

There you go. Now all you have to do is format as needed, and you’re done. Let me know what you think.

Does Drilling Need to be Painful?

I never know where the inspiration might come from for a blog post. This one just dropped in my lap, so to speak. A co-worker had some questions about drilling in Web Intelligence, and what impact it would have on the database. Well, the answer was a little more complex than I expected. So that got me wondering if other people might be confused about the various drilling options. So this is all about drilling in Web Intelligence, and when to use which options.

Options

Lets start by looking at the drilling options in Web Intelligence, and what they mean. Since drilling works the same in both Rich Client and Web Client, I’ll be working in Rich Client for this discussion, but will mention the difference in work flows, when appropriate.

In Rich Client, when you go to Tools – Options, you will find that the Options dialog box has a Drilling tab. In the Web Client, you will go to InfoView preferences – Web Intelligence to find these options.

The first option allows you to decide what happens when you click the Drill button. By default, when you go into Drill mode, the current report will be placed into drill mode. However, if you select the other option, “Start drill on duplicate report”, then entering drill mode will cause the current report to be duplicated, with the duplicate report being placed into drill mode. In this method, the original report remains in Edit mode, rather than Drill mode.

Next, we have an option that says, “Prompt if drill requires additional data”. When drilling on a document, ideally, the data needed for drilling should already be in the document. But this isn’t always the case. You may need to drill beyond what is in the document. This is called drilling out of scope. More on this later. However, this option allows you to be prompted if your drill action will require the query to be run. Think of it as a warning system, letting you know that drilling may take longer than expected. If you choose this option, you will get a dialogue box, when drilling out of scope, that allows you to select all the needed levels for additional drilling.

“Synchronize drill on report blocks” is an option that should be turned on by default, in my humble opinion. If you have two blocks on a report, such as a table and chart, that show the same data, this option allows you to drill down on both blocks with one click. If you don’t turn this option on, then one block will drill, and the other will be filtered for the drilled value.

If you’ve done any drilling in Web Intelligence, you’re probably aware of the Drill Filter Toolbar that appears above the report when drilling.

When you drill, combo boxes appear in this toolbar that allow you to drill sideways by selecting a different value for the drilled dimension. As the text in the toolbar indicates, you can also drag dimensions onto this toolbar from the Data tab of the Report Manager, and use them as simple report filters. The final option, “Hide Report Filter toolbar on startup”, will prevent this toolbar from appearing when you start drill mode.

Note that any changes you make to these options will take effect the next time you open a document.

Setting up Drill

In order to drill in a Web Intelligence document, Hierarchies (Drill Paths, in 4.x) need to be created in the universe, by the universe designer. This isn’t covered in this post. We’ll assume that the designer has already done this.

When building a query, we typically set up the drill path by defining the Scope of Analysis. This simply means selecting which objects from a Hierarchy we wish to use for the drill path. To do this, start by exposing the Scope of Analysis pane (Click the 3rd button on the toolbar). Then switch the Universe pane from “Display by objects” to “Display by hierarchies”.

Note that it is not necessary to Display by hierarchies. This view is helpful to see the hierarchies that are built in the universe. However, if you are familiar with the hierarchies, you can leave “Display by objects” selected.

You can now drag the objects from the Universe pane to the Scope of Analysis pane that you wish to use for drilling. These objects will be included in the query results, but will not be shown on the report until you start drilling to them. You don’t need to select all the objects from a hierarchy, only the ones you need for drilling.

In the query above, let’s say that we selected Year, Quarter and Month for the Scope of Analysis. On the report, we can drill from Year to Quarter, and Quarter to Month, using the query results stored in the document. However, if we want to drill from Month to Week, we will be drilling beyond the Scope of Analysis defined in the query. This is called drilling out of scope, and requires the query to be rerun. Drilling out of scope is generally not desirable, as it adds additional overhead to the database, and takes longer to drill. Therefore, it’s best to define the needed levels in the Scope of Analysis.

However, it’s also not good to define too many levels in the Scope of Analysis, as each additional level increases the size of the result set stored in the document, exponentially. This can cause the document to open slowly, import and export slowly, and perform sluggishly, especially when refreshing the query(ies). The bottom line is that you want to find the balance between too many levels and too few levels.

Query Drill

As mentioned, for each additional level added to the Scope of Analysis, the size of the result set grows exponentially. This can cause documents to bloat to huge sizes. Query Drill is designed to alleviate this problem. However, it can also be a dangerous feature. Here’s how it works.

Rather than defining a Scope of Analysis, you would simply create your query as usual, and run the query. Once you have the query run, open the Document Properties (File – Properties). Under Document options, you will see a check box labeled “Use query drill”. Click the check box to activate Query drill.

Now, when you drill, for example, on 2003 in the Year column, Web Intelligence will rerun the query, adding a query filter for Year=2003. If you then drill on Q4, the query will be run again, with another query filter added for Quarter=Q4. This keeps the result set nice and small, but allows drilling to all levels needed.

So why is this a dangerous feature? First of all, every time a user drills down, up, or sideways, the database is hit with another query. This can add tremendous overhead to the database, especially if there are many users using this feature.

Second, depending on how long it takes for the query to run, users will have to wait for the results of their drilling action. This can cause users to become annoyed with drilling, and BusinessObjects in general. Therefore, Query Drill should be used with extreme caution.

Administration

Yes, there’s some administration to be done with drilling. So, if you’re not an administrator, you can skip this section, although you may find it interesting.

There are some security settings that can be used to control the way users use drill. In the CMC, under Applications – Web Intelligence, there are some advanced settings that apply to drilling. The first one is “Enable drill mode”. With this setting, you can completely eliminate a user’s ability to drill. If you don’t want certain users to use drilling, use this setting to deny drilling.

“Extend scope of analysis” allows a person to drill out of scope. Since drilling out of scope adds overhead to the database, you can limit users ability to drill out of scope by denying this right.

So, with these two settings, you can have some control over drilling. Use these cautiously, as drilling is one of the most useful features of Web Intelligence.

Conclusion

The questions from my co-worker had to do with the impact of drilling on the database. Of course, my answer was along the lines of “It depends”. There are numerous factors that can affect the impact on the database. Hopefully, this article gives you the information you need to make an educated decision regarding the options for drilling. Let me know you have any special tricks or guidelines for drilling.

Adding a Linear Trend Line to a Chart

I must admit, I have long lamented that Web Intelligence doesn’t have built in trend lines for charts. Of course, that doesn’t mean you can have one. You just have to build it manually, and that can be a lot of work. So this post is about how to do a linear trend line, also known as a Linear Regression. This technique makes extensive use of Calculation Contexts.

We’ll use the eFashion universe for this example, so you can follow along with all the steps. Start by creating a query with Store name, Year, Month name, and Sales revenue. Run the query, and remove Store name and Year from the block, and put a custom sort on Month name to get the months in chronological order. Finally, add report filters for Store name and Year to the Report Filter toolbar. Your report should now look like this.

OK, let’s get started. We’ll need to create quite a few variables along the way.

1. Create a variable called “n”. This will simply calculate the number of values in the block. So the formula is as follows:

=Count([Month Name])In Block

Add this new variable to the block, and verify that it shows 12 on every row.

2. The next variable simply numbers the rows in the block, so we’ll call it “x” as it numbers the values along the X axis of the chart. The formula for this one is:

=Sum(RunningCount([Month Name]))

If you add [Line Num] to the block, it should number the rows, 1 – 12.

3. Now we get into the heavy calculations. Call the next variable “a” with the following formula:

=(Sum([x] * [Sales revenue]) In Block * [n] In Block) – (Sum([x]) In Block * Sum([Sales revenue]) In Block)

You can add the [a] variable to the block, just to verify that you get the same number on each row. Notice the “In Block” context used several times in this formula. In Block is an output context operator that tells the formula to output one number for the entire block. This is why we get the same value on each row.

4. Moving along in order, let’s call the next variable “b”, with the following formula:

=([n] * [n] * (Sum([x] * [x]) In Block / [n] – (Average([x]) In Block * Average([x]) In Block)))

The first part of this formula multiplies the squares of [n] and [x], outputing one number for the block, dividing it by the number of values in the block. It divides that by the number of values minus the square of the average [x] values. Sound confusing? Trust me, we’re on the right path, but we’re not done yet.

5. OK, this brings us to the next variable, which we’ll call “c”. This one is pretty simple:

=[a]/[b]

Of course, since both [a] and [b] return one value for all rows, we can expect [c] to return one value for all rows.

6. We’re in the home stretch now. Add another variable, called “d”, with the following formula:

=(Average([Sales revenue]) – [c] * Average([x])) In Block

Again, this one should return one value for all rows. Any time you see “In Block” as the output context for a block, you can expect it to return one value for all rows.

7. OK, one final variable! We’ll call this last variable “Trend”. This one has the following formula:

=[d] + [c] * [x]

That’s all the variables we need. We are now ready to plot this on a chart. So, remove everything from the block except the Month name, Sales revenue, and Trend objects. Finally, turn the block into a line chart, and format as desired. You should end up with something like this:

As you select different years, or stores, from the filters, above, the trend line will adjust accordingly.

Give it a shot with your charts and let me know what you think?

Removing the Confusion from Calculation Contexts

If you’ve read enough of my blog posts, you’re probably aware that I’ve been using BusinessObjects for a long time, and, during the early years, I was pretty clueless. One of the topics that had me completely stumped was Calculation Contexts. I attended the classes, and learned a lot, but could not grasp the concept of Calculation Contexts. So, I studied the training manuals, but it just wasn’t sinking in. So, I studied the user guides, but it still wasn’t making sense. So, I decided that I just wasn’t smart enough to understand this concept, and did my best to live without it.

It wasn’t until 2002, while attending the BusinessObjects User Conference that I finally got it. I was attending a session by my good friend Dave Rathbun. One of his topics was Calculation Contexts. Dave explained them in a slightly different way than the books. Suddenly the light bulb went on, and it hasn’t dimmed since. In 2004, I started teaching BusinessObjects classes, and I always taught Calculation Contexts the way I learned it, and it seemed to make more sense than what was in the books. So, my first piece of advice is, if you really want to understand Calculation Contexts, don’t read the books. :-)

What is Calculation Context?

The bottom line is that all calculations in Web Intelligence, by default, happen in the context of where they appear on a report. The same formula, in different places, can give different results. Often, the default context is exactly what we want. But sometimes it isn’t, and it’s then that we need to understand how to manipulate the contexts.

Default Calculation Context

Let’s start with a simple example, using the infamous eFashion universe. Create and run the following query:

Next, section the report on Store name. Add a Sum to the Sales revenue column. Notice that, even though the formula for the Sum is the same in every section, it yields different results in each section:

=Sum([Sales revenue])

This is because the calculation takes place in the context of where it appears on the report. In this case, the formula is within a section, so it outputs a different value for each section.

Next, drag the cell with the sum, out of the block, and drop it next to the section header cell.

Notice that the results are the same as when the sum was in the block. This is because the formula is still in the context of the section, Store name, and therefore, outputs a different value for each Store name.

Next, drag the cell with the sum up and to the left, next to the Report Title cell, but hold down the Ctrl key before you drop it. This creates a copy of the cell.

This time, after dropping the cell, you will see a different number. The number you see now is the grand total for the whole report. Once you move the formula outside the context of a section, it returns results in the context of the whole report.

But what if you want to calculate the percentage that each store makes up of the grand total? You would need the following calculation: (Revenue for each store) / (Grand Total Revenue)

But how do you get the grand total to appear in the section? Well, before we answer that question, let’s talk about Calculation Contexts, and what makes them up.

Components of Calculation Contexts

Every calculation in Web Intelligence has an Input and an Output context.

  • Input Context: Determines what values are taken into the calculation. Think of it as, “I need to take into the calculation, one number for every __________?
  • Output Context: Determines the output level, or dimensions, by which the calculation is output. Think of it as, “I need this formula to put out one number for every __________?

The syntax for Calculation Context is as follows:

=Aggregate Function([Measure] Input Context ) Output Context

Notice that the Input Context is inside the aggregate function, and the Output Context is outside the aggregate function. If you don’t specify an Input or Output Context, Web Intelligence will use the defaults. More on defaults later. By the way, while Calculation Contexts are most often used with aggregate functions, they are sometimes used without functions.

How to use Calculation Contexts

Let’s try a couple examples to demonstrate how these contexts work. Add another report to the document that we have already started, and place a block on it with Store name and Sales revenue.

Select the Sales revenue column, and add an Average calculation. In the footer of the block, you should see the following:

Next, we’re going to put an Alerter on the Sales revenue column, to highlight all values that fall below average. In order to reference the average of 2,799,016 from an Alerter, we will need to turn the Average into a Variable. So, select the cell that has the Average value in it, and click the Create Variable button on the Formula Toolbar.

Give the Variable the name, Average. Now we can create the Alerter. Select the Sales revenue column, and then click the Alerters button on the Reporting Toolbar.

In the Alerters box, click the New button, to create a new Alerter. Name the Alerter, Below Average. In the Sub-Alerter section, set the following conditions and formats:

Click OK, then OK in the Alerters box. At this point, the Alerter is applied to the Sales revenue column. But we don’t see any values highlighted in bold red. Why not? Well, it’s all about Calculations Contexts.

To see why this isn’t working, add the Average variable to the block, as the third column.

The value that we would expect to see, 2,799,016, isn’t showing up. Instead we’re seeing the same values as we see in the Sales revenue column. So let’s explore why.

As I mentioned, if we don’t specify an Input or Output Context, Web Intelligence will use the default contexts. And in this case, we didn’t specify any contexts in the Average variable. So what is the default? Well, in the body of a block, the default Input Context is the dimension(s) displayed in the block. And, the default Output Context is the dimension(s) displayed in the block.

If you were going to calculate the average manually, you would need to know what the numbers are for each store, sum them up, then divide by the count of stores. In other words, you would need to take into the calculation one number for each store, and you would want to output one number for the whole block of stores.

Well, that’s exactly what we need the Calculation Contexts to do. Since the default Input Context is the dimension(s) displayed in the block, and, in our case, the dimension displayed in the block is Store name, the default Input Context is to take in one value for each Store name. So, the default Input Context is fine.

So the problem must be the Output Context. Again, the default Output Context is the dimension(s) displayed in the block. In our case, therefore, the default Output Context is to output one number for each Store name.

So, what is happening here is that we’re taking into the calculation one number for each Store, and outputting one number for each Store. Clearly, that isn’t very useful. What we really want is to take in one number for each Store (the default), and output one number for the entire block (not the default). So, we need to specify the correct Output context.

So, let’s open up the Average variable, so we can modify the Output Context. Click at the end of the formula, outside the aggregate function, and enter the following:

In Block

So the entire formula should now be:

=Average([Sales revenue])In Block

The “In Block” part of the formula tells the formula to output one number for the entire block. Click OK, and confirm that you wish to modify the variable. Now, not only is the Average variable giving us the same value on every row, but the Alerter has now started working.

OK, so now that we have this working correctly, we can remove the Average column from the block. Our next step is to add another variable to the block. This time, we want to display the most revenue earned by each store in any one year. So, create another measure variable, called Maximum, with the following formula:

=Max([Sales revenue])

Add this new variable to the block as the third column. Does it look familiar? Just like the Average variable, Maximum is just repeating the Sales revenue values. This can’t possibly be correct! You’re right. It’s not correct. So let’s figure out why.

Since we didn’t specify an Input or Output Context, the formula is using the defaults. Therefore, it is taking in one number for each Store name, and outputting one number for each Store name. What do we want it to do? Well, in order to know the most revenue each Store earned in any one Year, we need to know what the values are for each Store, for each Year. In other words, we need to take into the calculation both Store name and Year, not just Store name. And, we want to output one number for each Store name, which is the default, so we don’t need to specify an Output Context.

So, open up the Maximum variable, and click just inside the closing parenthesis, so we can specify the Input Context. Add the following:

In([Store name];[Year])

So, the whole formula should now look like this:

=Max([Sales revenue]In([Store name];[Year]))

Notice that this time, we put parenthesis around the objects used in the context. When you use a keyword, such as Block or Report as the Context, you don’t put parenthesis around it. But when you use objects, like Store name or Year, you always put parenthesis around them.

Click OK, and confirm that you want to modify this variable. Now, you see very different values.

Are these numbers correct? Well, if you want to validate the results, add another block to the right with Store name, Year, and Sales revenue, and see if the Maximum variable is showing the highest annual revenue for each Store. You will find that, yes, the results are correct.

Now, let’s take a brief look at the first report that we sectioned on Store name. If we want to sidplay the percentage that each Store makes up of the total, we would use the following formula within the section:

=Sum([Sales revenue])/Sum([Sales revenue])In Report

The second part of the formula, with “In Report” as the Output Context, specifies that we want it to output one number for the whole report. Therefore, we are dividing the Sales revenue for each section, by the Sales revenue for the whole report. Format the cell as a percentage, and you’re done.

Using Other Context Keywords

So far, we have explored the Input and Output context using the “In” keyword. But there are more than just one keyword that can be used in Contexts. Let’s take a quick look at ForEach and ForAll.

ForEach can be used to add dimensions to the default context. For example, in the Maximum variable we just created, we specified both Store name and Year for the Input Context. If we had used “ForEach” instead of “In”, we would not have needed to specify Store name in the Context, as Store name is the default. Test this by creating another variable, called Max ForEach. Enter the following formula:

=Max([Sales revenue]ForEach([Year]))

Go back to the second report in our document, and add the new variable to the block as the fourth column. Notice that it gives the same results as Maximum. Use ForEach when you need the formula to be dynamic. If you add additional dimensions to the block, they will automatically be added to the Input Context.

ForAll is the opposite of ForEach. While ForEach adds dimensions to the defaults context, ForAll removes dimensions from the default Context.

OK, so let’s take a quick look at one more Context keyword: Where

The Where keyword allows you to add a filter to a formula. For example, if we wanted to split the revenue for each year into separate columns, we could create a variable for each year, using the Where keyword. Let’s try one.

Create a variable called Revenue 2001 as a measure, with the following formula:

=Sum([Sales revenue])Where([Year]=”2001″)

Add this variable to the existing block. In this way, you can easily filter a formula for one or more values of a dimension. Of course, this could also allow you to calculate the variance between two years. For example, if you wanted the variance between the 2002 revenue and the 2001 revenue, you would use the following formula:

=Sum([Sales revenue])Where([Year]=”2002″) - Sum([Sales revenue])Where([Year]=”2001″)

There are a couple rules when using the Where keyword:

  • Following the Where keyword, always use parenthesis for the condition.
  • In the condition, the value to the right of the equal sign must always be a constant, not a variable.

Conclusion

This post turned out to be much longer than I expected when I started it. I guess I got carried away. But I think that it’s very important to understand Calculation Contexts. Just remember, the Input Context says, “I need to take into the calculation, one number for every what?” And the Output Context says, “I want to put out one number for every what?” Of course, the more you practice it, the easier it will get over time. So, practice much, and let me know how it goes.

Follow

Get every new post delivered to your Inbox.

Join 296 other followers