A few Report Studio tricks

Building reports in Cognos can be a fairly arduous process. A mistake early on in the development can cause some headaches later on. Quickly and efficiently building reports is best, but what if you have to go back and fix something?

Fixing Extended Data Items

One of the most common problems I run into is people using extended data items to build a dimensional report. Once the data item has been created, it can’t easily be changed. What if you need to change the level, or even hierarchy, the item is based on? You’ll need to delete and rebuild it from scratch. Instead it would be easier to convert it to a full expression.

There is a function that allows this, but it is normally inaccessible. We can easily reveal it by using the developer toolbar in Report Studio. I prefer using Firefox, that’s the browser in the animation. Simply copy the following code into the console and run it.

document.getElementById('mnuToolsPopup').firstChild.firstChild.appendChild(document.getElementById('mnuConvertV5DataItems').parentNode.parentNode)

Convert to V5 Data Items

As you can see in the animation, the option magically appears under tools, and will convert every extended data item into a normal data item. It’s worth mentioning that this is not supported by IBM, and it is not possible to convert data items back into Extended Data Items, so tread carefully with this.

Avoid Extra Master/Detail relationships by using list headers

I’m not a big fan of using section functionality. It works by creating another list, a list in a list, connected with master/detail. In some cases Cognos can handle this efficiently; with master detail optimization and efficiently built queries. In most cases however, Cognos will generate an extra query for each separate detail.

How can we avoid this? By careful grouping and setting of list headers/footers, we can replicate the appearance of a sectioned report. In the image below, we have two identical looking lists. One is using the default section function, and one is some very clever formatting. Can you guess which is which?

List Headers vs Sectioning

Year grouped, removed from the list, headers and footers set, list header set to “start of details”, and extra rows/fields included for the padding effect. All small details that go a long way into making things look right.

The report xml is attached below, but can you figure it out yourself?

Caching charts to speed up run time

This is actually something I’m not too proud of. At my current client we have a few reports with embedded graphs. Hundreds upon hundreds of embedded graphs. Each graph can take around 300 ms to render. This increases the runtime of the report from 30 seconds, without graphs, to around 9 minutes. How can we get around this? Well, in this case, the graphs were fairly simple – a colored bar showing 0-100% against a light gray background. There are a few different variations – a 12 pixel tall vs a 6 pixel tall, and depending on the type it can be blue, green, or purple. In total there were 606 possible graphs (it’s actually 602 as the 0% graphs all look identical, but let’s not get technical here).

To start, the names of the graphs had to be consistent. Bar55B.png would be a bar set to 55% and is colored Blue. We can then use an image item with the source set to layout expression. ‘\images\charts\Bar’+number2string(floor([Measure]*100))+’B.png’. Each row would generate the image tag, and wouldn’t have to take the time to actually build the graph image. But how can we generate those images?

The easiest way is to use a numbers dimension. Don’t have a numbers dimension? Use a date dimension – you just need to massage a date dimension to work the way you want. For example, on the samples I’m using _days_between([Sales (query)].[Time].[Date],2010-01-01) and filtering that <=100. Build the graph in each row, and run it to make sure. The output should look like this: graphs

But saving each of those images would be tedious, so how can we automate it? Modern web browsers have a number of options. I’m using an addon called Save Images which will automatically save each image from the tab to a specified folder. This addon will number the images starting at 1, so it’s important to sort the list so the 0 is at the very end. This way you can modify Bar101 to Bar0, instead of having to subtract 1 from each graph. The addon saves and names the images in the order they appear in the output, so it works well for this purpose.

save images

Once the images have been saved, we need to do a bit of post processing.

First, rename Bar101 to Bar0. Now let’s notice something interesting the height of the image is 16, but we need 6 or 12. To fix this, we can use a batch image processor. For this we can use an application like XnConvert. With XnConvert we can modify the size and colors of the graph. The color picker offers a number of ways of selecting the desired colors, including a color picker.

Simply select the color to change and the color you want.
xnconvert colors

Next we resize
xnconvert resize

Finally, renaming:
xnconver rename
This will take the original name, BarN, append GL (green large).

And when we press the convert button…
xnconver rename

We can repeat, very quickly, for every variation we need. In my case it only took 6 rounds of changes, about a minute in total.
xnconvert done

Now instead of relying on Cognos to generate hundreds (if not thousands) of graphs, we simply use an image item with the source set to a report expression.
‘..\images\Bar\ipr\Bar’+number2string(floor([Query1].[Percentage]*100))+[Color letter] +[Large or Small]+’.png’

The difference in runtime is phenomenal. First we don’t have a master/detail relationship any more. So that’s an additional N db connections that aren’t being run (yes, I know M/D optimization negates that, but that only works in DQM and it’s spotty). And most importantly, that’s thousands of images Cognos doesn’t have to generate (300 ms * 1500 images = 7 minutes).

section-or-formatting.txt (204 downloads)
graphs.txt (161 downloads)

Takeaways from IBM Insight 2015

Unlike previous years, this year was almost all work. I didn’t really have any free time (what little time I had was devoted to the fine art of inebriation, with many thanks to Motio for doing there part), so I couldn’t give a daily overview. On the flip side, I did have much greater access to the IBM developers this time, and had the chance to ask a few of my most urgent questions. This is actually the primary reason for me attending IBM Insight – the chance to look them in the eye and hopefully get some straight answers. There were a few people who dropped by the booth to try to chat, but apparently I was never there. Sorry! I have a tendency to wander around and collect pens/USB Sticks/tchotchkes for the kids. If it’s still relevant, drop me a line and we’ll chat.

I’m sure everyone is wondering about the new version of Cognos. The most important thing first. I do not need to change the name of my blog – IBM is sticking with the Cognos moniker. It is now christened Cognos Analytics, but for the purpose of abbreviations I’ll just call it C11.

The authoring environment has gone under a major re-factoring. The Report Studio we all know and love (well, I love it) has undergone a substantial face-lift. The Report Viewer has also undergone many changes. Previously an authored report was essentially static. You could, in theory, write JS to manipulate objects on the page but that always came with some level of risk. Dashboarding and datasets have also gone under the knife and have emerged substantially improved.

Unfortunately I don’t have access to the C11 demo yet. This any screenshots will be from Youtube videos. As soon as I do get access, I’ll try to publish some articles with original media.

Authoring
The immediate reaction is that it seems to be a completely new tool. The menu bar at the top is gone, clicking on certain regions opens up a circular context menu, and the toolbox is actually arranged in a logical format.

This is actually still using rsapp.htm. In theory, all of the same functionality is still there. The locations for the menu options have been moved, and this time they actually feel logical. To get to the report properties, for example, you don’t go through the File menu option – you actually click on the report object. Additional report properties have also been moved here, so it does make things a little easier to find. Moving things around does have its drawbacks – in the hands-on demo it took me a minute to find the location to switch from an individual page to a specific query. A few other features took me some time to find.

When a table or block is dragged in, a plus icon appears in the center. Clicking on that creates a radial context menu.

The items that appear in the menu are the items from the “pinned” section in the toolbox. Easily changed by report authors.

Another very positive change is the query explorer now shows all objects associated with that query. Expand a query, and jump to the object by clicking on it. The more complex reports tend to accumulate a lot of QueryNs, so this should speed cleaning up the reports.

Reports built in this new RS version will automatically open in the new Report Viewer (this is a property on the report level which can be changed). The new report viewer should work with the Prompt API (though the hands-on was getting an error when I tried), but all other JS is likely to cause issues. This should be okay, as a majority of the JS that I write has some similar functionality. The report viewer appears to be a modified Workspace Advanced. End users can make various simple modifications to the report output, resorting, basic calculations in lists or crosstabs. Users will only be able to save those changes if they have write access to the report, or if they save a report view in their folders. There are plans to extend the published API, but I’ve heard no specifics yet.

Reports upgrades to C11 will continue to run in the old report viewer! This means there is a very good chance I won’t get frantic calls to fix broken reports. As usual I received no promises – only “in theory” and “should”.

Some of my biggest requests, master/details on singletons (everyone should vote on it here: https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=62883), and adding native functions to the dimensional function set (TM1 especially, it’s an IBM product! Why can’t I use the TM1 functions?! I should stop before this turns into a TM1 rant.) have not yet materialized.

Analysis Studio and Query Studio are still included in this version! While the direction is to remove it, eventually, there are a few gaps that couldn’t be overcome. Specifically with the getting the filters to show at the top. But this is the last version with AS and QS, for reals this time!

Datasets are very interesting.
Unlike previous versions, you can create datasets on existing packages. Drag in the fields you want and Cognos will save the results in a local table. This will by, by default, on a DB2 instance installed automatically by Cognos. This is essentially a materialized view, but I don’t have any information on scheduling it. I am guessing that there will be governors administrators can set on this – number of rows, size of output, max runtime of query, and similar settings.

This is a completely web based tool, which is supposed to be simple enough for end users. In theory it should automatically determine how the joins should be built. See more here: https://www.youtube.com/watch?v=cYlbiWeBgtA

These datasets are designed to feed the new dashboarding system, which is truly impressive.
The dashboarding tool is obviously designed for analysts and end users. It does not offer the wide range of optoins and capabilities we’ve come to love in RS. However, it does many things that RS can’t do. The tool only works with the datasets, so the response time is fast – the data is already aggregated to the users needs and it’s automatically using DQM.

Drag in a measure, and it will create a block with that value. Drag an attribute on top of the measure, and it turns into a bar chart. Drag in another measure and it adds another bar, or add an attribute and it turns into a cluster or a scatterplot. We can turn this into another data container, like a list, with a simple right-click. Now for the interesting bit – drag in another measure onto the screen. The list just created will filter that measure. Just like the “Global Filters” from Workspace, you can filter any object by clicking on any item in that list. Furthermore, users can now drag in an iterator, similar to the item in Active Reports. But this one can automatically loop through members.

I can’t give it justice without an animated screenshot. Instead, just watch the video here: https://www.youtube.com/watch?v=bRbulHoUQC4

Navigating
The Cognos Connection screen is completely revamped. We won’t have the same system with tabs across the top and navigating folders. I’m personally sceptical of this, and nobody’s been able to give a satisfactory answer about where the tabs from C10 will actually go. This is actually my biggest concern when it comes to customer adoption.

The back end is mostly unchanged. Reports still use the same XML format, and the Cognos content store is (supposedly) still the same. This fact, coupled with the option to use the old Report Viewer, gives me hope that upgrading should be quick and painless.

On the non-Cognos-centric side, Watson is being pushed hard now. There’s even a way to import the results from Cognos reports into directly into the Watson web app. It seems to work through a CMS call through your browser. Lets say you have a list report. You provide the location for that report, answer all of the prompts, and your browser will run it in the background and send the results back to Watson. They warned that this can be very slow with large datasets. Stretch, walk around, get an expensive coffee from that little shop on the other side of town slow. Watson is now powering several different applications, some of whom were showcased throughout Insight. Research, shopping, travel, hospital, robots – there seems very little that Watson can’t do.

As usual, IBM Insight was a lot of fun and I strongly recommend attending. The networking possibilities alone are phenomenal, when coupled with inexpensive certification testing, and shiny toys (I got to fly a drone, and I saw a 3D printed car driven by a robot, and I got to play with a Van der Graaf generator, and probably a more interesting experiences I’m forgetting) make Insight a unique experience.

Addendum: My apologies for the lack of updates and replies to comments recently, my current client is taking all of my time. The current phase of the project is nearly over, so I’m going to soon have some time to go over my list.

Another look at RAVE

I will be leading a 5 hour session on RAVE at the Schaumburg IBM Offices near Chicago August 12, 2015. Space is limited, so make sure to register early and often! Details can be found here: http://events.r20.constantcontact.com/register/event?oeidk=a07eb54blte62fd6806&llr=nuulfnqab

IBM released their new visualization engine quite some time ago, and great strides have been made in improving the usability. Sadly the documentation is still as opaque as ever, and there’s outside of the IBM Visualization forum, there’s not much support.

I’ve recently had the opportunity to work with it a great deal as my current client has some very interesting requirements. This article will actually serve as the basis of a talk I’ve been asked to give to the other developers at this client. I am planning on a series of articles on RAVE, (as a quick side note, I find simply writing about methodologies teaches me just as much as reading about them – everyone should stop what they’re doing and start their own blogs.) and this may recap some of what I’ve written previously.

To begin, let’s talk about the language behind RAVE, vizJSON.

As you can see from the name, vizJSON is based on JSON (JavaScript Object Notation). The basic structure is:

1. Objects: {}
2. Arrays: []
3. Key/Value Pairs: “Key”:”Value”

The value of a key can be a number, a string, an array, or an object. An array can contain one or more values or objects. With the exception of the vis object itself, every value must have a key.

An example:

"label":[{"content":["The range is "]},{"content":[{"$ref":"start"}," – ", {"$ref":"end"}],"style":{"font":{"weight":"bold"}}}]

This is actually much less complicated than it looks at first glance.

To begin, this is an element label. It will appear on or near the element to which is attached. The label is composed of two parts (notice the array attached to label contains two objects). The first object contains a single key, content, which itself is an array containing a single value. The second object contains two keys, content and style. This content will render values found in the dataset (which I’ll explain later), and has a style applied to it.

The entire graph is contained within an object. Each object type has a specific schema. For example, the data object won’t expect an axis object, but a dimension will. At the same time, some objects, such as style, can be associated with many different parents.

At the top level, there are three essential objects. Without any one of these, no graph will be displayed.

1. Data
2. Grammar
3. Version

The other graph objects include: parameters, legends, legendPosition, size, style, and copyright. Parameters allow report developers to modify the appearance of the graph at run time. Legend and legendPosition control, surprise surprise, the legend. Size and style control the size and formatting of frthe graph. And copyright allows me to sue anyone who visualizations without first paying me a king’s ransom.

RAVE, vizJSON, has gone through many revisions. Certain attributes have changed – features have been added and removed. Version allows the engine to properly identify which version of vizJSON the visualization has been written for, and will behave accordingly.

Data

The data array controls the datasets available to the visualization. Each dataset is represented in Cognos as a separate query. This gives us a flexibility unseen before. Previously graphs could only be built off a single query – necessitating inefficient queries for the more complex graphs.

An example of this might be a bullet or a thermometer chart embedded in a list. There are two sections; the first section controls the scale of the graph, the second controls the position for a notch. In a standard Cognos graph, we would need a complex query to get the red/yellow/green bands and the notch position.

horizontal themometer

In this graph, we show various bands, a target, and an actual. What’s not immediately visible is the first band does not start at 0. This graph is actually possible to accomplish using standard Cognos Graphs, but it is an ugly solution involving all sorts of trickery. Because it has to be a single query, a case statement is used. Additionally, because this is a master/detail, the score band query cannot be cached, and the tables are queried for each detail graph.

By using RAVE, we can set up two data sets. The first pulls the score band data. We pull score band label, start, and end. If we are using a single score band tier across the entire m/d list, this query can be cached.

The data set looks like:

      {
         "id":"zones",
         "fields":
         [
            {
               "id":"series",
               "label":"abc",
               "categories":
               [
                  "red",
                  "yellow",
                  "green"
               ]
            },
            {
               "id":"start",
               "label":"start"
            },
            {
               "id":"end",
               "label":"end"
            }
         ],
         "rows":
         [
            [0,160,359],
            [1,360,559],
            [2,560,760]
         ]
      }

The first field in that dataset is categorical. In the sample data, it’s populated with red,yellow,green. The rows section (containing the actual data), refers to the sample categories using a zero-based index. Once these appear in Cognos, the report developer will not have to worry about this.

The second data set is to create the dot. A single value can be called.

      {
         "id":"scores",
         "fields":
         [
            {
               "id":"score"
            }
         ],
         "rows":
         [
            [460]
         ]
      }

A quick call to the fact, with no joins, will be significantly faster than trying to join the two queries.

Grammar
Once we have the sample data to work with, the grammar describes the graphs. Grammar is an array, so we can actually have more than one graph in a single visualization. We can control the position of the different graphs – they can be next to each other vertically or horizontally, they can overlap, they can even be nested. Imagine a tree map nested inside the bars of a waterfall.

Inside a grammar we have coordinates, and elements. The coordinates is an object that holds the dimensions. The dimensions attribute is an array of edges for a specific graph. While it’s easy to imagine a two-dimensional graph, what can we do with more?

axes

In this visualization, we have an interval with three dimensions. We have the X Axis (measures, 1st dimension), the Y Axis (the categories, 2nd dimension), and the Z Axis (the series, 3rd dimension).

A graph need not have any axes. While frowned upon by data visualization experts, we can build a graph by showing packed bubbles, encoding a value by size and color.

bubbles

This graph has no axes – no dimensions. Instead, it’s positioning the bubbles using a specific method. There are a number of positioning algorithms. From positioning or nesting in any number of dimensions, to figuring out how to place a node when there are no dimensions.

The scale of a graph is defined by the element tied to it. A scale always has at least one span, you can add more to show, maybe, a line over a bar.
A dimension is often most useful when we have ticks and labels describing various points on the axis. In the clustered interval graph above, two out of three axes had labels. In vizJSON, each dimension has one or more scales, and an axis line.

There are various settings to control the axis line and ticks. If it’s a categorical scale, such as the Y and Z Axes above, do you want to show labels? How often do you want to show the measure for the Y Axis? In the previous example, the series was encoded by color, so there was no need to place labels.

Now let’s talk about the shapes themselves – bars, lines, circles, polygons, and more. While an interval is not much use in scatterplot, points can be placed almost everywhere. Some seem less obvious. What would you call the shape in a treemap? Or a heatmap? Or the paths between network nodes? A single basic element type can wear many hats.

Let’s talk about a two basic shapes, intervals and points.

An interval will build a rectangle from one point to another. The most basic interval generates the rectangle from the position of the field assigned to it.

Let’s take a dataset:

{
	{
	   "id":"series",
	   "label":"abc",
	   "categories":
	   [
		  "abc",
		  "abc",
		  "abc"
	   ]
	},
	{
	   "id":"size",
	   "label":"abc"
	},
	{
	   "id":"end",
	   "label":"abc"
	}
 ],
 "rows":
 [
	[  0,0,15,22	],
	[  0,1,17,23	],
	[  0,2,16,24	],
	[  1,0,24,34	],
	[  1,1,22,27	],
	[  1,2,26,33	],
	[  2,0,28,43	],
	[  2,1,16,34	],
	[  2,2,18,22	]
 ]
}

Apply an interval element

"type":"interval",
"position":
[
  {	 "field":{"$ref":"size"}  },
  {	 "field":{"$ref":"categories"}  },
  {	 "field":{"$ref":"series"}  }
]

That interval is built inside a grammar with three dimensions. It will attempt to build a rectangle, starting at 0 to the value of “size”, split at the positions for “categories” and “series”. There’s a little more to it than here, there are three dimensions listed, and the graph has the “cluster” transform applied to it.

interval

Interval also allows us to build a rectangle based on start and end values.

      "type":"interval",
"position":
[
  {	 "field":{"$ref":"start"}  },
  {	 "field":{"$ref":"end"}  },
  {	 "field":{"$ref":"categories"}  },
  {	 "field":{"$ref":"series"}  }
]

In general bars should begin at zero. 40 sales are obviously twice as many as 20. But is 40c twice as hot as 20c? Setting start and end ranges is useful when building thermometer graphs. We can use the interval to build the structure, the ranges, as the base for another indication.

range

Where interval describes a range or area, points usually describe position – with the caveat that size, shape, color can all be modified based on other criteria.

Let’s shake things up, we’ll go with a dataset that returns score bands like I described at the beginning of the article, and another dataset that returns a single score value.

   "data":
   [
      {
         "id":"scores",
         "fields":
         [
            {"id":"score"}
         ],
         "rows":
         [
            [460]
         ]
      },
      {
         "id":"zones",
         "fields":
         [
            {
               "id":"series",
               "label":"abc",
               "categories":
               [
                  "red",
                  "yellow",
                  "green"
               ]
            },
            {
               "id":"start",
               "label":"start"
            },
            {
               "id":"end",
               "label":"end"
            }
         ],
         "rows":
         [
            [0,160,359],
            [1,360,559],
            [2,560,760]
         ]
      }
   ]

With those two datasets, we need to find a way to build the ranges, and put a dot over the correct position.

In this case our grammar will contain two elements, an interval and a point. Remember how the interval can have start and end? This is easy. We can also apply a simple palette, changing based on the series.

               {
               "type":"interval",
               "position":
               [
                  { "field":{"$ref":"start"}},
                  { "field":{"$ref":"end"}},
                  { "value":"35%"}
               ],
               "color":
               [
                  {
                     "id":"_PALETTE_",
                     "field":
                     {
                        "$ref":"series"
                     },
                     "modifies":"both",
                     "palette":
                     [
                        "#BC202E",
                        "#F49020",
                        "#38B14A"
                     ]
                  }
               ],
               "style":
               {
                  "width":"8px"
               }
            },

The color attribute is as aesthetic, basically a conditional formatting. Almost any style can be changed on the fly. I’ll go into more detail on this in a later article. The style here is defining the size of the bar. In this case, because it’s feeding a pixel perfect report, it’s set at exactly 8 pixels wide. The big gotcha here is the “value”:”35%”. In this graph, even though there is a single bar, there are two dimensions. The “value”:”35%” is simply instructing RAVE to push the element 35% down the scale, with 100% being the top and 0% being the bottom.

vert therm 1

This alone will create a nice bar. Now let’s get the dot on it.

             {
               "type":"point",
               "position":
               [
                  {"field":{"$ref":"score"}},
                  {"value":"35%"}
               ],
               "style":
               {
                  "width":"9px",
                  "height":"9px",
                  "outline":"white",
                  "fill":"black",
                  "stroke":
                  {
                     "width":"1px"
                  }
               },
               "label":
               [
                  {
                     "content":
                     [
                        {
                           "$ref":"score"
                        }
                     ],
                     "style":
                     {
                        "align":"start",
                        "font":
                        {
                           "weight":"bold",
                           "size":"24px",
                           "family":"Helvetica"
                        }
                     }
                  }
               ]
            }

This will create not only a dot, but a label. The dot is a black circle (the default symbol) with a white border. The label is defined to be aligned “start”, which in this case is left.
vert therm 2

This is close to the example I showed at the beginning, but there’s a lot more. In future posts I’ll write about summary functions (to get the beginning of the green range), sorting (so the report developer doesn’t need to sort it in the report), and transposing (so it’s horizontal instead of vertical).

We’ll also talk about parameters, so the developer can choose his own colors, and the bundle definition.

Attached is the vizJSON for the above graph. You’ll need the RAVE customizer which can be found here. Can you figure out how to transpose the graph? How about finding the max start value and putting a triangle under the bar?

vertical-thermometer.txt (274 downloads)