In the previous post (http://dataminingreporting.weebly.com/2/post/2011/03/how-to-create-crosstabs-in-knime.html) we created crosstabs with respective totals as full numbers. Now we want to use the same structure but we want to represent the numbers as %.

The "Pivoting" node does not have a % output; there are only two options: count the co-occurrences of values or calculate some kind of aggregation, like the sum or the mean. Among the possible aggregations, "%" aggregation is not listed.

Let's expand the workflow used to create the crosstabs and here reported.

The resulting total number of rows is then transformed into a workflow variable and used to calculate the percent values from the absolute values on the output port of the "Pivoting" node.

As for the counting of the total number of rows, for the calculation of the ratio values a number of different nodes or group of nodes can be used, mainly the "Math Formula" node or the "Java Snippet" node. We chose a "Java Snippet" node with the following code lines and returning a double array:

Double [] pct = new Double [3];

pct[0] = new Double($A$)/$${ICount(class)}$$;

pct[1] = new Double($B$)/$${ICount(class)}$$;

pct[2] = new Double($C$)/$${ICount(class)}$$;

return pct;

$${ICount(class)}$$ is the workflow variable obtained from the "GroupBy" node and containing the total number of row of the original data table.

The ratio components in the double array are then extracted by means of a "Split Collection Column" node and all the other columns are filtered out.

The "Rename" node then assigns the name of the orginal columns (A, B, and C) to the ratio components.

The figure below shows the sub-workflow contained in the metanode named "Pivot table to %".

Notice that at this point the final data table does not contain the percentage values, but just the ratios. However, we can choose to represent the ratios as percentages by right-clicking the column headers in the data table view and choosing the appropriate Renderer (see figure below).

If you want the output data to be always and immediately represented as percentages you need to add a *100 to the calculations in the "java Snippet" node.