Thinking in terms of data matrix
All data analytics tool work on a data matrix, i.e. they make available algorithms and operations already implemented to work on a data table structure. We have all experienced that, when working with a data analytics tool, we need to change our way of thinking from a sequential series of operations, like in a Java program, to matrix based operations. Indeed, you can probably implement any "for" cycle by means of appropriate matrix based operations. Going along this way of thinking, centered on the data table, looping should be used with parsimony and only as the last desperate resource.
Goal: performing a mathematical operation on ALL available numerical columns
However, there are situations when looping seems impossible to avoid. One of these seems to be the looping on columns. Let’s suppose that we want to perform some mathematical operation on all available numerical columns and let’s suppose that we have many of these numerical columns. The only way to go here seem to be looping across all columns and performing the same mathematical operation on all of them, as explained in Solution 1.
Solution 1: Looping on all columns (not thinking in terms of data matrix)
We start a loop on all columns with the “Column List Loop Start” node, we rename the current loop column with a general neutral temporary name, we perform the math operation on the column, we rename the current column with a new name (original name + a suffix, for example), and finally we append the new column to the previously processed columns.
Solution 2: Unpivoting instead of Looping
However, solution 1 is not thinking in terms of data matrix. There must be a way of avoiding using the loop nodes and making the workflow execution much faster!
The key node here is the “Unpivoting” node. The “Unpivoting” node rotates the selected columns into rows; that is it changes the data table as shown below.
The “Pivoting” node uses values in column RowIDs as group values, values in column ColumnNames as pivot values, and values in First(ColumnValues) as cell values.
The solution 2 workflow, using the combination of “Unpivoting”, “Math Formula”, and “Pivoting” node, is shown below.