Radial Vis Gadgets

The RadialVisGadgets package provides interactive Shiny gadgets for interactive radial visualizations. By interacting with the gadgets, Exploratory Data Analysis can be performed. The gadgets can be used at any time during the analysis. They allow the exploration of the underlying nature of the data in tasks related to cluster analysis, outlier detection, and exploratory data analysis, e.g., by investigating the effect of specific dimensions on the separation of the data.

Star Coordinates
RadViz

Star Coordinates

Star Coordinate’s (SC) goal is to generate a configuration of the dimensional vectors which reveals the underlying nature of the data. Let’s look at the well known Iris dataset [1].

library(RadialVisGadgets)
library(datasets)
data(iris) 

result <- StarCoordinates(iris)

Sepal.Length	Sepal.Width	Petal.Length	Petal.Width	Species
5.1	3.5	1.4	0.2	setosa
4.9	3.0	1.4	0.2	setosa
4.7	3.2	1.3	0.2	setosa
4.6	3.1	1.5	0.2	setosa
5.0	3.6	1.4	0.2	setosa
5.4	3.9	1.7	0.4	setosa

One can observe four numerical attributes and one factor. The traditional Star Coordinates approach is defined for numerical attributes only. Therefore, as default we set attempt the conversion of all factors to numerical attributes. This can be disabled with numericRepresentation = FALSE to be described below.

Following the traditional approach [2] the five attributes are placed at equal angle steps from each other.

Initial Configuration for Iris Dataset

You can move your move towards the endings of the dimensional vectors. The circle at the end will be highlighted. As you can see in the figure below.

Movement of dimensional axes

You can move these axes in order to create a configuration that you believe suitable and brush a selection of points.

Brushing over points

Buttons

Done: Once a selection is done. One can press the Done button, the Star Coordinates Gadget will return a list given the projection matrix, a logical vector of the selection and the projection points (in 2D).

Screenshot: Takes a screenshot of the configuration at its current state

Zoom in/out: Self-explanatory. Zooms in and out of the plot

Hints: Used when colorVar and clusterFunc are provided. Creates the hints for the current configuration.

# names(result)
# [1] "Proj.Matrix"      "Selection"        "Projected.Points"

Orthographic Star Coordinates Approach

Orthographic Star Coordinates are supported by the Star Coordinates by adding the approach=“OSC” parameter. The axes are reconditioned with every movement as described by Lehmann & Theisel [3]. The interaction is kept the same as before. With this approach, the dimensional vectors are constrained under conditions described in [3].

StarCoordinates(iris, approach="OSC")

Initial OSC Configuration for Iris Dataset

Numeric Representation = FALSE

The traditional approach [2] was defined for numerical attributes only. However [4] extended the approach to mixed datasets. The axis for the factor dimensions are divided according to the frequency of each categorical value within the categorical dimension. Given that the 3 species labels are uniformly distributed, 2 ticks appear separating the 3 blocks for each categorical dimension.

Initial Configuration for Iris Dataset with non-numerical representation

By clicking at the axis, you can activate it. The categorical value blocks are now visible on the selected factor.

Selection of factor dimensions

By double-clicking on a categorical block, the value the block represents is highlighted. If another categorical block is selected by double-clicking, then those two blocks will swap with each other. Allowing to shift categorical values in one dimension. You can disable a categorical selection by double clicking a second time in the same categorical block.

Selection of categorical value within factor

Labels in analysis

StarCoordinates(iris, colorVar="Species")

By sending a factor dimension name in colorVar, the analysis can be performed on labeled data.
The points are then coloured according to the selected dimension. The “Standard” and “OSC” approach are avaible for both analysis.

Star Coordinates with labels

Hints

Hints are used to describe possible movements if a label and a function is provided. A button named Hint will appear. An increase in the evaluation of the function defines an increase in projection quality i.e. larger values are better. Details on the hints usage are defined in [4]. The thickness of the segments represent an increase in quality. In the figure below, it would imply that interacting with Petal.Width by moving it down will result an increase in quality. The absolute maximum increase in quality is shown in the Hint Button, allowing for early termination. The hints are computed on-demand only and are based on the current vector configuration. Once a movement is performed, the hints will disappear.

library(clValid)
func <- function(points, labels){ dunn(Data=points, clusters=labels)}
StarCoordinates(iris, colorVar="Species", clusterFunc = func)

Hints on Star Coordinates

Notes On Data Processing

Missing Data: Only complete-cases are used i.e. rows where data is missing are removed
Zero Variance: Zero or close to zero variance dimensions are removed.
Scaling: If the values are not mean centered then each dimension is scaled from 0..1.
Mean Centered: A normalization step as described in [5] is performed if meanCentered =TRUE (default).

Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of eugenics, 7(2), 179-188.
Kandogan, E. (2001, August). Visualizing multi-dimensional clusters, trends, and outliers using star coordinates. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 107-116).
Lehmann, D. J., & Theisel, H. (2013). Orthographic star coordinates. IEEE Transactions on Visualization and Computer Graphics, 19(12), 2615-2624.
Matute, J., & Linsen, L. (2020). Hinted Star Coordinates for Mixed Data. In Computer Graphics Forum (Vol. 39, No. 1, pp. 117-133).
Rubio-Sánchez, M., & Sanchez, A. (2014). Axis calibration for improving data attribute estimation in star coordinates plots. IEEE transactions on visualization and computer graphics, 20(12), 2013-2022

RadViz

RadViz’s goal is to generate a configuration which reveals the underlying nature of the data for cluster analysis, outlier detection, and exploratory data analysis, e.g., by investigating the effect of specific dimensions on the separation of the data.
Each dimension is assigned to a point known as dimensional anchors across a unit-circle. Each sample is projected according to the relative attraction to each of the anchors. We will follow with the iris dataset.

irisWOCat <- iris
irisWOCat["Species"] <- NULL 
RadViz(irisWOCat)

Initial Configuration

RadViz is not defined for non-numerical dimensions and given it’s non-linear behavior for the projection generation it would be “even more” misleading to convert the factors to numeric. As with Star Coordinates, we can interact in order to change the projection. The anchors represented by the circles can be moved around the unit circle.

*Moving Anchors

However, even a factor dimension can be used for the coloring the points according to a label. This can be done by supplying the name of the column as a color.

RadViz(iris, "Species )

*RadViz with labels

Buttons

Done: Once a selection is done. One can press the Done button, the RadViz Gadget will return a list given the anchor locations, a logical vector of the selection and the projection points (in 2D).

Screenshot: Takes a screenshot of the configuration at its current state

Zoom in/out: Self-explanatory. Zooms in and out of the plot

Notes On Data Processing

Missing Data: Only complete-cases are used i.e. rows where data is missing are removed
Zero Variance: Zero or close to zero variance dimensions are removed.
Scaling: If the values are scaled from 0..1.

Sharko, J., Grinstein, G., & Marx, K. A. (2008). Vectorized radviz and its application to multiple cluster datasets. IEEE transactions on Visualization and Computer Graphics, 14(6), 1444-1427.

Radial Vis Gadgets

Jose Matute