Visualisation of hierarchical multivariate data: Categorisation and case study on hate speech
Ecem Kavaz, Anna Puig, Inmaculada Rodrı́guez, Reyes Chacón, David De-La-Paz, Adrià Torralba-Agell, Montserrat Nofre, and Mariona Taule
Journal of Information Visualization, 2023
Multivariate hierarchical data has an important role in many applications. To find the best visualisation that best fits a concrete data is crucial to explore and understand the relationships between the data. This paper proposes a categorisation – Elongated and Compact – of hierarchical data based on the inner shapes of the hierarchies, that is the connectivity degree of the internal nodes, the number of nodes, etc, that can be applied to any hierarchical data. Based on this taxonomy, we explore implicit and explicit layouts – Tree, Circle Packing, Force and Radial – to provide users with a complete view of the data. We hypothesise that Tree and Circle Packing fit with Elongated structures, and Force and Radial fit with Compact ones. In addition, we cluster multivariate features to embed them in the hierarchical layouts. Especially, we propose two different glyphs –one-by-one and all-in-one, and we bet for the one-by-one glyphs as the most suitable for showing the distribution of several features along with the hierarchical structures. To validate our hypotheses, we conducted a user study with 35 participants using a hate speech annotated corpus. This corpus comes from 4359 comments posted in online Spanish newspapers. The results indicated that users preferred the Tree layout over the other three layouts (Circle, Force, Radial) with both types of structures (EC and CC). However, when we focused the analysis only on Radial and Force layouts, both of them scored significantly higher with Compact than with Elongated data. Moreover, participants scored the one-by-one glyph higher than the all-in-one glyph, but the difference was not significant.