It has been known that the curvature of data spaces plays a role in data analysis. For example, the Frechet mean (intrinsic mean) always exists uniquely for a probability measure on a non-positively curved metric space. In this paper, we use the curvature of data spaces in a novel manner. A methodology is developed for data analysis based on empirically constructed geodesic metric spaces. The population version defines distance by the amount of probability mass accumulated on travelling between two points and geodesic metric arises from the shortest path version. Such metrics are then transformed in a number of ways to produce families of geodesic metric spaces. Empirical versions of the geodesics allow computation of intrinsic means and associated measures of dispersion. A version of the empirical geodesic is introduced based on some metric graphs computed from the sample points. For certain parameter ranges the spaces become CAT(0) spaces and the intrinsic means are unique. In the graph case a minimal spanning tree obtained as a limiting case is CAT(0). In other cases the aggregate squared distance from a test point provides local minima which yield information about clusters. This is particularly relevant for metrics based on so-called metric cones which allow extensions to CAT(κ) spaces. We show how our methods work by using some actual data. This paper is a summary of a longer version [5]. See it for proof of theorems and details.

This content is only available via PDF.
You do not currently have access to this content.