Using either t-SNE or UMAP over another is difficult to justify. There is no evidence per se that UMAP algorithm have any advantage over t-SNE in terms of preserving global structure.
These algorithms should be used cautiously and with informative initialization by default
In all embeddings, distances between clusters of points can be completely meaningless. It is often impossible to represent complex topologies in 2 dimensions, and embeddings should be approached with the utmost care when attempting to interpret their layout.
The only cerrtainty is the closeness of the points and their similarity
These methods don’t work that great if the intrinsic dimensionality of the data is higher than 2D
High dimensional data sets typically have lower intrinsic dimensionality $ d << D $ however \(d\) may still be larger than 2 and preserving these distances faithfully might not always be possible.
When using both UMAP or t-SNE, one must take care not to overinterpret the embedding structure or distances.
import numpy as np import matplotlib.pyplot as pltfrom matplotlib.pyplot import cmimport seaborn as sns%config InlineBackend.figure_format ='retina'%config InlineBackend.print_figure_kwargs={'facecolor' : "w"}import openTSNE, umapprint('openTSNE', openTSNE.__version__)print('umap', umap.__version__)