efficiently storing pairwise distance in a big matrix in R
I am thinking of a way to efficiently store pairwise distances into a big
distance matrix. Suppose I have a function dist_pair(g.idx) that takes
gene indices g.idx as input, e.g. g.idx=c(1,2) for gene 1 and gene 2. This
function returns the distance between gene 1 and gene 2.
Now I have 10,000 genes, and the big matrix dist_mat is to store all
pairwise distances to be passed to hierarchical clustering algorithms such
as hclust in R. Note that we only need to store choose(10000,2) values in
this dist_mat. Is there an efficient way for me to do that? I fear for
loops in R which is so time-consuming, and I am looking forward to a
better way. Thanks!
No comments:
Post a Comment