幂迭代法应用:pivotMDS--大图数据的稀疏化布局

Posted fahaizhong

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了幂迭代法应用:pivotMDS--大图数据的稀疏化布局相关的知识,希望对你有一定的参考价值。

论文解读

首先pivotMDS是通过p个锚点,做p次单源最短路,得到一个n*p(d_{ij})矩阵。通过这样一个n*p(d_{ij})计算得到一个C矩阵,(C in R^{n imes k})
C矩阵计算方式:
[ egin{equation} C_{ij} = -frac{1}{2} (delta_{ij}^2 - frac{1}{n} sum_{r=1}^{n}delta_{rj}^2 - frac{1}{k} sum_{s=1}^{k}delta_{is}^2 + frac{1}{nk} sum_{r=1}^{n}sum_{s=1}^{k}delta_{rs}^2) end{equation} ]
[ B = C^TC ]
[ v_1 = poweriteration(B)_{max} ]
[ v_2 = poweriteration(B)_{sec} ]
[ x = Cv_1, y=Cv_2 ]
其中poweriteration由上一节幂迭代法求特征值的方法求得最大和次大特征向量。

代码

读取图数据

#
def readGraph(graph_name,fmt):
    if 'mat' in fmt:
        mat_data = io.loadmat(graph_name + '.mat')
        graph = mat_data['Problem']['A'][0][0]
        return graph
    elif 'txt' in fmt or 'csv' in fmt:
        df = pd.read_csv(graph_name + '.txt',sep=' ',names=["src","tgt"])
#         df["data"] = np.ones_like(df["src"].values)
        row = df["src"].values
        col = df["tgt"].values
        data = np.ones_like(row)
        graph = csc_matrix((data, (row, col)), shape=(max(row.max(),col.max()) + 1, max(row.max(),col.max()) + 1)) 
        return graph
    else:
        print("error:not supported format.")
        return None
graph_name = '../data/bus1138'
graph = readGraph(graph_name,'txt')

数据可以从tamu稀疏矩阵数据库,其中mat文件为matlab格式文件,txt为空格分隔的文本文件,共两列,分别表示连接一条边的两个点。

计算C矩阵和B矩阵(C^TC)

pivot_Point = np.random.randint(graph.shape[0],size = NP)
d = csgraph.shortest_path(graph, directed=False, unweighted=True,indices=pivot_Point)
d[np.isinf(d)] = 0
(k,n) = d.shape
d2  = d**2
deltaIS = d2.sum(axis=0)/k
deltaRJ = d2.sum(axis=1)/n
sumALL = d2.sum()/(n*k)
C = np.zeros((n,NP),dtype=np.float64)

@jit(nopython=True) 
def getC(C):
    for i in range(n):
        for j in range(k):
            C[i][j] = -1.0/2 * (d[j][i]**2 - deltaRJ[j] - deltaIS[i] + sumALL)
    return C
C = getC(C)
B = np.dot(C.T,C)

幂迭代法

def power_iteration(A, num_simulations: int):
    b_k = np.random.rand(A.shape[1])
    
    for _ in range(num_simulations):
        b_k1 = np.dot(A, b_k)
        b_k1_norm = np.linalg.norm(b_k1)
        b_k = b_k1 / b_k1_norm
 
    return b_k

V_1 = power_iteration(B, 100).reshape(1,-1)
lbd  = np.dot(V_1,np.dot(B,V_1.T)) 
B_2 = B - lbd / np.linalg.norm(V_1)**2 * np.dot(V_1.T,V_1)
V_2 = power_iteration(B_2, 100)

结果

pos = np.zeros((n,2))
pos[:,0] = np.dot(C,V_1.reshape(-1,1)).reshape(-1)
pos[:,1] = np.dot(C,V_2.reshape(-1,1)).reshape(-1)


X = pos.copy()
plt.figure(figsize=(32,32))
plt.axis('equal')
ax = plt.axes()
ax.set_xlim(min(X[:,0])-10, max(X[:,0])+10)
ax.set_ylim(min(X[:,1])-10, max(X[:,1])+10)
# ax.plot(X[:,0],X[:,1],'*',color='b',markersize=0.4)
lines = []
for i,j in zip(*graph.nonzero()):
    lines.append([X[i], X[j]])

lc = mc.LineCollection(lines, linewidths=.3, colors='#0000007f')
ax.add_collection(lc)
plt.savefig('../result/' + graph_name + '.pdf', dpi=1000)
plt.show()

最终效果如下:

技术图片
欢迎访问个人博客。转载请注明出处。

以上是关于幂迭代法应用:pivotMDS--大图数据的稀疏化布局的主要内容,如果未能解决你的问题,请参考以下文章

迭代稀疏矩阵并连接每一行的数据和索引

@算法 - 5@ 牛顿迭代法的应用——多项式开方,对数,指数,三角与幂函数

poj 3735 稀疏矩阵矩阵快速幂

poj 3735 Training little cats 矩阵快速幂+稀疏矩阵乘法优化

数据分箱

特征离散化的意义