目前我理解的DTW,是计算两个不等长序列之前的相似度,至于如何从距离转化为相似度,还没想好,不过先把计算两者之间距离的代码记录下:
def dtw_distance(ts_a, ts_b, d=lambda x,y: manhattan_distances([[x]],[[y]]), mww=10000):
"""
computers dtw distance between two time series
Args:
ts_a: time series a
ts_b: time series b
d: distance function #在这里用到的是曼哈顿距离(求绝对值距离)
mww: max warping window, int, optional(default = infinity)
Returns:
dtw distance
"""
# Create cost matrix via broadcasting with large int
ts_a, ts_b = np.array(ts_a), np.array(ts_b)
M, N = len(ts_a), len(ts_b)
cost = np.ones((M, N))
# Initialize the first row and column
cost[0,0] = d(ts_a[0], ts_b[0])
for i in range(1, M):
cost[i,0] = cost[i-1, 0] + d(ts_a[i], ts_b[0])
for j in range(1, N):
cost[0,j] = cost[0, j-1] + d(ts_a[0], ts_b[j])
# Populate rest of cost matrix within window
for i in range(1,M):
for j in range(max(1, i - mww), min(N, i+ mww)):
choices = cost[i-1, j-1], cost[i, j-1], cost[i-1, j]
cost[i, j] = min(choices) + d(ts_a[i], ts_b[j])
# Return DTW distance geiven window
return cost, cost[-1,-1]
计算[0, 0, 1, 3, 4, 5, 5, 5, 6, 6], [0, 1, 2, 3, 4, 5, 15, 5, 6, 6]两个序列的距离,为11
dtw_distance([0, 0, 1, 3, 4, 5, 5, 5, 6, 6],[0, 1, 2, 3, 4, 5, 15, 5, 6, 6])
(array([[ 0., 1., 3., 6., 10., 15., 30., 35., 41., 47.],
[ 0., 1., 3., 6., 10., 15., 30., 35., 41., 47.],
[ 1., 0., 1., 3., 6., 10., 24., 28., 33., 38.],
[ 4., 2., 1., 1., 2., 4., 16., 18., 21., 24.],
[ 8., 5., 3., 2., 1., 2., 13., 14., 16., 18.],
[13., 9., 6., 4., 2., 1., 11., 11., 12., 13.],
[18., 13., 9., 6., 3., 1., 11., 11., 12., 13.],
[23., 17., 12., 8., 4., 1., 11., 11., 12., 13.],
[29., 22., 16., 11., 6., 2., 10., 11., 11., 11.],
[35., 27., 20., 14., 8., 3., 11., 11., 11., 11.]]),
11.0)
网友评论