input_size is just the dim of a input data, for example, if a char is embedded into a vector which’d dim==512, then input is (bs, len, embed_dim), there, embed_dim is input_size
hidden_size is just hidden cell’s dim.

19. LSA Location Sensitive Attention

https://paperswithcode.com/method/location-sensitive-attention
https://www.zhihu.com/question/68482809/answer/264632289
https://cloud.tencent.com/developer/article/1614072

20. zoneout

zoneout是rnn 时间维度上的“dropout”，要么维持前一个时刻的hidden vector，要么按照一般的样子更新。不是指单独的cell，而是指训练时的一种trick。Dropout就是通用的一种深度学习技巧，训练时随机失活一些神经元，可以增强模型泛化抑制过拟合作用。zoneout是指随机失活一个rnncell，跳过一步。

https://www.zhihu.com/question/332535296/answer/733037609

21. WaveRNN

https://www.jianshu.com/p/b3019f2773ed

https://zhuanlan.zhihu.com/p/105788551

22. ResBlock

https://zhuanlan.zhihu.com/p/161639679

23. kernel_size=1 in conv

nn.Conv1d with a kernel size of 1 and nn.Linear give essentially the same results. The only differences are the initialization procedure and how the operations are applied (which has some effect on the speed). Note that using a linear layer should be faster as it is implemented as a simple matrix multiplication (+ adding a broadcasted bias vector)

https://stackoverflow.com/questions/55576314/conv1d-with-kernel-size-1-vs-linear-layer

24. torch.repeat

https://pytorch.org/docs/stable/generated/torch.Tensor.repeat.html

x = torch.tensor([1, 2, 3])
x.repeat(4, 2)
tensor([[ 1, 2, 3, 1, 2, 3],
[ 1, 2, 3, 1, 2, 3],
[ 1, 2, 3, 1, 2, 3],
[ 1, 2, 3, 1, 2, 3]])
x.repeat(4, 2, 1).size()
torch.Size([4, 2, 3])

以x.repeat(4, 2, 1).size()为例，由于需要repeat 4，2，1三次，但是x为一维的，所以unsqueeze两次得到x.shape=(1,1,3)
再对这三个维度分别进行 4，2，1 repeat

崔文耀

http://myblog.cuimouren.cn

本博客所有文章除特別声明外，均采用 CC BY 4.0 许可协议。转载请注明来源崔文耀 !

毕设

毕设基于迁移学习的声音合成于声音克隆

2022-05-09 -毕设

毕设

杂记 2022

2022-04-10 杂记 2022

杂记