/avatar.jpg

L10-Training I

Training I Activation Functions Sigmoid function: $\sigma(x) = \frac{1}{1 + e^{-x}}$ 不是0中心 两端饱和 always all positive or negative :( exp() 计算复杂,但是对于GPU不是问题 tanh function: $\tanh(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}}$ sigmoid变体 ReLU function: $f(x) = max(0, x)$ 不会饱和 计算快 非0中心 dead relu ==> leaky relu Leaky ReLU function: $f(x) = max(0.01x, x)$ 解决了dead relu问题 ==> PRelu function:把0.01改成可学习的参数 ELU function: $f(x) = \begin{cases} x & x \geq 0 \ \alpha(e^x - 1) & x < 0 \end{cases}$ Data Preprocessing 参见DATA-100相关课程

L9-Hard and Software

Hard and Soft ware Hardware eecs 598.009 GPU programming! 其实我很想了解一下cuda编程 tensorflow支持TPU,pytorch呢? 计算图存储在GPU内存里面 Software the point of deep learning frameworks allow rapid prototyping automatically compute gradients run it all efficiently on GPUs or else PyTorch sigmoid0减少计算图节点的设计,因为反向传播重写了 1 2 3 4 5 6 7 8 9 10 11 12 13 14 class Sigmoid(torch.autograd.Function): @staticmethod def forward(ctx, input): y = 1 / (1 + torch.exp(-input)) ctx.save_for_backward(input) return y @staticmethod def backward(ctx, grad_output): input, = ctx.