In neural network, Softmax function and Cross entropy is utilized as a set. Hence it's convenient to think differential for Softmax - cross entorpy layer. In this article, I will share with you how to derive the differential of "Softmax - cross entorpy layer". As a prior knowledge, reading links is prefered :)
0. Differential of Cross Entropy¶
Let yk be output of softmax function , tk be correct data as one-hot vector , cross entropy takes form, Cross Entropy=−n∑k=1tklogyk Obviously, we can derive differential of cross entropy with comperative ease :) Let E be cross entropy,
∂E∂yk=−tkyk1. Differential of Cross Entropy¶
As we discussed here, differential of softmax can be depicted as following, where ak=(a1,a2,⋯,an) is input and yk=(y1,y2,⋯,yk) is ouput for Softmax function,
∂yl∂ak={yk(1−yk)(k=l)−ykyl(k≠l)So differential of Softmax function can be derived as below :),
∂E∂ak=n∑i=1∂E∂yi∂yi∂ak=−tkykyk(1−yk)+∑i≠ktiyiyiyk=tkyk−tk+yk∑i≠kti=yk−tk (∵n∑i=1ti=1)Thus, we obtain incredibly simple result of differential, ∂E∂ak=yk−tk
No comments:
Post a Comment