下面是使用numpy的广义解,以及与tensorflow和scipy的正确性比较:
数据准备:
import numpy as np
np.random.seed(2019)
batch_size = 1
n_items = 3
n_classes = 2
logits_np = np.random.rand(batch_size,n_items,n_classes).astype(np.float32)
print('logits_np.shape', logits_np.shape)
print('logits_np:')
print(logits_np)
输出:
logits_np.shape (1, 3, 2)
logits_np:
[[[0.9034822 0.3930805 ]
[0.62397 0.6378774 ]
[0.88049906 0.299172 ]]]
使用tensorflow的Softmax:
import tensorflow as tf
logits_tf = tf.convert_to_tensor(logits_np, np.float32)
scores_tf = tf.nn.softmax(logits_np, axis=-1)
print('logits_tf.shape', logits_tf.shape)
print('scores_tf.shape', scores_tf.shape)
with tf.Session() as sess:
scores_np = sess.run(scores_tf)
print('scores_np.shape', scores_np.shape)
print('scores_np:')
print(scores_np)
print('np.sum(scores_np, axis=-1).shape', np.sum(scores_np,axis=-1).shape)
print('np.sum(scores_np, axis=-1):')
print(np.sum(scores_np, axis=-1))
输出:
logits_tf.shape (1, 3, 2)
scores_tf.shape (1, 3, 2)
scores_np.shape (1, 3, 2)
scores_np:
[[[0.62490064 0.37509936]
[0.4965232 0.5034768 ]
[0.64137274 0.3586273 ]]]
np.sum(scores_np, axis=-1).shape (1, 3)
np.sum(scores_np, axis=-1):
[[1. 1. 1.]]
使用scipy的Softmax:
from scipy.special import softmax
scores_np = softmax(logits_np, axis=-1)
print('scores_np.shape', scores_np.shape)
print('scores_np:')
print(scores_np)
print('np.sum(scores_np, axis=-1).shape', np.sum(scores_np, axis=-1).shape)
print('np.sum(scores_np, axis=-1):')
print(np.sum(scores_np, axis=-1))
输出:
scores_np.shape (1, 3, 2)
scores_np:
[[[0.62490064 0.37509936]
[0.4965232 0.5034768 ]
[0.6413727 0.35862732]]]
np.sum(scores_np, axis=-1).shape (1, 3)
np.sum(scores_np, axis=-1):
[[1. 1. 1.]]
Softmax使用numpy (https://nolanbconaway.github.io/blog/2017/softmax-numpy):
def softmax(X, theta = 1.0, axis = None):
"""
Compute the softmax of each element along an axis of X.
Parameters
----------
X: ND-Array. Probably should be floats.
theta (optional): float parameter, used as a multiplier
prior to exponentiation. Default = 1.0
axis (optional): axis to compute values along. Default is the
first non-singleton axis.
Returns an array the same size as X. The result will sum to 1
along the specified axis.
"""
# make X at least 2d
y = np.atleast_2d(X)
# find axis
if axis is None:
axis = next(j[0] for j in enumerate(y.shape) if j[1] > 1)
# multiply y against the theta parameter,
y = y * float(theta)
# subtract the max for numerical stability
y = y - np.expand_dims(np.max(y, axis = axis), axis)
# exponentiate y
y = np.exp(y)
# take the sum along the specified axis
ax_sum = np.expand_dims(np.sum(y, axis = axis), axis)
# finally: divide elementwise
p = y / ax_sum
# flatten if X was 1D
if len(X.shape) == 1: p = p.flatten()
return p
scores_np = softmax(logits_np, axis=-1)
print('scores_np.shape', scores_np.shape)
print('scores_np:')
print(scores_np)
print('np.sum(scores_np, axis=-1).shape', np.sum(scores_np, axis=-1).shape)
print('np.sum(scores_np, axis=-1):')
print(np.sum(scores_np, axis=-1))
输出:
scores_np.shape (1, 3, 2)
scores_np:
[[[0.62490064 0.37509936]
[0.49652317 0.5034768 ]
[0.64137274 0.3586273 ]]]
np.sum(scores_np, axis=-1).shape (1, 3)
np.sum(scores_np, axis=-1):
[[1. 1. 1.]]