差别
这里会显示出您选择的修订版和当前版本之间的差别。
| 两侧同时换到之前的修订记录 前一修订版 后一修订版 | 前一修订版 | ||
| 深度学习:神经网络基础 [2026/03/02 21:15] – [填空题] 张叶安 | 深度学习:神经网络基础 [2026/03/02 21:18] (当前版本) – [计算题答案] 张叶安 | ||
|---|---|---|---|
| 行 485: | 行 485: | ||
| 11. 给定一个单层神经网络(无隐藏层),输入$\mathbf{x} = [2, -1, 3]$,权重$\mathbf{w} = [0.5, 0.3, -0.2]$,偏置$b = 0.1$。请计算: | 11. 给定一个单层神经网络(无隐藏层),输入$\mathbf{x} = [2, -1, 3]$,权重$\mathbf{w} = [0.5, 0.3, -0.2]$,偏置$b = 0.1$。请计算: | ||
| - | | + | |
| - | (2) 若真实标签$y=1$,计算二分类交叉熵损失 | + | (1) 使用Sigmoid激活的输出 |
| + | |||
| + | (2) 若真实标签$y=1$,计算二分类交叉熵损失 | ||
| 12. 一个两层神经网络,结构如下: | 12. 一个两层神经网络,结构如下: | ||
| - | | + | |
| - | - 隐藏层:2个神经元,ReLU激活 | + | - 隐藏层:2个神经元,ReLU激活 |
| - | - 输出层:1个神经元,Sigmoid激活 | + | - 输出层:1个神经元,Sigmoid激活 |
| | | ||
| - | | + | 给定输入$\mathbf{x} = [1, 0, -1]$,权重: |
| - | $$W^{(1)} = \begin{bmatrix} 0.2 & 0.1 & 0.3 \\ -0.1 & 0.2 & 0.1 \end{bmatrix}, | + | |
| - | $$W^{(2)} = \begin{bmatrix} 0.3 & 0.4 \end{bmatrix}, | + | $$W^{(1)} = \begin{bmatrix} 0.2 & 0.1 & 0.3 \\ -0.1 & 0.2 & 0.1 \end{bmatrix}, |
| + | |||
| + | $$W^{(2)} = \begin{bmatrix} 0.3 & 0.4 \end{bmatrix}, | ||
| | | ||
| - | | + | 请计算网络输出$\hat{y}$。 |
| ===== 2.7 答案与解析 ===== | ===== 2.7 答案与解析 ===== | ||
| 行 534: | 行 538: | ||
| 11. **解答**: | 11. **解答**: | ||
| | | ||
| - | | + | (1) 线性变换: |
| - | $$z = \mathbf{w}^T \mathbf{x} + b = 0.5 \times 2 + 0.3 \times (-1) + (-0.2) \times 3 + 0.1$$ | + | |
| - | $$= 1.0 - 0.3 - 0.6 + 0.1 = 0.2$$ | + | $$z = \mathbf{w}^T \mathbf{x} + b = 0.5 \times 2 + 0.3 \times (-1) + (-0.2) \times 3 + 0.1$$ |
| + | |||
| + | $$= 1.0 - 0.3 - 0.6 + 0.1 = 0.2$$ | ||
| | | ||
| - | | + | Sigmoid输出: |
| - | $$\hat{y} = \sigma(0.2) = \frac{1}{1 + e^{-0.2}} = \frac{1}{1 + 0.819} \approx 0.550$$ | + | |
| + | $$\hat{y} = \sigma(0.2) = \frac{1}{1 + e^{-0.2}} = \frac{1}{1 + 0.819} \approx 0.550$$ | ||
| | | ||
| - | | + | (2) 二分类交叉熵损失: |
| - | $$\mathcal{L} = -[y \log(\hat{y}) + (1-y)\log(1-\hat{y})]$$ | + | |
| - | $$= -[1 \times \log(0.550) + 0 \times \log(0.450)]$$ | + | $$\mathcal{L} = -[y \log(\hat{y}) + (1-y)\log(1-\hat{y})]$$ |
| - | $$= -\log(0.550) \approx 0.598$$ | + | |
| + | $$= -[1 \times \log(0.550) + 0 \times \log(0.450)]$$ | ||
| + | |||
| + | $$= -\log(0.550) \approx 0.598$$ | ||
| 12. **解答**: | 12. **解答**: | ||
| | | ||
| - | | + | **隐藏层**: |
| - | $$\mathbf{z}^{(1)} = W^{(1)} \mathbf{x} + \mathbf{b}^{(1)}$$ | + | |
| - | $$= \begin{bmatrix} 0.2 \times 1 + 0.1 \times 0 + 0.3 \times (-1) + 0.1 \\ -0.1 \times 1 + 0.2 \times 0 + 0.1 \times (-1) + 0.2 \end{bmatrix}$$ | + | $$\mathbf{z}^{(1)} = W^{(1)} \mathbf{x} + \mathbf{b}^{(1)}$$ |
| - | $$= \begin{bmatrix} 0.2 - 0.3 + 0.1 \\ -0.1 - 0.1 + 0.2 \end{bmatrix} = \begin{bmatrix} 0.0 \\ 0.0 \end{bmatrix}$$ | + | |
| + | $$= \begin{bmatrix} 0.2 \times 1 + 0.1 \times 0 + 0.3 \times (-1) + 0.1 \\ -0.1 \times 1 + 0.2 \times 0 + 0.1 \times (-1) + 0.2 \end{bmatrix}$$ | ||
| + | |||
| + | $$= \begin{bmatrix} 0.2 - 0.3 + 0.1 \\ -0.1 - 0.1 + 0.2 \end{bmatrix} = \begin{bmatrix} 0.0 \\ 0.0 \end{bmatrix}$$ | ||
| | | ||
| - | | + | 应用ReLU: |
| - | $$\mathbf{a}^{(1)} = \text{ReLU}([0.0, | + | |
| + | $$\mathbf{a}^{(1)} = \text{ReLU}([0.0, | ||
| | | ||
| - | | + | **输出层**: |
| - | $$z^{(2)} = W^{(2)} \mathbf{a}^{(1)} + b^{(2)} = 0.3 \times 0 + 0.4 \times 0 + 0.1 = 0.1$$ | + | |
| + | $$z^{(2)} = W^{(2)} \mathbf{a}^{(1)} + b^{(2)} = 0.3 \times 0 + 0.4 \times 0 + 0.1 = 0.1$$ | ||
| | | ||
| - | | + | 应用Sigmoid: |
| - | $$\hat{y} = \sigma(0.1) = \frac{1}{1 + e^{-0.1}} \approx \frac{1}{1 + 0.905} \approx 0.525$$ | + | |
| + | $$\hat{y} = \sigma(0.1) = \frac{1}{1 + e^{-0.1}} \approx \frac{1}{1 + 0.905} \approx 0.525$$ | ||
| - | --- | ||
| - | **本章完** | ||