斯坦福cs231n课程记录—

一 .某些API解释

1. plt.rcParams

作用：设置matplotlib的配置参数

例子：

%matplotlib inline
plt.rcParams[\'figure.figsize\'] = (10.0, 8.0) # set default size of plots
plt.rcParams[\'image.interpolation\'] = \'nearest\'
plt.rcParams[\'image.cmap\'] = \'gray\'

2. auto_reload

作用：在调试的过程中，如果代码发生更新，实现ipython中引用的模块也能自动更新。

例子：

%load_ext autoreload
%autoreload 2

详情：

参考0

参考1

3. np.flatnonzero()

作用：矩阵扁平化后返回非零元素的位置

例子：

import numpy as np
x = np.arange(-2,3)
print x
y = np.flatnonzero(x)
print y

结果：

[-2 -1  0  1  2]
[0 1 3 4]

np.flatnonzero(y_train == y)

作用：找出标签中y类的位置

例子：

z = np.flatnonzero(x == -1)
print z

结果：

[1]

4. np.random.choice

原型：numpy.random.choice(a, size=None, replace=True, p=None)

作用：随机选取a中的值

详解：

参数	参数意义
a	为一维数组或者int数据；
size	为生成的数组维度；
replace	是否原地替换；
p	为样本出现的概率；

例子：

print(np.random.choice(7,4))  #[0 6 4 6]

解释：在0-7之间随机选取4个数。等同于np.random.randint(0,7,4)

print(np.random.choice(7,4,p=[0,0.1,0.3,0.2,0,0.2,0.2]))

解释：p中的值对应a中每个值的概率。

5.reshape中-1

作用：自动计算数组列数或行数

# Reshape the image data into rows
X_train = np.reshape(X_train, (X_train.shape[0], -1))
X_test = np.reshape(X_test, (X_test.shape[0], -1))
print(X_train.shape, X_test.shape)

输出：

(5000, 3072) (500, 3072)

6. np.linalg.norm

原型：

x_norm=np.linalg.norm(x, ord=None, axis=None, keepdims=False)

作用：求范数（详见参考连接）

例子：

difference = np.linalg.norm(dists - dists_two, ord=\'fro\')
print(\'Difference was: %f\' % (difference, ))
if difference < 0.001:
    print(\'Good! The distance matrices are the same\')
else:
    print(\'Uh-oh! The distance matrices are different\')

输出：

Difference was: 0.000000
Good! The distance matrices are the same

说明：为了保证向量化的代码运行正确，将运行结果与之前的结果对比。对比两个矩阵是否相等有很多方法，其中较简单的一种就是使用Frobenius范数。其表示的是两个矩阵所有元素的差值的均方根。或者将两个矩阵reshape成向量后，计算其欧式距离。

7. *args, **kwargs

*args表示任何多个无名参数，它是一个tuple

**kwargs表示关键字参数，它是一个dict

例子：

def foo(*args,**kwargs):
    print(\'args=\',args)
    print(\'kwargs=\',kwargs)
    print(\'************\')

foo(1,2,3)
foo(a=1,b=2,c=3)
foo(1,2,a=3)

输出：

args= (1, 2, 3)
kwargs= {}
************
args= ()
kwargs= {\'a\': 1, \'c\': 3, \'b\': 2}
************
args= (1, 2)
kwargs= {\'a\': 3}
************
例子：

# Let\'s compare how fast the implementations are
def time_function(f, *args):
    \"\"\"
    Call a function f with args and return the time (in seconds) that it took to execute.
    \"\"\"
    import time
    tic = time.time()
    f(*args)
    toc = time.time()
    return toc - tic

two_loop_time = time_function(classifier.compute_distances_two_loops, X_test)
print(\'Two loop version took %f seconds\' % two_loop_time)

二.KNN原理

1. compute_distances_two_loops

原理：

$\"\"$

    def compute_distances_two_loops(self, X):
        num_test = X.shape[0]
        num_train = self.X_train.shape[0]
        dists = np.zeros((num_test, num_train))
        for i in range(num_test):
            for j in range(num_train):
        #####################################################################
        # TODO:                                                             #
        # Compute the l2 distance between the ith test point and the jth    #
        # training point, and store the result in dists[i, j]. You should   #
        # not use a loop over dimension.                                    #
        #####################################################################
                dists[i,j] = np.sqrt(np.dot(X[i] - self.X_train[j],X[i] - self.X_train[j]))
        #####################################################################
        #                       END OF YOUR CODE                            #
        #####################################################################
        return dists

2. compute_distances_one_loop

原理：

$\"\"$

利用了broadcast原理。

def compute_distances_one_loop(self, X):
        num_test = X.shape[0]
        num_train = self.X_train.shape[0]
        dists = np.zeros((num_test, num_train))
        for i in range(num_test):
      #######################################################################
      # TODO:                                                               #
      # Compute the l2 distance between the ith test point and all training #
      # points, and store the result in dists[i, :].                        #
      #######################################################################
              dists[i,:] = np.linalg.norm(X[i,:] - self.X_train[:], axis = 1)
      #######################################################################
      #                         END OF YOUR CODE                            #
      #######################################################################
        return dists

3. compute_distances_no_loops

def compute_distances_no_loops(self, X):
        num_test = X.shape[0]
        num_train = self.X_train.shape[0]
        dists = np.zeros((num_test, num_train)) 
    #########################################################################
    # TODO:                                                                 #
    # Compute the l2 distance between all test points and all training      #
    # points without using any explicit loops, and store the result in      #
    # dists.                                                                #
    #                                                                       #
    # You should implement this function using only basic array operations; #
    # in particular you should not use functions from scipy.                #
    #                                                                       #
    # HINT: Try to formulate the l2 distance using matrix multiplication    #
    #       and two broadcast sums.                                         #
    #########################################################################
        dists += np.sum(np.multiply(X, X), axis = 1, keepdims = True).reshape(num_test, 1) 
        dists += np.sum(np.multiply(self.X_train, self.X_train), axis = 1, keepdims = True).reshape(1, num_train)
        dists += -2 * np.dot(X, self.X_train.T)
        dists = np.sqrt(dists)
    #########################################################################
    #                         END OF YOUR CODE                              #
    #########################################################################
        return dists

三.作业问题记录

Inline Question #1: Notice the structured patterns in the distance matrix, where some rows or columns are visible brighter. (Note that with the default color scheme black indicates low distances while white indicates high distances.)

What in the data is the cause behind the distinctly bright rows?
What causes the columns?

$\"\"$

Answer: 某些行颜色偏浅，说明该测试样本和所有训练样本的差异较大，该测试样本可能明显过亮或过暗或有色差,或者训练数据可能有坏点。某些列颜色偏浅，说明所有测试样本和该训练样本的差异较大，该训练样本可能明显过亮或过暗或有色差。

Inline Question 2 We can also other distance metrics such as L1 distance. The performance of a Nearest Neighbor classifier that uses L1 distance will not change if (Select all that apply.):

The data is preprocessed by subtracting the mean.
The data is preprocessed by subtracting the mean and dividing by the standard deviation.
The coordinate axes for the data are rotated.
None of the above.

Your Answer:1,2

Your explanation:1和2对坐标值的变换都是线性的，如果变换前（x+y+z+...)最小，则变换后(kx+ky+kz+...)也是最小，因此使用L1距离结果不会改变。3是坐标轴旋转，L1距离会变化，L2距离不会。L2距离是[x1,y1]=[[cosβ,sinβ],[-sinβ cosβ]][x,y]T ，即x1=xcosβ+ysinβ,y1=-xsinβ+ycosβ,L2距离不变。L1各向量有具体含义，L2没有。在面对两个向量之间的差异时，L2比L1更加不能容忍这些差异。相对于1个巨大差异，L2距离更倾向于多个中等程度的差异。

Inline Question 3 Which of the following statements about kk-Nearest Neighbor (kk-NN) are true in a classification setting, and for all kk? Select all that apply.

The training error of a 1-NN will always be better than that of 5-NN.
The test error of a 1-NN will always be better than that of a 5-NN.
The decision boundary of the k-NN classifier is linear.
The time needed to classify a test example with the k-NN classifier grows with the size of the training set.
None of the above.

Your Answer: Statements 1,4 are true

Your explanation:

Inline Question 3 Which of the following statements about kk-Nearest Neighbor (kk-NN) are true in a classification setting, and for all kk? Select all that apply.

The training error of a 1-NN will always be better than that of 5-NN.
The test error of a 1-NN will always be better than that of a 5-NN.
The decision boundary of the k-NN classifier is linear.
The time needed to classify a test example with the k-NN classifier grows with the size of the training set.
None of the above.

Your Answer: Statements 1,4 are true

Your explanation:

1: 当k=1时表示只有最近的点做判断的依据，因此训练没有误差，k=5的时候，根据vote的规则不同，会有不一样的训练误差。

2: k越小，如果某些数据存在噪声，过拟合，则泛化能力就差，因此k=1不一定优于k=5；

3: 首先，Knn不是线性分类器，因为输入和输出没有线性关系，其次，knn的分界面是由很多小的线性空间组成，分界面局部是线性的;

4: 搜索的量增大。

4. 结果讨论

Two loop version took 24.132196 seconds
One loop version took 45.021950 seconds
No loop version took 0.465832 seconds

原因：

一次循环是每次开内存空间导致时间比二次循环长

参考文献：

继续阅读与本文标签相同的文章

无标签

蚂蚁金服和世界一线的科幻作家聊了聊：技术与好的未来

grails web快速开发框架(第一章)

收藏打印

斯坦福cs231n课程记录——assignment1 KNN

浏览：1307 2026-05-08

目录

一 .某些API解释

二.KNN原理

三.作业问题记录

继续阅读与本文标签相同的文章

蚂蚁金服和世界一线的科幻作家聊了聊：技术与好的未来

grails web快速开发框架(第一章)

特别推荐 2026年05月18日星期一

精彩发现

热门标签

斯坦福cs231n课程记录——assignment1 KNN

浏览：1307 2026-05-08

目录

一 .某些API解释

二.KNN原理

三.作业问题记录

继续阅读与本文标签相同的文章

2026-05-18栏目： 教程

2026-05-18栏目： 教程

2026-05-18栏目： 教程

2026-05-18栏目： 教程

2026-05-18栏目： 教程

2026-04-23栏目： 教程

2026-04-23栏目： 教程

2026-04-23栏目： 教程

2026-04-23栏目： 教程

2026-04-24栏目： 教程

特别推荐 2026年05月18日 星期一

精彩发现

热门标签

相关文章

2026-05-18栏目：教程

2026-05-18栏目：教程

2026-05-18栏目：教程

2026-05-18栏目：教程

2026-05-18栏目：教程

2026-04-23栏目：教程

2026-04-23栏目：教程

2026-04-23栏目：教程

2026-04-23栏目：教程

2026-04-24栏目：教程

特别推荐 2026年05月18日星期一