본문 바로가기
CS231n

Assignment 1-1 (KNN)

by 민지기il 2024. 3. 18.

KNN 실습

 

이 식을 전개

Cross-Validation

num_folds = 5
k_choices = [1, 3, 5, 8, 10, 12, 15, 20, 50, 100]

X_train_folds = []
y_train_folds = []
################################################################################
# TODO: #
# Split up the training data into folds. After splitting, X_train_folds and #
# y_train_folds should each be lists of length num_folds, where #
# y_train_folds[i] is the label vector for the points in X_train_folds[i]. #
# Hint: Look up the numpy array_split function. #
################################################################################
# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

X_train_folds=np.array_split(X_train, num_folds)
y_train_folds=np.array_split(y_train, num_folds)



# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

# A dictionary holding the accuracies for different values of k that we find
# when running cross-validation. After running cross-validation,
# k_to_accuracies[k] should be a list of length num_folds giving the different
# accuracy values that we found when using that value of k.
k_to_accuracies = {}



################################################################################
# TODO: #
# Perform k-fold cross validation to find the best value of k. For each #
# possible value of k, run the k-nearest-neighbor algorithm num_folds times, #
# where in each case you use all but one of the folds as training data and the #
# last fold as a validation set. Store the accuracies for all fold and all #
# values of k in the k_to_accuracies dictionary. #
################################################################################
# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

for k in k_choices:
k_to_accuracies[k]=[]
for i in range(num_folds):
X_train_fold = np.concatenate([x for num, x in enumerate(X_train_folds) if num!=i])
y_train_fold = np.concatenate([x for num, x in enumerate(y_train_folds) if num!=i])

classifier.train(X_train_fold, y_train_fold)
#테스트 폴드(X_train_folds[i])에 대한 예측된 레이블을 저장
y_fold_pred = classifier.predict(X_train_folds[i], k=k, num_loops=0)
num_correct = np.sum(y_fold_pred == y_train_folds[i])

accuracy = float(num_correct) / X_train_folds[i].shape[0]
k_to_accuracies[k].append(accuracy)


# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

# Print out the computed accuracies
for k in sorted(k_to_accuracies):
for accuracy in k_to_accuracies[k]:
print('k = %d, accuracy = %f' % (k, accuracy))

 

: k-겹 교차 검증을 사용하여 k-NN 분류기의 최적의 k 값을 찾는 과정이다.

 

1. 데이터를 k개의 폴드로 분할한다.

X_train_folds = np.array_split(X_train, num_folds)

y_train_folds = np.array_split(y_train, num_folds)

 

2. train 단계에서는 하나를 제외한 모든 폴드를 사용하여 모델을 훈련하고, test 단계에서는 나머지 하나의 폴드를 사용하여 모델의 정확도를 측정한다. 이 과정을 k-겹 교차 검증의 각 폴드에 대해 반복하고, 각각의 k 값에 대한 정확도를 k_to_accuracies 딕셔너리에 저장한다.

for k in k_choices:

    k_to_accuracies[k] = []

    for i in range(num_folds):

       #특정 인덱스 i를 가진 폴드를 제외한 모든 폴드를 하나도 합친다.

        X_train_fold = np.concatenate([x for num, x in enumerate(X_train_folds) if num != i])

        y_train_fold = np.concatenate([x for num, x in enumerate(y_train_folds) if num != i])

 

        classifier.train(X_train_fold, y_train_fold)

        y_fold_pred = classifier.predict(X_train_folds[i], k=k, num_loops=0)

        num_correct = np.sum(y_fold_pred == y_train_folds[i])

 

        accuracy = float(num_correct) / X_train_folds[i].shape[0]

        k_to_accuracies[k].append(accuracy)

 

k_nearest_neighbor.py
0.01MB

 

knn.ipynb
0.53MB

 

 

교과서 풀고 수능 30번 푸는 기분이다. 분명히 아는 개념인데 코드로 어떻게 구현해야할지 모르겠는..

초반엔 구글링을 최대한 안 하려고 했는데 어쩔 수 없다. 

최대한 정리하면서 풀다가 안되겠으면 chat gpt나 구글링으로 참고하면서 푸는 중

 

**코드 참고

# np.argsort

1. 

a=np.array([4,2,1,7])  일때

s=a.argsort() 하고

print(s) 하면 [2,1,0,3] 이고

print(a[s]) 하면 [1,2,4,7] 출력

 

2.

a=np.array([4,2,1,7]) 일때

s=np.argsort(a)

print(a[s])하면 된다.

'CS231n' 카테고리의 다른 글

C231n(12) Visualizing and Understanding  (1) 2024.03.23
CS231n(11) Detection and Segmentation  (0) 2024.03.18
Assignment Python Tutorial  (1) 2024.03.14
CS231n(10) Recurrent Neural Networks  (0) 2024.03.13
CS231n(9) CNN Architectures  (1) 2024.03.12