Computer Vision – Assignment 1
Image Classification using a Bag-of-Words model
计算机视觉代写 Apply a nearest neighbor classifier: calculate the L2 distances, i.e., Euclidean distances, between the normalized histograms
o 80% of the assignment marks will be awarded for correctness of results
o 20% of the assignment marks will be awarded for the quality of the accompanying report
Submission Instructions 计算机视觉代写
o Send all solutions as a single PDF document containing your answers, results, and discussion
of the results. Attach the source code for the programming problems as separate files.
Before you start, download “COMP338_Assignment1_Dataset.zip” from Canvas; after unzipping the file, you can find there are five classes in the dataset: airplanes, cars, dog, faces, keyboard; there are 70 training
images and 10 test images for each class.
Step 1. (20 marks) Feature extraction
- Extract the SIFT features from the training and the test images. You need to implement the SIFT descriptor by yourself without calling the inbuilt SIFT functions in OpenCV (as the algorithm is patented), however, there are a few SIFT implementations available in GitHub and you can use them as a reference.
- Explain the main steps to extract SIFT features with your code in the report.
Step 2. (20 marks) Dictionary generation 计算机视觉代写
Create a dictionary by clustering the extracted SIFT descriptors from the training images. Use a dictionary of 500 words.
Step 3. (20 marks) Image representation with a histogram of codewords
. Calculate the Euclidian distance between image descriptors and codewords, i.e., the cluster centres.
. Assign each descriptor in the training and test images to the nearest codeword cluster.
. Visualize some image patches that are assigned to the same codeword.
. Represent each image in the training and the test dataset as a histogram of visual words, i.e., using
the Bag of Words representation. Normalize the histograms by their L1 norm.
Step 4. (10 marks) Classification
Apply a nearest neighbor classifier: calculate the L2 distances, i.e., Euclidean distances, between the normalized histograms of the test and the training images; assign each test image with the label (i.e., the
class) of its nearest neighbor in the training set.
Step 5 (10 marks) Evaluation 计算机视觉代写
. Compute and report the overall and the classification errors per class.
. Compute and show the confusion matrix.
. For each class, show some images that are correctly classified and some images that are incorrectly classified. Can you explain some of the failures?
Step 6 (10 marks)
Replace the L2 distances with the histogram intersection between two histograms. Repeat the steps 4 and 5.
Step 7 (10 marks)
- Perform the classification experiment using a very small dictionary and report the classification error and confusion matrices. Hint: Cluster the descriptors into a small number of clusters (e.g. 20).
. Explain the drop in the performance. To support your argument you may want to perform step 2.5 in order to visualize descriptors that belong to some of the visual words. 计算机视觉代写