聚类算法代写 CA Assignment 2 Data Clustering Implementing

聚类算法代写


CA Assignment 2


Data Clustering


聚类算法代写 In the assignment, you are required to cluster words belonging to four categories: animals, countries, fruits and veggies. The words are

Submission Instructions  聚类算法代写 

Submit via Canvas the following three fifiles (please do NOT zip fifiles into an archive)


  1. the source code for all your programs (do not provide ipython/jupyter/colab notebooks, instead submit standalone code in a single .py fifile),


  1. a README fifile (plain text) describing how to compile/run your code to produce the various results required by the assignment, and


  1. a PDF fifile of no more than 5 pages providing the answers to the questions. Your answers should be succinct, but complete and clear. The clarity and the presentaiton of the report will be assessed.

It is extremely important that you provide all the fifiles described above and not just the source code!

聚类算法代写

Objectives

This assignment requires you to implement the k-means and k-medians clustering algorithms using the Python programming language.

NOTE  聚类算法代写 

No credit will be given for implementing any other types of clustering algorithms or using an existing library for clustering instead of implementing it by yourself.

You are allowed to use


  • numpy library for accessing data structures such as numpy.array;


  • random module;


  • matplotlib for plotting; and  聚类算法代写


  • pandas.read_csv, csv.reader, or similar modules only for reading data from the fifiles.


However, it is not a requirement of the assignment to use any of those libraries. You must provide a README fifile describing how to run your code to re-produce your results. Programs that do not run will result in a mark of zero!

Assignment description

In the assignment, you are required to cluster words belonging to four categories: animals, countries, fruits and veggies. The words are arranged into four difffferent fifiles that you will fifind in the archive CA2data.zip. The name of a fifile is the true label of all objects in the fifile. The fifirst entry in each line is a word followed by 300 features (word embedding) describing the meaning of that word.

Questions/Tasks  聚类算法代写 


  1. (25 marks) Implement the k-means clustering algorithm to cluster the instances into k clusters.


  1. (25 marks) Implement the k-medians clustering algorithm to cluster the instances into k clusters.


  1. (10 marks) Run the k-means clustering algorithm you implemented in part (1) to cluster the given instances. Vary the value of k from 1 to 9 and compute the B-CUBED precision, recall, and F-score for each set of clusters. Plot k in the horizontal axis and the B-CUBED precision. Recall and F-score in the vertical axis in the same plot.  聚类算法代写


  1. (10 marks) Now re-run the k-means clustering algorithm you implemented in part (1) but normalise each object (vector) to unit ` 2 length before clustering. Vary the value of k from 1 to 9 and compute the B-CUBED precision, recall, and F-score for each set of clusters. Plot k in the horizontal axis and the B-CUBED precision.  Recall and F-score in the vertical axis in the same plot.


  1. (10 marks) Run the k-medians clustering algorithm you implemented in part (2) over the unnormalised objects. Vary the value of k from 1 to 9 and compute the B-CUBED precision, recall, and F-score for each set of clusters. Plot k in the horizontal axis and the B-CUBED precision, recall and F-score in the vertical axis in the same plot.  聚类算法代写


  1. (10 marks) Now re-run the k-medians clustering algorithm you implemented in part (2) but normalise each object (vector) to unit ` 2 length before clustering. Vary the value of k from 1 to 9 and compute the B-CUBED precision, recall, and F-score for each set of clusters. Plot k in the horizontal axis and the B-CUBED precision, recall and F-score in the vertical axis in the same plot.


  1. (10 marks) Comparing the difffferent clusterings you obtained in (3)-(6).  Discuss in which setting you obtained best clustering for this dataset.

威诺娜州立大学代写

更多代写:cs代写    计量经济代考   机器学习代写      r语言代写  gre网考作弊

发表回复

客服一号:点击这里给我发消息
客服二号:点击这里给我发消息
微信客服1:essay-kathrine
微信客服2:essay-gloria