Unsupervised_learning

Task 1:

The file cluster1.csv contains 500 data points. Each data point has two features. Python is used to apply k-means partition on the data set into three clusters (The cluster count three is provided in the problem statement of the assignment for academic course work). The resulting cluster is plotted. K-means algorithm is run with different centeroid seeds and the clustering performance is evaluated visually.

Reason as to why there are five clusters instead of three in the data set.

Task 2:

The file cluster2.csv contains 1000 data points. Each data point has two features. For this data set the number of clusters it can be segregated into is not specified. The model such as K-means or GMM that works best with it is also not specified. Hence evaluation of each model while identifying the optimal clustering count is what the code tries to implement.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Portfolio_2_task_2_1.ipynb		Portfolio_2_task_2_1.ipynb
Portfolio_2_task_2_2.ipynb		Portfolio_2_task_2_2.ipynb
README.md		README.md
cluster1.csv		cluster1.csv
cluster2.csv		cluster2.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unsupervised_learning

About

Releases

Packages

Languages

NithinMathewJosephAston/Unsupervised_learning

Folders and files

Latest commit

History

Repository files navigation

Unsupervised_learning

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages