Here is a basic KMeans algorithm in spark/scala. It uses arrays - it makes no use of the MLLIB. If you are in a hurry and want to develop on something basic, then this can be a good statrting point. The input txt file is just a tab delimited txt file. Each row in this text file represents a point. Set numClusters (number of clusters you'd like to have) and numIterations before using it. At the end, this code just prints out the centers and the center to which each point is mapped to. Please let me know if you have amy questions ([email protected]).
-
Notifications
You must be signed in to change notification settings - Fork 0
kardes/spark-scala-KMeans
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published