Skip to content

University of Westminster Software Engineering Degree Program Machine Learnining and Data Mining Coursework

Notifications You must be signed in to change notification settings

thejanv/ML_Coursework

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ML_Coursework

1st Objective (partitioning clustering)

  1. Pre-processing tasks (2 marks for scaling and 5 marks for outliers detection/removal)
  2. Determine the number of cluster centres by showing all necessary steps/methods via “automated” tools (1 mark for each one of these “automated” tools)
  3. K-means analysis for the chosen k (all attributes used) and show all requested outputs
  4. Show the silhouette plot (2 marks) and provide related discussion on this output, following this Kmeans attempt (2 marks)
  5. Apply a PCA for this vehicle dataset and show all related R-outputs (2 marks). Create a new dataset with those PCs with a cumulative score at least > 92%, as attributes and provide a discussion for your choice (2 marks).
  6. Determine the number of cluster centres by showing all necessary steps/methods via “automated” tools (1 mark for each one of these “automated” tools)
  7. K-means analysis for this “pca”-based dataset for the chosen k and show all requested outputs
  8. Show the silhouette plot (2 marks) and provide related discussion on this output, following this “pca-based” Kmeans attempt (2 marks)
  9. Implement and show the Calinski-Harabasz index. Provide, a brief discussion on the outcome of this index.

2nd Objective (MLP)

  1. Brief discussion of the various methods used for defining the input vector in electricity load forecasting problems
  2. Evidence of various adopted input vectors and the related input/output matrices for both “AR” (4 marks) and “NARX” (3 marks) based approaches
  3. Evidence of correct normalisation/de-normalisation (3 marks) and brief discussion of its necessity for MLP networks (3 marks)
  4. Implement a number of MLPs for the “AR” approach, using various internal structures (layers/nodes)/input variables/network parameters and show in the comparison table, their performances (based on testing data) through the provided stat. indices. (4 marks for structures with different input vectors, 8 marks for different internal NN structures).
  5. Discussion of the meaning of these four stat. indices (2 marks for each index)
  6. Creation of the comparison matrix for the “AR” case
  7. Discuss the issue of “efficiency” with your two best NN structures (for the “AR” approach)
  8. Implement a number of MLPs for the “NARX” approach, following the same procedure as the previous “AR” case. Provide a brief discussion. (2 marks for structures with different input vectors, 4 marks for different internal NN structures, 2 marks for the comparison table and 2 marks for the discussion).
  9. Provide your best results both graphically (your prediction output vs. desired output) and via performance indices (2 marks for the graphical display and 2 marks for showing the requested statistical indices)

About

University of Westminster Software Engineering Degree Program Machine Learnining and Data Mining Coursework

Topics

Resources

Stars

Watchers

Forks

Languages