best_FL_global_model.pt is selected from n-1 rounds of learning #802
-
It looks like the best_FL_global_model.pt is only ever selected/saved in a round prior to aggregation (EventType.BEFORE_AGGREGATE), using the initial metrics (MetaKey.INITIAL_METRICS) computed before training took place at the client sites (ValidateType.BEFORE_TRAIN_VALIDATE). This essentially means that the selection of a "best" FL global model can only take place on the n-1th round's global model, so if the global model improved in the nth round of learning then during cross-site eval we see that SRV_FL_global_model.pt actually outperforms SRV_best_FL_global_model.pt. I'm wondering, could there be an additional event fired or method called after the rounds are complete to perform one final validation of the current FL global model across all sites and fire EventType.GLOBAL_BEST_MODEL_AVAILABLE accordingly? |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 15 replies
-
Yes, your observation is correct. You could add a task to workflow that does a final evaluation before completing the run. |
Beta Was this translation helpful? Give feedback.
-
Hi, does it mean that the metric (i.e., accuracy) of SRV_best_FL_global_model could be worse than the metric of SRV_FL_global_model? I have this situation now. |
Beta Was this translation helpful? Give feedback.
-
It could be if your model is still converging. Typically the best global model selection is useful to avoid overfitting the training data, assuming the validation performance will decrease later in training. |
Beta Was this translation helpful? Give feedback.
Yes, your observation is correct. You could add a task to workflow that does a final evaluation before completing the run.