Skip to content

Latest commit

 

History

History
116 lines (89 loc) · 8.76 KB

taskforce.md

File metadata and controls

116 lines (89 loc) · 8.76 KB

[ Back to index ]

MLCommons Task force on Automation and Reproducibility

Goals

Mission

This task force was established by MLCommons and the cTuning foundation in 2022 to apply the established automation and reproducibility methodology and open-source tools from ACM, IEEE and the cTuning foundation to run MLPerf benchmarks out-of-the-box across any software, hardware, models and data from any vendor with the help of the MLCommons CM automation language and the MLCommons CK playground.

We use this open-source technology to organize reproducibility, replicability and optimization challenges to reproduce results from research papers and MLPerf submissions, improve/optimize them in terms of accuracy, performance, power consumption, size, costs and other metrics, and validate them in the real-world applications.

We successfully validated the latest version of open-technology during the 1st collaborative challenge to run MLPerf inference v3.0 benchmark across diverse models, software and hardware from Neural Magic, Qualcomm, Nvidia, Intel, AMD, Microsoft, Amazon, Google, Krai, cKnowledge, cTuning foundation, OctoML, Deelvin, DELL, HPE, Lenovo, Hugging Face and Apple - CK and CM has helped to automate more than 80% of all recent MLPerf inference benchmark submissions (and 98% of all power results), make them more reproducible and reusable, and obtain record inference performance on the latest Qualcomm and Nvidia devices.

Our ultimate mission is to help all MLCommons members and the community slash their benchmarking, development, optimization and operational costs and accelerate innovation. They should be able to use the CK playground and CM language to automatically generate the most efficient, reproducible and deployable application from the most suitable combination of software, hardware and models based on their requirements, constraints and MLPerf results.

Discussions

Chairs and Tech Leads

Development plan

2023

  • DONE: prototype the CM (CK2) automation to let the community submit MLPerf inference v3.0 results across any software and hardware stack (our technology powered 4K+ results (!) across diverse cloud and edge platforms with different versions of PyTorch, ONNX, TFLite, TF, TVM targeting diverse CPUs and GPUs that will be announced at the beginning of April)!
  • Prototype an open-source on-prem CK platform with a public API to automate SW/HW co-design for AI, ML and other emerging workloads based on user requirements and constraints.
  • Collaborative CK challenge for the community to reproduce, optimize and submit results to MLPerf inference v3.0 - 98% of all results were automated by the MLCommons CK technology!
  • New CK challenge to help MLCommons organizations and the community use our platform to prepare, optimize and compare their MLPerf inference v3.1 submissions on any SW/HW stack
  • Enhance the MLCommons CK2/CM automation meta-framework to support our platform across any SW/HW stacks from MLCommons members and the community.
  • Enhance the MLPerf C++ inference template library (MITL) to run and optimize MLPerf inference across any AI/ML/SW/HW stack.
  • Enhance the light MLPerf inference application to benchmark any ML model on any SW/HW stack without data sets and accuracy.
  • Enhance our platform and automation framework to support reproducibility initiatives and studies at conferences and journals across rapidly evolving software, hardware and data (collaboration with the cTuning foundation, ACM, IEEE and NeurIPS).

2022

Archive of 2022 tasks.

Resources

Acknowledgments

This task force is supported by MLCommons, cTuning foundation, cKnowledge and individual contributors.