Skip to content

Prophecis v0.2.0

Compare
Choose a tag to compare
@alexzyWu alexzyWu released this 15 Mar 12:50
· 73 commits to master since this release

Prophecis 0.2.0 release

Prophecis v0.2.0 mainly publish distributed modeling module(DI). This module is based on FfDL and mainly provides single machine modeling and distributed tensorflow tasks.

Enhancement

[1] Add Prophecis-DI Rest Module. #12
[2] Add Prophecis-DI Trainer & JobMonitor Module, which is responsible for managing the task lifecycle. #13
[3] Add Prophecis-DI LCM Module, which is responsible task scheduling, building single machine and distributed tasks. #8
[4] Add Prophecis-DI Storage Module, which is Responsible for the operation of storage module, such as Minio, ES, Mongo, etc. #14
[5] ADD Log in CLI Program, a command-line interface tool. #11

Bugfix

[1] Fix Helm Chart Setting Error. #16

Prophecis-DI module is built based on the FfDL. The main modifications are as follows:

[1] Integrate Kubeflow Arena,Provide distributed tensorflow task ability.
[2] Modify the creation mode of single machine modeling task:remove helper and job jobmonitor in task, and change deploy pod to deploy job.
[3] The log collection service is changed to daemonset, and the collection tool is changed to fluent bit.
[4] The task status update mode is changed to an independent service job monitor.
[5] Add user GUID control in container data directory.
[6] Enhance CLI, added parameter replacement of yaml template, and the train command was modified to websocket connect, providing log and state.
[7] The code file storage server is changed to Minio.