-
-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Bug: Ersilia fetch/serve fails but model appears on catalog #1505
Comments
This info bug is very similar to what @Abellegese was describing having faced with bentoml. But in any case, we should just delete the model and its artifacts if it fails to fetch. We presently only do it if the model fails to generate a Standard Model Example, as you can see in this snippet from the referenced code: fr = await self._fetch(model_id)
if fr.fetch_success:
try:
self._standard_csv_example(model_id)
except StandardModelExampleError:
self.logger.debug("Standard model example failed, deleting artifacts")
do_delete = yes_no_input(
"Do you want to delete the model artifacts? [Y/n]",
default_answer="Y",
)
if do_delete:
md = ModelFullDeleter(overwrite=False)
md.delete(model_id)
return FetchResult(
fetch_success=False,
reason="Could not successfully run a standard example from the model.",
)
else:
self.logger.debug("Writing model source to file")
model_source_file = os.path.join(
self._model_path(model_id), MODEL_SOURCE_FILE
)
try:
os.makedirs(self._model_path(model_id), exist_ok=True)
except OSError as error:
self.logger.error(f"Error during folder creation: {error}")
with open(model_source_file, "w") as f:
f.write(self.model_source)
return FetchResult(
fetch_success=True, reason="Model fetched successfully"
)
else:
return fr I think we should encapsulate all BentoML related sub-process calls with a general |
Yes exactly @DhanshreeA @GemmaTuron . Quick fix is to do the following pip uninstall bentoml
# then just call this bentoml commamd
bentoml --version |
@OlawumiSalaam this might be interesting, and definitely more of a deep dive than your current task. Please take a look when you can. |
eos69p9_serve.txt
Describe the bug.
Hi,
I tried to get a model through the CLI directly using the serve command (Docker inactive, so it will try from S3) but it crashed (See attached error log). Nonetheless, if I immediatly after do
ersilia catalog --local --more
, the model appears:The model source does not appear, which indicates it has failed but it would be good to have a way to catch this or add a note to users that a certain model is not working?
I'll tag this as an addition as it is not critical
Describe the steps to reproduce the behavior
No response
Operating environment
Ubuntu 24.02 LTS
The text was updated successfully, but these errors were encountered: