Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ModelArtifactConfig creation refactor #206

Merged
merged 3 commits into from
Feb 13, 2024

Conversation

masahi
Copy link
Member

@masahi masahi commented Feb 13, 2024

As of #192, StagingEngine needs to have model_artifact_config, which is created in the worker process. Creating the config in the constructor again, via get_model_artifact_config, is problematic for PT models since there is no build artifact for PT models (hence get_model_artifact_config cannot be used).

This PR does a refactor around how ModelArtifactConfig is constructed. Rather than creating it inside PagedCacheModel, we are now creating it at the beginning of the engine initialization. So the created config can be passed to all of StagingEngine, StagingEngineWorker, and SyncEngine.

Note that the same update needs to be applied to ollm in the next revision bump.

It also fixes the issue of SamplingParams not being verified correctly due to how it is initialized.

@sunggg @yelite

Copy link
Member

@sunggg sunggg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one comment.

# ignore_eos=request.ignore_eos,
# use_beam_search=request.use_beam_search,
)
sampling_params = SamplingParams()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we simply do

sampling_params = SamplingParams(
   presence_penalty=request.presence_penalty,
   ....
) 

Although we need to handle checks like if request.presence_penalty is not None:, I think this way we can make sure that these new values can go through the __post_init__.
And I think sampling_params.verify() can also stay within __post_init__ if this works.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that approach first, but I didn't want to repeat the default values specified in sampling_params.py here.

@sunggg sunggg merged commit 11ae521 into octoml:batch-serving Feb 13, 2024
1 check passed
Lunderberg pushed a commit to Lunderberg/mlc-llm that referenced this pull request Feb 27, 2024
This PR brings the database merge feature, which automatically merges
all database under the specified `--db-path` (which is `./log_db/` by
default).

For example, when there are two database under `./log_db/`:
`./log_db/vicuna` and `./log_db/redpajama`, we won't need to manually
specify one database to use. Instead, `build.py` will now load each
of the database and merge them into one in memory.

We are also backward compatible. Say if we have only one database which
is `./log_db` itself, we are still able to load it in.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants