diff --git a/docs/book/.gitbook/assets/vertexai_experiment_tracker_tb.png b/docs/book/.gitbook/assets/vertexai_experiment_tracker_tb.png new file mode 100644 index 00000000000..7db462df919 Binary files /dev/null and b/docs/book/.gitbook/assets/vertexai_experiment_tracker_tb.png differ diff --git a/docs/book/.gitbook/assets/vertexai_experiment_tracker_ui.png b/docs/book/.gitbook/assets/vertexai_experiment_tracker_ui.png new file mode 100644 index 00000000000..6a0fabed811 Binary files /dev/null and b/docs/book/.gitbook/assets/vertexai_experiment_tracker_ui.png differ diff --git a/docs/book/component-guide/experiment-trackers/vertexai.md b/docs/book/component-guide/experiment-trackers/vertexai.md new file mode 100644 index 00000000000..92d5464f2c1 --- /dev/null +++ b/docs/book/component-guide/experiment-trackers/vertexai.md @@ -0,0 +1,315 @@ +--- +description: Logging and visualizing experiments with Vertex AI Experiment Tracker. +--- + +# Vertex AI Experiment Tracker + +The Vertex AI Experiment Tracker is an [Experiment Tracker](./experiment-trackers.md) flavor provided with the Vertex AI ZenML integration. It uses the [Vertex AI tracking service](https://cloud.google.com/vertex-ai/docs/experiments/intro-vertex-ai-experiments) to log and visualize information from your pipeline steps (e.g., models, parameters, metrics). + +## When would you want to use it? + +[Vertex AI Experiment Tracker](https://cloud.google.com/vertex-ai/docs/experiments/intro-vertex-ai-experiments) is a managed service by Google Cloud that you would normally use in the iterative ML experimentation phase to track and visualize experiment results. That doesn't mean that it cannot be repurposed to track and visualize the results produced by your automated pipeline runs, as you make the transition toward a more production-oriented workflow. + +You should use the Vertex AI Experiment Tracker: + +* if you have already been using Vertex AI to track experiment results for your project and would like to continue doing so as you are incorporating MLOps workflows and best practices in your project through ZenML. +* if you are looking for a more visually interactive way of navigating the results produced from your ZenML pipeline runs (e.g. models, metrics, datasets) +* if you are building machine learning workflows in the Google Cloud ecosystem and want a managed experiment tracking solution tightly integrated with other Google Cloud services, Vertex AI is a great choice + +You should consider one of the other [Experiment Tracker flavors](./experiment-trackers.md#experiment-tracker-flavors) if you have never worked with Vertex AI before and would rather use another experiment tracking tool that you are more familiar with, or if you are not using GCP or using other cloud providers. + +## How do you configure it? + +The Vertex AI Experiment Tracker flavor is provided by the GCP ZenML integration, you need to install it on your local machine to be able to register a Vertex AI Experiment Tracker and add it to your stack: + +```shell +zenml integration install gcp -y +``` + +### Configuration Options + +To properly register the Vertex AI Experiment Tracker, you can provide several configuration options tailored to your needs. Here are the main configurations you may want to set: + +* `project`: Optional. GCP project name. If `None` it will be inferred from the environment. +* `location`: Optional. GCP location where your experiments will be created. If not set defaults to us-central1. +* `staging_bucket`: Optional. The default staging bucket to use to stage artifacts. In the form gs://... +* `service_account_path`: Optional. A path to the service account credential json file to be used to interact with Vertex AI Experiment Tracker. Please check the [Authentication Methods](vertexai.md#authentication-methods) chapter for more details. + +With the project, location and staging_bucket, registering the Vertex AI Experiment Tracker can be done as follows: + +```shell +# Register the Vertex AI Experiment Tracker +zenml experiment-tracker register vertex_experiment_tracker \ + --flavor=vertex \ + --project= \ + --location= \ + --staging_bucket=gs:// + +# Register and set a stack with the new experiment tracker +zenml stack register custom_stack -e vertex_experiment_tracker ... --set +``` + +### Authentication Methods + +Integrating and using a Vertex AI Experiment Tracker in your pipelines is not possible without employing some form of authentication. If you're looking for a quick way to get started locally, you can use the _Implicit Authentication_ method. However, the recommended way to authenticate to the Google Cloud Platform is through a [GCP Service Connector](../../how-to/infrastructure-deployment/auth-management/gcp-service-connector.md). This is particularly useful if you are configuring ZenML stacks that combine the Vertex AI Experiment Tracker with other remote stack components also running in GCP. + +> **Note**: Regardless of your chosen authentication method, you must grant your account the necessary roles to use Vertex AI Experiment Tracking. +> * `roles/aiplatform.user` role on your project, which allows you to create, manage, and track your experiments within Vertex AI. +> * `roles/storage.objectAdmin` role on your GCS bucket, granting the ability to read and write experiment artifacts, such as models and datasets, to the storage bucket. + +{% tabs %} +{% tab title="Implicit Authentication" %} +This configuration method assumes that you have authenticated locally to GCP using the [`gcloud` CLI](https://cloud.google.com/sdk/gcloud) (e.g., by running gcloud auth login). + +> **Note**: This method is quick for local setups but is unsuitable for team collaborations or production environments due to its lack of portability. + +We can then register the experiment tracker as follows: + +```shell +# Register the Vertex AI Experiment Tracker +zenml experiment-tracker register \ + --flavor=vertex \ + --project= \ + --location= \ + --staging_bucket=gs:// + +# Register and set a stack with the new experiment tracker +zenml stack register custom_stack -e vertex_experiment_tracker ... --set +``` + +{% endtab %} + +{% tab title="GCP Service Connector (recommended)" %} +To set up the Vertex AI Experiment Tracker to authenticate to GCP, it is recommended to leverage the many features provided by the [GCP Service Connector](../../how-to/infrastructure-deployment/auth-management/gcp-service-connector.md) such as auto-configuration, best security practices regarding long-lived credentials and reusing the same credentials across multiple stack components. + +If you don't already have a GCP Service Connector configured in your ZenML deployment, you can register one using the interactive CLI command. You have the option to configure a GCP Service Connector that can be used to access more than one type of GCP resource: + +```sh +# Register a GCP Service Connector interactively +zenml service-connector register --type gcp -i +``` + +After having set up or decided on a GCP Service Connector to use, you can register the Vertex AI Experiment Tracker as follows: + +```shell +# Register the Vertex AI Experiment Tracker +zenml experiment-tracker register \ + --flavor=vertex \ + --project= \ + --location= \ + --staging_bucket=gs:// + +zenml experiment-tracker connect --connector + +# Register and set a stack with the new experiment tracker +zenml stack register custom_stack -e vertex_experiment_tracker ... --set +``` + +{% endtab %} + +{% tab title="GCP Credentials" %} +When you register the Vertex AI Experiment Tracker, you can [generate a GCP Service Account Key](https://cloud.google.com/docs/authentication/application-default-credentials#attached-sa), store it in a [ZenML Secret](../../getting-started/deploying-zenml/secret-management.md) and then reference it in the Experiment Tracker configuration. + +This method has some advantages over the implicit authentication method: + +* you don't need to install and configure the GCP CLI on your host +* you don't need to care about enabling your other stack components (orchestrators, step operators and model deployers) to have access to the experiment tracker through GCP Service Accounts and Workload Identity +* you can combine the Vertex AI Experiment Tracker with other stack components that are not running in GCP + +For this method, you need to [create a user-managed GCP service account](https://cloud.google.com/iam/docs/service-accounts-create) and then [create a service account key](https://cloud.google.com/iam/docs/keys-create-delete#creating). + +With the service account key downloaded to a local file, you can register a ZenML secret and reference it in the Vertex AI Experiment Tracker configuration as follows: + +```shell +# Register the Vertex AI Experiment Tracker and reference the ZenML secret +zenml experiment-tracker register \ + --flavor=vertex \ + --project= \ + --location= \ + --staging_bucket=gs:// \ + --service_account_path=path/to/service_account_key.json + +# Register and set a stack with the new experiment tracker +zenml experiment-tracker connect --connector +``` + +{% endtab %} +{% endtabs %} + +## How do you use it? + +To be able to log information from a ZenML pipeline step using the Vertex AI Experiment Tracker component in the active stack, you need to enable an experiment tracker using the `@step` decorator. Then use Vertex AI's logging or auto-logging capabilities as you would normally do, e.g. + +Here are two examples demonstrating how to use the experiment tracker: + +### Example 1: Logging Metrics Using Built-in Methods + +This example demonstrates how to log time-series metrics using `aiplatform.log_time_series_metrics` from within a Keras callback, and using `aiplatform.log_metrics` to log specific metrics and `aiplatform.log_params` to log experiment parameters. The logged metrics can then be visualised in the UI of Vertex AI Experiment Tracker and integrated TensorBoard instance. + +> **Note:** To use the autologging functionality, ensure that the google-cloud-aiplatform library is installed with the Autologging extension. You can do this by running the following command: +> ```bash +> pip install google-cloud-aiplatform[autologging] +> ``` + +```python +from google.cloud import aiplatform + +class VertexAICallback(tf.keras.callbacks.Callback): + def on_epoch_end(self, epoch, logs=None): + logs = logs or {} + metrics = {key: value for key, value in logs.items() if isinstance(value, (int, float))} + aiplatform.log_time_series_metrics(metrics=metrics, step=epoch) + + +@step(experiment_tracker="") +def train_model( + config: TrainerConfig, + x_train: np.ndarray, + y_train: np.ndarray, + x_val: np.ndarray, + y_val: np.ndarray, +): + aiplatform.autolog() + + ... + + # Train the model, using the custom callback to log metrics into experiment tracker + model.fit( + x_train, + y_train, + validation_data=(x_test, y_test), + epochs=config.epochs, + batch_size=config.batch_size, + callbacks=[VertexAICallback()] + ) + + ... + + # Log specific metrics and parameters + aiplatform.log_metrics(...) + aiplatform.log_params(...) +``` + +### Example 2: Uploading TensorBoard Logs + +This example demonstrates how to use an integrated TensorBoard instance to directly upload training logs. This is particularly useful if you're already using TensorBoard in your projects and want to benefit from its detailed visualizations during training. You can initiate the upload using `aiplatform.start_upload_tb_log` and conclude it with `aiplatform.end_upload_tb_log`. Similar to the first example, you can also log specific metrics and parameters directly. + +> **Note:** To use TensorBoard logging functionality, ensure you have the `google-cloud-aiplatform` library installed with the TensorBoard extension. You can install it using the following command: +> ```bash +> pip install google-cloud-aiplatform[tensorboard] +> ``` + +```python +from google.cloud import aiplatform + + +@step(experiment_tracker="") +def train_model( + config: TrainerConfig, + gcs_path: str, + x_train: np.ndarray, + y_train: np.ndarray, + x_val: np.ndarray, + y_val: np.ndarray, +): + # get current experiment and run names + experiment_tracker = Client().active_stack.experiment_tracker + experiment_name = experiment_tracker.experiment_name + experiment_run_name = experiment_tracker.run_name + + # define a TensorBoard callback, logs are written to gcs_path + tensorboard_callback = tf.keras.callbacks.TensorBoard( + log_dir=gcs_path, + histogram_freq=1 + ) + # start the TensorBoard log upload + aiplatform.start_upload_tb_log( + tensorboard_experiment_name=experiment_name, + logdir=gcs_path, + run_name_prefix=f"{experiment_run_name}_", + ) + model.fit( + x_train, + y_train, + validation_data=(x_test, y_test), + epochs=config.epochs, + batch_size=config.batch_size, + ) + + ... + + # end the TensorBoard log upload + aiplatform.end_upload_tb_log() + + aiplatform.log_metrics(...) + aiplatform.log_params(...) +``` + +{% hint style="info" %} +Instead of hardcoding an experiment tracker name, you can also use the [Client](../../reference/python-client.md) to dynamically use the experiment tracker of your active stack: + +```python +from zenml.client import Client + +experiment_tracker = Client().active_stack.experiment_tracker + +@step(experiment_tracker=experiment_tracker.name) +def tf_trainer(...): + ... +``` + +{% endhint %} + +### Experiment Tracker UI + +You can find the URL of the Vertex AI experiment linked to a specific ZenML run via the metadata of the step in which the experiment tracker was used: + +```python +from zenml.client import Client + +client = Client() +last_run = client.get_pipeline("").last_run +trainer_step = last_run.steps.get("") +tracking_url = trainer_step.run_metadata["experiment_tracker_url"].value +print(tracking_url) +``` + +This will be the URL of the corresponding experiment in Vertex AI Experiment Tracker. + +Below are examples of the UI for the Vertex AI Experiment Tracker and the integrated TensorBoard instance. + +**Vertex AI Experiment Tracker UI** +![VerteAI UI](../../.gitbook/assets/vertexai_experiment_tracker_ui.png) + +**TensorBoard UI** +![TensorBoard UI](../../.gitbook/assets/vertexai_experiment_tracker_tb.png) + +### Additional configuration + +For additional configuration of the Vertex AI Experiment Tracker, you can pass `VertexExperimentTrackerSettings` to specify an experiment name or choose previously created TensorBoard instance. + +> **Note**: By default, Vertex AI will use the default TensorBoard instance in your project if you don't explicitly specify one. + +```python +import mlflow +from zenml.integrations.gcp.flavors.vertex_experiment_tracker_flavor import VertexExperimentTrackerSettings + + +vertexai_settings = VertexExperimentTrackerSettings( + experiment="", + experiment_tensorboard="TENSORBOARD_RESOURCE_NAME" +) + +@step( + experiment_tracker="", + settings={"experiment_tracker": vertexai_settings}, +) +def step_one( + data: np.ndarray, +) -> np.ndarray: + ... +``` + +Check out [this docs page](../../how-to/pipeline-development/use-configuration-files/runtime-configuration.md) for more information on how to specify settings. + +
ZenML Scarf
diff --git a/src/zenml/integrations/gcp/__init__.py b/src/zenml/integrations/gcp/__init__.py index 0c3508546d8..231d3b9f62e 100644 --- a/src/zenml/integrations/gcp/__init__.py +++ b/src/zenml/integrations/gcp/__init__.py @@ -30,6 +30,7 @@ GCP_ARTIFACT_STORE_FLAVOR = "gcp" GCP_IMAGE_BUILDER_FLAVOR = "gcp" +GCP_VERTEX_EXPERIMENT_TRACKER_FLAVOR = "vertex" GCP_VERTEX_ORCHESTRATOR_FLAVOR = "vertex" GCP_VERTEX_STEP_OPERATOR_FLAVOR = "vertex" @@ -70,6 +71,7 @@ def flavors(cls) -> List[Type[Flavor]]: from zenml.integrations.gcp.flavors import ( GCPArtifactStoreFlavor, GCPImageBuilderFlavor, + VertexExperimentTrackerFlavor, VertexOrchestratorFlavor, VertexStepOperatorFlavor, ) @@ -77,6 +79,7 @@ def flavors(cls) -> List[Type[Flavor]]: return [ GCPArtifactStoreFlavor, GCPImageBuilderFlavor, + VertexExperimentTrackerFlavor, VertexOrchestratorFlavor, VertexStepOperatorFlavor, ] diff --git a/src/zenml/integrations/gcp/experiment_trackers/__init__.py b/src/zenml/integrations/gcp/experiment_trackers/__init__.py new file mode 100644 index 00000000000..16ab7042f80 --- /dev/null +++ b/src/zenml/integrations/gcp/experiment_trackers/__init__.py @@ -0,0 +1,18 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at: +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express +# or implied. See the License for the specific language governing +# permissions and limitations under the License. +"""Initialization for the VertexAI experiment tracker.""" + +from zenml.integrations.gcp.experiment_trackers.vertex_experiment_tracker import ( # noqa + VertexExperimentTracker, +) + +__all__ = ["VertexExperimentTracker"] diff --git a/src/zenml/integrations/gcp/experiment_trackers/vertex_experiment_tracker.py b/src/zenml/integrations/gcp/experiment_trackers/vertex_experiment_tracker.py new file mode 100644 index 00000000000..08ecfed36c4 --- /dev/null +++ b/src/zenml/integrations/gcp/experiment_trackers/vertex_experiment_tracker.py @@ -0,0 +1,211 @@ +# Copyright (c) ZenML GmbH 2022. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at: +# +# https://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express +# or implied. See the License for the specific language governing +# permissions and limitations under the License. +"""Implementation of the VertexAI experiment tracker for ZenML.""" + +import re +from typing import TYPE_CHECKING, Dict, Optional, Type, cast + +from google.api_core import exceptions +from google.cloud import aiplatform +from google.cloud.aiplatform.compat.types import execution + +from zenml.constants import METADATA_EXPERIMENT_TRACKER_URL +from zenml.experiment_trackers.base_experiment_tracker import ( + BaseExperimentTracker, +) +from zenml.integrations.gcp.flavors.vertex_experiment_tracker_flavor import ( + VertexExperimentTrackerConfig, + VertexExperimentTrackerSettings, +) +from zenml.integrations.gcp.google_credentials_mixin import ( + GoogleCredentialsMixin, +) +from zenml.logger import get_logger +from zenml.metadata.metadata_types import Uri + +if TYPE_CHECKING: + from zenml.config.step_run_info import StepRunInfo + from zenml.metadata.metadata_types import MetadataType + +logger = get_logger(__name__) + + +class VertexExperimentTracker(BaseExperimentTracker, GoogleCredentialsMixin): + """Track experiments using VertexAI.""" + + @property + def config(self) -> VertexExperimentTrackerConfig: + """Returns the `VertexExperimentTrackerConfig` config. + + Returns: + The configuration. + """ + return cast(VertexExperimentTrackerConfig, self._config) + + @property + def settings_class(self) -> Type[VertexExperimentTrackerSettings]: + """Returns the `BaseSettings` settings class. + + Returns: + The settings class. + """ + return VertexExperimentTrackerSettings + + def prepare_step_run(self, info: "StepRunInfo") -> None: + """Configures a VertexAI run. + + Args: + info: Info about the step that will be executed. + """ + self._initialize_vertex(info=info) + self.experiment_name = self._get_experiment_name(info=info) + self.run_name = self._get_run_name(info=info) + + def get_step_run_metadata( + self, info: "StepRunInfo" + ) -> Dict[str, "MetadataType"]: + """Get component- and step-specific metadata after a step ran. + + Args: + info: Info about the step that was executed. + + Returns: + A dictionary of metadata. + """ + experiment_name = self._get_experiment_name(info=info) + run_name = self._get_run_name(info=info) + tensorboard_resource_name = self._get_tensorboard_resource_name( + experiment=experiment_name + ) + dashboard_url = self._get_dashboard_url(experiment=experiment_name) + return { + METADATA_EXPERIMENT_TRACKER_URL: Uri(dashboard_url), + "tensorboard_resource_name": tensorboard_resource_name or "", + "vertex_run_name": run_name, + } + + def _format_name(self, name: str) -> str: + return re.sub(r"[^a-z0-9-]", "-", name.strip().lower())[:128].rstrip( + "-" + ) + + def _get_experiment_name(self, info: "StepRunInfo") -> str: + """Gets the experiment name. + + Args: + info: Info about the + """ + settings = cast( + VertexExperimentTrackerSettings, self.get_settings(info) + ) + name = settings.experiment or info.pipeline.name + return self._format_name(name) + + def _get_run_name(self, info: "StepRunInfo") -> str: + """Gets the run name. + + Args: + info: Info about the step that will be executed. + + Returns: + The run name. + """ + return self._format_name(info.run_name) + + def _get_dashboard_url(self, experiment: str) -> str: + """Gets the run URL. + + Args: + experiment: The name of the experiment. + + Returns: + The run URL. + """ + resource = aiplatform.Experiment(experiment_name=experiment) + return resource.dashboard_url + + def _get_tensorboard_resource_name(self, experiment: str) -> Optional[str]: + resource = aiplatform.Experiment( + experiment_name=experiment + ).get_backing_tensorboard_resource() + resource_name = ( + str(resource.resource_name) if resource is not None else None + ) + return resource_name + + def _initialize_vertex(self, info: "StepRunInfo") -> None: + """Initializes a VertexAI run. + + Args: + info: Info about the step that will be executed. + """ + settings = cast( + VertexExperimentTrackerSettings, self.get_settings(info) + ) + experiment = self._get_experiment_name(info=info) + run_name = self._get_run_name(info=info) + credentials, project = self._get_authentication() + logger.info( + f"Initializing VertexAI with experiment name {experiment} " + f"and run name {run_name}." + ) + + aiplatform.init( + project=project, + location=self.config.location, + experiment=experiment, + experiment_tensorboard=settings.experiment_tensorboard, + staging_bucket=self.config.staging_bucket, + credentials=credentials, + encryption_spec_key_name=self.config.encryption_spec_key_name, + network=self.config.network, + api_endpoint=self.config.api_endpoint, + api_key=self.config.api_key, + api_transport=self.config.api_transport, + request_metadata=self.config.request_metadata, + ) + + try: + aiplatform.start_run( + run=run_name, + tensorboard=settings.experiment_tensorboard, + resume=True, + ) + except exceptions.NotFound: + aiplatform.start_run( + run=run_name, + tensorboard=settings.experiment_tensorboard, + resume=False, + ) + + logger.info( + f"VertexAI experiment dashboard: {self._get_dashboard_url(experiment=experiment)}" + ) + logger.info( + f"Tensorboard resource name: {self._get_tensorboard_resource_name(experiment=experiment)}" + ) + + def cleanup_step_run(self, info: "StepRunInfo", step_failed: bool) -> None: + """Stops the VertexAI run. + + Args: + info: Info about the step that was executed. + step_failed: Whether the step failed or not. + """ + state = ( + execution.Execution.State.FAILED + if step_failed + else execution.Execution.State.COMPLETE + ) + aiplatform.end_run(state=state) diff --git a/src/zenml/integrations/gcp/flavors/__init__.py b/src/zenml/integrations/gcp/flavors/__init__.py index 73bb6259aa5..e70f4937594 100644 --- a/src/zenml/integrations/gcp/flavors/__init__.py +++ b/src/zenml/integrations/gcp/flavors/__init__.py @@ -21,6 +21,10 @@ GCPImageBuilderConfig, GCPImageBuilderFlavor, ) +from zenml.integrations.gcp.flavors.vertex_experiment_tracker_flavor import ( + VertexExperimentTrackerConfig, + VertexExperimentTrackerFlavor, +) from zenml.integrations.gcp.flavors.vertex_orchestrator_flavor import ( VertexOrchestratorConfig, VertexOrchestratorFlavor, @@ -35,6 +39,8 @@ "GCPArtifactStoreConfig", "GCPImageBuilderFlavor", "GCPImageBuilderConfig", + "VertexExperimentTrackerFlavor", + "VertexExperimentTrackerConfig", "VertexOrchestratorFlavor", "VertexOrchestratorConfig", "VertexStepOperatorFlavor", diff --git a/src/zenml/integrations/gcp/flavors/vertex_experiment_tracker_flavor.py b/src/zenml/integrations/gcp/flavors/vertex_experiment_tracker_flavor.py new file mode 100644 index 00000000000..74f868b6756 --- /dev/null +++ b/src/zenml/integrations/gcp/flavors/vertex_experiment_tracker_flavor.py @@ -0,0 +1,209 @@ +# Copyright (c) ZenML GmbH 2022. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at: +# +# https://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express +# or implied. See the License for the specific language governing +# permissions and limitations under the License. +"""Vertex experiment tracker flavor.""" + +import re +from typing import TYPE_CHECKING, Any, Dict, Optional, Type, Union + +from google.cloud.aiplatform import utils +from pydantic import field_validator + +from zenml.config.base_settings import BaseSettings +from zenml.experiment_trackers.base_experiment_tracker import ( + BaseExperimentTrackerConfig, + BaseExperimentTrackerFlavor, +) +from zenml.integrations.gcp import ( + GCP_RESOURCE_TYPE, + GCP_VERTEX_EXPERIMENT_TRACKER_FLAVOR, +) +from zenml.integrations.gcp.google_credentials_mixin import ( + GoogleCredentialsConfigMixin, +) +from zenml.models import ServiceConnectorRequirements +from zenml.utils.secret_utils import SecretField + +if TYPE_CHECKING: + from zenml.integrations.gcp.experiment_trackers import ( + VertexExperimentTracker, + ) + + +class VertexExperimentTrackerSettings(BaseSettings): + """Settings for the VertexAI experiment tracker. + + Attributes: + experiment: The VertexAI experiment name. + experiment_tensorboard: The VertexAI experiment tensorboard. + """ + + experiment: Optional[str] = None + experiment_tensorboard: Optional[Union[str, bool]] = None + + @field_validator("experiment", mode="before") + def _validate_experiment(cls, value: str) -> str: + """Validates the experiment name matches the regex [a-z0-9][a-z0-9-]{0,127}. + + Args: + value: The experiment. + + Returns: + The experiment. + """ + if value and not re.match(r"^[a-z0-9][a-z0-9-]{0,127}$", value): + raise ValueError( + "Experiment name must match regex [a-z0-9][a-z0-9-]{0,127}" + ) + return value + + +class VertexExperimentTrackerConfig( + BaseExperimentTrackerConfig, + GoogleCredentialsConfigMixin, + VertexExperimentTrackerSettings, +): + """Config for the VertexAI experiment tracker. + + Attributes: + location: Optional. The default location to use when making API calls. If not + set defaults to us-central1. + staging_bucket: Optional. The default staging bucket to use to stage artifacts + when making API calls. In the form gs://... + network: + Optional. The full name of the Compute Engine network to which jobs + and resources should be peered. E.g. "projects/12345/global/networks/myVPC". + Private services access must already be configured for the network. + If specified, all eligible jobs and resources created will be peered + with this VPC. + encryption_spec_key_name: + Optional. The Cloud KMS resource identifier of the customer + managed encryption key used to protect a resource. Has the + form: + ``projects/my-project/locations/my-region/keyRings/my-kr/cryptoKeys/my-key``. + The key needs to be in the same region as where the compute + resource is created. + api_endpoint (str): + Optional. The desired API endpoint, + e.g., us-central1-aiplatform.googleapis.com + api_key (str): + Optional. The API key to use for service calls. + NOTE: Not all services support API keys. + api_transport (str): + Optional. The transport method which is either 'grpc' or 'rest'. + NOTE: "rest" transport functionality is currently in a + beta state (preview). + request_metadata: + Optional. Additional gRPC metadata to send with every client request. + """ + + location: Optional[str] = None + staging_bucket: Optional[str] = None + network: Optional[str] = None + encryption_spec_key_name: Optional[str] = SecretField(default=None) + api_endpoint: Optional[str] = SecretField(default=None) + api_key: Optional[str] = SecretField(default=None) + api_transport: Optional[str] = None + request_metadata: Optional[Dict[str, Any]] = None + + @field_validator("location", mode="before") + def _validate_experiment(cls, value: str) -> str: + """Validates if provided location is valid. + + Args: + value: The gcp location name. + + Returns: + The location name. + """ + utils.validate_region(value) + return value + + +class VertexExperimentTrackerFlavor(BaseExperimentTrackerFlavor): + """Flavor for the VertexAI experiment tracker.""" + + @property + def name(self) -> str: + """Name of the flavor. + + Returns: + The name of the flavor. + """ + return GCP_VERTEX_EXPERIMENT_TRACKER_FLAVOR + + @property + def docs_url(self) -> Optional[str]: + """A URL to point at docs explaining this flavor. + + Returns: + A flavor docs url. + """ + return self.generate_default_docs_url() + + @property + def sdk_docs_url(self) -> Optional[str]: + """A URL to point at SDK docs explaining this flavor. + + Returns: + A flavor SDK docs url. + """ + return self.generate_default_sdk_docs_url() + + @property + def logo_url(self) -> str: + """A URL to represent the flavor in the dashboard. + + Returns: + The flavor logo. + """ + return "https://public-flavor-logos.s3.eu-central-1.amazonaws.com/experiment_tracker/vertexai.png" + + @property + def config_class(self) -> Type[VertexExperimentTrackerConfig]: + """Returns `VertexExperimentTrackerConfig` config class. + + Returns: + The config class. + """ + return VertexExperimentTrackerConfig + + @property + def implementation_class(self) -> Type["VertexExperimentTracker"]: + """Implementation class for this flavor. + + Returns: + The implementation class. + """ + from zenml.integrations.gcp.experiment_trackers import ( + VertexExperimentTracker, + ) + + return VertexExperimentTracker + + @property + def service_connector_requirements( + self, + ) -> Optional[ServiceConnectorRequirements]: + """Service connector resource requirements for service connectors. + + Specifies resource requirements that are used to filter the available + service connector types that are compatible with this flavor. + + Returns: + Requirements for compatible service connectors, if a service + connector is required for this flavor. + """ + return ServiceConnectorRequirements( + resource_type=GCP_RESOURCE_TYPE, + ) diff --git a/tests/integration/integrations/gcp/experiment_trackers/__init__.py b/tests/integration/integrations/gcp/experiment_trackers/__init__.py new file mode 100644 index 00000000000..9542a8dd283 --- /dev/null +++ b/tests/integration/integrations/gcp/experiment_trackers/__init__.py @@ -0,0 +1,16 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at: +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express +# or implied. See the License for the specific language governing +# permissions and limitations under the License. +"""Initialization of the VertexAI experiment tracker.""" + +from zenml.integrations.gcp.experiment_trackers.vertex_experiment_tracker import ( # noqa + VertexExperimentTracker, +) diff --git a/tests/integration/integrations/gcp/experiment_trackers/test_vertex_experiment_tracker.py b/tests/integration/integrations/gcp/experiment_trackers/test_vertex_experiment_tracker.py new file mode 100644 index 00000000000..46c6a39371f --- /dev/null +++ b/tests/integration/integrations/gcp/experiment_trackers/test_vertex_experiment_tracker.py @@ -0,0 +1,154 @@ +# Copyright (c) ZenML GmbH 2022. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at: +# +# https://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express +# or implied. See the License for the specific language governing +# permissions and limitations under the License. + +import re +from contextlib import ExitStack as does_not_raise +from datetime import datetime +from uuid import uuid4 + +import pytest +from mock import MagicMock + +from zenml.enums import StackComponentType +from zenml.integrations.gcp.experiment_trackers.vertex_experiment_tracker import ( + VertexExperimentTracker, +) +from zenml.integrations.gcp.flavors.vertex_experiment_tracker_flavor import ( + VertexExperimentTrackerConfig, +) +from zenml.stack import Stack + + +@pytest.fixture(scope="session") +def vertex_experiment_tracker() -> VertexExperimentTracker: + """Returns a Vertex experiment tracker.""" + return VertexExperimentTracker( + name="", + id=uuid4(), + config=VertexExperimentTrackerConfig(), + flavor="vertex", + type=StackComponentType.EXPERIMENT_TRACKER, + user=uuid4(), + workspace=uuid4(), + created=datetime.now(), + updated=datetime.now(), + ) + + +def test_vertex_experiment_tracker_stack_validation( + vertex_experiment_tracker, + local_orchestrator, + local_artifact_store, +) -> None: + """Tests that a stack with neptune experiment tracker is valid.""" + with does_not_raise(): + Stack( + name="", + id=uuid4(), + orchestrator=local_orchestrator, + artifact_store=local_artifact_store, + experiment_tracker=vertex_experiment_tracker, + ).validate() + + +def test_vertex_experiment_tracker_attributes( + vertex_experiment_tracker, +) -> None: + """Tests that the basic attributes of the neptune experiment tracker are set correctly.""" + assert ( + vertex_experiment_tracker.type == StackComponentType.EXPERIMENT_TRACKER + ) + assert vertex_experiment_tracker.flavor == "vertex" + + +def is_valid_experiment_name(name: str) -> bool: + """Check if the experiment name matches the required regex.""" + EXPERIMENT_NAME_REGEX = re.compile(r"^[a-z0-9][a-z0-9-]{0,127}$") + return bool(EXPERIMENT_NAME_REGEX.match(name)) + + +@pytest.mark.parametrize( + "input_name,expected_output", + [ + ("My Experiment Name", "my-experiment-name"), + ("My_Experiment_Name", "my-experiment-name"), + ("MyExperimentName123", "myexperimentname123"), + ("Name-With-Dashes---", "name-with-dashes"), + ("Invalid!Name", "invalid-name"), + (" Whitespace Name ", "whitespace-name"), + ("UPPERCASE", "uppercase"), + ("a" * 140, "a" * 128), # Truncated to 128 chars + ("special&%_characters", "special---characters"), + ], +) +def test_format_name(vertex_experiment_tracker, input_name, expected_output): + """Test the name formatting function.""" + formatted_name = vertex_experiment_tracker._format_name(input_name) + assert ( + formatted_name == expected_output + ), f"Failed for input: '{input_name}'" + assert is_valid_experiment_name( + formatted_name + ), f"Formatted name '{formatted_name}' does not match the regex" + + +@pytest.mark.parametrize( + "input_name,expected_output", + [ + ("My Experiment", "my-experiment"), + ("Another Experiment", "another-experiment"), + (None, "default-experiment"), + ("", "default-experiment"), + ], +) +def test_get_experiment_name( + vertex_experiment_tracker, input_name, expected_output +): + """Test the experiment name generation function.""" + mock_settings = MagicMock() + mock_settings.experiment = input_name + vertex_experiment_tracker.get_settings = MagicMock( + return_value=mock_settings + ) + + info = MagicMock() + info.pipeline.name = "default-experiment" + + experiment_name = vertex_experiment_tracker._get_experiment_name(info) + assert ( + experiment_name == expected_output + ), f"Failed for input: '{input_name}'" + assert is_valid_experiment_name( + experiment_name + ), f"Generated experiment name '{experiment_name}' does not match the regex" + + +@pytest.mark.parametrize( + "input_name,expected_output", + [ + ("Run-001", "run-001"), + ("AnotherRun", "anotherrun"), + ("run_with_special_chars!@#", "run-with-special-chars"), + ], +) +def test_get_run_name(vertex_experiment_tracker, input_name, expected_output): + """Test the run name generation function.""" + info = MagicMock() + info.run_name = input_name + + run_name = vertex_experiment_tracker._get_run_name(info) + assert run_name == expected_output, f"Failed for input: '{input_name}'" + assert is_valid_experiment_name( + run_name + ), f"Generated run name '{run_name}' does not match the regex"