-
-
Notifications
You must be signed in to change notification settings - Fork 60
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
5 changed files
with
314 additions
and
71 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,103 @@ | ||
# 3.3. Entrypoints | ||
# 3.3. Entrypoints | ||
|
||
## What are package entrypoints? | ||
|
||
Package entrypoints are mechanisms in Python packaging that facilitate the exposure of scripts and utilities to end users. Entrypoints streamline the process of integrating and utilizing the functionalities of a package, whether that be through command-line interfaces (CLI) or by other software packages. | ||
|
||
To elaborate, entrypoints are specified in a package's setup configuration, marking certain functions or classes to be directly accessible. This setup benefits both developers and users by simplifying access to a package's capabilities, improving interoperability among different software components, and enhancing the user experience by providing straightforward commands to execute tasks. | ||
|
||
## Why do I need to set up entrypoints? | ||
|
||
Entrypoints are essential for making specific functionalities of your package directly accessible from the command-line interface (CLI) or to other software. By setting up entrypoints, you allow users to execute components of your package directly from the CLI, streamlining operations like script execution, service initiation, or utility invocation. Additionally, entrypoints facilitate dynamic discovery and utilization of your package's functionalities by other software and frameworks, such as Apache Airflow, without the need for hard-coded paths or module names. This flexibility is particularly beneficial in complex, interconnected systems where adaptability and ease of use are paramount. | ||
|
||
## How do I create entrypoints with poetry? | ||
|
||
Creating entrypoints with Poetry involves specifying them in the `pyproject.toml` file under the `[tool.poetry.scripts]` section. This section outlines the command-line scripts that your package will make available: | ||
|
||
```toml | ||
[tool.poetry.scripts] | ||
bikes = 'bikes.scripts:main' | ||
``` | ||
|
||
In this syntax, `bikes` represents the command users will enter in the CLI to activate your tool. The path `bikes.scripts:main` directs Poetry to execute the `main` function found in the `scripts` module of the `bikes` package. Upon installation, Poetry generates an executable script for this command, integrating your package's functionality seamlessly into the user's command-line environment, alongside other common utilities: | ||
|
||
```bash | ||
$ poetry run bikes one two three | ||
``` | ||
|
||
This snippet run the bikes entrypoint from the CLI and passes 3 positional arguments: one, two, and three. | ||
|
||
## How can I use this entrypoint in other software? | ||
|
||
Defining and installing a package with entrypoints enables other software to easily leverage these entrypoints. For example, within Apache Airflow, you can incorporate a task in a Directed Acyclic Graph (DAG) to execute one of your CLI tools as part of an automated workflow. By utilizing Airflow's `BashOperator` or `PythonOperator`, your package’s CLI tool can be invoked directly, facilitating seamless integration: | ||
|
||
```python | ||
from airflow import DAG | ||
from datetime import datetime, timedelta | ||
from airflow.providers.databricks.operators.databricks import DatabricksSubmitRunNowOperator | ||
|
||
# Define default arguments for your DAG | ||
default_args = {...} | ||
|
||
# Create a DAG instance | ||
with DAG( | ||
'databricks_submit_run_example', | ||
default_args=default_args, | ||
description='An example DAG to submit a Databricks job', | ||
schedule_interval='@daily', | ||
catchup=False, | ||
) as dag: | ||
# Define a task to submit a job to Databricks | ||
submit_databricks_job = DatabricksSubmitRunNowOperator( | ||
task_id='main', | ||
json={ | ||
"python_wheel_task": { | ||
"package_name": "bikes", | ||
"entry_point": "bikes", | ||
"parameters": [ "one", "two", "three" ], | ||
}, | ||
} | ||
) | ||
|
||
# Set task dependencies and order (if you have multiple tasks) | ||
# In this simple example, there's only one task | ||
submit_databricks_job | ||
``` | ||
|
||
In this example, `submit_databricks_job` is a task that executes the `bikes` entrypoint. | ||
|
||
## How can I use this entrypoint from the command-line (CLI)? | ||
|
||
Once your Python package has been packaged with Poetry and a wheel file is generated, you can install and use the package directly from the command-line interface (CLI). Here are the steps to accomplish this: | ||
|
||
1. **Build your package:** Use Poetry to compile your project into a distributable format, such as a wheel file. This is done with the `poetry build` command, which generates the package files in the `dist/` directory. | ||
|
||
```bash | ||
poetry build | ||
``` | ||
|
||
2. **Install your package:** With the generated wheel file (`*.whl`), use `pip` to install your package into your Python environment. The `pip install` command looks for the wheel file in the `dist/` directory, matching the pattern `bikes*.whl`, which is the package file created by Poetry. | ||
|
||
```bash | ||
pip install dist/bikes*.whl | ||
``` | ||
|
||
3. **Run your package from the CLI:** After installation, you can invoke the package's entrypoint—defined in your `pyproject.toml` file—directly from the command line. In this case, the `bikes` command followed by any necessary arguments. If your entrypoint is designed to accept arguments, they can be passed directly after the command. Ensure the arguments are separated by spaces unless specified otherwise in your documentation or help command. | ||
|
||
```bash | ||
bikes one two three | ||
``` | ||
|
||
## Which should be the input or output of my entrypoint? | ||
|
||
**Inputs** for your entrypoint can vary based on the requirements and functionalities of your package but typically include: | ||
|
||
- **Configuration files (e.g., JSON, YAML, TOML):** These files can define essential settings, parameters, and options required for your tool or package to function. Configuration files are suited for static settings that remain constant across executions, such as environment settings or predefined operational parameters. | ||
- **Command-line arguments (e.g., --verbose, --account):** These arguments provide a dynamic way for users to specify options, flags, and parameters at runtime, offering adaptability for different operational scenarios. | ||
|
||
**Outputs** from your entrypoint should be designed to provide valuable insights and effects, such as: | ||
|
||
- **Side effects:** The primary purpose of your tool or package, which could include data processing, report generation, or initiating other software processes. | ||
- **Logging:** Detailed logs are crucial for debugging, monitoring, and understanding how your tool or package operates within larger systems or workflows. | ||
|
||
Careful design of your entrypoints' inputs and outputs ensures your package can be integrated and used efficiently across a wide range of environments and applications, maximizing its utility and effectiveness. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.