-
-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(external_data_files): load data from other YAML files #1880
base: master
Are you sure you want to change the base?
Conversation
@copier-org/maintainers this one was a bit hard to get right, but I think it's OK now. The whole point of this PR is to open a new venue for template composability. Now you can have a template that reads information from answers written for other templates. Still, both templates can stay independent. But if joined, they can have some extra features. For example, some default values from one template can depend on the answers for another template. And others' answers can be accessed in current template. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1880 +/- ##
==========================================
+ Coverage 97.76% 97.81% +0.05%
==========================================
Files 49 49
Lines 5232 5267 +35
==========================================
+ Hits 5115 5152 +37
+ Misses 117 115 -2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before reviewing the implementation in detail, would you mind discussing the design first?
Even prior to this PR, multiple Copier templates could be applied to the same subproject. But each template was applied completely independently, and when some templates had overlapping questions that should be answered the same for consistency, then it was the user's responsibility to keep the answers in sync. The way I understand this PR, it aims to improve the UX in this regard by allowing a child template to load external data (e.g., answers from an answers file from a previously applied parent template) and, e.g., use them as default values for questions in the child template, facilitating the synchronization of semantically identical questions across a parent and child template.
That said, I have some doubts about the current design:
- A child template makes assumptions about the answers structure of a parent template. Although there is a graceful fallback when, e.g., the parent template's answers file is not found, the child template's external data loading is tailored to the parent template. This limits template composition with enhanced UX to composing the child template with a specific parent template.
- Enhanced UX for template composition relies on the order in which templates are applied. A child template that reuses answers from a parent template must be applied after the parent template has been applied.
- Although answers from a parent template may be used as default values for questions in a child template, the questions are asked nevertheless and may be answered differently, which may lead to inconsistencies.
I have been thinking about a similar feature myself but with a different design, inspired by how, e.g., React components are composed.
Instead of coupling the child template with the parent template, how about adding a way for a parent template to configure child templates to be applied? The new parent template may have a copier.yml
file like this:
_templates:
# A Copier template for generating a Python project using
# Poetry as package manager
https://git.example.com/copier-python.git:
_ref: v1.2.3 # optional but highly recommended
project_name: "{{ project_name }}"
python_versions: "{{ python_versions }}"
# A Copier template for generating GitHub Actions configs for
# a Python project with support for several package managers
# including Poetry
https://git.example.com/copier-python-github-actions.git:
_ref: v2.3.4 # optional but highly recommended
python_versions: "{{ python_versions }}"
package_manager: poetry # hardcoded because `copier-python` uses Poetry
# This Copier template must define all questions needed for itself
# and all child templates.
project_name:
type: str
help: The name of the Python project
python_versions:
type: str
help: The supported Python versions
choices: &python_versions
- "3.9"
- "3.10"
- "3.11"
- "3.12"
- "3.13"
default: *python_versions
multiselect: true
The answers of this template are recorded in its answers file, and any answers files that are rendered by the child templates are omitted because they aren't needed.
I see a few advantages of this design:
- There is hardly any coupling between child templates, as the parent template defines the exposed questionnaire, passes the answers down to the child templates as needed, and orchestrates their application.
- There is no order to remember in which templates must be applied because the parent template orchestrates the application of the child templates.
- There might even be a way to handle multi-template merge conflicts (during both "copy" and "update" operations) via some kind of N-way merge (just dreaming, I have no proof of concept).
- There could even be a hierarchy of child templates. It's important to note that this hierarchy is different than what we have discussed in Updatable templates for template hierarchies and reuse #934.
But I also see a few disadvantages of this design:
- Composing templates requires one additional template that acts as the root/parent template to orchestrate the child templates' applications. So, ad-hoc template composition is not supported – at least I don't see how it would be possible.
- Some questions from the child templates need to be (approximately) duplicated in the root/parent template. While this sounds redundant and cumbersome, it's the same situation that, e.g., React component developers are facing. I think there's no way around it.
WDYT, @yajo?
That other design you propose is nice, but I think that this design is
simpler in some senses:
- It's just a little adition that allows Copier to include external data.
- This data can be answers or other kind of data generated by other
non-copier systems.
- If that external data happens to be answers for another template, you
can reuse answers.
- Your template already has full access to that new external data, so
you don't really *need* to use it as default for your questionary. It's
just an example use. You can just do {{ _ext.some.thing | d("nothing") }}
in your tempalte directly.
- It allows updating both templates at different pace.
- It's done already :D
- It allows both templates keep "independent".
- If we ever implement the other approach, this won't hurt it.
Now, thinking of your proposal, if you really need that, can't you do it
already? Just by using a clever combination of git submodules, !include
tags, and symlinks in a parent template that includes 2 child ones... my
intuition is that it'll work out-of-the-box right now.
Finally, before reviewing the implementation, go review the test. It will
help you understanding the interface better I guess.
Message ID: ***@***.***>
Thanks for your comments, as always :)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarifications, this was quite helpful. Indeed, template composition as shown is only one example of the feature; it's actually more general – and lightweight, which is nice.
I've left a few remarks related to documentation and license poisoning.
Now, thinking of your proposal, if you really need that, can't you do it
already? Just by using a clever combination of git submodules, !include
tags, and symlinks in a parent template that includes 2 child ones... my
intuition is that it'll work out-of-the-box right now.
Almost, I believe. Mapping answers in the parent template to questions in a child template seems to be missing though. When using !include
, the questionnaire of a child template is simply merged with the parent template, which is not the same as passing down answers in the parent template as data to the child template. Also, !include
doesn't offer encapsulation of child templates, e.g. templates with different Jinja settings wouldn't be properly supported.
But as you said, the feature of this PR does not block or prevent my suggested design from becoming available in the future. 👍
It's not really clear for me why should I use this instead of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's _external_data_files
but the key in the template is different "{{ _ext.parent1.name }}"
may be use the same term to be more clear like : "{{ _external_data_file.parent1.name}}"
When composing templates, it's often needed to be able to load answers from other templates that you know are usually combined with yours. Or any other kind of external data. @moduon MT-8282
All attended. Could you please review again? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍, just a minor remark regarding type hints.
@@ -329,6 +329,14 @@ def exclude(self) -> tuple[str, ...]: | |||
) | |||
) | |||
|
|||
@cached_property | |||
def external_data(self) -> Dict[str, str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can simply use dict[str, str]
here for the return type and omit the import above:
def external_data(self) -> Dict[str, str]: | |
def external_data(self) -> dict[str, str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix
Since you removed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I simply second @hparfr here:
It's not really clear for me why should I use this instead of --data-file argument
A small note could be added in each to make the distinction. My own, quick wording:
The CLI argument --data-file
is not the same thing as the template configuration external_data
: the former lets you provide a dictionary of answers matching a template questionnaire as a user, while the latter lets you enhance the Jinja rendering context of your template as a template writer.
Something like that.
When composing templates, it's often needed to be able to load answers from other templates that you know are usually combined with yours. Or any other kind of external data.
@moduon MT-8282