Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #10437 - Unify Deployer ContextProvider #12583

Draft
wants to merge 52 commits into
base: jetty-12.1.x
Choose a base branch
from

Conversation

joakime
Copy link
Contributor

@joakime joakime commented Nov 26, 2024

A single scanner for all Environments.
Environment attributes are how the environment specific deployment configuration is controlled.
Existing properties behaviors maintained.

Currently a WIP (needs more testing and documentation)

@joakime joakime added Enhancement Bug For general bugs on Jetty side labels Nov 26, 2024
@joakime joakime requested a review from sbordet November 26, 2024 21:48
@joakime joakime self-assigned this Nov 26, 2024
@joakime joakime changed the base branch from jetty-12.0.x to jetty-12.1.x November 26, 2024 21:48
@joakime joakime linked an issue Nov 26, 2024 that may be closed by this pull request
@joakime joakime marked this pull request as ready for review December 3, 2024 00:01
@janbartel janbartel requested a review from gregw December 10, 2024 04:35
@sbordet
Copy link
Contributor

sbordet commented Dec 10, 2024

I'd like to see:

  • alphabetical sort of scanned files
  • DeploymentManager does not need to know AppProviders. It's AppProviders that call the DeploymentManager to deploy Apps. AppProviders can just be added as beans to DeploymentManager
  • Documentation
  • ContextProvider renamed to something more telling. DeploymentScanningAppProvider? The word "Context" has no meaning in this class, and it is lost in the current class name that it is a ScanningAppProvider.

@joakime joakime requested a review from sbordet January 15, 2025 21:45
@joakime joakime marked this pull request as draft January 16, 2025 00:14
@joakime
Copy link
Contributor Author

joakime commented Jan 16, 2025

Not ready yet, more changes coming (based on conversation from @sbordet )

* A Unit of deployment, a basename and all the associated
* paths that belong to that basename, along with state.
*/
public class Unit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that Unit should be promoted to a top level class, and App demoted to an implementation detail. The Unit class better captures what we're dealing with - a possible collection of files and/or directories that taken all together represent a deployable. The App is really just a subset of that information.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit is a component of a disk based, hot deployment based, disk scanning, local system only approach for deployment. (our ScanningAppProvider abstract and ContextProvider implementation).

Not all users of DeploymentManager or all implementations of AppProvider even use the local disk as a source for their applications. Many come from external sources, some come from build plugins, some come as a result of an action from some other trigger.

Making Unit the top level is the wrong approach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify succinctly the difference between App and Unit using language like "A Unit is a ....", "An App is a ..."? Looking at the code App has Path references and so does Unit, so in that sense they both refer to resources that can be disk based resources, so that can't be the difference.

* @return The name of the {@link org.eclipse.jetty.util.component.Environment} this provider is for.
* @deprecated not used by all AppProviders, no generic replacement provided.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this interface is not named exactly accurately, and the class javadoc is misleading. According to its api, it isn't creating or providing any App instances to the DeploymentManager at all. Looking at it's api, it's function is really to create a ContextHandler instance - after some other class passes in an already created instance of App.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Unit a way that ScanningAppProvider tracks its collection of Path objects to an App via it's basename. (as long as the basename exists, this Unit exist)
  • App the fundamental unit used to track the App through the deployment lifecycle. (created as part of new deployment, used as reference for moving through lifecycle: undeploy/remove steps)
  • ContextHandler the feature of the Jetty Server that represents the instantiated and live App

While all of these seem like the same "thing" to track as a single place, it is really 3 levels of abstraction, each with its own life cycle. The Unit has the longest lifecycle, followed by App with a shorter lifecycle, and finally ContextHandler with the shortest (from the point of view of a deployment manager).

A Unit exists for as long as there are Paths with a basename being tracked. (only lives in ScanningAppProvider)
An App only exists if there is an AppProvider that creates it. (note that ScanningAppProvider can have a Unit with no App if there is no main deployment path present for that Unit)
A ContextHandler only exists if there is a need for it in the lifecycle binding. (If a step in the lifecycle needs a ContextHandler then that's when it gets created or accessed. Our standard lifecycle bindings will use the ContextHandler for deploy/start/stop/undeploy, but not the other bindings)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ContextHandler can even change as the App moves through the registered lifecycle bindings.
For example, a custom binding could wrap the ContextHandler to add an auditing layer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lifecycle relationships are such that neither an App nor a ContextHandler (in the sense of deployer) can exist without a Unit. Sure, ContextHandler and App have their own lifecycles, but they ultimately depend on the existence or otherwise of the Unit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ContextHandlerFactory would be a better name for this interface

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another feature of Unit is that it represents a group of paths that influence the deployment process.

One aspect not made clear is that the environment configuration files (webapps/<env>.xml, webapps/<env>-<name>.xml, webapps/<env>.properties, and webapps/<env>-<name>.properties) are also considered a Unit (with basename <env>) that is tracked by the Scanner. (this is a feature that's been in Jetty 12.0.x since its inception). If one of those files in the unit are changed, then that triggers all associated Apps on that environment to also hot-redeploy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting a good simple description of what a Unit is would be good. Then perhaps a better name would come from that.

* A Unit of deployment, a basename and all the associated
* paths that belong to that basename, along with state.
*/
public class Unit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify succinctly the difference between App and Unit using language like "A Unit is a ....", "An App is a ..."? Looking at the code App has Path references and so does Unit, so in that sense they both refer to resources that can be disk based resources, so that can't be the difference.

return app;
}

public void setApp(App app)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating the App instance external to this class loses the relationship between the set of resources that this Unit is holding, one of which must be used by the App. Having this setter means that the App can have been created using any old path, not necessarily one of the resources associated with this Unit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is 100% correct.

A custom AppProvider doesn't have to have the concept of Path (main or otherwise) or anything like that.
This Unit class wouldn't be used in that scenario.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the distinction between Unit and App is not helped by the: a) naming; b) multiple transformations.

Currently this PR has:

  1. a Map<Path, Notification> is used to updated a Set<Unit>
  2. each Unit is converter to a primary Path
  3. each primary Path is used to create an App
  4. each App is used to create a ContextHandler

Steps 2. to 3. feels a bit wrong. Can we go directly from a Unit to an App? Internally this might use a primary Path, but making this step explicit feels unnecessary.

Also, I get it that a Unit is more about the discovered state of deployable files, and is almost like an single event (with ADDED, REMOVED, CHANGED), but that fact that you keep a set of existing Units makes it feel more than an event and closer to what an App is (i.e. the representation of something that can be deployed).

Would it be better to move Unit more towards an event, and not keep them around between scans? I know we need the UNCHANGED files, but they can be discovered by existence using name relationships rather than keeping a Set<Unit>. Unit could then be renamed to Scanned or ScanSet or something else to imply that it is the grouped results of a single scan.

* {@link org.eclipse.jetty.util.resource.ResourceFactory#newResource(String)}
* @return The App object for this particular context definition file.
* @param path The file that the main point of deployment (eg: a context XML, a WAR file, a directory, etc)
* @return The App object for this particular context.
*/
protected App createApp(Path path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the Unit produced the App then that would better control that the path must be one of those known about by the Unit. In fact, you should probably be able to navigate from an App to the parent Unit also.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all Unit instances have App. (re: Environment Config files are grouped into a Unit by its environment name, and trigger synthetic Changes to all registered App that are managed by ContextProvider and being tracked by our Unit).

App instances from other (custom) AppProviders do not have a Unit and are not seen by our ContextProvider.

Only our ContextProvider needs to know the App for the Unit.
Keep in mind that Unit is just ours, not a general purpose API, like App is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it is wrong that both App and Unit persist. App should be the long lived result of an stream of Unit events. Having them both persist is confusing

* @return The name of the {@link org.eclipse.jetty.util.component.Environment} this provider is for.
* @deprecated not used by all AppProviders, no generic replacement provided.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lifecycle relationships are such that neither an App nor a ContextHandler (in the sense of deployer) can exist without a Unit. Sure, ContextHandler and App have their own lifecycles, but they ultimately depend on the existence or otherwise of the Unit.

* @return The name of the {@link org.eclipse.jetty.util.component.Environment} this provider is for.
* @deprecated not used by all AppProviders, no generic replacement provided.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ContextHandlerFactory would be a better name for this interface

* @return the main deployable path
*/
@Override
protected Path getMainDeploymentPath(Unit unit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bit strange that the logic that knows how to group different paths into a single unit is in the internal DeploymentUnits, whilst the logic to pick which of them is the main deployment path is here, in an entirely different place. Surely both bits of logic are related and should be in the same class

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Picking the main path seems more like a ContextProvider concern.
The Unit and DeploymentUnits is go-between from the Scanner events to the ContextProvider.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, because even working out if a Unit is ADDED, CHANGED or REMOVED needs knowledge of deployable files, so adding a priority of deployable files is just a tiny step

if (dirs.size() > 1)
throw new IllegalStateException("More than 1 Directory for deployable " + asStringList(dirs));

throw new IllegalStateException("Unable to determine main deployable for " + unit);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really an ISE? If we have just created foo.d or foo.properties, then we don't have enough to deploy anything, but I don't think it is an exceptional condition. Perhaps a null return would be better, reflecting a unit that does not have a deployable path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.
We need a testcase for this scenario.

What would you expect the result to be?
Kind feels like a "core" deployment, not a servlet deployment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just return null if there is no deployable file.


// Calculate state of unit from Path states.
Unit.State ret = null;
for (Unit.State pathState : paths.values())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The algorithm here needs to be well documented in a comment so we can check if it is correct. A unit is:

  • REMOVED if it is empty or all its deployable paths (xml, war, directory) are REMOVED
  • CHANGED if it has at least one path that is not UNCHANGED. If that one path is ADDED, then it must have another path that is not ADDED
  • ADDED if it has one or more deployable paths that are ADDED and none that are not ADDED

Note the inclusion of the concept of a deployable path (xml, war, directory), as only changes to them may result in ADDED or REMOVED. However a change to a non deployable path (properties or .d) can result in a CHANGED.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the implementation is right (also I think my description above might need improvement as well).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some new javadoc, and some tests of Unit.calcState(), and fixed the implementation some to address these concerns better.

https://github.com/jetty/jetty.project/blob/fix/12.1.x/unify-deploy/jetty-core/jetty-deploy/src/test/java/org/eclipse/jetty/deploy/providers/UnitTest.java

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think it is not right.

firoj0

This comment was marked as spam.

* <dt>ADDED</dt>
* <dd>All Path states are in ADDED state</dd>
* <dt>CHANGED</dt>
* <dd>At least one Path state is CHANGED, or there is a variety of states</dd>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is exactly correct.

If we had a Unit that only contained non-deployable Paths (e.g. foo.properties) and then we ADDED a foo.war the result should be ADDED not CHANGED.

* <dt>UNCHANGED</dt>
* <dd>All Path states are in UNCHANGED state</dd>
* <dt>ADDED</dt>
* <dd>All Path states are in ADDED state</dd>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct, see example below

* <dt>CHANGED</dt>
* <dd>At least one Path state is CHANGED, or there is a variety of states</dd>
* <dt>REMOVED</dt>
* <dd>All Path states are in REMOVED state, or there are no Paths being tracked</dd>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, if all deployable files (xml, war, directory) are removed, leaving only a property file, then the result should be REMOVED even though all paths are not removed.

I think this algorithm definitely needs the concept of deployable files (xml, war, directory)

if (dirs.size() > 1)
throw new IllegalStateException("More than 1 Directory for deployable " + asStringList(dirs));

throw new IllegalStateException("Unable to determine main deployable for " + unit);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just return null if there is no deployable file.

* @return the main deployable path
*/
@Override
protected Path getMainDeploymentPath(Unit unit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, because even working out if a Unit is ADDED, CHANGED or REMOVED needs knowledge of deployable files, so adding a priority of deployable files is just a tiny step

* @return The name of the {@link org.eclipse.jetty.util.component.Environment} this provider is for.
* @deprecated not used by all AppProviders, no generic replacement provided.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting a good simple description of what a Unit is would be good. Then perhaps a better name would come from that.

* {@link org.eclipse.jetty.util.resource.ResourceFactory#newResource(String)}
* @return The App object for this particular context definition file.
* @param path The file that the main point of deployment (eg: a context XML, a WAR file, a directory, etc)
* @return The App object for this particular context.
*/
protected App createApp(Path path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it is wrong that both App and Unit persist. App should be the long lived result of an stream of Unit events. Having them both persist is confusing


// Calculate state of unit from Path states.
Unit.State ret = null;
for (Unit.State pathState : paths.values())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think it is not right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For general bugs on Jetty side Enhancement
Projects
Status: 👀 In review
Development

Successfully merging this pull request may close these issues.

Review DeploymentManager and ScanningAppProvider
5 participants