-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disjoint assemblies: how to treat them? #59
Comments
What do you exactly mean by disjoint assemblies? By the meaning of the word and the description I understand that a disjoint assembly is a crystal where there are two or more different assemblies that do not share any biological interface. An example being C3 assembly with A3 stoichiometry and C2 assembly with B2 stoichiometry in the AU. In that case, I would find uninteresting to list them together and also display all the combinations (A3+B, A2+B2), because we really need to consider them as different assemblies (although they happen to be in the same crystal) and I would rather treat them separately and show one line for each assembly (one line for the C3-A3 assembly and another for the C2-B2 assembly). This could solve the combinatorial explosion in displaying the results (and could be used as heuristic for computation), because the graphs of disjoint heteromeric assemblies become independent homomeric graphs, and it could also improve the interpretation. |
By disjoint assemblies we are talking about assemblies not sharing interfaces, e.g. stoichiometry A3C3 + stoichiometry B2. This we only allow in heteromeric cases. In homomeric cases we don't allow them because they would violate the isomorphism rule. The idea of using one line for each of them in the WUI display is good. One problem with it is that so far each line of the assembly results page corresponds to a fully covering assembly (i.e. an assembly that covers all components in the crystal). Breaking that would require a few changes in data structures. |
I think it is important that the assembly diagram display a complete covering of the unit cell. This is also required for the latticeGraph to be consistent. I think we do want to be able to handle such cases in our scoring function, because we would like to list author annotations like 1e94 (eppic-science#63) that are co-crystals. These would presumably get a penalty. With regard to the combinatorial explosion issue, I would suggest that we restrict the main assembly generation procedure to non-co-crystals, or at the most 2 disjoint complexes. Then we rely on the heuristic generation procedure to supply common co-crystals (like "all monomers"). BTW, I think that co-crystals will not turn out to be particularly uncommon due to crystallization factors like nanobodies and DARPINS, which would often be classified as xtal interfaces. So it may be worth including a restriction like "one complex plus some monomers". |
Thank you for the explanations. Now that I understood better the problem, I was thinking more of the way to display the results. I agree that assemblies should cover the full unit cell, and that our data structures are designed for that, but the display should be focused on the biological significance of the assemblies, and disjoint assemblies mean that they are independent (co-crystals, like joining multiple independent crystals in one). Maybe we can think of a way to keep the internal representation the same (data structures), but adapt the display. An idea I came up with is using the ID column to include multiple values if the assembly is disjoint. That way we could specify with very few rows all the combinations of disjoint assemblies. Now we are displaying multiple values in the As an example, the permutation for an A6(D3)+B6(D3) disjoint assembly are represented now as:
The new representation would be:
It is just an idea, so if the implementation is very difficult and the number of cases is very few (or these assemblies have always low score), it will probably not be worth implementing. Another issue that might arise is how to handle the 3D lattice graph and assembly diagram. |
This issue overlaps a bit with #101, more focused on the wui aspect of disjoint assemblies. |
Interesting idea. Perhaps we should distinguish between an assembly (full covering of the unit cell, formerly sometimes called the superassembly) and a complex (unique connected component of an assembly). This could in general be a many-to-many relationship (depending on what properties we assign to each concept). One problem I see with an interface like this is that it's not clear which complexes are compatable. For instance, your table above doesn't include combination entries like (AB) or (AB)6. So how would we express in the WUI situations like "A6 requires one of the B* complexes but is incompatible with (AB)*". I like the idea of reducing visual redundancy, but I worry that it would require much more sophisticated users. |
I did not include combination entries because I only wanted to show the differences in disjoint assemblies, but the idea is that if they are not disjoint the display is the same as it is now. Assuming that in the case above both A and B are C6 instead of D3 and that they have interfaces between them, the table would continue as follows:
With this all possible combinations would be covered, assemblies 1 to 16 being disjoint (the display has been reduced) and assemblies 16 to 20 being combined AB. The situation you described is expressed by the assembly ID. Because A6 does not have any ID in the range 17-20, it means that it is incompatible with any of the (AB)* complexes. |
This would require introducing another layer to the assembly hierarchy, so it's unrealistic for a 3.0 release. For now we need to just display lots of redundant assemblies. |
Following our assembly rules, at the moment we consider disjoint assemblies as valid assemblies (this only affects heteromeric protein crystals).
That introduces some issues in how to handle them:
We introduced them because we know of some examples where co-cristallization seems plausible, thus where a disjoint prediction would be desirable. See for instance: 2xqw, 1bui
The text was updated successfully, but these errors were encountered: