-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH make change percentiles plot to histogram #97
ENH make change percentiles plot to histogram #97
Conversation
Thanks!! |
@justinshaffer, what do you think about this plot type for alpha diversity? |
Feedback from the comms team: "
I get that the participant is supposed to see what their alpha-diversity is compared to the two countries that were studied. I just don’t get the how or what it means in comparison. They need to be able to walk away knowing how the information provided directly relates to them and how to interpret it with the least information given as possible." |
@wasade Thanks! I think it's clear and looks good. I'm not sure what the rate of histogram-understanding is, so I suggest to make things super clear, for example by changing the x-axis label to 'proportion of individuals', or even better to percentages, and the y-axis to additionally include a second line of text or similar, that clarifies lower values are less diverse and higher values are more diverse (e.g., "<-- lower diversity / higher-diversity -->"). It's also tempting to place additional markers other than 'You' for fun, but I understand using things implying 'sick' or 'healthy' is dangerous. But what about the average for like, healthy infants or something - just to highlight that increases in Faith's PD are associated with age, at least early on? Just thoughts - I think it's good as is! |
@gwarmstrong, any update here? |
Looks great, George!
Is there a way to include the metric name in the x-axis title above or
below what you have currently? Or is the metric name now in the plot title
(e.g., 'Richness')? If that's the case I think it works as is!
Justin
…On Thu, Mar 11, 2021 at 2:46 PM George Armstrong ***@***.***> wrote:
Here are some updates to the wording on the plot!
[image: Screen Shot 2021-03-11 at 2 45 42 PM]
<https://user-images.githubusercontent.com/19470970/110865459-80970780-8278-11eb-9d16-c4f8386d65ed.png>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#97 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADSDCGH5AWFM64KOJ7SH72LTDE23ZANCNFSM4YVXQHAA>
.
--
Justin Shaffer, PhD
IRACDA Postdoctoral Fellow
Rob Knight Group
Department of Pediatrics, School of Medicine
University of California, San Diego
justinshafferbio.wordpress.com
|
Hi George,
Thanks! I think that is more clear.
Sorry if I'm providing contradicting suggestions, but I think the title
should read 'Microbial Alpha-Diversity by Country', as my understanding is
that richness and Faith's PD are both distinct alpha-diversity metrics. I
think species richness is technically defined as 'the count of the number
of species', so could be equivalent to 'observed_otus' or
'number_of_distinct_features', and when visualizing those metrics I think
we should use 'richness' in their place. I think many folks may use
'alpha-diversity' and 'richness' interchangeably, and arguably for this
type of education it might be better to lean on the side of 'easier to
understand' than 'technically correct'. It's probably worth getting other's
opinions as I'm a stickler for terminology and jargon.
Justin
…On Thu, Mar 11, 2021 at 4:32 PM George Armstrong ***@***.***> wrote:
The metric is Faith's PD. I could Include it in the x axis title, like so:
[image: Screen Shot 2021-03-11 at 4 31 55 PM]
<https://user-images.githubusercontent.com/19470970/110873765-539e2100-8287-11eb-915f-5ce7529c1cd3.png>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#97 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADSDCGGZQKS42HGOGDXDTVTTDFHKBANCNFSM4YVXQHAA>
.
--
Justin Shaffer, PhD
IRACDA Postdoctoral Fellow
Rob Knight Group
Department of Pediatrics, School of Medicine
University of California, San Diego
justinshafferbio.wordpress.com
|
Awesome! Looks great. Nice work!
…On Thu, Mar 11, 2021 at 5:02 PM George Armstrong ***@***.***> wrote:
All good! I think that explanation makes sense. Here is the plot produced
with the latest updates:
[image: Screen Shot 2021-03-11 at 5 01 11 PM]
<https://user-images.githubusercontent.com/19470970/110875971-7a5e5680-828b-11eb-92f7-47bf2b13939d.png>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#97 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADSDCGFML4U56WMC64MC5HDTDFKYPANCNFSM4YVXQHAA>
.
--
Justin Shaffer, PhD
IRACDA Postdoctoral Fellow
Rob Knight Group
Department of Pediatrics, School of Medicine
University of California, San Diego
justinshafferbio.wordpress.com
|
Thank you both!!! This is great! |
While we're here, I do want to make one more plug for cumulative density. In the cumulative histogram below, compared the histograms shown so far, someone can much more easily see the percentage of samples that have a higher/lower alpha diversity compared to their sample, by looking at the y coordinate of their "You" line. E.g., this sample's alpha diversity is greater than about 60% of US samples and ~55% of UK samples, which at least more directly answers a question I might have when looking at this plot. It is pretty hard to get this from the probability density histogram. Also, the plot is just prettier when the alpha diversity values are not present in all bins (e.g., left tail of the histograms shown above). @justinshaffer any ideas on how we could make the cumulative density aspect more approachable for general participants? |
One concern I have here is the cumulative plots don't align on the right side which makes it look like an artifact of the visualization, although I agree it is prettier in that it is smoothed. Is the histogram possible to express with some type of smoothing function maybe? |
Hi George,
Sure thing!
- Yes why are the colors not aligned on the right-hand side of the plot?
Does this matter?
- I agree with everything you said re: this being much easier to pull
something meaningful out of. Sorry this may be dumb but I don't look at
these often - is it really the case that 60% of the blue data is to the
left of the 'You' line - rather than to the right of it? I'm squinting my
eyes and I think the left-hand tails are artifactually making the left-hand
part of each dataset seem smaller than what is on the right-hand side.
- I think all we would need to include with this figure for folks to
understand is a small bit of text describing essentially what you already
wrote - you could squeeze this into the x-axis label if preferred -
something like: 'For each country, the point where your line crosses the
curve indicates the % of people whose samples are less diverse than yours!'
Justin
…On Fri, Mar 12, 2021 at 10:31 AM George Armstrong ***@***.***> wrote:
While we're here, I do want to make one more plug for cumulative density.
In the cumulative histogram below, compared the histograms shown so far,
someone can much more easily see the percentage of samples that have a
higher/lower alpha diversity compared to their sample, by looking at the y
coordinate of their "You" line. E.g., this sample's alpha diversity is
greater than about 60% of US samples and ~55% of UK samples, which at least
more directly answers a question I might have when looking at this plot. It
is pretty hard to get this from the probability density histogram. Also,
the plot is just prettier when the alpha diversity values are not present
in all bins (e.g., left tail of the histograms shown above).
@justinshaffer <https://github.com/justinshaffer> any ideas on how we
could make the cumulative density aspect more approachable for general
participants?
[image: Screen Shot 2021-03-12 at 10 19 45 AM]
<https://user-images.githubusercontent.com/19470970/110981851-a2e26100-831c-11eb-8600-22bc73c611d7.png>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#97 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADSDCGADCX6IA266SPIJ4QDTDJFX3ANCNFSM4YVXQHAA>
.
--
Justin Shaffer, PhD
IRACDA Postdoctoral Fellow
Rob Knight Group
Department of Pediatrics, School of Medicine
University of California, San Diego
justinshafferbio.wordpress.com
|
I had not figured out a way to extend the colors to the right for a cumulative histogram in Plotly yet. I can work more on figuring this out if it matters, but if it is not supported by Plotly then the this ticket gets a lot more open-ended.
So the "left seeming smaller than the right" might be confounding area under the curve. It is not necessarily the case that 60% of the are under the cdf is left over the 'You' line. Instead, the blue USA bar having a height of 60% at the 'You' line indicates that 60% of samples from USA have an alpha diversity <= the x-value at the 'You' line. The height of the bar is cumulative frequency, not a count of samples. |
That makes sense - thanks for the clarification!
…On Tue, Mar 16, 2021 at 12:15 PM George Armstrong ***@***.***> wrote:
Yes why are the colors not aligned on the right-hand side of the plot?
Does this matter?
I had not figured out a way to extend the colors to the right for a
cumulative histogram in Plotly yet. I can work more on figuring this out if
it matters, but if it is not supported by Plotly then the this ticket gets
a lot more open-ended.
I agree with everything you said re: this being much easier to pull
something meaningful out of. Sorry this may be dumb but I don't look at
these often - is it really the case that 60% of the blue data is to the
left of the 'You' line - rather than to the right of it? I'm squinting my
eyes and I think the left-hand tails are artifactually making the left-hand
part of each dataset seem smaller than what is on the right-hand side.
So the "left seeming smaller than the right" might be confounding area
under the curve. It is not necessarily the case that 60% of the are under
the cdf is left over the 'You' line. Instead, the blue USA bar having a
height of 60% at the 'You' line indicates that 60% of samples from USA have
an alpha diversity <= the x-value at the 'You' line. The height of the bar
is cumulative frequency, not a count of samples.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#97 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADSDCGHV7WYFSCIWR7VAGGTTD6U6NANCNFSM4YVXQHAA>
.
--
Justin Shaffer, PhD
IRACDA Postdoctoral Fellow
Rob Knight Group
Department of Pediatrics, School of Medicine
University of California, San Diego
justinshafferbio.wordpress.com
|
It would be nice to have this merged as it's one of the few result types we have for skin/oral samples. please let me know if this will be completed or if it should be closed off |
This PR should address the plotting concerns in #94, including:
Sample below:
Note that it seems git's diffing algorithm gave some weird results starting around 398/433 starting with
function updateSimilarity(similarityData, state){
. To the best of my knowledge, I did not modify anything below that.