Sequences categorized into types of pseudogenes #56

navkahlon240 · 2023-04-10T14:06:25Z

Hi, Thank you for this awesome pipeline for pseudogenes analysis. I just wanted to know if I can get the fasta sequences categorized as Short, long, fragmented and intergenic sequences. Because, I think it shows the total number of short, long, fragmented and intergenics in log. Is there any way it can give the nucleotide sequences categorized like which sequences are short, long, fragmented, because I am interesting to do further analysis on long sequences.

Thanks.

mitchso · 2023-04-10T15:55:12Z

Hi,

The categorical information for each pseudogene is found in the GFF output file. From there you can identify the locus tags associated with the group of pseudogenes you are interested in analyzing further, and then pull the sequences that correspond to those locus tags from the fasta files.

Hope this helps!
Mitch

liamfriar · 2023-04-13T15:45:55Z

Hi,

I also love the tool. The "Reason(s):" list appears to always be blank when the reason is that the feature was input as a pseudogene. It is still relatively easy to parse because of the pseudogene vs. pseudogene candidate designation in the .gff. I bring it up because when I then called re-annotate, it always has 0 input pseudogenes. Maybe that is just how reannotate works, but I thout it might have something to do with the lack of annotation in the .gff file? It looks in "annotate.py" like the pseudogene reason strings are sometimes saved in reason_dict, sometimes as pseudo_reasons, and sometimes as pseudo_candidate_reasons, so maybe these objects aren't all communicating with each other properly?

Thanks again. Great tool!

mitchso · 2023-04-16T21:50:59Z

Thanks for bringing this to my attention! I'll clean up the labelling and data structure soon.
Best,
Mitch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sequences categorized into types of pseudogenes #56

Sequences categorized into types of pseudogenes #56

navkahlon240 commented Apr 10, 2023

mitchso commented Apr 10, 2023

liamfriar commented Apr 13, 2023

mitchso commented Apr 16, 2023

Sequences categorized into types of pseudogenes #56

Sequences categorized into types of pseudogenes #56

Comments

navkahlon240 commented Apr 10, 2023

mitchso commented Apr 10, 2023

liamfriar commented Apr 13, 2023

mitchso commented Apr 16, 2023