Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Importing distance.cmp into R #77

Open
UniAlberta opened this issue Oct 5, 2024 · 7 comments
Open

Importing distance.cmp into R #77

UniAlberta opened this issue Oct 5, 2024 · 7 comments

Comments

@UniAlberta
Copy link

Hi, I ran sourmash sketch dna ~/*.fastq.gz on my fasta files and then
sourmash compare *.sig -o distances.cmp -k 31. Now I have distance.cmp output file and I want to use it in R to plot ordination. I`m not sure how I can import .cmp file into R. Is there any code for that?
Thanks

@ctb
Copy link
Contributor

ctb commented Oct 5, 2024

hi @UniAlberta, you'll want to use the --csv output instead - the .cmp file is a numpy binary matrix file that is probably more difficult to read into R!

Here's some example code: https://sourmash.readthedocs.io/en/latest/other-languages.html#r-code-for-working-with-compare-output

@UniAlberta
Copy link
Author

UniAlberta commented Oct 5, 2024

Thanks for your reply. I have all my .files and when I try to run sourmash compare *.sig --csv distance.cmp.csv, I get an error of ModuleNotFoundError: No module named 'numpy'. Could you please help to fix it?
Here`s the info from my smash
(smash) ...@...:~/miniforge3/envs/smash/lib/python3.9/site-packages/sourmash$ ls
init.py _lowlevel.py cli commands.py fig.py lca nodegraph.py sbt_storage.py sig utils.py
main.py _lowlevel__ffi.py command_compute.py compare.py hll.py logging.py np_utils.py sbtmh.py signature.py version.py
pycache _lowlevel__lib.so command_sketch.py exceptions.py index.py minhash.py sbt.py search.py sourmash_args.py

@ctb
Copy link
Contributor

ctb commented Oct 5, 2024

that's weird - what command did you use to install sourmash?

In any case, after activating the conda environment, you should be able to use

pip install numpy

or

conda install numpy

@UniAlberta
Copy link
Author

I follow this for installation https://sourmash.readthedocs.io/en/latest/tutorial-install.html. I run conda install numpy and then when I`m in smash environment, I run sourmash compare *.sig --csv distance.cmp.csv and it gives me the same error. ModuleNotFoundError: No module named 'numpy'.
Have you updated the tool? Because I ran it last week and it worked

@ctb
Copy link
Contributor

ctb commented Oct 6, 2024

Nope, no updates. And in any case that wouldn't have broken your conda environment!

I'm wondering if maybe your conda environment activation is somehow messed up - try logging in again/starting a new shell, and then activating the smash environment again. conda list should show that numpy is installed, along with sourmash.

@UniAlberta
Copy link
Author

I tried pip3 install numpy and it worked. Thanks for your help.

@ctb
Copy link
Contributor

ctb commented Oct 7, 2024

fantastic!

please feel free to ask for help here or elsewhere - more people monitor https://github.com/dib-lab/sourmash/issues so that's a good place to go :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants