Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Packaging and csv export #3

Merged
merged 18 commits into from
Nov 27, 2024
Merged

Packaging and csv export #3

merged 18 commits into from
Nov 27, 2024

Conversation

NithinMathewJosephAston
Copy link
Contributor

I have formatted the tool to take in input file via the argparse module and at the same time generate output csv files containing Barcodes, Features and Annotations. The build was meant to be used like a n executable hence was made in this format. To execute the code : --> Python3 ./cloupe.py <input_file path>

Once the script is executed it will generate .csv files in the current working directory.

@prete
Copy link
Contributor

prete commented Nov 22, 2024

Add gitignore to the root of the repo:
https://github.com/github/gitignore/blob/main/Python.gitignore

pyproject.toml Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
pyproject.toml Show resolved Hide resolved
dist/cloupe_package_nj9-0.0.1.tar.gz Outdated Show resolved Hide resolved
pyproject.toml Show resolved Hide resolved

# Final csv addition
with open(barcode_clusters, 'w') as barcode_clusters_file:
barcode_clusters_file.write("Barcodes,{},{}\n".format(cloupe.celltracks[0]["Name"],cloupe.celltracks[1]["Name"]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code ONLY works if there are two celltracks.
What if there are none? What if there's only one? What if there are 99?

barcode_clusters_file.write(
"\n".join(
[
re.sub(r"[()' ]", "", str(pair)) # Remove parentheses, single quotes, and spaces
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also fixed to only support 2 celltracks.
Should be dynamic based on the number of items in cloupe.celltracks.

re.sub(r"[()' ]", "", str(pair)) is very odd


collection = []
# writing the barcodes to the csv files
with open(barcodes, 'w') as barcodes_file:
Copy link
Contributor

@prete prete Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could each one of the writes (barcodes, features, and annotations) be an individual function. If so, should they be inside Cloupe class?

Have a look at https://docs.python.org/3.9/library/csv.html for writing CSVs with quoting and all that fun.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had been trying various combination of csv writter with the data that we have. I was finding it difficult to dump out the csv properly even when i was using the quoting and quotechar features along with the delimiter. I was thinking if i cud stick without using csv module. Let me know ur thoughts.

Regarding moving the writes as methods. My query would be do i remove the main function entirely and the cloupe.py will remain a class?

@prete
Copy link
Contributor

prete commented Nov 22, 2024

Related to #2

@NithinMathewJosephAston NithinMathewJosephAston marked this pull request as ready for review November 26, 2024 07:57
@NithinMathewJosephAston
Copy link
Contributor Author

Hi Martin,
I have added most of the requirements that you had mentioned previously. Have a look and let me know what else needs to be added or changed?

@prete
Copy link
Contributor

prete commented Nov 26, 2024

  • Comma missing in pyproject.toml at the end of this line:
    "License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)"
  • version and author are in this file, if you'll have version there —which is goood— version in pyproject.toml should be a dynamic property
  • Remove the subfolder cloupe_package_nj9, just leave the code at src/ level.

pyproject.toml Show resolved Hide resolved
src/cloupe.py Outdated Show resolved Hide resolved
src/cloupe.py Outdated Show resolved Hide resolved
src/cloupe.py Outdated Show resolved Hide resolved
src/cloupe.py Outdated
def spatial_plot(self):
spatial_plot_attributes = vars(self)
spatial_embedding = spatial_plot_attributes["projections"]['Spatial']
plt.scatter(x=spatial_embedding[0], y=spatial_embedding[1])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we wanted to dump projections as csv, not plot them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If i do that wud nt the file format get changed to xls? becuz csv doesn't have that capability? correct me if I am wrong

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only get the coordinates, no plotting, for example:

  • projection_tsne.csv
  • projection_Spatial.csv

Each file being:

Barcodes x y z
ACTCAT-1 0.3 0.1 0.0
ACTCAT-1 0.7 0.1 0.0
... 0.0 0.0 0.0
ACTCTT-1 1.0 0.45 0.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick question y is there an z coordinate. Do I need to default set it to 0.0?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think axis may be different for each projection, some may have 2 others 3.

When working with a specific projection, you can get the dimensions with the first part of "Dims" that represents the number of columns you have in the table.

cols, rows = tuple(projection["Dims"])

Instead of x,y,z you can use d0,d1,d2 matching each dimension in the projection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

src/cloupe/cloupe.py Outdated Show resolved Hide resolved
src/cloupe/cloupe.py Outdated Show resolved Hide resolved
src/cloupe/cloupe.py Outdated Show resolved Hide resolved
pyproject.toml Show resolved Hide resolved
@NithinMathewJosephAston
Copy link
Contributor Author

NithinMathewJosephAston commented Nov 27, 2024

Have added the projection methods. Let me know if there is anything i can do to improve or tweak the performance of those functions. Also if i need to add in methods for projections like UMAP and Fiducials?

src/cloupe/cloupe.py Outdated Show resolved Hide resolved
src/cloupe/cloupe.py Outdated Show resolved Hide resolved
@prete prete merged commit 9d873fd into main Nov 27, 2024
@prete prete deleted the Dev_nmj branch November 27, 2024 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants