-
Notifications
You must be signed in to change notification settings - Fork 0
poloxu/CSE280A_BFBSimulator
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Documentation for BFB Simulations Lena Krockenberger 8/17/22 BFBSimulator Parameters: usage: BFBSimulator.py num_cycles chr_str start_pos end_pos genome [-h] [-l L] [-ld {expo,simple}] [-df DF] [-dl DL] [-dd {expo,simple}] [-xf XF] [-xl XL] [-xd {expo,simple}] [-o O] [-rev] [-p P] [-b B] [-bl BL] positional arguments: num_cycles Number of BFB cycles to simulate chr_str Name of the chromosome to simulate BFB on start_pos Start position of the genome section to simulate BFB on (0-indexed, inclusive) end_pos End position of the genome section to simulate BFB on (0-indexed, exclusive) genome FASTA file of the genome optional arguments: -h, --help show this help message and exit -l L Mean length of the BFB segments (default 600000) -ld {expo,simple} Distribution of the BFB segment lengths. "expo"(default): exponential; "simple": all use the same value as the mean length -df DF Probability for each BFB segment to have one deletion (default 0.1) -dl DL Mean deletion length of the BFB segments (default 50000) -dd {expo,simple} Distribution of the BFB segment deletion lengths. "expo"(default): exponential; "simple": all use the same value as the mean deletion length -xf XF Probability for each BFB segment to have one duplication (default 0.1) -xl XL Mean duplication length of the BFB segments (default 10000) -xd {expo,simple} Distribution of the BFB segment duplication lengths. "expo"(default): exponential; "simple": all use the same value as the mean duplication length -o O Prefix of the output files -rev If this option is specified, BFB simulation will append string to the 5' side of the region -p P For each segment, the probability that deletion is simulated before duplication (default 0.5) -b B Probability that there is a deletion when appending duplicated segment. (default 0.1) -bl BL Length of deletion when appending duplicated segment. (default 50000) Simple Case - (low segment count, low copy number, no indels, no end deletions): Ex with ERBB2: num_cycles 5 chr_str 17 start_pos 33600000 end_pos 36600000 (add 3,000,000 to start_pos) genome new_hg38.fa (fasta file name) -l L default -ld {expo,simple} default -df DF 0 (don’t want deletions) -dl DL 0 (don’t want deletions) -dd {expo,simple} simple -xf XF 0 (don’t want duplications) -xl XL 0 (don’t want duplications) -xd {expo,simple} simple -o O outputs/simple/ERBB2 (can add file paths to output prefixes) -rev defualt -p P 0 (no deletions) -b B 0 (no end deletions) -bl BL 0 (no end deletions) Final command: ./BFBSimulator.py 5 17 33600000 36600000 new_hg38.fa -df 0 -dl 0 -dd simple -xf 0 -xl 0 -xd simple -o outputs/simple/ERBB2 -p 0 -b 0 -bl 0 Intermediate Case - (regular segment number and copy number, end deletions, no indels): Ex with ERBB2: num_cycles 10 (larger number of cycles increases copy number) chr_str 17 start_pos 33600000 end_pos 39600000 (add 6,000,000 to start_pos) genome new_hg38.fa (fasta file name) -l L default -ld {expo,simple} default -df DF 0 (don’t want deletions) -dl DL 0 (don’t want deletions) -dd {expo,simple} simple -xf XF 0 (don’t want duplications) -xl XL 0 (don’t want duplications) -xd {expo,simple} simple -o O outputs/end_Deletions/ERBB2 (can add file paths to output prefixes) -rev default -p P 0 (no deletions) -b B 1 (always simulate end deletion when appending new segment) -bl BL 50000 Final Command: ./BFBSimulator.py 10 17 33600000 39600000 new_hg38.fa -df 0 -dl 0 -dd simple -xf 0 -xl 0 -xd simple -o outputs/end_Deletions/ERBB2 -p 0 -b 1 -bl 50000 Complex Case - (regular segment number and copy number, end deletions, indels): Ex with ERBB2: num_cycles 10 chr_str 17 start_pos 33600000 end_pos 39600000 (add 6,000,000 to start_pos) genome new_hg38.fa (fasta file name) -l L default -ld {expo,simple} default -df DF default -dl DL default -dd {expo,simple} default -xf XF default -xl XL default -xd {expo,simple} default -o O outputs/complex/ERBB2 (can add file paths to output prefixes) -rev default -p P default -b B default -bl BL default Final Command: ./BFBSimulator.py 10 17 33600000 39600000 new_hg38.fa -o outputs/complex/ERBB2 Script for Running Simulations – simulations.sh Usage: ./simulations.sh Parameters: none When running simulations.sh, 5 different oncogenes are simulated through each of the 3 cases (simple, intermediate, and complex). The 5 genes include: ERBB2, MYC, EGFR, BCL2, and PIK3CA. Simple case: 1 simulation per gene with the same parameters for “simple case” detailed above (5 simulations total) Intermediate case: 3 simulations per gene with the same parameters for “intermediate case” detailed above (15 simulations total) Complex case: 2 simulations per gene with the same parameters for “complex case” detailed above (10 simulations total) In total, simulations.sh runs 30 simulations of varying complexity. To modify details of the simulations, the script must be edited.
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published