Skip to content

Patterned, random guide generation and scoring

Aaron McKenna edited this page Sep 5, 2020 · 1 revision

Generate random guides matching a set pattern

This tutorial assumes you've followed the quickstart on the main page: Quickstart. We're also assuming that you've indexed a real genome you want to work with, like so (for the human genome):

java -Xmx6000m -jar FlashFry-assembly-<version_number_here>.jar \
index \
--tmpLocation ./tmp \
--database hg38_database \
--reference ../genome/hg38.fa \
--enzyme spcas9ngg

I've recently added the ability to make patterned random guide sequences to FlashFry. This allows targets to be generated using a set pattern, specified by the FASTA format. Some examples are below:

Generate a known base editing sequence

Our first example will be a set of targets that can be edited by the ABEMAX construct (see the supplemental figure 8 from this paper for the editing window). Lets say we want a C at the 5th site, and then no editable bases around this site. We want a random set of targets that generally look like this:

--**C**-------------NGG

With dashes representing any base, stars indicating bases other than C, and a Cas9 PAM sequence at the end. We'd specify this using the following FlashFry command:

java -Xmx6000m -jar FlashFry-assembly-<version_number_here>.jar random \
-randomCount=10000 \
-patterned N,N,D,D,C,D,D,N,N,N,N,N,N,N,N,N,N,N,N,N \
-enzyme SPCAS9 \
-outputFile abemax_targets.txt

We can look at the output file and compare it to our design:

--**C**-------------NGG
CGGTCAAGTACAGCCTAGGATGG
TCTGCTTCAAGCTCAATGAATAG

Generate matching degenerate bases

Patterns can also have degenerate matching bases. For instance, if you want two Ns to match each other (be the same base every time), we can add a number after the sequence so that FlashFry 'remembers' the result. Let say we want the two sets of stars on each side of the pattern above to match, we'd update our pattern as follows:

-patterned N,N,D1,D2,C,D1,D2,N,N,N,N,N,N,N,N,N,N,N,N,N \

with some example results that look like (pattern on the top):

--**C**-------------NGG
TCGTCGTTGGATTCACTAATCGG
TTGACGAGGTTTAGGAACCGCAG
GGTTCTTCGGAAGAAGCTTTTAG

The first has GT on each side of the C, the second GA, the third TT, etc.

Designed complementary bases

Lastly, we can setup complementary bases, where these memorized bases can be complemented. For instance, if we wanted to make palindrome targets, where bases on one side of the target match the other, we can add the dash character to our memorized bases. To make a full palindrome target, we'd specify it as follows:

-patterned N1,N2,N3,N4,N5,N6,N7,N8,N9,N10,N10-,N9-,N8-,N7-,N6-,N5-,N4-,N3-,N2-,N1- \

We get results that look like (split at the N10/N10- palindrome section):

TCACCGGGCA TGCCCGGTGAGAG
TTAGGGTCTT AAGACCCTAAGGG