iFISH probe design
View the Project on GitHub
View the Project on PyPI
ggirelli/iFISH-probe-design (2.0.4)
We included two algorithms in ifpd
: the first to design a single probe in a genomic region of interest (gROI), and the second to design a number of homogeneously spread probes in a gROI (i.e, to design a spotting probe).
Both algorithms are based on the calculation of either single or spotting probe-related features. We will go more into the details of both features and algorithms in the following sections.
This algorithm is implemented in the ifpd_query_probe
script.
size
: defined as the distance between the genomic coordinates of the first base covered by the first oligo, and the last base covered by the last oligo.centrality
: ratio between the distance between the gROI midpoint and the probe midpoint, and the gROI’s half-size. It spans from 0, when the probe midpoint sits on the gROI’s border, to 1, when the probe midpoint coincides to the gROI’s midpoint.homogeneity
of inter-oligo distance: defined as the reciprocal of the inter-oligo distance standard deviation. The distance between two consecutive oligos is defined as the difference in genomic coordinates of the last base covered by the first oligo, and the first base covered by the second oligo.The single probe design algorithms requires the following inputs:
chr1:1000000-1001000
.size
, (2) homogeneity
, and (3) centrality
.The algorithm performs the following steps:
size
by default) outside a range around the best value. This step behaves differently depending on the feature, using the following ranges:
size
: min(size)±F*min(size)
homogeneity
: max(homogeneity)±F*max(homogeneity)
centrality
: max(centrality)±F*max(centrality)
homogeneity
by default), from the best to the worst value. This step behaves differently depending on the feature, sorting as follows:
size
: from min(size)
to max(size)
homogeneity
: from max(homogeneity)
to min(homogeneity)
centrality
: from max(centrality)
to min(centrality)
This algorithm is implemented in the ifpd_query_set
script.
homogeneity
of inter-probe distance and probe size: defined as the average between the reciprocal of the standard deviation of inter-probe distance and probe size, respectively. The distance between two consecutive probes is the difference in genomic coordinates between the last base covered by the last oligo in the first probe, and the first base covered by the first oligo in the second probe.The spotting probe design algorithms requires the following inputs:
chr1:1000000-1001000
.size
, (2) homogeneity
, and (3) centrality
.The algorithm works as following:
homogeneity
of inter-probe distance and probe size, and use it to sort in decreasing order alongside the number of probes (some sets might have less probes due to lack of oligonucleotides in a window).