ifpd

iFISH probe design

 View the Project on GitHub
 View the Project on PyPI
ggirelli/iFISH-probe-design (2.0.4)

Content

Algorithms

We included two algorithms in ifpd: the first to design a single probe in a genomic region of interest (gROI), and the second to design a number of homogeneously spread probes in a gROI (i.e, to design a spotting probe).

Both algorithms are based on the calculation of either single or spotting probe-related features. We will go more into the details of both features and algorithms in the following sections.

Single probe design

This algorithm is implemented in the ifpd_query_probe script.

Features

Algorithm

The single probe design algorithms requires the following inputs:

The algorithm performs the following steps:

  1. Retrieve all oligonucleotides in the region of interest, from the database.
  2. Identify all sets of NO consecutive oligonucleotides from the retrieved ones.
  3. Calculate the three features for each oligonucleotide set.
  4. Discard all sets that have a priority 1 feature (size by default) outside a range around the best value. This step behaves differently depending on the feature, using the following ranges:
    • size: min(size)±F*min(size)
    • homogeneity: max(homogeneity)±F*max(homogeneity)
    • centrality: max(centrality)±F*max(centrality)
  5. Sort the remaining sets based on the priority 2 feature (homogeneity by default), from the best to the worst value. This step behaves differently depending on the feature, sorting as follows:
    • size: from min(size) to max(size)
    • homogeneity: from max(homogeneity) to min(homogeneity)
    • centrality: from max(centrality) to min(centrality)
  6. Provide as output the first set in the sorted list, which is considered to be the optimal probe.

Spotting probe design

This algorithm is implemented in the ifpd_query_set script.

Features

Algorithm

The spotting probe design algorithms requires the following inputs:

The algorithm works as following:

  1. Retrieve all oligonucleotides in the region of interest, from the database.
  2. Identify all sets of NO consecutive oligonucleotides from the retrieved ones.
  3. Calculate the three features for each oligonucleotide set.
  4. Divide the region into NP+1 windows.
  5. For each of the first NP windows, run the single probe design algorithm and identify the optimal probe.
  6. Define the set of NP optimal probes as a spotting probe.
  7. Shift the windows of W and repeat steps 4-6 until the whole region has been covered. (e.g., if W is 0.1, after 11 iterations).
  8. For each spotting probe, calculate the homogeneity of inter-probe distance and probe size, and use it to sort in decreasing order alongside the number of probes (some sets might have less probes due to lack of oligonucleotides in a window).
  9. Provide as output the first spotting probe in the sorted list, which is considered to be the optimal spotting probe.