Deep Learning and CRISPR Off-Target Effect Prediction — A Walkthrough

A Bit About CRISPR

For those of you that don’t know about CRISPR, it’s basically a method of editing your genes. What are genes? Loosely speaking, they’re stretches of DNA 🧬 in your genome (so all the DNA in your body) that code for specific proteins. These proteins might give you a certain eye 👁 colour or hair colour, or might help your cells grow and divide. Proteins essentially play a role in every single trait you have, and all the processes that are going on in every single one of your trillions of cells every single second.

The central dogma of molecular genetics. Weird two-stranded molecule, to another weird single-stranded molecule, to a weird blob, to, finally, a weird ball of strings

Guide RNA Design and Problems

As I said before, the guide RNA molecule guides the CRISPR system to the correct spot on the DNA. It’s like the GPS 🗺️ that guides you to the nearest McDonald’s 🍟. Now, you might think that the CRISPR system will follow these directions really closely.

Off-target effects take place because of small mismatches between the gRNA and DNA that the CRISPR system accepts

The Alternative

How can we solve this problem?

The Benefit of Deep Learning

  1. It’s faster ⏱️
  2. It’s better at analyzing unclear relationships 📈

The Data

To be able to train the models for off-target prediction, we need to pass the mismatch and off-target effects. In this case, our gRNA sequences consist of 23 base pairs, including the 3-bp PAM sequence. This means that each data point is of length 23. But, neural networks are not always effective at analyzing non-numerical data. So, we must convert the 4 letters that code the gRNA (A, C, T, G) into numerical 🔢 vectors.

This is how the OR operator combines both sequences to create a collated sequence that can communicate the mismatches between the complementary guide RNA and the target DNA

The Feedforward Neural Network

The first model is a feedforward neural network (FNN), also known as a multi-layer perceptron. This is the basic form of the neural network, with an input layer, hidden dense layers, and an output layer with two neurons.

The basic structure of an FNN

The Convolutional Neural Network: How It Works

The second type of model is the convolutional neural network (CNN), which has a similar structure as an FNN, with a few special properties.

A CNN in action, classifying people, buses, and cars on a busy street (somewhere in Europe I think?). Pretty neat 😎 huh?
The convolutional layer at work, using a preset filter to create a feature map with all the important information in a more concise form

The Results

So, all of this sounds cool and all, but did it work?

The performance of all the different models in the experiment, with the best performance for the standard CNN (CNN_std) and the FNN with 3 hidden layers (FNN_3layer)

Next Steps

Now, this is great news. The first time deep learning was applied to off-target effect and site prediction, it performed extraordinarily and outperformed all current methods.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store