Diced
#
A Rust re-implementation of the MinCED algorithm to Detect Instances of CRISPRs in Environmental Data.
Overview#
MinCED is a method developed by Connor T. Skennerton to identify Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) in isolate and metagenomic-assembled genomes. It was derived from the CRISPR Recognition Tool. It uses a fast scanning algorithm to identify candidate repeats, combined with an extension step to find maximally spanning regions of the genome that feature a CRISPR repeat.
Diced is a Rust reimplementation of the MinCED method, using the original Java code as a reference. It produces exactly the same results as MinCED, corrects some bugs, and is much faster. The Diced implementation is available as a Rust library for convenience.
Diced is a Python package, so you can add it as a dependency to your
project, and stop worrying about the minced binary invoking the Java
Virtual Machine.
Directly pass sequence data as Python str objects, and retrieve the
results using an iterator.
The Java code uses a handwritten implementation of the Boyer-Moore algorithm,
while the Rust implementation uses the str::find method of the standard
library, which uses the Two-way algorithm.
The Rust code powering Diced is zero-copy: it will work without copying the sequence data from the Python memory space.
Get the same results as MinCED v0.4.2+10f0a26e.
Get the pre-built wheels from PyPI for a fast installation on a
variety from machines, or compile from source using maturin.
Setup#
Run pip install diced in a shell to download the latest release
from PyPi, or have a look at the Installation page to find
other ways to install diced.
Library#
License#
This library is provided under the GNU General Public License v3.0 or later. The code for this implementation was derived from the MinCED source code, which is available under the GPLv3 as well. See the Copyright Notice section for more information.
This project is in no way not affiliated, sponsored, or otherwise endorsed by the original MinCED authors. It was was developed by Martin Larralde during his PhD project at the Leiden University Medical Center in the Zeller team.