Diced Stars#

A Rust re-implementation of the MinCED algorithm to Detect Instances of CRISPRs in Environmental Data.

Actions Coverage PyPI Bioconda AUR Wheel Versions Implementations License Source Mirror Issues Docs Changelog Downloads

Overview#

MinCED is a method developed by Connor T. Skennerton to identify Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) in isolate and metagenomic-assembled genomes. It was derived from the CRISPR Recognition Tool. It uses a fast scanning algorithm to identify candidate repeats, combined with an extension step to find maximally spanning regions of the genome that feature a CRISPR repeat.

Diced is a Rust reimplementation of the MinCED method, using the original Java code as a reference. It produces exactly the same results as MinCED, corrects some bugs, and is much faster. The Diced implementation is available as a Rust library for convenience.

Batteries-included

Diced is a Python package, so you can add it as a dependency to your project, and stop worrying about the minced binary invoking the Java Virtual Machine.

Flexible I/O

Directly pass sequence data as Python str objects, and retrieve the results using an iterator.

Fast

The Java code uses a handwritten implementation of the Boyer-Moore algorithm, while the Rust implementation uses the str::find method of the standard library, which uses the Two-way algorithm.

Memory-efficient

The Rust code powering Diced is zero-copy: it will work without copying the sequence data from the Python memory space.

Consistent results

Get the same results as MinCED v0.4.2+10f0a26e.

Pre-built packages

Get the pre-built wheels from PyPI for a fast installation on a variety from machines, or compile from source using maturin.

Setup#

Run pip install diced in a shell to download the latest release from PyPi, or have a look at the Installation page to find other ways to install diced.

Library#

License#

This library is provided under the GNU General Public License v3.0 or later. The code for this implementation was derived from the MinCED source code, which is available under the GPLv3 as well. See the Copyright Notice section for more information.

This project is in no way not affiliated, sponsored, or otherwise endorsed by the original MinCED authors. It was was developed by Martin Larralde during his PhD project at the Leiden University Medical Center in the Zeller team.