Recent Package Updates
2025-07-15: rdp-classifier-2.14-1 (Bayesian classifier of taxonomic data)The RDP Classifier is a naive Bayesian classifier which was developed to provide rapid taxonomic placement based on rRNA sequence data. The RDP Classifier can rapidly and accurately classify bacterial 16s rRNA sequences into the new higher-order taxonomy proposed by Bergey's Trust. It provides taxonomic assignments from domain to genus, with confidence estimates for each assignment. The RDP Classifier is not limited to using the bacterial taxonomy proposed by the Bergey's Manual. It worked equally well when trained on the NCBI taxonomy. The RDP Classifier likely can be adapted to additional phylogenetically coherent bacterial taxonomies. The new updated RDP Classifier now works on Fungal LSU sequences. Wang, Q, G. M. Garrity, J. M. Tiedje, and J. R. Cole. 2007. Naive Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial Taxonomy. Appl Environ Microbiol. 73(16):5261-7; doi: 10.1128/AEM.00062-07 [PMID: 17586664] commit log from Hanspeter Niederstrasser ([email protected]): rdp-classifier: v2.142025-07-15: vcftools-0.1.17-1 (Tools for VCF files)
Program package designed for working with VCF files, such as those generated by the 1000 Genomes Project. The aim of VCFtools is to provide methods for working with VCF files: validating, merging, comparing and calculate some basic population genetic statistics. Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, et al. The variant call format and VCFtools. Bioinformatics. 2011;27(15):2156-8. Epub 2011/06/10. doi: 10.1093/bioinformatics/btr330. PubMed PMID: 21653522; PubMed Central PMCID: PMC3137218. commit log from Hanspeter Niederstrasser ([email protected]): vcftools: 0.1.172025-07-15: bcftools-1.22-1 (Tools for VCF/BCF files)
BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed. Most commands accept VCF, bgzipped VCF and BCF with filetype detected automatically even when streaming from a pipe. Indexed VCF and BCF will work in all situations. Un-indexed VCF and BCF and streams will work in most, but not all situations. In general, whenever multiple VCFs are read simultaneously, they must be indexed and therefore also compressed. BCFtools is designed to work on a stream. It regards an input file "-" as the standard input (stdin) and outputs to the standard output (stdout). Several commands can thus be combined with Unix pipes. commit log from Hanspeter Niederstrasser ([email protected]): bcftools: 1.222025-07-15: muscle-5.3-2 (Protein multiple sequence alignment software)
MUSCLE is public domain multiple alignment software for protein and nucleotide sequences. MUSCLE stands for multiple sequence comparison by log-expectation. R.C. Edgar (2021) "MUSCLE v5 enables improved estimates of phylogenetic tree confidence by ensemble bootstrapping" https://www.biorxiv.org/content/10.1101/2021.06.20.449169v1.full.pdf commit log from Hanspeter Niederstrasser ([email protected]): muscle: v5.32025-07-15: bedtools-2.31.1-1 (Utilities for comparing genomic features)
The BEDTools utilities allow one to address common genomics tasks such as finding feature overlaps and computing coverage. The utilities are largely based on four widely-used file formats: BED, GFF/GTF, VCF, and SAM/BAM. Using BEDTools, one can develop sophisticated pipelines that answer complicated research questions by "streaming" several BEDTools together. The following are examples of common questions that one can address with BEDTools. Intersecting two BED files in search of overlapping features. Culling/refining/computing coverage for BAM alignments based on genome features. Merging overlapping features. Screening for paired-end (PE) overlaps between PE sequences and existing genomic features. Calculating the depth and breadth of sequence coverage across defined "windows" in a genome. Screening for overlaps between "split" alignments and genomic features. Quinlan AR and Hall IM, 2010. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 26, 6, pp. 841-842. commit log from Hanspeter Niederstrasser ([email protected]): bedtools 2.31.12025-07-15: libstaden-read11-shlibs-1.14.9-2 (Library for reading/writing DNA seq. results)
A fully developed set of DNA sequence assembly (Gap4 and Gap5), editing and analysis tools (Spin) for Unix, Linux, MacOSX and MS Windows. commit log from Hanspeter Niederstrasser ([email protected]): libstaden-read: add packaging notes2025-07-15: bamtools-2.5.3-1 (Tools for BAM alignment files)
Command-line toolkit for reading, writing, and manipulating BAM (genome alignment) files. commit log from Hanspeter Niederstrasser ([email protected]): bamtools: v2.5.32025-07-15: samtools-1.22-1 (Tools for SAM alignment files)
SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format. commit log from Hanspeter Niederstrasser ([email protected]): samtools 1.222025-07-15: fasttree-2.2.0-1 (Fast inference of phylogenetic trees)
FastTree infers approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences. FastTree can handle alignments with up to a million of sequences in a reasonable amount of time and memory. For large alignments, FastTree is 100-1,000 times faster than PhyML 3.0 or RAxML 7. Price, M.N., Dehal, P.S., and Arkin, A.P. (2009) FastTree: Computing Large Minimum-Evolution Trees with Profiles instead of a Distance Matrix. Molecular Biology and Evolution 26:1641-1650, doi:10.1093/molbev/msp077. commit log from Hanspeter Niederstrasser ([email protected]): fasttree: v2.2.02025-07-15: mothur-1.48.3-1 (Microbial ecology software suite)
Microbial ecology software suite commit log from Hanspeter Niederstrasser ([email protected]): mothur: v1.48.32025-07-15: libhts3-shlibs-1.22-1 (Library for high-throughput sequencing data)
HTSlib is an implementation of a unified C library for accessing common file formats, such as SAM, CRAM, VCF, and BCF, used for high-throughput sequencing data. It is the core library used by samtools and bcftools. HTSlib: C library for reading/writing high-throughput sequencing data James K Bonfield, John Marshall, Petr Danecek, Heng Li, Valeriu Ohan, Andrew Whitwham, Thomas Keane, Robert M Davies GigaScience, Volume 10, Issue 2, February 2021, giab007, https://doi.org/10.1093/gigascience/giab007 commit log from Hanspeter Niederstrasser ([email protected]): libhts3: v1.22