Lab wiki • Lab manual, protocols, etc. Login required.
Calendars • Meetings and instruments.
NCBI • Central NCBI databases for nucleotide and protein sequences (among other things).
ENA • A European nucleotide database similar to NCBI's.
JGI/IMG • A database for well-annotated microbial (meta)genomes and transcriptomes from the DOE. Downloadable sequences and metadata.
UniProt • A central repository for well-annotated protein sequences (and metadata, and AlphaFold predictions). Likely to contain anything from NCBI's NP and NR databases, and the ENA, but not necessarily NCBI WGS or JGI/IMG.
InterPro • An EBI-hosted central repository for protein family information (includes PFAM and TIGRfam data); also includes classification tools, profile hidden Markov models, etc.
bioinformatics.org • Basic nucleotide-based sequence analysis/manipulation. Everyone uses it for generating the reverse complement of sequences because it is the first Google result.
Jalview • Free Java-based sequence maniuplation tool for visualizing and manipulating sequences. OS-agnostic, and lighter-weight than things like uGene for basic alignment and visualization.
EFI-EST • A web tool to construct sequence similarity networks for your protein family. Designed for UniProt integration, but can handle custom .fasta input.
BLAST • Alignment-based search for similar protein or nucleic acid sequences. Servers targeting specific databases available via NCBI, UniProt, IMG (login may be required for full functionality), etc.
Diamond • A DNA alignment program(download only) that is faster than BLAST in larger databases but is still less widespread.
MAFFT • A tool for accurate protein and nucleic acid sequence alignment. Server available (and it can also be integrated into Jalview and Geneious).
MUSCLE • Another tool for accurate alignment of protein or nucleic acid sequences. Download for local install or run via JalView or Geneious. Either MAFFT or MUSCLE are generally better choices than ClustalΩ in terms of accuracy.
NGPhylogeny.fr • A server with a bunch of alignment and phylogeny tools.
HMMER • A toolset for profile hidden Markov model use and construction. EBI provides a more limited online version of these tools.
InterProScan • EBI search tool to identify protein or domain families from the InterPro database in a protein sequence.
HHBlits and HHPred • Part of an extensive toolset from the Max Planck Institut with a web interface. HHblits searches HMM databases, looking for remote homology (while HHPred takes into account structure), but many additional tools (MAFFT and MUSCLE installs, secondary structure predictors, CLANS, DIAMOND-DeepClust and MMSeqs2 implementation, etc.) are also available.
Expasy • From the SIB, also many online tools or predicting protein sequence properties. Frequently used: STRING, ProtParam, PeptideMass.
SignalP • Predict signal peptides in proteins.
DeepTMHMM • Predict transmembrane domains in proteins.
TopCons • More transmembrane prediction.
protein structure tools
AlphaFold • Predict protein structures with this Colab version (use a cluster install for the full version)!
ColabFold • Predict protein structures, more configurably (multiple Colab notebooks available)! See also localColabFold.
RoseTTAFold • Predict protein structures - also via the Baker lab's server (login required).
DALI • A server to look for structural homology in the PDB with a source structure. Recently added AlphaFold DB searching.
FoldSeek • An additional structural homology search server - looks at non-PDB sources (like UniProt AlphaFold predictions) but by default won't delve as deeply in the PDB. Can also be installed locally. DALI is probably better for accuracy at the moment.
MEGA • A downloadable program with highly configurable phylogeny tools for smaller sequence sets.
FastTree • Does what it says on the tin: makes trees - even really, really large ones - fast. Not an online implementation.
IQ-TREE • A newer maximum-likelihood tree tool (I'd recommend it over RAxML or PhyML these days except for large datasets, where FastTree still wins.)
NGPhylogeny.fr • A server with a bunch of alignment and phylogeny tools.
ggTree • A very flexible tree visualization package in R with a predictably real learning curve.
figTree • A clunky but widespread OS-agnostic program for tree visualization
Dendroscope • A differently clunky alternative tree viewing program.
iTOL • An online tool for tree viewing and annotation, but subscriptions now limit some of the options.
Phylo.io • A quick tree viewer.
antiSMASH • An excellent server (offline implementations are also available) for
predicting biosynthetic gene clusters in contigs or genomes. Best for
Prism • An alternate BGC discovery tool that has some strengths for NRPS and PKS systems particularly.
MiBIG • A database of natural product biosynthetic gene clusters with experimental evidence (hooked into antiSMASH, with annotation for each BGC.)
BiGSCAPE-Corason • Tool for exploring BGC similarity; relies on antiSMASH, which can be a challenge for new BGC types.
prettyClusters • (My) sequence-independent tool for exploring gene clusters without relying on previous BGC annotations. Accepts GenBank and IMG input. Sadly still just an R package (maybe you can help...?)
EFI-GNT • A webtool for a sequence-dependent way of exploring and visualizing genomic neighborhoods (requires an SSN for a gene of interest; relies on UniProt as a database souce.)
NPAtlas • Database of info for known natural products.
StreptomeDB • Streptomyces natural product database.
CAGECAT • Online implementation of clinker (for visualizing genomic neighborhoods) and cblaster (a multiblast tool for gene clusters).
OpenWetWare • A general molecular/synthetic/micro biology wiki. Lots of protocols.
IDT • As a primer manufacturer, IDT has a useful set of nucleotide tools - OligoAnalyzer (for primer QC) and PrimerQuest (for qRT-PCR primer design) are particularly helpful.
NEB • Similarly, NEB has a very useful set of molecular biology tools. Especially helpful - NEBcloner (for traditional cloning, including digests, ligation, mutagenesis, etc.), NEBuilder (for Gibson/HiFi assembly).
Primer3Plus • Online primer design tool, very customizable.
genome and transcriptomes
bowtie • A free and well-established short read aligner. Works with tophat if you are operating in a system where splicing is relevant.
velvet • Freeware for de novo assembly of genomes.
prokka • Program for preliminary annotations for prokaryotic genomes. I'd currently recommend it over RAST, which can make non-standard GenBank files.
PGAP • The NCBI-supported prokaryotic genome annotation pipeline.
Galaxy • A solid web platform for 'omics work.
kBase • A web platform that implements a lot of genome- and systems biology-related programs. Powerful but workflow setup can be really finicky.
roary • A pipeline for pangenome identification.
Cluster 3 • A simple program with a basic GUI for doing a bunch of hierarchical clustering type analyses.
JavaTreeView • Another simple program with a GUI for viewing clustered heatmaps (including Cluster 3 output with trees).
ActinoBase • An exceedingly helpful wiki for all things actinomycete.
Practical Streptomyces Genetics (pdf) • A.K.A. "the Streptomyces Bible."
EcoliWiki • A wiki devoted to all things E. coli.
BioCyc • From genomes to metabolic models. also has tools for mapping omics data onto models! (See also kBase, sorta.)
NPDC • Natural products-focused portal for the Scripps strain collection. BLASTable.
NotVoodoo • A classic synthetic organic chemistry lab bible.
Schlenk Line Survival Guide • Exactly what it sounds like, with many helpful figures.
SDBS • A database for MS, H/C NMR, IR, RR, and EPR for small organic compounds.
SpectraBase • Another broad database for NMR, MS, UV-vis (!), and other info for small organic compounds. Availability is sometimes spotty, but does have links to papers.
PubChem • Reference info for chemicals (exact mass, solubility, suppliers, etc.)
SciFinder • Search for papers, suppliers, syntheses, etc. for a given compound (or even for structurally similar things.) Requires login.
XCMS • Online metabolomics analyses. (An R package is also available.)
GNPS • Online molecular networking for mass spectrometry datasets! Developed by natural products people.
mzMine • Downloadable open-source program (OS-agnostic) with a GUI for mass spec data processing.
Metlin • Database with searchable MS2 datasets. (The Gen2 version is not free though.)
CFM-ID • MS2 prediction and assignments.
MASST • Search a single MS2 spectrum against public GNPS libraries.
PeptideMass • Expasy's digest prediction tool (good for exact mass prediction for MS on digested peptides).
ChemCalc • Online tools for exact masses, including molecular formula prediction from monoisotopic masses, peptide MS2 fragmentation, and isotopic distribution.
enviPat • Simulate isotopic distributions at various charges and resolutions from chemical formulas.
UWPR proteomics • Additional implementations of several useful proteomics-related calculators. Note: the amino acid reference masses are for amino acids in peptides (i.e. in amide bonds, without terminal amine or carboxyl groups.)
BMRB • A database of NMR spectra for biomolecules and metabolites.
NMRShiftDB • A database of NMR spectra for small organic compounds.