TO TOP

Workflows for proteomics data from medicine and life sciences

The research area "Medical Bioinformatics" provides numerous bioinformatics tools and workflows covering common questions, e.g., protein identification and quantification, and statistical data analysis in the field of medicine and life sciences.

In addition, we create customized bioinformatics solutions up to machine learning and support the interpretation of results for scientific publications (Turewicz et al. 2017) as part of our services. For this purpose, please feel free to contact us at any time. As part of our tool and workflow development, we place great emphasis on usability. An overview with a selection of workflows can be found on this page.

to top

Quality control on raw data and identfications

MaCProQC

In this workflow, raw mass spectrometric data are evaluated and various metrics are calculated to assess data quality. These include metrics based on the raw data alone, such as the number of MS1 and MS2 spectra recorded, TICs, and precursor charges. In addition, peptides are identified using various search engines and from this, for example, the number of identified peptides and proteins or the proportion of missed cleavages are calculated.

Various graphical representations of these metrics are included in the workflow. A principal component analysis is calculated from the metrics, which reveals any systematic biases that may be present in the data. Outlier samples may become visible and need to be re-measured or removed. Following the quality control, further analysis of the data can continue.

Workflow on KNIME Hub

to top

Identification

PIA

PIA is a toolbox for protein inference and identification analysis for mass spectrometry data. With PIA you can compare the results of popular mass spectrometry identification search engines and combine them. The main focus of PIA is on the integrated inference algorithms, i.e., inference of proteins from identified peptides. However, it also allows viewing peptide spectrum matches (PSMs), calculating FDR values across different search engine results, and visualizing the matches between PSMs, peptides, and proteins.

Publications:
Workflow on KNIME Hub

to top

Quantification

Calibra Curve

Calibra Curve is a useful and flexible tool for the calibration of targeted mass spectrometry-based measurements. It allows automated determination of dynamic linear ranges and quantification limits for both targeted proteomics and similar assays. The software uses a variety of metrics to assess calibration accuracy and provides intuitive visualizations.

Publication:
Workflow on KNIME Hub

KNIME workflow for protein quantification

This workflow includes conversion of raw data to mzML files, peptide identification (using various search engines), protein inference, feature detection, and quantification. It also creates mzIdentML files that can be used for upload to PRIDE. Within the workflow, the PIA tool is used for protein inference as well as OpenMS-KNIME nodes for other functionalities. The workflow is adapted to local infrastructure and is executable on an internal KNIME server. The workflow can be made available on request, but then requires some customization.

to top

Quality control and normalisation of quantitative data

QC_quant

This workflow is designed for quality control of quantitative proteomics data. Data are visualized by valid values, boxplots and principal component analysis. Normalisation of the data via median, -quantile or LOESS normalisation is possible and can be assessed using MA plots. The workflow allows various settings, e.g., for logarithmization of the data and colours for the graphs.

Workflow on Github (R-scripts with tutorial)

to top

Statistical analysis

Differential analysis of protein data

Using this workflow, protein measurements from two groups of samples can be statistically compared using a t-test. The resulting p-values are adjusted for multiple testing and plotted together with the fold changes in a volcano plot.