Secondary Structure Content Prediction

Predict the secondary structure content of your protein of interest based on an unassigned 1H,15N-HSQC peak list. Our prediction is based on a Maschine Learning (CatBoost) model that was trained and optimized on HSQC data of 3514 carefully selected PDB structures.

1. Data Upload
  • HSQC demo spectrum
  • Collect your 1H,15N-HSQC spectra, pick peaks of backbone amides (without side chain NH2 resonances), and upload your *.csv peak list. Files generated using TopSpin's export function in the Peaks Tab and backbone peak lists of any BMRB entry can be uploaded directly. Examples can be downloaded here.
2. Binning
  • HSQC demo spectrum with grid
  • Your spectrum is replicated and divided into three different grids. The size of each grid has been optimized for the prediction of α-helix, β-sheet and random coil, respectively. Peaks in each quadrant of the grids are counted (binned), while peaks outside the grid area (here in red) are ignored.
3. ML-based Prediction
  • Pie demo chart with results
  • Our CatBoost model predicts the amount of α-helix, β-sheet and random coil based on the binned peaks in each quadrant. Furthermore, a heatmap of SHAP values is generated, visualizing each quadrant's contribution to the final prediction.

Drag & Drop your peak list (*.csv) here to start. Alternatively, click in this area to select a file manually.