Features
E-SNPs&GO is a fast and accurate method that, given an input protein sequence and a single residue variation, can predict whether the variation is related to diseases or not.
The prediction relies on an input encoding completely based on protein language models and embedding techniques, specifically devised to encode protein sequences and GO functional annotations.
Input
The user can submit one or more single residue variation/s occurring on one or more protein sequence/s.
Each unique sequence has to be provided in a fasta-like format.
The header of each sequence must contain a protein identifier followed, after a space, by the list of variations separated by commas, in the format [wildtype residue][position][variant residue].
Before the job is submitted, a validation procedure is applied to the input to validate it. The server accepts at most 1000 unique sequences in a single job. After submission, the user is redirected to a new page indicating that the job has been accepted.
Output
Results are displayed in a tabular format, with a section for each submitted unique sequence.
In each section, one line per variation is shown, with the following columns:
- Variant: Identifier of the variation
- Pathogenicity class: Predicted class, either Pathogenic or Benign
- Pathogenicity probability: Calibrated probability of a variation to be Pathogenic
- Reliability Index: an integer number in the range [0-10] where 0 and 10 correspond to the minimum and maximum confidence for the prediction.