Filters protein identification engine results by different criteria.
| potential predecessor tools | IDFilter | potential successor tools |
| MascotAdapter (or other ID engines) | PeptideIndexer | |
| IDFileConverter | ProteinInference | |
| FalseDiscoveryRate | IDMapper | |
| ConsensusID |
This tool is used to filter the identifications found by a peptide/protein identification tool like Mascot. Different filters can be applied:
To enable any of the filters, just change their default value. All active filters will be applied in order.
The command line parameters of this tool are:
IDFilter -- Filters results from protein or peptide identification engines based on different criteria.
Version: 2.0.0 May 16 2015, 09:22:21, Revision: GIT-NOTFOUND
Usage:
IDFilter <options>
Options (mandatory options marked with '*'):
-in <file>* Input file (valid formats: 'idXML')
-out <file>* Output file (valid formats: 'idXML')
Filtering by precursor RT or m/z:
-precursor:rt [min]:[max] Retention time range to extract. (default: ':')
-precursor:mz [min]:[max] Mass-to-charge range to extract. (default: ':')
-precursor:allow_missing When filtering by precursor RT or m/z, keep peptide IDs with missing
precursor information ('RT'/'MZ' meta values)?
Filtering by peptide/protein score. To enable any of the filters below, just change their default value. All
active filters will be applied in order.:
-score:pep <score> The score which should be reached by a peptide hit to be kept. The scor
e is dependent on the most recent(!) preprocessing - it could be Mascot
scores (if a MascotAdapter was applied before), or an FDR (if FalseDis
coveryRate was applied before), etc. (default: '0')
-score:prot <score> The score which should be reached by a protein hit to be kept. Use in
combination with 'delete_unreferenced_peptide_hits' to remove affected
peptides. (default: '0')
Filtering by significance threshold:
-thresh:pep <fraction> Keep a peptide hit only if its score is above this fraction of the pept
ide significance threshold. (default: '0')
-thresh:prot <fraction> Keep a protein hit only if its score is above this fraction of the prot
ein significance threshold. Use in combination with 'delete_unreference
d_peptide_hits' to remove affected peptides. (default: '0')
Filtering by whitelisting (only instances also present in a whitelist file can pass):
-whitelist:proteins <file> Filename of a FASTA file containing protein sequences.
All peptides that are not a substring of a sequence in this file are r
emoved
All proteins whose accession is not present in this file are removed.
(valid formats: 'fasta')
-whitelist:by_seq_only Match peptides with FASTA file by sequence instead of accession and
disable protein filtering.
Filtering by blacklisting (only instances not present in a blacklist file can pass):
-blacklist:peptides <file> Peptides having the same sequence and modification assignment as any
peptide in this file will be filtered out. Use with blacklist:ignore_mo
dification flag to only compare by sequence.
(valid formats: 'idXML')
-blacklist:ignore_modifications Compare blacklisted peptides by sequence only.
Filtering by RT predicted by 'RTPredict':
-rt:p_value <float> Retention time filtering by the p-value predicted by RTPredict. (defaul
t: '0' min: '0' max: '1')
-rt:p_value_1st_dim <float> Retention time filtering by the p-value predicted by RTPredict for firs
t dimension. (default: '0' min: '0' max: '1')
Filtering by mz:
-mz:error <float> Filtering by deviation to theoretical mass (disabled for negative value
s). (default: '-1')
-mz:unit <String> Absolute or relative error. (default: 'ppm' valid: 'Da', 'ppm')
Filtering best hits per spectrum (for peptides) or from proteins:
-best:n_peptide_hits <integer> Keep only the 'n' highest scoring peptide hits per spectrum (for n>0).
(default: '0' min: '0')
-best:n_protein_hits <integer> Keep only the 'n' highest scoring protein hits (for n>0). (default:
'0' min: '0')
-best:strict Keep only the highest scoring peptide hit.
Similar to n_peptide_hits=1, but if there are two or more highest scor
ing hits, none are kept.
-min_length <integer> Keep only peptide hits with a length greater or equal this value. Value
0 will have no filter effect. (default: '0' min: '0')
-max_length <integer> Keep only peptide hits with a length less or equal this value. Value 0
will have no filter effect. Value is overridden by min_length, i.e. if
max_length < min_length, max_length will be ignored. (default: '0' min:
'0')
-min_charge <integer> Keep only peptide hits for tandem spectra with charge greater or equal
this value. (default: '1' min: '1')
-var_mods Keep only peptide hits with variable modifications (fixed modifications
from SearchParameters will be ignored).
-unique If a peptide hit occurs more than once per PSM, only one instance is
kept.
-unique_per_protein Only peptides matching exactly one protein are kept. Remember that isof
orms count as different proteins!
-keep_unreferenced_protein_hits Proteins not referenced by a peptide are retained in the ids.
-remove_decoys Remove proteins according to the information in the user parameters.
Usually used in combination with 'delete_unreferenced_peptide_hits'.
-delete_unreferenced_peptide_hits Peptides not referenced by any protein are deleted in the ids. Usually
used in combination with 'score:prot' or 'thresh:prot'.
Common TOPP options:
-ini <file> Use the given TOPP INI file
-threads <n> Sets the number of threads allowed to be used by the TOPP tool (default
: '1')
-write_ini <file> Writes the default configuration file
--help Shows options
--helphelp Shows all options (including advanced)
INI file documentation of this tool:
| OpenMS / TOPP release 2.0.0 | Documentation generated on Sat May 16 2015 16:13:42 using doxygen 1.8.9.1 |