One of the main goals in proteomics is to solve biological and molecular questions regarding a set of identified proteins. In order to achieve this goal, one has to extract and collect the existing biological data from public repositories for every protein and afterwards, analyze and organize the collected data. Due to the complexity of this task and the huge amount of data available, it is not possible to gather this information by hand, making it necessary to find automatic methods of data collection.
Within a proteomics context, we have developed PIKE (Protein Information and Knowledge extractor) which solves this problem by automatically accessing several public information systems and databases across the Internet. PIKE bioinformatics tool starts with a set of identified proteins, listed as the most common protein databases accession codes, and retrieves all relevant and updated information from the most relevant databases.
Once the search is complete, PIKE summarizes the information for every single protein using several file formats that share and exchange the information with other software tools. It is our opinion that PIKE represents a great step forward for information procurement and drastically reduces manual database validation for large proteomic studies.