Computing and Electronic Engineering
This research proposes an application based on the use of Genetic Programming to identify the files contents by analysing only the raw binary streams and without the need for any meta data. The proposed methodology represent the data into new form referred to as GP-Fileprint.  The new representation is used to analyse the files’ and classify their contents. This can be particularly useful for many applications such as; email spam filtering, virus detection, forensic analysis, and networks security. Experimental results show that GP compares very well with established classification algorithms in terms of the accuracy achieved.


