Automated methods for taxon identification and automated extraction of morphological character states are increasingly important in evolutionary biology, ecology and taxonomics, because on the one hand the number of taxonomic experts able to perform identification manually decreases and the amount of data (e.g. available digital specimen scans and images) to be processed increases. Powerful automated methods would allow a higher throughput and greater repeatability in taxon identification.
The Intelligent Vision Systems group at the University of Bonn and our group recently developed a system to identify plant species from scans of herbarium specimens (Grimm et al., 2016). Its initial test dataset of a selection of specimens of fern species on which the software achieved recognition accuracies between 94 % and 100 %.
In this thesis additional data sets containing specimens of species from other plant taxa shall be tested with the software to explore its capability. Recognition accuracies for distinguishing between herbarium specimens with different qualities from taxa with different morphological diversity shall be evaluated and algorithmic extensions can be made to improve the software.
Depending in the progress an additional task can be to test, whether the software is able to group specimens by the presence of certain morphological character states (e.g. types of leaf margins) instead of distinguishing between taxa. An important question here is the relationship between the computer vision feature detection algorithms used (e.g. SIFT) and the morphological features used by botanists identifying taxa manually.
Example output of our software showing SIFT points in a herbarium specimen.
What we offer
- Individualized supervision and an advanced training in bioinformatical methods.
- Co-authorship in a journal publication depending on the progress made.
- Interest in working in bioinformatics/biodiversity informatics and computer vision.
- Our paper on the initial version of the software: Image-Based Identification of Plant Species Using a Model-Free Approach and Active Learning
- Another software to identify plants by images of their leaves developed in the two groups: LeafNet - a method based on a convolutional neural network (CNN) to identify plants from images of leaves
- The paper on LeafNet
- Other software developed in our group
- A review paper providing on overview on the field: Plant species identification using digital morphometrics: A review
- A Nature opinion paper describing the need for automated identification
If you are interested to work on this topic in your bachelor or master thesis or in a master research module, please contact Ben Stöver. (Working on other bioinformatics topics related to our research and software is also possible.)