Previous Page |
Next Page |
There are two more controls that determine how the recognizer will be constructed:
The Maximum Complexity control limits the size of the recognizer to the specified number of "states". If the training data is highly varied with vocalizations consisting of many syllable types, more complexity (and more training data) may be required to accurately model the vocalization. For readers with more experience in pattern recognition techniques, Song Scope makes use of Hidden Markov Models, and this control limits the number of states used to generate a model for the vocalization.
The Maximum Resolution control limits the size of spectral "feature vectors" as described in Dimension Reduction. Many bird vocalizations are "narrow band", meaning they have tight spectral components representing "whistle-like" sounds. These vocalizations are not particularly complex, and a feature vector of only 6 or so dimensions often provides sufficient spectral resolution. On the other hand, vocalizations rich in spectral complexity may require more dimensions to represent them accurately. You should also be aware that low quality (e.g. open microphone) recordings may require a lower resolution to match the "fuzzier" spectral resolution, while a higher resolution may be more suitable for higher quality (e.g. parabolic or otherwise very high signal-to-noise ratio) recordings.
Once you are satisfied with the selection of training data and parameter settings, just press the "Generate Recognizer" button. Song Scope will then begin building several permutations of models (based on trying different numbers of syllable types from simple to more complex models). For each model, and for each annotation id, Song Scope will build the model excluding vocalizations marked with the specific id, and then test the performance of the model against the excluded vocalizations. This process can take quite a bit of time if you have a lot of training data and need to build very complex models. On even fairly fast machines, it may take 30 minutes to an hour to build some models. Fortunately, this is not something you will need to do very often.
When the recognizer completes, the "Recognizer Information" section of the recognizer control panel will display information about the generated recognizer as shown below. The most important is the cross training result as a measure of how well the model is expected to perform. Some of the results are related to details of the algorithms and should not be of any consequence to most users.
Cross Training:
Cross training shows the average and standard deviation of the "fit" of excluded annotation ids when building the model. A low score may indicate that the generated model may not accurately represent the vocalization. If this is the case, a more complex model may be required (by adjusting the maximum complexity setting), more training data may be needed, or the vocalization may need to be split into subclasses.
Total Training:
Total training shows the average and standard deviation of the "fit" of all the training data in the final model which includes all of the training data. It will typically show a slightly higher score and slightly smaller standard deviation than the cross training result described above.
Model States:
Indicates the size of the model as a number of states.
Feature Vector:
Indicates the number of dimensions in the spectral feature vectors, the same as the Maximum Resolution control setting.
Syllable Types:
Syllable types indicates the number of different syllable classes that were used to construct the final model. Song Scope tries different values up to 1/4th of the maximum model complexity and chooses the value that scored highest during cross training.
State Usage:
Indicates the average and standard deviation in the number of different states traversed by each vocalization
Mean Symbols:
Indicates the average and standard deviation of the number of symbols contained within each vocalization
Mean Duration:
Indicates the average and standard deviation of the duration of each vocalization.
From the "File" menu, select "Save..." to save the Song Scope Recognizer to a .ssr file. The filename will be used as the name of the recognizer.
Previous Page |
Next Page |