Susan Wethington
Hummingbird Monitoring Network
With the grant of 2 licenses of Wildlife Acoustics Kaleidoscope Pro 4.1 software with acoustic Cluster Analysis, the Hummingbird Monitoring Network (HMN) could now analyze recordings taken in fields of hummingbird-visited flowers during southbound migration. The science objectives of the study are to determine how weather, plant phenology and abundance of available nectar influence hummingbird migration. The community objectives of the study are to employ and engage high school students in STEM (Science, Technology, Engineering, and Mathematics) activities.
In 2013 and 2014, we recorded daytime activity of hummingbirds in 7 flower patches for 5 weeks during southbound migration in the Chiricahua Mountains of southeastern Arizona. In 2015 and 2016, we worked with Songscope software to build recognizers of hummingbird sounds. This effort had limited success and we were anxious to learn the Kaleidoscope software. During spring semester 2017, Patagonia High School students easily learned how to use the Kaleidoscope software and began identifying clusters with hummingbird chirp notes, vocalizations, and wing trills. By the end of this semester's program, students had iteratively defined clusters and were beginning to refine the classifiers that identify hummingbird chirp notes and vocalizations to species. The refinement of the classifier will continue during fall semester with the goal of having complete classifiers for the three hummingbird species known to have used these flower patches. Upon completion of the hummingbird classifiers, HMN's science collaborators will complete the analyses for the study.
Building classifiers with the Kaleidoscope software was an excellent project for high school students. They became proficient at identifying hummingbird sounds and classifying clusters into different vocalization categories. Our workflow was somewhat unique because it was multi-threaded. Two students, each using a license of Kaleidoscope, built classifiers from different recordings. We, then, wanted to combine the classifiers and re-run the cluster analyzer to continue refining the classifiers. We were unable to figure out how to do this, so we contacted Wildlife Acoustic's technical support team and worked with Chris Warren. He quickly helped identify how to combine the efforts as well as answered additional questions that arose throughout the semester.
We think passive recordings are an excellent field technique; have encouraged others to use it as well as engage high school students to help build the classifiers. We thank Wildlife Acoustics for this grant and particularly thank Chris Warren for his timely and extremely helpful guidance as we learned how to use Kaleidoscope.
No progress has been made with this project since the last report. Building the hummingbird classifiers are part of a STEM program with Patagonia Union High School. At the end of the program last March, students had iteratively defined clusters and were beginning to refine the classifiers to identify hummingbird chip notes and vocalizations to species. Due to lack of funding for the PASEO program (Patagonia After School Employment Opportunities), it was not offered to students this Fall semester. The high school student, Nick Botz, who mastered Kaleidoscope, is an accomplished musician and strong science student and will be working with HMN from late November to January. We expect to complete building the hummingbird classifiers by the end of his employment. In 2018, we will begin integrating the results of Kaleidoscope with the environmental data to identify the weather/climate factors influencing hummingbird migration.
The purpose of HMN's study is to see how weather patterns affect the migration of hummingbirds in our area. To do that, they need to be able to accurately estimate the number of hummingbirds visiting a flower patch in a set period of time. The data for the study came from seven different sites in the Chiricahua Mountains: Barfood Park, Coal Pit, El Coronado, Long Park, Onion Saddle, Saulsberry, and Turkey Creek. The hummingbirds tagged in these sites were Broad-tailed (BTLH), Black-chinned (BCHU), Rufous (RUHU), and Magnificent (MAHU). Recording took place during August and September of 2013 and 2014 with Wildlife Acoustics' Song Monitor devices. It was these recordings that would become the key to proceeding with the study.
That is where Wildlife Acoustics' Kaleidoscope 4.1.0a comes in. This software allows you to build a classifier, a special cluster.kcs file that Kaleidoscope uses to process audio recordings, creating a 'cluster.csv' Excel document as its results table. The end result is a machine that can pick out each individual hummingbird sound over the course of weeks for our review. The final step is to combine all the resulting Excel files to create one spreadsheet telling how many hummingbirds visited the flower patches for each day of the study.
Kaleidoscope uses a small set of sample recordings to create a classifier, which can then be used on new recordings to sort vocalizations into 'clusters', or groups of vocalizations with similarities. It is the job of the human user to train the classifier to discriminate between different species. Processing a set of recordings for the first time creates a cluster.csv and cluster.kcs file. The cluster.csv is an Excel document containing all the meta data from the scan. This is what the human user opens and edits with Kaleidoscope. Changes to the cluster.csv create a new cluster.kcs, and this is the file that the software uses to cluster new recordings with its many complex models.
The entire process can be hard to grasp, so Wildlife Acoustics has provided a series of tutorial videos on their website to train new users. They also conducted a free workshop at the US Fish and Wildlife Service office, where representatives answered questions and ran through training simulations, which was very helpful in getting familiar with the software.
The Cornell Lab of Ornithology's Macaulay Library contains audio samples from every species in the study, so it proved extremely useful in discovering the subtle differences between different vocalizations. The website uses a black-against-white style for its spectrogram plots, which makes it a little more difficult to compare and contrast with Kaleidoscope's dotted green-against-black style. The audio files from the Macaulay library can be downloaded as MP3's, then processed through the media.io engine to convert them to WAV files, allowing them to be viewed with Kaleidoscope. This was just for the purpose of reference; the Macaulay downloads were not mixed into the field recordings.
Here is the colorless Macaulay version of a RUHU vocalization...
...And here is the exact same vocalization when processed through Kaleidoscope.
This was advantageous also because Kaleidoscope allows the user to pick a certain range of kHz they want to hear when they play the audio. This meant that the loud, distracting background noise could be filtered out (you can see such noise at the bottom of the above spectrogram) to quite literally get a clearer picture of the vocalizations.
The first step of building any classifier is to select a set of recordings and run them through Kaleidoscope's simplest action: 'scan and cluster recordings to create cluster.kcs and cluster.csv.' In the tutorial videos, the creators recommend using training recordings, but those were unavailable, so the classifier was created from the field recordings themselves. The sample recordings were selected by site and date using data from the tagging study that accompanied the recording study. The sites with the most tagged hummingbirds from each species were selected in the hopes of obtaining enough vocalizations to include every species in the classifier. With that, the first scan commenced. Mention additional recordings of known hummingbird chips, Provide enough detail to convince the reader that you identified all vocalizations and chip notes. This justifies the use of the cluster analyses for relative abundance estimates.
Here the tallest chip is from 3 kHz to 12 kHz, and the widest chip is from 0 seconds to 0.05 seconds, so those values would be plugged into the Signal Parameters. Then, once Kaleidoscope processed a file, it would ignore a Black-Capped Chickadee, for example, which has calls over a second long and therefore does not fit in the x range.
To isolate the chips so the classifier would count every one individually, the Inter-Syllable gap was set to 0.01, the lowest it could be, so that 2 chips in rapid succession would still be separated.
Once the recordings were scanned, they could be examined in the Kaleidoscope viewer. It was then time to determine the correct signal parameters for the classifier. Kaleidoscope's signal parameters are a range of length (seconds) and frequency (Hz) allowed for scanning. The idea was to adjust them to fit snugly around a single chip note so that noises such as frog trills, which sound nothing like a hummingbird chip, would be left out.
The method for figuring out the signal parameters was simple. First, take the tallest chip note and the widest, measuring the range of x(Seconds) and y(Hz).
The next step in the process is clustering, or going through every detection and entering a name for it in the MANUAL ID column. This can be done two ways, with two different kinds of results. The first is by clicking 'Bulk ID' and assigning a species identification to an entire cluster of detections at once. This is far quicker and results in a simple classifier with low accuracy. The second method is actually viewing every single detection and assigning it its own species identification. With a results table containing hundreds of thousands of detections, this is a lot more tedious and time-consuming, but the result is an advanced classifier that has an accuracy of about 89%. The CHIP classifier was made using both of these methods. First, a simple classifier was made by entering either CHIP or NOTCHIP in the 'Bulk ID' tab. Then, the .csv file was processed through the 'rescan recordings and edited cluster.csv to create new cluster.kcs with pairwise classifiers and cluster.csv' action, separating the hummingbird sounds from everything else. Then, work could commence on the advanced classifier by entering species names into the 'MANUAL ID' column.
Out of the original four species present in field study, the MAHU chips were so few and far between that the software actually left them out after the first re-scan. That turned out not to be a problem, since data collected from on-the-scene monitoring shows that BTLH, BCHU, and RUHU have higher populations in our area and migrate much further than MAHU, making them more suitable subjects for the study.
Here is a group of BTLH chips. What distinguishes them from the others in sound is that they have less of a staccato chipping noise and more of a smooth chirp, like that of a sparrow. On the spectrogram plot, you can see that the notes are shorter (have a smaller range of kHz), and are formed by two overlapping, slanted, and slightly curved lines. In most cases, the BTLH's signature wing noise was also present in the background, as you can see in the above example.
Here is a group of BCHU chips. Their sound is shorter than that of the BTLH, but still very smooth sounding. The best way to describe this would be to compare it to a note played on an instrument, which always starts with an attack - that hard consonant you get from picking the string, followed by the smooth vowel of the note itself as the string resonates. Both the BTLH and BCHU chips have a weak attack. On the spectrogram, the chips have a wider range of kHz that stretches down between 1 and 2 kHz. The energy of the sound (shown in Kaleidoscope by the degree of brightness in green) is concentrated in the lower end. Its shape consists of multiple stubby, downward-slanting lines stacked on top of each other with the largest and heaviest sticking out a bit at the bottom, creating a cute little 'foot'.
Last but definitely not least, as this was by far the most common chip, we have the RUHU. Its chips are staccato, high-pitched, and have a strong attack. The range of kHz is about the same as the BCHU's but moved up with a higher minimum and maximum value (the top of the chip note is cut off by the viewer in the image). There is a lot of energy in the middle of the chip, with a very slight widening in its otherwise straight-up-and-down shape.
Once every detection had been manually labeled, the file was ready to be rescanned again and become an advanced classifier. After just one rescan, the classifier was still full of false positives. Creating a high-accuracy classifier is a reiterative process, with each rescan weeding out a little more of the false positives. The classifier reaches its maximum accuracy once a rescan consistently creates 10% or fewer false positives. From the first basic classifier scan to the final advanced rescan, it ultimately took 11 rescans to finalize the chip classifier.
With the 2013 and 2014 data extracted, it was time for our first big milestone: getting results from the chip classifier. We collated every cluster.csv into one giant Excel file, which was only possible within the 1,000,000 row limit because we removed the NOTHUM detections. Using the quantity of every CHIP detection and the date/time the recordings were taken, this graph was made.
These results were still made from only half the data. It was time to move on from chip notes to vocalizations.
The chip classifier was only half the battle. To include every sound made by the hummingbirds in the study, a second classifier had to be made that would have different signal parameters in order to capture hummingbird vocalizations, which are snippets of hummingbird song with a clear beginning and end.
Here, although there are multiple sounds, they all make up one vocalization like words in a phrase. Therefore, the signal parameters were adjusted to fit a much wider domain such as pictured.
That meant starting from the very beginning, using the same recordings that were selected for the chip classifier because of their relative abundance of hummingbirds.
The signal parameters were found for the vocalizations using the same method as for the chips. This time the vocalizations with the largest range of length (seconds) and frequency (Hz) were used to set the signal parameters. Predictably enough, the frequency did not need to be changed, but the length was extended to 4.2 seconds. Changing the inter-syllable gap was very important, since it allowed the vocalization and all its syllables to be grouped together instead of being pulled apart and counted separately the way the chips were. Kaleidoscope's default for this setting is 0.35 seconds, which ended up being enough to detect entire vocalizations.
Here is a BCHU vocalization about three seconds long. The floating, blocky noises are a shrill, sustained screech. The vertical bars with lines going up and down their heights make a similar sound, but shorter and with lower ranges making it sound richer. Then you can see the main substance of the vocalization, the chip-like sounds. They are different from chips in that they are wider (more seconds in length) and mushed together in a vocalization.
In the same way that MAHU chips were so underrepresented in the Chip classifier that the software would not cluster them, BTLH vocalizations did not make it into the vocalization classifier. However, there were a few unexpected Calliope Hummingbird detections that did get clustered and incorporated into the classifier. A surprise, to be sure, but a welcome one...
As was the case with the chip classifier, one scan was not nearly enough. To remove as many false positives as possible and reach a final product, the cluster.csv had to be processed through the 'rescan recordings and edited cluster.csv to create new cluster.kcs with pairwise classifiers and cluster.csv' action 6 times.
Andrea Nieto and Gabriela Samaniego are installing recorders in a flower patch and then Andrea is conducting a hummingbird census to calibrate the abundance of hummingbirds heard in the patch.
In 2013 and 2014, passive recordings with 7 Wildlife Acoustics Songmeters (SM3) were made in 7 flower patches for 5 weeks each year during southbound migration in the Chiricahua Mountains of southeastern Arizona. During this time, weekly field surveys were conducted to estimate hummingbird activity and floral abundance so abundance estimates from the recordings could be calibrated. The science objectives of the study are to determine how weather, plant phenology and abundance of available nectar influence hummingbird migration. The community objectives of the study are to employ and engage high school students in STEM (Science, Technology, Engineering, and Mathematics) activities.
In 2015 and 2016, we worked with Songscope software to build recognizers of hummingbird sounds. This effort had limited success and we were anxious to learn Kaleidoscope, the replacement software for Songscope. With the grant of two licenses of Kaleidoscope at the end of 2017, we were prepared to continue extracting hummingbird vocalizations from the recordings.
PASEO students Nick Botz and Hectar Parra with Music teacher Jason Schreiber learning Kaleidoscope in the Patagonia High School Library.
In 2017, Patagonia High School students started learning Kaleidoscope and began identifying clusters with hummingbird chip notes, vocalizations, and wing trills. By the end of this semester's program, students had begun to iteratively define clusters and refine the classifiers. At this time, funding was lacking and this project was placed on hold.
In 2018 and 2019, we hired Patagonia High School sophomore Nick Botz to continue building classifiers with Kaleidoscope. Nick is a talented musician and is interested in science. He was the ideal student to continue this project. He became proficient at identifying hummingbird sounds and classifying clusters into different vocalization categories. During the summer of 2018, we attended a day long workshop on Kaleidoscope offered by Wildlife Acoustics through the USFWS office in Tucson.
Nick Botz with one of the many instruments that he plays well.
This workshop helped identify ways in which we could still improve the development of the classifiers. In Spring 2019, Nick was confident that he had extracted all the hummingbird vocalizations from the recordings. His final task was to write a report that described how he developed the classifier and that could educate the next person working on the project. His report follows this summary. Upon finishing this project, Nick enrolled in an online Audio Engineering course. He's using Audacity software to mix & edit tracks, and suddenly looking at waveform plots again!
Now, we are collaborating with scientists in the School of Natural Resource and the Environment at University of Arizona to integrate the weather, field, and vocalization data so we can explore how weather, plant phenology and abundance of available nectar influence hummingbird migration. The resulting science is dependent upon fully extracting all hummingbird vocalizations. Upon reading Nick's report by a Wildlife Acoustic technician, we would appreciate learning if there was something else that we could have done to improve the extractions.
Wildlife Acoustics, Inc.
3 Mill and Main Place, Suite 110
Maynard, MA 01754-2657 USA
+1 (978) 369-5225
+1 (888) 733-0200