I’m happy to announce that this year (2020), the international detection and classification of acoustic scenes and events (DCASE) challenge will host the task of automated audio captioning!
Automated audio captioning is task 6 at DCASE 2020 challenge, and it will be based on the Clotho dataset (more info about Clotho at this announcement).
To help you jump-start your methods, we have released a set of coding tools and a baseline system at GitHub. The coding tools include code for handling Clotho dataset (available here) and evaluating the results of your system (available here). Of course, you can just start using the baseline system, with its code available here.
You can find detailed info about the challenge set-up, the employed dataset, and the results of the baseline system at the web page of the automated audio captioning task at DCASE 2020 challenge.
Every submission to the automated audio captioning task, should be accompanied by a report which will be submitted to the DCASE challenge.
Also, we warmly invite you to submit more exotic approaches for automated audio captioning at the DCASE workshop!