In the context of micro-task crowdsourcing, each task is usually
performed by several workers. This allows researchers to
leverage measures of the agreement among workers on the
same task, to estimate the reliability of collected data and
to better understand answering behaviors of the participants.
While many measures of agreement between annotators have
been proposed, they are known for suffering from many problems
and abnormalities. In this paper, we identify the main
limits of the existing agreement measures in the crowdsourcing
context, both by means of toy examples as well as with
real-world crowdsourcing data, and propose a novel agreement
measure based on probabilistic parameter estimation
which overcomes such limits. We validate our new agreement
measure and show its flexibility as compared to the existing
agreement measures.