DE | FR

Thank you for participating in the Citizen Science survey! We really appreciate your help with rating emojis and, now that the survey period is over, we are pleased to share a little bit more about what we were up to! In this first blogpost, we wanted to give a little insight into what was going on behind the scenes, and what the thought processes and decisions during the preparation and development of our study.

The goal of our project is to examine how people accommodate their emoji usage, in other words: how do we adapt what emojis we use, but also how and when we use them? Specifically, our team is interested in studying at the multilingual context of Switzerland, to be able to examine the country’s texting habits across the different languages. Therefore, we used Swiss text messages (based on the What’s Up Switzerland? Corpus, of which you can find more information here), limited to Swiss German and French data, for mainly practicality reasons.

In order to be able to analyze larger trends in emoji usage, and to be able to look at the evolution of their usage over time, it is necessary to look at every single emoji in detail and to answer a series of questions related to its use in that context. This is, of course, not feasible when working with hundreds of WhatsApp conversations, and especially in a team of 7, where not everyone understands all the languages or dialects in the conversations. So, we decided to open up the project and to invite the general population to contribute, and help us out by rating the emojis. Since the information we need related to the emojis is based on comprehension of the text messages and on general interpretation of the emoji, it was a perfect study for a “citizen science” approach, since the people rating them didn’t need to have any kind of formal training, but needed to be able to understand the messages.

A big challenge, however, was to figure out how we can make sure that what we ask the citizen scientists is, most importantly, clear, and straightforward to implement for them in the context of emoji usage. We also needed to make sure that it could translate to models and terminology that we could use in our analysis and in our interdisciplinary work. And vice-versa: the terms from our fields needed to be made understandable so that the ratings we got would accurately match what we would be interpreting out of them. Therefore, technical terms like “arousal” and “valence” were formulated into “intensity” and “positivity”, since we esteemed them easier to understand. But from this, other discussions emerged: how should they be rated? On a scale of 1-10 how positive the emoji is? Or a 5-point scale going from negative to positive, where the emojis are placed? And also, what is the opposite of intense? After discussion, we selected the term “calm”, although it is not a perfect match (if you can think of a good term, please don’t hesitate to get in touch with us and share your thoughts!).

As you might imagine, with 600 WhatsApp conversations, all of varying lengths, and all containing emojis, it would have been difficult to have them all be rated. Especially since, because emojis are so variable in their meaning and interpretation from one individual to the next, we wanted to have several people rate each emoji. Therefore, we couldn’t possibly give all emojis to be rated, that would have been a terrible idea. Instead, we decided to limit them. But how do you limit them? The first approach was to limit the WhatsApp conversations we chose. Since we’re interested in how people adapt their emoji usage, we needed the emojis rated to come from chats that were long enough that we could consider there to be some long-term adaptation. Also, we selected chats in Swiss French and Swiss German where we had demographic information about the people texting: there may be some underlying trends based on gender or age, or something else, of the participants.

Next, we brainstormed about what communicative situation would serve two purposes. One: what situation appears regularly enough in a WhatsApp chat and is limited enough that we could draw the conclusion that there is accommodation going on? Two: what situation is clear enough out of context for our citizen scientists to be able to understand the emojis and to be able to rate them? After some deliberation and maybe a lightbulb going off, we decided to work with beginnings of conversations: they are regular enough that there would be enough data to work with across one conversation, and emojis occur regularly in them. Also, since it is the beginning of a conversation, the citizen scientists should have all the context necessary to understand what is going on.

However, you may, bright reader, already have realized an issue here, either through your experience as an avid texter, or because you gave these selection criteria a little bit of thought. A great thing about texting is that we can have very quick written conversations with others that go by much faster than if we were to wait for letters, or even emails. However, texting is still not as quick and instantaneous as spoken conversations. There is still time that goes by between texts and that time does not always correlate to an ending or a beginning of a conversation. And that was a challenge for us. Sometimes new conversations are started after a small amount of time has gone by, and sometimes a single conversation can span over days with long periods of inactivity between messages. And every time that single conversation has another message, despite it being hours later, it is another response, and not the initiation of a new conversation. So, how to go about this? We decided that there was no perfect way of handling this, and decided to use the cut-off period of 2 hours: any time there had been inactivity for at least 2 hours and a new message was sent in, and there was an emoji in the first 7 messages after that, we considered it a sample. The next time there was a 2 hour cut, then it was the next sample. And so on!

Once we had finally selected how to go about it, and then selected our chats (we won’t go into detail just now), we set up the survey, and bam! A fantastic 452 users rated a total of 643 sample screenshots, with each screenshot being rated 6 times! We are so so thankful for each of them that helped us out and, we are very excited to share what we discovered thanks to them! Keep your eyes on this page to find out more about what emojis are used in which way most frequently, and more!