MA Research: Data

3/6/2015

In my last 'MA Research' post, I explained that my research is based on a corpus (basically a database) of text written by adult learners. Today's post is about where I got that text from, the ethical implications of that, and the decisions that led me to make about how to use the text.

Existing studies of adults learning music tend to use techniques such as interviews or surveys to find out what these adults think about their experiences. Stephanie Pitt's fascinating recent book, Chances and Choices, which looks at the impact of music education on lifelong musical involvement, is based on research which asked participants to write a musical 'life story' - an autobiography of their musical experiences, and this includes the experiences of adult learners. These studies give us valuable and detailed insight into the thoughts of adult learners - but only a small group of them, who have chosen to take part in a research project.

To try to access a bigger group of adult learners, I turned to one of the biggest sources of text around - the internet. People write online - in forums, blogs, discussion groups, etc - about learning music as an adult. They compare their experiences, ask each other questions, discuss their problems and successes. The internet gives us access to a huge amount of text, and the corpus approach is perfectly suited to analysing it. It can be downloaded from web pages, turned into plain text, 'tidied up' (the time-consuming bit, removing extraneous text such as sidebars or forum headings), then fed into the corpus analysis software, ready to explore. My data consists of a 500,000-word corpus of such text.

Perfect? Not completely. Just as existing studies are only analysing responses from those who've chosen part in their research, this approach only analyses texts from people who post online. So it's not exhaustive, and maybe there is potential to combine the two approaches in future (one of the aims of my project is to see whether my results complement or differ from existing studies which have used different methods).

The other, bigger, issue is an ethical one. In 'traditional' research, participants are normally informed about what they're doing, give consent, and are aware that what they're writing is being used for research purposes. Internet research is still a fairly new field, and the ethical guidelines there aren't quite so clear-cut. One the one hand, there's the position that participants should be informed and give consent in the traditional way. On the other, two arguments. Firstly that (unless password-protected) this information is already in the public domain, so is available to 'use', much like analysing an article or a letter in a newspaper - but some disagree, saying that people 'feel' that internet communities are private even if they technically aren't, and this should be taken into account. Secondly, and stemming from this idea of 'community', is the idea that announcing you're doing some research on some online text can disrupt that community. People may no longer feel 'safe' to post whatever they've been posting before, or feel that they have to edit their text in some way because of the 'presence' of a researcher (as they might do in a traditional interview/ survey), and so don't use the online 'spaces' in the same way as they did before, to express their thoughts and feelings around a subject, or as a support system. My decision-making on what approach to take was informed by reading about what social media users think about online research (for example, this report from NatCen Social Research). The overwhelming answer from this research and other online research guidelines is... it depends (on the type of research, the type of website or social media, the topic of the research). But the main guidance is to make sure you've considered the issues and come up with an approach that takes these into account.

Corpus linguistics helps us out again here (and thank you to researchers at the ESRC Centre for Corpus Approaches to Social Sciences for advice on this, via my supervisor). Because I'm looking at patterns in the text, for example the terms which are most frequently associated with the word teacher, rather than examining individual participant's responses, I'm able to anonymise the data. I don't mention the sites I've downloaded the data from; I don't include any real names, user names, or identifying details in my analysis, and I'm being particularly careful about the traceability of any quotes. It's not a perfect solution to all the issues, but in research - much like in music - perfection is elusive, perhaps impossible, and not actually necessary. I'm taking an approach that I've thought through and feel comfortable enough with to use (and I've got ethical approval from my department, which is always reassuring!).

0 Comments

MA Research: Data

Leave a Reply.

Archives

Categories