MICER17 Reflection 6: Georgios Tsaparlis

This is a reflection on a specific MICER17 conference session; for an overview of the conference, start reading here.

Prof. Georgios Tsaparlis finished up the day with the RSC Education award lecture on problem-solving. My takeaways from this session were to do with the long-lasting problem of … problems! Dorothy Gabel wrote in 1984 (the year of my birth) observed that students will frequently attempt to use algorithmic approaches without understanding the underlying problem – it seems that students never change.

Students are also adept that turning problems into exercises – using familiarity to drop the cognitive load of the task at a hand undernear their own working memory capacity and in so doing become adept at that which was once challenging, but without understanding the problem – it reminds me of the “novice, expert, strategic” approaches to problem solving, where we all collectively attempt to reduce complexity and our cognitive load.

MICER17 Reflection 5: Keith Taber

This is a reflection on a specific MICER17 conference session; for an overview of the conference, start reading here.

Keith S Taber (editor of CERP) gave a fantastic double session on research ethics, and the importance of having a widely-known middle initial. The pre-reading for this session inspired thought, once more, around what really constitutes educational research. Keith has a number of editorials on this, with the opinion that studying a local implementation of a generally effective pedagogical technique is not really research. In this case, to be research it should have control data, and unless the control data is from previous years, splitting into cohorts and running a control in a way known to be disengaging is potentially unethical unless the technique is legitimately novel; in which case, it should be studied alongside best practice, rather than placebo (The reference escapes me but it puts me in mind of a flaw in medicinal chemistry statistics where a new intervention is significant against placebo, but not significant against Existing Best Practice (which itself is not significant against placebo), leading to inappropriate conclusions)

What are some of the reasons these studies happen anyway? Perhaps institutional resistance (Does it work here? Prove it before you change something properly), and perhaps personal doubt (I know it works, but will it work in my hands?). Do I, as a physical scientist, simply trust educational research findings less? Does the increased variation of human research scare me? I would suggest framing both of these issues the same way: We have to put the onus on the person resisting change, whether ourselves or our institution, to prove that the literature supporting change is flawed beyond simply saying “It might not work in our context”.

My takeaway from Keith’s talk was his walk through notable failures of ethics in the history of medicine and psychology: Although the Stanford prison experiment wasn’t on the agenda, we looked at Milgram and Tuskegee, and discussed of the factors that can lead researchers into a situation that is grossly unethical when observed externally. Milgram tells us that people will follow the suggestions of authority into deeply uncomfortable places – deferring our moral judgement in the process. Do we as experimenters (or interviewers) risk accidentally expressing our authority in inappropriate ways? Or can we collectively deceive ourselves that the course of action we are on is justified by the tenets of utilitarianism, as in the extreme example of the Tuskegee incident?

My table had a particularly insightful discussion around the purpose of the debrief – voluntary consent that only becomes informed at the conclusion of the experiment, lest the information affect the outcome. In the Stanford Prison Experiment, trauma is inflicted that goes beyond a simple debrief or disclaimer – and has left people deeply affected, even @decades later. Is it ethical to traumatise someone if it’s all explained later as a fakery? We thought probably not.

All this might seem like a far cry from educational ethics, but badly-implemented research could see students subject to inappropriately difficult tests, potentially harming their self-efficacy and even challenging self-belief. Poorly designed studies can also waste valuable donated time. We also risk a lack of oversight if we are the gatekeepers of our own students – departmental or faculty ethics boards are meant to provide this oversight, but it often amounts to nothing more than a rubber stamp. If we run an experiment with students that view us as a lecturer or leader, can we be sure they feel no implicit coercion? No link between participation and good grades?

We then had an extensive discussion of ethics in publication. Pointing out the limitations of your findings. Not mis-citing sources. When and when not to reveal personally-identifying information. Keith identified a number of “cargo cult behaviours” (my own words), which were seen as making research ethical. Destroying research data and anonymising participants were two given examples and I would add university ethics boards to this under certain circumstances – in that it is possible for a group of people used to assessing medical interventions to rubber-stamp an educational ethics application, but that does not prevent the possibility of straying into subtly coercive behaviour as an interviewer/experimenter. I have no oversight just because my forms are in order!

For a far more elegant summary of the talk, Dr Kristy Turner was also at the conference and sketched several of the talks; her tweet is embedded below with permission, gratefully received!

MICER17 Reflection 4: Graham Scott

This is a reflection on a specific MICER17 conference session; for an overview of the conference, start reading here.

Drilling down into data collection methods more was Graham Scott, talking about interviews for data collection. Many useful do’s and don’ts were shared, such as the importance of interviewing in a neutral distraction-free environment, without a strong power imbalance between the interviewer and interviewee. Lecturers interviewing their students, and vice versa, were both problematic!

The importance of testing out your data collection was re-iterated (an emergent theme for this conference) . Pilot your interview on a single participant, as you may discover whole questions and subject areas that deserve an entry in your rota. The idea of open questions to prompt discussion in focus groups was also raised, with interviewees provided with lists of prompt questions to speed up dry spells. It also helps speed up transcription of group conversation if the mediator addresses people by name!

Our table picked as a group to interview, those students who don’t turn up for lectures. Conversation largely focused around getting people to engage with the interview itself, with the possibility of telephone interview or even instant messaging. Telephone interviews are both hindered and helped by the lack of body language, either in reading student emotions or in avoiding prejudicing the conversation with body language of your own.

Some other takeaways from this session were references: Firstly, a paper from Graham exploring the motivations to share educational practice, in biology educators. Why do we bother publishing, or talking, or attending conferences?

Secondly, a look at barriers to the adoption of fieldwork, where teachers were given a presentation of exemplar good practice, followed by a single question: “Why won’t this work in your context?” It’s something of a personal weakness (using my local context as a reason not to trust education research findings) so I imagine some of the findings will dovetail nicely with Terry McGlynn’s outstanding blog piece from last year, Education research denialism in STEM faculty. We are all in the thrall of pragmatic teaching factors, but perhaps part of the reason we get stuck in a loop of “can’t fix the leak, too busy bailing” is because we just don’t trust the sealant?

For a far more elegant summary of the talk, Dr Kristy Turner was also at the conference and sketched several of the talks; her tweet is embedded below with permission, gratefully received!

MICER17 Reflection 3: Orla Kelley

This is a reflection on a specific MICER17 conference session; for an overview of the conference, start reading here.

Dr Orla Kelly talked about evaluating classroom practice, how we do it, and why we do it. The session focused a lot on action research, and then on the different ways we can collect data.

As a novice educator and mostly-practicioner, I still have very little idea about exactly what Action Research is – having had several cracks at understanding it at previous conferences, I think I’m finally starting to get it. I think it simply refers to any systematic evaluation of one’s own performance, during teaching, and the immediate feedback. Last year, I ran a lecturer evaluation survey mid-way through my lecture and changed my practice as a result – could this be action research?

Regardless of what it is, action research needs to have both research and action! As an example from her own practice, Orla talked about one of her papers where she collected student data about a scheme to introduce problem-based learning into recipe-based labs, an endeavour which was only partially successful, for a number of reasons.

A chunk of the session was spent discussing the pros and cons of different types of data gathering and handling – my group looked at focus groups specifically. Our table had a divergent conversation about grounded theory, and the idea that if you bring any pre-conceived notions into the interview or the data analysis, it’s not grounded theory. While the meat of this session was extremely useful to all, I am not really in a position to conduct data collection outside of survey design for now.

As I am not well-versed enough in the topic of Orla’s talk, I highly recommend you check out her sessional pre-reading – lest my absence of words be seen as problematic.

MICER17 Reflection 2: Stewart Kirton

This is a reflection on a specific MICER17 conference session; for an overview of the conference, start reading here.

Dr Stewart Kirton continued the theme of proper statistical handling of data, set by Fraser Scott last year. We were specifically looking at Likert scales, and ways of developing and handling them. The importance of piloting your study was raised for the first (but not the last) time, both on fellow education researchers and students, to find out if the question is as understandable as you think. I do not do enough of this!

We then discussed several implementation dos and don’ts around Likert scales: I was wary of mixing questions with different types responses, but this is fine – as is using questions with even numbers of responses. Although it wasn’t explicitly mentioned, I wonder if people prefer to give Likert scales an odd number of values in order to provide a neutral option. The neutral option itself may be desirable precisely because Likert scales usually do not provide a “do not know” option – something Stewart encouraged us to do if appropriate!

Two caveats do apply around responses, however, which I was ignorant of and have broken the rules around frequently:

  • Don’t mix positive and negative wording in questions
  • Possible responses shouldn’t be clustered at the extreme ends of the scale (Endorsability).

Negatively worded questions can potentially influence participants whereas positive questions don’t, but the far bigger source of bias comes from mixing the two wordings together – this should be avoided even at the cost of a universally-negative questionnaire! For the questionnaire itself, Stewart advocated writing somewhere around 10-12 questions, and then keeping the best 6-8 of these. I imagine that if you needed more than this, then your research question may be inappropriately broad (with reference to Suzanne’s session).

The meat of this session, which spilled over into post-conference discussion, was around the issue of averaging Likert scale data. In brief, the numbers associated with Likert scale data are more or less arbitrary (ordinal data) but we frequently treat them mathematically (interval data). The gap between 1 and 2 may not be the same as the gap between 2 and 3 – so applying mountains of statistics is often inappropriate and time-consuming (and a bane of certain peer reviewer’s lives!). Rather, Stewart suggests strategies around data binning into binary choices – “Very NSS”, as Simon Lancaster put it on the day.

We frequently want to compare pre- and post-intervention data, so another strategy might be to look at percentage shifts in each response, or to look at how individual student responses changed over time. I’ve committed the sin of Likert averages more than once, and had previously wrestled with standard deviations as a method of conveying answer distributions, without understanding why it felt “wrong”. Now I do!

I had multiple takeaways from this session, but my favourite is probably endorsability as a way to answer the same sort of inherent question that leads people to average likert scale data. For example, testing the effectiveness of an intervention in the past I may have looked for a numerical change on the average response to a single question, whereas the principle of endorsability would have me providing several questions, assessing student comfort in low-, medium- and high-stakes situations.

For a far more elegant summary of the talk, Dr Kristy Turner was also at the conference and sketched several of the talks; her tweet is embedded below with permission, gratefully received!

MICER17 Reflection 1: Suzanne Fergus

This is a reflection on a specific MICER17 conference session; for an overview of the conference, start reading here.

Dr Suzanne Fergus lead a discussion around the need to conduct rigorous, quantitative research – putting me rather in mind of several editorials from Keith Taber, around what constitutes chemistry education research and how to spot quality in such. You can test an innovation in your local context, and find that it works – but is this truly research? An account of this development will be a valuable resource for the community, but does it merit publication specifically as research, and if not, was it ethical to deprive a control group of the innovation?

Before the conference, we shared our thoughts on what prevent us from conducting educational research – while time was a factor, the confidence and lack of social science grounding was perhaps the main cause for concern. In my own journey, this is because I received many years of formal and informal training as a chemist but virtually none as an education researcher, and the difference in the challenges is even larger for this lack of grounding.

We looked at the importance of writing a good research question, and ways in which people get it wrong. We all tried our hand at writing a research question, and I realised that any question I could come up with would be answered by exploring gaps in my knowledge of the literature – am I cut out to be a researcher yet? At best, I came up with a fusion of my two interests, Peer Instruction and the laboratory – but rather than a research question, it prompted me to look for literature around student-student interactions in the laboratory.

Suzanne then shared some of her own research, including one looking into assessing competency in the laboratory. It rather puts me in mind of Keele’s practical exams and digital badging – things I’m keen to adopt in my own practice. But, do my specific challenges require research? All the areas for improvement I’ve identified so far in the courses I’ve taken over are crying out for better application of existing good practice, rather than novel research.

For a far more elegant summary of the talk, Dr Kristy Turner was also at the conference and sketched several of the talks; her tweet is embedded below with permission, gratefully received!

MICER17 conference reflections

On Friday 19th May, it was the second annual Methods In Chemistry Education Research conference, held once more at the Royal Society of Chemistry in London and run by Michael Seery. Again, the conference was a great opportunity to learn about and discuss the tools, methods, and philosophies used when conducting research into chemistry education. It was particularly great to see an increasing number of chemistry education researchers at the postgraduate level, and there will be a specific satellite meeting for these folks at ViCE/PHEC in August!


I’ve written up my notes from MICER17 and will post them over the next few days as a series of blogs – but these only reflect what I took from the day; do not read them as accurate summaries or in doing so I shall have worked a deep injustice on each of the presenters, who may otherwise feel on reading these accounts that I have grossly missed the points of their sessions. These summaries will turn into links as I upload each blog piece in turn.

Session 1: Dr Suzanne Fergus
tl;dr: Educational research must flow from a good, narrow, collaboratively-defined research question.

Session 2: Dr Stewart Kirton
tl;dr: Don’t average Likert scale data. Don’t mix positive and negative questions. And don’t ask questions that only have extreme answers!

Session 3: Dr Orla Kelly
tl;dr: Action Research needs to have both Research and Action. Use an appropriate method of data collection, and document your failures!

Session 4: Prof. Graham Scott
tl;dr: There are many more ways to skew interviews than it seems.

Session 5:Prof. Keith Taber
tl;dr: Research ethics are really complex, and all of us bear the dark seeds of utilitarian tyranny, however improbable it seems.

Session 6:Prof. Georgios Tsaparlis
tl;dr: Students have been attempting to solve problems in a non-problem-solving way since before I was born.

Final reflections

MICER17 was once again a fantastic opportunity to swim in a professional community that’s committed to producing rigorous research into educational theory and practice. It reminds me to keep my aspirations high, and not to just settle for being a practitioner who occasionally reads journals – I may have left laboratory research behind in 2015, but the curious itch is as undiminished as ever!