ViCEPHEC17 roundup!

Another summer, another fantastic ViCEPHEC to provide a vital burst of excitement and ideas for the new semester. My third such conference, I’m starting to feel like I might be finding my feet a little – to echo Michael’s recent blog, I’m starting to put down the overwhelming feeling of wanting to implement everything I see.

Immediately before the conference, there were two great events – Labsolutely Fabulous, a showcase of laboratory experiments and practical work, which I was woefully late for but still managed to whoosh around a few demonstrations, including some great microscale work from Bob Worley of CLEAPSS, much discussed but never before met. I also came away with a pipe-cleaner molecular model of water from Kristy Turner, that graced my lanyard for the rest of the conference and my living-room from now on. We also had a satellite meeting of the Teaching Fellows Network, much-renamed and invaluable.

One theme that stood out for me at the Teaching Fellows Network, and across the whole conference, was difficulty in challenging signature pedagogies. I saw case after case where introducing too much innovative practice, too quickly, would result in student rebellion or poor satisfaction scores. Despite research indicating that teaching quality is unrelated to student satisfaction, we saw multiple cases of academics being punished and in some cases denied promotion on the basis of poor reception. Simon Lancaster even found himself in the position of potentially having to advise himself to reduce the extent of flipping in a course, pitting a direct inverse correlation of learning gain with student satisfaction – the peril of being a director of teaching which includes your own…

it’s become increasingly clear that setting the tone of the culture is important to elicit change – both within a department, but also in the places our students come from. It’s easy to blame “the system” for unhelpful student preconceptions, but when that’s a code for blaming secondary education, then it’s even more vital to listen to the intersecting experience of teacher-lecturers like Kristy Turner, David Read, and Sir John Holman. It’s hardly a problem unique to our little corner of humanity that we have a need of casting less blame and building more bridges.

My main new piece of good practice came right at the start of the conference, from Suzanne Fergus, who gives voice to a habit I’ve used haphazardly and accidentally – Put the why first. Give your lecture a context in day 1, minute 1. I’ve long argued that it’s far more important to spend time making your subject relevant and engaging than it is to cram another sliver of content in, and it’s great to have a voice with some weight that I can cite. Suzanne also spoke about Miller’s pyramid of competency in lab skills – things I’ll be looking into locally in the next year.

Continuing the lab skills theme, Robin Stoodley of UBC presented work they had done to categorise the cognitive tasks of the undergraduate teaching laboratory – revealing a real narrowness of experience, with many of the tasks being repeated across many experiments, and many only appearing in a single experiment – organic chemistry being a particular culprit for narrowness of experience. This framework would, I think, be useful to categorise experimental work from all families of Domin’s descriptors, useful to me as I begin to add elements of inquiry to my first year curriculum (very much following some unpublished work of Jenny Burnham in this area).

Finally in the lab theme, Jenny Slaughter presented an important observation that echoes what we already know to be true: student retention is directly linked to interaction with graduate teaching assistants! It’s a powerful reminder not to neglect the training of these students, as they represent most of the direct staff contact between students and the university, certainly in first year. also vitally important in the lab is safety education. Both Liverpool and Bristol deny student entry to the lab without a passing mark on a pre-lab safety quiz, and James Gaynor of Liverpool spoke of a robust and integrated approach to H&S that involves giving students access to official COSHH forms directly, as part of lab preparation.

Hopefully, in editing down my 10,000 words of conference notes into this single blog post, I’ve also managed to reduce my cache of new ideas for implementation down into something small enough to tackle before #vicephec18…

Bring on the next semester!


Career progression in UK Chemistry HE

I’ve been working full-time in Higher Education for about two years now, and the precursor scramble of postdocs, contracts, and CV buffing has left imprints of an interest in what makes a person appealing – initially to interview panels, but latterly transferred to the internal promotional power structures of universities.

At last year’s ViCEPHEC16 in Southampton, Jenny Burnham lead a satellite pre-meeting of chemistry teaching fellows (and other early-career teachers), where the focus was career progression (there is some scholarship on this from outwith chemistry, but not much.). As part of the discussion, we were tasked with identifying our own institution’s career progression criteria. A theme emerged of a balance between leadership and scholarship. Based on this and other discussions around career tracks within peer support groups at Strathclyde, I’ve added a third vertex of good practice; neglected though it may be in many institutions, I started my career at Glasgow, loosely under the guidance of Bob Hill, who I believe (though I may be wrong) was prof’d on the basis of just being a damn fine teacher.

Anyway, the three points of the “promotion triangle” I’ve identified are:

  1. Pedagogical research
  2. Influence
  3. Good practice

A university’s promotion criteria will usually favour one or two of these over another, and a mismatch between these and personal strengths can be as frustrating as it is prevalent – for every good practitioner bemoaning the need to publish, there’s a pedagogical researcher being told to stop writing and start teaching. Defining them is also fuzzy:

  • Elements 1 and 2 are related – high-impact publications of research could be taken as influence, but publishing an account of practice would probably not be recognised as an academic, scholarly, (REFable…) work.
  • Element 2 is probably the hardest to pin down – external examination, conference presentations, textbook authorship, institutional education policy – these can all contribute.
  • Element 3 is probably the hardest to evidence, and a lot more has been written, and better than I can, on the tyranny of student awards, and the role of likability in the TEF.

Rather than try to pretend I’m particularly well-read in this area, I instead want to bring some questions out of this: what would it look like to become a professor in these areas? Are there more? Has anyone ever been promoted for administrative excellence? Do you think this paradigm should be de-emphasised or dismantled? Does it work for anyone? Is it still sexist? Have I asked too many questions?

Answers on a tweetcard, discussion needed and valuable!

MICER17 Reflection 6: Georgios Tsaparlis

This is a reflection on a specific MICER17 conference session; for an overview of the conference, start reading here.

Prof. Georgios Tsaparlis finished up the day with the RSC Education award lecture on problem-solving. My takeaways from this session were to do with the long-lasting problem of … problems! Dorothy Gabel wrote in 1984 (the year of my birth) observed that students will frequently attempt to use algorithmic approaches without understanding the underlying problem – it seems that students never change.

Students are also adept that turning problems into exercises – using familiarity to drop the cognitive load of the task at a hand undernear their own working memory capacity and in so doing become adept at that which was once challenging, but without understanding the problem – it reminds me of the “novice, expert, strategic” approaches to problem solving, where we all collectively attempt to reduce complexity and our cognitive load.

MICER17 Reflection 5: Keith Taber

This is a reflection on a specific MICER17 conference session; for an overview of the conference, start reading here.

Keith S Taber (editor of CERP) gave a fantastic double session on research ethics, and the importance of having a widely-known middle initial. The pre-reading for this session inspired thought, once more, around what really constitutes educational research. Keith has a number of editorials on this, with the opinion that studying a local implementation of a generally effective pedagogical technique is not really research. In this case, to be research it should have control data, and unless the control data is from previous years, splitting into cohorts and running a control in a way known to be disengaging is potentially unethical unless the technique is legitimately novel; in which case, it should be studied alongside best practice, rather than placebo (The reference escapes me but it puts me in mind of a flaw in medicinal chemistry statistics where a new intervention is significant against placebo, but not significant against Existing Best Practice (which itself is not significant against placebo), leading to inappropriate conclusions)

What are some of the reasons these studies happen anyway? Perhaps institutional resistance (Does it work here? Prove it before you change something properly), and perhaps personal doubt (I know it works, but will it work in my hands?). Do I, as a physical scientist, simply trust educational research findings less? Does the increased variation of human research scare me? I would suggest framing both of these issues the same way: We have to put the onus on the person resisting change, whether ourselves or our institution, to prove that the literature supporting change is flawed beyond simply saying “It might not work in our context”.

My takeaway from Keith’s talk was his walk through notable failures of ethics in the history of medicine and psychology: Although the Stanford prison experiment wasn’t on the agenda, we looked at Milgram and Tuskegee, and discussed of the factors that can lead researchers into a situation that is grossly unethical when observed externally. Milgram tells us that people will follow the suggestions of authority into deeply uncomfortable places – deferring our moral judgement in the process. Do we as experimenters (or interviewers) risk accidentally expressing our authority in inappropriate ways? Or can we collectively deceive ourselves that the course of action we are on is justified by the tenets of utilitarianism, as in the extreme example of the Tuskegee incident?

My table had a particularly insightful discussion around the purpose of the debrief – voluntary consent that only becomes informed at the conclusion of the experiment, lest the information affect the outcome. In the Stanford Prison Experiment, trauma is inflicted that goes beyond a simple debrief or disclaimer – and has left people deeply affected, even @decades later. Is it ethical to traumatise someone if it’s all explained later as a fakery? We thought probably not.

All this might seem like a far cry from educational ethics, but badly-implemented research could see students subject to inappropriately difficult tests, potentially harming their self-efficacy and even challenging self-belief. Poorly designed studies can also waste valuable donated time. We also risk a lack of oversight if we are the gatekeepers of our own students – departmental or faculty ethics boards are meant to provide this oversight, but it often amounts to nothing more than a rubber stamp. If we run an experiment with students that view us as a lecturer or leader, can we be sure they feel no implicit coercion? No link between participation and good grades?

We then had an extensive discussion of ethics in publication. Pointing out the limitations of your findings. Not mis-citing sources. When and when not to reveal personally-identifying information. Keith identified a number of “cargo cult behaviours” (my own words), which were seen as making research ethical. Destroying research data and anonymising participants were two given examples and I would add university ethics boards to this under certain circumstances – in that it is possible for a group of people used to assessing medical interventions to rubber-stamp an educational ethics application, but that does not prevent the possibility of straying into subtly coercive behaviour as an interviewer/experimenter. I have no oversight just because my forms are in order!

For a far more elegant summary of the talk, Dr Kristy Turner was also at the conference and sketched several of the talks; her tweet is embedded below with permission, gratefully received!

MICER17 Reflection 4: Graham Scott

This is a reflection on a specific MICER17 conference session; for an overview of the conference, start reading here.

Drilling down into data collection methods more was Graham Scott, talking about interviews for data collection. Many useful do’s and don’ts were shared, such as the importance of interviewing in a neutral distraction-free environment, without a strong power imbalance between the interviewer and interviewee. Lecturers interviewing their students, and vice versa, were both problematic!

The importance of testing out your data collection was re-iterated (an emergent theme for this conference) . Pilot your interview on a single participant, as you may discover whole questions and subject areas that deserve an entry in your rota. The idea of open questions to prompt discussion in focus groups was also raised, with interviewees provided with lists of prompt questions to speed up dry spells. It also helps speed up transcription of group conversation if the mediator addresses people by name!

Our table picked as a group to interview, those students who don’t turn up for lectures. Conversation largely focused around getting people to engage with the interview itself, with the possibility of telephone interview or even instant messaging. Telephone interviews are both hindered and helped by the lack of body language, either in reading student emotions or in avoiding prejudicing the conversation with body language of your own.

Some other takeaways from this session were references: Firstly, a paper from Graham exploring the motivations to share educational practice, in biology educators. Why do we bother publishing, or talking, or attending conferences?

Secondly, a look at barriers to the adoption of fieldwork, where teachers were given a presentation of exemplar good practice, followed by a single question: “Why won’t this work in your context?” It’s something of a personal weakness (using my local context as a reason not to trust education research findings) so I imagine some of the findings will dovetail nicely with Terry McGlynn’s outstanding blog piece from last year, Education research denialism in STEM faculty. We are all in the thrall of pragmatic teaching factors, but perhaps part of the reason we get stuck in a loop of “can’t fix the leak, too busy bailing” is because we just don’t trust the sealant?

For a far more elegant summary of the talk, Dr Kristy Turner was also at the conference and sketched several of the talks; her tweet is embedded below with permission, gratefully received!

MICER17 Reflection 3: Orla Kelley

This is a reflection on a specific MICER17 conference session; for an overview of the conference, start reading here.

Dr Orla Kelly talked about evaluating classroom practice, how we do it, and why we do it. The session focused a lot on action research, and then on the different ways we can collect data.

As a novice educator and mostly-practicioner, I still have very little idea about exactly what Action Research is – having had several cracks at understanding it at previous conferences, I think I’m finally starting to get it. I think it simply refers to any systematic evaluation of one’s own performance, during teaching, and the immediate feedback. Last year, I ran a lecturer evaluation survey mid-way through my lecture and changed my practice as a result – could this be action research?

Regardless of what it is, action research needs to have both research and action! As an example from her own practice, Orla talked about one of her papers where she collected student data about a scheme to introduce problem-based learning into recipe-based labs, an endeavour which was only partially successful, for a number of reasons.

A chunk of the session was spent discussing the pros and cons of different types of data gathering and handling – my group looked at focus groups specifically. Our table had a divergent conversation about grounded theory, and the idea that if you bring any pre-conceived notions into the interview or the data analysis, it’s not grounded theory. While the meat of this session was extremely useful to all, I am not really in a position to conduct data collection outside of survey design for now.

As I am not well-versed enough in the topic of Orla’s talk, I highly recommend you check out her sessional pre-reading – lest my absence of words be seen as problematic.

MICER17 Reflection 2: Stewart Kirton

This is a reflection on a specific MICER17 conference session; for an overview of the conference, start reading here.

Dr Stewart Kirton continued the theme of proper statistical handling of data, set by Fraser Scott last year. We were specifically looking at Likert scales, and ways of developing and handling them. The importance of piloting your study was raised for the first (but not the last) time, both on fellow education researchers and students, to find out if the question is as understandable as you think. I do not do enough of this!

We then discussed several implementation dos and don’ts around Likert scales: I was wary of mixing questions with different types responses, but this is fine – as is using questions with even numbers of responses. Although it wasn’t explicitly mentioned, I wonder if people prefer to give Likert scales an odd number of values in order to provide a neutral option. The neutral option itself may be desirable precisely because Likert scales usually do not provide a “do not know” option – something Stewart encouraged us to do if appropriate!

Two caveats do apply around responses, however, which I was ignorant of and have broken the rules around frequently:

  • Don’t mix positive and negative wording in questions
  • Possible responses shouldn’t be clustered at the extreme ends of the scale (Endorsability).

Negatively worded questions can potentially influence participants whereas positive questions don’t, but the far bigger source of bias comes from mixing the two wordings together – this should be avoided even at the cost of a universally-negative questionnaire! For the questionnaire itself, Stewart advocated writing somewhere around 10-12 questions, and then keeping the best 6-8 of these. I imagine that if you needed more than this, then your research question may be inappropriately broad (with reference to Suzanne’s session).

The meat of this session, which spilled over into post-conference discussion, was around the issue of averaging Likert scale data. In brief, the numbers associated with Likert scale data are more or less arbitrary (ordinal data) but we frequently treat them mathematically (interval data). The gap between 1 and 2 may not be the same as the gap between 2 and 3 – so applying mountains of statistics is often inappropriate and time-consuming (and a bane of certain peer reviewer’s lives!). Rather, Stewart suggests strategies around data binning into binary choices – “Very NSS”, as Simon Lancaster put it on the day.

We frequently want to compare pre- and post-intervention data, so another strategy might be to look at percentage shifts in each response, or to look at how individual student responses changed over time. I’ve committed the sin of Likert averages more than once, and had previously wrestled with standard deviations as a method of conveying answer distributions, without understanding why it felt “wrong”. Now I do!

I had multiple takeaways from this session, but my favourite is probably endorsability as a way to answer the same sort of inherent question that leads people to average likert scale data. For example, testing the effectiveness of an intervention in the past I may have looked for a numerical change on the average response to a single question, whereas the principle of endorsability would have me providing several questions, assessing student comfort in low-, medium- and high-stakes situations.

For a far more elegant summary of the talk, Dr Kristy Turner was also at the conference and sketched several of the talks; her tweet is embedded below with permission, gratefully received!