Blog
Sep 16, 2024

User-centred evaluation

Evals are hard. Get closer to the users.

Andy Bell

With LLMs it's easy to build apps that sort of work. It is much harder to make apps that actually work. And it is hard to even know  where you are on the continuum between ‘sort of’ and ‘actually’.

But, it is essential to know. Clients want to know what’s being achieved with their investment. Organisational leaders want to validate a new strategy and make future decisions. Risk managers want to know that a new way of doing things will improve safety and compliance, not threaten it. The development team needs to know what needs improving.

Recently, on our ‘intelligent case notes’ project for the Citizens Advice Scotland Extra Help Unit, we tried to answer this question of algorithm performance in a way that would give all stakeholders context and confidence.

We explored questions like: Was HelpFirst ever missing clients with risks it should have picked up? How did the HelpFirst score compare to the opinion of an experienced practitioner?

But each question we asked opened up a new rabbit hole of complexity. As we looked at the tradeoff between accuracy, precision and recall in different scenarios, we had to address the fact that human practitioners often disagree in their own assessment of a case, and therefore it’s often hard to get a ‘ground truth’ to evaluate against.

Before we knew it, our report had gone from 6 pages to 40 pages. It was impenetrable. Stakeholders did not find it useful for making informed decisions about the system performance.

Then we tried another approach to evaluation, one that gives senior managers a much more holistic view of system performance.

Rather than directly evaluating the algorithms, we switched to evaluating the test users’ experience of the system.

In the pre-launch testing phase, we’ve had several experienced caseworkers using the system on a trial basis. Every month we ask 7 standard questions. Here’s what’s happened as we’ve iterated on the algorithms in response to their feedback:

Feedback from User Steering Group. (Green means good)

We think user-centred evaluation works better because what you’re trying to evaluate is so complex.

Imagine you hired an identical assistant for each practitioner, and they all had a 3-month probation period. You could try and evaluate the assistants based on a few quantitative metrics of output or efficiency. Or, you could ask the staff working with them a few questions about whether they want to keep working with their assistant, why, and what could be better.

We are delighted that, as of this August, there is unanimous feedback that life is better with HelpFirst!

The Priority Services Register (PSR) is a key tool that energy suppliers use to fulfil their responsibilities to vulnerable customers. But there isn't just one register. Every energy company has their own PSR and the application forms vary unpredictably from supplier to supplier. 

As part of our CivTech Challenge, we’ve been researching best practice across the industry. We were left with lots of questions:

  • Why is 'restricted hand movement' a vulnerability that almost all suppliers assess?
  • Why are archaic phrases like ‘bedridden’ used?
  • Why does only one supplier check if their vulnerable customers use ‘electric showering’? 

Alas, we weren't able to fully answer these questions. But here’s a visual guide to various PSR forms, so you can get an overview of the landscape.  

Overview

We accessed PSR application forms for Ovo Energy, British Gas, SSE, Octopus, EDF, Shell and Utilita. For many other suppliers, access is restricted.

The first observation is that the application forms are extremely varied: 

  • We’ve grouped questions into categories to make things a bit easier to read, however the forms themselves come in very different structures.  Some offer all their options in one long list, some separate into smaller sections.  Some only show certain sections once a customer has selected a particular option (e.g. selecting ‘sight loss’ gets you extra questions on the Shell application).
  • Several vulnerabilities are only mentioned by one supplier. Only one asks about autism, and another asks about breathing difficulties. The following options only showed up once: ‘female presence preferred’, ‘longer time to answer the door’ and ‘bedridden’.
  • For sensory needs: ‘blind’ and ‘partially sighted’ are separate options in all the forms. 'Hearing impairment' and 'deaf' are combined in half the forms and the former is not asked at all in one. This may be contentious, as hearing impairment and being deaf are very different conditions.
  • Some suppliers include options for accessible information provision in the same form (i.e. braille, large print letters, etc.) Others link to an additional form, or do not reference it at all.
  • When temporary conditions are mentioned, only some suppliers allow the customer to select a date when they believe the condition will no longer apply.
  • Most of the forms are multiple choice, limiting to what the supplier chooses to ask about.  Occasionally the supplier (e.g. EDF) gives the customer a larger space to talk about their conditions, equipment and needs in more detail.

Next, we dive deeper into the application forms.

Medical Conditions2

Question asked
Question asked (with variations)
Question not asked
Ovo
BG
SSE
Octopus
EDF
Utilita
Shell
Hearing
Speech impairment
Poor sense of smell/taste
Mental health
Dementia(s)/cognitive impairment
Non-English speaker
Chronic or serious illness
Partially sighted
Blind
Developmental condition
Restricted hand movement
Pensionable age
Physical impairment
Unable to answer door
Learning difficulties
Arthritis
Anxiety or depression
Heart condition
Dyslexia
Autism
Living alone
Bedridden
Breathing difficulties
Carer
75+
Receive disability benefits

EDF’s application form has the highest number of options related to medical conditions (20 in total) with British Gas and Utilita featuring the lowest (13).  EDF also features options which cover multiple medical conditions (e.g. 'breathing difficulties', 'disability benefits') more frequently than other suppliers. SSE has the highest number of options for learning and mental health related conditions (including 'dyslexia', 'autism', 'learning difficulties' and 'anxiety/depression').

There is some overlap within options, which could be confusing. For example, SSE lists both ‘developmental condition’ and ‘autism’ separately, even though the latter is a type of the former. Another example is the ‘mental ill health’ and ‘anxiety/depression’ options, again found in the SSE form. It is not clear if customers should tick both or only the more specific option.

All organisations feature options to indicate older age, however they specify a variety of different ages as the lower threshold, including:  60+, 65+, 'pensionable age' or 'pensioner'.  British Gas have two separate options relating to older age ('pensionable age (65 and over)' and 'age 75 and over').

There is some degree of consistency across organisations. This appears to be where specific conditions have been mentioned within the Ofgem guidance (for instance, 'restricted hand movement' appears in all but one form, in spite of the fact this is a very specific need).

Language Used

The language used across suppliers is very inconsistent. SSE uses ‘hard of hearing’ and ‘deaf’ to describe hearing loss-related needs, while other suppliers employ terms such as ‘hearing impairment’ or ‘hearing impaired’.

Some options have multiple potential meanings: ‘carer’ could refer to the respondent either needing a carer or being a carer for someone else.

All suppliers ask about speech and language difficulties and broader language barriers. However there is no shared way of asking whether a customer speaks English. Variations include: 'unable to communicate in English', 'language barrier' and 'foreign language speaker'.

‘Unable to communicate in English’ (used by Octopus and Ovo) is somewhat ambiguous. Customers might take it to mean having a different first language or having a speech condition. The requirements are quite different: with the former you could use an interpreter or multilingual support, with the latter you would need different support.

Medical Equipment3

Question asked
Question asked (with variations)
Question not asked
Ovo
BG
SSE
Octopus
EDF
Utilita
Shell
Stairlift/hoist/electric Bed
Heart/lung or ventilator machine
Dialysis, feeding pump automated medication
Nebuliser or apnoea monitor
Careline/telecare system
Medicine refridgeration
Water dependent
Medically dependent on showering/bathing
Oxygen concentrator
Oxygen use
MDE electric showering
Mains powered electric medical equip
Life support
Wheelchair user
Medical or other critical dependency

Options Offered

British Gas do not offer any specific options for types of medical equipment: they solely offer the generic category ‘mains powered electric medical equipment’.  All other organisations surveyed have more specific options.  These are broadly consistent across suppliers with some more limited options (e.g. ‘wheelchair’, ‘MDE electric showering’). 

Most organisations (bar British Gas and EDF) also ask about reliance on water.

Language Used

It is unclear what is meant by the ‘life support’ option used by EDF. Often the phrase ‘life support machine’ refers to a ventilator, but EDF also have a separate option for ‘heart and lung ventilators’. It could mean life support as a condition or set of needs but that seems too broad for the PSR.

Temporary Changes

In a rare show of unanimity, all suppliers offer the same options for temporary changes.

Question asked
Ovo
BG
SSE
Octopus
EDF
Utilita
Shell
Young adult
householder <18
Children age 5
and under
Temporary life change (bereavement/pregnancy)
Post hospital recovery

Other Questions

Question asked
Question not asked
Ovo
BG
SSE
Octopus
EDF
Utilita
Shell
Additional presence preferred
Nominee
Regular meter readings
Move meter to support access
Extra time to answer door
Female presence preferred
Duplicate bill to family
Power of attorney
Financial difficulties

Passwords 

All suppliers offered the option of setting up a password or PIN. This is usually so a technician can state this password as an additional security measure on home visits. Two suppliers required a 6-letter password, one an 8-letter password and one a 10-letter password.  A final supplier did not specify length.  An unfortunate side effect of this variation is that if an individual were to move supplier, they may need to change their password and remember a new one. (Note: not shown in an infographic.)

Life Scenarios

Varying from the multiple choice standard, Shell veer into first person narratives. In their ‘Nominee Scheme’ section of the form, they feature an additional tick box option: ‘I can be easily confused and worried by communications from my energy supplier’. When asking about meter support they offer: ‘I have a prepayment meter and no-one in my household is able to safely read it or top it up’.

Accessibility Information4

Question asked
Question asked (with variations)
Question not asked
Ovo
BG
SSE
Octopus
EDF
Utilita
Shell
Large print letter
Braille
Audio
Alternative language: please specify
Black and white letter
Letter
Colour contrast
Arial font
Large print letter in black and white

Organisations vary on including accessibility questions on their PSR form. Ovo offers seven different accessibility options for receiving information, while Shell offers a single broad range checkbox.

Conclusions

Suppliers diverge considerably in what information they collect on their customers to register them for Priority Services support.

On our travels we encountered the aspiration to create a more standardised or universal PSR. Initiatives like the Vulnerability Registration Service and Experian’s Support Hub aim in this direction. In the future we are keen to explore the user experience of these services and how they aid vulnerable customers.  

In the meantime, we hope this analysis will prove useful if you are looking to improve the experience of vulnerable energy customers. Any questions or comments, contact harriet@helpfirst.ai. We’d love to hear from you!

Footnotes
  1. SSE was acquired by Ovo Energy in 2020.  They hadn't completed their move over when we started this research and were still registering people to their PSR. We’ve included them in this analysis as their approach was interesting with many mental health and developmental condition-type questions.
  2. Some questions have been condensed in the infographics. Numbers referenced in the discussion refer to the full options as available on the questionnaires, but the infographics demonstrate a condensed version for brevity and ease of visualisation. Full original data is available on request.
  3. 'Heart/lung machine & ventilator' is the most common formulation of question regarding this equipment. However EDF separates these questions into: 'heart/lung machine' and 'ventilator'.
  4. Octopus and British Gas do not ask about accessibility. Shell only offers a general ‘accessible information’ needs tick box if the customer has earlier selected that they have a visual impairment. This does not mean they do not record this information elsewhere, where these questions did not appear on their forms we were not able to verify what (or if) they ask about accessible information.