UX Research Case Study · 2025

Empathy in AI for Caregivers.

A six-month mixed methods study on how caregivers experience AI empathy, and what that tells us about designing for care.

Role

Lead UX Researcher

Client

Dell Medical School, UT Austin

Timeline

June 2025 – Dec 2025

Team

Solo on empirical; co-author Dr. Kristina Shiroma (LSU) on SLR

Outcome

SLR under review at JMIR; design implications

Research process at a glance

Step 1

Systematic Literature Review

→

Step 2

Survey

→

Step 3

Interviews

→

Step 4

Thematic Analysis

→

Step 5

Design Implications

The empathy framework — what AI can and can't do

Cognitive

Recognizing what someone else feels.

✓AI can do mechanically

Affective

Resonating with the feeling. Sometimes called emotional contagion.

~AI mimics, can't perform

Compassionate

Acting on care. Requires the prior two plus motivation.

✗Outside AI's capability

01· Situation

AI is being layered into caregiving without much study of how it affects an already-stressed group.

About 53 million Americans care for an older adult, and burnout is common. Roughly a third of those caregivers are themselves age 65 or older.

As clinical access shrinks and AI products get more capable, caregivers are increasingly using ChatGPT, Gemini, and voice assistants for everything from looking up symptoms to processing grief. Most of this happens outside any clinical or research framework, with limited visibility into whether AI is helping or making things harder.

I designed this study to answer:

01What kind of empathy can AI actually provide, given how clinical research defines empathy?
02Where does AI care differ from human care, and what can be learned from those differences?
03How do caregivers, especially those at risk of burnout, experience that gap in the tools they already use?

The stakes: automating emotional and clinical care for an already-stressed group can harm them when AI's limits are not visible to the people relying on it. Comparing AI care with human care in this context shows what AI can plausibly do and where it should defer.

02· Task

Lead UX researcher and first-author.

Responsibilities

Systematic literature review
Survey instrument design and distribution
Design and execution of interview-based study
Participant recruitment
Thematic analysis
Synthesis into design implications
First-author on manuscript

Constraints

Caregivers are time-poor and rarely respond to recruitment
Caregivers became the proxy for older adults due to IRB restrictions
Small sample size by necessity
Remote sessions only

03· Action one

Study Design & Execution

Two connected studies, sequenced so each fed the next.

The SLR examined the field's existing claims about empathic AI. The empirical phase tested those claims against how caregivers actually use these tools.

Phase 1: Systematic Literature Review

Co-authored with Dr. Kristina Shiroma (LSU). Methodology: PRISMA 2020.

PRISMA screening · records → final synthesis

Identified across 7 databases

5,765

After deduplication & title screen

842

Full-text reviewed

Included in synthesis

Databases searched

PubMedScopusIEEE XploreACM Digital LibraryWeb of SciencePsycINFOCINAHL

Approximate stage counts; final n=15 is the synthesis pool.

Phase 2: Empirical follow-up with caregivers

Solo. Survey plus cognitive walkthroughs/ interviews.

Survey (n=16). Family and professional caregivers, distributed via support networks, Reddit, and Facebook groups.
Cognitive walkthroughs / interviews (n=8). 5 family caregivers, 3 professional caregivers, all currently caring for adults aged 60+.
Study materials. Real Reddit posts from r/CaregiverSupport and r/dementia, paired with the actual responses ChatGPT, Gemini, Siri, and Alexa produced for that prompt.
Recruitment. 15+ organizations contacted. Final pool from assisted-living partners, AI-using home-care companies, and online communities.

Topics covered

Fall riskEarly dementia signsCaregiver burnoutEnd-of-life careMedication side effects

04· Action two

Extracted key findings

From the literature

AI can offer one part of empathy.

Clinical and psychological research defines empathy in three parts: cognitive, affective, and compassionate.
AI can mechanically perform cognitive empathy.
Affective and compassionate empathy require feeling and motivated action and are outside what AI can do.

The field has been measuring empathy with non-clinical tools.

Existing AI research has largely used self-reports and usability instruments to measure empathy.
The construct under study is therefore closer to user experience than to empathy as clinical care defines it.

From caregivers

6/8 family caregivers used AI heavily for information work.

Caring for someone with dementia (or another chronic condition) typically means navigating multiple health practitioners, decoding insurance, and managing intersecting health conditions where one diagnosis interacts with another.

Family caregivers used AI to:

Learn medical concepts they hadn't trained for
Coordinate information across specialists
Look up drug interactions
Translate clinical terminology
Verify advice they had already received from doctors

Caregivers with stronger human support networks were more critical of AI as a replacement.

3 of 8 interview participants explicitly preferred human support over AI for emotional needs.
Each had access to high-quality human support: a sibling co-caregiver, a partner, or a peer support group.
Their critique mapped to the empathy framework: AI lacks the affective and compassionate dimensions a peer with shared lived experience provides.

“I don't trust AI to provide me with emotional support in my life because it doesn't have lived experiences.”
Family caregiver, interview

Professional caregivers were more optimistic about AI as a workflow tool.

3 professional caregivers in this study used AI for:

Clinical information search during work
Their own emotional regulation after difficult shifts
The hope that AI could companion-engage with the older adult in their care

“I am a student and I juggle studies with my care work, which can be quite exhausting, long hours and emotional drain. So I appreciate being able to quickly look up information and get my questions answered.”
Professional caregiver, interview

From the test artifacts

Voice assistants failed the emotional test.

For the heaviest emotional prompts (caregiver burnout, end-of-life care), Siri and Alexa returned “I'm not quite sure how to help you with that,” or surfaced search-engine snippets unrelated to the emotional content.
Voice is the modality most older adults can actually use, which makes this a meaningful product gap.

05· Action three

Translated findings into 5 design implications.

Personalize by caregiver context.

Family and professional caregivers used AI for different things and had different tolerance for emotive framing. A product targeting both groups should differentiate by context, role, and time pressure. A product targeting one group should be built specifically for that group.

Add human-in-the-loop and external support paths.

AI should not be the endpoint for a stressed user. Build in connections to peer caregiver communities, professional networks, and clinician handoffs. The findings suggest the most stressed caregivers are also the ones most likely to over-rely on AI without those paths in place.

iii

Make performative empathy toggleable.

Some caregivers want clinical information without emotive framing and described emotive AI as deceptive. A toggle that lets users switch between warm and direct response styles addresses this preference without removing warmth for users who want it.

Be transparent about limitations.

"I can help you find information, but I cannot replace a clinician or someone who has lived through this" is more honest than performed warmth and reduces over-reliance for the caregivers most likely to act on AI advice without verifying it.

Design for older caregivers, including digital literacy.

About a third of caregivers are themselves age 65 or older. Products in this space need to be built with their accessibility, digital literacy, and varied tech comfort in mind. Digital literacy support, onboarding, and accessible interaction patterns are essential rather than optional. Designers should not assume the typical user of caregiving AI is a tech-fluent younger adult.

06· Results + Relevance

Triangulated literature and lived experience to surface where AI care helps caregivers, where it falls short, and what the difference reveals for design.

Papers reviewed in SLR (from 5,765 retrieved)

Caregivers reached: 16 surveyed, 8 interviewed

Design implications produced

JMIR

SLR under peer review

07· Evolution

What I learned

The dearth of research is itself a finding.I expected to find more empirical work on older adults using general-purpose AI than I did. The gap signals two things: a lack of theoretical and clinical grounding in how these technologies are designed, and likely low adoption among the population they are often pitched to. This shaped both the framing of the SLR and the choice to focus the empirical phase on caregivers.
The SLR sharpened the empirical study.Doing the literature review first changed how I designed Phase 2. The three-types empathy framework, the gaps in measurement, and the recurring use of empathy as a UX construct directly shaped what I asked caregivers and how I structured the test artifacts.
Hard-to-reach populations need design, not just outreach.Caregivers don't have time to participate in studies, and IRB constraints around vulnerable groups added another layer. I learned how to recruit a hard-to-reach population without any industry partner, using cold outreach to support communities and gradually building trust.

What's next

The SLR is under peer review at JMIR with co-author Dr. Kristina Shiroma.