Testing the Extents of AI Empathy: A Nightmare Scenarioby@anywhichway
388 reads
388 reads

Testing the Extents of AI Empathy: A Nightmare Scenario

by Simon Y. Blackwell37mMarch 28th, 2024
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow
EN

Too Long; Didn't Read

This document describes an evaluation of how various AI assistants handle empathetic conversations. The AIs evaluated include Claude, Gemini, ChatGPT, Willow, Pi.ai, Mistral, and a customized version of Claude. Each AI was prompted with scenarios involving being sad, happy, or having nightmares. Their responses were assessed based on expression of sympathy, attempts to understand the user, provision of space for emotions, advice quality, affirmative conversation, manifestation of empathy, and escalation of serious issues. Willow and Pi.ai demonstrated the most empathy, while base Claude and Gemini needed prompting and API use was required for Mistral. The customized Claude performed well compared to the benchmarks.
featured image - Testing the Extents of AI Empathy: A Nightmare Scenario
Simon Y. Blackwell HackerNoon profile picture
Simon Y. Blackwell

Simon Y. Blackwell

@anywhichway

Working in the clouds around Seattle on open source projects. Sailing when it's clear.

Share Your Thoughts

About Author

Simon Y. Blackwell HackerNoon profile picture
Simon Y. Blackwell@anywhichway
Working in the clouds around Seattle on open source projects. Sailing when it's clear.

TOPICS

Languages

THIS ARTICLE WAS FEATURED IN...

Permanent on Arweave
Read on Terminal Reader
Read this story in a terminal
 Terminal
Read this story w/o Javascript
Read this story w/o Javascript
 Lite
Also published here
L O A D I N G
. . . comments & more!