We've analyzed how ChatGPT responds to users based on their name, using language model research assistants to protect privacy.
Creating our models takes more than dataβwe also carefully design the training process to reduce harmful outputs and improve usefulness. Research has shown that language models can still sometimes absorb and repeat social biases from training data, such as gender or racial stereotypes.
In this study, we explored how subtle cues about a user's identityβlike their nameβcan influence ChatGPT's responses. This matters because people use chatbots like ChatGPT in a variety of ways, from helping them draft a resume to asking for entertainment tips, which differ from the scenarios typically studied in AI fairness research.
While previous research has focused on third-person fairness, where institutions use AI to make decisions about others, this study examines first-person fairness, or how biases affect users directly in ChatGPT.