Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- LLMs are used to generate content for a range of tasks
- Need to ensure models are aligned with human preferences and do not produce unsafe, inaccurate or toxic outputs
- Alignment techniques can mitigate safety concerns and improve model capabilities
- Personalising LLMs through micro-level preference learning processes may result in models better aligned with each user
- Normative challenges in defining bounds of societally-acceptable and safe degree of personalisation
Paper Content
Introduction
- LLMs have improved in recent years and are being used in a wide range of applications
- ChatGPT was released in 2022 and reached over 100 million users in two months
- LLMs are being aligned with human preferences using reward learning
- More reward learning can lead to LLMs having polarised views on certain topics
- Current implementations of reward learning are limited and rely on a small group of people
- This paper presents a taxonomy and policy framework for explicit personalisation of LLMs
- Personalised LLMs could provide tailored assistance and adapt to diverse groups
- Personalised LLMs could also reinforce biases and narrow information diets
- A policy framework is needed to govern personalised LLMs safely and ethically
Background
- AI systems are being applied to complex tasks
- Alignment is desirable to avoid undesirable behaviours
- Alignment is a technical challenge
- Recent works have tried to align LLMs with human preferences
- Three axes of alignment: what, what, who
- Implicit personalisation is occurring to meet expectations of non-representative crowdworkers
From implicit to explicit personalisation
- Given the diversity of human values and preferences, and the importance of pragmatics for contextual understanding, the aim to fully align models across human populations may be a futile one.
- Personalisation in internet technologies and NLP is not new.
- The technical apparatus for effective feedback learning exists.
- LLMs are designed for adaption via transfer learning.
- The HuggingFace hub is evidence of the demand for customisation.
- Recent industry model developments and announcements suggest the imminent possibility of personalisation.
A taxonomy of the benefits and risks from personalised llms
- Consider the effects of personalised LLMs at individual and societal levels
- Benefits and risks of personalised LLMs are summarized in Table 1
- Benefits and risks can be directly related to each other
- Benefits and risks can accumulate at the societal level
- Constructed taxonomy through increased stakeholders involvement, allowing for a more participatory and inclusive approach
- Personalised LLMs can increase efficiency, utility, autonomy, and connection
- Increased efficiency is the inverse of quality of service harms
- Increased utility is the inverse of discrimination and exclusion harms
- Increased autonomy is the inverse of representational harms
- Increased connection is the inverse of emotional harms
- Cost incurred by end-users in providing personalised feedback to an LLM
Societal level
- Personalised LLMs may better adapt to the needs of marginalised communities
- Personalised LLMs could improve access to resources and reduce cost
- Personalisation democratises how values or preferences are embedded into an LLM
- Personalised LLMs could improve work productivity
- Personalised LLMs could entrench digital divide
- Personalised LLMs could lead to polarisation and breakdown of shared social cohesion
- Personalised LLMs could be used for malicious purposes
- Personalised LLMs could lead to labour displacement
- Personalised LLMs could have a large environmental impact
A three-tiered policy framework for personalised llms
- Proposes a new policy framework for managing benefits and risks of personalised LLMs
- Provides a principled and holistic way of deciding how personalisation should be managed
The limits of personalisation
- Deciding limits of personalisation is a subjective and contentious decision
- Deciding which aspects of model behaviour should be personalised and how they should be allowed to be personalised
- Restrictions and requirements should be applied to different types of personalised LLM outputs
How people interact with llms
- Workflows in machine learning models have changed in the past 5 years
- Pre-trained models are used instead of training from scratch
- Generative models are fine-tuned on in-domain data
- Model provider, application provider and end-user are involved in creating an LLM application
- Personalisation of LLMs needs to be managed to ensure benefits and mitigate risks
A three-tiered policy framework
- Three-tiered policy framework proposed
- Tier One: Immutable restrictions, e.g. terrorist content, CSAM, threats of violence or sexual assault
- Tier Two: Optional restrictions and requirements, based on values and preferences of actor
- Tier Three: Tailored requirements, user decides personal preferences within boundaries set by higher tiers
Discussion
- Personalisation of LLMs is a likely pathway for their continued expansion
- Taxonomy of benefits and risks from personalised LLMs
- Policy framework to govern benefits and risks
- Assumption that personalised LLMs are technically feasible
- Demand for customised and highly-adapted LLMs
- Technical decision-points and engineering challenges
- Challenges to implementing and enforcing policy framework
- Plans for maturing research and iterating on taxonomy
Technical challenges
- Learning from human feedback requires less data than pre-training
- Exact amount of data needed for personalisation is unclear
- Cold start problem exists when no feedback data points are available
- Data format and quality is important
- Model scale may not contribute significantly to performance
- Alignment tax must be balanced to prevent overfitting
Policy framework enforcement
- Personalised LLMs have wide-reaching benefits and risks
- Framework outlines properties for appropriate governance
- Policy enforcement to minimize violations and friction
- Must comply with existing regulations
- Difficult to regulate dynamic and distributed systems
- Responsibility distributed among model, application and end-users
- Open question of principles to define organisational bounds
Next steps
- Work relies on informed speculation about future development and governance of personalised LLMs
- Affordances, constraints and harms depend on design, use and safeguards
- Future iteration of taxonomy and policy framework will involve interviews with end-users, providers and policymakers
- Aim is to avoid long lags in understanding, documenting and governing harms from personalised LLMs