Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • GPT models and related technologies could have implications on the US labor market.
  • A new rubric was used to assess occupations based on their correspondence with GPT capabilities.
  • 80% of the US workforce could have at least 10% of their work tasks affected by GPTs.
  • 19% of workers may see at least 50% of their tasks impacted.
  • Impact is not limited to industries with higher recent productivity growth.
  • GPTs exhibit characteristics of general-purpose technologies.

Paper Content

Introduction

  • Recent progress in generative AI and large language models (LLMs)
  • LLMs can process and produce various forms of sequential data
  • A new rubric proposed to measure the overall exposure of tasks to GPTs
  • 19% of jobs have at least 50% of their tasks exposed to GPTs
  • 3% of U.S. workers have over half of their tasks exposed to GPTs
  • 49% of workers could have half or more of their tasks exposed to LLMs
  • Occupations with higher wages generally present with high exposure
  • Science and critical thinking skills negatively correlated with exposure
  • Programming and writing skills positively associated with LLM exposure
  • Higher barriers to entry in jobs tend to experience more exposure to LLMs
  • Information processing industries exhibit high exposure
  • Weak connection between productivity growth and overall GPT exposure
  • GPTs are general-purpose technologies
  • Measurements of LLM impact potential and use case of applying LLMs to develop such measurements efficiently and at scale

Literature review

The advancement of large language models

  • LLMs have risen to prominence in AI research
  • LLMs can tackle complex language-based tasks
  • Factors such as increased model parameter count, greater training data volume, and enhanced training configurations have fueled progress
  • LLMs excel in diverse applications like translation, classification, creative writing, and code generation
  • Methods like fine-tuning and reinforcement learning with human feedback have improved the steerability, reliability, and utility of LLMs
  • LLMs have potential to program and control other digital tools
  • LLMs are becoming increasingly integrated into specialized applications
  • LLMs may be unreliable for various tasks due to issues such as factual inaccuracies, inherent biases, privacy concerns, and disinformation risks
  • LLMs can become valuable assets in machine learning model development
  • LLMs can contribute to economic decision-making at the task level
  • LLMs can continuously enhance performance, but bring a variety of serious risks

The economic impacts of automation technologies

  • Research has shown that technological progress raises the demand for skilled workers over unskilled workers
  • Studies have explored the effects of technological change and automation on workers within a task-based framework
  • Workers involved in routine and repetitive tasks are at a higher risk of technology-driven displacement
  • New technology increases the need for a wider array of labor-intensive tasks
  • Automation technologies have resulted in wage inequality in the US
  • Various approaches have been used to estimate the overlap between AI capabilities and the tasks and activities workers undertake in different occupations
  • Realization of general purpose technologies’ full potential requires extensive co-invention
  • Many studies of machine learning technologies focus on systems-level adoption
  • Task-level information is used to assess whether LLMs fulfill GPT criteria
  • Findings are aggregated to occupations and industries to capture the overall potential exposure in the contemporary U.S. labor market

Methods and data collection

Data on activities and tasks performed by occupation in the us

  • Used O*NET 27.2 database
  • Contains information on 1,016 occupations
  • Detailed Work Activities (DWAs) and tasks
  • Sample of tasks and DWAs in Table 1
  • 19,265 tasks with task description and occupation
  • Most tasks associated with one or more DWAs
  • 2,087 DWAs with most connected to one or more tasks

Data on wages, employment, and demographics

  • Obtained employment and wage data from 2020 and 2021 Occupational Employment series
  • Dataset includes occupational titles, number of workers, and employment projections for 2031
  • Typical education and on-the-job training required for entry in an occupation
  • Used BLS-recommended crosswalk to ONET to link ONET task and DWA dataset
  • BLS Labor Force Demographics derived from Current Population Survey

Exposure

  • Human ratings and GPT-4 ratings were collected using an exposure rubric
  • Rubric defines exposure as a measure of whether access to a GPT or GPT-powered system would reduce the time required for a human to perform a specific DWA or complete a task by at least 50 percent
  • Three primary measures of exposure were constructed: E1, E2, and E3
  • E1 corresponds to the lower bound of the proportion of exposed tasks within an occupation
  • E2 is the sum of E1 and 0.5*E2, accounting for additional investment
  • E3 is the sum of E1 and E2, an upper bound of exposure
  • E3 is used for the remainder of the analysis, assuming tasks directly exposed are considered twice as exposed as tasks requiring complementary innovation

Limitations of our methodology

  • Validity of task-based framework
  • Relative vs. absolute measures
  • Lack of expertise and task interpretation
  • Forward-looking and subject to change
  • Sources of disagreement between humans and GPT-4
  • Human raters and GPT-4 ratings show high degree of agreement
  • Biased judgments due to lack of occupational diversity
  • High-quality labels require workers engaged in occupations
  • GPT-4 sensitive to wording, order, composition, detail, and definitions
  • Iterating on prompt can enhance agreement between model and rubric
  • GPT-4 capable of applying intricate taxonomies

Results

  • General-purpose technologies are rare and have pervasive, long-term impacts.
  • GPTs can influence labor, productivity, and capital input.
  • GPTs have the potential to affect a diverse range of occupations and wage structures.

Summary statistics

  • Average occupation-level values suggest 15% of tasks are exposed to GPTs
  • Over 30% of tasks are exposed to GPTs for some occupations, and over 50% for others
  • 80% of workers belong to an occupation with at least one task exposed to GPTs
  • 19% of workers are in an occupation where over half the tasks are labeled as exposed

Wages and employment

  • Exposure intensity across the economy is displayed in Figure 3
  • Exposure is measured in terms of total workers and total occupations
  • Worker concentration in occupations is not highly correlated with occupational exposure to GPTs or GPT-powered software
  • Human and GPT-4 annotations exhibit qualitative similarities and tend to correlate
  • Higher wages are associated with increased exposure to GPT
  • Potential exposure to GPTs has little correlation with current employment levels

Skill importance

  • Skill importance and exposure measures are related
  • Science and critical thinking skills are negatively associated with exposure
  • Programming and writing skills are positively associated with exposure

Barriers to entry

  • Job Zone is a proxy for barriers to entry
  • Median income increases with Job Zone
  • Exposure to GPT increases from Job Zone 1 to 4, then decreases or remains similar at Job Zone 5
  • Higher wage occupations tend to be more exposed to GPT
  • Bachelor’s, Master’s, and professional degrees are more exposed to GPT than those without formal education
  • Jobs with least exposure require longest training, lower payoff
  • Jobs with no on-the-job training required or only internship/residency required yield higher income but are more exposed to GPT

Validation of measures

Comparison to earlier efforts

  • Aim to build on previous studies of occupational exposure to AI and automation
  • Previous studies used a variety of methods
  • Mapping text descriptions of tasks to descriptions of technological advances in patents
  • Linking capabilities of AI systems to occupational abilities
  • Mapping results of AI task benchmark evaluations to worker tasks
  • Expert labeling of automation potential for certain occupations
  • Developing a rubric for evaluating the “suitability for machine learning”
  • Summary statistics on many of these prior efforts
  • Methodology builds upon the SML approach
  • Results of OLS regressions of new LLM exposure measurements on occupation-level exposure measures
  • Four separate output variables representing new scores
  • Generally positive and statistically significant correlations between LLM exposure measures and previous measurements
  • Encouragingly, SML exposure scores show significant and positive associations
  • Webb software and AI patent-based measures, SML, and normalized routine cognitive and manual scores all exhibit positive associations
  • Low correlations with Felten et al. and Frey and Osborne
  • 28-40% unexplained variance compared to other measurements

Discussion

Gpts as a general-purpose technology

  • GPTs could be classified as a general-purpose technology if they meet three criteria
  • GPTs are improving in capabilities over time
  • GPTs can have pervasive impacts across the economy
  • Complementary innovations enabled by GPTs can have widespread application to economic activity
  • Adoption and use of LLMs is becoming increasingly widespread
  • Adoption of LLMs will vary across different economic sectors due to various factors

Implications for us public policy

  • Automation technologies, including LLMs, have been linked to economic disparity and labor disruption.
  • Results from the US suggest the need for policy preparedness for the potential economic disruption posed by LLMs.
  • Prior work has suggested policy directions related to education, worker training, and safety net programs.

Limitations and future work

  • Study has limitations that need further investigation
  • Focus on US restricts generalizability
  • Need to extend scope and share methods
  • Need to explore GPT adoption patterns and actual capabilities/limitations of state-of-the-art models
  • Need to consider vision capabilities in ratings

Conclusion

  • Generative Pre-trained Transformers (GPTs) generate profound transformations
  • 19% of jobs have at least 50% of their tasks exposed to GPTs
  • GPTs can have pervasive impacts across a wide swath of occupations in the US
  • GPTs can augment or displace human labor
  • GPTs can impact job quality, inequality, and skill development
  • New rubric for understanding LLM capabilities and their potential effects on jobs
  • Direct exposure (E1) - Writing and transforming text and code according to complex instructions
  • Exposure by LLM-powered applications (E2) - Summarizing documents longer than 2000 words and answering questions about those documents
  • Exposure given image capabilities (E3) - Reading text from PDFs, scanning images, or creating or editing digital images according to instructions
  • No exposure (E0) - Tasks requiring a high degree of human interaction, precise measurements, reviewing visuals in detail, use of a hand or walking, making decisions that might impact human livelihood, existing technology not powered by an LLM
  • Developed capacities that facilitate learning or the more rapid acquisition of knowledge
  • Background structures needed to work with and acquire more specific skills in a variety of different domains
  • Procedures that contribute to the more rapid acquisition of knowledge and skill across a variety of domains
  • Programming - Writing computer programs for various purposes
  • Impact potential is present across nearly all industries, with wide heterogeneity
  • Productivity growth since 2012 and exposure to LLM technologies appear unrelated
  • Occupations with the highest exposure according to each measurement
  • Regression of occupation-level, human-annotated exposure to GPTs on skill importance
  • Mean exposure to GPTs by job zone