Words matter: AI can predict salaries based on the text of online job openings

We’re excited to bring Transform 2022 back in person on July 19 and pretty much July 20-28. Join AI and data leaders for insightful conversations and exciting networking opportunities. Register today

The job landscape in the United States is changing dramatically: COVID-19 pandemic has redefined essential work and sent workers out of the office. New technologies are changing the nature of many professions. Globalization continues to push jobs to new locations. And concerns about climate change are adding jobs in the alternative energy sector and cutting them off from the fossil fuel industry.

Amid this turmoil in the workplace, workers, employers and policymakers alike could benefit from understanding which job characteristics lead to higher wages and mobility, says Sarah Bana, a postdoctoral researcher at Stanford’s Digital Economy Labportion of the Stanford Institute for Human-Centric Artificial Intelligence† And, she notes, there is now a large dataset that can help provide that insight: the text of millions of online job openings.

“Online data offers us a great opportunity to measure what matters,” she says.

Indeed, using artificial intelligence (AI) and machine learning, Bana recently shown that the words used in a dataset of more than a million online job openings explain 87% of the variation in salaries across much of the labor market. It is the first work to use such a large dataset of secondments and look at the relationship between placements and salaries.

Bana also experimented with injecting new text — adding a skill certificate, for example — into relevant job postings to see how those words changed salary forecasts.

“It turns out that we can use job posting text to evaluate the salary-relevant characteristics of jobs in near real time,” says Bana. “This information could make applying for jobs more transparent and improve our approach to staff education and training.”

An AI dataset of 1 million vacancies

To analyze how online job posting text compares to salaries, Bana obtained more than one million pre-pandemic job listings from Greenwich.HR, which collects millions of job listings from online job boards.

She then used BERT, one of the most advanced natural language processing (NLP) models available, to train an NLP model using the text of over 800,000 job openings and the associated salary data. When she tested the model with the remaining 200,000 job openings, it accurately predicted the corresponding salaries 87% of the time. In comparison, using only the job titles and geographic locations of the job postings yielded accurate predictions only 69% of the time.

In follow-up work, Bana will try to characterize the contribution of different words to the salary forecast. “Ideally, we color words in messages from red to green, pairing the darker red words with a lower salary and the darker green words with a higher salary,” she says.

The Value of Upskilling: A Text Injection Experiment

To determine which skills are important for salary forecasting, Bana used a text-injection approach: to certain relevant job openings, she added short sentences indicating that the job requires a particular career certification, such as those listed in Indeed.com’s 10 in-demand career certifications (and how to get them)† Obtaining these certifications can be costly, with prices ranging from about $225 to about $2,000. But until now, there has been no way to determine whether the investment is worth the paycheck.

Bana’s experiment found that some certifications (such as the IIBA Agile Analysis certification) deliver significant salary gains quickly, while others (such as the Cisco Certified Internetwork Expert) take it slower—valuable information for employees who would like better information about how a company works. investments in skills training will affect their salaries and prospects, Bana says.

Not only employees benefit from this information, notes Bana. Employers can use these results to better invest in human capital, she says. For example, if machine learning models show a gradual shift from some tasks to others, employers would be warned in advance and could retrain certain employees.

And policymakers considering which vocational training programs to promote would equally benefit from understanding which skills are increasing or decreasing in economic value.

To that end, Bana and her colleagues are currently working on a companion document identifying which jobs disappear from job openings over time and which new jobs appear.

In the future, Bana hopes that textual analysis of job openings can provide a web-based application where employees or companies can explore the added value through further training or by moving to a new geographic location.

“Right now there isn’t much clarity about a path to higher earnings,” Bana says. “Tools like these can help job seekers improve their job prospects, employers develop their workforce, and policymakers respond to immediate changes in the economy.”

Katharine Miller is a contributing writer for the Stanford Institute for Human-Centered AI.

This story originally appeared on hai.stanford.edu† Copyright 2022

DataDecision makers

Welcome to the VentureBeat Community!

DataDecisionMakers is where experts, including the technical people who do data work, can share data-related insights and innovation.

If you want to read about the latest ideas and up-to-date information, best practices and the future of data and data technology, join DataDecisionMakers.

You might even consider contribute an article of your own!

Read more from DataDecisionMakers

Leave a Comment

Your email address will not be published. Required fields are marked *