top of page

Quantitative layer

JOB-ADVERTISING CORPUS.

A total of 6,206 job sources (public portals, commercial boards and employer ATS feeds) were scraped in fifteen languages. The window runs 1 January 2022 to 31 december 2025.

AI keyword lexicon.

A 220-term, multi-lingual list captured AI references in titles, duties and skills fields; fast-text similarity handled synonyms.

Vacancy universe.

After de-duplication and language harmonisation, 31.2 million unique European postings remained. 7.47 million contained at least one AI reference and formed the core dataset, that’s 24%.

Wage data.

Advertised salaries were matched to Eurostat’s Structure of Earnings Survey t(wage-index adjusted) and ≈ 195 000 Glassdoor/Indeed self-reports.

Task & automation data.

The OECD Skills-for-Jobs database (v 2024-2) provides task-exposure scores that inform role deep-dives.

Asset 10_2x_edited.jpg

scraping

cleaning

AI-tagging

analytics

pipelines

01

Methodology

Qualitative layer

The OECD Skills-for-Jobs database (v 2024-2) provides task-exposure scores that inform role deep-dives.

Limitations & mitigations

Language nuance. Multi-word competence phrases, e.g. « analyse des données », risk under-counting. Fast-text similarity and manual checks correct the largest misses.

Unadvertised hiring. Internal promotions and referral pipelines are invisible to scraping. Interview probes capture qualitative offsets.

Self-reported wages. Glassdoor data skew toward English-speaking countries; Eurostat medians anchor national differentials.

Rapid tool churn. New AI product names appear faster than any static lexicon; a monthly refresh loop keeps keyword capture current.

Asset 22_4x.png

NEXT CHAPTER

2. AI Skills in Digital Careers

Explore
bottom of page