Kategorien
Event

HR Data Science: Salary Prediction at Trivago and StepStone

Dear guest!

On Tuesday evening, 29th of August 2023, I was at the Düsseldorf Data Science Meetup at Trivago, the Online search company for hotels, in their extraordinary headquarters in the Medienhafen of Düsseldorf, my birth city.

I left my office in Solingen early in the afternoon that day since I met with my business friend Dominik Rühl before the event – and walked in the sun from the Düsseldorfer Landtag (state parliament) at the Rhine to our meeting point at UCI cinema near Trivago.

Stefan Klemens with Dominik Rühl at the meetup. Foto: Stefan Klemens

It was good to see Dominik after about five years – And we had a fruitful exchange on our common topics artificial intelligence (AI), recruitment, skills, and digital assessment as well as some private issues. He is now working as a HR & Recruiting Manager at Advance Business Partner GmbH based nearby in the city of Neuss on the other side of the Rhine. The consulting company focuses on mobility services in different areas like recruitment, innovation, and transformation management.

Although the summer and weather this year in Germany is pretty unstable, we enjoyed sitting outside with our drinks at unique brewery bar Eigelstein.

Find out more about the Düsseldorf Data Science Meetup Group with its interests in Data Science, Machine Learning and Python/R, on this website.

Arriving at Trivago

The Trivago building as seen from the north-east of the Medienhafen Düsseldorf. Foto: Stefan Klemens

At 6 pm it was time to walk to nearby Trivago building, finished in 2018. The individual modern styled entrance area and the café behind offers a glimpse on how the interior of the building is decorated (see this article and this article about the New Work culture at Trivago and the architecture of the headquarter´s spaces.)

Surprisingly we, with another guest, were the first participants arriving (ok, it was half our before the official start and talks started even later), but were soon picked up by Gina from Trivago. Together we (and a cart full of pizza in yellow boxes for the data people) were lifted by one of the elevators to the top floor for the location event.

A stunning view to the south-west skyline from the roof terrace reached our eyes, and Dominik, the coming participants, and me enjoyed drinks and pizza before the event started at 7 pm.

Our co-host Aida Orujova gave us a very warm welcome, she introduced the speakers, and broke the ice by asking who is from data science, who is from engineering, and who just there to know more about salaries.

Co-host and moderator of the evening Aida Orujova welcoming the Data Science crowd. Foto: Stefan Klemens (with approval of the her)

First talk: Alexander Fischer, Trivago

Alexander Fischer from Trivago started with his talk about is passion for the programming and statistics software R, and his (and the economists´) “Swiss knife” methodical approach for prediction outcome variables: Linear Regression. He showed how he and his team used this classical algorithm with packages R´s fixest, and PyFixest to predict wage by using the variables education and ability (e.g. intelligence).

In his presentation of the problem in doing that (“The error term is correlated with the dependent variable”) he referred to a recent study using data from 59,000 Swedish men published 2023 by Marc Keuschnigg, Arnout van de Rijt, and Thijs Bol in the European Sociological Review (number 20, pages 1-14), titled “The plateauing of cognitive ability among top earners” (online article published here on January 28, 2023).

Since AB-Testing (or randomized experimental and control group design) is not feasible in the model (sending randomized individuals in one group for example one year more to college) the classical solution in Social Sciences and Psychology are Quasi-Experiments which were first introduced in the literature by standard book “Quasi-Experimentation: Design and Analysis Issues for Field Settings” written by Cook and Campell (1979).

As a solution for not manipulation experimental the years of education as predictor of the wage Alexander used therefore a variable called “distance to college” as a natural differing factor between people regarding their years of education.

The data scientist from Trivago further pointed out in his “The Secret Sauce” slide that taking the role of companies into account in the corresponding regression model, the computation is quite demanding (millions of employees, thousands of companies, 20 years of data) – But he presented of course a solution for it (and that was not Spark!).

At the end with the help of programming language Python and package PyFixest Alexander showed that the prediction of salary can be done, and he answered the questions of the audience.

Second talk: Michael Matuschek & Tim Elfrink, StepStone

In the second talk this evening we learned from Michael Matuschek and Tim Elfrink how StepStone is predicting the salaries of all kinds of jobs for their salary products.

Michael begun the session, and gave an overview about StepStone´s salary products include the Salary Planer, Salary on Listings, and Auto-generated Salary SEO pages.

As a result of a 2020 study and further research before it turned out that salary is for 96 % of the respondents the most important criteria when choosing a job (flexible working ours, career & training opportunities, and corporate culture, reach only 90 % resp. 91 %).

Michael Matuschek with Tim Elfrink from StepStone answering a question from the audience. Foto. Stefan Klemens. Thanks both for their approval of the picture!

Michael told us also about the challenges in prediction salary at StepStone regarding data distribution and features (more white collar jobs and little part-time data for example) and: The gender pay gap, quality assurance, feature engineering, the underlying model and the used algorithm, as well as the metrics (main business KPIs) accuracy and generalisation.

After him Tim Elfrink took the mic and explained the broader infrastructure of the predicting IT system with AWS and the auto deployment of the model. Further subtitles of his presentation were for example: Creating scalable infrastructure and development environment.

A number of questions (and some hints for improving their model) came from the participants, and Michael and Tim were happily answering them.

Closing, socks, and outlook

At 8.30 pm presenter Aida Orujova returned to the stage again and thanked all guests and speakers for being there. As several others I took the chance to talk with some participants (see header picture), before I needed to catch my tram to travel home.

Trivago-Logo in front of the building after sunset. Foto: Stefan Klemens

My second Düsseldorf Data Science Meetup was another wonderful experience (read about my first here), and the scheduled next event in October 2023 is of course on my list.

Oh, one last thing (we learned this from the apple guy, right?) I did not mention yet. Before the start the participants could grab one, two, or three promotional gifts from Trivago as shown in the picture: One for using your hand to write (still common among a few people I was told), one for storing big data in a small piece of metal, and one to keep your feet between 28 ° C and 33 ° C (surface temperature of the extremities as I learned writing this sentence) when external temperatures fall in later autumn.

Promotional gifts for the participants of the Düsseldorf Data Science Meetup from Trivago. Foto: Stefan Klemens

As I like to test digital and analogue things (I have high scores on openness to experience (see the Big Five Personality Traits) and curiosity which is one of my signature strengths according to the VIA-Model), the usefulness of the trivagonian socks to prevent cold toes needed to be proven also.

Note: If you like to know more about psychological traits and psychometric assessment of these for HR recruiting, selection, and development, then click on my work as a Work Psychologist as presented here: https://www.digitalassessment.de/

I can say that my feet got warmer but the real test of course – and perhaps then like a case study (N = 1) with more treatments like a stepstonian, a sipgatian, and quantopian fabric as well a control (no treatment, that is walking without socks! preparing for that right know!) – will be conducted in colder times which are coming soon to Germany. I will report on it! 😉 And perhaps you wanna join the experiment to lift the “N”, so results will be more valid?

Me testing the Trivago socks: And I am smiling realizing the double meaning of the words and symbols matching the two main areas of the company. Foto: Stefan Klemens

With this of course rather funny ending, I thank very much the organizers and speakers for this evening, and Trivago for hosting the meeting! Will we see us next time on a Düsseldorf Data Science Meetup (or another place if you like)?

Many greeting and all the best to you!

Stefan Klemens

PS: Want to exchange ideas on people analytics, digital assessment or artificial intelligence in HRM? Then network, write a message and/or make an appointment for an online meeting. Or the classic way: phone call.

And: You like my work and the content I regularly share? Then I’m happy about a Like or comment on LinkedIn. Thank you! 🙂 🙋‍♂️🌳

Von Stefan Klemens

Stefan Klemens arbeitet als People & Digital HR Analyst und gründete Schorberg Analytics 2022. Der Diplom-Psychologe und ausgebildete Bankkaufmann ist seit 2006 im Human Resource Management mit dem Schwerpunkt Online-Assessment, Online-Befragung sowie Arbeit, Gesundheit und Persönlichkeit tätig. Zuvor war er Mitarbeiter an der Bergischen Universität Wuppertal im Fachbereich Arbeits- und Organisationspsychologie und Angestellter bei der Stadtsparkasse Düsseldorf. Seit 2020 fokussiert er sich auf People Analytics, Data Science und Künstliche Intelligenz. Weiter ist er Gründer und Administrator der LinkedIn-Gruppe "Wirtschaftspsychologie Region Düsseldorf" (bis 2022 auf XING). Eines seiner Hauptanliegen ist die Verbindung von Zahlen und Statistik mit Intuition und Heuristik für bestmögliche Entscheidungen im Human Resource Management.