Breadcrumb

Informative Job Search

Student Data Science Project

To help navigate the complexities of today’s job search, a group of UBC Master of Data Science (MDS) Computational Linguistics students (Max Ahluwalia, Tushar Choudhary, Daniel Jimenez, Waquas Mohammad) worked on a project to help job seekers find the right role for their career aspirations while also being confident about job stability.

This project was part of their Advanced Corpus Linguistics course. 

The project's concept revolves around developing a comprehensive Job Search and analytics dashboard. This entails amalgamating data pertaining to job postings and layoffs, two highly-discussed topics within the industry at present.

To aggregate their raw, unannotated data, the students created Python scripts to scrape LinkedIn (for new job posts) and Global News (for layoff articles). 

Master of Data Science Computational Linguistics Advanced Corpus Linguistics Project Job Search Tool

The students used Doccano, an open-source data labeling tool to label the data. For job posts, the labeled features included: company name, job title, job level, location, date, compensation, and type (part-time, contract, temporary, full-time), and for layoff articles, labeled features included: job posts, company name, location, industry, number of impacted employees and the date for layoff articles.

In addition, the students found publicly available semi-structured data about recent layoffs. This provided an additional 2000 new rows to supplement their already existing layoff data from news articles. The students also used TechCrunch for additional layoff information. 

Advanced data analysis techniques were used by the students to extract insights from the collected data. One of the key analyses they conducted was examining the trends over the past few months, focusing on the number of layoffs and job postings. By analyzing data from selected timeframes, the students were able to identify patterns and correlations, providing users with valuable historical context and aiding in predictive modeling.

These predictive models considered factors such as economic indicators, industry trends, and past job market fluctuations, enabling users to make informed decisions and adapt their strategies accordingly. Also, students implemented Elastic Search for efficient and comprehensive job-searching capabilities within their tool so users could access relevant job postings

The biggest challenge the students faced during the project was that news articles were quite unstructured and thus inconsistent with the data included. Sometimes the team found it difficult to get sufficient data from all of the news articles annotated.
If the students were to have more time with the project they would expand the scope to include layoff data outside of larger cities (i.e. Toronto, Vancouver, Seattle, and New York). 

They would also use some of the other features they annotated, such as compensation or job type, since this could also provide insights into the value companies place on their employees, or conversely, the reluctance of companies to adequately compensate, value, or recognize the contributions of their employees. By understanding these dynamics, we can then also assess the corporate culture and the true importance placed on employee welfare and satisfaction.

At the end of the day, the solution the students developed hopes to give people the tools and insights they need to navigate job opportunities and stay informed about layoffs in the market.

Explore Computational Linguistics Explore Other Data in Action Stories