…but nobody is thinking about it.
That was the sub-title to a presentation I gave recently at the Careers Show in London on 4th March, 2015. The headline title was “Using Real LMI to Inform Career Decisions.” LMI of course being Labour Market Information. I had such a positive response immediately after and subsequently that I decided to blog the presentation for a wider audience than the 150 – 200 people who attended.
The content is very much written from the perspective of a business using technology to support career decisions. Our tools are used to help make what for many are life defining or life changing decisions and as such we have a responsibility to ensure that they are the best they can be. We therefore assess new technologies and sources of data and information used in our tools very carefully before deciding to use it to ensure it is really helping our users and those supporting them make those decisions.
Use of the term “Real” implies that there might be something about LMI that is “NOT Real”. But what I really meant was understanding the difference between what we at CASCAID are calling “Static LMI” and “Live LMI”; the difference between what we think is going on versus what is really going on.
Static LMI is the type of LMI used by many websites and online tools to provide information that includes hours, salaries, current workforce levels, future trends and opportunities.
The characteristics of each type are as follows:
“Static LMI” draws upon annual surveys based on limited sample sizes (by comparison to the size of the real economy). It uses imperfect classification systems such as SOC codes. There is a lot of aggregation and extrapolation of data. Rational models and economic rules of thumb are applied and an assumption that you can use past trends to predict the future is often made. The information relied upon is often dated being as much as 3 or 4 years old in some cases.
By contrast, “Live LMI” is based on real job vacancies posted online by employers. From these vacancies it is possible to glean real salary data, job titles, locations, skills and qualifications in demand, sectors and occupations currently advertising and much more.
What is the problem with current sources of LMI?
About 18 months or so ago, everybody seemed to be talking about LMI. It appeared on virtually every seminar and conference agenda; CPD events were advertised and run; products of various sorts both online and offline started to introduce it. Many an official report made the point that young people in particular needed access to good quality LMI to inform career choices.
Around about the same time we became aware of the ‘LMI for all’ project run by the University of Warwick’s Institute for employment research funded by the UK Commission for Employment and Skills. In summary ‘LMI for all’ is an open data project delivered via:
“an online data portal, which connects and standardises existing sources of high quality, reliable labour market information (LMI) with the aim of informing careers decisions. This data is made freely available via an Application Programming Interface (API) for use in websites and applications.”
So it is specifically targeted at organisations and individuals like us with an interest in using LMI within online products. “Brilliant!” we thought. A bringing together of all the reputable sources of LMI with all the heavy lifting done and best of all…FREE. What more could we want?
However, on closer inspection we began to uncover problems that gave us a bit of a headache. The sources of data had all of the characteristics identified above under the Static LMI heading. Our view was that if used without overtly and transparently identifying or qualifying the limitations in the data there is a high risk of imparting erroneous information that could lead to bad or ill informed decisions on the part of users or clients of products making use of such sources.
So the aim of this article is to:
- Highlight the issues in the current sources of LMI
- Provide some exemplars of how the data is being used identifying why caution might be needed
- Highlighting the possible issues and consequences of using or relying on unmediated static LMI
- To float some ideas for “Future LMI”
Why does it matter?
In CASCAID’s view LMI should be about connecting labour market demand with supply; matching individuals with real opportunities whether that be in work, training or education. It should help to identify the gap between where I am now in relation to the labour market and where I want to be or where the opportunities are for jobs; gaining new skills through training or new qualifications through education and progression.
In that scenario it is vital that the information used to inform those choices is accurate, relevant and in real-time.
What’s on offer and what are the issues with current sources of LMI?
The main official sources of LMI come from the Office of National Statistics. To save you searching through multiple pages and digging around the website to find relevant data and information, the ONS have also developed NOMIS which pulls together all official sources of Labour Market Information “to give you free access to the most detailed and up-to-date UK labour market statistics from official sources.” This is a bit like ‘LMI for all’ in many senses but without the API interface for developers and ‘LMI for all’ does have vacancy data via Universal Jobmatch.
Both ONS and NOMIS provide good high level background macro economic information although in some cases it does take time to trawl through spreadsheets but there isn’t a lot of granularity from an individual user’s point of view; there isn’t much actionable intelligence. If you want high level ward, town or county data to inform high level policy or strategy, it’s good. But how useful is this to an individual when making a decision about jobs to apply for, training to undertake or first career choices? It’s also worth noting that much of the data is static. It isn’t frequently updated being based on census returns or annual surveys. It is certainly not user friendly.
‘LMI for all’ promotes itself as:
“an online data portal ….developed by the UKCES to bring together existing sources of LMI that can inform people’s decisions about their careers…(that)…will put people in touch with some of the most robust LMI from our national surveys / sources therefore providing a common and consistent baseline for people to use alongside wider intelligence.”
It draws its data from these sources:
- Employment, projected employment and replacement demands from Working Futures
- Pay and earnings based on the Annual Survey of Hours and Earnings and the Labour Force Survey
- Hours based on the Annual Survey of Hours and Earnings
- Unemployment rates based on the Labour Force Survey
- Skills shortage vacancies based on the Employer Skills Survey
- Skills, Abilities and Interests based on the US O*NET system
- Occupational descriptions from the ONS Standard Occupational Classifications
- Current vacancies available from Universal JobMatch
- Higher education destinations data from HESA
For each of the above data sources ‘LMI for all’ provide a reasonably detailed summary description of the source which also includes “Known quality issues” and statements around “Accuracy of data” . The current version of this document is dated June 2014. That is an indicator in itself that much of the data used is static and now well out of date. The latest data included is 2013, some dates back to 2011 or further.
You can of course read the document for yourself but in summary there is:
- Much aggregation, estimation and extrapolation
- The application of rounding, economic rules of thumb and statistical methods
- Reliance on assumptions that are at best questionable in a dynamic 21st century, technologically driven world and at worst totally misplaced
- Use of data that is drawn from annual surveys
- Use of data from cross sectional surveys
- Data that is dated
- Data that cannot be drilled into beyond national level for Wales, Northern Ireland or Scotland
- Data that cannot be drilled into beyond regional level in England
‘LMI for all’ do have a vacancy feed from Universal Jobmatch 1 which is updated regularly but within the current API the limitation for vacancy data retrieval is a maximum of 50 and the postcode filter is hard set to 50 miles. There is no SOC code or other form of categorisation and the only search filter allowed is by ‘keyword’. It is not possible to get to the raw data therefore it is not possible to extract other information such as salary, skills or qualifications.
So we can find out where a single vacancy or group of vacancies might be based on a keyword search but it is hard to form a picture of what types of occupations are currently being advertised in which sectors in any specific geographic location. Very little meaningful analysis can be done on the data. As an online application developer it’s very limited if you are concerned about delivering good quality actionable intelligence and data to users and their advisers.
On a wider scale it’s also very concerning.
It’s concerning because whilst ‘LMI for all,’ to their credit, do identify all these issues for potential developers, there is a big question about whether developers using the ‘LMI for all’ API pass on details of those issues in a clear and transparent way to the users of their resources?
As far as it has been possible to ascertain, known flaws in the data are mostly not advertised. The assumptions used are not advertised or are hidden in the small print or in background reports that very few users (and a lot of time poor advisers) ever really read. Information in resources, applications or websites is often presented as fact and in absolute terms. Would a user know? to dig into the small print or supplementary reports? Would an adviser know the source of the data itself issues cautions over its quality and validity?
One of the major issues with any source of data is how you classify and structure it for analysis purposes. Within Labour Market Information, a common approach is to use the Office of National Statistics Standard Occupational Classification (SOC) system. This is used as a classification method for careers information in many products. As such it is not bad but it does have limitations, particularly when used in conjunction with the collation of labour market information.
A typical occupational grouping looks like this (click on the images for a better view):
Readers will note that “job” titles at the 4th digit level are very generic and high level. Anyone trying to classify a real job such as e.g. Customer Relationship Manager or Service Desk Manager has to make a judgement about which 4th digit code to apply or apply the catch all code 1139 which is a generic code covering jobs that are ‘not elsewhere classified’. In other words it does not fit easily with any of the other 4th digit job titles.
This can and does result in aggregated data for salaries, hours, future predictions and current opportunities. There really needs to be a 5th level if SOC codes are to be truly reflective of what is happening in the real economy. Alternatively, the information presented needs to be connected to live sources of information so that users are able to see for themselves the real jobs that sit behind the data.
What does it add up to?
If there are known and acknowledged flaws in the data combined with the limitations of the SOC code system, overlaid with economic assumptions based on limited and dated surveys and access is by uninformed users or advisers, you have to ask the questions: LMI for who? LMI for what?
All of the above is of course an academic but logical analysis of current sources of LMI. Here are some examples of how such LMI data is used in live and available products. The first two are anonymised; the third is not as it is a publicly funded resource from a Government funded agency.
The first example is from a freely available website. The target market users of this website can be as young as 12 years old. It provides information on a variety of careers and you can be matched to careers if you complete a (long) questionnaire that was originally designed for and aimed at adults in a completely different country.
Some basic descriptive information is provided for the job of Bee Keeper.
Scrolling down I can find out more information about the job including skills needed and other background information including salary which is presented thus:
To the informed eye this tells you that Bee Keeper has been classified in the catch all ‘not elsewhere classified’ SOC code. Therefore the salary data has been aggregated together with a host of other jobs in the Agriculture and fishing trades sector. Most users are unlikely to understand the significance of this.
One click down, if you pursue exploring this job further, you will see in the more detailed information the following:
Is this a real career opportunity? Is this an accurate salary? Isn’t this confusing? I can earn up to £24,000 doing this job….oh wait…maybe I can’t? Maybe I can’t do this as a job or career at all?
What is known is that as of the end of February 2015 there have been no advertised vacancies for a Bee Keeper in a 300 mile radius of the writer’s house since the beginning of the year. That covers all of England and Wales, Northern Ireland and a fair proportion of Scotland too. What is also known is that this is just one of many issues with the presentation of data and information on this site. This is not the only example on this site. And the site is not unique in presenting potentially misleading information.
In a different tool we find on the website marketing its features and benefits the following statement:
“ our data is the hub around which all our tools and services are built, and so one of our most vital tasks is to make sure our data is up-to-date and that it takes into account any new datasets that would give our customers an even better insight into their local labour market than they are currently getting.”
Deeper in we see this quote:
”Our data is updated annually.”
So it isn’t that up to date is it? The stated sources for the data are ‘LMI for all’ and ONS. But there is no mention of the known quality issues with the source data. Information is presented as follows. This particular one is for an Information Technology or Telecommunications Director in a specific geographic location in England:
Again, to the informed eye, it is possible to see that data has probably been aggregated to the 4th digit SOC code and probably encompasses a wide range of jobs and job titles from entry level to senior. Furthermore, how it is possible to arrive at such precise numbers for those currently employed in this role and the future prediction given the stated sources of data and the known issues with them is not clear. It can only be by extrapolation and the application of economic rules of thumb. It’s just not possible to know so precisely based on the data sources.
More likely is that those differing salary points reflect quite different roles and indeed career paths that have been aggregated into a single 4 digit SOC code. At the entry level this could be a Systems Administrator or similar. At the median wage level this could be a Technical Architect or a Senior Software Development or IT Service Desk Manager role and at the high wage level this could be something like a Chief Information Officer role. The point is, we don’t know and there is no way of finding out and no caveats about the method used or quality of data is given.
Forecasting the future
In December 2014 the UK Commission for Employment and Skills published a document aimed at Teachers and young people describing “‘Careers of the future’”. In it they identify the 40 top jobs of the future based on their research and analysis. It’s a well-designed resource intended to attract and capture young people’s and parents attention at key points in the document. This is just one example:
Labour market information about train drivers is presented as follows:
Notwithstanding the fact that again there have been no vacancies for Train Driver over the past couple of months in a 300 mile radius of the writer’s house it does not take long to find the following using a google search “driverless trains” .
Source: The Independent
Is this a good long term career choice given the likely impact of new, planned for and proven technologies? A 16-18 year old today will be 26 – 28 in the mid 2020’s when TfL (and I am sure many other train operators) plan to implement these trains.
In the ‘Careers of the future’ report on page 7 the following statement is made:
“We don’t pretend to be able to predict the future, but we can get an idea of longer term job prospects based on past trends. We think this is a good basis for thinking about the future.”
Again, notwithstanding the fact that the presentation of information suggests that they do pretend to predict the future, it is unlikely many young people will read that section (or teachers for that matter given time constraints and other pressures). The sentence also contains an assumption that past trends are a good basis for thinking about the future. An assumption that is, in a dynamic 21st century economy, highly flawed and I can cite numerous examples in addition to the one above. Kodak and Instagram spring to mind in the first instance.
‘Careers of the future’ is supported by a Background Report that runs to 115 pages. In its foreword the report states:
“we mobilise robust business and labour market research to inform choice, practice and policy.”
It goes on to say that:
“Access to high quality intelligence about the world of work is also critical for individuals”
They are right, it is. Further on they say that:
“…research on ‘Careers of the future’ seeks to deploy the UK Commission’s intelligence about the labour market for the specific purpose of supporting individual career choices.”
This is all very laudable. But as you start to read through the background report, the caveats start to appear. From pages 16 and 17 the following:
“The detailed projections present a carefully considered view of what the future might look like assuming that past patterns of behaviour and performance are continued over the longer term.”
“The results are indicative of general trends and orders of magnitude and are not intended to be perceived as definitive”
“Forecasting is as much an art as a science and requires considerable judgement on the part of the researcher especially when the forecast horizon is as much as 10 years ahead. Any errors in the forecaster’s ability to predict the future will result in inaccuracies in the projections.”
“The extent to which the historical base is inaccurate due to data limitations further exacerbates this problem. There are margins of error associated with the estimates from key official statistical sources used and these are carried over into Working Futures.”
So, if it’s not definitive; contains margins of error and acknowledged inaccuracies; requires considerable judgement and assumes the past is a good indicator of the future, is it acceptable to present the information in a way that implies accuracy and that it is fact like this one for Police Officer?
You will note the asterisk next to the number 193,000. A footnote appears on the page in small print that states:
*Employment figures are based on Labour Force Survey and do not correspond to official statistics on police workforce numbers produced by Home Office.
A quick check of the Home Office figures shows that: “There were 129,584 full-time equivalent (FTE) police officers in the 43 police forces of England and Wales as at 31 March 2013”
That’s only a c. 50% error rate! And if the Labour Force Survey is so wrong about Police Officers, what about the stated numbers in the other ‘Careers of the future’? How can we have any confidence in them? Note this is the only qualification I can find in the main elements of ‘Careers of the future’. It tells you that there is an anomaly but it doesn’t tell you how big that anomaly is. And if they know there is an anomaly of such huge proportions how can they even consider using the data in this way? How many will check as I did?
To be absolutely clear and fair all the caveats about the data and the approach are stated but they are hidden in the small print or in the background report as I have highlighted. But we all know that these are rarely read by those that really need to know the issues. They are not clearly stated in such a way that a young person could easily understand it. All they will see and look at is the glossy designs and cool layouts that contain very specific and precise data presented as if it were fact.
Don’t get me wrong. Discussions about the future are useful and important but if information is presented in a misleading way with a lack of transparency in situations where there might be a lack of mediation and support then there is a significant risk of ill informed decisions.
Is it intentional?
No. All of these resources are well intended. But they are poorly executed. They do not put the user or purported beneficiary at the heart of what they do. They assume that because the data they rely on comes from so called ‘reputable’ sources that it must be good and reliable. There is also a good chance that those accessing the data and making use of it to design websites or other resources are not experts or professionals in the business of supporting young people to make good career choices. There may also be an assumption that because it’s online that ‘digital natives’ know implicitly how to make best use of it ‘because they just do’.
It’s a recipe for bad decisions and those in the education technology business and elsewhere (especially Government funded or commissioned bodies using tax payers’ money) have a duty if not a moral imperative to ensure that technology and data is used responsibly, with care and with thought especially where life defining decisions are being made.
Traditional sources of Labour Market Information are thus inherently limited. They rely on census and survey data on employment that can take months or longer to collect, compile and publish. By then the conditions and trends they describe may already have shifted.
In a dynamic 21st century labour market can we really afford such a time lag when making decisions about first career choices, re-employment, workforce development or business growth?
Online job postings are an expression of employers’ actual demand for workers. They possess rich potential to provide actionable intelligence and real time insights that change from day to day, week to week and month to month.
Analysis can be done on the content of those adverts providing insight into the following:
- Job function
- Employer industry
- Educational requirement
- Common and Specialised Skills
- Duration and level of experience
- Plurality (i.e. does this represent just one job?)
- Normalised salary
- Intermediation (i.e. was this posted by a recruiter?)
- Required certifications or licenses
This provides a much better insight into what is actually happening in the labour market and what has actually happened. It is possible to identify past trends at a local, regional or national level based on a much larger sample than a survey could ever provide and without the need for subjective extrapolations.
Thus we could start to do things like this:
How useful would data and information like this be to individuals, businesses, strategic planning bodies and education institutions? Individuals would know whether their aspirations to follow a particular career path were viable in any given geographic location. They would know what skills (hard and soft) employers were currently looking for. They would know which employers were currently recruiting and where.
Businesses could gain insight into competitor activities or the current levels of local demand and salaries being paid for given occupations. Strategic planning authorities would be able to track which sectors were growing or in decline from an employment point of view and which skills were in highest demand enabling better decisions around resource allocation, skills development initiatives and infrastructure planning. Education institutions could analyse their curriculum provision and ensure that they were preparing students with the right hard and soft skills to succeed in the local or regional economy thus better meeting the needs of employers at the same time.
This is real time actionable intelligence that cannot be gleaned from current static sources of LMI which at best can only give a high level snapshot of the past and at worst are potentially highly misleading.
The next step would be to start to connect the demand side of the economy more directly with supply. What if we could build a large user base with CV’s or profiles? What if we could analyse those CV’s and profiles for skills and experience and match the individual to live labour market opportunities based on their current skills, qualifications and experience? Better still, what if we could suggest further skills and qualifications to those individuals that would open up new, better or alternative career paths if they got them?
What if we could do matching, exploration and referral activities that analyses and catalogues information directly from CVs and job postings? We would have the beginnings of a real time understanding of skills gaps across the economy – locally, regionally and nationally.
Of course it would not be perfect information. No data set can ever be perfect. There is still a hidden jobs market that is not advertised but it will be a much more accurate and up to date real time picture of what is going on based on real activity rather than subjective forecasts and limited, dated, survey data.
What we are talking about is a lifelong SATNAV for earning and learning that provides a set of career management tools for any eventuality that might arise in a person’s career. A tool that sets out to solve the problem:
Help me understand:
- Where I am in the labour market
- What are my strengths and weaknesses
- What opportunities are available to me now
- What opportunities could be available to me if I did this
- What threats might be around the corner
That is our vision and aim. The current sources of LMI fall well short of that.
1 There are well documented issues with the quality of Universal Jobmatch data but that is for another post.