Building Data Science Capabilities Means Playing the Long Game


Are you frustrated with the challenge of finding and retaining Data Science hires who don’t require a lot of time and investment to develop?  You’re not alone.

Recently, at a Machine Learning conference, we met an executive from a well-known data-driven corporation who expressed to us his weariness from the ongoing Data Science talent shortage. He lamented that he just couldn’t find “good Data Science people who he wouldn’t have to spend a ton of time training”.  His perspective was understandable.  This company has been investing in Data Science capabilities for years, well ahead of its competitors, and is still struggling to find and develop seasoned Data Science talent.

Why?  Because the competition for top analytical talent continues to intensify and retention of that talent has become very difficult and expensive - even for the most data-driven brand name companies.  Moreover, technology and all things computer science continue to evolve at lightning speed while zettabytes of data pile up.  In the face of these mounting challenges, perhaps it’s time organizations rethink their approach to hiring and creating Data Science teams.

So then, what to do?  Well the US government has given us a clue.  Following the recent executive order on maintaining American leadership in artificial intelligence, former US Chief Tech Officer Meghan Smith urged school children to learn code and play with Raspberry Pi’s as a matter of national necessity.

The powers that be are sending a simple but critical message to the world:  When it comes to Data Science and Artificial Intelligence, plan and invest for the long-term development of skills.  The consequences of not doing so are simple too. Your struggle to incorporate Data Science talent into your organization will never cease.

That’s all good and well you say,  but in the face of the aforementioned zettabytes of data being produced daily and the urgency to get value out of it, companies need to focus on the immediate Data Science needs of today.  They need Data Scientists now. They can’t afford to wait for current 8th graders to skill up and graduate.  So, what options do they have?

QuantHub’s Experience with Data Science Skills


At QuantHub we see actual data science skills test scores from both entry-level Data Scientists and Data Engineers to Ph.D. candidates time and time again.   And you know what?  Skills are all over the place.  Rare is the candidate who knocks it out of the ballpark and beats our QuantHub "mascot", Chip.   Nevertheless, we firmly believe that there is a good base of Data Science talent “potential” out there just waiting to be developed further.   And therein lies the key to alleviating some of that weariness our friend from the conference is feeling.

So, Let’s Define the Long-Term Game McKinsey-style

We’re big fans of McKinsey.  They tend to be at the forefront of everything and have been quite active in researching the realm of Data Science for a long time.  So, we decided to look into how long McKinsey has been promoting the idea of hiring Data Scientists to get some perspective.

Over ten years ago in 2007, McKinsey published “Eight Business Technology Trends to Watch”.  In that article, it cited that “putting more science into management” was a trend and that “technology is helping managers exploit ever-greater amounts of data to make smarter decisions and develop the insights that create competitive advantages and new business models.” Yeah, that’s right – they said that 12 whole years ago.

Then in 2011, McKinsey wrote a report called “Big Data: The next frontier for innovation, competition, and productivity” in which it hailed big data as a “growing torrent” with all kinds of stats to prove it.  Five years later in 2016,  produced a follow-up study “The Age of Analytics: Competing in a Data-Driven World”.  Here’s the perspective it gave three years ago,

As we take stock of the progress that has been made over the past five years, we see that companies are placing big bets on data and analytics. But adapting to an era of more data-driven decision making has not always proven to be a simple proposition for people or organizations. Many are struggling to develop talent, business processes, and organizational muscle to capture real value from analytics. This is becoming a matter of urgency, since analytics prowess is increasingly the basis of industry competition, and the leaders are staking out large advantages. Meanwhile, the technology itself is taking major leaps forward—and the next generation of technologies promises to be even more disruptive. Machine learning and deep learning capabilities have an enormous variety of applications that stretch deep into sectors of the economy that have largely stayed on the sidelines thus far.

Fast forward to 2018 and McKinsey publishes “Analytics Comes of Age” where it submits that “data are the coins of the realm.”  Translation: Data is very valuable, and oh, by the way, we’ve been telling you to develop capabilities and talent to derive value from data for over 10 years.  Time’s up.

What does this mean?  If you’d jumped on the bandwagon back in 2007 and started investing in big data analytics back then, in theory, you might have waited over ten years before your analytics investments “came of age” and became valuable “coin”.

McKinseyTimeline v4

Why then wouldn’t the same timeline and investment be granted to employees working in this field?

A Long-Term Road Map for Developing Data Science Capabilities

As we mentioned before, there is Data Science talent potential out there, but much of it is imperfect.   To tap into this talent and then develop Data Science capabilities companies need to devise a long-term plan to do so and then implement it.

We’ve put together a roadmap for developing a multi-pronged, long term approach to “training good people” to become even better Data Scientists.

1.     Give up on the unicorn, for real

Like the machine learning executive, you can’t just continue to put your head in the sand out of frustration and not recognize the reality of today’s Data Science job market.   You’re just not going to find the exact candidate you need, now or ever.   When faced with the pressing challenges such as those that big data presents, it’s easy to fantasize about the perfect solution that doesn’t exist.   So, as the folks at Forrester Research recently said, "beware of unicorns".  Recognize that you are looking for unicorns in an effort to get to a solution quickly.   Accept that you are not going to find those people and move on to the next step.

2.     Recognize and fix your recruitment process flaws

A recent study by the Initiative for Analytics and Data Science Standards found a big disconnect between the skills that Data Scientists and Data Engineers list on their LinkedIn resumes, and the skills that employers require for Data Science and analytical roles.  While candidates list a wide range of applicable hard and soft skills applicable to the field of Data Science, employers are recruiting for specific technical skills, especially specific programming languages.

Of the top 10 required skills listed by employers for Data Science positions, almost all were programming languages, while actual Data Scientists typically have several non-programming skills such as data analysis and statistics, as well as other relevant skills such as data wrangling that were completely missing from employer requirements.

There is a lot of other evidence out there indicating that the recruitment process for Data Science talent is flawed – whether it be companies not knowing what skills they are hiring for, HR managers not being able to properly screen candidates, or requiring advanced degrees that are unnecessary.   Perhaps the company that this executive worked for has been using the same recruitment and retention strategies for so long that it hasn’t adapted its hiring and development process to the job market and the talent that is out there now.

Regardless, the ongoing disconnect between Data Science recruitment requirements and the reality of the Data Science candidate pool out there is surely fueling an incapacity to find “good people” to train and develop.  They are simply being overlooked in the recruitment process.

3.     Develop a portfolio view of Data Science talents

Many organizations don’t yet recognize the mix of talent and skills required to be successful when applying Data Science to business strategy.   The reality is that no one, or even two, Data Scientists can cover all the bases, nor should they be expected to.   You should, therefore, aim to develop a portfolio of individuals who collectively possess the talents you need.

In our recent webinar, Data Science executives from Protective Life and Regions Bank both described similar strategies of hiring a diversity of Data Science talent in terms of hard and soft skill sets and even educational degrees.  As experienced Data Analytics Managers in highly quantitative industries, these business leaders recognized that this is the only way to build and retain long term successful Data Science capabilities in their organizations.

The first step to doing this is to the define talents you need as an organization over time, rather than defining a team member who has specific skills, or a skillset needed for a specific data project.  Rather than thinking about investing $100,000+ in recruiting for a specific role, think about ways you can invest in developing the collection of skills and talents you need as an organization to be successful in your data efforts.  You’d probably find that these go far beyond the typical “R” coding language and SQL capabilities that most companies recruit for when planning for Data Science initiatives.

A properly developed Data Science strategy may need functional domain knowledge to help identify high-value use cases, design-thinking skills to help conceptualize a solution, finance skills to create a compelling business case, data engineering, and wrangling skills to provide access to the right data in the form needed, and machine learning skills to drive the execution AI technologies. Can you really get that all in one or two people? Probably not.

4.     Cross train on skills and talents

Similar to the previous point, another deliberate approach that Protective Life takes is what they dub “cross-pollination”.    This means they hire a diverse set of people with different but needed skills and then put those people in a room working together for a long time with lots of coffee and downtime.  This provides plenty of opportunity over time for skills exchange and knowledge sharing. They also send these people out into the organization regularly to collaborate and have coffee with business people. The key to the success of cross-pollination is purposefully creating the “space” for it to happen naturally.

5.     Aim for lifetime learning over degrees and immediate skills

The debate still rages on as to whether strict adherence to specific degree qualifications is driving a lot of the perceived Data Science talent shortage.  For instance, it is common in the Pharmaceutical industry for an undergrad working in AI-based research to not have a chance at promotion unless they have a master’s or Ph.D.   This kind of exacting requirement to be eligible for promotion is simply unnecessary and short-sighted and likely a reflection of corporate unwillingness or inability to invest in the long-term training and development of junior Data Science talent.  As a result, the talent that is recruited, seeing no career advancement, will either underperform or seek development opportunities elsewhere.

In addition, those self-taught Data Science Bootcamp graduates may not have the most polished coding skills.  However, they’ve got one demonstrated skill: a drive to learn challenging material and develop difficult skillsets.  It’s no secret that the technology of Data Science and business intelligence changes rapidly. Coding languages will come and go. Technologies will evolve rapidly.   Someone who is self-motivated to learn and develop over time is someone that you can invest in over time and in doing so, retain in your organization.

Ask yourself: Who is more loyal? A PhD who can get a high paying job anywhere, anytime?  Or a bright, but self-educated graduate who is eager to grow and learn and will seek to further their job skills - even on their own time?

6.     Build in a career path that aims to develop Data Science talent

One strategy that Region’s Bank takes is that it allows and encourages its Data Scientists and Analysts to move around the organization.  It openly discusses and plans for this with its new hires and in annual performance reviews.  So, an entry level Data Scientist will come into a department and know that in a few years after learning all there is to learn in that department, they will have the opportunity to move to a new area of the organization where they can continue to develop, contribute and be recognized.

This strategy feeds well into the very simple fact that right about the time your best Data Scientists are hitting their stride they may also be thinking about starting a family and will thus be looking for new opportunities and challenges within your organization, rather than face a move.   They will likely plan to stay with your company longer when they see that they have a future and that they can start thinking about other life goals.

In addition to this, companies should start creating a hybrid career development track for all employees. One that runs parallel to both Data Science and the business side, and which allows business people to move into analytics and vice versa.   For example, if your company can afford it, you might develop education subsidy programs that support employees who are interested in pursuing advanced analytical degrees. Tuition grants can be given in exchange for your employees agreeing to stay with your company for a certain number of years to "pay back" the favor.

7.     Democratize Data Science

The real long-term solution to training and developing Data Scientists comes with democratizing Data Science. Companies must find ways to stop relegating Data Science knowledge to a handful of highly specialized, Ph.D.- clad quasi-unicorns.  The responsibility of Data Science is growing faster than ever. This puts tremendous strain on both the Data Science team itself, as well as recruiters and managers.

So what McKinsey, Harvard, and companies like Airbnb have been advocating for is the democratization of Data Science as a way to both meet the shortage of Data Science talent, but more importantly, as a critical step toward extracting the true value of Data Science.  Companies are setting up data science universities to train all management and hiring “hybrid” roles such as visualization specialists to help bridge the gap between Data Scientists and the rest of the organization.

Some companies are actively implementing automated tools, pre-trained models and self-service analytics to make Data Science techniques and skills more available to the rest of the organization.  The democratization of Data Science over time may relieve the pressure on the Data Science team and promote a fairer judgment of how trained they really need to be.

Admittedly most companies are far from democratizing Data Science. But getting there could go faster than you would think.

Ten Years From Now…

It’s clear that there is no single way of meeting the broad array of Data Science needs in any immediate way without taking a few risks on talent hiring and planning for future development of Data Science talent in new ways.  Organizations need to take steps to alleviate the unrealistic demands now being placed on individual Data Science candidates and the people trying to hire them.  This requires taking a long-term view toward developing a more balanced set of capabilities within the organization.

Many of the proposed steps don’t really require much in the way of additional investment. Bootcamps and LinkedIn Learning are not all that expensive compared to the cost of recruiting and hiring a “ready-made” Data Scientist.   Rather what is required is an acceptance that hiring ready-made Data Scientists is really a fading option.   Start working on your long term roadmap now and maybe in 10 years when McKinsey reports that AI has “come of age” you’ll be armed with a team of homegrown and loyal Data Scientists ready to take advantage of all that the future of Data Science has to offer.