If different people buy the same items at the grocery store, will their taste in movies also strongly overlap? Can a company recognize when someone tries to make a fraudulent payment? Is a home buyer getting a fair price? Those are the sorts of problems that data scientists tackle.

“Data science is the marriage of statistics and computer science,” says Janet Kamin, chief admissions officer at NYC Data Science Academy. “It is the art of finding patterns and insights in large sets of data that allow you to make better decisions or learn things you couldn’t otherwise learn.” The demand for data scientists is booming across industries—retail, automotive, banking, health care, and more. It’s also growing in the nonprofit and government sectors. (See the plot on page 22.)

Demand for data scientists is booming. Shown here is the relative growth in US data science job postings. (Data courtesy of Indeed.com.)

Demand for data scientists is booming. Shown here is the relative growth in US data science job postings. (Data courtesy of Indeed.com.)

Close modal

In parallel with the data explosion, boot camps on how to deal with the data have mushroomed over the past few years. Most tech boot camps are in coding, Web and mobile app development, and user experience design, but more and more are popping up with a focus on big data and data science. With the aim of helping people transition into the field, the data-focused camps offer various combinations of coding, math and statistics, and machine learning, plus hands-on experience and assistance with landing a job.

“Data in industry has been having a moment for the last 5 to 10 years,” says Berian James, who moved into data science from astrophysics. “It’s transformative, and I felt I would regret not being part of it.”

“There is a huge glut of PhDs who can’t get tenure-track positions,” says Ryan Orban, chief technology officer at Galvanize, which offers data science programs and other tech-sector educational and entrepreneurial services. “We wanted to be a bridge by creating a hands-on, practical curriculum. Industry is looking to hire people who have not just a theoretical understanding but the ability to deliver on day one by building models and solving problems.”

“We give participants three things,” says Kim Nilsson, an astrophysicist and founder of the London-based Science to Data Science (S2DS) boot camp. “We teach them how to commercialize their skills.” For example, boot campers learn how to use Apache’s analytics platform Spark, which is popular in industry and rare in academia. “Second, they learn about working in a commercial environment, with deadlines, intense teamwork, and the culture of business. Academics are meticulous, they want to get things 110% right. In business, 80% is good enough. And third, we help them form a network.” (See the interview with Nilsson on Physics Today’s website.)

Boot camps typically last from 5 to 12 weeks. They can be on site or online. Some cater to people with a degree in a science, technology, engineering, or math (STEM) field. Some are extremely selective, accepting just a few percent of applicants, while others are open to anyone who can pay; the cost varies from free to around $20 000.

For example, the five-week-long S2DS camp, which accepts mainly STEM PhDs, costs £800 ($1000), including housing. Companies pay a fee to have S2DS participants work on a project; in return, the companies get the results from the project and the opportunity to recruit the participants. NYC Data Science Academy charges its broader mix of participants $16 000 for a 12-week course. “We look for people with a STEM background,” Kamin says. “But we will take a risk on people who are uniquely talented and motivated—like lawyers and people in marketing and psychology.” The breakdown of boot campers tends to be about one-third PhDs, one-half masters, and some bachelors, she adds. In a third model, di-Academy in Brussels trains people sponsored by their current or prospective employers, who pay about €15 000 ($16 700) per person for a 12-week data science boot camp.

The typical boot camp attendee is in their mid 20s to early 30s, but the age range is from the early 20s up into the 50s; women make up from 20% to nearly 40% of attendees. Most participants need to know at least some programming going in.

Claire Lackner, a former astrophysicist, at Insight Data Science, where she got the training and connections she needed to move into a data science career. She is now a data scientist at Element Analytics.

Claire Lackner, a former astrophysicist, at Insight Data Science, where she got the training and connections she needed to move into a data science career. She is now a data scientist at Element Analytics.

Close modal

Attendees and instructors at the close of a 12-week boot camp at Data Science Retreat in Berlin. The attendees had just presented their individual projects to recruiting companies. Founder Jose Quesada is second from right.

Attendees and instructors at the close of a 12-week boot camp at Data Science Retreat in Berlin. The attendees had just presented their individual projects to recruiting companies. Founder Jose Quesada is second from right.

Close modal

The average annual salary for a data scientist in the US is $117 000, according to the job-tracking website Indeed.com. Testimonials on boot camp websites showcase physicists turned data scientists working at startups and at well-known companies like Twitter, LinkedIn, and Google.

Kamin notes that physicists rarely come in knowing what they want to do after they complete boot camp. “They are not the ones who say, ‘I want to analyze medical records.’ Instead, they come in saying, ‘I love breaking down big problems into small pieces.’ They are agnostic about what they do; they just want to solve problems.”

Benjamin Arar earned his bachelor’s degree in physics from Princeton University in 2013. A year later he was frustrated: He was getting into machine learning on his own, but he was getting nowhere in his search for employment. Early this year Arar attended the inaugural 12-week session of the RSquare Edge boot camp in New York City; the startup gave him a partial scholarship for the $19 000 program. “It was intense,” he says. “Every four weeks we switched to a new set of classes with homework and a project.”

Within two months of completing the program, Arar had two job offers: one from a startup where he would create predictive models for media companies, and one from a company where he would build real-time analytic and machine-learning tools for marketing applications.

James, the former astrophysicist, first heard about the opportunities in data science when he was a postdoc at the University of California, Berkeley. He applied to Insight Data Science for a free seven-week program for PhD scientists. After attending the boot camp in Palo Alto, he got a job at the San Francisco–based payment-processing company Square, where he wrote algorithms that used machine learning to detect fraud. Four years on, he now manages a team of about 20 data scientists at the company.

James was originally drawn to physics by his desire to learn and understand the universe. He was happy to find that the opportunities for learning in industry “accelerated,” he says. “It’s not as much depth as in an academic career, but the variety of things to work on is big—fraud is just one component; marketing, sales, these will be rich fields for decades for data science.”

Martina Pugliese began looking around at career options as a doctoral student in physics at Sapienza University of Rome. Her PhD work applying mathematical theories from complex systems to linguistics “was the embryo of my path in data science,” she says. “I did a good amount of data mining and numerical modeling, but not much machine learning.”

Attending an S2DS boot camp in 2014 helped in two ways, says Pugliese. It provided access to companies, and by doing a real-life project during the boot camp she proved herself as a data scientist. She is now at Mallzee, a startup company in Edinburgh, Scotland, where she works on an app that personalizes clothing recommendations and sells the items it lists.

Before the recommendations can be personalized, though, clothing data from retailers’ websites need to be standardized and classified. For example, grabbed text might read, “Beautiful pencil skirt in blue, wear it with ankle boots in brown.” Pugliese uses the probability of words appearing together and applies machine learning to train an algorithm to recognize the product as a skirt. In addition, she feeds data from individuals’ preferences—collected from their viewing and buying histories—to create an algorithm that will display personalized recommendations.

Although Pugliese has never been interested in clothes, she is interested in understanding people’s behavior. Her work, she says, “uses an enormous amount of data, and you can see the psychology of people. And you never get bored. The problems are not meant to be solved in months or years. They are meant to be solved at a quick pace. It gets messy, but it’s fun.”

Data science, says Pugliese, is so fashionable at the moment that it presents a cultural challenge: Data scientists have to get up to speed in the field they are working in to be able to ask the right questions, and people on the business side have to learn what data scientists can actually glean from the data—they should not expect a “silver bullet.”

Luigi Scorzato is a data scientist at the consulting firm Accenture in its Geneva office, where he works with clients to improve their computing architectures and their models for handling growing amounts of data. The work, he says, “is not less intellectually challenging or interesting than what I was doing in particle physics.” Most of the time in particle physics, he says, “you are involved with problems that are not really fundamental.” Now, he says, “what I find most interesting is to see problems solved and to help in innovation.”

The technologies used in industry change fast, as do the problems. But, says Scorzato, research, in particular research in physics, will remain extremely useful for implementing the most innovative solutions.