Best data scientists in the world

By Julia Morgan

The machine learning industry has grown enormously since its inception in the first half of the 20th century. Data scientists have been responsible for the growth of machine learning. Some have contributed groundbreaking research of neural networks and deep learning, and others have built industry leading applications of machine learning. Here are a number of data scientists that have contributed both to machine learning research and to applications as leaders in the industry. Many have built unimaginable innovations and leave their innovations for the millions of AI applications used today. Here is our list of the best data scientists throughout the world, leading the field of AI which is changing and evolving every day.

Yann LeCun

LeCun is one of the most well known data scientists of our time. He is well known as the Director of AI Research at Facebook, but has made industry changing inventions that earned him a spot at the top of the list for best data scientists in the world. Yann has 14 registered US patents and created CNN (convolutional neural networks) which is basically one of the reasons deep learning is what it is today. During his time at Bell Labs, his team made numerous contributions to computer science such as DejaVu (Yann LeCun, Yoshua Bengio, Léon Bottou and Patrick Haffner), and Support Vector Machines (by Vladimir Vapnik) and is one of the recipients of the Turning award for his work in Deep Learning.

Dr. DJ Patil

Patil essentially labelled the job category for a “data scientist” in his days as chief scientist at Linkedin. He has been a leader in the field, and has shaped the future of data policy in the U.S, serving as the first Chief Data Scientist at White House for Obama. He has been responsible for spreading Big Data across industries and government agencies, and even helped Facebook create their beginning data science programmes. He has helped spread data science as an enterprise strategy, and has guided millions of enterprises in building their data science teams with his book “Building data science teams”.

Yoshua Bengio

Along with Yann LeCun, Yoshua has been called by some, one of the “Godfathers of deep learning” and was one of the winners of the prestigious Turing award. His work has been essential to the progress of artificial intelligence. He helped to build the technique that combines large amounts of data with many-layered artificial neural networks, which are inspired by the brain. His work pioneered the use of neural networks of distributed representations for words (or word embeddings), as well as the use of soft attention mechanisms. Both innovations leading to major advances in natural language processing. Along with Yann LeCun and Geoff Hinton, Bengio received the award for making deep neural networks a “critical component of computing”. He remains an active leader in academia and the field of AI research, and has advised and co-founded several AI start-ups, including Elemental AI.

Corinna Cortes

Corinna is also an alumn of the famous Bell Labs group responsible for some of the most influential innovations and discoveries in AI. Cortes is well known for her contributions to the theoretical foundations of support vector machines (SVMs) also known as supervised learning, for which she jointly with Vladimir Vapnik received the 2008 Paris Kanellakis Theory and Practice Award, and her work on data-mining in very large data sets for which she was awarded the AT&T Science and Technology Medal in the year 2000.

Leslie Kaelbling

Prof. Kaelbling is a leading professor and researcher at MIT. She has done substantial research on designing situated agents, mobile robotics, reinforcement learning, and decision-theoretic planning. She is the founder and editor-in-chief of the Journal of Machine Learning Research. She has contributed significantly to the field of robotics and is widely recognized for adapting partially observable Markov decision process from operations research for application in artificial intelligence and robotics. Kaelbling received the IJCAI Computers and Thought Award in 1997 for applying reinforcement learning to embedded control systems and developing programming tools for robot navigation.

Nando de Freitas

De Freitas is a top researcher in the fields of neural networks and deep learning, reinforcement learning, apprenticeship learning and teaching, goal and program discovery, transfer and multi-task learning, reasoning and cognition. His many academic papers have been noted as an authority in machine learning. He is a Professor of Computer Science at Oxford University as well as a senior staff research scientist at Google DeepMind in London.

Caitlin Smallwood

Smallwood is the VP of Data Science and Analytics, Netflix. She has applied data science with real life business value with her team that drives predictive decision models, algorithm / machine learning research, and experimentation science for all parts of the Netflix business. Her career has included working at Yahoo! as the director of data solutions and at PWC as a senior manager in quantitative consulting.

Chris Mattmann

Matman is a Principal Data Scientist and Chief Architect at NASA’s Jet Propulsion Laboratory. He helped develop the third generation of the Apache Object Oriented Data Technology data processing and information integration system. He assisted in inventing Apache Tika, and Apache Nutch, widely used software frameworks for content detection and analysis, and has continued his work with Dark Web and automated data processing technologies leading research teams working with DARPA and NASA JPL on the Memex project. The project involves data discovery and dissemination from the Dark Web.

Andrew Ng

Ng is the founder and leader of Google Brain, a deep learning artificial intelligence research team at Google. His previous work was as VP and chief scientist at Baidu. While he is most well known for his online learning courses in machine learning and deep learning, he has also contributed to much of todays research in deep learning. As a former Bell Labs member he conducted research on reinforcement learning, model selection, and feature selection. He is one of the most influential and well known data scientists in the world.

Gideon Mann

Gideon Mann is the head of Data Science at Bloomberg, guiding the strategic direction for machine learning, natural language processing, and search on the core terminal. At Bloomberg, his team has worked on the company-wide data science platform, natural language question answering, and deep learning text processing, among other products. He also founded and leads the Data for Good Exchange (bloomberg.com/d4gx), an annual conference on data science applications for social good and is a core member of the Shift Commission on Work, Workers and Technology (shiftcommission.work). Mann graduated Brown University in 1999 and subsequently received a Ph.D. from The Johns Hopkins University in 2006. His focus at Hopkins was natural language processing with a dissertation on multi-document fact extraction and fusion. After a short post-doc at UMass-Amherst working on problems in weakly supervised machine learning, he moved to Google Research in NYC in 2007. In addition to academic research, his team at Google built core internal machine learning libraries, and publicly released the Google Prediction API and coLaboratory, a collaborative iPython application. He joined Bloomberg’s leadership team in the CTO department in 2014.

Kira Radinsky

Radinsky specializes in predictive data mining and gained recognition after her software predicted the first in 130 years outbreak of cholera in Cuba. The prediction was made based on the pattern identified by mining of 150 years of data from various sources: in poor countries, floods within a year after a drought often followed by a cholera outbreak. At Microsoft Research she lead research in the field of Web Dynamics and Temporal Information Retrieval. She has previously lead as the director of data science at Ebay

Angela Bassa

Angela Bassa is an expert in building and leading data teams. An MIT-trained and Edelman-award-winning mathematician, she has over 15 years of experience across industries—spanning finance, life sciences, agriculture, marketing, energy, and software. Angela leads the Analytics, Data Science, and Machine Learning teams at iRobot, where she helps bring intelligence to a global fleet of millions of consumer robots. She is also a renowned keynote speaker and author, with credits including the Wall Street Journal and Harvard Business Review.

Jeff Hammerbacher

Jeff was one of the first employees of Facebook, responsible for building the data team and developing the technology they required to handle the unprecedented amount of data the firm was generating. After Facebook he founded Cloudera, where he also served as Chief Scientist. Through Facebook and Cloudera his teams were responsible for driving many of the applications of statistics and machine learning at Facebook, as well as building out the infrastructure to support these tasks for massive data sets. He’s been a critical contributor to creating three of the most popular open source databases (Cassandra, Hive, and Impala) which has greatly impacted the big data and machine learning industries.

Dr. Eva-Marie Muller-Stuler

Eva is the Chief Data Scientist and Analytics & AI practice leader at IBM MEA. Dr. Eva-Marie is the author and co-author of numerous papers and publications as well as a member of various mathematical societies including Women in Machine Learning. While the Chief Data Scientist at KPMG in Decision Science, Dr. Eva led the development of new ground-breaking approaches to facilitate strategic data driven real-time decision engines.

Monica Rogati

Monica is a leader in the data science community. She was Vice President of Data of Jawbone from 2013 to 2015 and was a senior data scientist at LinkedIn for five years, where she built the initial version of LinkedIn’s job matching system and the first machine learning model for LinkedIn’s “People You May Know” feature. She is currently an equity partner at Data Collective (DCVC) and a scientific advisor to CrowdFlower. She contributes greatly to the field of data science as a speaker and thought leader.

Andrew Therriault

Andrew Therriault has led data science and analytics teams for major organizations including the City of Boston, as Director of Data Science at the Democratic National Committee and Facebook focused on solving real-world problems. He’s spent the past fifteen years working with data as a practitioner, a researcher, an educator, an advisor, and a leader particularly specializing in local government or Big Tech. He was appointed as Boston’s first-ever Chief Data Officer, leading a team of more than twenty staff, contractors, and fellows working in the areas of data science, data engineering, performance management, open data, and geospatial analytics. Developed strategies, policies, and programs to promote the use of data and analytics throughout city government while managing the associated privacy, security, legal, and political risks. Guided the design and implementation of new data science and data warehousing platforms to deliver enterprise-grade scalability, reliability, security, and performance for analytics projects.

Jeremy Stanley

Jeremy is the VP of data science at Instacart. Instacart is currently applying deep learning to help shoppers become more efficient through the shopping list. Jeremy’s team uses AI and deep learning to allow Instacart to predict the sequence that shoppers pick items in specific store locations – In some cases saving 10%+ time in-store. He has been instrumental in applying data science to real world applications in how AI technologies are enabling the on-demand delivery economy and the implications that these advancements will have on the retail industry.

Kamelia Aryafar

As Executive Vice President, Chief Algorithms and Analytics Officer, and board member for Overstock.com, Dr. Kamelia Aryafar leads the company’s analytics, data platform, technology, machine learning, data science, data engineering, and algorithmic product functions across the marketing, customer, sourcing, and website verticals. Dr. Aryafar worked at Etsy with different product teams across several platforms to integrate analytics, machine learning (ML) and artificial intelligence (AI) algorithms throughout the organization. Since joining Overstock, her teams have integrated state-of-the-art ML and AI algorithms across various product teams, including search and discovery, personalization, experimental design and analytics, pricing, ranking, sort and navigation, recommender systems, marketing, CRM, advertising technologies, email, sourcing, and supply chain.From the algorithms that power the search experience to the analytics and data strategy and platform used to personalize that experience, she is helping customers discover the most relevant products and information to meet their needs and inspire their wants.

Lili Jiang

Lili is the Head of Data Science at Quora . She leads a team of data scientists at Quora in Mountain View, CA. Her team delivers data insights and pinpoints growth opportunities related to Quora’s content recommendation system, which curates personalized knowledge to hundreds of millions of users. She has lead projects where she focused on using data insights to identify opportunities in user activation and retention channels. She graduated from Harvard with a degree in Physics. At Harvard, she conducted biophysics research that constructed Markov chain and biased random walk models to understand protein RNA interaction and energy barriers.

Richard Socher

Richard is the chief scientist (EVP) at Salesforce, where he leads teams working on fundamental research, applied research, product incubation, CRM search, customer service automation and a cross-product AI platform for unstructured and structured data. He has previously been an adjunct professor at Stanford’s computer science department and was the founder and CEO/CTO of MetaMind which was acquired by Salesforce in 2016. In 2014.

While these are just a few of the data scientists that we have recognized, there are many many reputable data scientists building the AI of the future that should also be recognized. We hope to see more machine learning heroes thrive, and see more stars hit this list. Hopefully this list will inspire others to join the field, and will give other data scientists a place to learn about new opportunities in the field of AI both in a corporate enterprise, or in research, or for social causes.