A Glimpse into The Future of Data Science

Juan Manuel Ortiz de Zarate

19.03.2021

TABLE OF CONTENTS

How did we get here?

The Data Science Revolution

What's next?

Share on

According to many prominent thinkers, we are currently living through the Fourth Industrial Revolution, a rapidly transformative period that—much like its steam-powered, electric, and computer-driven predecessors—promises to completely revolutionize how our society functions.

New technological trends like the Internet of Things (IoT), Big Data, and Machine Learning are ushering in “changes so profound that, from the perspective of human history, there has never been a time of greater promise or potential peril” asserts Klaus Schwab, executive chairman of The World Economic Forum and popularizer of the term. In this article, we explore the history of this revolution, provide some current examples, and consider some of its future implications for data science specialists.

How did we get here?

Evolution of Data Science and Machine Learning terms search on Google since 2004

The theories that lie at the heart of recent advances in Data Science have existed for hundreds of years. For example, Bayes’ Conditional Probability Theorem dates to the 18th century, Gaussian Distribution and Markov chains were both conceptualized in the 19th century, and the Pearson correlation coefficient was coined in the early 20th century. So, why did it take so long for all these incredible advances to arrive? The simple explanation is that it wasn’t until recently that technological infrastructure—i.e. big computing, cloud storage, and heightened bandwidth capacity—was developed to truly harness these theories. Data science has come to include many frameworks, if you are interested in seeing which are the top 10 Python frameworks for data scientists, make sure to check our article.

At the beginning of the past century, computing was made by electromechanical machines, requiring physical movements to execute simple operations. Due to this, these proto-computers were able to compute less than 0.001 calculations per second, clearly less than what is required to produce artificial intelligence. Fortunately, the computing capacity has grown at a consistently exponential rate since then, now reaching an incredible speed of 5 billion calculations per second with the Intel Core i7 QUAD processor.

Source: Ray Kurzweil, "The singularity is near: When humans transcend biology"

Similarly, storage capacity has also increased exponentially, rising from a mere 10 megabytes in 1980 to hundreds of Gigabytes in many devices. And that’s discounting cloud storage which is functionally infinite. By the same token, the use of international bandwidth in 2004 was less than 1 Terabyte per second, while now it is more than 1,000 times that.

As a consequence of such bandwidth growth, people can constantly connect online via their phones, tablets, smart TV, watches, and (of course) personal computers. Such connections generate information continuously, which is then sent to cloud servers and saved for later analysis (made possible by the aforementioned storage capacity growth). That enormous amount of information can then be harnessed to train complex mathematical models that can predict future behaviour.

You might not be aware of the scale that your information is collected, or that your actions are being used to create predictive machinery. However, let me tell you that all of us are the guinea pigs of these AI (Artificial Intelligence) models. Some of the most illustrative examples are social networks like Facebook, Twitter, Instagram, Tiktok which are constantly learning your preferences, tastes, and opinions. This information then informs the kind of content they show you in an attempt to maximize your engagement; after all, the more you use their application the more they increase their returns.

Social networks are not the only area where AI is being used to improve services, there are many, many more. Below we will highlight some other examples of how Data Science impacts our reality. Are you curious about joining the Data Science Revolution? Here at Pangea, we have many companies specializing in Data Science, so simply tell us what you need and we’ll connect you with up to 5 companies within 72h that match your needs—all for free!

The Data Science Revolution

Making the National Football League Smarter

To begin with, we are going to talk about sports analytics, specifically as used in the NFL. As any fantasy football aficionado will tell you, a staggering amount of data has been collected over the years about these games, including stats on both team strategy and individual performance. Recently, advances in technology—like chips placed in players’ shoulder pads—have allowed teams to measure the exact speed and position of a given player on the field.

Having all this information available has generated radical changes in teams' strategies. One of the first teams that used analytics to re-think their movements was the Philadelphia Eagles, who have embraced tracking and math-based decision making in everything from player and scheme fit to training and fitness monitoring. But perhaps the most important change was incorporating statistical analysis into in-game decision making.

Source: Diego Wedgwood / Pangea.ai

After analyzing historical data, the analysts realized what many armchair statisticians have been saying for years: in many (if not most) fourth-down situations, it’s preferable to risk a final play than to kick a field goal or punt the ball away to the opponent. The Eagles became one of the most aggressive teams on fourth down and even had an analyst on hand during the games to tell the coach when to go for it.

In 2017, this advice directly contributed to the famous “Philly Special” trick play that helped the Eagles secure their first-ever Super Bowl in franchise history. In a copy-cat league like the NFL, it wasn’t long before other teams adopted this strategy, resulting in a 40% increase in fourth-down attempts between 2017 and 2019. Indeed, with two teams getting knocked out of the most recent playoffs partially due to suboptimal fourth-down strategy, analytics will no doubt be even more in demand in football.

Improving mental health

One of the most complex tasks in psychiatry is to diagnose and treat serious mental illnesses. The difficulty lies in the absence of precise and objective clinical tests of the kind used in other areas of medicine. This results in many patients being misdiagnosed, told that they have a mental illness that they don’t have, or missing one that they do.

However, a recent article published in the prestigious scientific journal World Psychiatry suggests a hopeful solution. The authors have developed a machine learning model that can detect a psychotic break more than two years in advance with an accuracy of 83%, far higher than the 30% rate obtained by doctors. This technique could also be applied to other mental problems, such as Parkinson's, dementia, depression, or bipolarity.

The technique consists in a natural language processing (NLP) model that quantifies the typical deformation of the language of patients with schizophrenia. While NLP is one of the most popular areas of machine learning with very well-known applications like virtual assistants like Siri and Alexa, AI clearly has the potential to make very real benefits for healthcare.

Deciding about our freedom

In recent years, artificial intelligence has also impacted criminal law in the United States, where machine learning models have become tools that help judges set sentences in at least 14 States. The most popular of these systems, COMPAS (Correctional Offender Management Profiling for Alternative Sanctions), uses analysis of historical data from North American prisons to predict the probability that a subject has of repeating a crime. To do this, a series of criminogenic factors (causes of crime) of populations with similar characteristics to the value are taken into account.

The idea behind implementing this tool was to improve the productivity of judges by automating "routine" jobs that often come in large quantities. However, in practice the results have been damning: a group of analysts tested the model and determined it had a clear bias, predicting a much higher recidivism score for black people than white people. The reason behind this bias was the sad fact that prison populations are disproportionately black in the United States, due to its complex history of racism. So in this case, the automation unintentionally uncovered systemic bias in the historical rulings of judges that the algorithm was trained on.

What's next?

As we have seen, Big Data and Machine Learning are already affecting our lives in both positive and negative ways, but what does the future hold in store? Read on to find out, but if you want to hear from an expert, just tell us what you need and we’ll connect you with up to 5 companies within 72h that have plenty of Data Scientists on staff!

According to the neuroscientist and philosopher Sam Harris, human beings will never stop looking for improvements in knowledge and technology, considering the value they provide to us as a species. Indeed, the policies of most governments and corporations are universally directed towards growth and automation, so it seems unlikely that the rate of progress and innovation will slow any time soon.

Thus, as humans continue to develop increasingly intelligent machines, sooner or later we will reach the inevitable “singularity,” or a future where machines achieve an intelligence equal to or greater than that of the human being and are capable of learning on their own and improving themselves without needing our input.

As we have seen in this article, current AI models require human guidance in selecting the datasets they are trained against. If you require data scientists to help your AI, make sure to read our article about the best practices when hiring data scientists. In practice, this means that while these models can perform some tasks faster than the human brain, they still cannot perform all the same things we carn. But what happens when the algorithms have made us obsolete?

On this, the experts in the field disagree. On one hand, there is philosopher Nick Bostrom who argues that if we instill AI with human values, everything will be fine. On the other side, notable figures like Bill Gates, Steve Wozniak, and Stephen Hawking have sounded warning that "the development of full artificial intelligence could spell the end of the human race."

So which vision of this science fiction is correct?

While impossible to know, perhaps a new discourse on these topics, the philosophy of artificial intelligence, can help. In this new field, there are three currents of thought about what the not-so-distant future holds for us:

Artificial intelligence models will serve as a mirror to realize the terrible errors and biases that our behaviors are having. This will produce a deep reflection on humanity that will make us a better society.
Artificial intelligence will sharpen and worsen our biases and errors, inevitably leading to chaos and significant social deterioration.
The singularity will produce better machines than us and they will teach us how to do things in a better way. Or they will simply decide to erase us because we have no solution...

What do you think? Will artificial intelligence, machine learning, and data science make our lives better or worse? The future is already written or can we still twist the course towards a better destination? One way or another, we will find out soon enough!

A Glimpse into The Future of Data Science

Juan Manuel Ortiz de Zarate

How did we get here?

The Data Science Revolution

What's next?

Join the Pangea.ai community.