22/09/2020 - Digital
We’re hearing more and more about data science. This is a very recent discipline, but one which has grown quickly over recent years. Jérôme Feroldi, data scientist at Smartengo, explains exactly what this expression covers.
Hello Jérôme. What’s the purpose of data science?
Data science aims at exploring and analyzing a high volume of (digital) data coming from various sources, using mathematical algorithms and statistics (such as “machine learning”), to optimize business processes, to assist with decision-making or to add value (informative, economic…).
It is much in demand due to the crosscutting nature of the skills it requires. Data science is indeed situated at a crossroads between:
What are the concrete applications of data science?
The main applications of data science can be sorted into four groups: optimization, automatization, creation and prediction.
In concrete terms, many of them are present in our daily life. There’s data science in your search engine, in your YouTube or Deezer recommendations, on your e-commerce websites, in self driving cars or in your voice-operated assistant. It’s also used to manage spam messages or moderate sensitive content.
In companies, data science is involved in all kinds of jobs and activities: finance, marketing, products, logistics and supply chain, etc. This is naturally also the case at Vallourec, where more than a hundred initiatives based on data science have been identified: at VAM®, in finance, in production (mills) and naturally for Smartengo.
In companies, data science is involved in all kinds of jobs and activities: finance, marketing, products, logistics and supply chain, etc.Jérôme Feroldi Data Scientist at Smartengo
Some key dates
What are the links with Artificial Intelligence and Big Data?
Artificial Intelligence refers to the techniques used to imitate the mechanisms of the human brain: image recognition, predictive models for various phenomena (the weather, purchasing behavior, etc.), filtering abusive comments…
Big Data refers more generally to the enormous volumes of data processed with the associated calculation power (conventional IT tools are unable to properly process these quantities and volumes). It can be described according to three vectors, the 3 V: the “Volume” related to the increase in exchanges and the explosion of data (so more servers and personnel), the “Variety” of the data types and the “Velocity” or real-time collection and processing times. “Data is the new oil!” of data science.
What do you believe the future holds for this discipline?
It’s reasonable to assume that in the near future more and more environments (services via applications, connected items, etc.) will generate ever greater volumes of data. This will result in more powerful and complex algorithms but also cloud environments
At the same time, it is very likely that two challenges will become increasingly important: the protection of privacy and the limitation/reduction of the energy footprint of these resource-hungry techniques.