According to the national balance sheets of the most advanced economies, despite a recent sharp decline in per capita net wealth, Italian private households present a higher rate among the wealthiest and least indebted in Europe. Recently, the COVID-19 outbreak caused a new leap in households' savings worldwide, particularly in advanced economies and Italy. This study underlines that using advanced analytics tools, household saving behaviour information, and big data analytics may support data-driven decision approaches addressing the management of complex relationships in the financial arena. More specifically, using exploratory and predictive analyses based on big data analytics and machine learning, this study aims to provide extensive customer profiling in the household saving sector in Italy, supporting a data-driven decision-making approach. A profiling of household savings has been defined using the information provided by big data analysis. To proceed in this direction, the hardware and software requirements necessary to perform data processing were considered in the first phase of the study. Data collection was performed according to the so-called extract, transform, load (ETL) process. The contribution of this study lies in the results obtained in terms of data analytics over a dataset that accounts for the purchasing behaviour of almost 20 million postal savers. The clustering algorithm is highly efficient and scales well for large datasets. K-means clustering can be implemented within the MapReduce computational framework. Therefore, the overall procedure proposed here can be easily extended to big data using parallel computing and software implementing MapReduce, such as Hadoop and Spark.

Segmenting with big data analytics and Python: A quantitative exploratory analysis of household savings

Festa, G;
2023-01-01

Abstract

According to the national balance sheets of the most advanced economies, despite a recent sharp decline in per capita net wealth, Italian private households present a higher rate among the wealthiest and least indebted in Europe. Recently, the COVID-19 outbreak caused a new leap in households' savings worldwide, particularly in advanced economies and Italy. This study underlines that using advanced analytics tools, household saving behaviour information, and big data analytics may support data-driven decision approaches addressing the management of complex relationships in the financial arena. More specifically, using exploratory and predictive analyses based on big data analytics and machine learning, this study aims to provide extensive customer profiling in the household saving sector in Italy, supporting a data-driven decision-making approach. A profiling of household savings has been defined using the information provided by big data analysis. To proceed in this direction, the hardware and software requirements necessary to perform data processing were considered in the first phase of the study. Data collection was performed according to the so-called extract, transform, load (ETL) process. The contribution of this study lies in the results obtained in terms of data analytics over a dataset that accounts for the purchasing behaviour of almost 20 million postal savers. The clustering algorithm is highly efficient and scales well for large datasets. K-means clustering can be implemented within the MapReduce computational framework. Therefore, the overall procedure proposed here can be easily extended to big data using parallel computing and software implementing MapReduce, such as Hadoop and Spark.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11575/141641
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact