DATA SCIENCE IN ECONOMICS LABORATORY
The training aim for the student is to reach the following learning outcomes:
Knowledge and ability to understand
The course aims to provide methodological and application advances of specific data science methods. In particular, students will analyze some types of economic and business data through specific statistical techniques. We also want to encourage students to specialize in the use of the open source statistical package R.
Ability to apply knowledge and understanding
At the end of the teaching course, the student will also be able to analyze data bases, even large ones, with sophisticated statistical methods, with the help of case studies carried out with the statistical software R. The acquired knowledge will allow him to critically interpret economic and / or business relationships.
1.Network data Analysis
2. Advanced Regressive Methods
3. Cluster Analysis
4. Time Series
1.Network data Analysis
1.1. Introduction
1.2. Grafh types: Direct and Indirect
1.3. Visualization and descriptive analysis of Network data
1.4. Graphic layout
1.5. Connections and Contiguity Matrixes
1.6. Metrics and taxonomy of Network data
1.7. Use of Network data in Classification and Forecasting
1.8. Network data collection with R
1.9. Applications
2. Advanced Regressive Methods
2.1. Local polynomial regression: non-parametric regression
2.2. Selection of variables based on the penalization of regression models
2.3. LASSO
2.4. Quantitative regression
2.5. Applications
3. Cluster Analysis
3.1. Introduction
3.2. Distances between Units and between Groups
3.3. Hierarchical clustering
3.4. Dendrogram: visualization of the aggregation process
3.5. Limits of Hierarchical Clustering
3.6. Non-Hierarchical Clustering The k-Means Algorithm
3.7. Applications
4. Time Series
4.1. Introduction
4.2. Collection Time Series data with R
4.3. Components of a time series
4.4. The ARIMA models: Identification and Estimation
4.5. Use of the ARIMA models for prediction
4.6. Other forecasting methods
4.7. Applications
Course slides.
James G, Witten D, Hastie T, Tibshirani R (2013). An Introduction to Statistical Learning
with Applications in R. Springer.
Further readings:
Giudici P, Figini S (2009). Applied Data Mining for Business and Industry. Wiley
Ledolter J. (2013). Data Mining and Business Analytics With R. Wiley
Shmueli G, Bruce PC, Yahav I, Patel NR, Lichtendahl KC, Jr. (2018). Data Mining for Business Analytics. Wiley
Lectures.
R practice and exercises.
Knowledge and ability to understand
To verify the learning a test and an interview are scheduled. The test will consist of theoretical questions and empirical exercises on the whole program with particular attention to the use of the software R, simulating some statistical analysis on real case studies. The final evaluation, expressed in thirtieths, takes into account both the test and the interview.
Ability to apply knowledge and understanding
During the test and the interview, students' ability to apply the knowledge of advanced data science models is tested so as to be able to deal with specific case studies.
E-mail: benedett@unich.it
In the first semester the Professor sees students only by appointment (benedett@unich.it). In the second semester, the office hours for students is scheduled for Wednesday from 4 pm to 6 pm, studio DEC 2 ° floor, Viale della Pineta, 4.