BERGAMO
Dati Generali
Periodo di attività
Syllabus
Obiettivi Formativi
“Data production and analysis” offers a solid background for professional activities in the field of data production, collection, treatment and analysis. This overview aims at supporting students in empirical research developed in different fields (such as statistics, economics and social sciences). The course focuses on the official data production system, within the modernization and coordination framework of official statistics producers.
On the one hand, the course will guide and encourage a professional approach characterized by the need of official statistics, eventually integrated with other types of available sources.
On the other hand, it introduces the most common data collection and production methods (including both official and survey data frameworks). Such data can be used in various fields (economic and social data, health, and so forth) for various purposes.
Despite mostly focused on official statistics, what learned during the course will provide a solid guide in producing or collecting data in/for any other field, beyond the official statistics framework. Students will achieve ability in applying adequate methods at different steps of the data production or collection process (such as: quality check, missing data treatment, imputation procedures, outlier detection) in order to check or enhance the data quality. Student will also practice the data production and analysis “cycle”, treating practical cases throughout both course lectures and laboratories.
Ability to process, analyze and model the data will be also practically deepened learning to use (and to program) one of the most widely spread statistical software, worldwide: SAS.
The main specific course objectives are:
• Students will be able to design and manage data production processes for different fields.
• Students will be able to evaluate as well as enhance data quality (having in mind the definition of the main dimensions of data quality, learning how to evaluate and to detect data issues and applying suitable methods for optimizing the data quality).
• Students will learn how to analyse data with a hierarchical structure (multilevel modelling).
• Students will acquire ability in critically choosing and properly using different types of data sources (including censuses, cross sectional or longitudinal surveys, administrative sources).
• Students will become familiar with the most common business concepts linked to official statistics production (e.g., Generic Statistical Business Process Model - GSBPM, data archiving, metadata management, statistical standard classification, imputation).
The course is fully consistent with the education aims of the EMOS (European Master in Official Statistics) label, of which it is a milestone, as well as of the Master course EDA (Economics and Data Analysis).
Prerequisiti
No prerequisites
Metodi didattici
The course includes both lectures and lab sessions with constant teacher-student interaction and discussion; it stimulates an active participation.
Thematic seminars and workshops (e.g., about programming with SAS or about further specialized topics) will be also proposed.
Personal research projects can also be proposed to students.
Verifica Apprendimento
The course exam will be organized into two different parts, corresponding to the topics of Module 1 and 2; if possible, there will be a lag of at least 5 to 10 days between the two parts of the exam.
For each of the two parts the maximum score is 31 (30 cum laude). The final full evaluation is given by the simple average of the two scores.
Each module evaluation can be based on:
• Module 1: theoretical written final exam including tests (T/F) or multiple choice questions and open-ended questions and/or questions of other types (e.g., brief exercises or applications). In addition, the ability of using SAS will be also assessed with a practical challenge in data analysis and/or with personal analyses developed by students. The written exam can be replaced by an oral exam touching the same themes/topic.
• Module 2: written exam including theoretical questions (T/F tests, multiple choice, and open-ended questions) and exercises.
Each module’s exam final score can be integrated by:
• Presentations (made by individual students or by group) of case studies, research results and/or deeper discussions about specific course topics.
• Evaluation of assessments provided by the teachers, including case studies, reports and presentations for mates.
• Other periodical evaluations during the course.
The average full exam score will be published on the “sportello internet”; for Module 1, detailed exam scores will be also published on the eLearning.
Contenuti
The course is organized into two consequent and complementary modules.
Module 1 (teacher: Daniele Toninelli)
• GSBPM (Generic Statistical Business Process Model) step by step: how an official statistics business model is organized and works within the framework of data production processes.
• Role of metadata: in order to enhance the quality standards of the information produced and to provide best practices in clearly communicating/sharing and visualizing statistical or quantitative outputs of any type.
• Data editing and imputation methods in practice: how to professionally fix the most common issues with collected raw data, optimizing their quality, their usability and their reliability.
• Introduction to the SAS statistical software by means of an user-friendly interface: how to use SAS Enterprise Guide and first steps in SAS programming.
• Multilevel modelling: when it is useful, how to estimate such models, how to interpret and use the main output for decision making in practical contexts.
Module 2 (teacher: Annamaria Bianchi)
• The steps in the data production process, the decisions due at each step. How to be aware of interactions between the different steps, pros and cons for statistical purposes.
• Probability-based surveys: basic concepts, sampling methods, mode of data collection, errors and total survey error paradigm, quality framework, European Statistics Code of Practice, questionnaire design, non-response analysis and non-response correction methods, estimation.
• Sample selection, estimation and non-response analysis with SAS.
• Non-probability samples: convenience samples, quota samples, volunteer web panels.
• Coverage and self-selection problems in non-probability samples.
Altre informazioni
The course will include the launch of the official SAS Certification path: "Towards the SAS Certification". Students will have the chance to start the process in order to obtain the SAS Certification in SAS base programming.
Additional information about the course are available on the course eLearning page (for the enrollment key, write to the teachers).