About the event
Data in the form of numbers and figures, text, photos and other formats are abundant in the media, internet, social networks, scientific documents, etc. and help us to be aware of our surroundings. Such knowledge helps organizations, business owners and researchers to achieve their goals with more quality and precision. In the event on play with real data, teams interested in working with data receive a real data set and use any analytical method to work on this data in a competitive environment alongside other teams. The committee of judges of the event will check the output of the teams and at the end the winning teams will be announced. The information on how to participate in this event, schedule and evaluation criteria are on this page. We hope that by participating in this event, you will experience a pleasant scientific competition.
History of the event
The first Event on Play with Real Data was organized by Bu-Ali Sina University, jointly with the Scientific Data Analysis Team as one of the capacity-building programs for statistics in 2021. The event was sponsored by the International Institute of Statistics, the World Bank, the International Society for Business and Industrial Statistics, Day Insurance Company, the National Laboratory Brain mapping, Lund University and the Actuarial Society of Iran. About 350 people registered in this event and more than 150 people participated in the competition section of the event. The data considered in the first event included four data sets in the fields of insurance, medicine, protein structures and time series, and after the competition, four teams were awarded a certificate and cash prizes.
The Event on Play with Real Data was again organized by Bu-Ali Sina University in 2023 for the second time. It was sponsored by the International Society for Business and Industrial Statistics, Statistical Learning and Data Science section of the ASA, TehranRe, National Center for Health Insurance Research Center, Water and Waste Water Company of Hamedan, Iranian Statistical Society and the Actuarial Society of Iran. The data for the second event included three data sets in the fields of insurance, medicine and transportation, and after the competition, three teams were awarded a certificate and cash prizes.
New track
The first and second Event on Play with Real Data was compare all attendees with together. The organizer decided to consider the competition at two levels for the third event on play with real data:
- Amateur: This level has been created for the first time and it is especially for participants who like working with data but they have a little experience and use simple descriptive tools such as charts, tables and statistical indicators
- Professional: Especially for the attendees who have experience in data analysis and have used analytical models (including statistical models, machine learning models, artificial intelligence, data mining algorithms) in addition to simple descriptive tools.
Outline of the program
Opening webinar:
- Introducing event details and answering questions;
- Scientific speeches;
- Introducing datasets.
- Teams arrangement;
- Choosing data set;
- Conduct the competition;
- Judging contest;
Closing webinar:
- Event report;
- Announcing winners.
Criteria for success in the competition
- The external quality of the report in terms of fonts, structuring and the way the content is arranged;
- The quality of the content of the report in terms of the way of writing, phrasing, correctness of the content in terms of meaning;
- The correctness of the methodology and how to use it in solving the problem;
- The quality of the oral presentation and compliance with the timing and the level of readiness to answer.
Subject of Data sets
This event consider three data sets in the following areas:
Further information about the data sets will be given in this webpage and also in the opening webinar
Some notes about the data sets
- From each data set, the first 30 rows can be downloaded in Excel file format by clicking on the name of the data sets in the above;
- The volume of all three data sets is large enough to be viewed and managed with common software such as Microsoft Excel:
- Medical insurance data includes three sheets, one of which has about 290,000 rows and 9 columns, the second sheet has about 50,000 rows and 7 columns, and the third sheet has about 100,000 rows and 5 columns;
- The data of drug prescriptions is about 150 thousand rows and 18 columns;
- Vehicle accident data is about 90 thousand rows and 30 columns;
- The amount of data for the amateur level is 50% of the cases mentioned and randomly selected from among them.
Scientific committee
- David Banks, Duke University, USA
- Daniel R. Jeske, University of California, USA
- Javad Faradmal, Medical University of Hamedan, Iran
- Luca Frigau, Cagliari University, Italy
- Ozan, Kocadagli, Mimar Sina University, Turkey
- Paulo Rodrigues, Federal University of Bahia, Brazil
- Rahim Mahmoudvand, Bu-Ali Sina University, Iran
- Seyed Yaser Samadi, Southern Illinois University, USA
- Tahir, Ekin, Texas State University, USA
- Ashraf Daneshkhah, Bu-Ali Sina University, Iran
Organizing Committee
- Rahim Mahmoudvand, Bu-Ali Sina University, Iran
- Fatemeh Moameri, Bu-Ali Sina University, Iran
- Zahra Seifi, AGNA, Iran
- Mahlagha Moameri, AGNA, Iran
- Soraya Moamer, Medical University of Hamedan, Iran
Activity | Deadline |
Registration | 06 Oct |
Opening webinar | 07 Oct |
Team arrangement | 08 Oct |
Receive data set file | 09 Oct |
Data contest and submitting report | 10-16 Oct |
Review and choosing top teams | 12-19 Oct |
Preparing video for oral presentation | 19-22 Oct |
Presentation webinar | 22 Oct |
Announcing winners | 23 Oct |
Only webinars
- Free of charge
Webinars and competition
- 15 USD; link of payment will be send by email after registration
- 15 USD; link of payment will be send by email after registration
Important points
- To participate in the contest, the registration form must be completed and submitted by 23:59 on 6 Oct (IRST time zone).
- All team members must be registered for this event. Otherwise, the selected data will not be sent.
- It is necessary to respect the principles of professional ethics in conducting the competition.
- Team members are committed to use the received data only for this event.
Registration form
registration is closed
Important points
- To participate in the contest, the following form must be completed and submitted by 23:59 on 8 Oct (IRST time zone).
- All team members must be registered for this event. Otherwise, the selected data will not be sent. Receipt of the registration payments must be uploaded in the form.
- Teams can only choose one dataset and it is impossible to change it after selection.
- It is necessary to respect the principles of professional ethics in conducting the competition.
- Team members are committed to use the received data only for this event.
Team arrangement and data selection
Time is over
Prizes for amateur team winners
First place
150 USD
Second place
125 USD
Third place
100 USD
Sponsors and collaborators
Saman Insurance Company
This company works on different business lines including life and non-life insurance as well as reinsurance. Saman Insurance Company supported us financially.
Ma Insurance Company
This company works on different area of insurance industry in Iran and provided a real data for this event.
International Society for Business and Industrial Statistics
ISBIS is one of the sub societies of International Statistics Institute and supported us for several times up now.
TehranRe is a company that works on reinsurance area and supported us for the second time
Bu-Ali Sina University
This university is co-organizer of the event
Statistical Learning Laboratory
SaLLy is a laboratory in the Federal University of Bahia in Brazil. It is a hub for statistical and data science research, training, and consultancy and supported this event.
National Center for Health Insurance Researches
This is one of the data providers of the event and supported us for the second time
Iranian Mathematical Society
This society supported the event for the first time
Actuarial Society of Iran
This society was sponsor of the first and second event and now supported us for the third time
Statistical Society of Iran
This is national society of Statistics in Iran and supported us for the second time
International Society for Statistical Engineering
This society has the aim and missions that are very close to the event and supported us.