Census project manager Liina Osila: people rightfully expect us to use register data
The greater part of the preparations for the Population and Housing Census beginning at the end of the year have already been made, but the most important part remains. As the Population and Housing Census project manager, Liina Osila has been overseeing the preparations since the end of last year. She gives an overview of the progress and shares practical information that people are most keen to know.
Before you joined Statistics Estonia, you managed the crisis helpline project at the Emergency Response Centre. What prompted you to accept a new challenge, and has your experience so far matched your expectations?
The only time I had previously dealt with demography was back in university days, but the census seemed like an extremely interesting challenge from a project management perspective. Such an opportunity is very rare. When I accepted the new challenge, I expected the job to have a mission and a clear practical output. The census fulfils both of these criteria. I expected this job to be exciting and interesting and not always easy. So far, my expectations have been met.
What has been the biggest surprise in your new role?
This job is very challenging mentally. It is definitely much more exciting than might at first appear. To illustrate, it is like a show that has been in production for almost ten years while the audience will experience it only for a few hours. The upcoming census is even more special as it will mostly be based on data from registers. This means that the people in Estonia will not have to devote too much of their valuable time. Since we are using registers, it might not be obvious to everyone how much work is actually done behind the scenes to make the census a success.
When I joined the process, I was surprised to find out that several key things were still undecided, even though we are now so close to the census and a lot of the preparations have been made. For example, the exact dates of the census were still unclear. The list of characteristics to be studied with the Population and Housing Census has been agreed, but we are still working on the wording of the questions. For me, personally, all of these various issues are very interesting and enlightening.
You started this job right before the beginning of the test survey. How did the census rehearsal go and what were the main takeaways?
In my opinion, the test survey was a success. We received valuable feedback, which was our goal. We wanted to know how respondents understand the questions, whether the questionnaire is user-friendly, and how, given the coronavirus pandemic, to get answers from those who for some reason did not complete the online questionnaire. We had an incident where the online census platform stopped working. This was also an important lesson for us – we need to fine-tune all the technical aspects to prevent such problems during the actual census. In addition to that, the rehearsal also showed that people may be sceptical of interviewers or afraid to talk to them on the phone, because they suspected that the interviewer might be part of a scam. All of this is essential feedback for us. These issues have to be dealt with to ensure a smooth census process, and the good news is that we still have time for preparations.
Speaking of time, there are now under 300 days left until the census. How far have you come with preparations? Will you finish on time?
We will finish on time, there is no doubt about that. Ideally, we want to finish everything with time to spare, so that we could go over everything once more before the census and prevent any surprises. At the moment, we are making the final additions and corrections to the questionnaire. In May, we will test all the technological solutions. We will be able to breathe easy once we have assured that the technical side works. The operational reliability of the online questionnaire is essential, especially considering that the coronavirus pandemic might last for a while. Also, we are still making improvements to the methodology and to various registers, to guarantee the timely transfer of the specific data required. There are other tasks on our list, but we are working as hard as we can to make sure everything goes well during the census.
The coming census can basically be divided into two: one part of the census will be based on registers and the second part will be traditional where each resident can contribute. Let’s start with the register-based component, as this is the first time that register data are used on this scale in Estonia. What does it mean that the census is based on register data?
Certain characteristics – data on people’s educational attainment, ethnic nationality or legal marital status– are already available from various state databases. A register-based census means that if the data are already recorded somewhere, we do not have to ask this information from people in the census. For example, we will not ask whether or where a person works, because that information comes from the Employment Register and the Estonian Unemployment Information System.
Figuratively speaking, someone at Statistics Estonia pushes a button and all these data on individuals arrive straight from existing registers. Of course, it is not really that simple, but the use of registers means that a person does not have to do much as a respondent. We hope that each person finds a moment before the census to check that their details in the Population Register are correct.
If the data are available and can be retrieved from registers at any moment, why do we still need a separate full census?
One of the principles of censuses is a pre-agreed census moment, that is, the critical moment to which all the collected data should refer. This ensures comparable data as at a specific moment in time. Another principle is that censuses must be universal and regular. Again, this ensures that the data can be compared across time and analysed, with the output used to make the right decisions. It will provide a thorough overview of how life in Estonia has changed over ten years.
Which data are obtained from registers, that is, what are the things that people will no longer be asked about this time?
Most of the data are available in various databases. People might recall that in previous censuses they had to complete a long questionnaire of several pages, whereas now the self-reported component fits onto a single page. In total, there are nearly 30 databases in Estonia that can be used for census purposes. Some characteristics are based on combined data from multiple registers. For example, if different registers show different educational attainment for a person, we will compare the sources and use the data most recently updated. In case of mother tongue, we will compare the data in the Estonian Education Information System and in the Population Register, and we will also refer to the data of the 2011 census and the information from the Police and Border Guard Board.
What are the other registers that will be used in the census?
All the registers used in the census are listed on the census website at rahvaloendus.ee/en. The Population Register and the National Register of Buildings are the most familiar ones, but we will also obtain data, for example, from the Causes of Death Register, the Traffic Register, the Land Register, the e-File system etc.
How are the register data obtained?
We have agreements with all the registers specifying which data we require, when and in which format the data must be submitted, and what the data quality should be. At the agreed time, all the data will be transferred through the secure X-tee environment to our database. With some databases, we have yet to complete the transition to automatic data transmission via X-tee, but there are just a few of these databases and we are working on ensuring automatic data exchange with those sources as well.
You have mentioned data quality a few times. What does quality mean in case of data?
Quality means that the data are accurate, true and allow researchers, politicians and others to draw the right conclusions based on the data. As an example, take the recent initiative where the eesti.ee portal was used to invite people to get the coronavirus vaccine. There were many people whose profile in that portal specified an email address that was no longer used or had not been forwarded. It means that the data quality was poor as the data were not accurate and did not help as much as expected.
There are those who doubt the quality of register data. What would you say to them and what has Statistics Estonia done to ensure the quality of data in the relevant registers?
As it was known that this census would be based on registers, the improvement of databases began already before the 2011 census. Between the previous population census and today, there have been 16 million euros’ worth of investments, and a half of these have been made by ministries to improve registers and data quality. In addition, we have conducted two pilot censuses to test the quality of data. Thus, we have known the problematic areas for a long time and we have actively addressed these issues. Nothing has been left to chance. We are confident that the data in Estonian state registers have a very high quality and provide a true picture of our people and their dwellings.
How exactly have these investments improved data quality?
The majority of the investments have been related to the development of automated data transmission. This ensures that we will not have to process data as Excel tables and information will be transmitted automatically without errors via correct and secure X-tee channels. So much has already been accomplished. For example, as a result of the (still on-going) development of the Register of Buildings, a huge amount of true data has been added in the register regarding the year of construction, size, water supply, heating and other details of buildings. Also, the register-based census was a key impetus for employers to start recording each employee’s occupational title and workplace address in the Employment Register. This, in turn, allows the more efficient collection of remuneration and workforce data, leading to more accurate statistics on wages and the labour market.
When it comes to the quality of register data, Statistics Estonia also has a request for people. What exactly would you like everyone to do?
We ask people to review and update their data. Of course, people cannot access all the registers themselves, but the most important thing to do is to check your data in the Population Register. Go and see whether your data are all available in that register and all correct. People have generally become cautious about sharing their data, but most of us do not really know what kind of data about us are stored in different registers. According to our latest public opinion survey, just 17% of people had logged into the e-Population Register over the last year. However, on a positive note, 87% of the respondents were prepared to update their data in that register. We really wish that people would take the time and review their data.
As we know, there are some data that are not available in registers but are still required for the census. Let’s talk about the survey component of the census. What will people be asked?
We apply the once-only principle in data collection. It means that if the data are available in a register, we will not ask people to provide the same data once more. We only ask for data not found in registers. We ask some things based on a person’s perceptions and beliefs, because it has been nationally agreed that such information on certain issues is required. There are just a few things like that: we have questions about knowledge of languages and dialects, religion, and health-related limitations. At the moment, we are still testing whether registers provide enough reliable data about migration. If the quality is suitable, we will use registers for this information. Otherwise, we will ask a question about it.
If you are conducting a general census, why not ask more questions from people?
The census has its own purpose, it is not a big survey where we ask things just out of interest. We believe that before we ask a question, the users of that information – politicians, researchers, public servants – must be able to explain why they need such data and which decisions they cannot make without these data. We are the ones to reach out to people and they want us to tell them why. Therefore, it is our job to thoroughly consider what we are going to ask from people, and it is our duty to ask the stakeholders for their justification.
Also, we are a digital society and people rightfully expect that they will not be asked for data already available in a register somewhere. Asking for something more than once affects our reputation and reliability, which are the key to people trusting us with information in the first place. We will not get their trust if we cannot explain what the data will be used for, which problem is going to be solved, or what the immediate gain is for the individual sharing their data.
It took a long time to decide whether the survey component of the census would include the entire population of Estonia or just a certain part of it. This matter was settled at the end of February. Please explain who will be participating in the survey component of the census.
The combined method means that data on all residents of Estonia are collected from registers, and there will also be a sample survey with those 5 to 7 separate questions on knowledge of languages and dialects, religion, and health. The survey will be obligatory for a certain amount of people randomly sampled across Estonia – the sample size will be about 60,000 people. Everyone will receive an e-mail invitation to participate in the e-census. People living at the addresses included in the sample will additionally be notified by mail. At first, everyone can give the answers online. If the respondents in the sample do not complete the e-census, they will be contacted by our interviewer for a phone or face-to-face interview depending on the circumstances. I want to stress that all the people of Estonia can answer the online questionnaire. We encourage everyone to do it. Everyone’s answers count and are very important.
Why should someone who is not in the sample go and voluntarily complete the online questionnaire?
The more answers we get, the higher the quality and accuracy of our data. The selected sample ensures the necessary level of quality and accuracy, but each additional answer further improves the final results.
What led to the decision to not survey the full population? Does the sample survey satisfy Statistics Estonia?
The potential advantages of both methods were assessed and this was found to be the most sensible solution. A full survey is significantly more expensive but, at the same time, does not guarantee greater quality of the results proportional to the expenditure. Also, Estonia is a digital society and this means keeping databases up to date. It is clear that the fast availability and analysis of high-quality data is already crucial and will become increasingly important in the future, which is why we must take these steps.
Does the sample survey provide a good enough picture of Estonia as a whole?
Yes, it does. This method is fully accepted internationally and meets all the quality requirements defined for us.
What about people who are not in the sample? How can they be motivated to participate, even though they are not required to answer the questionnaire and will not be contacted by an interviewer?
By participating in the e-census, everyone contributes to providing reliable data for decision-making, thus ensuring better decisions about our everyday life. The greater the accuracy of our data, the better the information that our decisions are based on. For example, we will be able to determine the areas with a higher share of residents with health-related limitations, we will know their age, and can use this information to plan and improve health care and social services.
What happens if someone included in the sample is unwilling to answer?
If a person has been chosen for the sample of our survey component, they are obliged to participate under the Official Statistics Act. If they refuse to give answers, they are actually in breach of the law. The Official Statistics Act stipulates that respondents are required to answer all questions of a census and provide true and complete answers, with the exception that questions about beliefs are answered on a voluntary basis. This means that a person may choose not to answer questions about religion or health, for example.
Why should people answer the questions about health and religion?
The questions about health and religion have been added due to domestic need. Statistics Estonia conducts censuses in accordance with the Official Statistics Act which stipulates that, among other things, religion and the existence of a long-term illness or health problem and its impact on normal activities are part of the data additionally collected and processed due to the domestic need for official statistics. The existence of a health problem and religious affiliation are based on perceptions and beliefs, and such information is not collected in registers. Therefore, questions about these aspects have been added to the sample survey questionnaire of the census. Health data allow the state as well as local governments to better plan health care and social services, for example. The questions regarding health and religion are voluntary, meaning that respondents may choose not to answer.
What will a person gain by spending their time on answering the questionnaire?
I agree that policy-making is a complicated process where it is often hard to see the clear link between the use of specific data and the ensuing decision. Still, I would say that answering these questions is an investment, not a waste of time. All of us want that the decisions regarding our everyday life would be informed, that these decisions would be based on high-quality and accurate data. That is why everyone’s contribution matters – the more true and complete answers we get, the higher the quality of the collected data, allowing us to make decisions that address actual needs. We ask policy-makers and researchers to explain why the need specific data, in order to make sure that people are not burdened unnecessarily.
How do you plan to reach the people in the sample? We know that the contact details for some people may be missing in the Population Register. Is that going to be a problem?
For most people, the contact information is available in the register. But we plan to run publicity campaigns encouraging people to update their data in the Population Register. However, if the people living at the sampled addresses cannot complete the online questionnaire, for various reasons, they will receive a visit from an interviewer. During the census, we are likely to find out that there are some addresses where no one lives. This is also important information in the context of the Population and Housing Census.
How many answers do you hope to receive online, and how many people will have to be contacted by interviewers?
Ideally, we would, of course, like if everyone completed the survey online. But that is not realistic and it is completely understandable that this might not be an option for everyone. Last time, the census questionnaire was completed online by 67% of the population. This time, we want to achieve a rate higher than that, to live up to our reputation as a digital society. People only have to answer a small number of questions, which will approximately take less than 10 minutes. I hope that everyone finds the time. We have estimated that we will need about 130 interviewers to complete the survey component of the census.
As for the interviewers, do you plan for them to also make home visits during the census or will they only contact respondents over the phone?
We prefer communication over the phone, especially in the current circumstances. During the test survey, the interviewers did not physically visit any respondents. In this stage of preparations, we are considering various options and going through all the scenarios.
The official census moment is 31 December 2021, but when does the census exactly take place? When can people access the e-census? When are they interviewed by phone or face-to-face?
We will receive the data from registers at the beginning of 2021 over several months, but all the data will refer to the census moment. The census questionnaire, which everyone can access online, can be answered from 28 December 2021 until 15 January 2022. After that, it will take a few days to determine the respondents in the sample who did not complete the e-census. And then, from 22 January to 20 February, interviewers will contact these remaining respondents.
What is the census moment? Will an interviewer knock on someone’s door at midnight on New Year’s Eve?
It will not be like that. It means that all the data must be reported as at the official census moment. If a person answers the questionnaire on 10 January, their answers must reflect the situation as it was at 00:00 on 31 January. For example, if a child was born to the respondent on 5 January, this fact will not be recorded in the census because the child had not been born yet as at 31 December. The data are collected as at a specific moment to ensure comparability, including internationally.
How many interviewers will Statistics Estonia need? When will recruitment begin?
Based on our calculations, we will use about 130 interviewers. In addition, we have 40 interviewers already employed by Statistics Estonia. So, there will be about 170 interviewers in total during the population census. The recruitment will begin in the autumn.
When will the census results be complete and made public?
The exact timeframe has not been agreed yet, but the results will be published bit by bit. In general, all the information will be compiled and published no later than within the year following the census, that is, in 2022.
What will the census data be used for? What changes in Estonia will the data lead to in the next decade?
The ultimate use of the data depends primarily on researchers, politicians and other stakeholders who will rely on the data in decision-making, thereby shaping the everyday life of all of us. The fact is that with better databases and more accurate data they can make better decisions. The current coronavirus crisis has shown that data quality is essential, whether in connection with mask distribution, vaccine roll-out or allowing travelling to Estonian islands or abroad.