Collecting accommodation data via the Internet

- the Norwegian experience

by

Anne Mari Auno

Senior Executive Officer, Statistics Norway

tel: +47 62 88 54 01, e-mail:

for

The 7th International Forum on Tourism Statistics, 9-11 June 2004, Stockholm

Theme no: 5

Abstract:

Statistics Norway has over the two previous years made several moves to increase the response rates, improve the quality of the input data, decrease the response burden and increase the timeliness of our accommodation statistics. The main changes have been the introduction of a fine for non- or late-respondents, optical registration of paper questionnaires, receiving reports directly from the establishment's booking systems and receiving questionnaires via the Internet.

We evaluate in this paper the impact of collecting questionnaires via the Internet on the response burden, the data quality and the time of response compared with other types of data collection used in the statistics on hotels and similar establishments. The overall conclusion is that all targets have been met and that the web portal has been a success.

Contents

Contents

Background

IDUN - the web portal

Response rates

Time of response......

The size of the Internet respondents

Data quality

Response burden

Operating the production system

Conclusion

Background

Statistics Norway has over the last years made several moves to increase the response rates, improve the quality of the input data, decrease the response burden and increase the timeliness of our accommodation statistics. The main changes have been the introduction of a fine for non- or late-respondents, optical registration of paper questionnaires, receiving reports directly from the establishment's booking systems, receiving questionnaires via the Internet and publishing the results free of charge on the Internet including StatBank Norway.

These changes are not unique to the accommodation statistics, but among overall changes made in Statistics Norway founded on inter alia the necessity to shorten the time between the collection and the dissemination of data and to minimise the response burden. One mean to achieve this, is to use information technology in a more efficient manner. Statistics Norway aims to offer an electronic option for delivering data for all statistics by mid-2004. With electronic options we mean everything from specially designed software to reporting the answers by e-mail.

This paper will focus on our experience with receiving questionnaires for the statistics on hotels and similar establishments via the Internet based on the experiences gained after the first eight months.

The accommodation statistics were among the first statistics using the Internet to collect data in Statistics Norway. This area was chosen partly because the tourism businesses were thought to have a high willingness towards using the Internet, hence the system would be tested thoroughly. Additionally it was necessary to offer methods to enable the establishments to answer on time. The answers are due in by the fifth in the following month. Those who do not answer within 20-25 days get fined. Until 2003 we accepted self-made reports and answers sent by telefax. After starting to read the questionnaires optically, the only acceptable options were our questionnaires returned by post and data reported directly from the booking system. The latter method enables the respondents to answer fast, but has only been adopted by a few hotels.

IDUN - the web portal

For the reference month of July 2003 the web questionnaire for the hotel statistics was launched through the IDUN web portal. IDUN is the abbreviation for information and data exchange with enterprises, and is aimed to be a scalable and general solution for web-based data collection from enterprises.

To get started the establishments need a computer with an Internet access. As all data are stored centrally at Statistics Norway, both drafts and final reports, one does not have to use the same computer from time to time. All data are encrypted.

To log on the establishments use a username and a password, which are the same for all surveys. After entering the correct survey, the establishments have to verify the information about the business like postal address and location.

On the next page one fills in the questionnaire, which is identically built up like the paper questionnaire. The major difference is that there are checks controlling that all boxes that shall be filled in have been done so and that the answers are consistent and reasonable.

The checks have two alternative outcomes: the definitive ones find mistakes that have to be correct before sending the data and the possible ones that indicate that some of the data may be wrong, but one can send the data. Ideally, we would like the respondents to return perfect data only. However, by making the checks definitive we may end up with not getting any data at all or that the respondent chooses to return the questionnaire by post with all the errors still on it. As the respondents have to do the data revision for us if implementing definitive checks, we would also be increasing the response burden, which is the opposite of our intentions. Instead we have kept the definitive checks to a minimum and made warnings pop up when a box that should have been filled in is empty, when the totals do not add up and so on. Thereby we aim to make the respondents improve the answers. The following analysis will show the results so far.

In addition, for questions the respondents often find difficult, there are pop-up textboxes explaining the question further. If one needs even more information, there are both links to extensive written instructions and telephone numbers for technical support and support related to the questionnaire itself.

Finally, the data can be sent off and on the screen the respondents get a transcript of their answers and a reference number. The data are downloaded every night from the central area to the production system.

Response rates

Statistics Norway has made more moves to increase the response rates for the accommodation statistics apart from the web portal. The most significant change was the introduction of a fine to all non-respondents from February 2002, i.e. everyone not answering at all or answering after the final figures are made. This is the main reason why the response rate has increased. Still we experienced an all-time-high for the first month with the web portal.

In the figure on the next page the means of replying are illustrated. 'Optically read' and 'manual registration' are both answers returned on paper, but the latter ones are either telefaxes, self-made reports or questionnaires that can not be read optically due to for instance red ink being used or the questionnaire being torn.

Here one can see that the share of those reporting via their booking system is increasing, although slowly, and that the share of those preferring the Internet has become significant in the data collection process. After six months with the web portal, one of four reported their figures electronically.

At the same time, the share of questionnaires having to be registered manually was down to 17 per cent. This is less than a third of those registered manually two years earlier.

An unforeseen advantage with the electronic options is that hotels not having received a questionnaire, reports figures to us if in business. We send questionnaires to those having said at an earlier stage that they will be open in the month concerned. Previously, some hotels have rung and informed us about changes and asked for a questionnaire too. The number has however increased lately, improving our population.

Time of response

The users of the hotels statistics have high demands in regards to data freshness. The current target is to publish final figures four to five weeks after the month of reference. Consequently it is important that we receive the answers at an early stage enabling us to revise the data properly.

In the next figure one can see the minimum, maximum and average share of responses by time and mean for the period July 2003 to February 2004 inclusive. The measuring points are the dates when the answers are due in originally and after the first reminder respectively.

The respondents reporting via their booking systems were the fastest ones to reply overall. After five days 68 per cent had answered on average for both electronic options. This was better than the best month for those answering by post, and far better than their average response time.

The differences were not that evident by the 15th. On average 97 per cent had answered after a fortnight of those reporting via their booking systems. The respondents reporting via the Internet were not far behind in speed. Due to there being twice as many respondents answering by post than electronically, the overall average was 90 per cent by the 15th.

We have also found that the data collection via the Internet is very vulnerable to computer and network problems. A couple of the months our system was down for a day or two during the first five days of collecting data, resulting in the respondents returning the answers by post instead. The time before the data can be loaded into our systems is then firstly delayed by the post and secondly by having to be opened manually and read optically. The latter is done three times in the data collection period to optimise the routines both of those registering the data and of those revising the data. The time from the respondent has returned the questionnaire to it is loaded into the production system will be anything between three and ten days with the current routines.

When the web portal becomes more stable, it is fair to believe that both the share of respondents answering by this mean and the time of response will be influenced positively.

We also found that the average response rate for those answering by post has decreased significantly after launching the electronic options. In the period 2000 to 2002 the average response rates by the fifthwas 65 per cent. For the period used in the figure on the previous page the average was only 51 per cent. From this, one can conclude that those already replying quickly have chosen to report the answers electronically.

The size of the Internet respondents

There is a constant demand for new data by the users. One way of being able to give the users timely statistics is to estimate figures instead of waiting until the end of the data collection period before publishing the results. To do this one has to be sure that the data available are representative.

In this paper we have not made a full analysis of this. To get an idea whether the data are representative however, we have looked at the share of guest nights the Internet respondents constitute.

The figure on the previous page shows that the share of guest nights reported via the Internet in February 2004 was 32 per cent. In the same month the share of questionnaires reported via the Internet was 23 per cent. The previous months show the same tendency. This means that the Internet respondents are larger in terms of guest nights than the average respondent.

Whether one will be able to produce preliminary figures based on the data from the Internet respondents, cannot be concluded here. We can however conclude that the findings do not exclude this option.

Data quality

The quality of the data is measured by the number and types of warnings per questionnaire. We have three main types of checks. Firstly there are the ones checking for errors or changes in the background data. The second main type is the checks controlling that the questionnaires are fully completed and that the figures given are consistent. Finally there are checks controlling for extreme values.

The time spent on revising the data is believed to be shortened by collecting the data electronically. As described in the section 'IDUN - the web portal' we aim to improve the answers received via the Internet by informing the respondents about possible errors and giving more details about the data asked for. In the figure below the results measured by number and type of warnings are given.

Share of warnings, all warnings / Share of questionnaires received by mean / Share of warnings for the figures given, all such warnings / Number of warnings, all questionnaires / Number of warnings, all questionnaires with warnings
All / Definitive errors / Possible errors
Booking system / 3,1 / 3,7 / 1,8 / 0,4 / 2,8 / 1,2 / 1,7
Internet / 8,6 / 17,7 / 7,8 / 6,0 / 9 / 0,7 / 1,6
Post / 88,3 / 78,6 / 90,3 / 93,6 / 88,2 / 1,7 / 2,5

Note: All figures are average figures for the period July 2003 to February 2004 inclusive.

In the first two columns one can see that the share of warnings on questionnaires received via the Internet is much lower than those received by post compared with the share of questionnaires received by those means. From this we can conclude that the checks and the information on the Internet have made the respondents report better figures.

This conclusion is further underlined in the last two columns. The average number of warnings on all questionnaires is significantly lower for the Internet questionnaires compared with those received via the booking systems and on paper.

The mid-columns are all warnings not being related to the background data. Definitive errors have to be corrected and possible errors have to be revised, but not necessarily corrected. Here one can see that both electronic modes have improved even further compared with paper questionnaires.

The time spent on revising the data has been reduced as a consequence of this.

Response burden

Most of the hotels collect the required data for internal use. Consequently we only judge the response burden as the time spent on filling in the questionnaire. Whether one does this on paper and posts it afterwards or logs on to the Internet and completes the questionnaire there, it takes approximately the same amount of time. Judging from the feedback from our users, the reporting of the data via the Internet is found to be easier and fun though. The psychological effect does in other words play a positive role.

By end-2004 we will be able to offer statistics on each establishment in return. The establishments can receive this after having logged on to the IDUN web portal. Consequently the establishments will not have to spend time making the calculations themselves. This is also believed to make the response burden feel lower.

Operating the production system

The benefits of the web portal for the respondents and the users are undoubtedly many and positive as illustrated in this paper. Statistics Norway also benefits by using less resources on the collection of paper questionnaires and on revising the data. Many of these resources, however, are spent again updating and operating a more extensive production system. This has not been unproblematic in the initial phase, as the mentioned areas require different skills. Subsequently the manpower situation has been influenced negatively. In the long run this will even out.

Conclusion

Using the Internet to collect data for the hotel statistics has been a success. Statistics Norway receives the data faster and the data quality is higher than for paper questionnaires. In addition the respondents perceive the response burden to be lower than earlier.

On the negative side we have found that the resources saved registering and revising the data now are spent on operating the production system. Consequently we have not gained anything resource-wise yet.

We do also have to be careful making ourselves depending on the Internet answers in for instance respect to the time of dissemination, until the web portal definitively has become stable.

1