CEIRC Survey on Managing Usage Statistics

CEIRC Survey on Managing Usage Statistics

CEIRC Survey on managing usage statistics,

February 2007.

Results compiled by Neil Renison and Diane Costello.

Introduction

In February/March 2007 all member institutions of the CAUL/CEIRC consortium were invited to respond to a questionnaire seeking information about their practice in collecting statistics on the use of online information resources and their requirements for a proposed workshop on this topic.

Of the 47 Australian and New ZealandUniversities involved in CEIRC, 45 responded. Another 4 responses were received from non-University participants. All 49 institutions reported collecting and using these statistics to some degree. The high participation rate and commonly expressed enthusiasm for the workshop is an indication of the importance placed on measuring use and perhaps the difficulty experienced in doing this successfully and economically.

Questions and Responses

Most questions could be answered with a simple yes or no; where elaboration was necessary it was specifically requested, otherwise respondents could comment as they wished.

Part I: Collecting statistics from Information Providers

1. Only COUNTER compliant statistics are collected and processed. (Yes/No):

There were 3 institutions responding yes, though one of these added the comment “Unless no COUNTER statistics are available for that resource”.

Summary of comments and additional information:

One institution’s response applies for almost all "We take COUNTER statistics by preference where they are available, but will take others if that is all there is." While there is caution in use of non-COUNTER statistics, the need for data seems to override any misgivings.

2. Statistics are collected from provider in-house manually. (Yes/No):

All 49 respondents replied yes.

Summary of comments and additional information:

All respondents report doing some statistics collection manually, even those institutions which have ScholarlyStats to do part of the collection. Practice includes a mixture of online harvesting and scheduled emailed reports.

3. Statistics are collected from provider using in-house system ILMS, ERM or other. (Yes plus System name/No):

Only 6 institutions reported using some technology based system for in-house collection.

Summary of comments and additional information:

While only a minority use any form of automated system for in-house data collection, the comments were divided between those that use (or plan to use) their ERM to collect COUNTER and COUNTER like data and those using different measures acquired through systems like SFX, EZProxy and "Click through" counting.

4. Statistics are collected by a commercial provider, e.g. MPS ScholarlyStats, SerialsSolutions. (Yes plus Provider name/No):

A total of 14 use a commercial provider.

Summary of comments and additional information:

MPS ScholarlyStats is the dominant statistical service provider (11 clients), though some cited SerialsSolutions and ISI/Journal Use Reports. The lack of comprehensiveness in statistics reporting was identified as a problem, with both ScholarlyStats “it would only track a limited number of resources” and SerialsSolutions “is only one way that our electronic titles can be accessed, so the statistics don't show the complete picture.”

5. SUSHI is used by your institution or an external provider in statistics collection. (Yes/No):

Just 2 “pioneers” reported using SUSHI.

Summary of comments and additional information:

While comments here and later indicated an interest in SUSHI, only two institutions are active and one reported problems of a technical nature and with suppliers not being fully “SUSHI” ready.

6. Usage statistics are collected from sources other than Information Providers, e.g. EZProxy. (Yes/No):

Almost half, that is 23 respondents, collect statistics from sources other than the Information Providers.

Summary of comments and additional information:

EZProxy log analysis is the most common alternative source, or source under investigation (15 institutions). SFX/MetaLib and SerialsSolutions, which create e-resource usage statistics as a by-product from their own use, were also mentioned as well as two institution specific local sources.

Part II: Processing usage statistics

7. Statistics are processed in-house manually. (Yes/No):

In all, 48 institutions (of 49 respondents) reported some in-house manual processing of the statistics collected.

Summary of comments and additional information:

Nearly every institution does some processing, though the survey did not draw out how comprehensive this is. Some advised that processing was limited and/or done on an “as needed basis”. The libraries which outsourced processing also reported: “those statistics not processed by ScholarlyStats are done manually”.

8. Statistics are processed in-house automatically. (Yes plus name of system or software used /No):

There were 4 institutions responding yes, but those responses are qualified in the following comments.

Summary of comments and additional information:

The question was intended to reveal whether any institutions had developed (or purchased) technologies that would help them processes COUNTER or vendor supplied statistics. The 4 yes responses all apply to quite different sources of data. Whereas the one comment that “we do have some scripts that obtain statistics and load selected journal title statistics into a database” came from an institution answering no, but seemingly making some progress.

9. Statistics are processed by a commercial provider. (Yes plus Provider name/No):

Of the 8 yes responses one might be attributed to Question 8 above.

Summary of comments and additional information:

While a minority of institutions outsource processing, ScholarlyStats is the provider of choice (or the only option). "Voyager and MetaLib/SFX" reported here might have been categorised as in-house processing assuming these are systems managed by the institution, even if as part of a consortium.

10. What are the “end-products” of the processing? (List and describe all.):

There were 42 libraries reporting some end products.

Summary of comments and additional information:

The end products reported (with occurrences) were; consolidated annual reports on e-resource use (9), database of statistics (2), graphs (3), lists (e.g. most heavily used OR zero use) (2), printouts (1), renewal reports (1), reports showing increased/decreased use (1), ScholarlyStats reports (2), spreadsheets (31), time series reports (1), web pages (4).

Respondents also added comments anticipating later questions on what the statistics would be used for; budget submissions (1), calculations of cost per download/search (8), calculation of increase/decrease of use (2), determining ROI (when matched with other indicators) (1), library annual or quarterly reports (5), library 'instruction’ (1), library promotion (2), purchase or renewal decisions and proposals (10).

Part III: Utilising usage statistics

11. Data is used in collection development and deciding on renewals. (Yes/No)

There were 48 libraries reporting use for collection management activities.

Summary of comments and additional information:

Nearly all libraries use the data in collection development, but not necessarily in all cases and usage may be only one factor among others. “It's not the major factor, but is important”.

12. Data is used for library promotion and funding submissions. (Yes/No)

A majority of libraries, 34 in all, reported some use.

Summary of comments and additional information:

Most use the collected statistics for library promotion or funding submissions; either one and not the other, or both. Although one library reported that “The data is not specifically used for these purposes”. Practice varies from “Selected electronic resource usage statistics are published in the Library's biannual newsletter in March each year” to “Occasionally”.

Part IV: Workshop content

13. Have you, or someone else at your institution, acquired skills or experience in relevant areas and would be willing to contribute to the workshop? (Yes/No - if yes, please elaborate).

There were 7 respondents who said yes.

Summary of comments and additional information:

One object here was to allow respondents to self assess their level of expertise and confidence in this area. Most respondents were very modest if not dismissive of their level of expertise, perhaps unfairly so. We have seven or eight potential contributors to the proposed workshop that was the stimulus for this survey.

14. Are you aware of software/ERM/ILMS developments with collection, processing or utilization of usage statistics which you would like to hear more about? (Yes/No - if yes, please elaborate).

Responses from 31 respondents indicated a lively interest in the use of technology and standards.

Summary of comments and additional information:

Most were aware of recent technological developments and products and wanted more information on SUSHI, SerialsSolutions, various ILMS/ERM systems, Scholarly Statistics, Journal Usage Reports and EZProxy log analysis. Swets and Ebsco were also mentioned in the context of SUSHI. One respondent who “would really like focus to be on stuff that can be used now” may have been speaking for many.

15. Proposed sessions include: In-house collection of usage statistics, MPS ScholarlyStats, SUSHI, in-house processing of usage statistics, ISI Journal Usage Reports, presenting usage data, using the data for collection evaluation and development, using the data for library promotion and funding. Are there topics here of no interest or important topics we haven't listed? (Yes/No - if yes to either please elaborate).

There were 28 respondents confirming interest in the proposed topics or adding more.

Summary of comments and additional information:

The common response was that all the proposed topics are of interest. Other topics that people wanted to include were:

COUNTER update.
Data analysis for the different categories of e-resources.
Data mining proxy servers - EZProxy or institutional servers - and software for analysis.
Discussion of different indices that can be developed from the statistics.
Effect of federated search tools on usage statistics.
How to be smart with Excel.
How to involve Reference staff using empirical information to promote, and teach e-research.
How to use the COUNTER XML schema.
Negotiations with non-COUNTER providers.
Specific ERM/ILMS products and vendors such as Innovative, ExLibris, Endeavor.
Usage of free resources, and institutional repositories? How will we measure?
Usage statistics for e-books.

Part V: (Optional):

16. Add anything else you want to contribute, answers to questions we didn't ask, additional matters you want covered in the programme, etc.

There were 9 contributions added to the survey.

The comments had little in common so all are provided here in brief:

A discussion of what these statistics can be used for, what kind of leverage, especially from actual examples. Also, how user behaviour surveys could support findings to provide context. Also, the problem with some information that I 'know' is not accurate. What do others do about this?
Extremely interested to learn from others.

I feel guilty for not making more use of all the data we have.
I would like to know more about the analytical side, what are common / key figures to produce?
I'd also like to hear more about the Swinburne experience with obtaining EBL usage statistics and using these to automatically drive selection decisions. It would be good to have a bit of focus on ebook usage.
Inconsistent data from vendors; some more time-consuming to collect than others; inconsistent delivery practice. Meaningful analysis and comparison of data is often near impossible.
Interested also in staffing models to do e-statistics work.
Please consider making the workshop open to a wide audience.
Use of ROI in the academic sector. Federated search statistics.

Conclusions

The survey shows that most institutions are involved in manual collection and processing of data and want to improve their practice; while at the same time they also want to explore opportunities for automated or semi-automated processing or outsourcing of much of the task.

It also provided reassurance that a workshop on these issues will be well received, especially if it focuses on practice and practicalities. However there does seem to be a wider range of interests than would be possible to address in one day and some specialised areas (such as EZProxy log analysis) that might attract a different audience.

Appendix 1. Individual Comments

1. Only COUNTER compliant statistics are collected and processed:

All statistics are collected, but we try to collect the statistics that could be compared.
Both are collected. Although non-COUNTER statistics are at least theoretically comparable, the non-COUNTER statistics can at least provide trend information.
Both COUNTER and others are collected.
Collecting and collating all that we get from vendors.
Many of the databases from which we collect statistics are COUNTER compliant, but there are some databases for which we collect statistics which are not e.g. Business Monitor Online and Factiva. Some databases may be only partially COUNTER compliant e.g. Informit.
We collect statistics from as many vendors as possible and practical, providing we feel they are of use to Library management. COUNTER statistics are very much preferred.
No, not all of our statistics are COUNTER compliant yet.
No, not all subscribed electronic services are COUNTER compliant - eg World News Connection, British Education Index). Business and Legal databases in particular provide less well-defined data and given the cost of these services, the situation is less than ideal.
Non-compliant statistics are massaged into "COUNTER like" form as far as is possible.
Only about 90% are COUNTER compliant.
The focus has been on COUNTER compliant only, but now that we have ScholarlyStats I want to look at non-COUNTER products, and am looking at the EZProxy logs to see if they can help.
Unless no COUNTER statistics are available for that resource.
Usage statistics for most platforms/databases are collected.
Usage statistics from locally hosted electronic journals are not COUNTER compliant; however we know how they are determined.
We collect all available statistics, however those which are not COUNTER compliant are considered indicative rather than actual.
We also record statistics use from our federated & SFX search products which are not COUNTER compliant.
We collect a mixture of statistics and not all are COUNTER compliant.
We collect COUNTER compliant statistics where available, but sometimes we have to use other statistics as well.
We collect from a range of vendors. Naturally we need to be careful when comparing them, but non-COUNTER statistics are better than none at all.
We prefer it, but use ScholarlyStats and if the only statistics we have are not COUNTER compliant we will use them in some cases.
We systematically collect COUNTER statistics from our major datasets, particularly “Journal Report 1”. However, we also collect statistics for services that are not COUNTER compliant. Statistics, both COUNTER and other, are also collected as needed from all resource providers.
We take COUNTER statistics by preference where they are available, but will take others if that is all there is.
We'll take any statistics that the suppliers provide.
While COUNTER compliant statistics are preferred we need to know our usage for all products.

2. Statistics are collected from provider in-house manually:

A mix of manual collection and use of a commercial provider.
Collected from publisher administration site or via email to Library Systems. (these are then manually processed). Not all services/publishers provide statistics.
Either collected, or if available, by scheduled email delivery.
For resources not covered by ScholarlyStats and/or non-COUNTER Compliant.
Many are set up as scheduled reports that are emailed monthly.
Sometimes.
Stats are collected once per month by a library officer. No assessment is done, only collection.
Those statistics not accessible through ScholarlyStats are manually collected.
We have to go to each supplier's site and extract the statistics. It is somewhat time-consuming as each site has different navigation, metaphors or jargon.
When we can find time to do it. For some I've set up scheduled reports to appear in my email inbox.
While most statistics are obtained by downloading from the Vendor's administration page, some statistics are sent as files to a designated contact.
Yes and no. Currently use ScholarlyStats where available. Other statistics are either collected manually or automatically emailed to us for processing
Yes, for non ScholarlyStats resources 2007 onwards. Before 2007, all were collected manually.
Yes, some suppliers email us statistics when we request them.

3. Statistics are collected from the provider using in-house system ILMS, ERM or other:

However, we are in the process of implementing Meridian, Voyagers' ERMS.
Innovative ERM.
Locally hosted electronic journal collection has a usage module.
Looking at using EZProxy web logs.
Not yet! We are in the process of setting up III (Innovative) ERM, which will enable us to collect COUNTER-compliant statistics automatically.
SFX.
SFX.
There is no in-house system for collecting statistics - this is a future project?
We are currently investigating existing ERM's with a view to implementation during Q4, 2007.
We use a statistics package called Urchin to collect some statistics usage. We have a Connect (web) page for each database and collect statistics (using Urchin) for hits on that page - but this does not count whether the user continued into the database; so this is only a rough indication of use.
We use Innovative's ERM, but due to technical issues have not been able to get the statistics import or AutoStats features working.

4. Statistics are collected by a commercial provider, e.g. MPS ScholarlyStats, SerialsSolutions:

Currently exploring subscribing to a serial collection service.
Have investigated ScholarlyStats, but found them not suitable for our purposes due to relatively small proportion of our vendors that they includes in their database. Plus pricing model not suitable.
MPS ScholarlyStats - currently 21 vendors have also recently purchased Thomson JUR, but are still setting this up.
No, not yet. Products are being considered.
ScholarlyStats, Serials Solutions.
ScholarlyStats.
ScholarlyStats.
ScholarlyStats.
ScholarlyStats.
Statistics are collected from provider in-house manually.
Use ScholarlyStats.
Used to use MPS (ScholarlyStats). At the moment a high priority is to amalgamate the database use data - searches and turnaways. The first tier pricing for ScholarlyStats is 9 platforms. In 2006 ScholarlyStats had only 7 database vendors on as options, of these turnaways were only available for 4 of the 7 platforms. So in 2006 we have opted to have 9 journal vendors in our ScholarlyStats service. Since the CEIRC offers are vendor by vendor, the cross vendor comparison given by ScholarlyStats were not being used. (Instead the native interface statistics are consulted.) However this service could be very useful when/if we need to look at cross platform use.
We also have SerialsSolutions statistics.
We also pull some usage statistics from SerialsSolutions.
We have purchased ScholarlyStats for 2007 and will test its suitability for our purposes.
We looked at ScholarlyStats, but decided that the cost / benefit wasn't that great because it would only track a limited number of resources.
We use ScholarlyStats for around 30 different platforms, e-journal and bibliographic databases.
We use SerialsSolutions but have never had time to make use of their statistics.
We use SerialsSolutions 'e-journal portal' and can view click throughs from searches preformed in the portal. The e-journal portal is only one way that our electronic titles can be accessed, so the statistics don't show the complete picture.
Yes, we use ScholarlyStats for some of our statistics.

5. SUSHI is used by your institution or an external provider in statistics collection:

At last check, none of our providers supplied SUSHI-compliant statistics, despite their claims (EBSCO/EBSCO EJS). We are also having technical issues importing SUSHI statistics via our ERM's AutoStats feature.
Don't know.
Don't use SUSHI.
Interested an acquiring and ERM which is SUSHI compliant.
Investigating SUSHI.
Keeping a watching brief and will explore at a later stage of SUSHI development.
Only used it recently to collect Ebsco EJS statistics.

6. Usage statistics are collected from sources other than Information Providers, e.g. EZProxy: