Exploit Interactive HomeHomeSearch
Issue CoverEditorialFeaturesRegular ColumnsNews and EventsEt cetera

Issue 7

Access to all regular articles. This page is intended for printing purposes. Note that the internal links to references will not work correctly.


Etc Articles


Software in use: Comparing Externally-Hosted Web Statistics and Purchased Statistics Services

In this final issue we will be comparing the two types of statistics software that have been used during the production of Exploit Interactive.

The Value of Statistics

Benjamin Disraeli once contended that "There are three kinds of lies: lies, damned lies, and statistics." Of course this is not always the case, some statistics can even prove to be useful! However, whether we like them or not statistics are often a required part of EC Web sites and are usually necessary to justify spending to Project funders. Web site statistics can illustrate that your site is increasing in popularity or show that further dissemination is needed. They can also confirm who your users are and where they come from; which in turn helps you to bring in more users. Some statistical analysis programmes can even help resolve problem areas such as broken links. For more on providing regular performance indicators for your funding body see Brian Kelly's article in Ariadne Issue 23 [1].

Nevertheless, Disraeli's reservation should be taken as a word of warning. Statistics can be misleading and should always be viewed as objectively as possible. There are quite often influencing factors that can cause unexpected results. One way of getting clearer figures is by using more than one statistical analysis system. This article will firstly have a look at some statistics from the time of Exploit Interactive Web magazine's launch to now (7th August). For this we will primarily use the WebTrends software because the other software programmes only began building up statistics in January 2000, a few added features of SiteMeter will also be considered. Secondly, the two types of software used to create these statistics will be compared and a list of relevant positive and negative factors drawn up.

Externally-Hosted Web Statistics Services (SiteMeter and Nedstat)

The two pieces of externally-hosted statistical software used on Exploit Interactive are SiteMeter [2] and NedStat [3]. Both were reviewed in the Software in Use column in issue 5 [4]. SiteMeter gives general site statistics and has been running on the Exploit Interactive server since 4th January 2000. NedStat is used to provide statistics for individual pages. The benefits in using externally-hosted statistical software were discussed in Ariadne issue 23 [5].

Purchased Statistics Services (WebTrends Professional Suite)

The WebTrends [6] Professional Suite was installed for access on a server in may 2000. It lays claim to a complete Web site management Suite that includes Web Site Traffic Analysis, Link Analysis, Content Management and Visualization, Alerting Monitoring and Recovery, Proxy Server Traffic Analysis and Reporting. We have yet to find out how most of these work!

Exploit Interactive Statistics

The WebTrends software analyses logfiles. A report was created for the entire site for the date range 23rd April 1999 14:46:15 to 7th August 2000 16:32:38, the time Exploit Interactive has been running. This report had a number of filters switched on so that page hits from the main authoring PC and from Harvester, a Web crawler, would be ignored.

Figure 1: WebTrends Summary
Figure 1: WebTrends Summary

WebTrends found that the total number of hits from the launch till 7th August to be 733,476. This may seem like a huge number but to WebTrends the total number of hits is a count of all the successful hits including HTML pages, pictures, forms, scripts and files downloaded. It is probably more useful to consider the total number of page views received which was 230,768, this evens out to average 488 a day. With page views only the hits to documents and forms are counted. The total number of visitor sessions, hits to the site by a single visitor within a timeframe, came to 48,802, an average of 103 per day.

WebTrends allows you to view a whole host of statistic types in a number of different ways. The full report is divided up into General Statistics, Resources Accessed, Visitors & Demographics, Activity Statistics, Technical Statistics, Referrers & Keywords and Browsers & Platforms.

Top 5 Articles

Some of the more interesting information includes the most requested page, unsurprisingly the Exploit Interactive Home Page followed by the issue home pages.

  1. Exploit Interactive Issue 1 News Article: Are You Linking to a Porn Site? [7]
  2. Exploit Interactive Issue 4: Building Europe's Largest Library [8]
  3. Exploit Interactive Issue 1: Oiling the Works: the PRIDE Project Develops an Information Brokerage Service [9]
  4. Exploit Interactive Issue 2: Newspaper Clippings in a Digital World: The LAURIN Project [10]
  5. Exploit Interactive Issue 1: Web Technologies - The Development Of Web Protocols [11]

The most accessed article was Exploit Interactive Issue 1 News Article: Are You Linking to a Porn Site? This page probably proved to be the most popular due to having 'porn' in its title so we should really discount it. The second most accessed article was Exploit Interactive Issue 4: Building Europe's Largest Library. The top 5 articles are listed above, obviously articles in earlier issues will have had more time to get hits.

Other interesting data includes the most active city, which is London closely followed by Bath; both are in the UK. The top geographic regions are Western Europe with 32.9% followed by North America with 30.2% and then region unspecified with 22.9%.

Top 5 Referrers

  1. http://www.exploit-lib.org/ - 9,670
  2. http://www.altavista.com/ - 1,714
  3. http://hul.helsinki.fi/ - 1,673
  4. http://www.ukoln.ac.uk/ - 1,060
  5. http://www.google.com/ - 929

The top referrer, after 'no referrer' with 26,923 hits, was found to be Altavista.com. The referrer is the domain name that the user visited your site from. This makes Altavista the top search engine, succeeded by Google, Yahoo and Excite.

Figure 2: WebTrends Top Search Engines
Figure 2: WebTrends Top Search Engines

The top 10 keywords used on search engines were found to be exploit, porn, Web, interactive, library, information, e-commerce, aggressor, server and libraries. The top browser used was Netscape, which was used by 37.58% of readers, next was Internet Explorer (IE), which was used by 35.01% of readers. These results are quite surprising; IE was used on more visitor sessions though.

One of the added features of SiteMeter is the ability to track the site by time zone. The Universal Time Coordinated [UTC] is given against a percent of users. For Exploit Interactive the highest percent, at 47%, is in Western Europe. The second highest group of users is East America and Africa with 15% and then Eastern Europe with 11%. These percentages do seem to vary throughout the day depending on which area of the world is 'awake'.

Figure 3: SiteMeter Site by Time Zone.
Figure 3: SiteMeter Site by Time Zone.

Comparing Statistics Services

The two types of statistics services used have been compared on a number of key areas.

  Externally-Hosted Web Statistics Services Purchased Web Statistics Services
Cost Free to use, no maintenance costs. UKOLN paid £500 for one server licence.
Installation Effort None, though the icon will need to be added to each page. There is an effort in installing the software. Once installed it is fairly simple to run a report, though to get to grips with the full funtionality of the software you may need to spend sometime reading the manual.
Access to information By default everyone has access to your statistics. If you are working on a dissemination project then this may be acceptable or even requisite. However you may not want users to be able to see your statistics. Public access to information can be suppressed from the manager page. Once logged in it is possible to make your site's statistics password protected and even remove your site from any public site lists. Access to statistical reports is initially limited to those with access to the report. If wanted HTML pages can be put on the Web.
Stability of Service If the external site or the network needed to access it go down them the statistics are not available. This may also effect the recording of statistics. The statistics software is hosted on your own server so is always available unless your own network is down.
Format of Reports The reports appear in HTML format accessible from your Web site. The format of the pages cannot be changed. If you wish to create a paper copy of the statistics you have to print each page individually, though it is possible to create CSV files. You can also get email statistics of your site but these are not as comprehensive. The reports produced are highly customisable. They can be modified in style and customised to use certain terminology. Reports can be produced in HTML for Web use, MS Word for printed copies and MS Excel for statistical analysis.
Information Recorded The information is captured through the downloading of the image icon. This means that information from 'text only' browsers or browsers with their images switched off is lost. Hits are also not recorded for users who flick from page to page without giving the icon time to load up. This is only a relatively small percent of users but will affect your statistics. To record all the hits you must remember to add the icon to every page you would like counted. This has not been done on Exploit Interactive. The server log files are used to create the statistics. However not all the data available is captured due to caching. Users who have viewed a page before maybe looking at a Web page that is held on their local drive and therefore their intention to view a page is not recorded. Before a report is run you can chose to filter out certain pieces of ambiguous data (see filtering).
Availability Up to date statistics are available virtually instantaneously. However sometimes data is only available for a limited amount of time. A report must be run to retrieve the statistics. The time this takes will depend on the length of time you need the data for and the popularity of a Web site.
Filtering Limited filtering capabilities to support data mining. Filtering is possible. Areas that are particularly useful are removing Web crawlers such as Harvest and Robots. You can also remove internal users.
Other Having an extra icon on your page may effect the time it takes for a page to download. Site Meter uses dedicated servers that are directly connected to the US Internet backbone so the time delay should be minimal. WebTrends provides information on more areas. It is possible to obtain lists of broken links; there are externally hosted services that can do this too such as Link Alarm [12].

Conclusions

In his article on using externally-hosted statistical software Brian Kelly concluded that "Web statistics services such as Nedstat and SiteMeter appear to be very useful. They provide a cheap alternative to purchased statistics services and work very effectively with few serious drawbacks". He also pointed out that they do however have limitations in how much information is retrieved and what information is available for viewing. Their main disadvantages are their inability to record hits from 'text only' browsers and their restrictions on what data you can see.

Purchased statistics services such as WebTrends offer more data and more manipulation of statistics into reports, though they have their own disadvantages due to caching. They are probably more useful to Web authors with a number of sites to manage and a need for highly comprehensive reports.

A combination of the two statistical services has allowed Exploit Interactive to have access to a full range of data. Information such as top referrers and search engine used has been very useful for dissemination targeting for Exploit Interactive and the new pan-European Web magazine, Cultivate Interactive [13]. The SiteMeter and NedStat services have meant that authors can see how well their articles are being received and how well the magazine as a whole is doing. Although statistics are not always perfect when used in a comparative way they can give a good idea of the success of your site. With such information aboard you, as a Web author or Web site manager, can revise your site to make it even more successful.

References

  1. Performance Indicators for Web Sites, Brian Kelly, Exploit Interactive, issue 5, April 2000
    URL: <http://www.exploit-lib.org/issue5/indicators/>
  2. SiteMeter
    URL: <http://www.sitemeter.com/> Link to external resource
  3. Nedstat
    URL: <http://uk.nedstat.net/>
  4. Software in Use: Externally-Hosted Statistical Software, Brian Kelly, Exploit Interactive, issue 5, April 2000
    URL: <http://www.exploit-lib.org/issue5/software-used/>
  5. Using Externally-Hosted Web Services, Brian Kelly, Ariadne, Issue 23, 2 March 2000
    URL: <http://www.ariadne.ac.uk/issue23/web-focus/> Link to external resource
  6. Web Trends Enterprise Solutions, WebTrends
    URL: <http://www.webtrends.com/> Link to external resource
  7. Are You Linking To A Porn Site?, Brian Kelly, Exploit Interactive, issue 1, 10 April 1999
    URL: <http://www.exploit-lib.org/issue1/webtechs/>
  8. Building Europe's Largest Library, Steve Coffman, Exploit Interactive issue 4, January 2000
    URL: <http://www.exploit-lib.org/issue4/ell/>
  9. Oiling the Works: the PRIDE Project Develops an Information Brokerage Service, The PRIDE project team, Exploit Interactive, issue 1, 10 April 1999
    URL: <http://www.exploit-lib.org/issue1/pride/>
  10. Newspaper Clippings in a Digital World: The LAURIN Project, Günter Mühlberger, Exploit Interactive, issue 2, 20 July 1999
    URL: <http://www.exploit-lib.org/issue2/laurin/>
  11. The Development Of Web Protocols And Formats,, Brian Kelly, Exploit Interactive, issue 1, 10 April 1999
    URL: <http://www.exploit-lib.org/issue1/web/>
  12. Link Alarm
    URL: <http://linkalarm.com/> Link to external resource
  13. Cultivate Interactive
    URL: <http://www.cultivate-int.org/> Link to external resource

Author Details

Marieke Napier Marieke Napier
Information Officer
UKOLN
University of Bath
Bath
England
BA2 7AY

URL: <http://www.ukoln.ac.uk> Link to external resource
Email: m.napier@ukoln.ac.uk

Marieke is editor of Exploit Interactive and Cultivate Interactive Web magazines.

For citation purposes:
Marieke Napier, "Software in use: Comparing Externally-Hosted Web Statistics and Purchased Statistics Services", Exploit Interactive, issue 7, 2nd October 2000
URL: <http://www.exploit-lib.org/issue7/statistics/>


Job Postings from around Europe: Projects, Networking, Libraries

Welcome to Exploit Interactive's Jobs Section. If your organisation has position openings for Telematics Projects, Networking, or Library related work, details (as shown below) will now need to be sent to cultivate-editor@ukoln.ac.uk

Information Officer for the Distributed National Electronic Resource (DNER) and Resource Discovery Network (RDN)

The Joint Information Systems Committee (JISC) is building a Distributed National Electronic Resource (DNER); a national managed information environment for further and higher education. The DNER is a co-ordinated and comprehensive collection of high quality digital resources for use in learning, teaching and research. The RDN is one important service component of the DNER, and the postholder will be required to work closely with both RDN and DNER staff to realise the overall objectives of the DNER. The Information Officer will support the implementation and ongoing development of an ambitious programme of PR and communication activity for both the DNER and the RDN, and assist in creating widespread awareness across the education, public, heritage and commercial sectors using a variety of print and electronic media. The successful applicant will be creative and dynamic and able to employ their previous experience, excellent communication, IT and research skills to:

The post is available for two years. Secondment arrangements will be considered. Salary, will be within the ALC1/2 scale: £18,909 - £27,347 per annum, inclusive of London Allowance.

For further details and an application form, please send a large self-addressed envelope to:
Emma Hammond
Personnel Department
King's College London
James Clerk Maxwell Building
57 Waterloo Road
London SE1 8WA
Phone: 0151-794-2696
Fax: 0151-794-2681
Email: emma.hammond@kcl.ac.uk Link to external resource quoting reference E2/QJ/148/00.

Closing date: Tuesday 10th October 2000.

~

Graduate Library Assistant, University of Liverpool Library

There is a vacancy for one full-time Graduate Library Assistant in the Special Collections and Archives Division of the University Library.

The Library Assistant is required to assist a team of librarians and archivists in the day to day running of the Special Collections and Archives Reading Room in the Sydney Jones Library. Library experience is not essential but there is a need to work accurately in a well-organised manner and to assist researchers in a friendly and efficient way. General enquiry, clerical, and preservation work is also part of the job. Duties will include work with the Innopac computerised library system and other online access tools. Full training will be given.

The post is specifically intended to provide pre-course experience to graduates wishing to progress to a course in Library and Information Science or Archives Administration. The post is tenable for one year.

Further particulars and application form may be obtained from: Maureen Watry
Phone: 0151-794-2696
Fax: 0151-794-2681
Email: mwatry@liverpool.ac.uk Link to external resource

To apply please send a CV and covering letter to:
Maureen Watry
Head of Special Collections and Archives
Sydney Jones Library
University of Liverpool
PO Box 123
Liverpool
L69 3DA
Closing date: Friday 6th October 2000.

~