Exploit Interactive HomeHomeSearch
Issue CoverEditorialFeaturesRegular ColumnsNews and EventsEt cetera

News Article: Are You Linking To A Porn Site?

Many web managers became concerned in early April on hearing the news that an innocuous web site had transformed into a porn site. Brian Kelly describes the event and its implications.

The Incident

University web managers in British Universities no doubt expected a leisurely return to work after the Easter break. But on reading a message entitled Webtechs porn warning sent to the website-info-mgt (and web-support) Mailbase mailing lists [1] they found their priorities had changed.

The message sent on Wednesday 7th April announced that "The registration for the domain www.webtechs.com appears to have lapsed and been taken over by an "adult content" site. The problem is that there are pages around bearing their "validated HTML" gif linked to the former validation service http://www.webtechs.com/html-val-svc/ which is now a source of cyberporn.". To make matters worse "If you use Analog to process your server stats, the "validated HTML" gif and link may be compiled in and be appearing automatically at the end of the reports." [2].

This incident generated a great deal of activity within the next few days. Web managers quickly learnt the syntax of the AltaVista search engine in order to discover sites which contained links to Webtechs. Using the search term link:www.webtechs.com resulted in 25,533 hits. Clearly too many for even the most determined of web mangers to check, but a useful indication of the size of the problem.

Using the search term link:www.webtechs.com host:ac.uk enabled the search to be restricted to UK academic web sites. As shown in Figure 1 over 5,000 pages contained a link to Webtechs.

Using AltaVista to Discover Web Sites Containing Links to Webtechs
Figure 1: Using AltaVista to Discover Web Sites Containing Links to Webtechs

Further analysis indicated the following numbers of affected pages in a variety of communities.

Table 1: Nos. of pages with links to WebTechs by community
Community No. of affected pages
All web pages 25,533
UK Academic community (.ac.uk domain) 5,026
Other UK web sites (.uk excluding .ac.uk domain) 230
US Academic community (.edu domain) 7,610
Non profit making organisations (.org domain) 1,065
Government organisations (.gov domain) 185
Network organisations (.net domain) 2,035
Military organisations (.mil domain) 30

A list of the numbers of affected pages in several EU countries is given in Table 2.

Table 2: Nos. of pages with links to WebTechs in Several EU countries
Community No. of affected pages
UK web sites (.uk domain) 5,256
Irish web sites (.ie domain) 34
French web sites (.fr domain) 196
Belgian web sites (.be domain) 66
German web sites (.de domain) 575
Italian web sites (.it domain) 173
Spanish web sites (.es domain) 158
Portugese web sites (.pt domain) 11
Swedish web sites (.se domain) 224
Finnish web sites (.fi domain) 230
Norwegian web sites (.no domain) 54

Please note that the figures given in Tables 1 and 2 may not be completely accurate due to limitations in AltaVista's searching capabilities (e.g. the domain www.it.kth.se is included in Italy's total due to the precence of .it in the domain name), the date on which the AltaVista robot last trawled, etc.

The Role of Analog

Analysis of these hits indicated that large numbers of them contained "Web Server Statistics for ..." in their title. These pages contained statistical summaries of web site traffic generated by the Analog software, as illustrated in Figure 2.


Figure 2: Output From Analog Contains Icon Pointing to Webtechs

The Analog software was developed at the University of Cambridge. It is a widely used and freely available package for analysing web server log files. Stephen Turner, the author of Analog, has documented a solution to the problem. He described [3] how this problem occurred in old versions of Analog and that an upgrade to the software would fix the problem.

Stephen also described the background to the incident: "According to Webtechs, they have not intentionally given up control of their domain, but Internic (or Network Solutions, or whatever we have to call them now) lost their registration even though it was properly paid up, and "Virtual Domain Buyers" took it over. Webtechs are trying to get it back. So maybe it will all become right again soon."

The Implications

Implications For Projects

The figures given in Tables 1 and 2 probably exaggerate the numbers of pages which contain inadvertent links to pornography. A great many of the pages have, no doubt, been created automatically by the Analog software. These pages are unlikely to be visited frequently, and visitors are unlikely to click on an icon to be found at the bottom of such pages.

However the Webtechs incident has some worrying implications, especially for projects which have their own domain name. As described by Kelly and Peacock [4] a number of EU Telematics for Libraries projects have their own domain name, such as MALVINE (at http://www.malvine.org/). Within the EU's Telematics Application Programme the DESIRE project (at http://www.desire.org/) also has its own domain name.

Unless selling a popular domain name to a porn company is used as an exit strategy (!) there is a clear need for projects to be aware of the dangers of reuse of their domain name once the project has completed and no further funding is available to pay for the domain name. Similarly funding bodies, such as the European Union, should be alert to these dangers - especially as once a project has finished, there will normally be nobody left to deal with any such incidents.

Implications For Web Managers

What steps should managers of web services take to ensure that they are not inadvertently pointing to porn sites? Clearly search engines such as AltaVista can be used to check if a site contains such pointers - although, of course, this is not an infallible solution and a search across web file store may be a better solution.

If links to the Webtechs web site are found, what should the web manager do? It could be argued that, unless the page is directly managed by the web manager, it would not be proper - and perhaps even illegal, to tamper with someone else's web pages. On the other hand removing such links may be desirable - if not for legal reasons then to save embarrassment. Can you imagine giving a training course and clicking on the icon to demonstrate a HTML validation service, and then going to a porn site?

Implications For Mirroring Services

The Webtech HTML validation service was very popular in the mid 1990s, when HTML authoring first became popular. Shortly afterwards the HENSA service (described elsewhere in Exploit Interactive [5]) provided a UK mirror of the service. In November 1998 following discussion on the web-support Mailbase list, Dave Beckett announced that the Webtech HTML validation service had been replaced by the W3C HTML validation service [6]. Fortunately the popularity of the service made HENSA aware that the Webtech service was very dated. Failure of Webtech's to respond to requests for an update prompted HENSA to replace their service with W3C's validation service. This was, perhaps, fortunate for HENSA as an automated process for mirroring the validation service could have resulted in HENSA hosting a porn site!

Although HENSA are probably sufficiently experienced in mirroring services not to be caught in this way, it does illustrate some potential problems for sites mirroring web services.

Conclusions

No doubt the Irish Catholic women's ordination campaign [7] and the First Baptist Church of Sausalito [8] would be embarrassed by the links they are now hosting. Sadly porn companies now appear to be actively purchasing expired domain names. Tony Grimes, the Internet Marketing Executive for Macmillan Publishers has had similar experiences:

"I work on the Macmillan Reference website which used to own the groveartmusic.com url. Unfortunately, our registration [of] this url lapsed when we stopped using it last year. Since then it has been purchased by a Dutch company for pornographic purposes. Obviously this means that there are many websites containing this link who believe that they are actually linking to Grove Dictionaries of Music."

A number of solutions to the problem of bona fide web sites transforming into pornographic sites have been discussed. The Computer Science department at the University of Kent at Canterbury were in a position to quickly regenerate their list of publications since the pages were generated from a database. Jon Knight, Loughborough University, has suggested that information gateways could store an MD5 checksum of catalogued resources and provide warnings if the page changes [9]. Dan Brickley, University of Bristol, has proposed use of "a smarter link checker which periodically consults a PICS metadata label bureau and asks it for a description of each site, e.g. using a pornography-filtering ratings vocabulary like RSACi." [10].

System administrators may also provide solutions. David Hastings, a Systems Administrator at the Oxford University Computing Services (OUCS) regards the Webtechs incident as annoying rather than serious. OUCS have blocked access to the Webtechs site's numeric address in their cache configuration file, and sent out a warning message to webmasters in Oxford University. However, as David pointed out, users outside Oxford University will not be affected by their cache filter, and will still be able to follow links to the porn site from a page at Oxford University. As this "won't do anything to enhance the reputation of the University!" they are in the process of removing links to webtech.com from Oxford web servers. David made two additional suggestions:

  1. There ought to be a mechanism within the UK academic community for notifying webmasters of dubious sites.
  2. IP addresses shouldn't be recycled so quickly. Phone numbers are held for 2 years or so before they are re-used, so why not something similar for IP addresses?

In the longer term we may see more elegant protocol solutions. For example external link databases could be used for managing hypertext links (such as the proposed XLink protocol [11]) or digitally signed web sites [12].

Although there may be solutions provided in the future, web managers are still left with dealing with problems today. This article concludes with thoughts from Rebecca Linford, the University Web Administrator at the University of Dundee:

In the case of Webtechs, the thought of our institutional web site linking to a pornographic site, was probably more worrying than the reality. The pages linking to Webtechs were low profile, and received very few hits - the statistics pages were out-of-date and not of general interest; the pages listing html validators were primarily for internal users, rather than for promoting the University to an external audience.

The problem was easily "fixed" on official web pages - those responsible for the pages with links to WebTech were identified using our database of web administrators. Details were also distributed to our internal web administrators' mailing list. There are concerns that personal web pages could still have links to the site, as they will not be indexed by Altavista nor by our own engine. Although this could have a negative effect on the image of the institution (especially if published in the press!), the University does have disclaimers in place to highlight that such pages are the responsibility of the individual.

Here are some thoughts raised by the Webtechs example (with particular reference to this institution):

Feedback

If you have any comments on this article, please contact the author (B.Kelly@ukoln.ac.uk) or the editor (exploit-editor@ukoln.ac.uk).

References

  1. website-info-mgt Mailbase list archives, Mailbase
    URL: <http://www.mailbase.ac.uk/lists/website-info-mgt/>
  2. "Webtechs porn warning", Janet Wheeler, Posting to website-info-mgt Mailbase list archives
    URL: <http://www.mailbase.ac.uk/lists/website-info-mgt/1999-04/0002.html>
  3. Webtechs & pornography, Stephen Turner, University of Cambridge
    URL: <http://www.statslab.cam.ac.uk/~sret1/analog/webtechs.html>
  4. Web Technologies: URLs for Telematics for Libraries Project Pages, Brian Kelly and Ian Peacock, Exploit Interactive, Issue 1
    URL: <http://www.exploit-lib.org/issue1/urls/>
  5. Look in the Mirror for Bandwidth Savings, Sally Hadland, Exploit Interactive, issue 1
    URL: <http://www.exploit-lib.org/issue1/hensa/>
  6. Re: Validators, Dave Beckett, Posting to web-support Mailbase list
    URL: <http://www.mailbase.ac.uk/lists/web-support/1998-11/0131.html>
  7. Irish Catholic women's ordination campaign, Home Page
    URL: <http://homepages.iol.ie/~duacon/basic.htm>
  8. What We Believe and Preach, First Baptist Church of Sausalito
    URL: <http://marin.org/npo/firstbc/kerygma.html>
  9. Re: www.webtechs.com is now a porn site, Jon Knight, Posting to lis-elib Mailbase list
    URL: <http://www.mailbase.ac.uk/lists/lis-elib/1999-04/0009.html>
  10. Re: www.webtechs.com is now a porn site, Dan Brickley, Posting to lis-elib Mailbase list
    URL: <http://www.mailbase.ac.uk/lists/lis-elib/1999-04/0012.html>
  11. What Are .. XLink and XPointer?, Ariadne, Issue 16
    URL: <http://www.ariadne.ac.uk/issue16/what-is/>
  12. Digital Signature Initiative Overview, W3C
    URL: <http://www.w3.org/DSig/>

Author Details

Brian Kelly
UK Web Focus
Email: B.Kelly@ukoln.ac.uk
UKOLN: http://www.ukoln.ac.uk/
Tel: +44 1225 323943
Address: UKOLN, University of Bath, Bath, BA2 7AY
Brian Kelly

Brian Kelly is employed as UK Web Focus, at UKOLN (UK Office for Library and Information Networking) at the University of Bath, England. Brian's responsibilities include keeping the UK Higher Education community informed of web developments.

For citation purposes:
Brian Kelly, "Are You Linking To A Porn Site?" Exploit Interactive, issue 1, 10 April 1999
URL: <http://www.exploit-lib.org/issue1/webtechs/>