Living in the Shadow of the World Wide Web: Lesser Known Network Resources


By Matt Ernst <erns0637@pacificu.edu>
Junior, Computer Science Major, Pacific University

Since its popularization in the early to mid 1990s, the Internet has become to many people synonymous with the World Wide Web. In reality, the Internet predates the Web considerably. Much of the traffic that flows on the Internet has nothing to do with the Web, sent via protocols that both predate and postdate HTTP.

FTP
A history and technical presentation of FTP, the File Transfer Protocol, may be found in RFC 959, Request for Comments document 959. This document, although written in 1985, still defines the majority of modern FTP implementations. The first FTP document, RFC 141, dates from 1971 - truly antediluvian by computer standards. FTP has endured the test of time, remaining a valuable and widely-used service.

The File Transfer Protocol is, as its name indicates, a protocol for transferring files across a network. FTP enables a Macintosh to copy files from a mainframe and transfer them to a PC, all while smoothing over the hardware and filesystem differences between the different platforms. The ability to coherently transfer files between different machines is almost a given today, thanks in part to FTP's birth years before the web. FTP is still widely used to transfer files over a network - for example, to upload and modify content on a remote web server.

HTTP is perfectly capable of serving files of all sorts, not just images and text. So why is FTP still used? FTP servers and clients are considerably simpler than those for HTTP (and therefore smaller). FTP is better suited to two-way file transfer. It's easy enough to download via HTTP, but uploading is more complex. The FTP protocol, unlike HTTP, keeps track of the state a user is in. HTTP sends information off to a client and has no idea what the client is up to until the client responds in some fashion (usually by sending another request). FTP keeps track of users as they begin and cease using the FTP service. FTP servers can limit the simultaneous number of connections they allow, the number of connections per client, and the number of file transfers (both uploads and downloads). HTTP does not easily allow such fine-grained control.

A critical part of FTP's control is that every client connecting to an FTP service must be identified with a unique login and a password. There are many times when it is desirable to allow the general public access to certain files via FTP, so it is traditional to have an account called "anonymous" that will accept anything for its password (a valid e-mail address may be requested as the password but this is impossible to enforce). This read-only account allows access to whatever public files the server's administrator has made available. Private accounts are protected by passwords that are not widely shared (it is worth noting, however, that FTP does not protect sensitive information with encryption). Many ISPs will grant private FTP accounts to their clients to allow remote modification of the clients' web content. Software companies will grant certain FTP logins and passwords to their customers, to enable them to download beta products, patches, or information not available to the general public. Many private FTP servers, run via personal broadband connections, make available copied games, music, movies, and software. FTP servers commonly use ports 20 and 21, but (like other services) may be configured to run on alternate ports.

When should you turn to FTP instead of the web? If you are looking for files instead of text and pictures, FTP is a good bet. It's easy to access anonymous FTP sites. Most modern browsers can at least download FTP, although a dedicated client is nicer. For example, if you point your browser to ftp.netscape.com/pub/, you'll find hundreds of files, hierarchically organized, and you won't be prompted to fill out any forms before you download any of them. FTP servers are not automatically indexed by search engines, so it can be hard to find what you're looking for. Happily, Lycos has a dedicated FTP search tool. It's a fantastic resource for finding older or obscure versions of software.

Telnet
Telnet was defined in 1980 in RFC 764. It was another considerable step forward for interoperability between different platforms. FTP allowed different machines to send files to each other; telnet allowed users to use one machine from another. In 1980 virtually every computer (apart from some experimental research machines) utilized a command-line interface. Users typed out text commands and the computer responded with more text. Telnet allowed this interaction to take place over a network. Users could use a low-powered terminal device or workstation to connect to a much more powerful computer, or to a computer miles away. Then they could issue commands and view output as if they were sitting right next to the machine in question. Telnet-accessible shell accounts from an ISP were popular during the early to mid 1990s, so that customers could perform modifications and use tools not accessible via FTP. Telnet is clearly showing its age and should be replaced on any public-network machine where it is currently available. It is an insecure protocol, from which it is very easy to intercept passwords. Telnet commonly runs on port 23.

Secure Shell
Secure shell is not just one tool but a suite of network tools that increase the security of remote computer operations. It can act as an encrypted, secure replacement for both FTP and telnet. It is an especially good idea to replace telnet with SSH. Rather, it's a good idea to replace it with OpenSSH, a completely free workalike of the commercial SSH created by the security-oriented OpenBSD project. OpenSSH is primarily oriented toward Unix systems, but there are versions available for Windows and the Macintosh as well. The homepage has a number of links to clients for various platforms.

NNTP
The Internet has been a social medium for decades. In the pre-Web era (and even today) Usenet was (and is) one of the most popular forums for discussing everything under the sun. NNTP is the Network News Transfer Protocol. Found in RFC 977, this protocol is one of the relatively few standardized things about Usenet. The one other standard, specifying the format of Usenet messages, is found in RFC 1036. NNTP is the protocol by which messages are read from and posted to Usenet (also known as NetNews).

What is Usenet? Usenet is a shared, worldwide, hierarchical system of various "newsgroups" catering to specific topics. It is similar to a very large bulletin board system with a good dose of anarchy thrown in. The "big eight" hierarchies begin with comp, humanities, misc, news, rec, sci, soc, and talk, and are fairly orderly. There is an established procedure to be followed when calling for the creation of a new group, and for voting on that motion. Some newsgroups may be moderated, so that no message ("post") appears without the approval of the group's human moderator. Groups are dedicated to specific topics and discussion of other topics within the group is generally discouraged (and frequent nonetheless). The group's name generally reflects the interests of the group. For example, rec.pets.cats is for discussing pet cats. The newsgroup rec.pets.cats.anecdotes is for telling and discussing funny stories about your pet cat. The newsgroup rec.pets.cats.announce is a moderated newsgroup containing various compilations of Frequently Asked Questions (FAQs).

Typical Usenet discussions ("threads") will begin with a single post either asking a question, making an observation or proposal, or telling a story. If anyone else who reads the message is interested, they may respond with further questions, a story of their own, an answer, hostility (a "flame"), or whatever they please. Sometimes threads grow to substantial size while focusing on the original topic and remaining civil. This is rare. Most long threads either exhibit severe topic drift or degenerate into "flame wars", online arguments with two or more factions that sometimes exhibit great wit and verbal skill but usually consist of unsophisticated personal attacks repeated with greater intensity as time goes on. Topic drift and flame wars are frequently combined. Topics often drift toward gun control, environmental politics, and Nazi Germany, at which point the stage is usually set for a flame war. Once the participants are exhausted and nobody has posted a new reply to the thread in a while, the thread is generally considered "closed." Replies to messages are sometimes posted as separate messages, especially in popular groups, since it is easy to get lost in the background otherwise.

It is critical that the protocol for sending and receiving Usenet messages over a network is standardized, because Usenet only exists as a distributed entity. There is no central Usenet server, no central Usenet authority, and no external control of Usenet. It is possible to remove a site from the WWW by legal force. It is impossible to remove a message from Usenet by legal force unless the authority in question is able to wrangle cooperation from the thousands of Usenet providers and Usenet archives worldwide. This is especially relevant to the binary groups on Usenet.

Unlike most groups on Usenet, binary groups are not primarily concerned with discussion. They are concerned with the sharing of files - whether software, pictures, sounds, or videos. Binary groups contain huge, multipart messages consisting of computer files that have been converted to ASCII characters transmittable over Usenet. Many Usenet providers carry few or no binary groups, because the bandwidth requirements are tremendous and content is often illegal (such as copied software and movies) or sexually explicit. Tamer binary groups catering to (for example) 3D artists and model train enthusiasts also exist.

Beginning in the mid 1990s with America Online providing Usenet access, the quality of discussion on Usenet deteriorated considerably. Millions of new users with no sense of etiquette or respect for the existing culture they began participating in drove away many former Usenet enthusiasts and diluted the quality of discussions for those who remained. Since then Usenet has also suffered repeated assaults from automated spamming and garbage flooding programs. And, since there is no central authority for Usenet, it has not been possible to stop the assaults. Many smaller groups with fewer messages have been killed off by spam and garbage (computer-generated messages full of random words, posted with false identities, intended to make groups less readable).

In spite of what might seem like a mountain of problems, Usenet is still one of the best electronic resources in existence. Beginning in 1995, a company called Dejanews began archiving the complete text of every non-binary newsgroup it could find. Dejanews reinvented itself as Deja and eventually went under, but Google bought the Usenet archive. It continues to host and maintain the archive as Google Groups. One can read specific newsgroups on Google Groups, or perform keyword-based searches. Google Groups recently made an incredible breakthrough as Google managed to find, and make available online, Usenet archives dating back to 1981.

If you want to make the final leap and begin participating in Usenet, there are a number of options. Many large ISPs (like Verizon, AOL, or Earthlink) already offer a large number of newsgroups. All you need to do is configure a client with the proper information and you can begin participating. If your ISP does not carry newsgroups, or does not carry some you are interested in, there are a number of companies providing unfiltered, complete Usenet access for a monthly fee, like Giganews. If you already have basic Internet access but would like to participate in (non-binary) Usenet groups, it is possible to register for a genuine, free of charge news feed at news.cis.dfn.de.

Hotline
Hotline was the first peer to peer file sharing system of which I am aware. Released in 1997, it never achieved the phenomenal popularity of Napster but remains an important part of the electronic landscape. Unlike everything previously mentioned in this article, the Hotline protocol assumes that Hotline clients are also servers, and servers are also clients. That is, every instance is a peer to each other instance. Also unlike previously mentioned protocols, the Hotline protocol is unpublished. Documents on it exist, but they were obtained by reverse-engineering the original.

An awareness of Hotline is valuable because this widely used program has often been overlooked in the excitement attending Napster and its successors. Yet users may transfer tremendous amounts of information through Hotline, bogging down the network for other users. At Pacific University, highly publicized music swapping programs like Napster quickly had their bandwidth limited so that individuals wouldn't exhaust the available network capacity. While Napster was crippled to throttle downloads of 4 megabyte MP3s, my roommate was happily downloading 700 megabyte movie files via Hotline, often starting 3 or 4 downloads at a time so he could just let his computer chug away while he went about his day.

Hotline and Hotline knockoffs are first-rate sources for the Usual Suspects of pornography, pirated movies, pirated music, pirated games, and pirated software. All of these things are bandwidth-intensive, so even schools with no moral or legal qualms may wish to set policies regulating Hotline use. In reality, regulating any particular client or protocol will just push users to new software. It makes more sense, ultimately, to assign static IP addresses to networked computers and monitor each address for excessive traffic.

In Conclusion
The World Wide Web gets the lion's share of attention when it comes to popular reporting about the Internet. Ultimately, the Internet is only a framework over which limitless different services can run. An awareness and knowledge of some of these different services is the first step toward using them and dealing with them.