(An "how to" for young web seekers)

How to research, evaluate and collate web material
by A+heist (heavily edited by fravia+)

First published: mid February 2008
updated: 17/Feb/2008

Introduction (by fravia+)
How to research, evaluate and collate web material
by A+heist
De materia inveniendis   ~   Evaluation tricks
Finding valid "reconstructed" images
Finding more information (for social engineering)
Other resources

(by fravia+)

Old A+heist reappeared out of the blue with an interesting short essay: how to prepare a research at pre-university level, full-fledged and thoroughly valid, on a given argument (here in his example: Dinosauria) in a very short amount of time. As the Author himself points out: "Unfortunately, finding journals and texts is often easier than evaluating them".
A problem I have noticed (observing my own kids) is that most "school-searches" are nowadays made just using google and wikipedia. Not "starting" from google and wikipedia: "just".
Please understand me right: I admire wikipedia and think it is one of the best knowledge tools around. But it is still a tool for general, unevaluated knowledge. Should be used as a springboard, not as a swimming pool. Else, since most pre-university students (and alas many university students as well) don't know any evaluation lore at all, such an approach could bring to catastrophic results from a scientific point of view. A "research" made using just google and wikipedia, while still barely acceptable for pre-university students is surely NOT a good idea for university level students. In fact there are many better, more effective, relatively simple and scientifically solider ways to proceed, as this paper by our fellow +HCUker A+heist intends to prove... strangely as it might seem... arriving to a wealth of scientific material after having STARTED from wikipedia and after having used almost exclusively the main search engines.
Here lies the very worthiness of this paper: our readers will be able to apply its many searching-teachings (beautifully resumed in the "conclusions") to many different (and "saurian-unrelated") targets.

How to research, evaluate and collate web material
by A+heist

De materia inveniendis 

OK, this is going to be a broad list of possible 'angles' you could use when searching scientific material on the web. Among other aims, we intend to find which Authors, texts and even languages and specific universities are the "top of the tip" for... let's say... "dinosauria-related" sciences. We don't want "second hand" or "second tier" knowledge, since we are searchers, and we "own" the web (being able to grab her most intimate parts :-), we might as well go for the absolute best.
Let's first of all just check how the searchscape looks like using google and wikipedia.
As you can easily see from this english wikipedia entrance (but it would be foolish not to quickly check at least its DE and FR counterparts), the most interesting parts for us, because we are trying to establish a foothold quickly, are the modern definition (where, should we have forgot, we are informed that paleontologists study such matters and that taxonomy discussions are bound to be peppered around) and the Notes and references part, that we will use to try to gather a useful bibliography.

The 'creationist' crap

Unfortunately some searches are bound to be politically sensitive, and the "Dinosauria" search is one of these. A quick glance at that very Notes and references part will show that we have already encountered a problem that we will meet again and again: the crap "creationist discussion", a religious-oriented backward fight going on in the States, that is of no scientific interest whatsoever and that we will consider just useless background noise, and from now on nuke from our desired results.
So a dinosaur -creationist -creationism parameter will later come handy in all our queries.
Incidentally, while for other search-targets it is usually very useful to visit and comb any messageboard focused on the topic you are researching, this 'creationist' crap unfortunately means that searchers should stear clear of all "paleontology" messageboards, since they will be infested by trolls and shills.

I will use in this paper google, yahoo and -occasionally- altavista and A9: be aware of the huge differences among the various main search engines: google is globally the most effective, but its indexes are often quite stale, yahoo has the biggest (and freshest) index, but it is heavily dependent from .com sites, altavista is much too easily spammed... and so on.
Well, of course when starting with ole google a parameter like dinosaur -creationist -creationism is not enough, given the arrogance of the commercial invaders of our web, therefore our searchstrings need a little more cleaning just to start with: dinosaur -buy -youtube -homepage -shop -creationist -creationism -film -quiz -Bar-B-Que -download, or maybe we should try a completely different approach: dinosaur site:edu.
As you can see, if you tried the queries I listed, we are already fishing in cleaner waters.

We were pointing out the "Notes and references" part of the main 'entry' to our target on wikipedia. When searching, however, we have to keep in mind that many gems often "lurk better" in the dark web when using specific searchterms: on wikipedia for instance the same "Notes and references" list for Allosaurus ("strange lizard" because "distinguished from any known Dinosaurs by the vertebrae, which are peculiarly modified to ensure lightness": aka "the wolf of the late Jurassic") could represent a valid starting point for our query. The interesting fact (which represents at the same time an important caveat for all wikipedia users!) is that the references for our specific genus "Allosaurus" are on wikipedia much less exhaustive than those we found for its own suborder Theropoda. Clearly an Allosauri-afecionado has invested a lot of time in this entry.

Time to change perspective and start our search from a DIRECTORY, not from a search engine. Of course we'll start with DMOZ. Let's dig: Dinosaurs... no much here, a quick glance at the results show it, we'll leave dmoz this time.

Let's try some nice web-combing. I want to gather quickly the names (and web-addresses) of the most important journals, since I'll need them to assess the value of the Authorities, universities and books we have to find. A search for paleontology journals, or even better, vertebrate paleontology journals, should cut the mustard quickly, and indeed it does: a bunch of first results have been collected here. Note that this long list of 'paleontological' journals should be further used as a "springboard" in order to quickly gather further angles and be able to finetune your specific queries. An example is using three 'promising' titles in order to gather savvy sites: +journals "Cretaceous Research" Paleobiology "Revue de Paleobiologie" should help us to find further lists of journals, if needed, whereas "Dinosaur Paleontology" "vertebrate paleontology" should quickly give us also non-journal links. Note how the searchscape changes if we add to this query "The Society of Vertebrate Paleontology": "The Society of Vertebrate Paleontology" "Dinosaur Paleontology" "vertebrate paleontology".

I have decided to call this well-known trick: using elements of your previous query to broaden the next query, the accumulative law of websearching, since afaik there's no precise terminology for it.

Let's shift focus again. The point here is to use our "dinosaur*" example to show how you can quickly but effectively search the web for scholar material, bypassing all the commercial guet-apens we could encounter on the road.
So use your fantasy when poking the web... ...you get the idea. Now you should be able to carry on on your own... but there's still one point that needs to be addressed: collation.

Wikipedia is quite imprecise about collation, and defines it as "the assembly of written information into a standard order". This is not true. Collation is the assembling of all the information you have found (not necessarily "written": why? Just because historically only texts have been collated?) integrating such information in a coherent presentation in order to offer the best possible result. This means among other things: eliminating doubles, choosing the best sources, choosing among variants, and so on.

Since you will need to "collate" all the material you will gather, how should you "prepare" it?
There are two easy main options: saving sessions and taking notes.
If you are using the browser Opera (considered the quickest among the graphical browser and usually preferred by searchers), you can save the snippets of interest you find into a track of "bread crumbs" using its Notes function. You highlight a target text found on a webpage, then right click on it and chose the "save to note" option (or press CTRL+SHIFT+C, like the copy command, but with an extra "shift"). To see all your notes just press CTRL+ALT+E (or click "tools" and then "notes"). The webpage you took the note from is automatically stored, together with the note. Hover over the note in order to see the URL or just double click on the note and reopen it (double clicking a note takes you back to where you got the note from).
Do not forget that you can edit those notes!
Clicking anywhere into the text of a note (on the right side), you can write and delete text, copy it to the clipboard, or insert it from there (works just like a text editor). You can also start an empty note (CTRL+ALT+E and then click on "new note") and write in it whatever you fancy.
This makes Notes a great tool for preparing an excerpt from a webpage.
When, for example, the information you need is "hidden" in a lengthy piece of prose, the Note function can be used to jot down just a few headwords as you read. That way you don't have to re-read the entire document if you need it again, and you can later use the snippets you highlighted in your wordprocessor.
This makes Notes a very useful function for students! (Notes are great for saving forum posts, draft submissions, login details and eReceipts).
Finally do learn how to create FOLDERS inside your notes, and then just shift the new notes inside the appropriate folder. When preparing this paper I had just four folders: "journals", "tips", "universities" and "alltherest", and collation was easy afterwards.

You can also (and you should) save your global search session, freezing a given moment of your search activity in time. I usually name sessions with the current date and time, something like "110208_1500", but you could name the session you have started following this paper -say- "dinosaurs". Get used to save your sessions when searching, it will be a life saver later, when collating results.

Evaluation tricks

Readers should learn to evaluate what they find on the web, which is imperative: Evaluation refers to the process of analyzing (and ordering) the data you have gathered in such a way to determine whether you are carrying out effectively your searches, and the extent to which you are achieving your stated objectives and anticipated results.

There are many tricks you can use: one of the simplest I know of is to search for images, not only for texts: you can easily "rewind" your images search back to interesting SITES starting from the images you found. This is a very interesting trick, since it allows you to use your "visual evaluation" skills. The sites you'll click onto are bound to be (most probably) interesting BECAUSE you can very quickly evaluate the worthiness of the images from the thumbnails, and hence the worthiness of the site itself. Here an example for Apatosaurus, where would you click?
More on images searching below.

Unfortunately, finding journals and texts is often easier than evaluating them. This is due to the fact that we are searching a matter that we don't yet master: we are "learning as we go", so to say. Hence we have to rely to what other people assert (which means we have the added need to evaluate the "authority" of such people); or we have to rely on our own feelings (often useful, at times misguiding); or we have to relate on the old evaluation rules like, for books and texts, their intended audience and the purpose of the information they deliver.

Let's try for instance to find the best universities for our dinosauria. There cannot be a "best university for paleontology": different universities have different strengths in different areas of paleontology, usually depending on the interests of individual professors. And your interests also matter. You can find out what professors are interesting to you by reading their published papers in the professional journals that we have found above, but with any journal we have a further -more global- problem: is the content freely available (as it should and is more and more frequent) or is that content still "jailed" inside an obsolete proprietary database?
Fortunately, many journals (yet by all means not all) now offer at least in part complete access to the papers they publish. Here a couple of examples:
Journal of Paleontology (some articles can be retrieved as full texts), Paleobiology (some articles can be retrieved as full texts) and Palaios (some articles can be retrieved as full texts).
Journal of Vertebrate Paleontology (articles cannot be retrieved as full texts).
It is worth pointing out that even when searching a journal outside the english speaking area, for instance a FRENCH journal, like the Annales de paleonthologie, you are nowadays bound to find most articles in english. This is important for your decisions about future studies, since in the European union almost all countries that have a "small" language (Danemark, Holland, Finland...) offer more and more university level courses in english. So you are not confined to the UK and Ireland :-)

But the fact that a journal has a closed database doesn't mean you have to renounce to it: even the content of those journals that still follow the obsolete and absurd "pay per view" model, instead of publishing the full content on the web, will be anyway available for free in most good libraries and university and college libraries.

Consulting the journals you'll be able to fond out which universities are most likely to offer a good paleontological preparation, which could be useful if you are considering studying later in a paleontological faculty. In that case -again- you should make an effort to contact those professors whose work has interested you directly, by letter, email or phone, and arrange to visit their departments. Do not hesitate: in general so few people do ever care about their work, that you can be assured they will bend backwards in order to help you with good advice and suggestions. This will help you learn more about graduate programs and possibilities.

So how do we find the "best universities"? Well we could first gather which universities have a paleontology faculty (university NEAR paleontology). But the "journals" approach is less biased and more rewarding. After a quick glance at the journals we can tentatively draw some conclusions (we have excluded universities that might be very good but do not offer english courses):
In the States the University of Pennsylvania seems to be an excellent choice for anything related to paleontology, especially if the critters you want to work with are dinosaurs. This university, one of the very few places in the States where you can actually take a degree in paleobiology, has the oldest tradition of paleo education in the New World.

In the European Union the university of Bristol seems to have good paleobiologists.

Elsewhere the Moscow university for Geology and Paleonthology has been considered "the best university & faculty forever".

If you see your future inside paleonthology, check also the FAQs and other "career advice" stuff listed in the 'mailing lists' linked below.

Quickly gathered from the web, with some comments:
  • Weishampel, D.B., P. Dodson, and H. Osmólska (eds.) (2004). The Dinosauria. 2nd edition. University of California Press, Berkeley. 833 pp. Simply put, the definitive text on dinosaurs. A bit technical for beginners, but exhaustive in detail; a "must have".
  • Benton, Michael J. (2004). Vertebrate Palaeontology, Third Edition. Blackwell Publishing, 472 pp., A basic textbook on vertebrate paleontology by a Bristol's professor, three editions, published in 1990, 1997, and 2005, designed for paleontology graduate courses in biology and geology as well as for the interested layman.
  • Euan Clarkson’s Invertebrate Palaeontology and Evolution (published first by George Allen & Unwin and now by Blackwells) has for many years led the field in palaeontology textbooks for undergraduate courses,. Although in recent years a number of new teaching texts have been published, none can rival Clarkson.
  • Milsom Clare & Rigby Sue (2004). Fossils at a Glance: "a solid addition to any paleo library".
  • R. McNeill Alexander, Dynamics of dinosaurs and other extinct giants, as the title implies ("extinct giants" is clearly appealing to the laymen), this is a 'reduced' work from this expert (of course we can quickly find out what this professor ever published :-)
    "In this book Alexander exercises his considerable expertise in engineering; he tackles questions of dinosaur weight, gait, agility, behavior, and metabolism from a mechanical perspective, and emphasizes methods a paleontologist can use to infer or calculate whole animal structure and function from bones and trackways. Flying and marine reptiles and giant birds and mammals are also included. For the intelligent lay reader who wonders how scientific reconstructions of fossil animals can be created; and for the professional seeking an approachable version of topics covered in Alexander's more technical works".

Finding valid "reconstructed" images

Images searching is an art apart, therefore I'll simply refer you to fravia's very useful section. For our dinosauria, however I would try the following queries:
  1. Aeolosaurus note...
    • how a specific species, here "Aeolosaurus" ("wind reptile", alluding to the windy Patagonian region of southern Argentina where the fossil was found), gives us less "noisy" results; however most images seems to be just one and the same (Lucas Fiorelli's)
  2. Titanosauridae note...
    • how enlarging the image query to a specific group, here "Titanosauria" (titanic reptiles), gives us more interesting results;
  3. A simple query like "Where to Find Great Dinosaur Pictures" will bring us at once to http://www.search4dinosaurs.com/. Now, applying even the simplest evaluation parameters, it is obvious that this site is utterly unscientific. Yet it has gather a bonanza of pictures that could still be useful if used cum grano salis.
Quickly scanning the web for images inside generic "dinosaurs" pirated books, the reader will realize that the available literature (ease of availability being unfortunately, on the web, directly proportional to "celebrity") is nothing to write home about, especially if compared with the "scholar" results we were able to fetch with the approaches described above (for instance: here). However, some less "scientific" images-oriented books are easily available on the web, probably because they satisfy the unwasheds' desires for nice images.
In less than 15 minutes (February 2008, YMMV), with the usual techniques, I could find and download for free the following books:
Norman_Dinosaurs-A Very Short Introduction-0192804197.pdf
This last book has many interesting images, but they still lack the "expressiveness" I would like to have when writing a dissertation or a paper on such matters (this is just me, YMMocV).

See: everything depends from the KIND of images you are searching. As stated, if you want scientific images you are better served by the images you'll find perusing the journals we have combed, but if you would be more satisfied with less exact but MUCH more impressive (if probably a tag unscientific) images of our dinosauria, you might also shift the seeking focus onto some side alleys. For instance the most able among dinosaur "comics" designers: Ricardo Delgado (found through the "age of reptiles" search above, images: here right and on the journals page). I suppose all his books must have landed in the public domain, since you can easily find all of them on the web for free in their "comic book archive files" format.

A small digression about .cba, .cbz, .cbr or .cbt formats
If you want to seek "comic" books on the web, be aware that they will be most probably stored in .cba, .cbz, .cbr or .cbt format (Comic Book Archive files) which consist of a series of image files archived in ZIP, RAR, or more rarely TAR or ACE formats with the filename extension renamed to .cbz, .cbr, .cbt and .cba respectively. such .cba, .cbz, .cbr or .cbt files typically contain PNG or JPEG files. Occasionally GIF, BMP, and TIFF are seen. You can open them directly, in linux, using evince or just rename them to their respective archive format and open all images in a folder.

Finding more information

Sources of More Information
(social engineering)

Paleontological Research Institution
1259 Trumansburg Road
Ithaca, NY 14850
(607) 273-6623

The Paleontological Society
P.O.Box 1897
Lawrence, KS 66044-8897

The Society of Vertebrate Paleontology
P.O. Box 809183
Chicago, IL 60680

A somewhat informal forum created by professionals (as part of the Dinosaur Mailing List, see below) 
The Dinosaur Society
200 Carlton Ave.
East Islip, NY 11730
Department of Earth Sciences 
University of Bristol 
Wills Memorial Building 
Queens Road 
Bristol, England BS8 1RJ 
Phone	0117-954-5400 
Fax	0117-925-3385

Other resources

(worth a try)


We think that the approaches described above should be enough to allow anyone to start collecting almost anything the web has to offer on the target "Dinosauria"... of course the real purpose of this paper is NOT to explain how to specifically search for "Dinosauria", but rather to show some useful approaches that searchers could use and apply for all sort of targets.

Let's finally resume what we were able to see today:
  • How to start with wikipedia and google, and some limits of both wikipedia (individual specific entries might be more exhaustive than individual global entries) and google (stale indexes);
  • The importance of "negative" parameters to avoid noise (dinosaur -creationist -creationism);
  • The possibility of limiting scholar searches to .edu sites;
  • The fact that the open directory DMOZ (usually highly considered by searchers) in this case doesn't help much;
  • How to quickly prepare a list of "authoritative" Journals using journal titles to find other journals (the "accumulative" law of websearching);
  • How to quickly find out what a professor ever published;
  • Some examples of using fantasy when poking the web with queries;
  • How to collate your results (Opera: taking notes and saving sessions);
  • How to use an image search in order to find new sites and evaluate quickly the worthiness of such sites;
  • How to (try to) bypass the "deep web" classical problem of closed databases and jailed information;
  • Some universities that could be useful, approaching professors;
  • Establishing a first quick bibliography;
  • How to search for images and the inverse proportion between ease of availability and lack of "celebrity";
  • Comic formats (a digression);
  • Real-life addresses for social engineering purposes;
  • Mailing lists and other bizarre (but useful) resources.
Have phun enjoying your websearching überpowers!
                                                                                                                                                                (c) A+heist 2008

