This is the first classroom, I have invited here as teacher, today, ~S~ Humphrey P., aka as "HP" or also as "Humph", a master searcher I have had the honour to met on my old messageboards. Just read the text below, where he goes to great lengths, with the help of Iefaf, in order to enter a web-database, and I'm sure you'll understand and enjoy...
This thread, originally, on http://www.insidetheweb.com/messageboard/mbs.cgi?acct=mb959559
First classroom
Cat burglers in the museum after dark
by ~S~ Humphrey P., January 2000

Thread slightly edited by fravia+

It begins with an interesting note by Iefaf:
A side finding worth a bookmark:
Gallica 2000

"70000 documents numérisés, une navigation plus intuitive, cette
nouvelle version de Gallica constitue la mise à jour la plus
importante depuis la création de ce serveur en octobre 1997.

Le lecteur accède aujourd’hui à une bibliothèque multimédia dont
les ressources documentaires s’étendent du Moyen Âge au début du
XXe siècle.

Images fixes provenant des fonds prestigieux de la BnF, imprimés
numérisés en mode image, documents en mode texte composent ici
l’une des plus importantes bibliothèques numériques sur le réseau

(Ah those arrogant French;)

"Pour effectuer des recherches plus complexes, utilisez l'interface
du catalogue des documents numérisés.



The very moment Iefaf points out the link http://gallica.bnf.fr/Fonds_Mosaiques/ (visit it and try NOW yourself if you can get at the images BEFORE proceeding with this text...) he receives an immediate and powerful reply by HP:

OK, let's work on this one


N° d'image: n° 008
Bibliothèque Nationale : ancien Hôtel Mazarin. N Atget : 4540.
Photographiepositive sur papier albuminé d'après négatif sur verre
au gélatinobromure ; 21,5 x 17 cm (épr.). 
[Cote : BNF - Est. Eo 109b bte 4 ;n micr. T039534] \

http://gallica.bnf.fr/scripts/mediator.exe?L=03100020&I=0000008&F=A = little one
http://gallica.bnf.fr/scripts/mediator.exe?L=03100020&I=0000008&F=C = big one

Displayed in a separate window, Netscape reports the following:
JPEG image 512x728 pixels
[[La] Bibliothèque nationale][[Image_fixe]]/Eugène Atget,photogr.

When saved, the name is: mediator.exe

Let's have a look...

Inside mediator.exe
FF D8 FF E0 00 10 4A 46 49 46 00 01 00 01 00 48 - ......JFIF.....H

a JPEG, with a wrapper? and a technology attached too!: 
LEAD Technologies Inc. V1.01

No wrapper - Just renaming mediator.exe to 0000008C.jpg works fine.

Just guessing:
-1- should be two subdirectories off from 03100020, called A and C, 
    and inside C is 0000008.
-2- Or, with the proper naming (as is meticulously demonstrated)
    everything could be within a big .zip file, with mediator.exe
    doing the extracting.

"Liste du répertoire refusé
 Le listage du contenu n'est pas autorisé pour ce répertoire virtuel."

Well. They are polite.

Anyway, question: Skulking around a public metropolis like this
one at gallica.bnf.fr, where are the backdoors and backstairs and
skeleton keys? This is off in the realm of sysadmin and "hacking"
(the newspaper definition) I suppose. But there must be a list of
mundane capers.

"How to gain root" in alltheweb got me 26 of them. One underground
stream will filter into another, and soon we'll have pure water, I

Hmmm. at te bottom of your 478K list of nnnnnnnn.htm is

"... le site Gallica connaît actuellement quelques perturbations."

That galliant should have his heart examined.
(cat burglers in the museum after dark... shhhh..)

Humphrey P            

F' course these searchers did not miss the port I have highlighted in red above...

A good many of the old volumes are available in PDF format.
Did you notice the funny ports used, like :8091
AFAIAC "How to gain roof access" is more appropriate, and feline (Alltheweb=0 hit;)
Dictionnaires & encyclopédies
http://gallica.bnf.fr/dictionna ires.htm

Encyclopédies & Dictionnaires généraux

Dictionnaire d'Histoire & de Géographie
- Topographie

Dictionnaires biographiques

Dictionnaires de Droit

Dictionnaires d'Économie
- Économie rurale & domestique

Dictionnaires d'Esthétique

Dictionnaires de Langue française
- Synonymes
- Locutions proverbiales
- Analogies
- Étymologies
- Rimes
- Argot

Dictionnaires multilingues & en langues étrangères
My pick
Dunn, Oscar, Glo ssaire franco-canadien et vocabulaire de locutions vicieuses usitées au Canada (1880)
A tool for the old red rebus?
O'Kelly de Galway, Alphonse-Charles-Albert, Dic tionnaire archéologique et explicatif de la science du blason. (1901)
and the tempting
Herbelot, Barthélemy d', Bib liothèque orientale ou Dictionnaire universel contenant généralement tout ce qui regarde la connaissance des peuples de l'Orient... (1697)

Dictionnaires de Philosophie & de Théologie

Dictionnaires de Sciences
- Chimie
- Histoire naturelle
- Mathématiq ues
- Médecine & Pharmacie
- Physique
- Sciences de l'ingénieur

Dictionnaires des Sciences politique & administrative

And then the hunt goes on...

I was toying with the idea that using Webget-type-of-files-to- download, could eventually leads us, little sneakers, to the hidden [note hidden, not forbidden] trove. So using the command: wget -t 45 -a log.txt -A 0*.htm http://gallica.bnf.fr/ Could lead to any directory containing the raw meat gutter cats love so much. Unfortunatly after trying the following combination 0* 0*. o*.htm "o*.htm" "o*.ht*" o*.ht* I simply get stuck no further that basic.htm If you are still interested to the "where are the big pictures" and want to try Webget a "SvD Recommended tm" tool with the manual at http://www.gnu.ai.mit.edu/manual/wget/ as altern.org doesn't let me in at the moment, be my mate on the gutter, the last step to the roof, hopefully.
Yes, it is down. Well, I will spend some time reading the manual, instead. And seeing what I'd grabbed from svd, before. Perhaps webget was amongst all that. Then, again, webget is on GNU somewhere. (svd will be proud of you! You're actually getting me to do something with svd's treasures... ) Hmmm. jeff should be following this too. This is not an ftp entry, but rather an http entry, asking the same questions. "Welp, here we are in /pub/ foyer. What can we do next? Where is the employee entrance? the delivery boy's entrance? the maintenance man's entrance? the elevator?" Can there be an unlisted door which isn't locked? I know there can be an unsecured door behind a locked door. NY Times was like that for a while. The Guardian. Different criteria for accessing the site through an internal page than through the basic.htm default page. ~ Cookie detour: Perhaps the return visit, is expediated by a cookie. (Let me think, was I paying attention, then?) Hmmm. The contents of a cookie could be anything... I wonder if the cookie crunchers have made a cookie museum with a little 3x5" card listing each one's key features? Hmmm, why one for NY Times, two for multimania, five for mail.yahoo. Must be accessing the same cookie that many different times. How can that be secure? Surely I could ask a different computer about someone else's cookie... let's see, there was a security level about answering only the address which issued the cookie. Then, there must be a cookie register to keep track of that. And how can you get anywhere by denying cookies? Surely you need one for Amazon.com. Perhaps some others are not necessary, and the cookie pusher wiser by experience. Anyway, once you were in, you didn't need to come in through the default page. And I suspected that you never did need to come in through the default page's signup process. And, then, again, there are cookies and there are access rights, and where you got it doesn't seem to matter much. PointCast seemed to sign you up with WSJ, NYTimes, LATimes without much trouble. ~ Detour rejoins delivery boy knocking on front door: But the concept isn't any different than bypassing the "name, address, and personal information" page and going right to the ftp address for the demo.zip you want to download. 0* 0*. o*.htm "o*.htm" "o*.ht*" o*.ht* Hmmm. now, why that sequence is obvious to you - there must be a closed set of possible which I hadn't known before. Well, it's right there in the manual...RTFM: examples, robots,,, ~ Up on the rooftop, reindeer pause; Out jumps good old Santa Claus. Down through the chimbly ... http://www.chebucto.ns.ca/~av359/xmas/carols/roof.html ~ .fr standard file formats. Tried a .pdf Specifically, the get_page.pdf on page: http://catalognum.bnf.fr:8091/i-full?585&1 Not a .pdf. Seems to be a prefix within... .fr is a good deal like the old IBM and the new Microsoft: would rather do it their way. Speak bnf.fr or die. ~ Webget, for the gnu of it... smarter than the average bot, hmm? Humphrey P
