The Forge Reference Project

 

Topic: Google indexing the Forums?
Started by: Seth L. Blumberg
Started on: 7/26/2002
Board: Site Discussion


On 7/26/2002 at 3:48pm, Seth L. Blumberg wrote:
Google indexing the Forums?

Check out what I found while ego-surfing.

Can we get a /robots.txt that excludes search engines from the Forums, please? I don't necessarily want a prospective employer to be looking at my Forge postings.

Forge Reference Links:
Topic 8

Message 2851#27746

Previous & subsequent topics...
...started by Seth L. Blumberg
...in which Seth L. Blumberg participated
...in Site Discussion
...including keyword:

 (leave blank for none)
...from around 7/26/2002




On 7/26/2002 at 3:51pm, Clinton R. Nixon wrote:
RE: Google indexing the Forums?

We most certainly can.

Now, this is a funny question coming from the technical proficency guy, but: how exactly do we do that?

- Clinton

Message 2851#27747

Previous & subsequent topics...
...started by Clinton R. Nixon
...in which Clinton R. Nixon participated
...in Site Discussion
...including keyword:

 (leave blank for none)
...from around 7/26/2002




On 7/26/2002 at 3:57pm, Zak Arntson wrote:
RE: Google indexing the Forums?

This is for Google, but it may be useful for other engines: http://www.google.com/webmasters/3.html#removed. It's a start.

Message 2851#27750

Previous & subsequent topics...
...started by Zak Arntson
...in which Zak Arntson participated
...in Site Discussion
...including keyword:

 (leave blank for none)
...from around 7/26/2002




On 7/26/2002 at 3:59pm, Matt Snyder wrote:
RE: Google indexing the Forums?

Clinton R Nixon wrote: We most certainly can.

Now, this is a funny question coming from the technical proficency guy, but: how exactly do we do that?

- Clinton


If I understand it rightly, you do something like this:


# robots.txt for http://www.yoursite.com
# This file is for resticting access to parts of the web server
# to all robots who use the Robot Exclusion Standard.

User-Agent: *
Disallow: /forum


(Edit: that character after "User-Agent:" should be an asterisk.)

Each disallow "/whatever" is another directory to exclude. Then, you save this file as "robots.txt" in your root web directory.

That's how we do it at the web site I work for.

Message 2851#27752

Previous & subsequent topics...
...started by Matt Snyder
...in which Matt Snyder participated
...in Site Discussion
...including keyword:

 (leave blank for none)
...from around 7/26/2002




On 7/26/2002 at 4:08pm, Clinton R. Nixon wrote:
RE: Google indexing the Forums?

Got it - I've added a /robots.txt file, and added META tags that should prevent the pages from being indexed, as well.

I'll contact Google and ask them to remove www.indie-rpgs.com/forum from their cached files.

Forge Reference Links:

Message 2851#27754

Previous & subsequent topics...
...started by Clinton R. Nixon
...in which Clinton R. Nixon participated
...in Site Discussion
...including keyword:

 (leave blank for none)
...from around 7/26/2002




On 7/26/2002 at 4:16pm, Le Joueur wrote:
I'm Not a Web Designer (Nor Do I Play One on TV)

Clinton R Nixon wrote: Now, this is a funny question coming from the technical proficency guy, but: how exactly do we do that?

Forgive my ignorance, but don't you just put this in the header:

[code]<META NAME="robots" CONTENT="NOINDEX, NOFOLLOW">[/code]
A little bit of code in the source that the php pulls the header from should do the trick, right?

BTW, thanks for pointing me at pMachine; have you switched the reviews to it recently?

Fang Langford

Message 2851#27756

Previous & subsequent topics...
...started by Le Joueur
...in which Le Joueur participated
...in Site Discussion
...including keyword:

 (leave blank for none)
...from around 7/26/2002




On 7/26/2002 at 4:22pm, Clinton R. Nixon wrote:
Re: I'm Not a Web Designer (Nor Do I Play One on TV)

Le Joueur wrote:
A little bit of code in the source that the php pulls the header from should do the trick, right?

BTW, thanks for pointing me at pMachine; have you switched the reviews to it recently?


Yup - it works great for them. I'm glad you like it.

Message 2851#27758

Previous & subsequent topics...
...started by Clinton R. Nixon
...in which Clinton R. Nixon participated
...in Site Discussion
...including keyword:

 (leave blank for none)
...from around 7/26/2002




On 7/26/2002 at 9:34pm, Victor Gijsbers wrote:
RE: Google indexing the Forums?

What exactly is the rationale behind this? I mean, it's your site, you can do with it whatever you wish. But this forum is full of interesting material, and I think it's rather strange to ensure that people won't be able to find it.

Message 2851#27806

Previous & subsequent topics...
...started by Victor Gijsbers
...in which Victor Gijsbers participated
...in Site Discussion
...including keyword:

 (leave blank for none)
...from around 7/26/2002




On 7/26/2002 at 9:40pm, Clinton R. Nixon wrote:
RE: Google indexing the Forums?

It's so people's names can't be found while searching the Internet. Don't think we won't still be indexed - the Forge will be. However, the front page of the forums (which is all that would be indexed anyway, for reasons I can explain if you like, but are technical and boring) doesn't convey much information, and might have my, Seth's, your, or anyone's name on it.

Many people would rather their names not pop up when people like prospective employers search the Internet. I can completely understand this, especially since I almost got fired from a job about 5 months ago because I said something on my personal journal that was disparaging to a co-worker.

Message 2851#27808

Previous & subsequent topics...
...started by Clinton R. Nixon
...in which Clinton R. Nixon participated
...in Site Discussion
...including keyword:

 (leave blank for none)
...from around 7/26/2002




On 7/27/2002 at 11:49am, Victor Gijsbers wrote:
RE: Google indexing the Forums?

Clinton R Nixon wrote: the front page of the forums (which is all that would be indexed anyway, for reasons I can explain if you like, but are technical and boring) doesn't convey much information, and might have my, Seth's, your, or anyone's name on it


Ah, if that is the only thing which would be indexed, not much is lost. Does the 'technical and boring' reason have anything to do with the fact that the threads themselves aren't static html but rather database entries retrieved by your php-scripts?

Message 2851#27853

Previous & subsequent topics...
...started by Victor Gijsbers
...in which Victor Gijsbers participated
...in Site Discussion
...including keyword:

 (leave blank for none)
...from around 7/27/2002




On 7/27/2002 at 3:34pm, Clinton R. Nixon wrote:
RE: Google indexing the Forums?

Victor,

It's because you access forums and threads with URL's like http://www.indie-rpgs.com/forum/viewforum.php?f=1, which are composed of a web page + arguments sent to that web page. Google, and other indexes, only grab the web page without any arguments.

If our forum system created URL's like: http://www.indie-rpgs.com/forum/site_discussion/2341, then the individual threads would be indexed.

Forge Reference Links:
Board 1
Topic 2341

Message 2851#27858

Previous & subsequent topics...
...started by Clinton R. Nixon
...in which Clinton R. Nixon participated
...in Site Discussion
...including keyword:

 (leave blank for none)
...from around 7/27/2002




On 7/27/2002 at 10:01pm, Victor Gijsbers wrote:
RE: Google indexing the Forums?

This is strange, because to test my theory, I tried a Google on my nickname at http://gathering.tweakers.net. It's a forum with url's like "http://gathering.tweakers.net/showtopic.php/220386/1/100", without arguments. As I have over 7000 posts there, I should have found quite a lot. But I didn't find a single thing, nothing at all.

Might this have something to do with the fact that the last part of the url (the '/1/100' here) depends on options in the user's profile? (1/100 means: go to the first page, with 100 posts per page.)

Hm, maybe I'd better ask this to one of the DB-admins on the Gathering of Tweakers itself. Never mind.

Message 2851#27873

Previous & subsequent topics...
...started by Victor Gijsbers
...in which Victor Gijsbers participated
...in Site Discussion
...including keyword:

 (leave blank for none)
...from around 7/27/2002




On 7/29/2002 at 5:07am, Paul Czege wrote:
RE: Google indexing the Forums?

It's because you access forums and threads with URL's like http://www.indie-rpgs.com/forum/viewforum.php?f=1, which are composed of a web page + arguments sent to that web page. Google, and other indexes, only grab the web page without any arguments.

They've got this:

http://www.indie-rpgs.com/forum/viewforum.php?f=2

Paul

Forge Reference Links:
Board 1
Board 3
Board 2

Message 2851#27966

Previous & subsequent topics...
...started by Paul Czege
...in which Paul Czege participated
...in Site Discussion
...including keyword:

 (leave blank for none)
...from around 7/29/2002