URL.txt Newbie Question

For general issues related to PWB v2.

Moderators: Tyler, Scott, PWB v2 Moderator

Post Reply
mtaylor
Observer
Observer
Posts: 5
Joined: Thu Jan 16, 2003 5:56 pm
Contact:

URL.txt Newbie Question

Post by mtaylor »

We just upgraded to v2.04 and I have a question about the URL.txt. In the old PWB version we setup an onlyaccess.txt list for our public access catalogs and when someone tried to go someplace that wasn't on the list it sent them back to the home page. Is there a way I can do this in the new version? I've been looking in the .INI file but haven't been able to find that setting. Thanks.

Scott
Site Admin
Site Admin
Posts: 2539
Joined: Mon Dec 16, 2002 12:31 pm
Location: Rochester, MN
Contact:

Post by Scott »

PWB uses the URL filter file. The only access, no access, has been replaced by a "+" or "-" in the single filter file.

Check out this post.

http://www.teamsoftwaresolutions.com/ph ... c.php?t=13

--Scott

mtaylor
Observer
Observer
Posts: 5
Joined: Thu Jan 16, 2003 5:56 pm
Contact:

Post by mtaylor »

I've been playing around with the URL.txt and I almost have it working... I have it setup like this:

-all
+library.ppld.org
+http://www.e-toolbox.com/
+www.libraryhq.com/uhtbin/getenrich
+www.referenceusa.com
+catalog.ppld.org
+etc....

The only problem is when I try to get to google I can. When I try to do a search it stops me but I don't understand why I can even get to google in the first place.

Also in the old version of PWB you had this line in the .INI:

OnDenyAccess=Home

Is there a way I can do this with the new version? Thanks.

Scott
Site Admin
Site Admin
Posts: 2539
Joined: Mon Dec 16, 2002 12:31 pm
Location: Rochester, MN
Contact:

Post by Scott »

There is something in your URL file that is allowing access to Google. The filters use an in string function that looks for strings in side the URL for access rights. To get a list of what PWB is seeing for URLs, enable the history file, URL tracking, and access logging.

[Security]
...
WriteHistoryFile=True
TrackURL=True
CheckURLAccess=True
...
LogAccess=True
...

By examining the history file you can determine what PWB is allowing access to and the URLs that PWB is parsing.

PWB v2.04 revision 4 now in beta testing adds regular expressions to the URL parsing functions. This will allow further fine tuning of the URL filter file.

PWB v2.04 revision 4 beta
http://www.teamsoftwaresolutions.com/beta/PWBv204r4.zip

Guide to using regular expressions
http://etext.lib.virginia.edu/helpsheets/regex.html

Currently PWB v2 does not have on deny access home, but I have added an on deny access actions to our to do list.

--Scott

mtaylor
Observer
Observer
Posts: 5
Joined: Thu Jan 16, 2003 5:56 pm
Contact:

Post by mtaylor »

I checked my URL.txt file and can't see why it's allowing people to get to google, could you take a look at it and see what's up? Thanks.

http://bluebeetle.org/URL.txt

spragers
Benefactor
Benefactor
Posts: 153
Joined: Fri Dec 27, 2002 9:11 am
Contact:

Post by spragers »

Hi,

Ran a test using your URL file in place of our own. I couldn't load http://www.google.com

Are you sure you have CheckURLAccess=True?

mtaylor
Observer
Observer
Posts: 5
Joined: Thu Jan 16, 2003 5:56 pm
Contact:

Post by mtaylor »

CheckURLAccess is set to true... Here is my .INI file:

http://bluebeetle.org/PWB.ini

Alachua
Observer
Observer
Posts: 7
Joined: Tue Jun 24, 2003 8:43 am
Location: Gainesville, Fl
Contact:

Post by Alachua »

mtaylor wrote:I checked my URL.txt file and can't see why it's allowing people to get to google, could you take a look at it and see what's up? Thanks.

http://bluebeetle.org/URL.txt

I think if you add

-google

this would stop anyone from getting their.

Bob. 8)

mtaylor
Observer
Observer
Posts: 5
Joined: Thu Jan 16, 2003 5:56 pm
Contact:

Post by mtaylor »

I've tried that... they can still get to the google.com home page. If they try to do a search it blocks them. I'd like it to not let them get to google.com at all.

spragers
Benefactor
Benefactor
Posts: 153
Joined: Fri Dec 27, 2002 9:11 am
Contact:

Post by spragers »

First, try fixing these lines:

SearchPage=http://http://catalog.ppld.org
WebMailPage=http://http://catalog.ppld.org

I seem to recall that the default setting for the search page is www.google.com - maybe (and I could be WAY off here) they are clicking the search button, and since the double http://'s look like a problem, maybe PWB is defaulting to www.google.com for the search page?

spragers
Benefactor
Benefactor
Posts: 153
Joined: Fri Dec 27, 2002 9:11 am
Contact:

Post by spragers »

Also, I believe you should have those IP adresses in the IP.TXT file, not in the URL.TXT file.

Post Reply