> pictools > doc > urlHandler
> pictools > doc > urlHandler
Pictools: How Things Work
V3.3 (435)
Handles requests for missing pages; strategies include the badlinks map and servepicture
usage, in .htaccess: unsuccessful requests are redirected to this script

Arriving URL requests are processed by the server. Pictools uses mod_rewrite, but I ran into some deficiencies, so requests that do not map immediately to a page are routed to urlHandler.php. The original URL is fetched via $sp->request_uri(). The goal of the analysis is to choose one of these outcomes:

serve an appropriate page
reply with a permanent redirect so that browser will no longer request the initial URL. The mapping table is urlmap.sq3.
Fail via forPhyspics.failpage
Redirect to tarpit.php, a black hole. This is implemented as a SUCCEED with arevised URL of /tarpit.php.

As you add items to the site, you may find some names that are inaccessible. This will happen for names listed in /admin/urlmap.xslx and a few other names listed in urlHandler.php. The map or URLhandler must sometimes be revised to solve such problems.

These strategies are tried

  • rewrite "Cincinatti" to "Cincinnati" in current URL
  • redirect requests for /admin/ to TARPIT
  • send requests ending with xmlrpc to TARPIT
  • parse the URL into dir, file, and extension; if the parse fails, FAIL
  • if there is no file, set file to index.php (or an alternative, if one exists)
  • rewrite localcaptions.cap to index.php (for a directory built from a captions file segment)
  • if the current URL refers to an existing file, SUCCEED to it
  • if the URL or some initial string of it, is in the remap database, set the current URL to the mapped value from the data base (many malicious strings map to TARPIT). The database test increments a count of how often a given entry is selected.
  • (at this point the code makes several tests to repair requests for pages that have moved. These are commented out for distribution.)
  • if the current URL corresponds to an existing file, SUCCEED to it
  • if the request is for an image in a segment directory, find the captions file and SUCCEED via servepicture.php
  • if the database check found a match, REDIRECT with the resulting url
  • if the request is for xxx/index.php, convert it to xxx/
  • FAIL

From time to time (daily), URL requests arrive for pages that do not exist. These may be from

  • user typos
  • web crawlers following obsolete links to files that have moved ot vanished
  • buggy web crawlers
  • webscum seeking vulnerabilties

The known-to-be-evil requests are forwarded to tarpit and ignored. Others are handled with box404. and logged for the administrator. The log is displayed when the administrator views command.php. When I see such errors, I process them as described in urlmap.

Testing URLhandler

The first column of worksheet 'Tests and tesults' in badlinks.xslx is a list of URLs to visit for testing. The rest of the sheet is predicted/experienced results. Copy the first column to server file /admin/urlTest.txt and then browse to /admin/testUrlHandler.php. The output will be the three columns seen in worksheet 'Test and Results'.  I generally compare the old and new result with cell formulas of the form =X1=Y1.

In urlTest.txt, blank lines are ignored and lines beginning with hash are special. A #! line terminates the test. A ## line is ignored. Otherwise, hash lines are displayed in the output..

Copyright © 2016 ZweiBieren, All rights reserved. Jul 17, 2016 18:07 GMT Page maintained by ZweiBieren