OpenBSD-4.6/gnu/usr.bin/lynx/samples/cernrules.txt

# This files contains examples and an explanation for the RULESFILE / RULE
# feature.
#
# Rules for Lynx are experimental.  They provide a rudimentary capability
# for URL rejection and substitution based on string matching.
# Most users and most installations will not need this feature, it is here
# in case you find it useful.  Note that this may change or go away in
# future releases of Lynx; if you find it useful, consider describing your
# use of it in a message to <lynx-dev@nongnu.org>.
#
# Syntax:
# =======
# Summary of common forms:
#
#   Fail           URL1
#   Map            URL1  URL2      [CONDITION]
#   Pass           URL1  [URL2]    [CONDITION]
#   Redirect       URL1  URL2      [CONDITION]
#   RedirectPerm   URL1  URL2      [CONDITION]
#   UseProxy       URL1  PROXYURL  [CONDITION]
#   UseProxy       URL1  "none"    [CONDITION]
#
#   Alert          URL1  MESSAGE   [CONDITION]
#   AlwaysAlert    URL1  MESSAGE   [CONDITION]
#   UserMsg        URL1  MESSAGE   [CONDITION]
#   InfoMsg        URL1  MESSAGE   [CONDITION]
#   Progress       URL1  MESSAGE   [CONDITION]
#
# As you may have guessed, comments are introduced by a '#' character.
# Rules have the general form
#   Operator  Operand1  [Operand2]  [CONDITION]
# with words separated by whitespace.  Words containing space can be quoted
# with "double quotes".  Although normally this should not be necessary
# necessary for URLs, it has to be used for MESSAGE Operands in Alert etc.
# See below for an explanation of the optional CONDITION.
#
# Recognized operators are
#
#   Fail  URL1
# Reject access to this URL, stop processing further rules.
#
#   Map   URL1  URL2
# Change the current URL to URL2, then continue processing.
#
#   Pass  URL1  [URL2]
# Accept this URL and stop processing further rules; if URL2
# is given, apply this as the last mapping.
# See the next item for reasons why you generally don't want to "pass"
# a changed URL.
#
#   RedirectTemp       URL1  URL2
#   RedirectPerm       URL1  URL2
#   Redirect [STATUS]  URL1  URL2
# Stop processing further rules and redirect to URL2, just as if lynx had
# received a HTTP redirection with URL2 as the new location.  This means that
# URL2 is subject to any applicable permission checking, if it passes a new
# request will be issued (which may result in a new round of rules checking,
# with a new "current URL") or the new URL might be taken from the cache, and,
# after successful loading, lynx's idea of what the loaded document's URL is
# will be fully updated.  All this does not happen if you just "pass" a changed
# URL (or let it fall through), so this is generally the preferred way for
# substituting URLs. 
# If the RedirectPerm variant is used, or if the optional word is supplied and
# is either "permanent" or "301", act as if lynx had received a permanent
# redirection (with HTTP status 301).  In most cases this will not make a
# noticeable difference.  Lynx may cache the location in a special way for 301
# redirections, so that the redirection is followed immediately the next time
# the same original URL is accessed, without re-checking of rules.  Therefore
# the permanent variant should never be used if the desired outcome of rules
# processing depends on variable conditions (see CONDITIONS below) or on
# setting a special flag (see next item).
#
#   PermitRedirection  URL1
# Mark following redirection as permitted, and continue processing.  Some
# redirection locations are normally not allowed, because permitting them in a
# response from an arbitrary remote server would open a security hole, and
# others are not allowed if certain restrictions options are in effect.  Among
# redirection locations normally always forbidden are lynxprog:  and lynxexec: 
# schemes.  With "default" anonymous restrictions in effect, many URL schemes
# are disallowed if the user would not be allowed to use them with 'g'oto. 
# This rule allows to override the permission checking if rules processing ends
# with a Redirect (including the RedirectPerm or RedirectTemp forms).  It is
# ignored otherwise, in particular, it does not influence acceptance if rules
# processing ends with a "Pass" and a real redirection is received in the
# subsequent HTTP request.  If redirections are chained, it only applies to the
# redirection that ends the same rules cycle.  Note that the new URL is still
# subject to other permission checks that are not specific to redirections; but
# using this rule may still weaken the expected effect of -anonymous,
# -validate, -realm, and other restriction options, including TRUSTED_EXEC and
# similar in lynx.cfg, so be careful where you redirect to if restrictions are
# important!
#
#   UseProxy  URL1  PROXYURL
# Stop processing further rules, and force access through the proxy given by
# PROXYURL.  PROXYURL should have the same form as required for foo_proxy
# environment variables and lynx.cfg options, i.e., (unless you are trying to
# do something unusual) "http://some.proxy-server.dom:port/".  This rule
# overrides any use of a proxy (or external gateway) that might otherwise apply
# because of environment variables or lynx.cfg options, it also overrides any
# "no_proxy" settings.
#
#   UseProxy  URL1  none
# Mark request as NOT using any proxy (or external gateway), and continue
# processing(!).  For a request marked this way, any subsequent UseProxy
# rule with a PROXYURL will be ignored, and any use of a proxy (or external
# gateway) that might otherwise apply because of environment variables or
# lynx.cfg options will be overridden.  Note that the marking will not
# survive a Redirect rule (since that will result, if successful, in a
# new request).
#
#   Alert         URL1  MESSAGE
#   AlwaysAlert   URL1  MESSAGE
#   UserMsg       URL1  MESSAGE
#   InfoMsg       URL1  MESSAGE
#   Progress      URL1  MESSAGE
# These produce various kinds of statusline messages, differing in whether
# a pause is enforced and in its duration, immediately when the rule is
# applied.  AlwaysAlert shows the message text even in non-interactive mode
# (-dump, -source, etc.).  Rule processing continues after the message is
# shown.  As usual, these rules only apply if URL1 matches.  MESSAGE is
# the text to be displayed, it can contain one occurrence of "%s" which
# will be replaced by the current URL, literal '%' characters should be
# doubled as "%%".
#
# Rules are processed sequentially first to last for each request, a rule
# applies if the current URL matches URL1.  The current URL is initally the
# URL for the resource the user is trying to access, but may change as the
# result of applied Map rules.  case-sensitive (!) string comparison is used,
# in addition URL1 can contain one '*' which is interpreted as a wildcard
# matching 0 or more characters.  So if for example
# "http://example.com/dir/doc.html" is requested, it would match any of
# the following:
#   Pass  http:*
#   Pass  http://example.com/*.html
#   Pass  http://example.com/*
#   Pass  http://example*
#   Pass  http://*/doc.html
# but not:
#   Pass  http://example/*
#   Pass  http://Example.COM/dir/doc.html
#   Pass  http://Example.COM/*
#
# If a URL2 is given and also contains a '*', that character will be
# replaced by whatever matched in URL1.  Processing stops with the
# first matching "Fail" or "Pass" or when the end of the rules is reached.
# If the end is reached without a "Fail" or "Pass", the URL is allowed
# (equivalent to a final "Pass *").
#
# The requested URL will have been transformed to Lynx's normal
# representation.  This means that local file resources should be
# expected in the form "file://localhost/<path using slash separators>",
# not in the machine's native representation for filenames.
#
# Anyone with experience configuring the venerable CERN httpd server will
# recognize some of the syntax - in fact, the code implementing rules goes
# back to a common ancestor.  But note the differences: all URLs and URL-
# patterns here have to be given as absolute URLs, even for local files.
# (Absolute URLs don't imply proxying.)
#
# CONDITIONS
# ----------
# All rules mentioned can be followed by an optional CONDITION, which can
# be used to further restrict when the rule should be applied (in addition
# to the match on URL1).  A CONDITION takes one of the forms
#   "if"     CONDITIONFLAG
#   "unless" CONDITIONFLAG
# and currently two condition flags are recognized:
#   "userspecified"   (or abbreviated "userspec")
#   "redirected"
# To explain these, first some terms need to be defined.  A "request"
# is...
# 
# A user action (like following a link, or entering a 'g'oto URL) can either be
# rejected immediately (for example, because of restrictions in effect, or
# because of invalid input), or can generate a "request".  For the purpose of
# this discussion, a "request" is the sequence of processing done by lynx,
# which might ultimately lead to an actual network request and loading and
# display of data; a request can also result in rejection (for example, some
# restrictions are checked at this stage), or in a redirection.  A redirection
# in turn can be rejected (which makes the request fail), or can automatically
# generate a new request.  A "request chain" is the sequence of one or more
# requests triggered by the same user event that are chained together by
# redirections.
# For each request, some URL schemes are handled (or rejected) specially, see
# Limitation 1 below, the others are passed to the generic access code.  Rules
# processing occurs at the beginning of the generic access code, before a
# request is dispatched to the scheme-specific protocol module (but after
# checking whether the request can be satisfied by re-displaying an already
# cached document).
# With these definitions, the meaning of the possible CONDITIONFLAGS:
# 
#   if redirected
# The rule applies if the current request results from a redirection;
# whether that was a real HTTP redirection or one generated by a rule
# in the previous request makes no difference.  In other words, the
# condition is true if the current request is not the first one in the
# request chain.
#
#   if userspecified
# The rule applies if the initial URL of the request chain was specified
# by the user.  Lynx marks a request as "user specified" for URLs that
# come from 'g'oto prompts, as well as for following links in a bookmark
# or Jump file and some other special (lynx-generated) pages that may
# contain URLs that were typed in by the user.
# Note that this is not a property of the request, but of the whole request
# chain (based on where the first request's URL came from).  The current
# URL may differ from what the user typed
# - because of initial fixups, including conversion of Guess-URLs and file
#   paths to full URLs,
# - because of Map rules applied, and/or
# - because of a previous redirection.
# So to make reasonably sure a suspicious or potentially dangerous URL has
# been entered by the user, i.e. is not a link or external redirection
# location that cannot be trusted, a combination of "userspecified" and
# "redirected" flags should be used, for example
#   Fail URL1 unless userspecified
#   Fail URL1 if redirected
#   ...
#
# CAVEAT
# ======
# First, to squash any false expectations, an example for what NOT TO DO.
# It might be expected that a rule like
#   Fail  file://localhost/etc/passwd		# <- DON'T RELY ON THIS
# could be used to prevent access to the file "/etc/passwd".  This might
# fool a naive user, but the more sophisticated user could still gain
# access, by experimenting with other forms like (@@@ untested)
# "file://<machine's domain name>/etc/passwd" or "/etc//passwd"
# or "/etc/p%61asswd" or "/etc/passwd?" or "/etc/passwd#X" and so on.
# There are many URL forms for accessing the same resource, and Lynx
# just doesn't guarantee that URLs for the same resource will look the
# same way.
#
# The same reservation applies to any attempts to block access to unwanted
# sites and so on.  This isn't the right place for implementing it.
# (Lynx has a number of mechanisms documented elsewhere to restrict access,
# see the INSTALLATION file, lynx.cfg, lynx -help, lynx -restrictions.)
#
# Some more useful applications:
#
# 1. Disabling URLs by access scheme
# ----------------------------------
#   Fail  gopher:*
#   Fail  finger:*
#   Fail  lynxcgi:*
#   Fail  LYNXIMGMAP:*
# This should work (but no guarantees) because Lynx canonicalizes
# the case of recognized access schemes and does not interpret
# %-escaping in the scheme part (@@@ always?)
#
# Note that for many access schemes Lynx already has mechanisms to
# restrict access (see lynx.cfg, -help, -restrictions, etc.), others
# have to be specifically enabled.  Those mechanisms should be used
# in preference.
# Note especially Limitation 1 below.
# This can be used for the remaining cases, or in addition by the
# more paranoid.  Note that disabling "file:*" will also make many
# of the special pages generated by lynx as temporary files (INFO,
# history, ...) inaccessible, on the other hand it doesn't prevent
# _writing_ of various temp files - probably not what you want.
#
# You could also direct access for a scheme to a brief text explaining
# why it's not available:
#   Redirect news:*   http://localhost/texts/newsserver-is-broken.html
#
# 2. Preventing accidental access
# -------------------------------
# If there is a page or site you don't want to access for whatever
# reason (say there's a link to it that crashes Lynx [don't forget to
# report a bug], or if that starts sending you a 5 Mb file you don't
# want, or you just don't like the people...), you can prevent yourself
# from accidentally accessing it:
#    Fail  http://bad.site.com/*
#
# 3. Compressed files
# -------------------
# You have downloaded a bunch of HTML documents, and compressed them
# to save space.  Then you discover that links between the files don't
# work, because they all use the names of the uncompressed files.  The
# following kind of rule will alow you to navigate, invisibly accessing
# the compressed files:
#   Map file://localhost/somedir/*.html file://localhost/somedir/*.html.gz
# or, perhaps better:
#   Redirect file://localhost/somedir/*.html file://localhost/somedir/*.html.gz
#
# 4. Use local copies
# -------------------
# You have downloaded a tree of HTML documents, but there are many links
# between them that still point to the remote location.  You want to access
# the local copies instead, after all that's why you downloaded them.  You
# could start editing the HTML, but the following might be simpler:
#  Map http://remote.com/docs/*.html file://localhost/home/me/docs/*.html
# Or even combine this with compressing the files:
#  Map http://remote.com/docs/*.html file://localhost/home/me/docs/*.html.gz
#
# Again, replacing the "Map" with "Redirect" is probably better - it will
# allow you to see the _real_ location on the lynx INFO screen or in the
# HISTORY list, will avoid duplicates in the cache if the same document is
# loaded with two different URLs, and may allow you to 'e'dit the local
# from within lynx if you feel like it.
#
# 5. Broken links etc.
# --------------------
# A user has moved from http://www.siteA.com/~jdoe to http://siteB.org/john,
# or http://www.provider.com/company/ has moved to their own server
# http://www.company.com, but there are still links to the old location
# all over the place; they now are broken or lead to a stupid "this page
# has moved, please update your bookmarks. Refresh in 5 seconds" page
# which you're tired of seeing.  This will not fix your bookmarks, and
# it will let you see the outdated URLs for longer (Limitation 3 below),
# but for a quick fix:
#   Redirect   http://www.siteA.com/~jdoe/*      http://siteB.org/john/*
#   Redirect   http://www.provider.com/company/* http://www.company.com/*
#
# You could use "Map" instead of "Redirect", but this would let you see the
# outdated URLs for longer and even bookmark them, and you are likely to
# create invalid links if not all documents from a site are mapped
# (Limitation 3).
#
# 6. DNS troubles
# ---------------
# A special case of broken links.  If a site is inaccessible because the
# name cannot be resolved (your or their name server is broken, or the
# name registry once again made a mistake, or they really didn't pay in
# time...) but you still somehow know the address; or if name lookups are
# just too slow:
#   Map   http://www.somesite.com/*  http://10.1.2.3/*
# (You could do the equivalent more cleanly by adding an entry to the hosts
# file, if you have access to it.)
#
# Or, if a name resolves to several addresses of which one is down, and the
# DNS hasn't caught up:
#   Map   http://www.w3.org/*    http://www12.w3.org/*
#
# Note that this can break access to some name-based virtually hosted sites.
#
# In this case use of "Map" is probably preferred over "Redirect", as long
# as the URL on the left side contains the real and preferred hostname or
# the problem is only temporary.
#
# 7. Avoid redirections
# ---------------------
# Some sites have a habit to provide links that don't go to the destination
# directly but always force redirection via some intermediate URL.  The
# delay imposed by this, especially for users with slower connections and
# for overloaded servers, can be avoided if the intermediate URLs always
# follow some simple pattern: we can then anticipate the redirect that will
# inevitably follow and generate it internally.  For example,
#   Redirect http://lwn.net/cgi-bin/vr/*    http://*
#
# Warning: The page authors may not like this circumvention.  Often the
# redirection is wanted by them to track access, sometimes in connection
# with cookies.  Some sites may employ mechanisms that defeat the shortcut.
# It is your responsibility to decide whether use of this feature is
# acceptable.  (But note that the same effect can be achieved anyway for
# any link by editing the URL, e.g. with the ELGOTO ('E') key in Lynx, so
# a shortcut like this does not create some new kind of intrusion.)
#
# 8. Detailed proxy selection
# ---------------------------
# Basic use for this one should be obvious, if you have a need for it.
# It simply allows selecting use (or non-use) of proxies on a more detailed
# level than the traditional <scheme>_proxy and no_proxy variables, as well
# as using different proxies for different sites.
# For example, to request access through an anonymizing proxy for all pages
# on a "suspicious" site:
#   UseProxy  http://suspicious.site/*  http://anonymyzing.proxy.dom/
# (as long as all URLs really have a matching form, not some alternative
# like <http://suspicious.site:80/> or <http://SuSpIcIoUs.site/>!)
#
# To access some site through a local squid proxy, running on the same host
# as lynx, except for some image types (say because you rarely access images
# with lynx anyway, and if you do, you don't want them cached by the proxy):
#   UseProxy  http://some.site/*.gif  none
#   UseProxy  http://some.site/*.jpg  none
#   UseProxy  http://some.site/*      http://localhost:3128/
# Note that order is important here.
#
# To exempt a local address from all proxying:
#   UseProxy  http://local.site/*  none
#
# Note however that for some purposes the "no_proxy" setting may be better
# suited than "UseProxy ... none", because of its different matching logic
# (see comments in lynx.cfg).
#
# 9. Invent your own scheme
# -------------------------
# Suppose you want to teach lynx to handle a completely new URL scheme.
# If what's required for the new scheme is already available in lynx in
# _some_ way, this may be possible with some inventive use of rules.
# As an example, let's assume you want to introduce a simple "man:" scheme
# for showing manual pages, so (for a Unix-like system, at least) "man:lynx"
# would display the same help information as the "man lynx" command and so
# on (we ignore section numbers etc. for simplicity here).
# First, since lynx doesn't know anything about a "man:" scheme, it will
# normally reject any such URLs at an early stage.  However, a trick exists
# to bypass that hurdle: define a man_proxy environment variable *outside of
# lynx, before starting lynx* (it won't work in lynx.cfg), the actual value
# is unimportant and won't actually be used.  For example, in your shell:
#   export man_proxy=X
#
# If you already have some kind of HTTP-accessible man gateway available,
# the task then probably just amounts to transforming the URL into the right
# form.  For one such gateway (in this case, a CGI script running on the
# local machine), the rule
#   Redirect man:* http://localhost/cgi-bin/dwww?type=runman&location=*/
# or, alternatively,
#   UseProxy man:* none
#   Map      man:* http://localhost/cgi-bin/dwww?type=runman&location=*/
# does it, for other setups the right-hand side just has to be modified
# appropriately.  The "UseProxy" is to make sure the bogus man_proxy gets
# ignored.
#
# If no CGI-like access is available, you might want to invoke your system's
# man command directly for a man: URL.  Here is some discussion of how this
# could be done, and why ultimately you may not want to do it; this is also
# an opportunity to show examples for how some of the rules and conditions
# can be used that haven't been discussed in detail elsewhere.
# Lynx provides the lynxexec: (and the similar lynxprog:) scheme for running
# (nearly) arbitrary commands locally.  At the heart of employing it for
# man: would be a rule like this:
#   Redirect          man:*  "lynxexec:/usr/bin/man *"
# (It is a peculiarity of this scheme that the literal space and quoting
# are necessary here.  Also note that Map cannot be used here instead of
# Redirect, since lynxexec, as a special kind of URL, needs to be handled
# "early" in a request.)
# Of course, execution of arbitrary commands is a potentially dangerous
# thing.  lynxexec has to be specifically enabled at compile time and in
# lynx.cfg (or with command line options), and there are various levels
# of control, too much to go into here.  It is assumed in the following that
# lynxexec has been enabled to the degree necessary (allow /usr/bin/man
# execution) but hopefully not too much.
# What needs to be prevented is that allowing local execution of the man
# command might unintentionally open up unwanted execution of other commands,
# possibly by some trick that could be exploited.  For example, redirecting
# man:* as above, the URL "man:lynx;rm -r *" could result in the command
# "man lynx;rm -r *" executed by the system, with obvious disastrous results.
# (This particular example won't actually work, for several reasons; but
# for the purpose of discussion let's assume it did, there may be similar
# ones that do.)
# Because of such dangers, redirection to a lynxexec: is normally never
# accepted by lynx.  We need at least a PermitRedirection rule to override
# this protective limitation:
#   PermitRedirection man:*
#   Redirect          man:*  "lynxexec:/usr/bin/man *"
# But now we have potentially opened up local execution more than is
# acceptable via the man: scheme, so this needs to be examined.
# There are two aspects to security here: (1) restricting the user, and (2)
# protecting the user.  The first could also be phrased as protecting the
# system from the user; the second as preventing lynx (and the system) from
# doing things the user doesn't really want.  Aspect (1) is very important
# for setups providing anonymous guest accounts and similarly restricted
# environments.  (Otherwise shell access is normally allowed, and trying to
# protect the system in lynx would be rather pointless.)  As far as access
# to some URLs is concerned, the difference can be characterized in terms of
# which sources  of URLs are trusted enough to allow access: for (1), only
# links occurring in a limited number of documents are trusted enough for
# some (or all) URLs, user input at 'g'oto prompts and the like is not (if
# not completely disabled).  For (2) and assuming a user with normal shell
# privileges, the user may be trusted enough to accept any URL explicitly
# entered, but URLs from arbitrary external sources are not - someone might
# try to use them to trick the user (by following an innocent-looking link)
# or lynx (by following a redirection) into doing something undesirable.
#
# In the following we are concerned with (2); it is assumed that providers
# of anonymous accounts would not want to follow this path, and would have
# no need for additional schemes that imply local execution anyway.  (For
# one thing, with the man example they would have to carefully check that
# users cannot break out of the man command to a local shell prompt.)
#
# Getting back to the example, it was already mentioned that lynx does not
# allow redirections to lynxexec.  In fact this continues to be disallowed
# for real redirection received from HTTP servers.  But we have introduced
# a new man: scheme, and the lynx code that does the redirection checking
# doesn't know anything about special considerations for man: URLs, so
# an external HTTP server might send a redirection message with "Location:
# man:<something>", which lynx would allow, and which would in turn be
# redirected by our rule to "lynxexec:/usr/bin/man <something>".  Unless
# we are 100% sure that either this can never happen or that the lynxexec
# URL resulting from this can have no harmful effect, this needs to be
# prevented.  It can be done by checking for the "redirected" condition,
# either by putting something like (the first line is of course optional)
#   Alert  man:*  "Redirection to man: not allowed" if redirected
#   Fail   man:*                                    if redirected
# somewhere before the Redirect rule, or, reversing the logic, by adding
# a condition to the redirection rules, i.e. they become
#   PermitRedirection man:*                             unless redirected
#   Redirect          man:*  "lynxexec:/usr/bin/man *"  unless redirected
# (actually, putting the condition on either one of the rules would be
# sufficient).  The second variant assumes that the attempted access to
# man: via redirection will ultimately fail because there is no other way
# to handle such URLs.
#
# The above should take care of rejecting man: URLs from redirections, but
# what about regular links in HTML (like <A HREF="man:...">)?  As long as
# it can be assumed that the user will always inspect each and every link
# before following it, and never follow a link that can have harmful effect,
# no further restrictions are necessary.  But this is a very big assumption,
# unrealistic except perhaps in some single-user setups where the user is
# is identical with the rule writer.  So normally most links have to be
# regarded as suspect, and only URLs entered by the user can be accepted:
#   Alert  man:*  "Redirection to man: not allowed" if redirected
#   Fail   man:*                                    if redirected
#   Alert  man:*  "Link to man: not allowed"        unless userspecified
#   Fail   man:*                                    unless userspecified
#
# With these restrictions we have limited the ways our new man: scheme can
# be used rather severely, to the point where its usefulness is questionable.
# In addition to 'g'oto prompts, it may work in Jump files; also, should
# links to man:<something> appear in HTML text, the user could retype them
# manually or use the ELGOTO ('E') command with some trivial editing (like
# adding a space) to "confirm" the URL.  Even if the precautions outlined
# above are followed: THIS TEXT DOES NOT IMPLY ANY PROMISE THAT, BY FOLLOWING
# THE EXAMPLES, LYNX WILL BE SAFE.  On the other hand, some of the precautions
# *may* not be necessary: it is possible that careful use of TRUSTED_EXEC
# options in lynx.cfg could offer enough protection while making the new
# scheme more useful.
#
# If all this seems a bit too scary, that's intentional; it should be noted
# that these considerations are not in general necessary for "harmless" URL
# schemes, but appropriate for this "extreme" example.  One last remark
# regarding the hypothetical man scheme: instead of implementing it through
# "lynxexec:" or "lynxprog:", it would be somewhat safer to use "lynxcgi:"
# instead if it is supported.  A simple lynxcgi script would have to write
# the man page to stdout (either converted to text/html or as plain text,
# preceded by an appropriate Content-Type header line), and all necessary
# checking for special shell characters would be done within the script -
# lynx does not use the system() function to run the script.
#
# Other Limitations
# =================
# First, see CAVEAT above.  There are other limitations:
#
# 1. Applicable URL schemes
# -------------------------
# Rules processing does not apply to all URL schemes.  Some are
# handled differently from the generic access code, therefore rules
# for such URLs will never be "seen".  This limitation applies at
# least to lynxexec:, lynxprog:, mailto:, LYNXHIST:, LYNXMESSAGES:,
# LYNXCFG:, and LYNXCOMPILEOPTS: URLs.  You shouldn't be tempted
# to try to redirect most of these schemes anyway, but this also
# makes it impossible to disable them with "Fail" rules.
#
# Also, a scheme has to be known to Lynx in order to get as far as
# applying rules - you cannot just define your own new foobar: scheme
# and then map it to something here, but see Application 9, above,
# for a workaround.
#
# 2. No re-checking
# -----------------
# When a URL is mapped to a different one, the new URL is not checked
# again for compliance with most restrictions established by -anonymous,
# -restrictions, lynx.cfg and so on.  This can be regarded as a feature:
# it allows specific exceptions.  Of course it means that users for
# whom any restrictions must be enforced cannot have write access to a
# personal rules file, but that should be obvious anyway!
# This limitation does not applies if "Redirect" is used, in that case
# the new URL will always be re-examined.
#
# 3. Mappings are invisible
# -------------------------
# Changing the URL with "Map" or "Pass" rules will in general not be
# visible to the user, because it happens at a late stage of processing
# a request (similar to directing a request through a proxy).  One
# can think of two kinds of URL for every resource: a "Document URL" as
# the user sees it (on INFO page, history list, status line, etc.), and
# a "physical URL" used for the actual access.  Rules change only the
# physical URL.  This is different from the effect of HTTP redirection.
# Often this is bad, sometimes it may be desirable.
#
# Changing the URL can create broken links if a document has relative URLs,
# since they are taken to be relative to the "Document URL" (if no BASE tag
# is present) when the HTML is parsed.
#
# This limitation does not apply if "Redirect" is used - the new location
# will be visible to the user, and will be used by lynx for resolving
# relative URLs within the document.
#
# 4. Interaction with proxying
# ----------------------------
# Rules processing is done after most other access checks, but before
# proxy (and gateway) settings are examined.  A "Fail" rule works
# as expected, but when the URL has been mapped to a different one,
# the subsequent proxy checking can get confused.  If it decides that
# access is through a proxy or gateway, it will generally use the
# original URL to construct the "physical" URL, effectively overriding
# the mapping rules.  If the mapping is to a different access scheme
# or hostname, proxy checking could also be fooled to use a proxy when
# it shouldn't, to not use one when it should, or (if different proxies
# are used for different schemes) to use the wrong proxy.  So "just
# don't do that"; in some cases setting the no_proxy variable will help.
# Example 3 happens to work nicely if there is a http_proxy but no
# ftp_proxy.
#
# This limitation does not come into play if a "UseProxy" rule is applied,
# in either of its two forms: with a PROXYURL, proxying is fully under
# the control of the rules author, and with "none", subsequent proxy
# and gateway checking is completely disabled.  It is therefore a good
# idea to combine any "Map" and "Pass" rules that might result in passing
# the changed URL with explicit "UseProxy" rules, if the rules file is
# expected to be used together with proxying; or else always use "Redirect"
# instead of simple passing.
#
# 5. Case-sensitive matching
# --------------------------
# The matching logic is generic string-based.  It doesn't know anything
# about URL syntax, and so it cannot know in which parts of a URL case
# matters and where it doesn't.  As a result, all comparisons are case-
# sensitive.  If (a limited number of) case variations of a URL need
# to be dealt with, several rules can be used instead of one.
# In particular, this makes "UseProxy ... none" in some ways more limited
# than a no_proxy setting.
#
# 6. Redirection differences
# --------------------------
# For some URLs lynx does never check after a request whether a redirection
# occurs; that makes the "Redirect" rule useless for such URLs (in addition
# to those mentioned under limitation 1.).  Some of them are some gopher
# types, telnet: and similar in most situations, newspost: and similar,
# lynxcgi:, and some other private types.  Trying to redirect these will
# make access fail.  You probable don't want to change such URLs anyway,
# but if you feel you must, try using "Map" and "Pass" instead.
#
# The -noredir command line option only applies for real HTTP redirection
# responses, Redirect rules are still applied.  Also for certain other
# command line options (-mime_header, -head) and command keys (HEAD) lynx
# shows the redirection message (or part of it) in case of a real HTTP
# redirection, instead of following the redirection.  Here, too, a Redirect
# rule remains effective (there is no redirection message to show, after all).
#
# 7. URLs required
# ----------------
# Full absolute URLs (modulo possible "*" matching wildcards) are required
# in rules.  Strings like "www.somewhere.com" or "/some/dir/some.file" or
# "www.somewhere.com/some/dir/some.file" are not URLs.  Lynx may accept
# them as user input, as abbreviated forms for URLs; but by the time the
# rules get checked, those have been converted to full URLs, if they can
# be recognized.  This also means that rules cannot influence which strings
# typed at a 'g'oto prompt are recognized for URLs - rules processing kicks
# in later.