Creating a Policy Reference File

P3P uses a file called a policy reference file to tell clients what policies are in use on a site, and what parts of the site each policy covers. In essence, a policy reference file is a map of the P3P policies on that site. The IBM P3P Policy Editor contains a wizard which steps you through the process of creating a policy reference file.

Some Web sites will use just one P3P policy to cover the entire site. In this case, the policy reference file is quite simple. You'll need to give the URL where that P3P policy is published on the Internet, and then the reference file generator will do the rest of the work. It will create a reference file which gives the location of the P3P policy, and then state that this policy covers all URLs and all cookies from the site.

Other times, a site will need to use more than one policy to cover the site. In this case, the policy reference file will contain a list of P3P policies which are used by the site. For each P3P policy, you'll need to enter the following information:

URL Patterns in policy reference files

By default, a policy in a policy reference file does not apply to any URLs on the site. You must specify the URLs on the site which it covers. To do this, add each URL on the site which you wish to cover. The policy reference file should contain URLs relative to the root of the site; thus to cover the homepage (index.html) of www.example.com, the policy reference file would say that it covers /index.html.

It's obviously impractical to list every URL on the site in a policy reference file. Therefore, P3P lets policy reference files contain URL patterns. A URL pattern contains one or more wildcards, represented by * characters. A * character in a URL pattern stands for any sequence of 0 or more characters.

One common way to use wildcards in policy reference files is to indicate subtrees of the site to cover. For example, a policy might cover all content in the /servlet and /cgi-bin directory trees. This could be done by putting /servlet* and /cgi-bin* in the list of covered URLs for that policy.

URL Include and Exclude Lists

Policy reference files contain a list of URL patterns to be covered by a policy (also known as an include list), as well as a list of URL patterns to not be covered by a policy (also known as an exclude list). The way include and exclude lists work is that the policy covers all URLs listed in the include list, except for those listed in the exclude list.

This can be very handy when using wildcards. For example, consider the case where a certain policy is to cover everything in the /servlet directory tree except for /servlet/ExcludeMe. This can be done by entering /servlet* in the URL include list, and entering /servlet/ExcludeMe in the exclude list.

Cookies in policy reference files

A policy reference file also states which of the site's cookies are covered by each policy. As with URLs, a policy does not cover any cookies unless they are listed as being covered in the policy reference file. Similarly to URLs, there is a cookie-include list and a cookie-exclude list for each policy. The policy covers all of the cookies in the cookie-include list except those in the cookie-exclude list.

While URLs are covered simply by the URL name (or a wildcard which is matched against the URL name), cookies are more complex. A cookie has four important components. All four of these components are chosen by the application which sets the cookie:

  1. The cookie's name. When a server sets a cookie, it assigns a name to the cookie. Sometimes cookie names are easy mnemonics (like VISITORID), while other times they are not very meaningful (like YX). The application which is using the cookie will look up the cookie by its name.
  2. The cookie's value. The cookie's value is the information that the Web application has stored in the cookie. This can be a string or a numeric value, such as "1234567", or a string, such as "OPT_OUT".
  3. The cookie's domain. This is the set of Web servers, given by their hostname, that the cookie will be set to. Generally, this is either the name of the server which set the cookie (like www.example.com), or else the domain of the server (like .example.com, which means "all servers whose names end in .example.com").
  4. The cookie's path. While this field is not often used, the Web server can indicate that the cookie should only be sent back to a certain tree of URLs on the site. This is given by a URL prefix: the cookie will be sent to all URLs which start with the specified string.

When cookies are being added to the include or exclude list for a policy, values for any of these four fields can be given. There is an implied "and" across the fields selected. For example, if the policy reference file says that policy /privacy/p3p/policy2.xml covers cookies named Foo and with a domain of server2.example.com, then that policy only covers cookies named Foo which will be sent back to server2.example.com.

Since the cookie include and exclude lists are lists, they can have as many entries as needed. This is useful, for example, if you wish to cover cookies named Cookie1 and cookies named Cookie2.

Order in Policy Reference Files

Clients process policy reference files in top-to-bottom order. When they find a match for a specific URL or cookie, they stop processing. Thus if a site will be using multiple policies, then you should place the most-specific policies first in the policy reference file, and the more general ones last. For example, consider a site using three policies:

  1. /privacy/p3p/policy1.p3p covers all URLs in the /servlet tree
  2. /privacy/p3p/policy2.p3p covers all cookies named VISITORID
  3. /privacy/p3p/policy3.p3p covers all other URLs and cookies

If these policies are entered in this order in the policy reference file, then they will work as expected. However, if /privacy/p3p/policy3.p3p is listed first, then - unless it explicitly excludes the items the other two policies cover - clients would apply that policy to the entire site.

Additional help

More information about policy reference files, including guidance on how they are used on a site, can be obtained from the W3C's P3P deployment guide, which is available on the Web at http://www.w3.org/TR/p3pdeployment.