index.bs

<pre class="metadata">
Title: Self-Review Questionnaire: Security and Privacy
Status: ED
TR: https://www.w3.org/TR/security-privacy-questionnaire/
ED: https://w3ctag.github.io/security-questionnaire/
Shortname: security-privacy-questionnaire
Repository: w3ctag/security-questionnaire
Level: None
Editor: Theresa O’Connor, w3cid 40614, Apple Inc. https://apple.com, hober@apple.com
Editor: Pete Snyder, w3cid 109401, Brave https://brave.com, psnyder@brave.com
Former Editor: Jason Novak, Apple Inc., https://apple.com
Former Editor: Lukasz Olejnik, Independent researcher, https://lukaszolejnik.com
Former Editor: Mike West, Google Inc., mkwst@google.com
Group: tag
Markup Shorthands: css no, markdown yes
Local Boilerplate: status yes
Abstract: This document contains a set of questions to be used when
    evaluating the security and privacy implications of web platform
    technologies.
</pre>

<h2 id="intro">Introduction</h2>

When designing new features for the Web platform,
we must always consider the security and privacy implications of our work.
New Web features should always
maintain or enhance
the overall security and privacy of the Web.

This document contains a set of questions
intended to help <abbr title="specification">spec</abbr> authors
as they think through
the security and privacy implications
of their work.
It also documents mitigation strategies
that spec authors can use to address
security and privacy concerns they encounter as they work on their spec.

This document is itself a work in progress,
and there may be security or privacy concerns
which this document does not (yet) cover.
Please [let us know](https://github.com/w3ctag/security-questionnaire/issues/new)
if you identify a security or privacy concern
this questionnaire should ask about.

<h3 id="howtouse">How To Use The Questionnaire</h3>

Spec authors should work through these questions
early on in the design process,
when things are easier to change.
When privacy and security issues are only found later,
after the feature has shipped,
it's much harder to change the design.
If security or privacy issues are found late,
user agents may need to adopt breaking changes
to protect their users' privacy and security.

These questions should be kept in mind throughout work on any specification.
Spec authors should periodically revisit this questionnaire
to continue to consider the privacy and security implications of
their features, as their design changes over time.

<h3 id=reviews>TAG and PING reviews and this questionnaire</h3>

When authors request
a [review](https://github.com/w3ctag/design-reviews)
from the [Technical Architecture Group (TAG)](https://www.w3.org/2001/tag/),
the TAG asks that authors provide answers
to the questions in this document.
The [Privacy Interest Group (PING)](https://www.w3.org/Privacy/IG/)
also considers answers to these questions
while conducting
[privacy reviews](https://github.com/w3cping/privacy-reviews/issues).

The TAG and PING use this document
to record security and privacy questions
which come up during our reviews.
Working through these questions can save
both spec authors and the people performing design reviews
a lot of time.

To make it easier for anyone requesting a review
to provide their answers to these questions to the reviewers,
we've prepared [a list of these questions in Markdown](https://raw.githubusercontent.com/w3ctag/security-questionnaire/master/questionnaire.markdown).

<h2 id="questions">Questions to Consider</h2>

<h3 class=question id="purpose">
  What information might this feature expose to Web sites or other parties,
  and for what purposes is that exposure necessary?
</h3>

User Agents should only expose information to the web
when doing so is necessary to serve a clear user need.
Does your feature expose information to origins?
If so, how does exposing this information serve user needs?
Are the risks to the user outweighed by the benefits to the user?
If so, how?

See also

* [[DESIGN-PRINCIPLES#priority-of-constituencies]]

<h3 class=question id="minimum-data">
  Is this specification exposing the minimum amount of information necessary
  to power its features?
</h3>

Features should only expose information
when it's absolutely necessary to satisfy its use cases.
If your feature exposes more information than is necessary,
why does it do so?

See also

* [[#data-minimization]]

<h3 class=question id="personal-data">
  How does this specification deal with personal information,
  personally-identifiable information (PII), or information derived from
  them?
</h3>

Personal information is any data about a user
(for example, their home address),
or information that could be used to identify a user,
such as an alias, email address, or identification number.

Note: Personal information is
distinct from personally identifiable information
(<abbr title="personally identifiable information">PII</abbr>).
PII is a legal concept,
the definition of which varies from jurisdiction to jurisdiction.
When used in a non-legal context,
PII tends to refer generally
to information
that could be used to identify a user.

When exposing
personal information, PII, or derivative information,
spec authors must take steps to minimize the potential harm to users.

<p class=example>
A feature
which gathers biometric data
(such as fingerprints or retina scans)
for authentication
should not directly expose this biometric data to the web.
Instead,
it can use the biometric data
to look up or generate some temporary key which is not shared across origins
which can then be safely exposed to the origin. [[WEBAUTHN]]
</p>

Personal information, PII, or their derivatives
should not be exposed to origins
without [meaningful user consent](https://w3ctag.github.io/design-principles/#consent).
Many APIs
use the Permissions API to acquire meaningful user consent.
[[PERMISSIONS]]

Keep in mind
that each permission prompt
added to the web platform
increases the risk
that users will ignore
the contents of all permission prompts.
Before adding a permission prompt, consider your options for using
a less obtrusive way to gain meaningful user consent.
[[ADDING-PERMISSION]]

<p class=example>
`<input type=file>` can be used to upload
documents containing personal information
to websites.
It makes use of
the underlying native platform's file picker
to ensure the user understands
that the file and its contents
will be exposed to the website,
without a separate permissions prompt.
</p>

See also

* [[#user-mediation]]
* [[DESIGN-PRINCIPLES#consent]]

<h3 class=question id="sensitive-data">
  How does this specification deal with sensitive information?
</h3>

Personal information is not the only kind of sensitive information.
Many other kinds of information may also be sensitive.
What is or isn't sensitive information can vary
from person to person
or from place to place.
Information that would be harmless if known about
one person or group of people
could be dangerous if known about
another person or group.
Information about a person
that would be harmless in one country
might be used in another country
to detain, kidnap, or imprison them.

Note:
caste,
citizenship,
color,
credentials,
criminal record,
demographic information,
employment status,
ethnicity,
financial information,
health information,
location data,
marital status,
political beliefs,
profession,
race,
religious beliefs or nonbeliefs,
sexual preferences,
and
trans status
are all examples of sensitive information.

When a feature exposes sensitive information to the web,
its designers must take steps
to mitigate the risk of exposing the information.

<div class=example>

The Credential Management API allows sites
to request a user's credentials
from a password manager. [[CREDENTIAL-MANAGEMENT-1]]
If it exposed the user's credentials to JavaScript,
and if the page using the API were vulnerable to [=XSS=] attacks,
the user's credentials could be leaked to attackers.

The Credential Management API
mitigates this risk
by not exposing the credentials to JavaScript.
Instead, it exposes
an opaque {{FormData}} object
which cannot be read by JavaScript.
The spec also recommends
that sites configure Content Security Policy [[CSP]]
with reasonable [=connect-src=] and [=form-action=] values
to further mitigate the risk of exfiltration.

</div>

Many use cases
which require location information
can be adequately served
with very coarse location data.
For instance,
a site which recommends restaurants
could adequately serve its users
with city-level location information
instead of exposing the user's precise location.

See also

* [[DESIGN-PRINCIPLES#do-not-expose-use-of-assistive-tech]]

<h3 class=question id="persistent-origin-specific-state">
  Does this specification introduce new state for an origin that persists
  across browsing sessions?
</h3>

There are many existing mechanisms
origins can use to
store information about a user.
Cookies,
`ETag`,
`Last Modified`,
{{localStorage}},
and
{{indexedDB}}
are just a few examples.

Allowing an origin
to store data
on a user’s device
in a way that persists across browsing sessions
introduces the risk
that this state may be used
to track a user
without their knowledge or control,
either in [=first-party-site context|first-=] or [=third-party context|third-party=] contexts.

One of the ways
user agents mitigate the risk
that client-side storage mechanisms
will form a persistent identifier
is by providing users with the ability
to clear out the data stored by origins.
New state persistence mechanisms
should not be introduced
without mitigations
to prevent them
from being used
to track users
across domains
or without control
over clearing this state.
That said,
manually clearing storage
is something users do only rarely.
Spec authors should consider ways
to make new features more privacy-preserving without full storage clearing,
such as
reducing the uniqueness of values,
rotating values,
or otherwise making features no more identifying than is needed.
<!-- https://github.com/w3ctag/design-principles/issues/215 -->
Also, keep in mind that
user agents make use of several different caching mechanisms.
Which, if any, caches will store this new state?
Are additional mitigations necessary?

<div class=example>

Service Workers
intercept all requests made by an origin,
which enables sites
to continue to function when the browser goes offline.
Because of this,
a maliciously-injected service worker
could compromise the user (as documented in [[SERVICE-WORKERS#security-considerations]]).

The spec mitigates the risks
an [=active network attacker=] or [=XSS=] vulnerability present
by limiting service worker registration to [=secure contexts=].
[[SERVICE-WORKERS]]

</div>

<p class=example>
Platform-specific DRM implementations
(such as [=content decryption modules=] in [[ENCRYPTED-MEDIA]])
might expose origin-specific information
in order to help identify users
and determine whether they ought to be granted access
to a specific piece of media.
These kinds of identifiers
should be carefully evaluated
to determine how abuse can be mitigated;
identifiers which a user cannot easily change
are very valuable from a tracking perspective,
and protecting such identifiers
from an [=active network attacker=]
is vital.
</p>

<h3 class=question id="underlying-platform-data">
  What information from the underlying platform, e.g. configuration data, is
  exposed by this specification to an origin?
</h3>

If so, is the information exposed from the underlying platform consistent
across origins?  This includes but is not limited to information relating to
the user configuration, system information including sensors, and
communication methods.

When a specification exposes specific information about a host to an origin,
if that information changes rarely and is not variable across origins, then
it can be used to uniquely identify a user across two origins — either
directly because any given piece of information is unique or because the
combination of disparate pieces of information are unique and can be used to
form a fingerprint [[FINGERPRINTING-GUIDANCE]]. Specifications and user agents
should treat the risk of fingerprinting by carefully considering the surface
of available information, and the relative differences between software and
hardware stacks. Sometimes reducing fingerprintability may as simple as
ensuring consistency, i.e. ordering the list of fonts, but sometimes may be
more complex.

Such information should not be revealed to an origin without a user’s
knowledge and consent barring mitigations in the specification to prevent the
information from being uniquely identifying or able to unexpectedly
exfiltrate data.

<p class=example>
The `RENDERER` string exposed by some WebGL implementations
improves performance in some kinds of applications, but does so at the
cost of adding persistent state to a user's fingerprint. These kinds of
device-level details should be carefully weighed to ensure that the costs
are outweighed by the benefits.
</p>

<p class=example>
The {{NavigatorPlugins}} list exposed via the DOM practically never
changes for most users. Some user agents have taken steps to reduce the
entropy introduced by [disallowing direct enumeration of the plugin list](https://bugzilla.mozilla.org/show_bug.cgi?id=757726).
</p>

<h3 class=question id="sensor-data">
  Does this specification allow an origin access to sensors on a user’s
  device
</h3>

If so, what kind of sensors and information derived from those sensors does
this standard expose to origins?

Information from sensors may serve as a fingerprinting vector across origins.
In addition, sensor also reveals something about my device or environment and
that fact might be what is sensitive.  In addition, as technology advances,
mitigations in place at the time a specification is written may have to be
reconsidered as the threat landscape changes.

Sensor data might even become a cross-origin identifier when the sensor reading
is relatively stable, for example for short time periods (seconds, minutes, even days), and
is consistent across-origins.  In fact, if two user-agents expose the same
sensor data the same way, it may become a cross-browser, possibly even a cross-device identifier.

<p class=example>
As gyroscopes advanced, their sampling rate had to be lowered to
prevent them from being used as a microphone as one such example
[[GYROSPEECHRECOGNITION]].
</p>

<p class=example>
ALS sensors could allowed for an attacker to exfiltrate whether or not a
user had visited given links [[OLEJNIK-ALS]].
</p>

<p class=example>
Even relatively short lived data, like the battery status, may be able to
serve as an identifier if misused/abused [[OLEJNIK-BATTERY]].
</p>

<h3 class=question id="other-data">
  What data does this specification expose to an origin?  Please also
  document what data is identical to data exposed by other features, in the
  same or different contexts.
</h3>

As noted above in [[#sop-violations]], the [=same-origin policy=] is an
important security barrier that new features need to carefully consider.
If a specification exposes details about another origin's state, or allows
POST or GET requests to be made to another origin, the consequences can be
severe.

<p class=example>
Content Security Policy [[CSP]] unintentionally exposed redirect targets
cross-origin by allowing one origin to infer details about another origin
through violation reports (see [[HOMAKOV]]). The working group eventually
mitigated the risk by reducing a policy's granularity after a redirect.
</p>

<p class=example>
Beacon [[BEACON]] allows an origin to send POST requests to an endpoint
on another origin. They decided that this feature didn't add any new
attack surface above and beyond what normal form submission entails, so
no extra mitigation was necessary.
</p>

<h3 class=question id="string-to-script">
  Does this specification enable new script execution/loading mechanisms?
</h3>

* HTML Imports [[HTML-IMPORTS]] create a new script-loading mechanism, using
    <{link}> rather than <{script}>, which might be easy to overlook when
    evaluating an application's attack surface. The working group notes this
    risk, and ensured that they required reasonable interactions with Content
    Security Policy's [=script-src=] directive.

* New string-to-script mechanism? (e.g. {{eval()}} or {{setTimeout()}})

* What about style?

<h3 class=question id="remote-device">
  Does this specification allow an origin to access other devices?
</h3>

If so, what devices does this specification allow an origin to access?

Accessing other devices, both via network connections and via
direct connection to the user's machine (e.g. via Bluetooth,
NFC, or USB), could expose vulnerabilities - some of
these devices were not created with web connectivity in mind and may be inadequately
hardened against malicious input, or with the use on the web.

Exposing other devices on a user’s local network also has significant privacy
risk:

* If two user agents have the same devices on their local network, an
    attacker may infer that the two user agents are running on the same host
    or are being used by two separate users who are in the same physical
    location.
* Enumerating the devices on a user’s local network provides significant
    entropy that an attacker may use to fingerprint the user agent.
* If the specification exposes persistent or long lived identifiers of
    local network devices, that provides attackers with a way to track a user
    over time even if a user takes steps to prevent such tracking (e.g.
    clearing cookies and other stateful tracking mechanisms).
* Direct connections might be also be used to bypass security checks that
    other APIs would provide. For example, attackers used the WebUSB API to
    access others sites' credentials on a hardware security, bypassing
    same-origin checks in an early U2F API. [[YUBIKEY-ATTACK]]

<p class=example>
The Network Service Discovery API [[DISCOVERY-API]] recommended CORS
preflights before granting access to a device, and requires user agents to
involve the user with a permission request of some kind.
</p>

<p class=example>
Likewise, the Web Bluetooth [[WEB-BLUETOOTH]] has an extensive discussion of
such issues in [[WEB-BLUETOOTH#security-and-privacy]], which is worth
reading as an example for similar work.
</p>

<p class=example>
[[WEBUSB]] addresses these risks through a combination of user mediation /
prompting, secure origins, and feature policy.
See [[WEBUSB#security-and-privacy]] for more.
</p>

<h3 class=question id="native-ui">
  Does this specification allow an origin some measure of control over a user
  agent's native UI?
</h3>

Features that allow for control over a user agent’s UI (e.g. full screen
mode) or changes to the underlying system (e.g. installing an ‘app’ on a
smartphone home screen) may surprise users or obscure security / privacy
controls.  To the extent that your feature does allow for the changing of a
user agent’s UI, can it effect security / privacy controls?  What analysis
confirmed this conclusion?

<h3 class=question id="temporary-id">
  What temporary identifiers might this this specification create or expose
  to the web?
</h3>

If a standard exposes a temporary identifier to the web, the identifier
should be short lived and should rotate on some regular duration to mitigate
the risk of this identifier being used to track a user over time.  When a
user clears state in their user agent, these temporary identifiers should be
cleared to prevent re-correlation of state using a temporary identifier.

If this specification does create or expose a temporary identifier to the
web, how is it exposed, when, to what entities, and, how frequently is it
rotated?

Example temporary identifiers include TLS Channel ID, Session Tickets, and
IPv6 addresses.

The index attribute in the Gamepad API [[GAMEPAD]] — an integer that starts
at zero, increments, and is reset — is a good example of a privacy friendly
temporary identifier.

<h3 class=question id="first-third-party">
  How does this specification distinguish between behavior in first-party and
  third-party contexts?
</h3>

The behavior of a feature should be considered not just in the context of its
being used by a first party origin that a user is visiting but also the
implications of its being used by an arbitrary third party that the first
party includes. When developing your specification, consider the implications
of its use by third party resources on a page and, consider if support for
use by third party resources should be optional to conform to the
specification.  If supporting use by third party resources is mandatory for
conformance, please explain why and what privacy mitigations are in place.
This is particularly important as user agents may take steps to reduce the
availability or functionality of certain features to third parties if the
third parties are found to be abusing the functionality.

<h3 class=question id="private-browsing">
  How does this specification work in the context of a user agent’s Private
  Browsing or "incognito" mode?
</h3>

Each major user agent implements a private browsing / incognito mode feature
with significant variation across user agents in threat models,
functionality, and descriptions to users regarding the protections afforded
[[WU-PRIVATE-BROWSING]].

One typical commonality across user agents' private browsing / incognito
modes is that they have a set of state than the user agents’ in their
‘normal’ modes.

Does the specification provide information that would allow for the
correlation of a single user's activity across normal and private browsing /
incognito modes?  Does the specification result in information being written
to a user’s host that would persist following a private browsing / incognito
mode session ending?

There has been research into both:

* Detecting whether a user agent is in private browsing mode [[RIVERA]]
    using non-standardized methods such as <code>[window.requestFileSystem()](https://developer.mozilla.org/en-US/docs/Web/API/Window/requestFileSystem)</code>.
* Using features to fingerprint a browser and correlate private and
    non-private mode sessions for a given user. [[OLEJNIK-PAYMENTS]]

<h3 class=question id="considerations">
  Does this specification have a "Security Considerations" and "Privacy
  Considerations" section?
</h3>

Documenting the various concerns and potential abuses in "Security
Considerations" and "Privacy Considerations" sections of a document is a good
way to help implementers and web developers understand the risks that a
feature presents, and to ensure that adequate mitigations are in place.
Simply adding a section to your specification with yes/no responses to the
questions in this document is insufficient.

If it seems like a feature does not have security or privacy impacts,
then say so inline in the spec section for that feature:

> There are no known security or privacy impacts of this feature.

Saying so explicitly in the specification serves several purposes:

1. Shows that a spec author/editor has explicitly considered security and
    privacy when designing a feature.
1. Provides some sense of confidence that there might be no such impacts.
1. Challenges security and privacy minded individuals to think of and find
    even the potential for such impacts.
1. Demonstrates the spec author/editor's receptivity to feedback about such
    impacts.
1. Demonstrates a desire that the specification should not be introducing
    security and privacy issues

[[RFC3552]] provides general advice as to writing Security Consideration
sections.  Generally, there should be a clear description of the kinds of
privacy risks the new specification introduces to for users of the web
platform.  Below is a set of considerations, informed by that RFC, for
writing a privacy considerations section.

Authors must describe:

1. What privacy attacks have been considered?
1. What privacy attacks have been deemed out of scope (and why)?
1. What privacy mitigations have been implemented?
1. What privacy mitigations have considered and not implemented (and why)?

In addition, attacks considered must include:

1. Fingerprinting risk;
1. Unexpected exfiltration of data through abuse of sensors;
1. Unexpected usage of the specification / feature by third parties;
1. If the specification includes identifiers, the authors must document what
    rotation period was selected for the identifiers and why.
1. If the specification introduces new state to the user agent, the authors
    must document what guidance regarding clearing said storage was given and
    why.
1. There should be a clear description of the residual risk to the user
    after the privacy mitigations has been implemented.

The crucial aspect is to actually considering security and privacy. All new
specifications must have security and privacy considerations sections to be
considered for wide reviews. Interesting features added to the web platform
generally often already had security and/or privacy impacts.

<h3 class=question id="relaxed-sop">
  Does this specification allow downgrading default security characteristics?
</h3>

Does this feature allow for a site to opt-out of security settings to
accomplish some piece of functionality?  If so, in what situations does your
specification allow such security setting downgrading and what mitigations
are in place to make sure optional downgrading doesn't dramatically increase
risks?

* {{Document/domain|document.domain}}
* [[CORS]]
* [[WEBMESSAGING]]
* [[REFERRER-POLICY]]'s <a>"unsafe-url"</a>

<h3 class=question id="missing-questions">
  What should this questionnaire have asked?
</h3>

This questionnaire is not exhaustive.
After completing a privacy review,
it may be that
there are privacy aspects of your specification
that a strict reading, and response to, this questionnaire,
would not have revealed.
If this is the case,
please convey those privacy concerns,
and indicate if you can think of improved or new questions
that would have covered this aspect.

Please consider [filing an issue](https://github.com/w3ctag/security-questionnaire/issues/new)
to let us know what the questionnaire should have asked.

<h2 id="threats">Threat Models</h2>

To consider security and privacy it is convenient to think in terms of threat
models, a way to illuminate the possible risks.

There are some concrete privacy concerns that should be considered when
developing a feature for the web platform [[RFC6973]]:

* Surveillance: Surveillance is the observation or monitoring of an
    individual's communications or activities.
* Stored Data Compromise: End systems that do not take adequate measures to
    secure stored data from unauthorized or inappropriate access.
* Intrusion: Intrusion consists of invasive acts that disturb or interrupt
    one's life or activities.
* Misattribution: Misattribution occurs when data or communications related
    to one individual are attributed to another.
* Correlation: Correlation is the combination of various pieces of
    information related to an individual or that obtain that characteristic
    when combined.
* Identification: Identification is the linking of information to a
    particular individual to infer an individual's identity or to allow the
    inference of an individual's identity.
* Secondary Use: Secondary use is the use of collected information about an
    individual without the individual's consent for a purpose different from
    that for which the information was collected.
* Disclosure: Disclosure is the revelation of information about an
    individual that affects the way others judge the individual.
* Exclusion: Exclusion is the failure to allow individuals to know about
    the data that others have about them and to participate in its handling
    and use.

In the mitigations section, this document outlines a number of techniques
that can be applied to mitigate these risks.

Enumerated below are some broad classes of threats that should be
considered when developing a web feature.

<h3 id="passive-network">
  Passive Network Attackers
</h3>

A <dfn>passive network attacker</dfn> has read-access to the bits going over
the wire between users and the servers they're communicating with. She can't
*modify* the bytes, but she can collect and analyze them.

Due to the decentralized nature of the internet, and the general level of
interest in user activity, it's reasonable to assume that practically every
unencrypted bit that's bouncing around the network of proxies, routers, and
servers you're using right now is being read by someone. It's equally likely
that some of these attackers are doing their best to understand the encrypted
bits as well, including storing encrypted communications for later
cryptanalysis (though that requires significantly more effort).

* The IETF's "Pervasive Monitoring Is an Attack" document [[RFC7258]] is
    useful reading, outlining some of the impacts on privacy that this
    assumption entails.

* Governments aren't the only concern; your local coffee shop is likely to
    be gathering information on its customers, your ISP at home is likely to
    be doing the same.

<h3 id="active-network">
  Active Network Attackers
</h3>

An <dfn>active network attacker</dfn> has both read- and write-access to the
bits going over the wire between users and the servers they're communicating
with. She can collect and analyze data, but also modify it in-flight,
injecting and manipulating Javascript, HTML, and other content at will.
This is more common than you might expect, for both benign and malicious
purposes:

* ISPs and caching proxies regularly cache and compress images before
    delivering them to users in an effort to reduce data usage. This can be
    especially useful for users on low-bandwidth, high-latency devices like
    phones.

* ISPs also regularly inject JavaScript [[COMCAST]] and other identifiers
    [[VERIZON]] for less benign purposes.

* If your ISP is willing to modify substantial amounts of traffic flowing
    through it for profit, it's difficult to believe that state-level
    attackers will remain passive.

<h3 id="sop-violations">
  Same-Origin Policy Violations
</h3>

The <dfn>same-origin policy</dfn> is the cornerstone of security on the web;
one origin should not have direct access to another origin's data (the policy
is more formally defined in Section 3 of [[RFC6454]]). A corollary to this
policy is that an origin should not have direct access to data that isn't
associated with *any* origin: the contents of a user's hard drive,
for instance. Various kinds of attacks bypass this protection in one way or
another. For example:

* <dfn local-lt="XSS">Cross-site scripting attacks</dfn> involve an
    attacker tricking an origin into executing attacker-controlled code in
    the context of a target origin.

* Cross-site request forgery attacks trick user agents into exerting a
    user's ambient authority on sites where they've logged in by submitting
    requests on their behalf.

* Data leakage occurs when bits of information are inadvertently made
    available cross-origin, either explicitly via CORS headers [[CORS]],
    or implicitly, via side-channel attacks like [[TIMING]].

<h3 id="third-party-tracking">
  Third-Party Tracking
</h3>

Part of the power of the web is its ability for a page to pull in content
from other third parties — from images to javascript — to enhance the content
and/or a user's experience of the site.  However, when a page pulls in
content from third parities, it inherently leaks some information to third
parties — referer information and other information that may be used to track
and profile a user.  This includes the fact that cookies go back to the
domain that initially stored them allowing for cross origin tracking.
Moreover, third parties can gain execution power through third party
Javascript being included by a webpage.  While pages can take steps to
mitigate the risks of third party content and browsers may differentiate
how they treat first and third party content from a given page, the risk of
new functionality being executed by third parties rather than the first party
site should be considered in the feature development process.

The simplest example is injecting a link to a site that behaves differently
under specific condition, for example based on the fact that user is or is not
logged to the site. This may reveal that the user has an account on a site.

<h3 id="legitimate-misuse">
  Legitimate Misuse
</h3>

Even when powerful features are made available to developers, it does not
mean that all the uses should always be a good idea, or justified; in fact,
data privacy regulations around the world may even put limits on certain uses
of data. In the context of first party, a legitimate website is potentially
able to interact with powerful features to learn about the user behavior or
habits. For example:

* Tracking the user while browsing the website via mechanisms such as mouse
    move tracking

* Behavioral profiling of the user based on the usage patterns

* Accessing powerful features enabling to reason about the user system,
    himself or the user surrounding, such as a webcam, Web Bluetooth or
    sensors

This point is admittedly different from others - and underlines that even if
something may be possible, it does not mean it should always be done,
including the need for considering a privacy impact assessment or even an
ethical assessment. When designing a specification with security and privacy
in mind, all both use and misuse cases should be in scope.

<h2 id="mitigations">
  Mitigation Strategies
</h2>

To mitigate the security and privacy risks you’ve identified in your
specification as you’ve filled out the questionnaire,
you may want to apply one or more of the mitigations described below to your
feature.

<h3 id="data-minimization">
  Data Minimization
</h3>

Minimization is a strategy that involves exposing as little information to
other communication partners as is required for a given operation to
complete. More specifically, it requires not providing access to more
information than was apparent in the user-mediated access or allowing the
user some control over which information exactly is provided.

For example, if the user has provided access to a given file, the object
representing that should not make it possible to obtain information about
that file's parent directory and its contents as that is clearly not what is
expected.

In context of data minimization it is natural to ask what data is passed
around between the different parties, how persistent the data items and
identifiers are, and whether there are correlation possibilities between
different protocol runs.

For example, the W3C Device APIs Working Group has defined a number of
requirements in their Privacy Requirements document. [[DAP-PRIVACY-REQS]]

Data minimization is applicable to specification authors and implementers, as
well as to those deploying the final service.

As an example, consider mouse events. When a page is loaded, the application
has no way of knowing whether a mouse is attached, what type of mouse it is
(e.g., make and model), what kind of capabilities it exposes, how many are
attached, and so on. Only when the user decides to use the mouse — presumably
because it is required for interaction — does some of this information become
available. And even then, only a minimum of information is exposed: you could
not know whether it is a trackpad for instance, and the fact that it may have
a right button is only exposed if it is used. For instance, the Gamepad API
makes use of this data minimization capability. It is impossible for a Web game
to know if the user agent has access to gamepads, how many there are, what
their capabilities are, etc. It is simply assumed that if the user wishes to
interact with the game through the gamepad then she will know when to action
it — and actioning it will provide the application with all the information
that it needs to operate (but no more than that).

The way in which the functionality is supported for the mouse is simply by
only providing information on the mouse's behaviour when certain events take
place. The approach is therefore to expose event handling (e.g., triggering
on click, move, button press) as the sole interface to the device.

Two features that have minimized the data they make available are:

* [[BATTERY-STATUS]] <q>The user agent should not expose high precision readouts</q>
* [[GENERIC-SENSOR]] <q>Limit maximum sampling frequency</q>,
    <q>Reduce accuracy</q></em>

<h3 id="privacy-friendly-defaults">
  Default Privacy Settings
</h3>

Users often do not change defaults, as a result, it is important that the
default mode of a specification minimizes the amount, identifiability, and
persistence of the data and identifiers exposed.  This is particularly true
if a protocol comes with flexible options so that it can be tailored to
specific environments.

<h3 id="user-mediation">
  Explicit user mediation
</h3>

If the security or privacy risk of a feature cannot otherwise be mitigated in
a specification, optionally allowing an implementer to prompt a user may
be the best mitigation possible, understanding it does not entirely remove
the privacy risk.  If the specification does not allow for the implementer to
prompt, it may result in divergence implementations by different user agents
as some user agents choose to implement more privacy-friendly version.

It is possible that the risk of a feature cannot be mitigated because the
risk is endemic to the feature itself.  For instance, [[GEOLOCATION-API]]
reveals a user’s location intentionally; user agents generally gate access to
the feature on a permission prompt which the user may choose to accept.  This
risk is also present and should be accounted for in features that expose
personal data or identifiers.

Designing such prompts is difficult as is determining the duration that the
permission should provide.

Often, the best prompt is one that is clearly tied to a user action, like the
file picker, where in response to a user action, the file picker is brought
up and a user gives access to a specific file to an individual site.

Generally speaking, the duration and timing of the prompt should be inversely
proportional to the risk posed by the data exposed.  In addition, the prompt
should consider issues such as:

* How should permission requests be scoped? Especially when requested by an
    embedded third party iframe?
* Should persistence be based on the pair of top-level/embedded origins or a
    different scope?
* How is it certain that the prompt is occurring in context of requiring the
    data and at a time that it is clear to the user why the prompt is occurring.
* Explaining the implications of permission before prompting the user, in a
    way that is accessible and localized -- _who_ is asking, _what_ are they
    asking for, _why_ do they need it?
* What happens if the user rejects the request at the time of the prompt or
    if the user later changes their mind and revokes access.

These prompts should also include considerations for what, if any, control a
user has over their data after it has been shared with other parties.  For
example, are users able to determine what information was shared with other
parties?

<h3 id="restrict-to-first-party">
  Explicitly restrict the feature to first party origins
</h3>

As described in the "Third-Party Tracking" section, a significant feature of
the web is the mixing of first and third party content in a single page, but,
this introduces risk where the third party content can use the same set of web
features as the first party content.

Authors should explicit specify the feature's scope of availability:

* When a feature should be made available to embedded third parties -- and
    often first parties should be able to explicitly control that (using
    iframe attributes or feature policy)
* Whether a feature should be available in the background or only in the
    top-most, visible tab.
* Whether a feature should be available to offline service workers.
* Whether events will be fired simultaneously

Third party’s access to a feature should be an optional implementation for
conformance.

<h3 id="secure-contexts">
  Secure Contexts
</h3>

If the primary risk that you’ve identified in your specification is the
threat posed by [=active network attacker=], offering a feature to an
insecure origin is the same as offering that feature to every origin because
the attacker can inject frames and code at will. Requiring an encrypted and
authenticated connection in order to use a feature can mitigate this kind of
risk.

Secure contexts also protect against [=passive network attackers=].  For
example, if a page uses the Geolocation API and sends the sensor-provided
latitude and longitude back to the server over an insecure connection, then
any passive network attacker can learn the user's location, without any
feasible path to detection by the user or others.

However, requiring a secure context is not sufficient to mitigate many
privacy risks or even security risks from other threat actors than active
network attackers.

<h3 id="drop-feature">
  Drop the feature
</h3>

Possibly the simplest way
to mitigate potential negative security or privacy impacts of a feature
is to drop the feature,
though you should keep in mind that some security or privacy risks
may be removed or mitigated
by adding features to the platform.
Every feature in a spec
should be seen as
potentially adding security and/or privacy risk
until proven otherwise.
Discussing dropping the feature
as a mitigation for security or privacy impacts
is a helpful exercise
as it helps illuminate the tradeoffs
between the feature,
whether it is exposing the minimum amount of data necessary,
and other possible mitigations.

Consider also the cumulative effect
of feature addition
to the overall impression that users have
that [it is safe to visit a web page](https://w3ctag.github.io/design-principles/#safe-to-browse).
Doing things that complicate users' understanding
that it is safe to visit websites,
or that complicate what users need to understand
about the safety of the web
(e.g., adding features that are less safe)
reduces the ability of users
to act based on that understanding of safety,
or to act in ways that correctly reflect the safety that exists.

Every specification should seek to be as small as possible, even if only
for the reasons of reducing and minimizing security/privacy attack surface(s).
By doing so we can reduce the overall security and privacy attack surface
of not only a particular feature, but of a module (related set of
features), a specification, and the overall web platform.

Examples

* [Mozilla](https://bugzilla.mozilla.org/show_bug.cgi?id=1313580) and
    [WebKit](https://bugs.webkit.org/show_bug.cgi?id=164213)
    dropped the Battery Status API
* [Mozilla dropped](https://bugzilla.mozilla.org/show_bug.cgi?id=1359076)
    devicelight, deviceproximity and userproximity events

<h3 id="privacy-impact-assessment">
  Making a privacy impact assessment
</h3>

Some features are potentially supplying very sensitive data, and it is
the responsibility of the end-developer, system owner, or manager to realize
this and act accordingly in the design of his/her system. Some use may
warrant conducting a privacy impact assessment, especially when data
relating to individuals may be processed.

Specifications that expose such sensitive data should include a
recommendation that websites and applications adopting the API — but not
necessarily the implementing user agent — conduct a privacy impact assessment
of the data that they collect.

A features that recommends such is:

* [[GENERIC-SENSOR]] advises to consider performing of a privacy impact
    assessment

Documenting these impacts is important for organizations although it should
be noted that there are limitations to putting this onus on organizations.
Research has shown that sites often do not comply with security/privacy
requirements in specifications.  For example, in [[DOTY-GEOLOCATION]], it was
found that none of the studied websites informed users of their privacy
practices before the site prompted for location.

<pre class="anchors">
urlPrefix: https://tc39.github.io/ecma262/; spec: ECMASCRIPT
    text: eval(); url: #sec-eval-x; type: method
urlPrefix: https://privacycg.github.io/storage-access/; spec: STORAGE-ACCESS
    text: first-party-site context; url: #first-party-site-context; type: dfn
urlPrefix: https://privacycg.github.io/storage-access/; spec: STORAGE-ACCESS
    text: third-party context; url: #third-party-context; type: dfn
urlPrefix: https://www.w3.org/TR/encrypted-media/; spec: ENCRYPTED-MEDIA
    text: content decryption module; url: #cdm; type: dfn
</pre>

<pre class="link-defaults">
spec:html; type:element; text:script
spec:html; type:element; text:link
</pre>

<pre class="biblio">
{
  "ADDING-PERMISSION": {
    "href": "https://github.com/w3cping/adding-permissions",
    "title": "Adding another permission? A guide",
    "authors": [ "Nick Doty" ],
    "publisher": "W3C Privacy Interest Group"
  },
  "COMCAST": {
      "href": "http://arstechnica.com/tech-policy/2014/09/why-comcasts-javascript-ad-injections-threaten-security-net-neutrality/",
      "title": "Comcast Wi-Fi serving self-promotional ads via JavaScript injection",
      "publisher": "Ars Technica",
      "authors": [ "David Kravets" ]
  },
  "DOTY-GEOLOCATION": {
    "href": "https://escholarship.org/uc/item/0rp834wf",
    "title": "Privacy Issues of the W3C Geolocation API",
    "authors": [ "Nick Doty, Deirdre K. Mulligan, Erik Wilde" ],
    "publisher": "UC Berkeley School of Information"
  },
  "GYROSPEECHRECOGNITION": {
    "href": "https://www.usenix.org/system/files/conference/usenixsecurity14/sec14-paper-michalevsky.pdf",
    "title": "Gyrophone: Recognizing Speech from Gyroscope Signals",
    "publisher": "Proceedings of the 23rd USENIX Security Symposium",
    "authors": [ "Yan Michalevsky", "Dan Boneh", "Gabi Nakibly"]
  },
  "HOMAKOV": {
      "href": "http://homakov.blogspot.de/2014/01/using-content-security-policy-for-evil.html",
      "title": "Using Content-Security-Policy for Evil",
      "authors": [ "Egor Homakov" ]
  },
  "OLEJNIK-ALS": {
    "href": "https://blog.lukaszolejnik.com/privacy-of-ambient-light-sensors/",
    "title": "Privacy analysis of Ambient Light Sensors",
    "publisher": "Lukasz Olejnik",
    "authors": [ "Lukasz Olejnik" ]
  },
  "OLEJNIK-BATTERY": {
    "href": "https://eprint.iacr.org/2015/616",
    "title": "The leaking battery: A privacy analysis of the HTML5 Battery Status API",
    "publisher": "Cryptology ePrint Archive, Report 2015/616",
    "authors": [ "Lukasz Olejnik", "Gunes Acar", "Claude Castelluccia", "Claudia Diaz"]
  },
  "OLEJNIK-PAYMENTS": {
    "href": "https://blog.lukaszolejnik.com/privacy-of-web-request-api/",
    "title": "Privacy of Web Request API",
    "authors": [ "Lukasz Olejnik" ],
    "publisher": "Lukasz Olejnik"
  },
  "RIVERA": {
    "href": "https://gist.github.com/jherax/a81c8c132d09cc354a0e2cb911841ff1",
    "title": "Detect if a browser is in Private Browsing mode",
    "authors": [ "David Rivera" ],
    "publisher": "David Rivera"
  },
  "TIMING": {
      "href": "http://www.contextis.com/documents/2/Browser_Timing_Attacks.pdf",
      "title": "Pixel Perfect Timing Attacks with HTML5",
      "authors": [ "Paul Stone" ],
      "publisher": "Context Information Security"
  },
  "VERIZON": {
      "href": "http://adage.com/article/digital/verizon-target-mobile-subscribers-ads/293356/",
      "title": "Verizon looks to target its mobile subscribers with ads",
      "publisher": "Advertising Age",
      "authors": [ "Mark Bergen", "Alex Kantrowitz" ]
  },
  "WU-PRIVATE-BROWSING": {
    "href": "https://dl.acm.org/citation.cfm?id=3186088",
    "title": "Your Secrets Are Safe: How Browsers' Explanations Impact Misconceptions About Private Browsing Mode",
    "publisher": "WWW '18 Proceedings of the 2018 World Wide Web Conference",
    "authors": [ "Yuxi Wu", "Panya Gupta", "Miranda Wei", "Yasemin Acar", "Sascha Fahl", "Blase Ur"]
  },
  "YUBIKEY-ATTACK": {
      "href": "https://www.wired.com/story/chrome-yubikey-phishing-webusb/",
      "title": "Chrome Lets Hackers Phish Even 'Unphishable' YubiKey Users",
      "authors": [ "Andy Greenberg" ],
      "publisher": "Wired"
  }
}
</pre>