-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing evidence for core problem addressed by specification #215
Comments
I suggested in #214 (comment) that we have this spec cite https://w3c.github.io/fingerprinting-guidance/#avoid-passive-increases because it explains that "Passive fingerprinting allows for easier and widely-available identification, without opportunities for external detection, …". I think that explains why there's not much public data on how passive fingerprinting is used: by being passive, it evades such detection. That doesn't mean we should do nothing until a whistleblower comes forward, as you're arguing. We should work to improve the detectability of the available fingerprinting methods, which is what UA client hints do. This is a distinct purpose from @pes10k's issues, especially httpwg/http-extensions#767. He was objecting to moving active fingerprinting mechanisms from javascript to HTTP headers. The specific use of client hints to (eventually) replace the UA string instead moves a passive fingerprinting mechanism in an HTTP header, to an active mechanism in a header. |
For a harm to occur there has to be a victim. If there were victims they would come forward and there would be "public data" to use as evidence to justify the proposal. Unless there are victims that have been harmed and that harm can be tied to the User-Agent the problem is at best theoretical. The proposal should be amended to remove references to passive fingerprinting. |
It seems odd to base a proposal on an assumption, all the more so given the dominance of HTTPS connections where only parties designated by the destination page gain access to headers. What other harm should we assume is happening? And if passive fingerprinting really is a problem, rather than doing nothing I'm suggesting that there is already a perfectly good mechanism available to browser makers to decide how much entropy to reveal, and that this mechanism is already quite widely used. The User-Agent header is a SHOULD requirement since HTTP 1.0—browser makers are free to decide what to put in it in order to serve their users best. |
@miketaylr can you advise when we can expect the evidence requested to be provided? If you are not minded to provide any evidence can you please provide positive confirmation. I've observed in my analysis of FLoC for this comment an absence of evidence to justify Privacy Sandbox proposals. I'm keen to avoid investing even more of my company's money and time on UACH. This could be spent conducting more beneficial improvements and features. |
I'm wary to estimate timelines for any spec issue -- there are competing demands for time and attention, and sometimes standards work is slower than we all would like, unfortunately. But I would like to come to a resolution to this issue, one way or the other, before we progress the spec along the standards track. |
@ronancremin I agree with you here -- browsers can reduce the granularity of information available in the User-Agent header by default. But for use cases that require more entropy, UA-CH can provide an (active) opt-in mechanism for sites to request it from the user agent. |
Here's an interesting article that talks about certain companies pitching passive HTTP-based fingerprinting, as a workaround for privacy policies: https://digiday.com/media/the-elephant-in-the-room-companies-persist-with-fingerprinting-as-a-workaround-to-apples-new-privacy-rules/.
In this case, it's not academic research about the possibility covert tracking, but speaks to companies trying to offer this as a service today. |
Thanks for the article. It's difficult to be 100% sure about the exact data flow described but it seems pretty clear that it's describing a situation where an avertising company's SDK communicates with that company's back end servers, and then those servers in turn chose to pass the received information on to a third party. This would happen outside of the purview of Apple and the user would be oblivious. While I agree that this is fingerprinting, it certainly isn't passive—it is a deliberate choice of the advertising company. |
I think maybe we're not aligned on the terminology of passive vs active (and admittedly they could be seen as overloaded terms). Passive, in the context of fingerprinting, means that it can happen without the user agent's knowledge -- a browser, or user for that matter, has no idea what a company might do with its HTTP logs to create a fingerprint, for example. Active means there is some code that is run to probe characteristics of a device or browser to create a fingerprint. User agents do have the capability to know what APIs are being called in this instance. (which is more or less what https://w3c.github.io/fingerprinting-guidance/#passive and https://w3c.github.io/fingerprinting-guidance/#active says). |
I see the following issues with this article and the approach advocated.
In relation to the draft fingerprinting document edited by Nick Doty referenced. That document, and the ones it references, do not provide evidence of specific harms, only theoretical harms without acknowledging the legal basis and benefits of probabilistic identifiers. I agree with Michael Zacharski in this article.
|
It is also worth noting that the UK's ICO and CMA are now working together concerning the justification for privacy based changes to the web. See this joint ICO and CMA announcement issued on 19th May 2021. @miketaylr - I assume you are representing Google rather than yourself in this forum as @yoavweiss the original active Google engineer and proposer seems to have become less active of late. Correct? If so will Google be presenting any information to this forum to justify this proposal/specification? I assume such information would exist within Google prior to expending the time and energies of their elite engineers. |
@miketaylr Can you advise when and where the proposed discussion will take place? Tagging @ronancremin and @jonarnes who will likely also be interested. I observe that the current draft of the document still contains the following in the introduction.
Focusing purely on the singular issue of the many tens of millions of developer hours that will be required to upgrade data models the eco-system deserves a more thorough and evidence backed justification. Further, given the facts contained in the Google Digital Advertising Antitrust Litigation filing, and the relationship between this proposal and Privacy Sandbox, it's essential to demonstrate this is a net improvement to the web and not just browser vendors. At the moment the document reads as if the justification is self-evident which it is not. |
It's a shame this issue and other related issues were not raised in the TAG review on User-Agent Client Hints & UA Reduction. @miketaylr can you advise when the community will be able to review the evidence used? When will the discussion proposed be scheduled? |
Issues #314 and #315 advises that there is no impact associated with fingerprinting from the UA-CH changes. Google cited these issues in their October 2022 quarterly report to the CMA. Google also repeated them on their web pages that relate to User-Agent reduction. I therefore believe Google consider this evidence credible. The abstract of the UA-CH proposal states a goal of the document and the associated changes is “avoiding the historical baggage and passive fingerprinting surface exposed by the venerable User-Agent header”. If true, then the evidence in #314 and #315 confirms that the proposal is not fit for purpose as so-called "fingerprinting" is unaffected by the proposal. Further the Information Commissioners Office engaged Plum Consulting to review the literature associated with data protection harms. No literature associated with fingerprinting harms was identified. The report is available here. Further research is recommended. It now appears as if the disruption and competition harms associated with the deployment of UA-CH and the associated User-Agent Reduction is now no longer justified as it does not meet the stated goal of the proposal, and neither does it have any justification as previous commented on against this issue and evidenced in the ICO/Plum review. Please can Google (tagging @miketaylr, @cwilso, @yoavweiss) provide substantive commentary in the January 2023 quarterly report provided to the CMA and the industry under the commitments which I believe Google employees at W3C have now been trained in. |
The UA Client Hints proposal appears to be founded on the premise that passive fingerprinting is widespread and harmful, and thus is worth solving:
"Version numbers, platform details, model information, etc. are all broadcast along with every request, and form the basis for fingerprinting schemes of all sorts." (https://wicg.github.io/ua-client-hints/, March 17th 2021)
The proposal suggests that "Rather than broadcasting this data to everyone, all the time, user agents can make reasonable decisions about how to respond to given sites' requests for more granular data, reducing the passive fingerprinting surface area exposed to the network."
"Form the basis" is a strong statement and implies an accepted fact. What is absent from the proposal is any evidence for how widespread or harmful this practice is. The EFF's Panopticlick is often cited as evidence but this is mostly a demonstration of what's possible by combining both passive and active fingerprinting. It says nothing about how widespread the practice is. The W3C's Mitigating Browser Fingerprinting in Web Specifications document cites numerous academic studies of fingerprinting on the web but there is scarcely a mention of passive fingerprinting, nor any mention of how widespread it is.
Thus this proposal may be solving a theoretical rather than real problem. Set against this is the fact that this proposal constitutes a significant change to web standards in place for over 2 decades and the ecosystem that evolved on top of it. Furthermore, there has been significant disagreement over whether Client Hints in general constitute a worsening of user privacy on the web:
Browser makers are well placed to understand their users needs and already have the ability to make reasonable decisions about what level of entropy to expose to websites via their User-Agent strings, as evidenced by Firefox, Brave etc.
The text was updated successfully, but these errors were encountered: