The privacy threats of browser extension fingerprinting

I have already published two Chrome extensions in the Chrome Web Store, and I am currently working on my third for a European research project on disinformation on behalf of HU University of Applied Sciences Utrecht. I therefore may have a vested interest in presenting browser extensions in a positive light, because that helps me “sell” my work.
Sadly, reality does not care that much about my interests or my feelings, because a few weeks ago I read a 2024 paper which showed that the Chrome Web Store contains a large number of extensions that may not be safe. One of the papers cited in that paper is the subject of this week’s paper summary, which is about browser extension fingerprinting. It was published in 2020, so it is not very new, but it does do quite a good job of explaining how browser fingerprinting works.
The set of browser extensions that a user has installed can be used to uniquely identify them and track them across the Web, especially when combined with other browser or system information. This paper provides an overview of techniques that make browser extension fingerprinting possible and shows the extent of the privacy threat posed to users.
Browser extensions allow third-party developers to add new features and improve the browsing experience for users, for example by removing unwanted content such as advertisements and cookie pop-ups. Features are typically implemented by modifying web pages and executing scripts within the context of the page.
Web browsers provide no way for web pages to list the extensions that a user has installed in their browser, but their presence can still be detected indirectly. Extensions expose elements (e.g. icons) or exhibit behaviour that is observable. If an extension’s elements or behaviour are unique among all extensions, a web page may be able to uniquely identify (i.e. fingerprint) it. An attacker who is able to lure a user to a specially crafted page may learn a lot about them!
This paper describes in detail four techniques that can be used to generate such fingerprints.
Extensions declare so-called web-accessible resources (WARs) in their manifest file. Examples of WARs include an extension’s images, JavaScript, and CSS files. A web page can probe these resources to detect if the corresponding extension is present in a user’s web browser.
Creating behaviour-based signatures requires slightly more work. For example, some extensions may modify the DOM by adding new nodes, removing nodes, or changing existing nodes. To generate behavioural fingerprints, the researchers crafted , and built a system that automatically visits the honeysite three times with one extension installed. Any changes to the honeysite’s DOM will be the result of the extension and not some other external factor, allowing an attacker to isolate the changes that are made by a particular extension.
This is of course easier said than done. Changes may be identical between page visits or vary slightly between visits, in which case the system must attempt to infer which parts of the identified pattern are static. This might not always work well. Moreover, multiple extensions may have very similar signatures. The system attempts to mitigate these issues by allowing room for some degree of mismatches and by combining multiple techniques.
Intra-communication-based enumeration takes advantage of the fact that an extension’s content scripts and a web page’s scripts can only communicate with each other by exchanging messages. By listening to all messages, a web page can eavesdrop on an extension and identify it based on the messages it sends.
Finally, there is inter-communication-based enumeration. Extensions can also fetch resources from the Internet. By monitoring all HTTP requests using the Resource Timing API, a web page can query the list of all resources requested. Fingerprints can (again) be generated by visiting the honeysite three times to detect whether an extension always fetches the same or different resources.
There are systems that aim to render extension enumeration by substituting attributes of injected DOM nodes with randomised values. This renders some of these techniques ineffective – but not all.
Extension enumeration is not only useful for identifying users but can also be used to infer sensitive information about a person, such as religion, ethnicity, political affiliation, or health issues.
The Chrome Web Store provides two publicly available sources of information. The extensions’ descriptions reveal information about their intended target audiences. For example, the description of my Jirleader extension contains the term “JIRA”, which means that . Extension reviews are also a valuable source, as they are associated with .
The feasibility of inference attacks is demonstrated by first classifying 20,409 extensions into categories such as Computing, Social Networks, Jobs & Education, Health, and so on based on their descriptions.
Analysis of fingerprintable extensions shows this technique can identify 43.33% of extensions with a prevailing ethnicity. For example, while the FlashSaleTricks and Download Master extensions are both in English, most of their reviewers’ names are respectively Indian and Russian.
Similar insights can be gained by looking at the languages that are supported by extensions. 82.88% of extensions only support a single language, with more than a fifth of those supporting a language other than English. A considerable number support four or fewer languages. Overall, 25.66% of extensions can be used to reveal the language of a user.
When looking at reviewers’ genders, 1,448 extensions appear to have a user base that is dominated by one gender, which is often enough to determine the gender of a user who has installed such an extension.
Matching extension descriptions with publicly available wordlists of religious and medical terms allowed the researchers to identify 58 extensions related to religion and 62 that are health related (49 reveal health conditions). A similar technique was used to discover 307 extensions that provide insights into a user’s political inclination.
How feasible is it to perform de-anonymisation attacks, i.e. identify users solely based on their set of extensions? Surprisingly, the researchers found that even when only two extensions are detected, an attacker already has a 77.5% chance of uniquely identifying a user within a set of almost 84,000 users. With four detected extensions, that probability already increases to 94.5%!
-
Browser extensions may modify the DOM, load local or remote resources, or send messages
-
Attackers can use this information to fingerprint (identify) browser extensions
-
A browser extension’s description (and by extension the extension itself) can reveal sensitive information about its users