Eamonn O'Brien-Strain

MastodonBlueskyThreads

Consider a person using a free Internet app. The Internet company wants to gather data about the user, while the user wants to protect their privacy. This creates a natural tension between the two parties.

Can we use game theory to model trust in a quantified way, by expressing this as a Prisoner's Dilemma?

Company Cooperates Company Defects
Person Cooperates company Reward, person Reward company Temptation, user Sucker
Person Defects company Sucker, person Temptation company Punishment, person Punishment

Where $$ Temptation > Reward > Punishment > Sucker $$ We will call the case of where both participants cooperate as “trust”.

Game theory shows that if the game is played just once by rational players then both will defect.

However if the game is played multiple times by the same players then mutual cooperation, i.e. trust, can be a stable outcome, however only on the condition that $$ Reward > \frac{ Temptation + Sucker }{ 2 } $$ One way to use this model for the relationships between a person and an Internet company is as follows:

  • The person
    • is trying to maximize privacy
    • cooperates or defects by either consenting or rejecting sharing non-essential data. This, for example, could be by logging in or staying logged out, or accepting or rejecting cookies in a consent banner.
  • The company
    • is trying to maximize the amount of personal user data it can process
    • cooperates by practicing data minimization or defects by using excessive data

Without loss of generality, let's set the Sucker value to 0. This is the least amount of privacy for the user (if they consent to data sharing but the company uses excessive data), and the least amount of data for the company (if they practice data minimization but the user rejects data sharing).

Let's set the Punishment to 1. This is the value of privacy to the person and data to the company when the person rejects and the company uses excessive data.

Let's set the Reward to 2. This is that value of privacy to the person and data to the company when the person consents and the company practices data minimization.

Let's set the Temptation to 3. This is the most amount of privacy for the user (if they reject data sharing and the company practices data minimization), and the most amount of data for the company (if they use excessive data and the user consents to data sharing).

Company practices Data minimization Company uses Excessive data
Person Consents Data=2 Privacy=2 Data=3 Privacy=0
Person Rejects Data=0 Privacy=3 Data=1 Privacy=1

The above is an example of a payoff table that could result in trust (mutual cooperation) over the long term.

In general (assuming Sucker is zero, the condition for trust is: $$ Temptation < 2 Reward $$ So for trust to be possible

  • When the company practices data minimization, for the person the privacy value of rejecting data sharing must be less than twice the value of consenting to data sharing.
  • When the user consents to data sharing, for the company the value of using excessive data must be less than twice the value of practicing data minimization.

So the lesson for companies is that for long term trust there must be the following bounds on their use of data :

  • with a consenting user, extract more than half of the maximum value of the data
  • minimize data use enough that the user gets more than half the maximum privacy value even when they consent to data sharing

I remember it being transformational when about 10 year ago I upped my command-line game considerably by discovering that I could search through my shell history with Ctrl+R.

Now thanks to Atuin from @ellie@hachyderm.io I think there may be another quantum jump in my command-line productivity.

The same Ctrl+R now brings up a UI that looks like this: Screenshot of Atuin UI

It also has a live GitHub-like activity chart, which should update live as I continue to use the command line with Atuin enabled: a chart showing my command-line activity

Unsurprisingly, I learned that my most common command is git status.

I just installed it on my Linux laptop. I'll try installing it on a Chromebook too, and maybe on the Cloud server that runs this blog.

==================

Edit 2023-02-20 I also succeeded in installing and syncing on two more machines: a Google Cloud compute server and a Chromebook. The steps for these subsequent machines were as follows:

First on the original machine run

atuin key

Keep this window open as you will need to copy-paste if into the login phase below.

On the new machine:

bash <(curl https://raw.githubusercontent.com/ellie/atuin/main/install.sh)
atuin login
atuin import auto
atuin sync

The above worked fine on the Google Cloud compute server. However on the Chromeboook I had to run

sudo apt install build-essential

to install the compiler.

Also I had to run the

bash <(curl https://raw.githubusercontent.com/ellie/atuin/main/install.sh)

at least twice, because the install script could not find any ready-build binaries and had to install some Rust infrastucture to build them.


When thinking about your digital privacy it is important to consider what is the threat model that you are protecting against. You have to consider how important each is for you, and what you are willing to do to protect against it.

Privacy Threat Models

You can break down the data privacy threats into three layers, device, network, and servers.

And for each of these threats, you can consider

  1. What is the worst-case harm if your data was revealed to an unwanted party?
    • Mild annoyance (for example at ads you don't want)
    • General disquiet at being surveilled
    • Embarrassment, for example, someone finding you watching pornography
    • Being fired, for example in retaliation for labor organizing or whistle-blowing
    • Being prosecuted, for example by an authoritarian regime or by a US state enforcing regressive abortion laws
    • Being physically harmed or killed, for example, if you are the subject of intimate partner violence (domestic violence)
  2. What is the probability of that worst-case harm happening?
    • Are you a member of some vulnerable minority that has enemies?
    • Does your work mean you have highly valuable confidential information?
  3. Who are the stewards of your data? These are the people or institutions that have access to your data in the normal course of business.
  4. Who might be the unwanted parties who would cause you harm if they see your personal data?
  5. What mitigation can you do to reduce the probability of harm?
    • For device protection, you can use incognito or private browsing
    • For network protection, you can use a paid VPN (avoid most free VPNs, they actually reduce your privacy)
    • For server protection, use the privacy settings of each service to increase your privacy, and if you are in Europe reject consent for anything except essential cookies
  6. What is the cost of doing that mitigation, whether direct cost or reduction in usefulness of the service you are using?

As a summary, here is a framework for thinking about privacy threat models:

Device Network Servers
Data under threat on your physical device, or accessible in the cloud via your account in transit stored or logged in the cloud
Stewards of your data Apple, Samsung, Google, Mozilla, Firefox, app developers, ... coffee shop, airport, employer, Verizon, AT&T, Comcast, Akamai, Cloudflare, ... Google, Facebook, Amazon, TikTok, ...
Who might take your data family members, police, ... employer, prosecutors, government security services, ... prosecutors, government security services, hackers ...
Good Mitigation incognito mode, private browser, don't log in paid VPN, Tor don't log in, reject cookies, modify privacy settings
Cost of mitigation lose convenience of personalization monetary cost, reduced speed lose convenience of personalization

I'm writing this post in #WriteFreely a blogging platform that allows blogs to be followed and posts to be boosted on Mastodon.

Formatting

Mastodon only handles plain text, so the examples below probably won't appear properly there.

WriteFreely allows for formatting such as

  1. italics
  2. bold
  3. lists like this
    • including
    • nested
    • lists
  4. headers like “Formatting” above
  5. console.log("inline code segments")

You can also block-quote text like this.

console.log("Or you can add code blocks")
console.log("like this.")

Linking

You can also embed images

SFO

and links to arbitrary pages.

Length

WriteFreely posts can be arbitrarily long, as compared to Mastodon's limit of 500 characters.


  • Update: This is what the above part of the post looks like on Mastodon:*

Screenshot of the above post before expansion

Note, that the post is truncated so it is not longer than a typical Mastodon post. When you click the “Read more” it expands out to the following.

Screenshot of the above post after expansion

As expected, formatting like italics and bold are omitted, but the biggest loss is that embedded images are not shown.


Mastodon is great, but sometimes you want to write something long form without having to split it up into 500 character chunks. And sometimes you want a modicum of control over the formatting.

Now there is a new potential for a rebirth of the venerable blog form to complement Mastodon, while still being integrated with it in the larger Fediverse of federated servers.

This blog is an example. On Mastodon you can follow it by searching for

eob@eamonn.org

and clicking the Follow button. Then all future posts here will appear in your Mastodon feed.

(I'm still in the progress of converting over posts from the previous version of this blog, so you may see a bunch of metadata clutter the beginning of the older posts.)


I've long been passionate about issues of user privacy, so I jumped at the recent opportunity to spin up a new “Trust Experience” team in Google Search. Our team is responsible for aspects of the search user interfaces that affect user trust. (BTW, I'm hiring.)

I am excited about finally contributing to addressing what I think is the most difficult problem of privacy: how to give an average user transparency and meaningful control of their data, given the systems that manipulate the data are huge, complex, and opaque to almost everone.

I have a hypothesis that the key to solving this problem is to create a new kind of visualizable model that is a projection of how a user's data is being used.

Of course, our team has some mundane work to do too, including making sure the cookie consent dialogs meet all the regulatory mandates in different jurisdictions, and are not too annoying.


$$\text{width } 0.00391 \text{ centered at } -0.19854+i1.10018$$

I've now been using the Mandelbrot set images from my previous postings as my Google Meet video conferencing background for a few weeks, choosing a new one each day. It's been a fun conversation starter, and it's also indicative of what Googlers are like that a large proportion of people in my meetings know exactly what the image is.

I've now used up my first few batches of images. Time to find some more. Luckily there is an infinite number of them.

And what's more I've been doing some experiments with hill-shading to make the images reveal more of the structure, and also playing with the color palette so that the black of the actual Mandelbrot stands out better.

Shout out to @AnnaThieme for her article on coding up shaded relief for maps. With the help of her article I added the following to my C++ coloring code:

double hillshade(int ix, int iy) {
  // Values in the eight neighboring cells
  const double a = iterations(ix - 1, iy - 1);
  const double b = iterations(ix, iy - 1);
  const double c = iterations(ix + 1, iy - 1);
  const double d = iterations(ix - 1, iy);
  const double f = iterations(ix + 1, iy);
  const double g = iterations(ix - 1, iy + 1);
  const double h = iterations(ix, iy + 1);
  const double i = iterations(ix + 1, iy + 1);

  const double dzdx = ((c + 2 * f + i) - (a + 2 * d + g)) / (8 * KERNELSIZE);
  const double dzdy = ((g + 2 * h + i) - (a + 2 * b + c)) / (8 * KERNELSIZE);

  const double slope = atan(Z_FACTOR * sqrt(dzdx * dzdx + dzdy * dzdy));

  const double aspect = atan2(dzdy, -dzdx);
  double shade = ((cos(ZENITH) * cos(slope)) +
                  (sin(ZENITH) * sin(slope) * cos(AZIMUTH - aspect)));
  return shade < 0 ? 0 : shade;
}

Here are some new interesting areas of the Mandelbrot Set I've found

-0.7436438870371587 + i0.1318259042053119 width 5×10-13

video image

-0.18271806444477 + i0.66140756855431, width 1.45×10-11

video image

0.37001085813 + i0.67143543269, width 6×10-8

video image

-0.7345612674879727 + i0.3601896136089664, width 4.55×13-8

video image

0.1488658918 + i0.6424077239, width 5×13-7

image

-0.99920853376 + i0.30236435348, width 7.45×13-9

image

-0.7489856 + i0.055768, width 0.000244

image

And here are re-renderings of images from previous posts in this new style

Click the images to see at full resolution.

$$\text{width } 8 \text{ centered at } 0+i0$$

$$\text{width } 0.000244 \text{ centered at } -0.658448+i0.466852$$

$$\text{width } 0.000122 \text{ centered at } -0.715182+i0.2300282$$

$$\text{width } 0.000977 \text{ centered at } 0.284390+i0.013590$$

$$\text{width } 0.000977 \text{ centered at } 0.284430+i0.012732$$

$$\text{width } 4.657 \times 10^{-10} \text{ centered at } -0.13997533734+i0.992076239092$$

$$\text{width } 0.000977 \text{ centered at } -0.796186+i0.183227$$

$$\text{width } 1.19 \times 10^{-7} \text{ centered at } 0.250006+i0$$

$$\text{width } 7.45 \times 10^{-9} \text{ centered at } -1.9999117502+i0$$

$$\text{width } 0.00391 \text{ centered at } -0.19854+i1.10018$$

$$\text{width } 0.0001 \text{ centered at } -0.745263-i0.113042$$

$$\text{width } 2.5 \times 10^{-10} \text{ centered at } -0.749988802386+i0.006997251233$$

$$\text{width } 10^{-9} \text{ centered at } -1.67440967428+i0.00004716557$$

$$\text{width } 4.88 \times 10^{-13} \text{ centered at } -1.674409674652718+i0.000047165698791$$

$$\text{width } 2 \times 10^{-13} \text{ centered at } -1.674409674652720+i0.000047165698793$$

$$\text{width } 0.00000381 \text{ centered at } -1.28422516+i0.42732560$$

0.4464254440760-i0.4121418810742 2.328×10<sup>-10</sup> wide

-1.292628079931202-i0.352667377259980 7.28×10<sup>-12</sup> wide

0.4177429339294358+i0.2102828457812882 2.27×10<sup>-13</sup> wide

-0.8624892205832114+i0.21478927144381935 2.27×10<sup>-13</sup> wide


0.4177429339294358+i0.2102828457812882 2.27e-13 wide

Generating more Mandelbrot set details and videos zooming down to them.

See previous article for more details on this.

0.4464254440760-i0.4121418810742 2.328×10-10 wide:

video image

-1.292628079931202-i0.352667377259980 7.28×10-12 wide:

video image

0.4177429339294358+i0.2102828457812882 2.27×10-13 wide:

video image

-0.8624892205832114+i0.21478927144381935 2.27×10-13 wide:

video image


$$\text{width } 0.00391 \text{ centered at } -0.19854+i1.10018$$

Motivation

More than thirty years ago an exhibition in London changed how I view the world.

It was called “Schönheit im Chaos / Frontiers of Chaos”, presented by the Goethe-Institut, the German cultural organization. I still have the exhibition book by H.O. Peitgen, P. Richter, H. Jürgens, M. Prüfer, and D.Saupe

Schönheit im Chaos

It upended what I thought I knew about mathematics. It was astounding how much complexity could emerge from extremely basic calculations.

I was particularly drawn to the uncanny beauty and complexity of the Mandelbrot set, which is the complex values $$c$$ for which the iteration $$z \gets z^2 + c$$ does not diverge when starting with $$z = 0$$. This is so simply stated1, and yet generates the images below, revealing endless variations of complexity as you zoom deeper into the set.

Early Difficulties

Over the years I made several attempts to generate these images myself, on the various workstations and PCs I had available at the time. But ultimately it was frustrating because each image took so long to compute that it was painfully slow to explore to any depth in the Mandelbrot set.

The reason for the long compute times is that for each of the $$c$$ values of a million or so pixels you need to iterate $$z \gets z^2 + c$$ until it is diverging ($$\lvert z \rvert > 2$$). That can be fairly fast for pixels not in the Mandelbrot set (displayed with a color according to how many iterations). But for pixels that are in the Mandelbrot set (displayed as black), they will never diverge, so you have to pick some maximum number of iterations before giving up and deeming it to be non-diverging. The big problem is pixels that are outside the set but very close to it: they might require a large number of iterations before they diverge, especially at high zoom levels. To get a reasonably accurate set boundary you need the maximum iterations threshold to be at least 1000 at moderate zoom and 10,000 or 100,000 at higher zooms. Thus you can easily be doing billions of complex-number calculations per image.

In the end the computers I had available at the time were just too slow.

Trying Again

During the 2020 Winter Holidays I started thinking about generating Mandelbrot sets again, for the first time in more than a decade.

I realized that one thing different was that even my now several-years-old Linux ThinkPad laptop was vastly faster than anything I tried using before, especially if I could use all eight cores in parallel.

The first question was what programming language to use. To get the performance I wanted, interpreted languages like Python and JavaScript were out of the question, and I also thought that VM languages like Java or a .Net languages probably would not give me the control I wanted. So that left the choices:

  1. Rust is on my to-do list of languages to learn, but it was more than I wanted to tackle over the Holidays. Also the memory-safety features, which are the big advantages of Rust, while important for large-scale programming are not so important for this rather modest application.
  2. Go would have been fine, but in my past experience that libraries for graphics and color handling are fewer and harder to use than for C++.
  3. C++ is what I ended up choosing. Carefully written it can be faster than anything except possible assembly-language programming (and I was not willing to go down that rathole)

The only other performance option I considered was to investigate using GPU computation — but I decided to leave that for another time.

As a first quick prototype preserved in the now-obsolete SDL branch of the code on Github I used the SDL graphics library to throw up a window on my laptop and set pixels.

int iterations(int maxIterationCount, const complex<double> &c) {
  complex<double> z = 0;
  for (int i = 0; i < maxIterationCount; ++i) {
    z = z * z + c;
    if (abs(z) > 2) {
      return i;
    }
  }
  return maxIterationCount;
}

The C++ code takes advantage of the convenience of being able to use the std::complex type which means that the code z = z * z + c looks pretty much like the mathematical expression $$z \gets z^2 + c$$.

The code uses double which I've found allows zooming down to widths of $$10^{-13}$$ before obvious numeric artifacts appear. A possible future extension of the project would be to use arbitrary precision math to allow deeper zooms.

I also made sure I was using all available processing power by running multiple threads in parallel, which was easy because each pixel can be calculated independently. The threads interleave which rows they are working on:

int threadCount = thread::hardware_concurrency();

void threadWorker(const Params &params, Image *img, int mod) {
  for (int iy = mod; iy < params.imgHeight; iy += threadCount)
    for (int ix = 0; ix < params.imgWidth; ++ix) {
      img->iterations(ix, iy) = iterations(params, ix, iy);
    }
  cout << "Finished thread " << mod << endl;
}

...

  vector<thread> threads;
  for (int mod = 0; mod < threadCount; ++mod) {
    threads.emplace_back(threadWorker, params, &img, mod);
  }
  for (auto &thread : threads) {
    thread.join();
  }

The good news was this experiment showed reasonable performance, generating megapixel images in seconds.

Architecture

However this prototype had no UI (it had hardcoded parameters) so I needed to figure out how to put a UI on it. What I chose was a web app that operates as follows:

  1. The user opens a URL with the parameters encoded in the URL hash parameters.
  2. The client JS decodes the parameters and sets the src attribute of the <img> tag to a URL that invokes the server JS.
  3. If the image file for these parameters has not already been generated, the server JS invokes the C++ binary, which generates the image file and writes it out to the server's static files directory.
  4. The server JS sends back a 302 response to the client, redirecting it to the the static URL for the generated image file.
  5. The client JS updates its UI, and its hash parameters (enabling browser forward and back history to work).
  6. The user chooses some parameters and clicks on somewhere in the image.
  7. The client JS updates the src attribute of the <img> tag according to the updated parameters to update the URL that invokes the server JS. (goto step 3)
app.use(express.static('public'))

app.get('/image', (req, res) => {
  const { x, y, w, i } = req.query
  const imgPath = `/cache/mandelbrot_${x}_${y}_${w}_${i}.png`
  const imgFileName = `public${imgPath}`

  if (existsSync(imgFileName)) {
    console.log('Using existing cached ', imgFileName)
    res.redirect(imgPath)
    return
  }
  console.log('Generating ', imgFileName)
  const ls = spawn('./mandelbrot', [
    '-o', imgFileName,
    '-x', x,
    '-y', y,
    '-w', w,
    '-i', i,
    '-W', imgWidth,
    '-H', imgHeight
  ])

  ls.on('close', code => {
    console.log(`child process exited with code ${code}`)
    console.log('Generated ', imgFileName)
    res.redirect(imgPath)
  })
})

The server-side JavaScript code is very straightforward. It provides one endpoint for serving the static image files and a /image endpoint for invoking the C++ binary and redirecting to the generated image file:

let busy = false

const setCursorMagnification = () => {
  const magnification = magnificationElement.valueAsNumber
  zoomElement.innerText = 1 << magnification
  imgElement.className = `cursor-${magnification}`
}

const setIterations = () => {
  const i = Math.pow(10, iLog10Element.valueAsNumber)
  iNewElement.innerText = i
}

const doit = () => {
  if (!window.location.hash) {
    window.location = '#0_0_8_1000'
  }
  const [x, y, w, i] = window.location.hash.substr(1).split('_')
  xElement.innerText = x
  yElement.innerText = y
  wElement.innerText = w
  iElement.innerText = i
  busy = true
  imgElement.className = 'cursor-busy'
  imgElement.setAttribute('src', `/image?x=${x}&y=${y}&w=${w}&i=${i}`)
  imgElement.onload = () => {
    setCursorMagnification()
    busy = false
  }
  imgElement.onclick = event => {
    if (busy) {
      imgElement.className = 'cursor-error'
      setTimeout(() => {
        if (busy) {
          imgElement.className = 'cursor-busy'
        }
      }, 200)
      return
    }
    const { offsetX, offsetY } = event
    const scale = w / imgWidth
    const height = scale * imgHeight
    const viewPortLeft = x - w / 2
    const viewPortTop = y - height / 2
    const newX = viewPortLeft + offsetX * scale
    const newY = viewPortTop + (imgHeight - offsetY) * scale
    const magnification = magnificationElement.valueAsNumber
    const zoom = 1 << magnification
    zoomElement.innerText = zoom
    window.location = `#${newX}_${newY}_${w / zoom}_${iNewElement.innerText}`
    doit()
  }
}

window.onload = () => {
  window.onhashchange = doit
  magnificationElement.onchange = setCursorMagnification
  iLog10Element.onchange = setIterations
  setIterations()
  doit()
}

The client JavaScript code has the plumbing to pass down the parameters from the HTML to the server via the image URL query parameters. It also has a simple mutex mechanism using the busy Boolean to prevent the page spawning more than one concurrent image generation, and to modify the cursor to let the user know the state.

Results

Click any image to see in full resolution, $$1920 \times 1080$$, which is suitable for use as a video-conferencing background.

$$\text{width } 8 \text{ centered at } 0+i0$$

$$\text{width } 0.000244 \text{ centered at } -0.658448+i0.466852$$

$$\text{width } 0.000122 \text{ centered at } -0.715182+i0.2300282$$

$$\text{width } 0.000977 \text{ centered at } 0.284390+i0.013590$$

$$\text{width } 0.000977 \text{ centered at } 0.284430+i0.012732$$

$$\text{width } 4.657 \times 10^{-10} \text{ centered at } -0.13997533734+i0.992076239092$$

$$\text{width } 0.000977 \text{ centered at } -0.796186+i0.183227$$

$$\text{width } 1.19 \times 10^{-7} \text{ centered at } 0.250006+i0$$

$$\text{width } 7.45 \times 10^{-9} \text{ centered at } -1.9999117502+i0$$

$$\text{width } 0.00391 \text{ centered at } -0.19854+i1.10018$$

$$\text{width } 0.0001 \text{ centered at } -0.745263-i0.113042$$

$$\text{width } 2.5 \times 10^{-10} \text{ centered at } -0.749988802386+i0.006997251233$$

$$\text{width } 10^{-9} \text{ centered at } -1.67440967428+i0.00004716557$$

$$\text{width } 4.88 \times 10^{-13} \text{ centered at } -1.674409674652718+i0.000047165698791$$

$$\text{width } 2 \times 10^{-13} \text{ centered at } -1.674409674652720+i0.000047165698793$$

$$\text{width } 0.00000381 \text{ centered at } -1.28422516+i0.42732560$$

1 Though it is a bit more complex when the complex numbers are decomposed into their real and imaginary components: $$ z \gets z^2 + c $$ where $$ z = x + i y $$ $$ c = a + i b $$ expands to $$ x \gets x^2 -y^2 + a$$ $$ y \gets 2 x y + b $$


Looking at today's per-capita death rates:

  1. In the world rankings, the Latin American countries are still the worst affected, with some improvement for Mexico relative to the others.
  2. The worst seven US states are now Arizona and six Southern states, with Florida moving into fourth place.
  3. Three Arizona counties have moved into the top four of all counties in the US amongst the worst four along with one Mississippi county.
  4. In California, Orange County has moved up into the top three behind Los Angeles and Imperial Counties.