Eamonn O'Brien-Strain

MastodonBlueskyThreads

I've long been passionate about issues of user privacy, so I jumped at the recent opportunity to spin up a new “Trust Experience” team in Google Search. Our team is responsible for aspects of the search user interfaces that affect user trust. (BTW, I'm hiring.)

I am excited about finally contributing to addressing what I think is the most difficult problem of privacy: how to give an average user transparency and meaningful control of their data, given the systems that manipulate the data are huge, complex, and opaque to almost everone.

I have a hypothesis that the key to solving this problem is to create a new kind of visualizable model that is a projection of how a user's data is being used.

Of course, our team has some mundane work to do too, including making sure the cookie consent dialogs meet all the regulatory mandates in different jurisdictions, and are not too annoying.


$$\text{width } 0.00391 \text{ centered at } -0.19854+i1.10018$$

I've now been using the Mandelbrot set images from my previous postings as my Google Meet video conferencing background for a few weeks, choosing a new one each day. It's been a fun conversation starter, and it's also indicative of what Googlers are like that a large proportion of people in my meetings know exactly what the image is.

I've now used up my first few batches of images. Time to find some more. Luckily there is an infinite number of them.

And what's more I've been doing some experiments with hill-shading to make the images reveal more of the structure, and also playing with the color palette so that the black of the actual Mandelbrot stands out better.

Shout out to @AnnaThieme for her article on coding up shaded relief for maps. With the help of her article I added the following to my C++ coloring code:

double hillshade(int ix, int iy) {
  // Values in the eight neighboring cells
  const double a = iterations(ix - 1, iy - 1);
  const double b = iterations(ix, iy - 1);
  const double c = iterations(ix + 1, iy - 1);
  const double d = iterations(ix - 1, iy);
  const double f = iterations(ix + 1, iy);
  const double g = iterations(ix - 1, iy + 1);
  const double h = iterations(ix, iy + 1);
  const double i = iterations(ix + 1, iy + 1);

  const double dzdx = ((c + 2 * f + i) - (a + 2 * d + g)) / (8 * KERNELSIZE);
  const double dzdy = ((g + 2 * h + i) - (a + 2 * b + c)) / (8 * KERNELSIZE);

  const double slope = atan(Z_FACTOR * sqrt(dzdx * dzdx + dzdy * dzdy));

  const double aspect = atan2(dzdy, -dzdx);
  double shade = ((cos(ZENITH) * cos(slope)) +
                  (sin(ZENITH) * sin(slope) * cos(AZIMUTH - aspect)));
  return shade < 0 ? 0 : shade;
}

Here are some new interesting areas of the Mandelbrot Set I've found

-0.7436438870371587 + i0.1318259042053119 width 5×10-13

video image

-0.18271806444477 + i0.66140756855431, width 1.45×10-11

video image

0.37001085813 + i0.67143543269, width 6×10-8

video image

-0.7345612674879727 + i0.3601896136089664, width 4.55×13-8

video image

0.1488658918 + i0.6424077239, width 5×13-7

image

-0.99920853376 + i0.30236435348, width 7.45×13-9

image

-0.7489856 + i0.055768, width 0.000244

image

And here are re-renderings of images from previous posts in this new style

Click the images to see at full resolution.

$$\text{width } 8 \text{ centered at } 0+i0$$

$$\text{width } 0.000244 \text{ centered at } -0.658448+i0.466852$$

$$\text{width } 0.000122 \text{ centered at } -0.715182+i0.2300282$$

$$\text{width } 0.000977 \text{ centered at } 0.284390+i0.013590$$

$$\text{width } 0.000977 \text{ centered at } 0.284430+i0.012732$$

$$\text{width } 4.657 \times 10^{-10} \text{ centered at } -0.13997533734+i0.992076239092$$

$$\text{width } 0.000977 \text{ centered at } -0.796186+i0.183227$$

$$\text{width } 1.19 \times 10^{-7} \text{ centered at } 0.250006+i0$$

$$\text{width } 7.45 \times 10^{-9} \text{ centered at } -1.9999117502+i0$$

$$\text{width } 0.00391 \text{ centered at } -0.19854+i1.10018$$

$$\text{width } 0.0001 \text{ centered at } -0.745263-i0.113042$$

$$\text{width } 2.5 \times 10^{-10} \text{ centered at } -0.749988802386+i0.006997251233$$

$$\text{width } 10^{-9} \text{ centered at } -1.67440967428+i0.00004716557$$

$$\text{width } 4.88 \times 10^{-13} \text{ centered at } -1.674409674652718+i0.000047165698791$$

$$\text{width } 2 \times 10^{-13} \text{ centered at } -1.674409674652720+i0.000047165698793$$

$$\text{width } 0.00000381 \text{ centered at } -1.28422516+i0.42732560$$

0.4464254440760-i0.4121418810742 2.328×10<sup>-10</sup> wide

-1.292628079931202-i0.352667377259980 7.28×10<sup>-12</sup> wide

0.4177429339294358+i0.2102828457812882 2.27×10<sup>-13</sup> wide

-0.8624892205832114+i0.21478927144381935 2.27×10<sup>-13</sup> wide


0.4177429339294358+i0.2102828457812882 2.27e-13 wide

Generating more Mandelbrot set details and videos zooming down to them.

See previous article for more details on this.

0.4464254440760-i0.4121418810742 2.328×10-10 wide:

video image

-1.292628079931202-i0.352667377259980 7.28×10-12 wide:

video image

0.4177429339294358+i0.2102828457812882 2.27×10-13 wide:

video image

-0.8624892205832114+i0.21478927144381935 2.27×10-13 wide:

video image


$$\text{width } 0.00391 \text{ centered at } -0.19854+i1.10018$$

Motivation

More than thirty years ago an exhibition in London changed how I view the world.

It was called “Schönheit im Chaos / Frontiers of Chaos”, presented by the Goethe-Institut, the German cultural organization. I still have the exhibition book by H.O. Peitgen, P. Richter, H. Jürgens, M. Prüfer, and D.Saupe

Schönheit im Chaos

It upended what I thought I knew about mathematics. It was astounding how much complexity could emerge from extremely basic calculations.

I was particularly drawn to the uncanny beauty and complexity of the Mandelbrot set, which is the complex values $$c$$ for which the iteration $$z \gets z^2 + c$$ does not diverge when starting with $$z = 0$$. This is so simply stated1, and yet generates the images below, revealing endless variations of complexity as you zoom deeper into the set.

Early Difficulties

Over the years I made several attempts to generate these images myself, on the various workstations and PCs I had available at the time. But ultimately it was frustrating because each image took so long to compute that it was painfully slow to explore to any depth in the Mandelbrot set.

The reason for the long compute times is that for each of the $$c$$ values of a million or so pixels you need to iterate $$z \gets z^2 + c$$ until it is diverging ($$\lvert z \rvert > 2$$). That can be fairly fast for pixels not in the Mandelbrot set (displayed with a color according to how many iterations). But for pixels that are in the Mandelbrot set (displayed as black), they will never diverge, so you have to pick some maximum number of iterations before giving up and deeming it to be non-diverging. The big problem is pixels that are outside the set but very close to it: they might require a large number of iterations before they diverge, especially at high zoom levels. To get a reasonably accurate set boundary you need the maximum iterations threshold to be at least 1000 at moderate zoom and 10,000 or 100,000 at higher zooms. Thus you can easily be doing billions of complex-number calculations per image.

In the end the computers I had available at the time were just too slow.

Trying Again

During the 2020 Winter Holidays I started thinking about generating Mandelbrot sets again, for the first time in more than a decade.

I realized that one thing different was that even my now several-years-old Linux ThinkPad laptop was vastly faster than anything I tried using before, especially if I could use all eight cores in parallel.

The first question was what programming language to use. To get the performance I wanted, interpreted languages like Python and JavaScript were out of the question, and I also thought that VM languages like Java or a .Net languages probably would not give me the control I wanted. So that left the choices:

  1. Rust is on my to-do list of languages to learn, but it was more than I wanted to tackle over the Holidays. Also the memory-safety features, which are the big advantages of Rust, while important for large-scale programming are not so important for this rather modest application.
  2. Go would have been fine, but in my past experience that libraries for graphics and color handling are fewer and harder to use than for C++.
  3. C++ is what I ended up choosing. Carefully written it can be faster than anything except possible assembly-language programming (and I was not willing to go down that rathole)

The only other performance option I considered was to investigate using GPU computation — but I decided to leave that for another time.

As a first quick prototype preserved in the now-obsolete SDL branch of the code on Github I used the SDL graphics library to throw up a window on my laptop and set pixels.

int iterations(int maxIterationCount, const complex<double> &c) {
  complex<double> z = 0;
  for (int i = 0; i < maxIterationCount; ++i) {
    z = z * z + c;
    if (abs(z) > 2) {
      return i;
    }
  }
  return maxIterationCount;
}

The C++ code takes advantage of the convenience of being able to use the std::complex type which means that the code z = z * z + c looks pretty much like the mathematical expression $$z \gets z^2 + c$$.

The code uses double which I've found allows zooming down to widths of $$10^{-13}$$ before obvious numeric artifacts appear. A possible future extension of the project would be to use arbitrary precision math to allow deeper zooms.

I also made sure I was using all available processing power by running multiple threads in parallel, which was easy because each pixel can be calculated independently. The threads interleave which rows they are working on:

int threadCount = thread::hardware_concurrency();

void threadWorker(const Params &params, Image *img, int mod) {
  for (int iy = mod; iy < params.imgHeight; iy += threadCount)
    for (int ix = 0; ix < params.imgWidth; ++ix) {
      img->iterations(ix, iy) = iterations(params, ix, iy);
    }
  cout << "Finished thread " << mod << endl;
}

...

  vector<thread> threads;
  for (int mod = 0; mod < threadCount; ++mod) {
    threads.emplace_back(threadWorker, params, &img, mod);
  }
  for (auto &thread : threads) {
    thread.join();
  }

The good news was this experiment showed reasonable performance, generating megapixel images in seconds.

Architecture

However this prototype had no UI (it had hardcoded parameters) so I needed to figure out how to put a UI on it. What I chose was a web app that operates as follows:

  1. The user opens a URL with the parameters encoded in the URL hash parameters.
  2. The client JS decodes the parameters and sets the src attribute of the <img> tag to a URL that invokes the server JS.
  3. If the image file for these parameters has not already been generated, the server JS invokes the C++ binary, which generates the image file and writes it out to the server's static files directory.
  4. The server JS sends back a 302 response to the client, redirecting it to the the static URL for the generated image file.
  5. The client JS updates its UI, and its hash parameters (enabling browser forward and back history to work).
  6. The user chooses some parameters and clicks on somewhere in the image.
  7. The client JS updates the src attribute of the <img> tag according to the updated parameters to update the URL that invokes the server JS. (goto step 3)
app.use(express.static('public'))

app.get('/image', (req, res) => {
  const { x, y, w, i } = req.query
  const imgPath = `/cache/mandelbrot_${x}_${y}_${w}_${i}.png`
  const imgFileName = `public${imgPath}`

  if (existsSync(imgFileName)) {
    console.log('Using existing cached ', imgFileName)
    res.redirect(imgPath)
    return
  }
  console.log('Generating ', imgFileName)
  const ls = spawn('./mandelbrot', [
    '-o', imgFileName,
    '-x', x,
    '-y', y,
    '-w', w,
    '-i', i,
    '-W', imgWidth,
    '-H', imgHeight
  ])

  ls.on('close', code => {
    console.log(`child process exited with code ${code}`)
    console.log('Generated ', imgFileName)
    res.redirect(imgPath)
  })
})

The server-side JavaScript code is very straightforward. It provides one endpoint for serving the static image files and a /image endpoint for invoking the C++ binary and redirecting to the generated image file:

let busy = false

const setCursorMagnification = () => {
  const magnification = magnificationElement.valueAsNumber
  zoomElement.innerText = 1 << magnification
  imgElement.className = `cursor-${magnification}`
}

const setIterations = () => {
  const i = Math.pow(10, iLog10Element.valueAsNumber)
  iNewElement.innerText = i
}

const doit = () => {
  if (!window.location.hash) {
    window.location = '#0_0_8_1000'
  }
  const [x, y, w, i] = window.location.hash.substr(1).split('_')
  xElement.innerText = x
  yElement.innerText = y
  wElement.innerText = w
  iElement.innerText = i
  busy = true
  imgElement.className = 'cursor-busy'
  imgElement.setAttribute('src', `/image?x=${x}&y=${y}&w=${w}&i=${i}`)
  imgElement.onload = () => {
    setCursorMagnification()
    busy = false
  }
  imgElement.onclick = event => {
    if (busy) {
      imgElement.className = 'cursor-error'
      setTimeout(() => {
        if (busy) {
          imgElement.className = 'cursor-busy'
        }
      }, 200)
      return
    }
    const { offsetX, offsetY } = event
    const scale = w / imgWidth
    const height = scale * imgHeight
    const viewPortLeft = x - w / 2
    const viewPortTop = y - height / 2
    const newX = viewPortLeft + offsetX * scale
    const newY = viewPortTop + (imgHeight - offsetY) * scale
    const magnification = magnificationElement.valueAsNumber
    const zoom = 1 << magnification
    zoomElement.innerText = zoom
    window.location = `#${newX}_${newY}_${w / zoom}_${iNewElement.innerText}`
    doit()
  }
}

window.onload = () => {
  window.onhashchange = doit
  magnificationElement.onchange = setCursorMagnification
  iLog10Element.onchange = setIterations
  setIterations()
  doit()
}

The client JavaScript code has the plumbing to pass down the parameters from the HTML to the server via the image URL query parameters. It also has a simple mutex mechanism using the busy Boolean to prevent the page spawning more than one concurrent image generation, and to modify the cursor to let the user know the state.

Results

Click any image to see in full resolution, $$1920 \times 1080$$, which is suitable for use as a video-conferencing background.

$$\text{width } 8 \text{ centered at } 0+i0$$

$$\text{width } 0.000244 \text{ centered at } -0.658448+i0.466852$$

$$\text{width } 0.000122 \text{ centered at } -0.715182+i0.2300282$$

$$\text{width } 0.000977 \text{ centered at } 0.284390+i0.013590$$

$$\text{width } 0.000977 \text{ centered at } 0.284430+i0.012732$$

$$\text{width } 4.657 \times 10^{-10} \text{ centered at } -0.13997533734+i0.992076239092$$

$$\text{width } 0.000977 \text{ centered at } -0.796186+i0.183227$$

$$\text{width } 1.19 \times 10^{-7} \text{ centered at } 0.250006+i0$$

$$\text{width } 7.45 \times 10^{-9} \text{ centered at } -1.9999117502+i0$$

$$\text{width } 0.00391 \text{ centered at } -0.19854+i1.10018$$

$$\text{width } 0.0001 \text{ centered at } -0.745263-i0.113042$$

$$\text{width } 2.5 \times 10^{-10} \text{ centered at } -0.749988802386+i0.006997251233$$

$$\text{width } 10^{-9} \text{ centered at } -1.67440967428+i0.00004716557$$

$$\text{width } 4.88 \times 10^{-13} \text{ centered at } -1.674409674652718+i0.000047165698791$$

$$\text{width } 2 \times 10^{-13} \text{ centered at } -1.674409674652720+i0.000047165698793$$

$$\text{width } 0.00000381 \text{ centered at } -1.28422516+i0.42732560$$

1 Though it is a bit more complex when the complex numbers are decomposed into their real and imaginary components: $$ z \gets z^2 + c $$ where $$ z = x + i y $$ $$ c = a + i b $$ expands to $$ x \gets x^2 -y^2 + a$$ $$ y \gets 2 x y + b $$


Looking at today's per-capita death rates:

  1. In the world rankings, the Latin American countries are still the worst affected, with some improvement for Mexico relative to the others.
  2. The worst seven US states are now Arizona and six Southern states, with Florida moving into fourth place.
  3. Three Arizona counties have moved into the top four of all counties in the US amongst the worst four along with one Mississippi county.
  4. In California, Orange County has moved up into the top three behind Los Angeles and Imperial Counties.

States with highest per-capita Covid death rates

Some highlights from today's graphs of per-capita Covid death-rates:

  1. In the ranking of countries, Latin America is doing worst, taking up the top seven spots. The United States is not far behind in the middle of the pack, with the United Kingdom and Sweden not doing much better. The rest of Europe is now doing pretty well, which is quite a turnaround from earlier in the pandemic.
  2. In the US, looking at the ranking by state or by county shows what has been much reported in the news, that states which escaped the initial impact of the pandemic now have the highest ranking, lead by Arizona and Mississippi. But some of the states initially hit hard, particularly New Jersey are also seeing high death rates again. New York is now doing quite well, at the bottom of the ranking, but California, which had done well in the past is now climbing up worryingly out of the middle of the pack.
  3. In the ranking of California counties, Imperial county is still by far the worst affected, but this is off a very small population base. Of the counties with large populations there is a big divide between NorCal and SoCal, with the southern counties all in the high up in per-capita deaths and the Bay Area counties all low down.

Oh yes, in an ideal world we would use a debugger to understand what our code is doing, but setting up a development environment to allow debugging can sometimes be tricky or annoying to do.

So often we resort to the venerable art of “printf debugging”: adding judiciously placed print statements in our code.

In JavaScript this means simply adding console.log statements.

But if you are writing code in a functional style, it can sometimes mess up your code to add print statements.

For example imagine you have a bug in some code that looks like this ...

const dotProd = (v1, v2) =>
  zip(v1, v2).map(pair => pair[0] * pair[1]).reduce((a, x) => a + x, 0)

... and you want to inspect the value of a + x returned by the reducer.

To use console.log you would have to rewrite the reducer to this inelegant form:

const dotProd = (v1, v2) =>
  zip(v1, v2).map(pair => pair[0] * pair[1]).reduce((a, x) => {
    const result = a + x
    console.log('Reducer returns', result)
    return result
  }, 0)

This really makes it inconvenient and error-prone to add these temporary logging statements, and to remove them when you're done.

So I published the pp function as a more elegant alternative.

To use it in Node JS, install with

npm install passprint

and then import it into your JavaScript file like so:

const { pp } = require('passprint')

and then you can do

const dotProd = (v1, v2) =>
  zip(v1, v2).map(pair => pair[0] * pair[1]).reduce((a, x) => pp(a + x), 0)

Note the additional pp(...) near the end. This is much less ugly than using console.log.

This will log out lines that look something like:

|||||||||MyClass.myFunction myFile.js:46:63 22ms 78878

where 22ms is the time elapsed since logging started and 78878 is the value of a + x. The number of | characters shows the depth of the call stack.

The pp simply returns the value passed to it, so it can be used with minimal perturbation to your code. It is easy to add and remove.

You don't need any extra disambiguating message with pp because the log messages include the function name, file name, and line number where it was invoked from.

More details in the README on Github.

Feel free to give feedback on Twitter or by adding a GitHub issue.


Old vs. new design for this blog

[Edit 2023-02-05 – this post is referring to a previous instantiation of this blog]

I just updated the visual design of this site to be a little more interesting than the default theme I had been using since I ported the site to Jekyll. I decided to tweak the CSS to get a neumorphic effect, which is similar to Material Design except instead of things floating above the surface they are indented into or stick above the surface.

It's pretty easy to get the effect with a CSS box-shadow, and thanks to @adam_giebl there's a handy online neumorphism generator to get you started.

I also decided to give this blog a name. “Yak Shavings” seems appropriate, since a lot of the technical things I write about here are the result of me happily going off on repeatedly nested tangents as I noodle on my side projects.


Bajel

I’ve been programming on Unix and Linux machines for many decades now, and one universal I’ve leaned on is the default availability of make, which has been labor-saving swiss-army knife for setting up my build and development flows, or for organizing any random commands that I execute repeatedly in a directory. It’s a great memory aid, reminding me when I come back later of how to work with the files in that directory.

Here are some examples of Makefiles I created a decade ago:

  1. When I was doing Palm webOS development I wrote a Makefile that has an eclectic collection of commands that comprised my hacked-together Palm build system.
  2. When I was doing Java web-server development, I wrote a blog post explaining how I used a Makefile to integrate the cool continuous-integration system of the day (Hudson) with the cool Java web framework of the day (Play Framework).

The things I like about make is that:

  1. It has very little boilerplate.
  2. It captures the commands, just as you type them in the command line.
  3. It captures the dependencies between the commands.
  4. The dependencies are simply the files generated by one command that are used by another command.
  5. The files can be file patterns, so you can express how one type of file is generated from another in general terms.
  6. It uses the timestamps of input and output files of the commands to determine if a command is out of date and needs to be run.
  7. It has variables to factor out common strings.
  8. It is ubiquitous and installed by default everywhere.

But I have been becoming increasingly dissatisfied with make. It was fine-tuned for the development environment of its day (compiling C on UNIX), and though it is remarkably versatile it still has lots of vestiges of special build-in support for that development flow. And over the decades it has accumulated a lot of complexity in how patterns and variables work.

However, I never found any replacement system that has all the positive features of make. The nearest is Bazel which is a multi-repo public version of Google’s internal monorepo build tool (which replaced an earlier make-driven build system in Google).

So I built my own tool, Bajel.

These days, most of my side projects use the npm ecosystem, so I built Bajel there. If you have npm installed, you can execute Bajel simply by doing

npx bajel

Bajel expects that the current directory contains a build file which in its most straightforward form is a JSON file. However Bajel also supports build files in other syntaxes, including YAML or TOML syntax, which make for cleaner files.

Here is an example build.yaml file, which is probably the closest of the available syntaxes to a classic Makefile:


TEST: ava test/contract_test.js

serve:
    deps:
        - dist/index.cjs
    exec: python -m SimpleHTTPServer 8888

test:
    deps:
        - test_default
        - test_contract_production
        - test_contract_development
        - test_contract_no_env

test_default:
    deps:
        - dist/index.cjs
    exec: ava

test_contract_production:
    exec: NODE_ENV=production $(TEST)

test_contract_development:
    exec: NODE_ENV=development $(TEST)

test_contract_no_env:
    exec: NODE_ENV=  $(TEST)

"perf.csv":
    deps:
        - src/node/perf.js", "src/common/optimizer.js"]
    exec: node $<

"dist/index.cjs":
    deps:
        - rollup.config.js
        - src/node/index.js
        - src/common/index.js
        - src/common/random.js
        - src/common/color.js
        - src/common/optimizer.js
        - src/common/contract.js
        - src/common/random.js
    exec: rollup --config $<

publish:
    deps:
        - dist/index.cjs
    exec: npm publish

clean:
    exec: rm -rf dist

However YAML as a syntax has fallen out of favor in some quarters, so you might prefer the following, which is semantically identical, but in the nice clean TOML format.


TEST="ava test/contract_test.js"

[serve]
deps = ["dist/index.cjs"]
exec = "python -m SimpleHTTPServer 8888"

[test]
deps = [
    "test_default",
    "test_contract_production",
    "test_contract_development",
    "test_contract_no_env",
]

[test_default]
deps = ["dist/index.cjs"]
exec = "ava"

[test_contract_production]
exec = "NODE_ENV=production $(TEST)"

[test_contract_development]
exec = "NODE_ENV=development $(TEST)"

[test_contract_no_env]
exec = "NODE_ENV=  $(TEST)"

["perf.csv"]
deps = ["src/node/perf.js", "src/common/optimizer.js"]
exec = "node $<"

["dist/index.cjs"]
deps = [
    "rollup.config.js",
    "src/node/index.js",
    "src/common/index.js",
    "src/common/random.js",
    "src/common/color.js",
    "src/common/optimizer.js",
    "src/common/contract.js",
    "src/common/random.js",
]
exec = "rollup --config $<"

[publish]
deps = ["dist/index.cjs"]
exec = "npm publish"

[clean]
exec = "rm -rf dist"

Some features to note in the above:

  • TEST="ava test/contract_test.js" is setting a variable which is referred to later in the file as $(TEST). In TOML syntax all the variables have to be defined at the top of the file before anything else.
  • [serve] is the first target, and is the one whose action is executed by default when you do npx bajel. You can execute any other target by adding it as an argument to the command line, for example npx bajel test.
  • deps = ["dist/index.cjs"] says that serve has one dependency, the target dist/index.cjs. If that file does not exist or is older than any of its dependencies, then the action for dist/index.cjs is executed before serve is executed.
  • exec = "python -m SimpleHTTPServer 8888" specifies the shell command that executes for serve. This is executed just as if you had typed python -m SimpleHTTPServer 8888 on the command line.
  • exec = "rollup --config $<" contains $< which is replaced by the first dependency, in this case rollup.config.js. Also possible are $+ which is replaced by all the dependencies (blank-separated), and $@ which is replaced by the target.

Bajel implements all the features of make that I value (including % pattern patching not shown in this example), but keeps things as simple as possible.

For advanced use, there is also a JavaScript syntax, which allows the actions to be specified as JavaScript functions, making each target be like a cell in a spreadsheet. There’s also a markdown format, to allow for literate programming of build files.

For details, see the README on Github.

If you have any feedback, feel free to add a GitHub issue, or reach out to me on Twitter.

Naming note: “Bajel” is Bazel, but with a “j” for JavaScript. It is pronounced /ba-hel/ in the Spanish fashion, as it is a Spanish word for a sailing ship such the Santa Ana, which is pictured at the top of this post.


Smoothish

[Updated 5-May-2020 to change the examples to the new 1.0.0 API.]

I published a new npm module called smoothish that smooths out time-series data without some of the drawbacks of the usual moving-point average.

When working on the visualization of per-capita COVID-19 death rates I needed a way to smooth out the curves of some noisy and incomplete data, and I wanted the data to extend up to the most recent day.

Standard moving-average did not meet those requirements, so I ended up writing my own smoothing function which you can use like a moving average, but it does not drop the points at the beginning or (more importantly) at the end.

It works by, at every point, doing a least-squares linear interpolation of other points nearby. It is flexible enough to handle missing points or points at the boundaries.

It's easy to use. In your JavaScript project install it with:

npm install smoothish

Then using it is as simple as:

const smoothish = require('smoothish')

const daysPerMonth = [
    31, 28, undefined, 30, 31, null, 31, 31, null, 31, 30, 31]

smoothish(daysPerMonth)
// --> [ 30.0, 29.4, 29.8, 30.1, 30.5, 30.6, 30.8, 30.8, 30.8, 30.7, 30.6, 30.7 ]

Note, that not only does it produce a smoothed output, but it also interpolates missing values.

By default the function uses a radius of 2, indicating the width of the neighborhood of other points to be considered. All points are actually considered by default, but with the ones closer having more weight falling off exponentially with a time constant of the radius.

The smoothish functions also has options to do moving average and to have a step-function falloff, just like a normal moving average. In that case a radius of 2 would specify a five-point moving average.

The smoothish function always returns the same number of output values as in the input.