Home
How I implemented wc in the browser in 3 days

Building wc in the browser

From time to time I like to run wc -l on my source code to see how much code I wrote.
For those not in the know: wc -l shows number of lines in files.
Actually, what I have to do is more like find -name "*.go" | xargs wc -l because wc isn’t a particularly good at handling directories.
I just want to see number of lines in all my source files, man. I don’t want to google the syntax of find and xargs for a hundredth time.
After learning about File System API I decided to write a tool that does just that as a web app. No need to install software.
I did just that and you can use it yourself.
Here’s how it sees itself:
The rest of this article describes how I would have done it if I did it.

Building software quickly

It only took me 3 days, which is a testament to how productive the web platform can be.
My weapons of choice are:
For a small project Svelte and Tailwind CSS are arguably an overkill. I used them because I standardized on that toolset. Standardization allows me to re-use prior experience and sometimes even code.

Why those technologies?

Svelte is React without the bloat. Try it and you’ll love it.
Tailwind CSS is CSS but more productive. You have to try it to believe it.
JSDoc is happy medium between no types at all and TypeScript. I have great internal resistance to switching to TypeScript. Maybe 5 years from now.
And none of that would be possible without browser APIs that allow access to files on your computer. Which FireFox doesn’t implement because they are happy to loose market share to browser that implement useful features. Clearly $3 million a year is not enough to buy yourself a CEO with understanding of the obvious.

Implementation tidbits

Getting list of files

To get a recursive listing of files in a directory use showDirectoryPicker to get a FileSystemDirectoryHandle. Call dirHandle.values() to get a list of directory entries. Recurse if an entry is a directory.
Not all browsers support that API. To detect if it works:
/**
 * @returns {boolean}
 */
export function isIFrame() {
  let isIFrame = false;
  try {
    // in iframe, those are different
    isIFrame = window.self !== window.top;
  } catch {
    // do nothing
  }
  return isIFrame;
}

/**
 * @returns {boolean}
 */
export function supportsFileSystem() {
  return "showDirectoryPicker" in window && !isIFrame();
}
Because people on Hacker News always complain about slow, bloated software I took pains to make my code fast. One of those pains was using an array instead of an object to represent a file system entry.
Wait, now HN people will complain that I’m optimizing prematurely.
Listen buddy, Steve Wozniak wrote assembly in hex and he liked it. In comparison, optimizing memory layout of most frequently used object in JavaScript is like drinking champagne on Jeff Bezos’ yacht.
Here’s a JavaScript trick to optimizing memory layout of objects with fixed number of fields: derive your class from an Array.

Deriving a class from an Array

Little known thing about JavaScript is that an Array is just an object and you can derive your class from it and add methods, getters and setters.
You get a compact layout of an array and convenience of accessors.
Here’s the sketch of how I implemented FsEntry object:
// a directory tree. each element is either a file:
// [file,      dirHandle, name, path, size, null]
// or directory:
// [[entries], dirHandle, name, path, size, null]
// extra null value is for the caller to stick additional data
// without the need to re-allocate the array
// if you need more than 1, use an object

// handle (file or dir), parentHandle (dir), size, path, dirEntries, meta
const handleIdx = 0;
const parentHandleIdx = 1;
const sizeIdx = 2;
const pathIdx = 3;
const dirEntriesIdx = 4;
const metaIdx = 5;

export class FsEntry extends Array {
  get size() {
    return this[sizeIdx];
  }

  // ... rest of the accessors
}
We have 6 slots in the array and we can access them as e.g. entry[sizeIdx]. We can hide this implementation detail by writing a getter as FsEntry.size() shown above.

Reading a directory recursively

Once you get FileSystemDirectoryHandle by using window.showDirectoryPicker() you can read the content of the directory.
Here’s one way to implement recursive read of directory:
/**
 * @param {FileSystemDirectoryHandle} dirHandle
 * @param {Function} skipEntryFn
 * @param {string} dir
 * @returns {Promise<FsEntry>}
 */
export async function readDirRecur(
  dirHandle,
  skipEntryFn = dontSkip,
  dir = dirHandle.name
) {
  /** @type {FsEntry[]} */
  let entries = [];
  // @ts-ignore
  for await (const handle of dirHandle.values()) {
    if (skipEntryFn(handle, dir)) {
      continue;
    }
    const path = dir == "" ? handle.name : `${dir}/${handle.name}`;
    if (handle.kind === "file") {
      let e = await FsEntry.fromHandle(handle, dirHandle, path);
      entries.push(e);
    } else if (handle.kind === "directory") {
      let e = await readDirRecur(handle, skipEntryFn, path);
      e.path = path;
      entries.push(e);
    }
  }
  let res = new FsEntry(dirHandle, null, dir);
  res.dirEntries = entries;
  return res;
}
Function skipEntryFn is called for every entry and allows the caller to decide to not include a given entry. You can, for example, skip a directory like .git.
It can also be used to show progress of reading the directory to the user, as it happens asynchronously.

Showing the files

I use tables and I’m not ashamed.
It’s still the best technology to display, well, a table of values where cells are sized to content and columns are aligned.
Flexbox doesn’t remember anything across rows so it can’t align columns.
Grid can layout things properly but I haven’t found a way to easily highlight the whole row when mouse is over it. With CSS you can only target individual cells in a grid, not rows.
With table I just style <tr class="hover:bg-gray-100">. That’s Tailwind speak for: on mouse hover set background color to light gray.
Folder can contain other folders so we need recursive components to implement it. Svelte supports that with <svelte:self>.
I implemented it as a tree view where you can expand folders to see their content.
It’s one big table for everything but I needed to indent each expanded folder to make it look like a tree.
It was a bit tricky. I went with indent property in my Folder component. Starts with 0 and goes +1 for each level of nesting.
Then I style the first file name column as <td class="ind-{indent}">...</td> and use those CSS styles:
<style>
  :global(.ind-1) {
    padding-left: 0.5rem;
  }
  :global(.ind-2) {
    padding-left: 1rem;
  }
  /* ... up to .ind-17 */
Except it goes to .ind-17. Yes, if you have deeper nesting, it won’t show correctly. I’ll wait for a bug report before increasing it further.

Calculating line count

You can get the size of the file from FileSystemFileEntry.
For source code I want to see number of lines. It’s quite trivial to calculate:
/**
 * @param {Blob} f
 * @returns {Promise<number>}
 */
export async function lineCount(f) {
  if (f.size === 0) {
    // empty files have no lines
    return 0;
  }
  let ab = await f.arrayBuffer();
  let a = new Uint8Array(ab);
  let nLines = 0;
  // if last character is not newline, we must add +1 to line count
  let toAdd = 0;
  for (let b of a) {
    // line endings are:
    // CR (13) LF (10) : windows
    // LF (10) : unix
    // CR (13) : mac
    // mac is very rare so we just count 10 as they count
    // windows and unix lines
    if (b === 10) {
      toAdd = 0;
      nLines++;
    } else {
      toAdd = 1;
    }
  }
  return nLines + toAdd;
}
It doesn’t handle Mac files that use CR for newlines. It’s ok to write buggy code as long as you document it.
I also skip known binary files (.png, .exe etc.) and known “not mine” directories like .git and node_modules.
Small considerations like that matter.

Remembering opened directories

I typically use it many times on the same directories and it’s a pain to pick the same directory over and over again.
FileSystemDirectoryHandle can be stored in IndexedDB so I implemented a history of opened directories using a persisted store using IndexedDB.

Asking for permissions

When it comes to accessing files and directories on disk you can’t ask for forgiveness, you have to ask for permission.
User grants permissions in window.showDirectoryPicker() and browser remembers them for a while, but they expire quite quickly.
You need to re-check and re-ask for permission to FileSystemFileHandle and FileSystemDirectoryHandle:
export async function verifyHandlePermission(fileHandle, readWrite) {
  const options = {};
  if (readWrite) {
    options.mode = "readwrite";
  }
  // Check if permission was already granted. If so, return true.
  if ((await fileHandle.queryPermission(options)) === "granted") {
    return true;
  }
  // Request permission. If the user grants permission, return true.
  if ((await fileHandle.requestPermission(options)) === "granted") {
    return true;
  }
  // The user didn't grant permission, so return false.
  return false;
}
If permissions did not expire, it’s a no-op. If not, the browser will show a dialog asking for permissions.
If you ask for write permissions, Chrome will show 2 confirmations dialogs vs. 1 for read-only access.
I start with read-only access and, if needed, ask again to get a write (or delete) permissions.
Important note: verifyHandlePermission() only works in secure context. One of the requirements for secure context is that it’s a user-initiated operation.
Example of user initiated operation is clicking on a button. Code executed in onclick handler is executed in secure context
For wc it’s not an issue because every filesystem operation starts as initiated by user via click.
If that’s not the case you’ll have to write a gnarly code to force UI interaction from the user when permissions are note granted. For example: you show a dialog with a button user needs to click to re-acquire permissions to file or folder and then re-do the operation.

Deleting files and directories

Deleting files has nothing to do with showing line counts but it was easy to implement, it was useful so I added it.
You need to remember FileSystemDirectoryHandle for the parent directory.
To delete a file: parentDirHandle.removeEntry("foo.txt")
To delete a directory: parentDirHandle.removeEntry("node_modules", {recursive: true})

Getting bit by a multi-threading bug

JavaScript doesn’t have multiple threads so you can’t have all those nasty multi-threading bugs? Right? Right?
Yes and no.
Async is not multi-threading but it does create non-obvious execution flows.
I had a bug: I noticed that some .txt files were showing line count of 0 even though they clearly did have lines.
I went bug hunting.
I checked the lineCount function. Seems ok.
I added console.log(), I stepped through the code. Time went by and my frustration level was reaching DEFCON 1.
Thankfully before I reached cocked pistol I had an epiphany.
You see, JavaScript has async where some code can interleave with some other code. The browser can splice those async “threads” with UI code.
No threads means there are no data races i.e. writing memory values that other thread is in the middle of reading.
But we do have non-obvious execution flows.
Here’s how my code worked:
Async is great for users: calculating line counts could take a long time as we need to read all those files.
If this process wasn’t async it would block the UI.
Thanks to async there’s enough checkpoints for the browser to process UI events in between processing files.
The issue was that function to calculate line counts was using an array I got from reading a directory.
I passed the same array to Folder component to show the files. And I sorted the array to show files in human friendly order.
In JavaScript sorting mutates an array and that array was partially processed by line counting function.
If series of events was unfortunate enough, I would skip some files in line counting. They would be resorted to a position that line counting thought it already counted.
Result: no lines for you!
A happy ending and an easy fix: Folder makes a copy of an array so sorting doesn’t affect line counting process.

The future

No software is ever finished but I arrived at a point where it does the majority of the job I wanted so I shipped it.
There is a feature I would find useful: statistics for each extensions.
How many lines in .go files vs. .js files etc.?
But I’m holding off implementing it until:
You can look at the source code. It’s source visible but not open source.
svelte programming
Mar 21 2023

Feedback about page:

Feedback:
Optional: your email if you want me to get back to you: