Build Log: Personal Browsing History Graph (Hands-On)

Draft
In Progress

Prompt

  1. Web extension that tracks every page that you visit, and sends this data (in a meaningful way - preserving connections) up to a Dgraph instance
  2. [Not sure if this is necessary] A bridge that accepts data from the chrome extension and forwards it on to Dgraph
  3. [Use a Dgraph UI app if one exists] Web UI to browse your personal graph
  4. Do this on iOS as well if there’s a way to do this that isn’t annoying (VPN)

Checklist

Notes

  • The web extension could use Rust+WebAssembly
  • The DGraph docs make liberal use of name@. syntax without explaining it. After some digging:
  • I’m wondering if DGraph is possibly too heavy for this; it’s designed to be distributed (even in dev), and that’s unnecessary overhead.
  • Neo4J’s console UI looks much better than DGraph’s, and is probably going to be a lot better to learn with (if nothing else).
  • The syntax to insert things into Neo4j (at least for my needs) seems fairly straightforward. WebAssembly is the other big unknown; going to focus there now.

WebAssembly

  • The .wat text format (called wat for “WebAssembly Text”) uses S-expressions, and bears some resemblance to the Lisp family of languages like Scheme and Clojure.

  • WebAssembly has a very simple memory model. A wasm module has access to a single “linear memory”, which is essentially a flat array of bytes. This memory can be grown by a multiple of the page size (64K). It cannot be shrunk.

  • We use wasm-pack to orchestrate the following build steps:

    • Ensure that we have Rust 1.30 or newer and the wasm32-unknown-unknown target installed via rustup,
    • Compile our Rust sources into a WebAssembly .wasm binary via cargo,
    • Use wasm-bindgen to generate the JavaScript API for using our Rust-generated WebAssembly.
  • wee_alloc?

    wee_alloc: The Wasm-Enabled, Elfin Allocator.

    wee_alloc is focused on targeting WebAssembly, producing a small .wasm code size, and having a simple, correct implementation. It is geared towards code that makes a handful of initial dynamically sized allocations, and then performs its heavy lifting without any further allocations. This scenario requires some allocator to exist, but we are more than happy to trade allocation performance for small code size. In contrast, wee_alloc would be a poor choice for a scenario where allocation is a performance bottleneck.

  • https://rustwasm.github.io/wasm-bindgen/api/web_sys/
  • JavaScript’s garbage-collected heap — where Objects, Arrays, and DOM nodes are allocated — is distinct from WebAssembly’s linear memory space, where our Rust values live. WebAssembly currently has no direct access to the garbage-collected heap (as of April 2018, this is expected to change with the “Interface Types” proposal). JavaScript, on the other hand, can read and write to the WebAssembly linear memory space, but only as an ArrayBuffer of scalar values (u8, i32, f64, etc…). WebAssembly functions also take and return scalar values. These are the building blocks from which all WebAssembly and JavaScript communication is constituted.

  • wasm_bindgen defines a common understanding of how to work with compound structures across this boundary. It involves boxing Rust structures, and wrapping the pointer in a JavaScript class for usability, or indexing into a table of JavaScript objects from Rust. wasm_bindgen is very convenient, but it does not remove the need to consider our data representation, and what values and structures are passed across this boundary. Instead, think of it as a tool for implementing the interface design you choose.

  • TL;DR: Don’t pass large structures across the boundary; pass opaque handles and small result sets instead.
  • Initial Cell enum:
    #[wasm_bindgen]
    #[repr(u8)]
    #[derive(Clone, Copy, Debug, PartialEq, Eq)]
    pub enum Cell {
        Dead = 0,
        Alive = 1,
    }
    
    • I don’t really understand repr(u8) yet, but it apparently uses a single byte to represent an enum value. (Couldn’t this be optimized further to a single bit per cell?)
    • wasm_bindgen makes this enum accessible from JS
    • It is important that we have #[repr(u8)], so that each cell is represented as a single byte. It is also important that the Dead variant is 0 and that the Alive variant is 1, so that we can easily count a cell’s live neighbors with addition.

  • Rust/WASM + requestAnimationFrame makes this really fast!
  • Generating (and allocating) a String in Rust and then having wasm-bindgen convert it to a valid JavaScript string makes unnecessary copies of the universe’s cells.

  • Rust-generated WebAssembly functions cannot return borrowed references.

Cypher

  • What are variables?

    A variable is restricted to a single statement. It may have different or no meaning in another statement.

  • Return all nodes?

    MATCH (n) RETURN n
    

Cayley

This appears to be a graph database that supports various backends (including SQLite!), potentially making it a lot more portable than Neo4j.

Edit