D3 for Author Visualization

1 What is D3?

D3 (Data Driven Documents) is a Javascript library which allows you to manipulate the DOM (or draw directly to a canvas) based on a set of source data.

D3.js is a JavaScript library for manipulating documents based on data. D3 helps you bring data to life using HTML, SVG, and CSS. D3’s emphasis on web standards gives you the full capabilities of modern browsers without tying yourself to a proprietary framework, combining powerful visualization components and a data-driven approach to DOM manipulation. – https://d3js.org/

1.1 What can we do with D3?

Since D3 just provides the ability to easily manipulate the DOM (including SVG objects), its capabilities match those of HTML, CSS and SVG themselves - transitions, fancy text paths, 2D/3D transformations and graphical filters are all easily possible (without having to learn new technology).

For the prototype app made using D3, take a look at the demonstration examples here:

http://author-routeviz.baggale.yt/

(on-corporate network users, bless their souls, will need to look at http://author-routeviz.s3-website.eu-west-2.amazonaws.com/index.html#fe3b8f84-1fc5-463c-9fc6-c26a5fdcf543 instead)

The app parses live data from production Author and renders an interactive network graph in less than 600 lines of code (including the HTML and CSS).

1.2 The general flow of a D3 app

The core of the D3 framework is relatively minimalistic, consisting of a few different concepts from which more complex data visualisation flows can be built.

The overall flow of a D3 application is as follows:

Get your source data (e.g. an array of Javascript objects).
Form a D3 selection.
Bind the data to the selection.
Create and/or delete elements (using the .enter() and .exit() methods) from the selection to synchronise it with the data.
Manipulate the created elements according to their bound data.
☕️

Figure 1: A lovingly drawn diagram showing the basic flow from data to presentation in d3

This code snippet shows the link lines being created as part of the SVG diagram for the prototype routing visualisation app - data is first bound with .data() and then SVG lines are created with .append("line") to match the data representation; one line is made per entry in the links array.

Once created, the lines are styled - their CSS classes set using .classed() - according to their bound data, accessible through passing in a function accepting the data object as the second parameter of .classed(). This is a general pattern in D3 - functions operating on objects in a selection (e.g. .attr() to set HTML attributes) take either a direct value to set or a callback which will be supplied with bound data for the node.

  // Create linkage lines for each element in links data
  const linkLines = globalGroup
    .selectAll("line")
    .data(links)
    .enter()
    .append("line")
    .classed("routing", (d) => d.type === ROUTING_RULE)
    .classed("routingElse", (d) => d.type === ROUTING_RULE_ELSE);

1.3 Data structure

There’s no set way that data must be structured in order to use the core D3 functionality - it’s down to you how you want to interpret the data bound to your selection. D3 provides layouts, however, which do expect input in certain formats (specific to each layout type).

2 Layouts

D3 provides a variety of built-in functions which allow users to produce common data visualisation layouts with minimal effort.

See https://www.d3indepth.com/layouts/ for more detail & discussion of the different built-in layouts available and how to use them.

2.1 Force layout

For rendering (non-hierarchical) network graphs, the force layout is generally the layout of choice. It performs a physics simulation by applying forces to elements for a short period of time, allowing nodes in the graph to dynamically organise themselves into a suitable pattern (rather than having to specify an algorithm in advance).

The simulation can be seen in progress during the first few seconds of viewing the graph in the prototype app - see e.g. http://author-routeviz.baggale.yt/#53ec4df5-f1c8-41b2-a8b0-24772a06c0b0.

Figure 2: Three of the forces acting on nodes in the graph of the demo app

Here’s the code which creates the force layout in the prototype app:

  const simulation = d3
    .forceSimulation(data)
    .force("charge", d3.forceManyBody().strength(-10))
    .force("collide", d3.forceCollide(60))
    .force("center", d3.forceCenter(width / 2, height / 2).strength(0.75))
    .force(
      "y-axis-ordering",
      d3
        .forceY()
        .y((_, i) => i * 55)
        .strength(6)
    )
    .force("linkages", d3.forceLink().links(links).distance(100))
    .on("tick", handleTick);

The first two forces apply repulsive forces between each question page node, pushing them apart and stopping them from merging together (to improve readability / spacing in the graph).

The third center force serves to pull the entire network into the middle of the available SVG visible area (to prevent it from floating away from the viewport).

The y-axis-ordering force is applied so that earlier questions are encouraged to organise themselves before later questions vertically in the graph (using each element’s index; lower indicies are attracted to lower Y values in the SVG chart).

The linkages force is applied in order to bind question nodes together according to the routing rules present in the questionnaire (described in d3-friendly format in the links array). These linkages tie question page nodes together and are the way in which Author question routing pathways are identified visually on the graph.

2.2 Handling updates on simulation tick

When the forceSimulation function is called with the data array, D3 mutates the data in-place on each tick of the simulation, giving each entry (amongst other things) x and y attributes reflecting their new simulated position.

To make use of this, we provide a handleTick function passed via the .on("tick") method of the force simulation object which updates the positions within our SVG chart for each node:

  // handleTick: void -> void; called per tick of physica simulation
  const handleTick = () => {
    linkLines
      .each((d) => {
        const [targetX, targetY] = vectorShorten(
          d.target.x - d.source.x,
          d.target.y - d.source.y,
          radius + arrowHeadLength
        );
        d.line = { targetX, targetY };
      })
      .attr("x1", (d) => d.source.x)
      .attr("x2", (d) => d.line.targetX + d.source.x)
      .attr("y1", (d) => d.source.y)
      .attr("y2", (d) => d.line.targetY + d.source.y);

    nodes.attr("transform", (d) => `translate(${d.x}, ${d.y})`);
  };

This is mostly a straightforward updating of positions with some vector math thrown in to reduce the length of the lines by the length of the arrowhead markers used within the app (otherwise the arrow heads extend into the circles for the question nodes).

NB₁ - The x1 and x2 are SVG attributes specifing the start and end points of the lines respectively (ditto for y1 & y1).

NB₂ - The “nodes” are the question page nodes, which are implemented with SVG groups (as they consist of a circle and text which need to move together); for this reason we have to use a 2D transform to move them instead of specifying positional attributes.

3 Data representation

D3’s force layout places no constraints on the data representing the nodes in the network graph - these can be specified however we like. In the prototype app, the data for the question page nodes is an array of Javascript objects specifying question page alias, title, etc. These attributes can thereafter be read by the presentation code and used to e.g. create text elements with the alias (as used within the circles for each question page node).

The force layout does require that the links that you specify correspond to an array of objects containing source and target attributes which refer to indicies in the data array:

[
    {source: <sourceindex>, target: <targetindex>},
]

We will require code to translate Author JSON into two separate arrays: one data array containing relevant information for each visible node in the graph (e.g. question pages) and one links array describing which indicies of the data nodes link to which.

I’ve written some code to perform this process which can be viewed in full (with comments) here: survey-transform.js.

4 Note re: app architecture

This app should be quite possible to achieve as a static website - there’s no need for any serverside processing.

Would make the “feature activated from eQ Author” criteron on the MVP document easy to accomplish; in the prototype app, the questionnaire ID is passed in using location.hash - it then fetches the questionnaire directly from prod (using a CORS proxy for now until we add the requisite headers to Author API to do it directly). We could just have a button / link in Author which directs users to http://<urlofservice>/#<questionnaireID>

Would make deployment super simple too (and super cheap to “operate”) - can live in an S3 bucket; deployment is a simple upload / sync.

5 Tutorials

Some useful tutorials: