The core idea is to diff the rendered HTML output instead of source data structures, given that apps render UI as HTML anyway. This bypasses the need for custom diff implementations.
Legend:
The diagram above shows two approaches to building diff views. The Custom Diff approach (🟧) requires building and maintaining specialized diff logic and custom renderers for each document type. This means more development effort and one-off solutions that don't generalize.
In contrast, the HTML Diff approach (🟩) leverages the fact that most applications ultimately render to HTML. Rather than building custom diff logic, it works directly with the rendered HTML output. This requires no adjustments from developers in terms of creating diffed data structures or modifying renderers—it simply takes your existing HTML and adds data-diff-status attributes to highlight changes. This approach generalizes across any application UI that renders to HTML, making it a reusable solution.
HTML diff uses data-diff-key attributes to track elements across before/after states. Here's what happens:
Adding data-diff-key to elements only adds diff attributes—it doesn't modify the HTML structure:
data-diff-status added: Elements get data-diff-status="added", "modified", or "removed"Notice how in the diff result:
data-diff-status attribute is added to the <td> elementsThe progression of diffing granularity:
data-diff-key only: data-diff-status attributes added to changed elementsdata-diff-key + data-diff-mode="element": Explicit atomic element diffing (entire element marked as updated)data-diff-key + data-diff-mode="words": Granular word-level diffing with spansThis approach makes data-diff-key safe for complex layouts like tables, where inserting wrapper elements would break the structure.