Introducing hfviewer

Written by Embedl | May 4, 2026 10:01:13 AM

The Hugging Face ecosystem already has model cards, spaces, checkpoints, benchmarks, and demos. What it has still been missing is a fast general-purpose way to see how a model is put together. We built hfviewer.com to fill that gap: paste a Hugging Face model URL, open an interactive architecture graph in the browser, and move between overview and detail without installing anything.

Why we are doing this

This is our way of giving back to the Hugging Face community.

Why we built it

We kept running into the same problem: a model card can tell you what a model is for, but it rarely helps you inspect the actual structure quickly. If you want to understand where the vision encoder enters, how the decoder repeats, whether the model routes through experts, or how a multimodal merge happens, you often end up reading config files, staring at code, or building your own mental graph from scattered clues.

hfviewer is meant to make that first architectural pass much faster. You can open a model directly from the Hugging Face URL, get a visual map in the browser, and then zoom from the broad system shape down into the more specific substructure that matters for understanding deployment, latency, and correctness.

What hfviewer does

Open models directly from Hugging Face
Paste a model URL or repo id and open the graph without a local setup or notebook workflow.
Switch between overview and detail
Granularity levels let you move from the high-level architecture down to more specific traced blocks and paths.

A new kind of interactive blog

One of the most interesting things hfviewer enables is not just a prettier model page, but a new kind of technical article. On the Gemma 4 family page, the blog text and the graph are connected. You can read a section about a particular architectural decision, jump into the corresponding part of the graph, and then move back into the article with the surrounding context still intact.

That matters because model understanding is rarely linear. Sometimes you start from prose and need to verify it visually. Sometimes you see a node, a route, or a merge in the graph and want the editorial explanation immediately. We think that graph-to-text and text-to-graph loop is a better format for ML communication than a static diagram dropped into a long post.

We are releasing this because we think architecture understanding should be easier to share, easier to discuss, and easier to build on. Again: This is our way of giving back to the Hugging Face community.

View full post