What should we do about developer documentation?

We are pretty disorganized with regard to our developer docs (where dev docs are defined as documents covering our workflow or internal technical information—user-facing docs are a separate matter). Surveys of our developers suggest that lack of component specific and system-wide documentation impede productivity.

Currently, our dev docs are splattered around:

  • Internal wikis and posts, which are not visible to open source audiences.
  • Our GitHub wiki, which has no overarching organization, has many rotted links, and is not easily searchable (@ngimel describes them as “write-only”).
  • The codebase itself, which probably works the best currently but is not easily discoverable, searchable, or editable.

Looking around at other large open source projects, a common pattern seems to be to host a developer documentation site that organizes things more comprehensively. Ref:

I think the following properties are valuable for a docs thing:

  • Markdown support
  • Easy to edit, possibly not requiring code editing.
  • But still subject to review? (not sure about this, but arguably our GH wiki is a good example of what happens when you let folks dump docs)
  • Synced with code (possibly: add docs and code atomically, view dev docs at certain hash, etc.)
  • Good search

I guess some questions are:

  • Is it worth trying to find a good way to do the dev docs and consolidate them? Or are we fine with how things are currently?
  • Any thoughts on good candidates? e.g. docs.pytorch.org, GitBook, readthedocs.

GH explicitly states they disabled crawling of wiki pages but point people to github pages instead.

So, why don’t we just migrate our wiki to a GH page?

It’s a good option, probably we could set up Jekyll and just copypasta the current docs there.

I don’t think searching is the only thing we want to improve about the wiki tho, there’s a ton of stale content and no structure. I wonder how easy/hard it is to structure content w Jekyll

I suggest we

  1. Make it searchable
  2. Remove stale pages
  3. Remove stale content
  4. Decide on what we want to cover
  5. Cover it

But still subject to review?
I think optional review. Github repo with wide committer set might be ok for it.

We can start with setting up github pages. And if we actually get it to a polished state - exposing it under pytorch.org in the future maybe

Content is more critical than format as long as it’s searchable. Definitely makes sense to do a pass on the existing wiki. It might be also worth to do an attempt on describing architecture generally (similarly to the existing docs to the JIT part) and pull out parts of various dev discuss posts that are relevant.


FYI, that the github wiki is a git repo. While it doesn’t directly allow PRs, you can create a separate repo to “edit the wiki” + review, and just mirror that always to the wiki.

Python or Rust’s dev docs are pretty nice.
Python’s uses Sphinx, and I believe it doesn’t have a WYSIWYG editor, like @suo asked as a nice thing to have in the first post, but VSCode supports Markdown editing with live previews, if that’s acceptable.

@dzhulgakov and I have been playing with GitBook, which seems quite nice:

  • Quip-ish WYSIWYG editor within inline diff, comments, etc.
  • Bi-directional Git sync
  • Everything is Markdown
  • Free for open source projects
  • Indexed by search engines and stuff

In look and feel it is very similar to Rust’s mdBook (I have seen mdBook described explicitly as a GitBook clone somewhere). Example book from us messing around: About this guide - PyTorch Developer Documentation

Edit: oh, I saw that in the mdbook project description lol

1 Like

MKDocs is a great simple way to manage docs in Markdown. MKDocs material includes pretty good search and support for diagrams and mathjax. Getting started - Material for MkDocs