Generate a single-page index of documentation hosted in one or more GitHub organizations on github.com and/or one or more GitHub Enterprise instances.
This package is intended for organizations that host their documentation alongside code on GitHub (including GitHub Enterprise) and need a convenient single-page index to help people find things. It’s a small, opinionated, and purpose-specific tool, originally written so that my team could have a master index of our documentation (spread across github.com, two GitHub Enterprises, Confluence, and another intranet solution) without having to remember to add every new repository.
Full documentation is available on ReadTheDocs: http://github-docs-index.readthedocs.io/en/latest/
- Outputs docutils-ready reStructuredText for processing or conversion to a variety of formats.
- Supports a manually-curated “quick links” section at the top of the page, which can include arbitrary non-GitHub URLs
- Iterates repositories in any number of Organizations (or Users) on github.com and/or any number of GitHub Enterprise instances. Allows blacklisting or whitelisting repositories per-org/user and blacklisting organizations.
- Output sorted alphabetically and by last commit/update date.
- Configurable to show only repositories with GitHub Pages, a Repository URL, repositories with a README file longer than a specified length, repositories with a description, or all repositories.
- Option to ignore forks.
- Python API which allows injecting additional links to the chronological and alphabetical lists.
- GitHub tokens taken from environment variables.
- Configurable title, subtitle/header and footer; subtitle and footer can be overridden by environment variables.
- Python 2.7 or 3.4+ (currently tested with 2.7, 3.4, 3.5, 3.6)
- VirtualEnv and
pip(recommended installation method; your OS/distribution should have packages for these)
It’s recommended that you install into a virtual environment (virtualenv / venv). See the virtualenv usage documentation for information on how to create a venv.
pip install github-docs-index
Configuration is provided by a YAML file; see Example Configuration for a detailed example. The YAML file must have a mapping/hash/dict at the top level. Keys are as follows:
- title, string, the title of the index rst document
- footer, optional, string, a footer line to include at the end of the index rst document. This configuration option is overridden by the
GITHUB_DOCS_FOOTERenvironment variable if set.
- subtitle, optional, string, a subtitle/header line to include below the title of the index rst document. This configuration option is overridden by the
GITHUB_DOCS_SUBTITLEenvironment variable if set.
- githubs, array/list of mappings/dicts specifying the github instances to query. Each item in the array has the following structure:
- token_env_var, string, name of the environment variable that contains the Personal Access Token for this github instance
- url, optional, string, URL to this GitHub instance for GitHub Enterprise instances. If not specified, defaults to
- orgs, optional, array of string organization names to scan repositories in. If neither this option nor “orgs” is specified, default to scanning all repos of orgs that the authenticated user belongs to.
- users, optional, array of string user names to scan repositories in. If neither this option nor “orgs” is specified, default to scanning all repos of orgs that the authenticated user belongs to.
- whitelist_repos, optional, array of string repository slugs (full names) in “owner_name/repo_name” format. If specified, only these repos will be included regardless of other configuration for this GitHub instance.
- blacklist_repos, optional, array of string repository slugs (full names) in “owner_name/repo_name” format to never include in the output documentation index.
- blacklist_orgs, optional, array of string organization names to ignore when scanning repos.
- ignore_forks, optional, boolean, default False. If True, do not include any repos that are forks in the listing.
- quick_links, optional, array/list of mappings/dicts specifying manually-curated links to add to the “Quick Links” section at the top of the document. Each item in the array has the following structure:
- title string, the title/text of the link
- url string, the URL to link to
- description, optional, string description to output after the link to the repo
- repo_criteria, array/list of strings or mappings determining which repos to include in the listing and what URL to set for them. For each repo, these are evaluated in order and the first match wins; if none match, the repo is not added to the list. Possible options are:
homepage, string: if present, use the Repository Homepage URL (at the top of the repo page, next to the description) as the link. Only matches repos with a Homepage set.
github_pages, string: if present, use the repo’s GitHub Pages URL as the link. Only matches repos with GitHub Pages enabled.
readme: N, mapping/dict where “readme” is a string and N is an integer: match repos with a readme file of size N or greater, and link to the repo’s main HTML URL (github web UI URL)
description, string: match any repo with a description set, and link to the repo’s main HTML URL (github web UI URL)
allmatch any/all repos, and link to the repo’s main HTML URL (github web UI URL)
Usage via the command line is straightforward for common use cases. The reStructuredText output is printed to STDOUT, and can be redirected to a file. For example, assuming you’ve already installed the package as shown above, and using
example_config.yaml as an example:
# these next three environment variable names are specified in example_config.yaml export GITHUB_TOKEN=yourToken export GHE_TOKEN=anotherToken export OTHER_GHE_TOKEN=yetAnotherToken github-docs-index config.yaml > index.rst
This rst file can be converted to the format of your choice with any tool that understands reStructuredText input. For example, it can be converted to HTML using
rst2html.py from the docutils package (
pip install docutils):
rst2html.py --report=4 index.rst > index.html
You can see an example of the actual HTML output for my own github user in the source tree at Example Output.
github-docs-index can also be imported and used in other Python code. This can be especially useful for doing something with the raw rst output; here is an example that replicates the functionality of the above CLI examples in a single Python script:
#!/usr/bin/env python # for generating the rst from github_docs_index.config import Config from github_docs_index.index_generator import GithubDocsIndexGenerator # for docutils rst -> HTML from docutils import core from docutils.writers.html4css1 import Writer, HTMLTranslator # this replicates "github-docs-index config.yaml" at the command line g = GithubDocsIndexGenerator(Config('config.yaml')) rst_string = g.generate_index() # the code below here replicates "rst2html.py --report=4 index.rst > index.html" class HTMLFragmentTranslator(HTMLTranslator): def __init__(self, document): HTMLTranslator.__init__(self, document) self.head_prefix = ['', '', '', '', ''] self.body_prefix =  self.body_suffix =  self.stylesheet =  def astext(self): return ''.join(self.body) html_fragment_writer = Writer() html_fragment_writer.translator_class = HTMLFragmentTranslator with open('index.html', 'wb') as fh: fh.write(core.publish_string(rst_string, writer=html_fragment_writer)) print('Output written to: index.html')
Adding Documentation From Other Sources¶
It’s also possible via the Python API to include aribtrary documents from sources other than GitHub in the index; they will be sorted into the chronological and alphabetical lists along with the GitHub repositories. This can be helpful if you have other sources of documentation such as an Intranet or Wiki that you can programmatically query. The only requirement is that each document has a URL, title, date (generally a created/modified/updated date) and optional short description. The
github_docs_index.index_generator.GithubDocsIndexGenerator.generate_index method takes an optional
additional_links argument which is a list of instances of a subclass of
github_docs_index.index_link.IndexLink. So long as the instances implement the three properties of
IndexLink, they will be included in the documentation index. Here is a short, contrived example based on the code above which includes two other documents with hard-coded dates, titles and URLs; the
generate_additional_links() function could be switched out for one which queries your alternate documentation stores and returns similar output.
#!/usr/bin/env python3 from datetime import datetime, timezone from github_docs_index.config import Config from github_docs_index.index_generator import GithubDocsIndexGenerator from github_docs_index.index_link import IndexLink class StaticLink(IndexLink): """This class implements the three property methods in IndexLink""" def __init__(self, title, url, sort_datetime, description=''): self._title = title self._url = url self._sort_datetime = sort_datetime self._description = description @property def sort_datetime(self): return self._sort_datetime @property def sort_name(self): return self._title.lower() @property def rst_line(self): r = '`%s <%s>`_' % (self._title, self._url) if self._description is not None and self._description.strip() != '': r += ' - ' + self._description return r def generate_additional_links(): return [ StaticLink( 'Some Document', 'http://example.com/someDocument', datetime(2017, 6, 3, 12, 34, 41, tzinfo=timezone.utc), description='this is a document' ), StaticLink( 'Other Document', 'http://example.com/otherDocument', datetime(2018, 8, 12, 19, 24, 53, tzinfo=timezone.utc), description='this is another document' ) ] # this replicates "github-docs-index config.yaml" at the command line g = GithubDocsIndexGenerator(Config('config.yaml')) rst_string = g.generate_index(additional_links=generate_additional_links()) with open('index.rst', 'w') as fh: fh.write(rst_string)
To install for development:
- Fork the github-docs-index repository on GitHub
- Create a new branch off of master in your fork.
$ virtualenv github-docs-index $ cd github-docs-index && source bin/activate $ pip install -e email@example.com:YOURNAME/github-docs-index.git@BRANCHNAME#egg=github-docs-index $ cd src/github-docs-index
The git clone you’re now in will probably be checked out to a specific commit,
so you may want to
git checkout BRANCHNAME.
- pep8 compliant with some exceptions (see pytest.ini)
- 100% test coverage with pytest (with valid tests)
- testing is as simple as:
pip install tox
- If you want to pass additional arguments to pytest, add them to the tox command line after “–”. i.e., for verbose pytext output on py27 tests:
tox -e py27 -- -v
- Open an issue for the release; cut a branch off master for that issue.
- Confirm that there are CHANGES.rst entries for all major changes.
- Ensure that Travis tests passing in all environments.
- Ensure that test coverage is no less than the last release (ideally, 100%).
- Increment the version number in github-docs-index/version.py and add version and release date to CHANGES.rst, then push to GitHub.
- Confirm that README.rst renders correctly on GitHub.
- Upload package to testpypi:
- Create a pull request for the release to be merged into master. Upon successful Travis build, merge it.
- Tag the release in Git, push tag to GitHub:
- tag the release. for now the message is quite simple:
git tag -s -a X.Y.Z -m 'X.Y.Z released YYYY-MM-DD'
- push the tag to GitHub:
git push origin X.Y.Z
- tag the release. for now the message is quite simple:
- TravisCI will cut the release and upload to PyPI.
- Example Configuration
- Example Output
- github_docs_index package
- github_docs_index.config module
- github_docs_index.github_instance module
- github_docs_index.index_document module
- github_docs_index.index_generator module
- github_docs_index.index_link module
- github_docs_index.quick_link module
- github_docs_index.repo_link module
- github_docs_index.runner module
- github_docs_index.version module
- github_docs_index package