Creating documentation from our code by our code
I hate having to do documentation as much as the next developer, especially when it’s for something as simple as some codes or underlying dependencies you use in your pipeline! Just imagine it, you’re working in your Agile ways, you get given a task to incorporate a new feature which uses some of the businesses codes like ‘WEDPA’, ‘UUQL’ or some other strange hieroglyphic.
You get cracking with your development and your code speaks for itself, a work of art if you don’t say so yourself! You close the ticket and document up how to use the feature but realise you’ve not documented any of its dependencies and what they mean! Whats more the business might change those codes and you would have to update the documentation every time you added new codes or updated them...
The Solution
- Create a table to hold all of your dependencies, describing in detail all that they do (even better would be to grab that data from a static data table where whoever entered the data would populate that for you).
- Load this table into memory if you haven't created it yourself in the pipeline (eg from database -> pandas)
- Push this table into a Confluence page where you store all of this information so that its easily readable and visible to the Business, not just a csv or left somewhere as a comment in the code
Demo Confluence environment
To get us up and running we can spin up two docker containers, one running confluence and the other running a jupyter notebook. Be sure to follow through the step by step instructions on getting your confluence server up and running. It’s blissfully easy!
|
|
Creating and using confluence API wrapper
I have created a quick and dirty confluence wrapper and it is freely available on GitHub if you have any issues please raise them and pull requests are most welcome.
After we have installed this by running pip install git+https://github.com/ghandic/confluenceapi.git
we should be good to go with the jupyter notebooks.
First of all we will make our pages in confluence leaving some pages empty for the code to fill in and update on every production pipeline run.
Now we can add html content to that page by following the example notebook provided:
Another method we may want to document is by uploading files, maybe its a picture (.png), log file (.txt), etc we can do this by using the following methods:
To see more examples check out the full GitHub repo.