Follow Google C++ Style Guide, with max-line-length extended to 120.
Run cpplint
before committing code.
Follow Google Python Style Guide, with max-line-length extended to 120. Exceptions are allowed if it feels more natural to follow Pytorch style. For example, Pytorch allows import relative path, also class name.
Run pylint
before committing code. It doesn't mean every issue has to be corrected nor check has to be manually disabled. Just make sure you are aware of the remaining issues and you are comfort with all of them. But, don't leave lint error there. Disable it explicitly if it is not a real error.
Install pylint
To check a file with pylint
:
yapf is an auto format tool owned by Google (not a Google product). To save the time of arguing code style during code review, use yapf to format the code is a good option. Note that it doesn't reformat comment.
Install yapf
Format code with yapf
There are Sublime and Vim plugins.
Use googletest for c++ code.
Use pytest for python code.
To run all the tests:
To run a particular test file
Quantization is a very overloaded word, many things related to it can create a lot of confusions. Let's try to avoid confusions as much as possible by following existing conventions. Generally, if there is a similar Tensorflow or numpy function, follow its convention. Though Tensorflow uses quantized
, quantization
and quant
, let's stick with the shortest one only.
When developing quantized version of a function or module, addQuant
to class name, add quant_
to function name, e.g.
Add prefix quant_mode_
, num_bits_
etc. to name of tensors will be quantized, e.g.
Don't use prefix/suffix weight
or act
if tensor being quantized doesn't have them explicitly in name. From function's perspective, it takes tensors, not necessarily weight and activation tensors. e.g. a
and b
of matmul
can both be either weight or activation.
There only convention here we can adopt is per_channel
. Other things, like there is no convention to follow of per row/column scale of matrix multiply. And though we usually absolute max value based scaling factor, there are other ways to decide it, like KL-divergence.
Our API design is flexible enough to support any granularity of quantization. The main concept is axis
.