Commit 360cc76f authored by Bharath Ramsundar's avatar Bharath Ramsundar
Browse files

Cleaning up

parent d90214a4
Loading
Loading
Loading
Loading
+472 −475

File changed.

Preview size limit exceeded, changes collapsed.

+16 −16
Original line number Diff line number Diff line
@@ -66,22 +66,22 @@ fairly soon.
Testing Machine Learning Models
-------------------------------

Testing the correctness of a machine learning model can be quite tricky to do
in practice. When adding a new machine learning model to DeepChem, you should
add at least a few basic types of unit tests:

- Overfitting test: Create a small synthetic dataset and test that your model
  can learn this datasest with high accuracy. For regression and classification
  task, this should correspond to low training error on the dataset. For
  generative tasks, this should correspond to low training loss on the dataset.
- Reloading test: Check that a trained model can be saved to disk and reloaded
  correctly. This should involve checking that predictions from the saved and
  reloaded models
  matching exactly.

Note that unit tests are not sufficient to gauge the real performance of a
model. You should benchmark your model on larger datasets as well and report
your benchmarking tests in the PR comments.
Testing the correctness of a machine learning model can be quite
tricky to do in practice. When adding a new machine learning model to
DeepChem, you should add at least a few basic types of unit tests:

- Overfitting test: Create a small synthetic dataset and test that
your model can learn this datasest with high accuracy. For regression
and classification task, this should correspond to low training error
on the dataset. For generative tasks, this should correspond to low
training loss on the dataset.
- Reloading test: Check that a trained model can be saved to disk and
reloaded correctly. This should involve checking that predictions from
the saved and reloaded models matching exactly.

Note that unit tests are not sufficient to gauge the real performance
of a model. You should benchmark your model on larger datasets as well
and report your benchmarking tests in the PR comments.

Type Annotations
----------------