Commit d1667d74 authored by nd-02110114's avatar nd-02110114
Browse files

📝 update docs for get started and development

parent 9c509bbd
Loading
Loading
Loading
Loading
+1 −1
Original line number Diff line number Diff line
@@ -30,7 +30,7 @@ materials science, quantum chemistry, and biology.

## Requirements

DeepChem currently supports Python 3.5 through 3.7 and requires these packages on any condition.
DeepChem currently supports Python 3.6 through 3.7 and requires these packages on any condition.

- [joblib](https://pypi.python.org/pypi/joblib)
- [NumPy](https://numpy.org/)
+5 −1
Original line number Diff line number Diff line
@@ -35,7 +35,7 @@ release = deepchem.__version__
# ones.
extensions = [
    'sphinx.ext.autodoc', 'sphinx.ext.napoleon', 'sphinx.ext.doctest',
    'sphinx.ext.linkcode', 'sphinx.ext.mathjax',
    'sphinx.ext.linkcode', 'sphinx.ext.mathjax', 'sphinx.ext.autosectionlabel',
]

# Options for autodoc directives
@@ -59,6 +59,10 @@ source_suffix = '.rst'
# The master toctree document.
master_doc = 'index'

# autosectionlabel setting
autosectionlabel_prefix_document = True
autosectionlabel_maxdepth = 3

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.

docs/source/dataclasses.rst

deleted100644 → 0
+0 −26
Original line number Diff line number Diff line
Data Classes
============
DeepChem featurizers often transform members into "data classes". These are
classes that hold all the information needed to train a model on that data
point. Models then transform these into the tensors for training in their
:code:`default_generator` methods.

Graph Convolutions
------------------

These classes document the data classes for graph convolutions. We plan to simplify these classes into a joint data representation for all graph convolutions in a future version of DeepChem, so these APIs may not remain stable.

.. autoclass:: deepchem.feat.mol_graphs.ConvMol
  :members:

.. autoclass:: deepchem.feat.mol_graphs.MultiConvMol
  :members:

.. autoclass:: deepchem.feat.mol_graphs.WeaveMol
  :members:

.. autoclass:: deepchem.feat.graph_data.GraphData
  :members:

.. autoclass:: deepchem.feat.graph_data.BatchGraphData
  :members:

docs/source/dataloaders.rst

deleted100644 → 0
+0 −62
Original line number Diff line number Diff line
Data Loaders
============

Processing large amounts of input data to construct a :code:`dc.data.Dataset` object can require some amount of hacking. To simplify this process for you, you can use the :code:`dc.data.DataLoader` classes. These classes provide utilities for you to load and process large amounts of data.


DataLoader
----------

.. autoclass:: deepchem.data.DataLoader
  :members:

CSVLoader
^^^^^^^^^

.. autoclass:: deepchem.data.CSVLoader
  :members:

UserCSVLoader
^^^^^^^^^^^^^

.. autoclass:: deepchem.data.UserCSVLoader
  :members:

JsonLoader
^^^^^^^^^^
JSON is a flexible file format that is human-readable, lightweight, 
and more compact than other open standard formats like XML. JSON files
are similar to python dictionaries of key-value pairs. All keys must
be strings, but values can be any of (string, number, object, array,
boolean, or null), so the format is more flexible than CSV. JSON is
used for describing structured data and to serialize objects. It is
conveniently used to read/write Pandas dataframes with the
`pandas.read_json` and `pandas.write_json` methods.

.. autoclass:: deepchem.data.JsonLoader
  :members:

FASTALoader
^^^^^^^^^^^

.. autoclass:: deepchem.data.FASTALoader
  :members:

ImageLoader
^^^^^^^^^^^

.. autoclass:: deepchem.data.ImageLoader
  :members:

SDFLoader
^^^^^^^^^

.. autoclass:: deepchem.data.SDFLoader
  :members:

InMemoryLoader
^^^^^^^^^^^^^^
The :code:`dc.data.InMemoryLoader` is designed to facilitate the processing of large datasets where you already hold the raw data in-memory (say in a pandas dataframe).

.. autoclass:: deepchem.data.InMemoryLoader
  :members:

docs/source/datasets.rst

deleted100644 → 0
+0 −41
Original line number Diff line number Diff line
Datasets
========

DeepChem :code:`dc.data.Dataset` objects are one of the core building blocks of DeepChem programs. :code:`Dataset` objects hold representations of data for machine learning and are widely used throughout DeepChem.

Dataset
-------
The :code:`dc.data.Dataset` class is the abstract parent class for all
datasets. This class should never be directly initialized, but
contains a number of useful method implementations.

The goal of the :code:`Dataset` class is to be maximally interoperable with other common representations of machine learning datasets. For this reason we provide interconversion methods mapping from :code:`Dataset` objects to pandas dataframes, tensorflow Datasets, and PyTorch datasets.

.. autoclass:: deepchem.data.Dataset
  :members:

NumpyDataset
------------
The :code:`dc.data.NumpyDataset` class provides an in-memory implementation of the abstract :code:`Dataset` which stores its data in :code:`numpy.ndarray` objects.

.. autoclass:: deepchem.data.NumpyDataset
  :members:

DiskDataset
-----------
The :code:`dc.data.DiskDataset` class allows for the storage of larger
datasets on disk. Each :code:`DiskDataset` is associated with a
directory in which it writes its contents to disk. Note that a
:code:`DiskDataset` can be very large, so some of the utility methods
to access fields of a :code:`Dataset` can be prohibitively expensive.

.. autoclass:: deepchem.data.DiskDataset
  :members:

ImageDataset
------------
The :code:`dc.data.ImageDataset` class is optimized to allow for convenient processing of image based datasets.

.. autoclass:: deepchem.data.ImageDataset
  :members:
Loading