Commit b53e9cab authored by Bharath Ramsundar's avatar Bharath Ramsundar
Browse files

More docs

parent 61e578ba
Loading
Loading
Loading
Loading
+14 −4
Original line number Diff line number Diff line
@@ -39,10 +39,10 @@ def load_qm9(
  """Load QM9 dataset

  QM9 is a comprehensive dataset that provides geometric, energetic,
  electronic and thermodynamic properties for a subset of GDB-17 database,
  comprising 134 thousand stable organic molecules with up to 9 heavy atoms.
  All molecules are modeled using density functional theory
  (B3LYP/6-31G(2df,p) based DFT).
  electronic and thermodynamic properties for a subset of GDB-17
  database, comprising 134 thousand stable organic molecules with up
  to 9 heavy atoms.  All molecules are modeled using density
  functional theory (B3LYP/6-31G(2df,p) based DFT).

  Random splitting is recommended for this dataset.

@@ -99,6 +99,16 @@ def load_qm9(
  save_dir: str
    a directory to save the dataset in

  Note
  ----
  DeepChem 2.4.0 has turned on sanitization for SDF files by default.
  For the QM9 dataset, this means that calling this function will
  return a list of 132480 compounds instead of 133885 in the source
  dataset file. This appears to be due to valence specification
  mismatches in the dataset that weren't caught in earlier more lax
  versions of RDKit. Note that this may subtly affect benchmarking
  results on this dataset.

  References
  ----------
  .. [1] Blum, Lorenz C., and Jean-Louis Reymond. "970 million druglike small