Skip to content

Commit

Permalink
[python-package] [docs] Expand class docs for Dataset (microsoft#6558)
Browse files Browse the repository at this point in the history
  • Loading branch information
Plenitude-ai authored Jul 24, 2024
1 parent cbee5ee commit 3d8013c
Show file tree
Hide file tree
Showing 5 changed files with 24 additions and 7 deletions.
2 changes: 1 addition & 1 deletion R-package/DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -63,4 +63,4 @@ Imports:
utils
SystemRequirements:
~~CXXSTD~~
RoxygenNote: 7.3.1
RoxygenNote: 7.3.2
9 changes: 7 additions & 2 deletions R-package/R/lgb.Dataset.R
Original file line number Diff line number Diff line change
Expand Up @@ -758,8 +758,13 @@ Dataset <- R6::R6Class(
)

#' @title Construct \code{lgb.Dataset} object
#' @description Construct \code{lgb.Dataset} object from dense matrix, sparse matrix
#' or local file (that was created previously by saving an \code{lgb.Dataset}).
#' @description LightGBM does not train on raw data.
#' It discretizes continuous features into histogram bins, tries to
#' combine categorical features, and automatically handles missing and
# infinite values.
#'
#' The \code{Dataset} class handles that preprocessing, and holds that
#' alternative representation of the input data.
#' @inheritParams lgb_shared_dataset_params
#' @param data a \code{matrix} object, a \code{dgCMatrix} object,
#' a character representing a path to a text file (CSV, TSV, or LibSVM),
Expand Down
8 changes: 6 additions & 2 deletions R-package/man/lgb.Dataset.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/env.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ dependencies:
- r-markdown=1.12
- r-matrix=1.6_4
- r-pkgdown=2.0.7
- r-roxygen2=7.3.1
- r-roxygen2=7.3.2
- scikit-learn>=1.4.0
- sphinx>=6.0
- sphinx_rtd_theme>=2.0
10 changes: 9 additions & 1 deletion python-package/lightgbm/basic.py
Original file line number Diff line number Diff line change
Expand Up @@ -1745,7 +1745,15 @@ def current_iteration(self) -> int:


class Dataset:
"""Dataset in LightGBM."""
"""
Dataset in LightGBM.
LightGBM does not train on raw data.
It discretizes continuous features into histogram bins, tries to combine categorical features,
and automatically handles missing and infinite values.
This class handles that preprocessing, and holds that alternative representation of the input data.
"""

def __init__(
self,
Expand Down

0 comments on commit 3d8013c

Please sign in to comment.