Overview of LightGBM in SynapseML

Straipsnis
06/25/2024

LightGBM is an open-source, distributed, high-performance gradient boosting (GBDT, GBRT, GBM, or MART) framework. This framework specializes in creating high-quality and GPU-enabled decision tree algorithms for ranking, classification, and many other machine learning tasks. LightGBM is part of Microsoft's DMTK project.

Advantages of LightGBM

Composability: LightGBM models can be incorporated into existing SparkML pipelines and used for batch, streaming, and serving workloads.
Performance: LightGBM on Spark is 10-30% faster than SparkML on the Higgs dataset and achieves a 15% increase in AUC. Parallel experiments have verified that LightGBM can achieve a linear speed-up by using multiple machines for training in specific settings.
Functionality: LightGBM offers a wide array of tunable parameters, that one can use to customize their decision tree system. LightGBM on Spark also supports new types of problems such as quantile regression.
Cross platform: LightGBM on Spark is available on Spark, PySpark, and SparklyR.

LightGBM Usage

LightGBMClassifier: used for building classification models. For example, to predict whether a company bankrupts or not, we could build a binary classification model with LightGBMClassifier.
LightGBMRegressor: used for building regression models. For example, to predict housing price, we could build a regression model with LightGBMRegressor.
LightGBMRanker: used for building ranking models. For example, to predict the relevance of website search results, we could build a ranking model with LightGBMRanker.

Bendrinti naudojant

Overview of LightGBM in SynapseML

Advantages of LightGBM

LightGBM Usage

Atsiliepimai

Papildomi ištekliai

Bendrinti naudojant

Overview of LightGBM in SynapseML

Advantages of LightGBM

LightGBM Usage

Related content

Atsiliepimai

Papildomi ištekliai