Package ml.dmlc.xgboost4j.java
Class Booster
java.lang.Object
ml.dmlc.xgboost4j.java.Booster
- All Implemented Interfaces:
com.esotericsoftware.kryo.KryoSerializable
,Serializable
public class Booster
extends Object
implements Serializable, com.esotericsoftware.kryo.KryoSerializable
Booster for xgboost, this is a model API that support interactive build of a XGBoost Model
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic class
Supported feature importance types WEIGHT = Number of nodes that a feature was used to determine a split GAIN = Average information gain per split for a feature COVER = Average cover per split for a feature TOTAL_GAIN = Total information gain over all splits of a feature TOTAL_COVER = Total cover over all splits of a feature -
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
update with give grad and hessvoid
dispose()
private static long[]
dmatrixsToHandles
(DMatrix[] dmatrixs) transfer DMatrix array to handle array (used for native functions)evaluate with given dmatrixs.evaluate with given dmatrixs.evalSet
(DMatrix[] evalMatrixs, String[] evalNames, IEvaluation eval) evaluate with given customized Evaluation classevalSet
(DMatrix[] evalMatrixs, String[] evalNames, IEvaluation eval, float[] metricsOut) protected void
finalize()
final String
Get attribute from the Booster.getAttrs()
Get attributes stored in the Booster as a Map.private String[]
getDumpInfo
(boolean withStats) Save the model as byte array representation.getFeatureImportanceFromModel
(String[] modelInfos, String importanceType) Get the importance of each feature based on information gain or covergetFeatureScore
(String featureMap) Get importance of each featuregetFeatureScore
(String[] featureNames) Get importance of each feature with specified feature names.getFeatureWeightsFromModel
(String[] modelInfos) Get the importance of each feature based purely on weights (number of splits)String[]
getModelDump
(String[] featureNames, boolean withStats) Get the dump of the model as a string array with specified feature names.String[]
getModelDump
(String[] featureNames, boolean withStats, String format) String[]
getModelDump
(String featureMap, boolean withStats) Get the dump of the model as a string arrayString[]
getModelDump
(String featureMap, boolean withStats, String format) Get the feature importances for gain or cover (average or total)Get the feature importances for gain or cover (average or total), with feature namesint
private void
Internal initialization function.(package private) static Booster
loadModel
(InputStream in) Load a new Booster model from a file opened as input stream.(package private) static Booster
Load a new Booster model from modelPath(package private) int
Load the booster model from thread-local rabit checkpoint.float[][]
Predict with datafloat[][]
Predict with datafloat[][]
Advanced predict function with all the options.private float[][]
Advanced predict function with all the options.float[][]
predictContrib
(DMatrix data, int treeLimit) Output feature contributions toward predictions of given datafloat[][]
predictLeaf
(DMatrix data, int treeLimit) Predict leaf indices given the datavoid
read
(com.esotericsoftware.kryo.Kryo kryo, com.esotericsoftware.kryo.io.Input input) private void
void
saveModel
(OutputStream out) Save the model to file opened as output stream.void
Save model to modelPath(package private) void
Save the booster model into thread-local rabit checkpoint and increment the version.final void
Set attribute to the Booster.void
Set attributes to the Booster.final void
Set parameter to the Booster.void
Set parameters to the Booster.void
setVersion
(int version) byte[]
void
Update the booster for one iteration.void
update
(DMatrix dtrain, IObjective obj) Update with customize obj funcvoid
write
(com.esotericsoftware.kryo.Kryo kryo, com.esotericsoftware.kryo.io.Output output) private void
-
Field Details
-
logger
private static final org.apache.commons.logging.Log logger -
handle
private long handle -
version
private int version
-
-
Constructor Details
-
Booster
Booster(Map<String, Object> params, DMatrix[] cacheMats) throws XGBoostErrorCreate a new Booster with empty stage.- Parameters:
params
- Model parameterscacheMats
- Cached DMatrix entries, the prediction of these DMatrices will become faster than not-cached data.- Throws:
XGBoostError
- native error
-
-
Method Details
-
loadModel
Load a new Booster model from modelPath- Parameters:
modelPath
- The path to the model.- Returns:
- The created Booster.
- Throws:
XGBoostError
-
loadModel
Load a new Booster model from a file opened as input stream. The assumption is the input stream only contains one XGBoost Model. This can be used to load existing booster models saved by other xgboost bindings.- Parameters:
in
- The input stream of the file.- Returns:
- The create boosted
- Throws:
XGBoostError
IOException
-
setParam
Set parameter to the Booster.- Parameters:
key
- param namevalue
- param value- Throws:
XGBoostError
- native error
-
setParams
Set parameters to the Booster.- Parameters:
params
- parameters key-value map- Throws:
XGBoostError
- native error
-
getAttrs
Get attributes stored in the Booster as a Map.- Returns:
- A map contain attribute pairs.
- Throws:
XGBoostError
- native error
-
getAttr
Get attribute from the Booster.- Parameters:
key
- attribute key- Returns:
- attribute value
- Throws:
XGBoostError
- native error
-
setAttr
Set attribute to the Booster.- Parameters:
key
- attribute keyvalue
- attribute value- Throws:
XGBoostError
- native error
-
setAttrs
Set attributes to the Booster.- Parameters:
attrs
- attributes key-value map- Throws:
XGBoostError
- native error
-
update
Update the booster for one iteration.- Parameters:
dtrain
- training dataiter
- current iteration number- Throws:
XGBoostError
- native error
-
update
Update with customize obj func- Parameters:
dtrain
- training dataobj
- customized objective class- Throws:
XGBoostError
- native error
-
boost
update with give grad and hess- Parameters:
dtrain
- training datagrad
- first order of gradienthess
- seconde order of gradient- Throws:
XGBoostError
- native error
-
evalSet
evaluate with given dmatrixs.- Parameters:
evalMatrixs
- dmatrixs for evaluationevalNames
- name for eval dmatrixs, used for check resultsiter
- current eval iteration- Returns:
- eval information
- Throws:
XGBoostError
- native error
-
evalSet
public String evalSet(DMatrix[] evalMatrixs, String[] evalNames, int iter, float[] metricsOut) throws XGBoostError evaluate with given dmatrixs.- Parameters:
evalMatrixs
- dmatrixs for evaluationevalNames
- name for eval dmatrixs, used for check resultsiter
- current eval iterationmetricsOut
- output array containing the evaluation metrics for each evalMatrix- Returns:
- eval information
- Throws:
XGBoostError
- native error
-
evalSet
public String evalSet(DMatrix[] evalMatrixs, String[] evalNames, IEvaluation eval) throws XGBoostError evaluate with given customized Evaluation class- Parameters:
evalMatrixs
- evaluation matrixevalNames
- evaluation nameseval
- custom evaluator- Returns:
- eval information
- Throws:
XGBoostError
- native error
-
evalSet
public String evalSet(DMatrix[] evalMatrixs, String[] evalNames, IEvaluation eval, float[] metricsOut) throws XGBoostError - Throws:
XGBoostError
-
predict
private float[][] predict(DMatrix data, boolean outputMargin, int treeLimit, boolean predLeaf, boolean predContribs) throws XGBoostError Advanced predict function with all the options.- Parameters:
data
- dataoutputMargin
- output margintreeLimit
- limit number of trees, 0 means all trees.predLeaf
- prediction minimum to keep leafspredContribs
- prediction feature contributions- Returns:
- predict results
- Throws:
XGBoostError
-
predictLeaf
Predict leaf indices given the data- Parameters:
data
- The input data.treeLimit
- Number of trees to include, 0 means all trees.- Returns:
- The leaf indices of the instance.
- Throws:
XGBoostError
-
predictContrib
Output feature contributions toward predictions of given data- Parameters:
data
- The input data.treeLimit
- Number of trees to include, 0 means all trees.- Returns:
- The feature contributions and bias.
- Throws:
XGBoostError
-
predict
Predict with data- Parameters:
data
- dmatrix storing the input- Returns:
- predict result
- Throws:
XGBoostError
- native error
-
predict
Predict with data- Parameters:
data
- dataoutputMargin
- output margin- Returns:
- predict results
- Throws:
XGBoostError
-
predict
Advanced predict function with all the options.- Parameters:
data
- dataoutputMargin
- output margintreeLimit
- limit number of trees, 0 means all trees.- Returns:
- predict results
- Throws:
XGBoostError
-
saveModel
Save model to modelPath- Parameters:
modelPath
- model path- Throws:
XGBoostError
-
saveModel
Save the model to file opened as output stream. The model format is compatible with other xgboost bindings. The output stream can only save one xgboost model. This function will close the OutputStream after the save.- Parameters:
out
- The output stream- Throws:
XGBoostError
IOException
-
getModelDump
Get the dump of the model as a string array- Parameters:
withStats
- Controls whether the split statistics are output.- Returns:
- dumped model information
- Throws:
XGBoostError
- native error
-
getModelDump
public String[] getModelDump(String featureMap, boolean withStats, String format) throws XGBoostError - Throws:
XGBoostError
-
getModelDump
Get the dump of the model as a string array with specified feature names.- Parameters:
featureNames
- Names of the features.- Returns:
- dumped model information
- Throws:
XGBoostError
-
getModelDump
public String[] getModelDump(String[] featureNames, boolean withStats, String format) throws XGBoostError - Throws:
XGBoostError
-
getFeatureScore
Get importance of each feature with specified feature names.- Returns:
- featureScoreMap key: feature name, value: feature importance score, can be nill.
- Throws:
XGBoostError
- native error
-
getFeatureScore
Get importance of each feature- Returns:
- featureScoreMap key: feature index, value: feature importance score, can be nill
- Throws:
XGBoostError
- native error
-
getFeatureWeightsFromModel
Get the importance of each feature based purely on weights (number of splits)- Returns:
- featureScoreMap key: feature index, value: feature importance score based on weight
- Throws:
XGBoostError
- native error
-
getScore
public Map<String,Double> getScore(String[] featureNames, String importanceType) throws XGBoostError Get the feature importances for gain or cover (average or total)- Returns:
- featureImportanceMap key: feature index, values: feature importance score based on gain or cover
- Throws:
XGBoostError
- native error
-
getScore
Get the feature importances for gain or cover (average or total), with feature names- Returns:
- featureImportanceMap key: feature name, values: feature importance score based on gain or cover
- Throws:
XGBoostError
- native error
-
getFeatureImportanceFromModel
private Map<String,Double> getFeatureImportanceFromModel(String[] modelInfos, String importanceType) throws XGBoostError Get the importance of each feature based on information gain or cover- Returns:
- featureImportanceMap key: feature index, value: feature importance score based on information gain or cover
- Throws:
XGBoostError
- native error
-
getDumpInfo
Save the model as byte array representation. Write these bytes to a file will give compatible format with other xgboost bindings. If java natively support HDFS file API, use toByteArray and write the ByteArray- Parameters:
withStats
- Controls whether the split statistics are output.- Returns:
- dumped model information
- Throws:
XGBoostError
- native error
-
getVersion
public int getVersion() -
setVersion
public void setVersion(int version) -
toByteArray
- Returns:
- the saved byte array.
- Throws:
XGBoostError
- native error
-
loadRabitCheckpoint
Load the booster model from thread-local rabit checkpoint. This is only used in distributed training.- Returns:
- the stored version number of the checkpoint.
- Throws:
XGBoostError
-
saveRabitCheckpoint
Save the booster model into thread-local rabit checkpoint and increment the version. This is only used in distributed training.- Throws:
XGBoostError
-
init
Internal initialization function.- Parameters:
cacheMats
- The cached DMatrix.- Throws:
XGBoostError
-
dmatrixsToHandles
transfer DMatrix array to handle array (used for native functions)- Parameters:
dmatrixs
-- Returns:
- handle array for input dmatrixs
-
writeObject
- Throws:
IOException
-
readObject
- Throws:
IOException
ClassNotFoundException
-
finalize
-
dispose
public void dispose() -
write
public void write(com.esotericsoftware.kryo.Kryo kryo, com.esotericsoftware.kryo.io.Output output) - Specified by:
write
in interfacecom.esotericsoftware.kryo.KryoSerializable
-
read
public void read(com.esotericsoftware.kryo.Kryo kryo, com.esotericsoftware.kryo.io.Input input) - Specified by:
read
in interfacecom.esotericsoftware.kryo.KryoSerializable
-