Class Booster

java.lang.Object
ml.dmlc.xgboost4j.java.Booster
All Implemented Interfaces:
com.esotericsoftware.kryo.KryoSerializable, Serializable

public class Booster extends Object implements Serializable, com.esotericsoftware.kryo.KryoSerializable
Booster for xgboost, this is a model API that support interactive build of a XGBoost Model
See Also:
  • Field Details

    • logger

      private static final org.apache.commons.logging.Log logger
    • handle

      private long handle
    • version

      private int version
  • Constructor Details

    • Booster

      Booster(Map<String,Object> params, DMatrix[] cacheMats) throws XGBoostError
      Create a new Booster with empty stage.
      Parameters:
      params - Model parameters
      cacheMats - Cached DMatrix entries, the prediction of these DMatrices will become faster than not-cached data.
      Throws:
      XGBoostError - native error
  • Method Details

    • loadModel

      static Booster loadModel(String modelPath) throws XGBoostError
      Load a new Booster model from modelPath
      Parameters:
      modelPath - The path to the model.
      Returns:
      The created Booster.
      Throws:
      XGBoostError
    • loadModel

      static Booster loadModel(InputStream in) throws XGBoostError, IOException
      Load a new Booster model from a file opened as input stream. The assumption is the input stream only contains one XGBoost Model. This can be used to load existing booster models saved by other xgboost bindings.
      Parameters:
      in - The input stream of the file.
      Returns:
      The create boosted
      Throws:
      XGBoostError
      IOException
    • setParam

      public final void setParam(String key, Object value) throws XGBoostError
      Set parameter to the Booster.
      Parameters:
      key - param name
      value - param value
      Throws:
      XGBoostError - native error
    • setParams

      public void setParams(Map<String,Object> params) throws XGBoostError
      Set parameters to the Booster.
      Parameters:
      params - parameters key-value map
      Throws:
      XGBoostError - native error
    • getAttrs

      public final Map<String,String> getAttrs() throws XGBoostError
      Get attributes stored in the Booster as a Map.
      Returns:
      A map contain attribute pairs.
      Throws:
      XGBoostError - native error
    • getAttr

      public final String getAttr(String key) throws XGBoostError
      Get attribute from the Booster.
      Parameters:
      key - attribute key
      Returns:
      attribute value
      Throws:
      XGBoostError - native error
    • setAttr

      public final void setAttr(String key, String value) throws XGBoostError
      Set attribute to the Booster.
      Parameters:
      key - attribute key
      value - attribute value
      Throws:
      XGBoostError - native error
    • setAttrs

      public void setAttrs(Map<String,String> attrs) throws XGBoostError
      Set attributes to the Booster.
      Parameters:
      attrs - attributes key-value map
      Throws:
      XGBoostError - native error
    • update

      public void update(DMatrix dtrain, int iter) throws XGBoostError
      Update the booster for one iteration.
      Parameters:
      dtrain - training data
      iter - current iteration number
      Throws:
      XGBoostError - native error
    • update

      public void update(DMatrix dtrain, IObjective obj) throws XGBoostError
      Update with customize obj func
      Parameters:
      dtrain - training data
      obj - customized objective class
      Throws:
      XGBoostError - native error
    • boost

      public void boost(DMatrix dtrain, float[] grad, float[] hess) throws XGBoostError
      update with give grad and hess
      Parameters:
      dtrain - training data
      grad - first order of gradient
      hess - seconde order of gradient
      Throws:
      XGBoostError - native error
    • evalSet

      public String evalSet(DMatrix[] evalMatrixs, String[] evalNames, int iter) throws XGBoostError
      evaluate with given dmatrixs.
      Parameters:
      evalMatrixs - dmatrixs for evaluation
      evalNames - name for eval dmatrixs, used for check results
      iter - current eval iteration
      Returns:
      eval information
      Throws:
      XGBoostError - native error
    • evalSet

      public String evalSet(DMatrix[] evalMatrixs, String[] evalNames, int iter, float[] metricsOut) throws XGBoostError
      evaluate with given dmatrixs.
      Parameters:
      evalMatrixs - dmatrixs for evaluation
      evalNames - name for eval dmatrixs, used for check results
      iter - current eval iteration
      metricsOut - output array containing the evaluation metrics for each evalMatrix
      Returns:
      eval information
      Throws:
      XGBoostError - native error
    • evalSet

      public String evalSet(DMatrix[] evalMatrixs, String[] evalNames, IEvaluation eval) throws XGBoostError
      evaluate with given customized Evaluation class
      Parameters:
      evalMatrixs - evaluation matrix
      evalNames - evaluation names
      eval - custom evaluator
      Returns:
      eval information
      Throws:
      XGBoostError - native error
    • evalSet

      public String evalSet(DMatrix[] evalMatrixs, String[] evalNames, IEvaluation eval, float[] metricsOut) throws XGBoostError
      Throws:
      XGBoostError
    • predict

      private float[][] predict(DMatrix data, boolean outputMargin, int treeLimit, boolean predLeaf, boolean predContribs) throws XGBoostError
      Advanced predict function with all the options.
      Parameters:
      data - data
      outputMargin - output margin
      treeLimit - limit number of trees, 0 means all trees.
      predLeaf - prediction minimum to keep leafs
      predContribs - prediction feature contributions
      Returns:
      predict results
      Throws:
      XGBoostError
    • predictLeaf

      public float[][] predictLeaf(DMatrix data, int treeLimit) throws XGBoostError
      Predict leaf indices given the data
      Parameters:
      data - The input data.
      treeLimit - Number of trees to include, 0 means all trees.
      Returns:
      The leaf indices of the instance.
      Throws:
      XGBoostError
    • predictContrib

      public float[][] predictContrib(DMatrix data, int treeLimit) throws XGBoostError
      Output feature contributions toward predictions of given data
      Parameters:
      data - The input data.
      treeLimit - Number of trees to include, 0 means all trees.
      Returns:
      The feature contributions and bias.
      Throws:
      XGBoostError
    • predict

      public float[][] predict(DMatrix data) throws XGBoostError
      Predict with data
      Parameters:
      data - dmatrix storing the input
      Returns:
      predict result
      Throws:
      XGBoostError - native error
    • predict

      public float[][] predict(DMatrix data, boolean outputMargin) throws XGBoostError
      Predict with data
      Parameters:
      data - data
      outputMargin - output margin
      Returns:
      predict results
      Throws:
      XGBoostError
    • predict

      public float[][] predict(DMatrix data, boolean outputMargin, int treeLimit) throws XGBoostError
      Advanced predict function with all the options.
      Parameters:
      data - data
      outputMargin - output margin
      treeLimit - limit number of trees, 0 means all trees.
      Returns:
      predict results
      Throws:
      XGBoostError
    • saveModel

      public void saveModel(String modelPath) throws XGBoostError
      Save model to modelPath
      Parameters:
      modelPath - model path
      Throws:
      XGBoostError
    • saveModel

      public void saveModel(OutputStream out) throws XGBoostError, IOException
      Save the model to file opened as output stream. The model format is compatible with other xgboost bindings. The output stream can only save one xgboost model. This function will close the OutputStream after the save.
      Parameters:
      out - The output stream
      Throws:
      XGBoostError
      IOException
    • getModelDump

      public String[] getModelDump(String featureMap, boolean withStats) throws XGBoostError
      Get the dump of the model as a string array
      Parameters:
      withStats - Controls whether the split statistics are output.
      Returns:
      dumped model information
      Throws:
      XGBoostError - native error
    • getModelDump

      public String[] getModelDump(String featureMap, boolean withStats, String format) throws XGBoostError
      Throws:
      XGBoostError
    • getModelDump

      public String[] getModelDump(String[] featureNames, boolean withStats) throws XGBoostError
      Get the dump of the model as a string array with specified feature names.
      Parameters:
      featureNames - Names of the features.
      Returns:
      dumped model information
      Throws:
      XGBoostError
    • getModelDump

      public String[] getModelDump(String[] featureNames, boolean withStats, String format) throws XGBoostError
      Throws:
      XGBoostError
    • getFeatureScore

      public Map<String,Integer> getFeatureScore(String[] featureNames) throws XGBoostError
      Get importance of each feature with specified feature names.
      Returns:
      featureScoreMap key: feature name, value: feature importance score, can be nill.
      Throws:
      XGBoostError - native error
    • getFeatureScore

      public Map<String,Integer> getFeatureScore(String featureMap) throws XGBoostError
      Get importance of each feature
      Returns:
      featureScoreMap key: feature index, value: feature importance score, can be nill
      Throws:
      XGBoostError - native error
    • getFeatureWeightsFromModel

      private Map<String,Integer> getFeatureWeightsFromModel(String[] modelInfos) throws XGBoostError
      Get the importance of each feature based purely on weights (number of splits)
      Returns:
      featureScoreMap key: feature index, value: feature importance score based on weight
      Throws:
      XGBoostError - native error
    • getScore

      public Map<String,Double> getScore(String[] featureNames, String importanceType) throws XGBoostError
      Get the feature importances for gain or cover (average or total)
      Returns:
      featureImportanceMap key: feature index, values: feature importance score based on gain or cover
      Throws:
      XGBoostError - native error
    • getScore

      public Map<String,Double> getScore(String featureMap, String importanceType) throws XGBoostError
      Get the feature importances for gain or cover (average or total), with feature names
      Returns:
      featureImportanceMap key: feature name, values: feature importance score based on gain or cover
      Throws:
      XGBoostError - native error
    • getFeatureImportanceFromModel

      private Map<String,Double> getFeatureImportanceFromModel(String[] modelInfos, String importanceType) throws XGBoostError
      Get the importance of each feature based on information gain or cover
      Returns:
      featureImportanceMap key: feature index, value: feature importance score based on information gain or cover
      Throws:
      XGBoostError - native error
    • getDumpInfo

      private String[] getDumpInfo(boolean withStats) throws XGBoostError
      Save the model as byte array representation. Write these bytes to a file will give compatible format with other xgboost bindings. If java natively support HDFS file API, use toByteArray and write the ByteArray
      Parameters:
      withStats - Controls whether the split statistics are output.
      Returns:
      dumped model information
      Throws:
      XGBoostError - native error
    • getVersion

      public int getVersion()
    • setVersion

      public void setVersion(int version)
    • toByteArray

      public byte[] toByteArray() throws XGBoostError
      Returns:
      the saved byte array.
      Throws:
      XGBoostError - native error
    • loadRabitCheckpoint

      int loadRabitCheckpoint() throws XGBoostError
      Load the booster model from thread-local rabit checkpoint. This is only used in distributed training.
      Returns:
      the stored version number of the checkpoint.
      Throws:
      XGBoostError
    • saveRabitCheckpoint

      void saveRabitCheckpoint() throws XGBoostError
      Save the booster model into thread-local rabit checkpoint and increment the version. This is only used in distributed training.
      Throws:
      XGBoostError
    • init

      private void init(DMatrix[] cacheMats) throws XGBoostError
      Internal initialization function.
      Parameters:
      cacheMats - The cached DMatrix.
      Throws:
      XGBoostError
    • dmatrixsToHandles

      private static long[] dmatrixsToHandles(DMatrix[] dmatrixs)
      transfer DMatrix array to handle array (used for native functions)
      Parameters:
      dmatrixs -
      Returns:
      handle array for input dmatrixs
    • writeObject

      private void writeObject(ObjectOutputStream out) throws IOException
      Throws:
      IOException
    • readObject

      private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException
      Throws:
      IOException
      ClassNotFoundException
    • finalize

      protected void finalize() throws Throwable
      Overrides:
      finalize in class Object
      Throws:
      Throwable
    • dispose

      public void dispose()
    • write

      public void write(com.esotericsoftware.kryo.Kryo kryo, com.esotericsoftware.kryo.io.Output output)
      Specified by:
      write in interface com.esotericsoftware.kryo.KryoSerializable
    • read

      public void read(com.esotericsoftware.kryo.Kryo kryo, com.esotericsoftware.kryo.io.Input input)
      Specified by:
      read in interface com.esotericsoftware.kryo.KryoSerializable