fit¶
- TimeSeriesCloudPredictor.fit(train_data: str | Path | DataFrame | None = None, *, predictor_init_args: Dict[str, Any], predictor_fit_args: Dict[str, Any] | None = None, tuning_data: str | Path | DataFrame | None = None, static_features: str | Path | DataFrame | None = None, id_column: str = 'item_id', timestamp_column: str = 'timestamp', framework_version: str = 'latest', job_name: str | None = None, instance_type: str = 'ml.m5.2xlarge', instance_count: int = 1, volume_size: int = 100, custom_image_uri: str | None = None, wait: bool = True, backend_kwargs: Dict | None = None, known_covariates: str | Path | DataFrame | None = None) TimeSeriesCloudPredictor[source]¶
Fit the predictor with SageMaker. This function will first upload necessary config and train data to s3 bucket. Then launch a SageMaker training job with the AutoGluon training container.
- Parameters:
train_data (Union[str, pathlib.Path, pd.DataFrame]) – Training time series in long format, as a DataFrame or local/S3 path to a data file. See the TimeSeriesPredictor.fit docs for the expected format.
predictor_init_args (dict) – Arguments forwarded to
TimeSeriesPredictor(). See the TimeSeriesPredictor docs for available options (e.g.target,prediction_length,freq,eval_metric,quantile_levels,known_covariates_names).predictor_fit_args (Optional[dict], default = None) –
Additional fit args forwarded to
TimeSeriesPredictor.fit(). See the TimeSeriesPredictor.fit docs for available options. Must NOT containtrain_dataortuning_data— pass those as explicit arguments above.tuning_data (Optional[Union[str, pathlib.Path, pd.DataFrame]], default = None) – Optional tuning data in long format, as a DataFrame or local/S3 path to a data file.
static_features (Optional[Union[str, pathlib.Path, pd.DataFrame]], default = None) – Static (time-independent) features describing each individual time series.
id_column (str, default = "item_id") – Name of the column with the unique identifier of each time series (item).
timestamp_column (str, default = "timestamp") – Name of the column with the observation timestamps.
framework_version (str, default = latest) – Training container version of autogluon. If latest, will use the latest available container version. If provided a specific version, will use this version. If custom_image_uri is set, this argument will be ignored.
job_name (str, default = None) – Name of the launched training job. If None, CloudPredictor will create one with prefix ag-cloudpredictor
instance_type (str, default = 'ml.m5.2xlarge') – Instance type the predictor will be trained on with SageMaker.
instance_count (int, default = 1) – Number of instance used to fit the predictor.
volumes_size (int, default = 30) – Size in GB of the EBS volume to use for storing input data during training (default: 30). Must be large enough to store training data if File Mode is used (which is the default).
wait (bool, default = True) – Whether the call should wait until the job completes To be noticed, the function won’t return immediately because there are some preparations needed prior fit. Use get_fit_job_status to get job status.
backend_kwargs (dict, default = None) –
Any extra arguments needed to pass to the underneath backend. For SageMaker backend, valid keys are:
- autogluon_sagemaker_estimator_kwargs
Any extra arguments needed to initialize AutoGluonSagemakerEstimator Please refer to https://sagemaker.readthedocs.io/en/stable/api/training/estimators.html#sagemaker.estimator.Estimator for all options
- fit_kwargs
Any extra arguments needed to pass to fit. Please refer to https://sagemaker.readthedocs.io/en/stable/api/training/estimators.html#sagemaker.estimator.Estimator.fit for all options
- Return type:
TimeSeriesCloudPredictor object. Returns self.