dsci_310_group_11_pkg.preprocess
Module Contents
Functions
|
DESCRIPTION: Splits the dataset of the specified 'df' (dataframe) into |
- dsci_310_group_11_pkg.preprocess.preprocessor(df, tort)
DESCRIPTION: Splits the dataset of the specified ‘df’ (dataframe) into training and testing data, generates a ‘target’ variable for the ML model to classify given the value of the quality of each example.
- INPUTS: df - a dataframe object that contains the entirety of the dataset, for splitting.
tort - a binary value (0, 1) that specifies whether to return the train or test dataframe.
ACTION: Splits the dataset of the specified ‘df’ (dataframe) into training and testing data, uses np.where to assign a 0 to the target column of examples that have quality < 5, assigns a 1 to the target column of examples that have quality > 5.
RETURNS: IF the function calls 0, then returns the training data, ELSE IF function calls, returns the testing data
TODO: Modularize param_grid values