dsci_310_group_11_pkg.grapher

Module Contents

Functions

correlation_table(df)

DESCRIPTION: Displays a correlation table (correlation coefficient value

bar_chart(df)

DESCRIPTION: Displays a simple bar chart of the count of the quality variable.

class_report(pipe, X_test, y_test)

DESCRIPTION: Displays a heatmap of the count of the predicted case.

vis_tree(X_train, y_train)

DESCRIPTION: Displays a visual example of a decision tree for conceptual

compare_scores(lst)

DESCRIPTION: Displays a bar chart comparing the accuracy scores of each

show_coefficients(pipe, X_train)

DESCRIPTION: Displays a dataframe with the coefficients of the Logistic

show_correct(pipe, X_test, y_test)

DESCRIPTION: Displays a dataframe with the True Positive + True Negative

dsci_310_group_11_pkg.grapher.correlation_table(df)

DESCRIPTION: Displays a correlation table (correlation coefficient value of each variable to each other variable).

INPUTS: df - A dataframe object containing prediction features.

ACTION: Inputs a dataframe and displays the correlation coefficients in a square grid.

RETURNS: The table as a display.

dsci_310_group_11_pkg.grapher.bar_chart(df)

DESCRIPTION: Displays a simple bar chart of the count of the quality variable.

ACTION: Inputs a dataframe and displays the bar chart.

INPUTS: df - A dataframe object

RETURNS: The bar chart as a display.

TODO: 1. Modularize the variables that you can input into the chart 2. Move from altair to matplotlib

dsci_310_group_11_pkg.grapher.class_report(pipe, X_test, y_test)

DESCRIPTION: Displays a heatmap of the count of the predicted case.

ACTION: Inputs a model, testing data and displays the heatmap.

INPUTS: pipe - a model

X_test - testing features data y_test - testing label data

RETURNS: The heatmap as figure

dsci_310_group_11_pkg.grapher.vis_tree(X_train, y_train)

DESCRIPTION: Displays a visual example of a decision tree for conceptual purposes. The max_depth variable is limited to 3 so that the visualization is interpretable.

INPUTS: X_train - a dataframe object containing prediction features

y_train - a series object containing target variables.

ACTION: Inputs an X_train dataframe and y_train series and displays the decision tree model and each of its chosen parameter splits.

RETURNS: The decisision tree model as a display.

dsci_310_group_11_pkg.grapher.compare_scores(lst)

DESCRIPTION: Displays a bar chart comparing the accuracy scores of each ML model in the ‘lst’ list.

INPUTS: lst - a list of floats (accuracy scores) of each model.

ACTION: Inputs a list (lst) of ML model accuracy scores, generates a dataframe named ‘report’ and turns this dataframe into a bar chart.

RETURNS: The bar chart where the highlighted bar is the highest score.

dsci_310_group_11_pkg.grapher.show_coefficients(pipe, X_train)

DESCRIPTION: Displays a dataframe with the coefficients of the Logistic Regression model.

INPUTS: pipe - a pipeline object containing scikit-learn model transformers, and a scikit-learn model.

X_train - a dataframe object containing prediction features.

ACTION: Inputs a LogisticRegression model, and an X_train dataset. Names the pipe variables given the named_steps in the logistic regression in an array called ‘flatten’. Returns the dataframe ‘coeffs’ with the model’s features versus their coefficients. Sorts the values descendingly.

RETURNS: The dataframe, sorted descending by coefficients value.

Printing out coefficients of the regression model for values influencing the model.

dsci_310_group_11_pkg.grapher.show_correct(pipe, X_test, y_test)

DESCRIPTION: Displays a dataframe with the True Positive + True Negative versus the False Positive + False Negative ratio of the classifier model.

INPUTS: pipe - a pipeline object containing scikit-learn model transformers, and a scikit-learn model.

X_test - a dataframe object containing prediction features. y_test - a series object containing target variables.

ACTION: Inputs a model (pipe), and testing data; calls predict on the test data and reports the correct classifications versus the incorrect classifications.

RETURNS: A dataframe with the correct classifications versus incorrect classifications.