This function performs a series of data quality checks on a given dataframe, including checking the data structure, missing values, data accuracy, negative values, outliers, sample size, duplicate rows, and duplicate columns.

check_data_quality(df)

Arguments

df

A dataframe.

Value

A message indicating the results of each data quality check.

Examples

df <- data.frame(w = c(7, 8, 180, 7), x = c("a", "b", "c", "a"),
                 y = c(4, NA, -6, 4), z = c(7, 8, 180, 7))

# Check the data quality of the example dataframe
check_data_quality(df)
#> Number of rows: 4
#> 
#> Number of columns: 4
#> 
#> Column names: w, x, y, z
#> 
#> Column data types: numeric, character, numeric, numeric
#> 
#> Number of missing values: 1
#> 
#> Missing values found in the following columns:
#> y: 1
#> 
#> Data frame contains negative values.
#> 
#> Extreme values found in the numerical columns of the data.
#> 
#> Missing values detected in the data frame.
#> 
#> Duplicate rows found:
#> Row 4 is a duplicate of row 1
#> 
#> Duplicate columns found:
#> Column 'z' is a duplicate of column 'w'