preprocess
Preprocess the database. Following is the
comprehensive subroutine in essence. Note that the
output of any of the functions has not been bound to a
specific type here. They are entirely dependent upon
the module phs.datasetFunctions
to enforce, where the
four functions are defined.
dataRaw = readDataFromZipArchive(archive, archive_fname)
dataClean = sanitiseData(dataRaw)
xTrainVal,yTrainVal,xTest,yTest = splitData(dataClean)
saveData(data_dir,xTrainVal,yTrainVal,xTest,yTest)
Usage:
Options: