polimorfo.utils package¶
Submodules¶
polimorfo.utils.datautils module¶
-
polimorfo.utils.datautils.
download_file
(name: str, url: str, file_hash=None, extract: bool = True, cache_dir: str = '~/.polimorfo', cache_subdir: str = 'datasets') → List[pathlib.Path][source]¶ Downloads a file frolsm a URL if it not already saved Arguments:
name {str} – the name of the file (e.g. ) url {str} – the url of the file or the idx of the file in
case files are download from google drive- Keyword Arguments:
file_hash {str} – the hash of the file downloads (default: {None}) extract {bool} – try to extract the file (default: {True}) cache_dir {str} – the default folder where the file is saved
(default: {carambola.utils.datautils.CACHE_DIR})- cache_subdir {str} – the subdir where the file is downloaded
- (default: {carambola.utils.datautils.CACHE_SUBDIR})
-
polimorfo.utils.datautils.
download_from_gdrive
(uri: str, dst_path: str) → pathlib.Path[source]¶ download a file/folder from google drive Given a url https://drive.google.com/file/d/1EcUzQPNQXQGiHES9gU7oh-886wbBH3VF/view?usp=sharing the idx -> 1EcUzQPNQXQGiHES9gU7oh-886wbBH3VF Arguments:
uri {str} – the id of the file to download or the full google drive url dst_path {str} – the path to save the file- Returns:
- Path – the path where the file is saved
-
polimorfo.utils.datautils.
download_url
(url: str, dst_path: str) → Tuple[pathlib.Path, int][source]¶ download a url to the destination folder Arguments:
url {str} – [description] dst_path {str} – [description]- Returns:
- Tuple[Path, int] – [description]
-
polimorfo.utils.datautils.
extract_archive
(file_path: str, dst_path: str = '', archive_format='auto') → List[pathlib.Path][source]¶ Extract the archive if it match tar, tar.gz, tar.bz or zip format Arguments:
file_path {str} – the path to the archive- Keyword Arguments:
- dst_path {str} – the path to extract the folder (default: {None}
- the directory where the archive is placed)
- archive_format {str} – The format of the archive (default: {‘auto’})
- Options are: ‘auto’, ‘tar’, ‘zip’
- Returns:
- List[Path] – the paths where the file is saved
-
polimorfo.utils.datautils.
validate_file
(fpath, file_hash, algorithm='auto', chunk_size=65535)[source]¶ Validates a file against a sha256 or md5 hash. Arguments:
fpath: path to the file being validated file_hash: The expected hash string of the file.
The sha256 and md5 hash algorithms are both supported.- algorithm: Hash algorithm, one of ‘auto’, ‘sha256’, or ‘md5’.
- The default ‘auto’ detects the hash algorithm in use.
chunk_size: Bytes to read at a time, important for large files.
- Returns:
- Whether the file is valid