Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 20 additions & 28 deletions R/list_files.R
Original file line number Diff line number Diff line change
@@ -1,55 +1,47 @@
#' List files in a container
#'
#' Recursively (or not, if desired) lists all files found in a container. Search
#' can be restricted to a particular 'subdirectory' of the container, and/or
#' to files with a specific extension. The function assumes that all file names
#' end with a ".ext" extension of some sort.
#' Lists all files (recursively, if desired) found in a container within a
#' given directory (`dir`). The search can be restricted to files with a
#' specific extension.
#'
#' The function does not support filtering by file name, only by file extension.
#'
#' The returned file list (character vector) contains the full paths to the
#' files, ready to be passed perhaps to a `read_azure_*` function, or further
#' filtered by you. If you just want the names of the files without the folder
#' path, use [basename()] to extract these.
#' files, ready to be passed perhaps to a `read_azure_*` function, or filtered
#' further. If you just want the names of the files without the folder path,
#' use [basename] to extract these.
#'
#' @inheritParams read_azure_parquet
#' @param path (optional) subdirectory of the container to list files within.
#' `""` (the root folder of the container) by default
#' @param dir (optional) The directory of the container to list files within.
#' `""` (the root directory of the container) by default
#' @param ext (optional) A string giving the extension of a particular file type
#' you want to restrict the list to. No need to include the initial ".". The
#' default, `""`, means no filtering by file extension will be applied. Can be
#' a regular expression.
#' @param recursive A Boolean value: whether to list files recursively. `TRUE`
#' by default
#' to restrict the list to. No need to include the initial ".". The default,
#' `""`, means no filtering by file extension will be applied.
#' @param recursive logical: whether to list files recursively. Default `FALSE`
#'
#' @importFrom rlang .data
#' @returns A vector of file names, or an empty character vector if none found
#' @examples \dontrun{
#' list_files(get_container("example"), ext = "csv")
#' }
#' @export
list_files <- function(container, path = "", ext = "", recursive = TRUE) {
stopifnot(rlang::is_character(c(path, ext), 2))
list_files <- function(container, dir = "", ext = "", recursive = FALSE) {
stopifnot(rlang::is_character(c(dir, ext), 2))
stopifnot(rlang::is_bool(recursive))
pnf_msg <- ct_error_msg("Path {.val {path}} not found")
check_that(path, \(x) AzureStor::blob_dir_exists(container, x), pnf_msg)
check_that(dir, \(x) AzureStor::blob_dir_exists(container, x), pnf_msg)

tbl <- AzureStor::list_blobs(container, path, recursive = recursive)
if (nrow(tbl) > 0) {
ext_rx <- if (nzchar(ext)) sub("^\\.+", "", ext) else ".*" # nolint
tbl <- tbl |>
dplyr::filter(!.data[["isdir"]] & gregg(.data[["name"]], "\\.{ext_rx}$"))
}
ext_rx <- ifelse(nzchar(ext), gsub("^\\.+", "\\.", ext), ".*") # nolint
tbl <- AzureStor::list_blobs(container, dir, recursive = recursive) |>
dplyr::filter(!.data[["isdir"]] & gregg(.data[["name"]], "{ext_rx}$"))

# A zero-row tbl can result if `path` is initially empty, or via the filter
# step above. We handle this the same way, no matter which route led here.
# A zero-row tbl can result if the directory is actually empty, or via
# filtering out. We handle this the same way no matter which route led here.
if (nrow(tbl) == 0) {
fix_path <- \(p) sub("^/+$", "", sub("^([^/])(.*)", "/\\1\\2", p)) # nolint
ext <- if (nzchar(ext)) paste0(" ", ext)
msg <- "No{ext} files found in {.val [{container$name}]:{fix_path(path)}}"
if (rlang::is_interactive()) {
cli::cli_alert_info(msg)
}
cli::cli_alert_info(msg)
invisible(character(0))
} else {
tbl[["name"]]
Expand Down
27 changes: 12 additions & 15 deletions man/list_files.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading