ClassLazyArray and Tensor shared the same storage format, hence it's possible to convert from 'LazyArray' to 'Tensor' instances without moving files.

lazyarray_to_tensor(arr, drop_partition = FALSE)

Arguments

arr

'LazyArray' instance, see ClassLazyArray

drop_partition

whether to drop partition if partition file is missing; only valid when arr$is_multi_part() is true. The result dimensions might be different if drop_partition is true

Value

Tensor instance

Details

arr must either have multi-part mode turned off or multi-part with mode=1. mode=2 might result in error or data loss.

Because LazyArray allows missing partitions and data on those partitions are marked as NAs. This means not all files required by Tensor exist. Under such condition, there are two options: one is to create those partition files; the other is to drop those partitions as they are NAs anyway.

If drop_partition is set to false, the returned data dimension will be "as-is". However, if arr has missing partitions, then new partition files will be created and filled with NAs. This is useful when data dimension is required to be the same as input.

If drop_partition is set to true, then partition files (obtained via arr$get_partition_fpath()) will be tested. The missing files will be dropped from the result. For example, given input with dimension 2x3x4, if first two partitions are missing, i.e. arr[,,1:2] are NAs, then the returned tensor will have dimension 2x3x2 and only keep what already exist. This is especially useful when data are very large.

See also

Examples


if(interactive()){
path <- tempfile()
arr <- lazyarray::lazyarray(path, storage_format = 'double', dim = 2:4)

arr[,,1] <- 1:6

# arr is a 'LazyArray' and its partition files are
# missing except for the first one
arr[]

file.exists(arr$get_partition_fpath())

# only keep the existing partition
ts1 <- lazyarray_to_tensor(arr, drop_partition = TRUE)

# the last 3 partitions are dropped, result in 2x3x1 dimension
ts1
dim(ts1)

# Fill in all missing partitions with NA
ts2 <- lazyarray_to_tensor(arr, drop_partition = FALSE)

# ts2 dimension is 22x3x4
ts2
ts2$subset(dim3 ~ dim3 == 4, data_only = TRUE, drop = TRUE)

}