Skip to contents

provides hybrid data structure for 'HDF5' file

Value

none

self instance

self instance

subset of data

dimension of the array

data type, currently only character, integer, raw, double, and complex are available, all other types will yield "unknown"

Author

Zhengjia Wang

Public fields

quiet

whether to suppress messages

Methods


Method finalize()

garbage collection method

Usage

LazyH5$finalize()


Method print()

overrides print method

Usage

LazyH5$print()


Method new()

constructor

Usage

LazyH5$new(file_path, data_name, read_only = FALSE, quiet = FALSE)

Arguments

file_path

where data is stored in 'HDF5' format

data_name

the data stored in the file

read_only

whether to open the file in read-only mode. It's highly recommended to set this to be true, otherwise the file connection is exclusive.

quiet

whether to suppress messages, default is false


Method save()

save data to a 'HDF5' file

Usage

LazyH5$save(
  x,
  chunk = "auto",
  level = 7,
  replace = TRUE,
  new_file = FALSE,
  force = TRUE,
  ctype = NULL,
  size = NULL,
  ...
)

Arguments

x

vector, matrix, or array

chunk

chunk size, length should matches with data dimension

level

compress level, from 1 to 9

replace

if the data exists in the file, replace the file or not

new_file

remove the whole file if exists before writing?

force

if you open the file in read-only mode, then saving objects to the file will raise error. Use force=TRUE to force write data

ctype

data type, see mode, usually the data type of x. Try mode(x) or storage.mode(x) as hints.

size

deprecated, for compatibility issues

...

passed to self open() method


Method open()

open connection

Usage

LazyH5$open(new_dataset = FALSE, robj, ...)

Arguments

new_dataset

only used when the internal pointer is closed, or to write the data

robj

data array to save

...

passed to createDataSet in hdf5r package


Method close()

close connection

Usage

LazyH5$close(all = TRUE)

Arguments

all

whether to close all connections associated to the data file. If true, then all connections, including access from other programs, will be closed


Method subset()

subset data

Usage

LazyH5$subset(..., drop = FALSE, stream = FALSE, envir = parent.frame())

Arguments

drop

whether to apply drop the subset

stream

whether to read partial data at a time

envir

if i,j,... are expressions, where should the expression be evaluated

i, j, ...

index along each dimension


Method get_dims()

get data dimension

Usage

LazyH5$get_dims(stay_open = TRUE)

Arguments

stay_open

whether to leave the connection opened


Method get_type()

get data type

Usage

LazyH5$get_type(stay_open = TRUE)

Arguments

stay_open

whether to leave the connection opened

Examples

# Data to save
x <- array(rnorm(1000), c(10,10,10))

# Save to local disk
f <- tempfile()
save_h5(x, file = f, name = 'x', chunk = c(10,10,10), level = 0)
#> /var/folders/1y/56hdyx6x0_jb18k7b4ys9b6w0000gn/T//RtmpFQqvoC/file1c96757b3ff5 => x (Dataset Created) 
#> /var/folders/1y/56hdyx6x0_jb18k7b4ys9b6w0000gn/T//RtmpFQqvoC/file1c96757b3ff5 => x (Dataset Removed) 
#> /var/folders/1y/56hdyx6x0_jb18k7b4ys9b6w0000gn/T//RtmpFQqvoC/file1c96757b3ff5 => x (Dataset Created) 

# Load via LazyFST
dat <- LazyH5$new(file_path = f, data_name = 'x', read_only = TRUE)

dat

# Check whether the data is identical
range(dat - x)
#> [1] 0 0

# Read a slice of the data
system.time(dat[,10,])
#>    user  system elapsed 
#>   0.104   0.001   0.107