Read and write data frames from and to a fast-storage (
Allows for compression and (file level) random access of stored data, even for compressed datasets.
Multiple threads are used to obtain high (de-)serialization speeds but all background threads are
read_fst return (reads and writes are stable).
When using a
data.table object for
x, the key (if any) is preserved,
allowing storage of sorted data.
write_fst are equivalent to
write.fst (but the
former syntax is preferred).
write_fst(x, path, compress = 50, uniform_encoding = TRUE) read_fst(path, columns = NULL, from = 1, to = NULL, as.data.table = FALSE, old_format = FALSE) write.fst(x, path, compress = 50, uniform_encoding = TRUE) read.fst(path, columns = NULL, from = 1, to = NULL, as.data.table = FALSE, old_format = FALSE)
a data frame to write to disk
path to fst file
value in the range 0 to 100, indicating the amount of compression to use. Lower values mean larger file sizes. The default compression is set to 50.
Column names to read. The default is to read all columns.
Read data starting from this row number.
Read data up until this row number. The default is to read to the last row of the stored dataset.
If TRUE, the result will be returned as a
must be FALSE, the old fst file format is deprecated and can only be read and converted with fst package versions 0.8.0 to 0.8.10.
read_fst returns a data frame with the selected columns and rows.
x to a
fst file and invisibly returns
x (so you can use this function in a pipeline).
# Sample dataset x <- data.frame(A = 1:10000, B = sample(c(TRUE, FALSE, NA), 10000, replace = TRUE)) # Default compression write_fst(x, "dataset.fst") # filesize: 17 KB y <- read_fst("dataset.fst") # read fst file # Maximum compression write_fst(x, "dataset.fst", 100) # fileSize: 4 KB y <- read_fst("dataset.fst") # read fst file # Random access y <- read_fst("dataset.fst", "B") # read selection of columns y <- read_fst("dataset.fst", "A", 100, 200) # read selection of columns and rows