Python API

This page provides detailed documentation for the tszip Python API.

Usage example

Tszip can be used directly in Python to provide seamless compression and decompression of tree sequences files. Here, we run an msprime simulation and write the output to a .trees.tsz file:

import msprime
import tszip

ts = msprime.simulate(10, random_seed=1)
tszip.compress(ts, "simulation.trees.tsz")

# Later, we load the same tree sequence from the compressed file.
ts = tszip.decompress("simulation.trees.tsz")

Note

For very small simulations like this example, the tszip file may be larger than the original uncompressed file.

API

tszip.compress(ts, destination, variants_only=False)[source]

Compresses the specified tree sequence and writes it to the specified path or file-like object. By default, fully lossless compression is used so that tree sequences are identical before and after compression. By specifying the variants_only option, a lossy compression can be used, which discards any information that is not needed to represent the variants (which are stored losslessly).

Parameters:
  • ts (tskit.TreeSequence) – The input tree sequence.
  • destination (str) – The string, pathlib.Path or file-like object we should write the compressed file to.
  • variants_only (bool) – If True, discard all information not necessary to represent the variants in the input file.
tszip.decompress(path)[source]

Decompresses the tszip compressed file and returns a tskit tree sequence instance.

Parameters:path (str) – The location of the tszip compressed file to load.
Return type:tskit.TreeSequence
Returns:A tskit.TreeSequence instance corresponding to the the specified file.