ScanImage Tiff Reader for Python

Install

The libarary is pip-installable for 64-bit Windows, OS X, or Linux. We test against python 3.6 and python 2.7.14.

pip install scanimage-tiff-reader

Examples

Read a volume from a tiff stack:

from ScanImageTiffReader import ScanImageTiffReader
vol=ScanImageTiffReader("my.tif").data();

About

The ScanImageTiffReader reads data from Tiff and BigTiff files recorded using ScanImage. It was written with performance in mind and provides access to ScanImage-specific metadata. It is also available for Matlab, Julia and C. There’s also a command line interface. This library should actually work with most tiff files, but as of now we don’t support compressed or tiled data.

The library is pip-installable for 64-bit Windows, OS X, or Linux. We test against python 3.6 and python 2.7.14.

Both ScanImage and this reader are products of Vidrio Technologies. If you have questions or need support feel free to submit an issue or contact us.

ScanImage Tiff Files

ScanImage records images, volumes and video as Tiff or BigTiff files. For the most part Tiff and BigTiff are similar, but they are not the same; BigTiff enables storage of larger (>4GB) data sets. In addition to the image data, ScanImage stores metadata describing the microscope configuration and settings used during the acquisition. These are stored in the file itself.

Some of this metadata is accessible using standard Tiff readers. The tiff format provides for data fields (tags) which can be used to attach data to describing each frame. Past versions of ScanImage would store a copy the metadata in “image description” tag for each frame. Much of the metadata is redundant, and this can lead to longer load times and significant storage overhead in some cases.

Fortunately, the tiff format is very flexible and allows us to easily store the part of the metadata that doesn’t change over time in a dedicated block of the file. However, this metadata block is only accessible if you know where to look. The tiff files can still be read by any reader conforming to the baseline tiff specification, but those readers will not be aware of the ScanImage metadata block. The ScanImageTiffReader knows how to extract this metadata and also provides fast access to the images themselves.

ScanImage tiff files store the following kinds of data:

Kind Description
data The image data itself
metadata Frame-invariant metadata such as the microscope configuration and region of interest descriptions.
descriptions Frame-varying metadata such as timestamps and I2C data.

The metadata sections themselves are encoded as either matlab evaluable strings or json.

The ScanImage documentation has a very detailed description of how data is stored in a ScanImage Tiff.

Python Interface

About

The ScanImageTiffReader class reads Tiff’s produced by ScanImage. The reader exposes functionality for extracting the image stack into a numpy array, extracting the image description from individual frames, and for accessing other ScanImage specific metadata stored in the file. Additionally, we put some effort into making the reader fast.

You can also use this reader for Tiff’s and BigTiffs that weren’t produced by ScanImage, but some might not work. In particular, it lacks support for compression and tiling. Let us know if you have a Tiff you’d particularly like to read.

Example:

>>> reader=ScanImageTiffReader("data/TR_003.tif")
>>> stack=reader.data()
>>> reader.close()

The reader supports python’s with syntax:

>>> import ScanImageTiffReader
>>> with ScanImageTiffReader.ScanImageTiffReader('data/TR_003.tif') as reader:
...     print(reader.shape())
[10, 512, 512]

Notice, that in numpy dimensions are ordered like: [z,y,x].

Just in case it’s useful, the API version used by the library can be queried by running:

>>> print(ScanImageTiffReader.api_version()) #doctest: +ELLIPSIS
Version ...

API

class ScanImageTiffReader.ScanImageTiffReader(filename)[source]

Reader for ScanImage Tiff and BigTiff files.

The constructor opens a file and builds an index by scanning through the file.

static api_version()[source]

Returns a string stating the version of the c library that this python code is using.

Example:
>>> print(ScanImageTiffReader.api_version()) #doctest: +ELLIPSIS
Version ...
close()[source]

Closes the open file context releasing any bound resources.

data()[source]

Returns a numpy array with the image stack.

Return type:numpy.array
Example:
>>> v=ScanImageTiffReader("data/TR_003.tif").data()
>>> print(v.shape)
(10, 512, 512)
>>> print(v[0][0][0:10])
[6 6 7 5 7 8 8 9 6 9]

Notice, that in numpy dimensions are ordered like: [z,y,x].

description(iframe)[source]

Returns the contents of the image description tag for the indicated frame.

Parameters:iframe – An integer between 0 and len(self)-1
Return type:string
Example:
>>> with ScanImageTiffReader("data/res_00001.tif") as reader:
...     print(reader.description(0))
frameNumbers = 1
acquisitionNumbers = 1
frameNumberAcquisition = 1
frameTimestamps_sec = 0.000000000
acqTriggerTimestamps_sec = 0.000000000
nextFileMarkerTimestamps_sec = -1.000000000
endOfAcquisition = 0
endOfAcquisitionMode = 0
dcOverVoltage = 0
epoch = [1601  1  1  0  0 25.045]
auxTrigger0 = []
auxTrigger1 = []
auxTrigger2 = []
auxTrigger3 = []
I2CData = {}
<BLANKLINE>

Another example from a file that was saved using the json-style header:

>>> with ScanImageTiffReader("data/resj_2018a_00002.tif") as reader:
...     print(reader.description(0)) # doctest: +NORMALIZE_WHITESPACE
{
  "frameNumbers": 1,
  "acquisitionNumbers": 1,
  "frameNumberAcquisition": 1,
  "frameTimestamps_sec": 0.000000000,
  "acqTriggerTimestamps_sec": -0.000087000,
  "nextFileMarkerTimestamps_sec": -1.000000000,
  "endOfAcquisition": 0,
  "endOfAcquisitionMode": 0,
  "dcOverVoltage": 0,
  "epoch": [2018,10, 9,11,58,49.866],
  "auxTrigger0": [],
  "auxTrigger1": [],
  "auxTrigger2": [],
  "auxTrigger3": [],
  "I2CData": []
}
dtype()[source]

Returns a tuple corresponding to self.data().dtype

Return type:numpy.dtype
Example:
>>> with ScanImageTiffReader("data/resj_00001.tif") as reader:
...     print(reader.dtype())
int16
metadata()[source]

Reads the ScanImage metadata section from the file. This data section is not part of the Tiff specification, so common Tiff readers will not be able to access this data.

In ScanImage 2016, this is a JSON string. For previous versions of ScanImage, this is a bytestring that must be deserialized in matlab.

Return type:string
Example:
>>> import json
>>> with ScanImageTiffReader("data/resj_00001.tif") as reader:
...     o=json.loads(reader.metadata())
...     print(o["RoiGroups"]["imagingRoiGroup"]["rois"]["scanfields"]["affine"])
[[23.4, 0, -11.7], [0, 23.4, -11.7], [0, 0, 1]]
open(filename)[source]

Opens a ScanImage tiff file for reading.

Called by the constructor. Normally you don’t need to use this function. It’s possible, however to reuse an instance of this class:

Example:
>>> reader=ScanImageTiffReader("data/TR_003.tif")
>>> stack=reader.data()
>>> reader.close()
>>> reader.open("data/resj_00001.tif")
>>> print(reader.shape())
[10, 512, 512]
>>> reader.close()

However, there’s probably no real reason to use this pattern over just constructing new readers.

shape()[source]

Returns a tuple corresponding to self.data().shape

Example:
>>> with ScanImageTiffReader("data/resj_00001.tif") as reader:
...     print(reader.shape())
[10, 512, 512]

Changelog

Version Highlights
1.4
  • Support for loading a selected interval of frames
1.3
  • Python 3.6 support.
  • Made pip installable.
1.2
  • Fix: Properly return error messages to caller.
1.0
  • Initial release

Performance

The ScanImageTiffReader is fast.

The ScanImageTiffReader was designed to be fast. When done right, reading a Tiff file can have very low overhead. One should expect that the read performance is roughly the same as the bandwidth-limiting bottleneck, usually the hard-drive used for storage.

Thanks to solid-state storage and aggressive caching of files by the operating systems, read speeds of 500 MB/s or greater are very acheivable. Realizing this bandwidth improvement can reduce read times by an order of magnitude or more.

Reading from an SSD drive (Samsung 840 Pro),

_images/drive_stats.png

Benchmark speeds for the storage drive used here.

A roughly 6GB file should take ~11.4 seconds to read. Using the ScanImageTiffReader:

_images/julia_si.png

Time to read a stack using the ScanImageTiffReader API in Julia. The choice of language doesn’t make a significant difference.

We can compute the effective read bandwidth by dividing the total byte size of the file (6.277 GB) by the amount of time it took to read the file. In the Julia example above, this comes out to 430 MB/s.

Due to caching by the operating system, we sometimes exceed the expected speed of the hard drive. The behavior of this kind of file-system caching might vary between operating systems.