Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion +types/+core/Units.m
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@
obj.waveforms_index_index = p.Results.waveforms_index_index;
obj.waveforms_sampling_rate = p.Results.waveforms_sampling_rate;
obj.waveforms_unit = p.Results.waveforms_unit;

% Only execute validation/setup code when called directly in this class's
% constructor, not when invoked through superclass constructor chain
if strcmp(class(obj), 'types.core.Units') %#ok<STISA>
Expand Down
2 changes: 1 addition & 1 deletion .codespellrc
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
[codespell]
skip = *.html,*.svg,*fastsearch.m,*.yaml,*testResults.xml
ignore-words-list = DNE,nd,whos
ignore-words-list = DNE,nd,whos,ans
11 changes: 11 additions & 0 deletions docs/source/_static/css/custom.css
Original file line number Diff line number Diff line change
Expand Up @@ -72,3 +72,14 @@ button.copybtn {
border: none;
background: none;
}

.rst-content img.tutorial-media {
max-width: 100% !important;
height: auto;
display: block;
margin: 1rem auto;
}

.rst-content .highlight-text button.copybtn {
display: none !important;
}
43 changes: 19 additions & 24 deletions docs/source/_static/html/tutorials/basicUsage.html

Large diffs are not rendered by default.

6 changes: 3 additions & 3 deletions docs/source/_static/html/tutorials/behavior.html

Large diffs are not rendered by default.

Large diffs are not rendered by default.

20 changes: 10 additions & 10 deletions docs/source/_static/html/tutorials/dynamic_tables.html

Large diffs are not rendered by default.

66 changes: 50 additions & 16 deletions docs/source/_static/html/tutorials/dynamically_loaded_filters.html

Large diffs are not rendered by default.

998 changes: 60 additions & 938 deletions docs/source/_static/html/tutorials/ecephys.html

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions docs/source/_static/html/tutorials/icephys.html

Large diffs are not rendered by default.

10 changes: 5 additions & 5 deletions docs/source/_static/html/tutorials/images.html

Large diffs are not rendered by default.

18 changes: 9 additions & 9 deletions docs/source/_static/html/tutorials/intro.html

Large diffs are not rendered by default.

12 changes: 6 additions & 6 deletions docs/source/_static/html/tutorials/ogen.html

Large diffs are not rendered by default.

214 changes: 125 additions & 89 deletions docs/source/_static/html/tutorials/ophys.html

Large diffs are not rendered by default.

14 changes: 7 additions & 7 deletions docs/source/_static/html/tutorials/read_demo.html

Large diffs are not rendered by default.

42 changes: 24 additions & 18 deletions docs/source/_static/html/tutorials/read_demo_dandihub.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/source/_static/html/tutorials/remote_read.html

Large diffs are not rendered by default.

6 changes: 3 additions & 3 deletions docs/source/_static/html/tutorials/scratch.html

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
461 changes: 454 additions & 7 deletions docs/source/pages/tutorials/basicUsage.rst

Large diffs are not rendered by default.

363 changes: 356 additions & 7 deletions docs/source/pages/tutorials/behavior.rst

Large diffs are not rendered by default.

592 changes: 584 additions & 8 deletions docs/source/pages/tutorials/convertTrials.rst

Large diffs are not rendered by default.

203 changes: 195 additions & 8 deletions docs/source/pages/tutorials/dataPipe.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,18 +4,205 @@ Advanced Writing Using DataPipes 🎬
===================================

.. image:: https://www.mathworks.com/images/responsive/global/open-in-matlab-online.svg
:target: https://matlab.mathworks.com/open/github/v1?repo=NeurodataWithoutBorders/matnwb&file=tutorials/dataPipe.mlx
:target: https://matlab.mathworks.com/open/github/v1?repo=NeurodataWithoutBorders/matnwb&file=tutorials/dataPipe.m
:alt: Open in MATLAB Online
.. image:: https://img.shields.io/badge/View-Full_Page-blue
.. image:: https://img.shields.io/badge/View-Rendered_Live_Script-blue
:target: ../../_static/html/tutorials/dataPipe.html
:alt: View full page
:alt: View rendered Live Script
.. image:: https://img.shields.io/badge/View-Youtube-red
:target: https://www.youtube.com/watch?v=PIE_F4iVv98&ab_channel=NeurodataWithoutBorders
:alt: View tutorial on YouTube

.. raw:: html
.. contents:: On this page
:local:
:depth: 2

<iframe class="autoresize"
src="../../_static/html/tutorials/dataPipe.html"
style="width:100%; border:none; display:block;">
</iframe>
How to utilize HDF5 compression using dataPipe

**Authors:** Ivan Smalianchuk and Ben Dichter
**Contact:** smalianchuk.ivan@gmail.com, ben.dichter@catalystneuro.com
**Last Edited:** Jan 04, 2021

Neurophysiology data can be quite large, often in the 10s of GB per session and sometimes much larger. Here, we demonstrate methods in MatNWB that allow you to deal with large datasets. These methods are compression and iterative write. Both of these techniques use the ``types.untyped.DataPipe`` object, which sends specific instructions to the HDF5 backend about how to store data.

Compression - basic implementation
----------------------------------

To compress experimental data (in this case a 3D matrix with dimensions [250 250 70]) one must assign it as a ``DataPipe`` type:

.. code-block:: matlab

DataToCompress = randi(100, 250, 250, 70);
DataPipe = types.untyped.DataPipe('data', DataToCompress);

This is the most basic way to achieve compression, and all of the optimization decisions are automatically determined by MatNWB.

Background
----------

HDF5 has built-in ability to compress and decompress individual datasets. If applied intelligently, this can dramatically reduce the amount of space used on the hard drive to represent the data. The end user does not need to worry about the compression status of the dataset- HDF5 will automatically decompress the dataset on read.

The above example uses default chunk size and compression level (3). To optimize compression, ``compressionLevel`` and ``chunkSize`` must be considered. compressionLevel ranges from 0 - 9 where 9 is the highest level of compression and 0 is the lowest. ``chunkSize`` is less intuitive to adjust; to implement compression, chunk size must be less than data size.

``DataPipe`` Arguments
----------------------

.. list-table::
:widths: 25 75

* - ``maxSize``
- Sets the maximum size of the HDF5 Dataset. Unless using iterative writing, this should match the size of Data. To append data later, use the maxSize for the full dataset. You can use Inf for a value of a dimension if you do not know its final size.
* - ``data``
- The data to compress. Must be numerical data.
* - ``axis``
- Set which axis to increment when appending more data.
* - ``dataType``
- Sets the type of the experimental data. This must be a numeric data type. Useful to include when using iterative write to append data as the appended data must be the same data type. If data is provided and dataType is not, the dataType is inferred from the provided data.
* - ``chunkSize``
- Sets chunk size for the compression. Must be less than maxSize.
* - ``compressionLevel``
- Level of compression ranging from 0-9 where 9 is the highest level of compression. The default is level 3.
* - ``offset``
- Axis offset of dataset to append. May be used to overwrite data.

Chunking
--------

HDF5 Datasets can be either stored in continuous or chunked mode. Continuous means that all of the data is written to one continuous block on the hard drive, and chunked means that the dataset is automatically split into chunks that are distributed across the hard drive. The user does not need to know the mode used- HDF5 handles the gathering of chunks automatically. However, it is worth understanding these chunks because they can have a big impact on space used and read and write speed. When using compression, the dataset MUST be chunked. HDF5 is not able to apply compression to continuous datasets.

If chunkSize is not explicitly specified, dataPipe will determine an appropriate chunk size. However, you can optimize the performance of the compression by manually specifying the chunk size using *chunkSize* argument.

We can demonstrate the benefit of chunking by exploring the following scenario. The following code utilizes DataPipe's default chunk size:

.. code-block:: matlab

fData = randi(250, 100, 1000); % Create fake data

% create an nwb structure with required fields
nwb = NwbFile( ...
'session_start_time', datetime('2020-01-01 00:00:00', 'TimeZone', 'local'), ...
'identifier', 'ident1', ...
'session_description', 'DataPipeTutorial');

fData_compressed = types.untyped.DataPipe('data', fData);

fdataNWB=types.core.TimeSeries( ...
'data', fData_compressed, ...
'data_unit', 'mV', ...
'starting_time', 0.0, ...
'starting_time_rate', 30.0);

nwb.acquisition.set('data', fdataNWB);

nwbExport(nwb, 'DefaultChunks.nwb');

This results in a file size of 47MB (too large), and the process takes 11 seconds (far too long). Setting the chunk size manually as in the example code below resolves these issues:

.. code-block:: matlab

fData_compressed = types.untyped.DataPipe( ...
'data', fData, ...
'chunkSize', [1, 1000], ...
'axis', 1);

This change results in the operation completing in 0.7 seconds and resulting file size of 1.1MB. The chunk size was chosen such that it spans each individual row of the matrix.

Use the combination of arguments that fit your need. When dealing with large datasets, you may want to use iterative write to ensure that you stay within the bounds of your system memory and use chunking and compression to optimize storage, read and write of the data.

Iterative Writing
-----------------

If experimental data is close to, or exceeds the available system memory, performance issues may arise. To combat this effect of large data, ``DataPipe`` can utilize iterative writing, where only a portion of the data is first compressed and saved, and then additional portions are appended.

To demonstrate, we can create a nwb file with a compressed time series data:

.. code-block:: matlab

dataPart1 = randi(250, 1, 1000); % "load" 1/4 of the entire dataset
fullDataSize = [1 40000]; % this is the size of the TOTAL dataset

% create an nwb structure with required fields
nwb=NwbFile( ...
'session_start_time', datetime('2020-01-01 00:00:00', 'TimeZone', 'local'), ...
'identifier', 'ident1', ...
'session_description', 'DataPipeTutorial');

% compress the data
fData_use = types.untyped.DataPipe( ...
'data', dataPart1, ...
'maxSize', fullDataSize, ...
'axis', 2);

Set the compressed data as a time series

.. code-block:: matlab

fdataNWB = types.core.TimeSeries( ...
'data', fData_use, ...
'data_unit', 'mV', ...
'starting_time', 0.0, ...
'starting_time_rate', 30.0);

nwb.acquisition.set('time_series', fdataNWB);

nwbExport(nwb, 'DataPipeTutorial_iterate.nwb');

To append the rest of the data, simply load the NWB file and use the append method:

.. code-block:: matlab

nwb = nwbRead('DataPipeTutorial_iterate.nwb', 'ignorecache'); %load the nwb file with partial data

"load" each of the remaining 1/4ths of the large dataset

.. code-block:: matlab

for i = 2:4 % iterating through parts of data
dataPart_i=randi(250, 1, 10000); % faked data chunk as if it was loaded
nwb.acquisition.get('time_series').data.append(dataPart_i); % append the loaded data
end

The axis property defines the dimension in which additional data will be appended. In the above example, the resulting dataset will be 4000x1. However, if we set axis to 2 (and change fullDataSize appropriately), then the resulting dataset will be 1000x4.

Timeseries example
------------------

Following is an example of how to compress and add a timeseries to an NWB file:

.. code-block:: matlab

fData=randi(250, 1, 10000); % create fake data;

%assign data without compression
nwb=NwbFile(...
'session_start_time', datetime(2020, 1, 1, 0, 0, 0, 'TimeZone', 'local'), ...
'identifier','ident1', ...
'session_description', 'DataPipeTutorial');

ephys_module = types.core.ProcessingModule( ...
'description', 'holds processed ephys data');

nwb.processing.set('ephys', ephys_module);

% compress the data
fData_compressed=types.untyped.DataPipe( ...
'data', fData, ...
'compressionLevel', 3, ...
'chunkSize', [100 1], ...
'axis', 1);

Assign the data to appropriate module and write the NWB file

.. code-block:: matlab

fdataNWB=types.core.TimeSeries( ...
'data', fData_compressed, ...
'data_unit', 'mV', ...
'starting_time', 0.0, ...
'starting_time_rate', 30.0);

ephys_module.nwbdatainterface.set('data', fdataNWB);
nwb.processing.set('ephys', ephys_module);

% write the file
nwbExport(nwb, 'Compressed.nwb');
88 changes: 81 additions & 7 deletions docs/source/pages/tutorials/dimensionMapNoDataPipes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,88 @@ Mapping Dimensions without DataPipes
.. image:: https://www.mathworks.com/images/responsive/global/open-in-matlab-online.svg
:target: https://matlab.mathworks.com/open/github/v1?repo=NeurodataWithoutBorders/matnwb&file=tutorials/dimensionMapNoDataPipes.mlx
:alt: Open in MATLAB Online
.. image:: https://img.shields.io/badge/View-Full_Page-blue
.. image:: https://img.shields.io/badge/View-Rendered_Live_Script-blue
:target: ../../_static/html/tutorials/dimensionMapNoDataPipes.html
:alt: View full page
:alt: View rendered Live Script


.. raw:: html
.. contents:: On this page
:local:
:depth: 2

<iframe class="autoresize"
src="../../_static/html/tutorials/dimensionMapNoDataPipes.html"
style="width:100%; border:none; display:block;">
</iframe>
This tutorial demonstrates how the dimensions of a MATLAB array maps onto a dataset in HDF5. There are two main differences between the way MATLAB and HDF5 represents dimensions:

1. **C-ordering vs F-ordering:** HDF5 is C-ordered, which means it stores data in a rows-first pattern, whereas MATLAB is F-ordered, storing data in the reverse pattern, with the last dimension of the array stored consecutively. The result is that the data in HDF5 is effectively the transpose of the array in MATLAB.
2. **1D data (i.e vectors):** HDF5 can store 1-D arrays, but in MATLAB the lowest dimensionality of an array is 2-D.

Due to differences in how MATLAB and HDF5 represent data, the dimensions of datasets are flipped when writing to/from file in MatNWB. Additionally, MATLAB represents 1D vectors in a 2D format, either as row vectors or column vectors, whereas HDF5 treats vectors as truly 1D. Consequently, when a 1D dataset from HDF5 is loaded into MATLAB, it is always represented as a column vector. To avoid unintentional changes in data dimensions, it is therefore recommended to avoid writing row vectors into an NWB file for 1D datasets.

Contrast this tutorial with the `dimensionMapWithDataPipes <dimensionMapWithDataPipes>`_ tutorial that illustrates how vectors are represented differently when using ``DataPipe`` objects within ``VectorData`` objects.

Create Table
------------

First, create a ``TimeIntervals`` table of height 10.

.. code-block:: matlab

% Define VectorData objects for each column
% 1D column
start_col = types.hdmf_common.VectorData( ...
'description', 'start_times column', ...
'data', (1:10)' ... # maps onto HDF5 dataset of size (10,)
);
% 1D column
stop_col = types.hdmf_common.VectorData( ...
'description', 'stop_times column', ...
'data', (2:11)' ... # maps onto HDF5 dataset of size (10,)
);
% 4D column
randomval_col = types.hdmf_common.VectorData( ...
'description', 'randomvalues column', ...
'data', rand(5,2,3,10) ... # maps onto HDF5 dataset of size (10, 3, 2, 5)
);

% 1D column
id_col = types.hdmf_common.ElementIdentifiers(...
'data', int64(0:9)'); % maps onto HDF5 dataset of size (10,)

% Create table
trials_table = types.core.TimeIntervals(...
'description', 'test dynamic table column',...
'colnames', {'start_time','stop_time','randomvalues'}, ...
'start_time', start_col, ...
'stop_time', stop_col, ...
'randomvalues', randomval_col, ...
'id', id_col ...
);

Export Table
------------

Create NWB file with ``TimeIntervals`` table and export.

.. code-block:: matlab

% Create NwbFile object with required arguments
file = NwbFile( ...
'session_start_time', datetime('2022-01-01 00:00:00', 'TimeZone', 'local'), ...
'identifier', 'ident1', ...
'session_description', 'test file' ...
);
% Assign to intervals_trials
file.intervals_trials = trials_table;
% Export
nwbExport(file, 'testFileNoDataPipes.nwb');

You can examine the dimensions of the datasets on file using `HDFView <https://www.hdfgroup.org/downloads/hdfview/>`_. Screenshots for this file are below.

.. image:: ../../_static/tutorials/media/dimensionMapNoDataPipes/image_0.png
:class: tutorial-media
:width: 950px
:alt: image_0.png

.. image:: ../../_static/tutorials/media/dimensionMapNoDataPipes/image_1.png
:class: tutorial-media
:width: 952px
:alt: image_1.png
Loading
Loading