Cambridge-ICCS · sjavis · Jun 1, 2026 · Jun 1, 2026 · Jun 1, 2026 · Jun 1, 2026
diff --git a/docs/api/index.rst b/docs/api/index.rst
@@ -9,4 +9,5 @@ API Documentation
    TSTORMS <tstorms_api>
    Tempest Extremes <tempest_extremes_api>
    TRACK <track_api>
+   Preprocessing <preprocessing_api>
    Utility functions <utils_api>
diff --git a/docs/api/preprocessing_api.rst b/docs/api/preprocessing_api.rst
@@ -0,0 +1,6 @@
+Preprocessing functions
+=======================
+
+.. automodule:: tctrack.preprocessing
+   :members:
+   :member-order: bysource
diff --git a/docs/api/utils_api.rst b/docs/api/utils_api.rst
@@ -4,4 +4,5 @@ Utility functions
 .. automodule:: tctrack.utils
 
 .. autofunction:: load_tracker_metadata
+
 .. autofunction:: read_tracker_metadata
diff --git a/docs/data/preprocessing_data.rst b/docs/data/preprocessing_data.rst
@@ -6,6 +6,11 @@ we detail how to perform some of the typically required preprocessing steps usin
 cf-python library. Other tools can be used for the same tasks, however we focus on
 cf-python since it provides a uniform interface and it is a dependency of TCTrack.
 
+We also provide simple wrapper functions in :mod:`tctrack.preprocessing` that can
+simplify each of these tasks. Examples of these are given in each of the subsections
+below. These functions also return the fields so the output files do not need to be
+written every time.
+
 For full documentation of the routines described on these pages and more see the
 `cf python documentation <https://ncas-cms.github.io/cf-python/>`_.
 
@@ -31,6 +36,17 @@ cf-python:
     # Write the combined data to a single file
     cf.write(field, "combined-output.nc")
 
+Or, equivalently, in TCTrack you can use
+:func:`tctrack.preprocessing.select_time_range`, as below. All of the other
+preprocessing functions can also be used to combine files if a specific time range is
+not required.
+
+.. code-block:: python
+
+    tctrack.preprocessing.select_time_range(
+        input_files, ["1950-01-01", "1950-04-01"], output_file="combined-output.nc"
+    )
+
 Combine Variables
 -----------------
 
@@ -46,6 +62,14 @@ read them in separately and then write them together:
     # Write the combined fields to a single file
     cf.write([field1, field2], "combined_file.nc")
 
+Using TCTrack:
+
+.. code-block:: python
+
+    tctrack.preprocessing.read_files(
+        ["var1_file.nc", "var2_file.nc"], output_file="combined_file.nc"
+    )
+
 Separating Variables
 --------------------
 
@@ -61,6 +85,15 @@ If variables instead need to be separated into multiple files, such as in :doc:`
     cf.write(field1, "var1_file.nc")
     cf.write(field2, "var2_file.nc")
 
+Using TCTrack:
+
+.. code-block:: python
+
+    tctrack.preprocessing.separate_variables(
+        "combined_file.nc",
+        {"var1": "var1_file.nc", "var2": "var2_file.nc"},
+    )
+
 Subsampling
 -----------
 
@@ -99,6 +132,19 @@ To remove the single-valued coordinate from the field use cf-python's
     # or, for a new field
     field3 = field2.squeeze()
 
+Using TCTrack, single-valued coordinates can be removed using the ``squeeze`` argument
+(see the first example below).
+
+.. code-block:: python
+
+    field2 = tctrack.preprocessing.subsample_field(
+        "var1_file.nc", {"Z": [5]}, squeeze=True
+    )
+    field3 = tctrack.preprocessing.subsample_field("var1_file.nc", {"X": [0, 5]})
+    field4 = tctrack.preprocessing.subsample_field(
+        "var1_file.nc", {"Y": slice(3, -3, 2)}
+    )
+
 Operations
 ----------
 
@@ -116,12 +162,20 @@ For example, to calculate vorticity from coincident velocity data we can use ``c
     # calculate vorticity
     w_field = cf.curl_xy(u_field, v_field, radius="earth")
     w_field.nc_set_variable("vorticity")
-    w_field.set_property("standard_name", "atmosphere_upward_absolute_vorticity")
+    w_field.set_property("standard_name", "atmosphere_upward_relative_vorticity")
     w_field.set_property("units", "s-1")
 
     # Save the new variable to NetCDF
     cf.write(w_field, "vorticity_file.nc")
 
+Using TCTrack:
+
+.. code-block:: python
+
+    tctrack.preprocessing.calculate_vorticity(
+        "u_file.nc", "v_file.nc", output_file="vorticity_file.nc"
+    )
+
 Or to take a mean over a coordinate:
 
 .. code-block:: python
@@ -136,19 +190,32 @@ Or to take a mean over a coordinate:
     # Save the new variable to NetCDF
     cf.write(field_zonal_mean, "zonal_mean_file.nc")
 
+Using TCTrack:
+
+.. code-block:: python
+
+    tctrack.preprocessing.collapse_field(
+        "file.nc", "mean", "X", output_file="zonal_mean_file.nc"
+    )
+
 Setting Fill Values
 ^^^^^^^^^^^^^^^^^^^
 
-Sometimes it us useful to replace fill values after an operation before writing to file.
+Sometimes it is useful to replace the 'fill values' after an operation but before
+writing to file.
 This can be done using cf-python's ``filled`` routine.
-For example, after to set any null or masked values to ``0.0`` after calculating
-vorticity above use:
+For example, to set any null or masked values to ``0.0`` (e.g. after calculating
+vorticity above) use the following before writing to file.
 
 .. code-block:: python
 
     w_field.filled(fill_value=0.0, inplace=True)
 
-before writing to file.
+Using TCTrack:
+
+.. code-block:: python
+
+    tctrack.preprocessing.replace_fill_value(w_field, 0.0, output_file="output.nc")
 
 Set NetCDF Variable Name
 ------------------------
@@ -167,6 +234,17 @@ To set specfic NetCDF variable names for the fields and coordinates you can use
     # Save with the new netcdf variable names
     cf.write(field, "slp_file.nc")
 
+Using TCTrack:
+
+.. code-block:: python
+
+    tctrack.preprocessing.set_netcdf_variable_name(
+        "var1_file.nc",
+        "slp",
+        coord_names={"latitude": "lat"},
+        output_file="slp_file.nc",
+    )
+
 Regridding
 ----------
 
@@ -208,6 +286,22 @@ To regrid onto a new grid:
 
 Note that regridding can be performed inplace using ``inplace=True``.
 
+Using TCTrack:
+
+.. code-block:: python
+
+    # Regrid onto a different variable
+    tctrack.preprocessing.regrid_to_field(
+        "var1_file.nc", "var2_file.nc", output_file="var1_regridded.nc"
+    )
+
+    # Regrid onto a new grid
+    latitude = np.arange(-90, 91, 1)
+    longitude = np.arange(-180, 181, 1)
+    tctrack.preprocessing.regrid_to_lat_lon(
+        "var1_file.nc", latitude, longitude, output_file="var1_regridded.nc"
+    )
+
 Gaussian Grid
 ^^^^^^^^^^^^^
 
@@ -238,3 +332,11 @@ objects to be used for the regridding.
     # Regrid
     field = field.regrids((lat_coord, lon_coord), method="linear")
     field.nc_clear_dataset_chunksizes()  # Avoids a possible error when writing
+
+Using TCTrack:
+
+.. code-block:: python
+
+    tctrack.preprocessing.regrid_to_gaussian(
+        "var1_file.nc", 256, output_file="var1_regridded.nc"
+    )
diff --git a/docs/getting-started/index.rst b/docs/getting-started/index.rst
@@ -101,11 +101,10 @@ noting that we may need to add the library to the dynamic path e.g.::
 esmpy
 ~~~~~
 
-Any regridding of data with cf-python requires `esmpy
-<https://earthsystemmodeling.org/esmpy/>`_ and `ESMF
+Any preprocessing that involves regridding of data using :mod:`tctrack.preprocessing` or
+cf-python requires `esmpy <https://earthsystemmodeling.org/esmpy/>`_ and `ESMF
 <https://earthsystemmodeling.org/>`_ as dependencies. This is not needed directly in the
-TCTrack package but may be needed for initial pre-processing of data, such as in the
-tutorial and described in the :doc:`../data/preprocessing_data` page.
+tracking algorithms.
 
 These are not pip-installable but can be installed in a conda environment::
 

diff --git a/docs/getting-started/tutorial.rst b/docs/getting-started/tutorial.rst
@@ -90,9 +90,9 @@ before running.
 Pre-processing of Data
 ----------------------
 
-From inside the conda environment run the regridding script to pre-process the data::
+From inside the conda environment run the script to pre-process the data::
 
-    python regrid.py
+    python preprocess_data.py
 
 This will pre-process the downloaded data as required for our codes and place it in
 ``data_processed/``.

diff --git a/src/tctrack/__init__.py b/src/tctrack/__init__.py
@@ -1,5 +1,5 @@
 """Package providing tropical cyclone tracking utilities."""
 
-from tctrack import core, tempest_extremes, track, tstorms
+from tctrack import core, preprocessing, tempest_extremes, track, tstorms, utils
 
-__all__ = ["core", "tempest_extremes", "track", "tstorms"]
+__all__ = ["core", "preprocessing", "tempest_extremes", "track", "tstorms", "utils"]
Original file line number	Diff line number	Diff line change
Expand Up		@@ -4,4 +4,5 @@ Utility functions
		.. automodule:: tctrack.utils

		.. autofunction:: load_tracker_metadata

		.. autofunction:: read_tracker_metadata