4.1.1. What is PyMoDAQ’s Data?
Data in PyMoDAQ are objects with many characteristics:
a type: float, int, …
a dimensionality: Data0D, Data1D, Data2D and we will discuss about DataND
actual data as numpy arrays
Because of this variety, PyMoDAQ introduce a set of objects including metadata (for instance the time of acquisition)
and various methods and properties to manipulate those (getting name, slicing, concatenating…). The most basic object
DataLowLevel whose all data objects will inherit. It is very basic and will only store a name as a string and a
timestamp from its time of creation.
Then one have
DataBase objects that stores homogeneous data (data of same type) having the same shape as a list of numpy arrays.
Numpy is fundamental in python and it was obvious to choose that. However, instruments can acquire data having the same type and shape but from different channels. It then makes sense to have a list of numpy arrays.
Figure Fig. 4.2 presents the different types of data objects introduced by PyMoDAQ, which are also described below with examples on how to use them.
DataBase, see Data Management, is the most basic object to store data. It takes as argument a name, a DataSource, a DataDim, a DataDistribution, the actual data as a list of numpy arrays (even for scalars), labels (a name for each element in the list), eventually an origin (a string from which module it originates) and optional named arguments.
>>> import numpy as np >>> from pymodaq.utils.data import DataBase, DataSource, DataDim, DataDistribution >>> data = DataBase('mydata', source=DataSource['raw'],\ distribution=DataDistribution['uniform'], data=[np.array([1,2,3]), np.array([4,5,6])],\ labels=['channel1', 'channel2'], origin="documentation's code")
When instantiated, some checks are performed:
checking the homogeneity of the data
he consistency of the dimensionality and the shape of the numpy arrays
if no dimensionality is given, it is inferred from the data’s shape
Useful properties can then be used to check and manipulate the data. For instance one can check the length of the object (number of numpy arrays in the list), the size (number of elements in the numpy arrays), the shape (shape of the numpy arrays).
>>> data.dim <DataDim.Data1D: 1> >>> data.source <DataSource.raw: 0> >>> data.shape (3,) >>> data.length 2 >>> data.size 3
One can also make mathematical operations between data objects (sum, substraction, averaging) or appending numpy arrays (of same type and shape) to the data object and iterating over the numpy arrays with the standard for loop.
>>> for subdata in data: print(subdata) print(subdata.shape) [1 2 3] (3,) [4 5 6] (3,)
For a full description see What is PyMoDAQ’s Data?.
One course for data that are not scalar, a very important information is the axis associated with the data (one axis for waveforms, two for 2D data or more fro hyperspectral data). PyMoDAQ therefore introduces Axis and DataWithAxes objects.
The Axis object stores the information about the data’s axis
>>> from pymodaq.utils.data import Axis >>> axis = Axis('myaxis', units='seconds', data=np.array([3,7,11,15]), index=0) >>> axis Axis: <label: myaxis> - <units: seconds> - <index: 0>
It has a name, units, actual data as a numpy array and an index referring to which dimension of Data
the axis is referring to. For example,
index=0 for the vertical axis of 2D data and
index=1 for the
horizontal (or inversely, it’s up to you…).
Because there is no need to store a linearly spaced array, when instantiated, the
Axis object will, for linear
axis’s data replace it by None but compute an offset and a scaling factor
>>> axis.data None >>> axis.offset 3 >>> axis.scaling 4.0 >>> axis.size 4
Axis object has also properties and methods to manipulate the object, for instance to retrieve the associated numpy array:
>>> axis.get_data() array([ 3., 7., 11., 15.])
and mathematical methods:
>>> axis.mean() 11.0 >>> axis.find_index(11.0) 2
and a special slicer property to get subparts of the axis’s data (but as a new
>>> axis.iaxis[2:].get_data() array([11., 15.])
When dealing with data having axes, the
DataBase object is no more enough to describe the data.
PyMoDAQ therefore introduces
DataWithAxes which inherits from
DataBase and introduces more
metadata and functionalities.
>>> from pymodaq.utils.data import DataWithAxes >>> data = DataWithAxes('mydata', source=DataSource['raw'], dim=DataDim['Data2D'], \ distribution=DataDistribution['uniform'], data=[np.array([[1,2,3], [4,5,6]])],\ axes=[Axis('vaxis', index=0, data=np.array([-1, 1])), Axis('haxis', index=1, data=np.array([10, 11, 12]))]) >>> data <DataWithAxes, mydata, (|2, 3)> >>> data.axes [Axis: <label: vaxis> - <units: > - <index: 0>, Axis: <label: haxis> - <units: > - <index: 1>]
This object has a few more methods and properties related to the presence of axes. It has in particular an
AxesManager attribute that deals with the
Axis objects and the Data’s representation
Here meaning the data has a signal shape of
(2, 3). The notion of signal will be highlighted in the next
It also has a slicer property to get subdata:
>>> sub_data = data.isig[1:, 1:] >>> sub_data.data array([5, 6]) >>> sub_data = data.isig[:, 1:] >>> sub_data.data array([[2, 3], [5, 6]])
18.104.22.168. Uniform and Spread Data
So far, everything we’ve said can be well understood for data taken on a uniform grid (1D, 2D or more). But
some scanning possibilities of the DAQ_Scan (Tabular) allows to scan on specifics (and possibly random) values
of the actuators. In that case the distribution is
DataDistribution['spread']. Such distribution will be
differently plotted and differently saved in a h5file. It’s dimensionality will be DataND and a specific AxesManager
will be used. Let’s consider an example:
One can take images data (20x30 pixels) as a function of 2 parameters, say xaxis and yaxis non-uniformly spaced
>>> data.shape = (150, 20, 30) >>> data.nav_indexes = (0,)
The first dimension (150) corresponds to the navigation (there are 150 non uniform data points taken) The second and third correspond to signal data, here an image of size (20x30 pixels) so:
nav_indexesis (0, )
sig_indexesis (1, 2)
>>> xaxis = Axis(name=xaxis, index=0, data=...) >>> yaxis = Axis(name=yaxis, index=0, data=...)
both of length 150 and both referring to the first index (0) of the shape
In fact from such a data shape the number of navigation axes is unknown . In our example, they are 2. To somehow
keep track of some ordering in these navigation axes, one adds an attribute to the
Axis object: the
>>> xaxis = Axis(name=xaxis, index=0, spread_order=0, data=...) >>> yaxis = Axis(name=yaxis, index=0, spread_order=1, data=...)
This ordering will be very important for plotting of the data, see for instance below for an adaptive scan:
22.214.171.124. Special DataWithAxes
For explicit meaning, several classes are inheriting
DataWithAxes with adhoc attributes such as:
DataWithAxeswith its source set to
DataRawto be used within Instrument plugins
DataWithAxeswith its source set to
DataCalculatedto be used when processing data using ROI.
In general a given instrument (hence its PyMoDAQ’s Instrument plugin) will generate similar data (for instance several
Data1D waveforms for each channel of an oscilloscope). Such data can be completely defined using
DataWithAxes as we
However, when then plotting such data, the user can decide to use ROI to extract some meaningfull information to be
displayed in a live DAQ_Scan plot. This means that the corresponding DAQ_Viewer will produce both Data1D’s data but also
several Data0D’s ones depending on the number of used ROIs. To export (emit signals) or save (to h5), it would be much
better to have a specialized object to deal with these non-similar data. This is the role of the
DataToExport is a
DataLowLevel object with an extra attribute data, that is actually a list of
>>> from pymodaq.utils.data import DataToExport, DataRaw >>> dwa0D = DataRaw('dwa0D', data=[np.array(), np.array() , np.array()]) >>> dwa1D = DataRaw('dwa1D', data=[np.array([1, 2 , 3])]) >>> dte = DataToExport(name='a_lot_of_different_data', data=[dwa0D, dwa1D]) >>> dte DataToExport: a_lot_of_different_data <len:2>
It has a length of 2 because contains 2
DataWithAxes objects (dwa). One can then easily get the data from it :
>>> dte <DataRaw, dwa0D, (|1)>
or get dwa from their dimensionality, their name, the number of axes they have …
>>> dte.get_data_from_dim('Data1D').data <DataRaw, dwa1D, (|3)> >>> dte.get_names() ['dwa0D', 'dwa1D'] >>> dte.get_data_from_name('dwa0D') <DataRaw, dwa0D, (|1)>
Dwa can also be appended or removed to/from a
For more details see Union of Data