![]() |
IRootLab
An Open-Source MATLAB toolbox for vibrational biospectroscopy
|
Dataset class.
X
propertyThe X
property has dimensions [no]x[nf]. Each row represents one physical spectrum. Each column represents a "feature".
In IRootLab, dataset classes
are 0-based, so valid classes will range from 0
to (nc-1).
Classes
correspond to elements in the classlabels
property.
Negative classes have special meanings. See get_negative_meaning.m
The class labels may define a <bold>multi-level labelling system</bold>, with different levels separated by a vertical slash ("|"). See the example:
In the example above, the first level represents the cancer grade, whereas the second level represents the country. If a spectrum was taken from an individual who is from Ireland and has Low-grade cancer, its class wil be 4 (remember that the classes are zero-based!). There are resources to work with class levels (see blmisc_classlabels_hierarchy)
Public Member Functions | |
function | irdata () |
Constructor. More... | |
function get | no (in data) |
no getter More... | |
function get | nf (in data) |
nf getter More... | |
function get | nonf (in data) |
nonf getter More... | |
function get | nc (in data) |
nc getter More... | |
function get | no_groups (in data) |
no_groups getter More... | |
function get | width (in data) |
function | get_groupidxs_from_groupcodes (in data, in codes) |
function | get_obsidxs_from_groupidxs (in data, in groupidxs) |
function | get_no_levels (in data) |
Counts to pre-allocate. More... | |
function | check (in data) |
Checks if internal variables are synchronized with some troubleshooting. More... | |
function | import_from_struct (in data, in DATA) |
function | eliminate_unused_classlabels (in data) |
function | mount_from_signal (in signal, in no_inputs, in future) |
Populates from a time series. More... | |
function | get_props_to_copy (in data) |
This is the maximum number of rows of the dataset before something blows. More... | |
function | copy_emptyrows (in data) |
function | split_map (in data, in map, in feamap, in fext) |
function | split_splitidxs (in data) |
prepares a clone, except for the fields in rowfieldnames More... | |
function | map_rows (in data, in idxnew) |
function | select_features (in data, in idxs) |
function | transform_linear (in data, in L, in L_fea_prefix) |
irverbose(sprintf('INFO (data_select_features()): # features before: %>d; # features after: %>d. ', nfold, data.nf)); More... | |
function | get_fea_names (in data, in idxs) |
Returns the names of the features. More... | |
function | make_groupnumbers (in data) |
fills in the groupnumbers property based on the groupcodes property. More... | |
function | assert_fix (in data) |
Makes the dataset properties consistent with each other. More... | |
function | get_weights (in data, in exponent) |
function | transpose2 (in data) |
function | assert_not_nan (in data) |
Asserts that there is no NaN in data.X. More... | |
function | get_description (in o, in flag_short) |
function | setbatch (in o, in params) |
Sets several properties of an object at once. More... | |
function | get_methodname (in o, in flag_short) |
function | get_report (in o) |
Object reports are plain text. HTML would be cool but c'mon, we don't need that sophistication. More... | |
function | get_html (in o, in flag_stylesheet) |
function | get_params (in o, in data) |
Calls Parameters GUI. More... | |
function | extract_log (in o) |
function | get_ancestry (in o, in flag_title) |
Public Attributes | |
Property | no |
number of "observations" (e.g. spectra) More... | |
Property | nf |
number of features (i.e., variables) More... | |
Property | nonf |
a vector [no, nf] More... | |
Property | no_groups |
number of groups More... | |
Property | nc |
number of classes More... | |
Property | X |
[no]x[nf] matrix. Data matrix More... | |
Property | classes |
Property | classlabels |
Cell of strings. Class labels. More... | |
Property | groupcodes |
(optional) [no]x[1] Cell of strings. Group codes (e.g. patient names) More... | |
Property | obsnames |
(optional) [no]x[1] Cell of strings. Observation names (e.g. file names of the individual spectra) More... | |
Property | filename |
Property | filetype |
mat or txt More... | |
Property | fea_x |
feature x-axis More... | |
Property | fea_names |
(optional) Cell of strings. Name of each feature More... | |
Property | xname |
x-axis name, defaults to 'Wavenumber (cm^{-1})' More... | |
Property | xunit |
x-axis unit, defaults to 'cm^{-1}' More... | |
Property | yname |
y-axis name, defaults to 'Absorbance' More... | |
Property | yunit |
y-axis unit, defaults to 'a.u.' More... | |
Property | height |
Height of image. Spectra start counting from the bottom left upwards. More... | |
Property | width |
Width of image. Width is actually calculated as no/height . If result is not integer, an error will occur. More... | |
Property | direction |
Property | Y |
Output (instead of classes). For regression instead of classification. More... | |
Property | groupnumbers |
For easier access than groupcodes. More... | |
Property | obsids |
Property | splitidxs |
Property | title |
Property | color |
Protected Member Functions | |
function | do_get_html (in o) |
HTML inner body. More... | |
function | do_get_report (in o) |
Default report. More... | |
Protected Attributes | |
Property | rowfieldnames |
fields to be split or merged when dataset is split or merged More... | |
Property | flags_cell |
Property | classtitle |
Class Title. Should have a descriptive name, as short as possible. More... | |
Property | short |
Short for the method name. More... | |
Property | flag_params |
Property | flag_ui |
(GUI setting) Whether to "publish" in blockmenu and datatool. Note that a class can be "published" without a GUI (set flag_params=0 in this case, at the class constructor). More... | |
Property | moreactions |
(GUI setting) String cell containing names of methods that may be called from the GUIs More... | |
function irdata::irdata | ( | ) |
Constructor.
function irdata::assert_fix | ( | in | data | ) |
Makes the dataset properties consistent with each other.
This is both an assertion routine and a "fixing" routine. The two parts are implemented sequentially, so it will be easy to split this in the future.
The assertion part will do a number of checks and throw an error if there is no hope of making it a consistent dataset. Fatal problems will be:
The subsewquent fix part may do a number of works on the dataset:
function irdata::assert_not_nan | ( | in | data | ) |
Asserts that there is no NaN in data.X.
function irdata::check | ( | in | data | ) |
Checks if internal variables are synchronized with some troubleshooting.
function irdata::copy_emptyrows | ( | in | data | ) |
Makes copy with empty fields whose names are in .rowfieldnames Additionally, resets .height
|
protected |
HTML inner body.
|
protectedinherited |
Default report.
function irdata::eliminate_unused_classlabels | ( | in | data | ) |
retains only labels corresponding to classes that exist in the dataset, and classes are renumbered accordingly
|
inherited |
o |
|
inherited |
o | |
flag_title=1 |
|
inherited |
Returns description string
Precedence according with flag_short:
flag_short=0 | I am sealing this to make sure that no class will try to improvise on this function. |
function irdata::get_fea_names | ( | in | data, |
in | idxs | ||
) |
Returns the names of the features.
This function checks the fea_names
property and if it is empty, it makes feature names on-the-fly using the fea_x
property.
idxs | Optional list of indexes to be returned |
function irdata::get_groupidxs_from_groupcodes | ( | in | data, |
in | codes | ||
) |
Converts group codes to group indexes Indexes will point to the "unique(data.groupcodes)" vector
|
inherited |
flag_stylesheet=1 | Whether to include the stylesheet in the HTML |
|
inherited |
This is used only to compose sequence string e.g. xxx->yyy->zzz
flag_short=0 |
function irdata::get_no_levels | ( | in | data | ) |
Counts to pre-allocate.
Loops again to fill Returns the number of levels in classlabels
function irdata::get_obsidxs_from_groupidxs | ( | in | data, |
in | groupidxs | ||
) |
Converts group indexes to observation indexes CAUTION: be sure that idxs_codes contains indexes that point to the "unique(data.groupcodes)" vector
|
inherited |
Calls Parameters GUI.
If flag_params
, tries uip_<class>.m. If fails, tries uip_<ancestor>.m and so on
function irdata::get_props_to_copy | ( | in | data | ) |
This is the maximum number of rows of the dataset before something blows.
Mounts X and Y each data row will stand for [s(n) s(n-1) s(n-2) ...]. This way the dot product between the row and the coefficients of a linear filter is a causal convolution. Gets a list with all properties except the ones that will be split
|
inherited |
Object reports are plain text. HTML would be cool but c'mon, we don't need that sophistication.
function irdata::get_weights | ( | in | data, |
in | exponent | ||
) |
Gets weights for each class
Weights are inversely proportional to the number of observations in each class.
Weights are normalized, so that their sum equals one
exponent | =1. Exponent to power all weights before they are normalized to sum=1 |
function irdata::import_from_struct | ( | in | data, |
in | DATA | ||
) |
Copies structure fields to object fields Contains a dictionary with many old property names for backward compatibility Also works when the input is an object.
function irdata::make_groupnumbers | ( | in | data | ) |
fills in the groupnumbers property based on the groupcodes property.
function irdata::map_rows | ( | in | data, |
in | idxnew | ||
) |
Maps rows. Single-output version of split_map()
Returns new object
function irdata::mount_from_signal | ( | in | signal, |
in | no_inputs, | ||
in | future | ||
) |
Populates from a time series.
This function makes X and Y. X will be a Toeplitz matrix.
Inputs:
signal | vector s(n) |
no_inputs | dimensionality of the input data space (aka number of features or nf) |
future | "prediction task", which will be to predict s(n+future) |
function get irdata::nc | ( | in | data | ) |
nc getter
function get irdata::nf | ( | in | data | ) |
nf getter
function get irdata::no | ( | in | data | ) |
no getter
function get irdata::no_groups | ( | in | data | ) |
no_groups getter
function get irdata::nonf | ( | in | data | ) |
nonf getter
function irdata::select_features | ( | in | data, |
in | idxs | ||
) |
Manual feature selection.
Inputs: idxs: list of column indexes to select, or cell thereof
|
inherited |
Sets several properties of an object at once.
o | |
params | Cell followint the pattern{'property1', value1, 'property2', value2, ...} |
function irdata::split_map | ( | in | data, |
in | map, | ||
in | feamap, | ||
in | fext | ||
) |
Splits dataset into one or more datasets using row maps
map | 1D or 2D cell array of row indexes |
feamap | (optional |
out | Matrix of datasets. |
function irdata::split_splitidxs | ( | in | data | ) |
prepares a clone, except for the fields in rowfieldnames
maps the rowfieldnames fields be used or not as necessary and no error will occur. Splits dataset into one or more datasets using its own splitidxs property
map | 1D or 2D cell array of row indexes |
out | Matrix of datasets. |
function irdata::transform_linear | ( | in | data, |
in | L, | ||
in | L_fea_prefix | ||
) |
irverbose(sprintf('INFO (data_select_features()): # features before: %>d; # features after: %>d.
', nfold, data.nf));
Transforms dataset using loadings matrix L
data.X = data.X*L; data.xlabel = 'Factor'; data.ylabel = 'Score';
L[nf][any] | Loadings matrix |
L_fea_prefix=[] | Prefix to make new feature names. |
function irdata::transpose2 | ( | in | data | ) |
Changes direction and swaps width and height
This is called "transpose2" because MATLAB objects have a built-in "transpose" already
function get irdata::width | ( | in | data | ) |
Property irdata::classes |
[no]x[1] vector. Classes. Zero-based (first class is class zero).
Classes may be negative, with special meanings for negative values (see get_negative_meaning.m)
|
protectedinherited |
|
inherited |
Property irdata::direction |
Property irdata::fea_names |
|
protectedinherited |
=1. (GUI setting) Whether to call a GUI when the block is selected in blockmenu.m . If true, a routine called "uip_"<class name> will be called.
|
protectedinherited |
Property irdata::groupcodes |
Property irdata::groupnumbers |
Property irdata::height |
|
protectedinherited |
Property irdata::obsnames |
|
protected |
|
protectedinherited |
Property irdata::width |
Property irdata::xname |
Property irdata::Y |
Property irdata::yname |