Tutorial Video
Cell and Double Column Vectors
A table consists of a series of Variables (Columns). Let us first create these columns individually.
Names (Cell)
Names={'Simon';'Lisa';'Peter';'John';'Claire';'Ashley'}
class(Names)
Returns a cell.
Test1Results (Double)
Test1Results=[62;73;73;73;55;44]
class(Test1Results)
Returns a double meaning each number is stored using 64 bits.
Test2Results
Test2Results=[62;74;75;74;55;45]
class(Test2Results)
Returns a double meaning each number is stored using 64 bits.
Test3Results
Test3Results=[63;77;77;83;56;40]
class(Test3Results)
Returns a double meaning each number is stored using 64 bits.
DateEnrolled
DateEnrolled={'11/01/2019';'21/01/2019';'15/01/2019';'17/01/2019';'22/01/2019';'24/01/2019'}
class(DateEnrolled)
Returns a cell. Although this is in a date time format, it is input as a Cell Array (this will be amended later).
Creating a Table
You'll notice that we have created:
- Names (6 by 1 Cell)
- Test1Results (6 by 1 Double)
- Test2Results (6 by 1 Double)
- Test3Results (6 by 1 Double)
- DateEnrolled (6 by 1 Cell)
As these have an equal number of Rows (6) we can create a Table using these variables:
DataTable=table(Names,Test1Results,Test2Results,Test3Results,DateEnrolled)
Looking Up Properties
We can use dot indexing to lookup the table Properties. To the left hand side we have the Table Name we want to lookup and to the right hand side we have the Properties:
DataTable.Properties
If we want to lookup the Variable Names we can use:
DataTable.Properties.VariableNames
The output is a Cell Array. If we want to update it we can type in the following:
DataTable.Properties.VariableNames={'Names','Test1','Test2','Test3','DateStarted'}
Selecting a Variable (Column)
In a Table, each Variable becomes a Column. The above Table therefore consists of 5 Variables which can be accessed using dot indexing:
DataTable.Names
DataTable.Test1
DataTable.Test2
DataTable.Test3
DataTable.DateStarted
Sorting by Value
Let's sort DataTable alphabetically by the Variable 'Names'
DataTable=sortrows(DataTable,'Names','ascend')
Here we are updating DataTable so it is used both as an input and output of the function sortrows
. The second input is the input which we will use to sort the results and the third input is the direction we wish to sort the results out by, in this case we will sort alphabetically with A at the Top and Z at the bottom so we will use 'ascend'
Show this using MATLABs what you see is what you get interface.
Let's first create a new table DataTable2 which sorts the data by three layers the first layer being the value of Test1, the second being the value of Test2 and the third being the value of Test3. If one was using Excel, they would use Sort with the drop downs below.
In MATLAB we can use the sortrows
function again but this time with a cell array as the first input. This time we will create a new table as the output:
DataTable2=sortrows(DataTable,{'Test1','Test2','Test3'},'descend')
Operating on a Variable (Column)
The last column is a series of strings but we want them to be dates, we can convert them to dates using the function datetime
DataTable2.DateStarted=datetime(DataTable2.DateStarted)
Adding a Column to the Table
Supposing we want to calculate the average of the three tests and add it to a Column at the end of the table:
DataTable2.MeanTestResults=mean([DataTable2.Test1,DataTable2.Test2,DataTable2.Test3],2)
The first input of the function mean is the three Variables selected using dot indexing [DataTable2.Test1,DataTable2.Test2,DataTable2.Test3]
and the second input is the direction 2 meaning we act on Rows opposed to Columns.
The output is a new column DataTable2.MeanTestResults
which by default is added to the end of the Table.
Rearranging Variables (Columns)
Assuming we want the DateStarted to be the second column opposed to the last. We can rearrange the Variables and save the output to a new table using the Variable (column) numbers. The first input being :
meaning all rows are selected and the second input being the reordered variable (column) numbers
DataTable3=DataTable2(:,[1,5,2,3,4,6])
We can also rearrange the Variables and save the output to a new table using the Variable (column) names. Once again the first input being :
meaning all rows are selected and the second input being the reordered variable (column) names as a cell array
DataTable4=DataTable2(:,{'Names','DateStarted','Test1','Test2','Test3','MeanTestResults'})
Categorical Variable (Column)
DataTable4.FinalGrades={'A';'B';'C';'D';'E';'A'}
DataTable4.FinalGrades=categorical(DataTable4.FinalGrades)
Supposing we are interested in those who have a Grade 'A'
then we can created a Variable:
Aselect={'A'}
In order to use it with the Categorical Column, it itself must be Categorical:
Aselect=categorical(Aselect)
Now we can construct a Column Vector which will be 1 (true) when DataTable4.FinalGrades==ASelect
and 0 (false) when DataTable4.FinalGrades~=ASelect
LogicalRowSelect=DataTable4.FinalGrades==Aselect
We can then use this to Logically select the Rows where the DataTable4.FinalGrades
are 'A'
and all Variables (Columns):
DataTableA=DataTable4(LogicalRowSelect,:)
Supposing instead we want to add this as an additional Column to DataTable4:
DataTable4.IsA=LogicalRowSelect
Class of Variable (Column)
We can now check the class of each Variables:
class(DataTable4.Names)
cell
class(DataTable4.Test1)
double
class(DataTable4.Test2)
double
class(DataTable4.Test3)
double
class(DataTable4.MeanTestResults)
double
class(DataTable4.DateStarted)
datetime
class(DataTable4.FinalGrades)
categorical
class(DataTable4.IsA)
logical
Categorical Variable (Column) Dependent on Criteria
And supposing we have the following criteria:
Note that there are 5 Grades and 6 Levels (number of Grades plus 1) because each Grade has a lower and upper limit
Levels=[0;50;55;70;80;100]
Grades={'F';'D';'C';'B';'A'}
We can specify the FinalGrades as a categorical Array of the Grades which correspond to the Levels in the MeanTestResults.
DataTable4.FinalGrades=discretize(DataTable4.MeanTestResults,Levels,'categorical',Grades)
Removing a Variable (Column) from a Table
Since the Logical Column IsA, no longer makes sense we can remove it using:
DataTable4.IsA=[]
Merging Categories
Supposing Grades A, B and C correspond to a Pass and Grades D and F correspond to a Fail. We can use the Merge Categories Column to:
Creates a New Column from the merged categories:
DataTable4.Status=mergecats(DataTable4.FinalGrades,{'A','B','C'},'Pass')
Merges the remaining categories:
DataTable4.Status=mergecats(DataTable4.Status,{'D','F'},'Fail')
Note the line would update the values 'Pass' withe the corresponding 'A', 'B' and 'C' values:
DataTable4.Status=mergecats(DataTable4.FinalGrades,{'D','F'},'Fail')