Monday, 25 April 2016

Getting Started on R in Linux Mint Debian Edition 2 : Loading data file

Installing the R is a breeze on Linux Mint Edition 2. You can install it using the apt repository. The current version is 3.1.1 as of 26 April 2016.

Prior to working with the dataset, it is a good practice to set your working directory and verifying that you are working in your desired working directory:

1) getwd()
2) setwd(/path/to/working/directory)

Next we can load the data file in .csv format we want to work on, for e.g. the file is XYZ.csv and we want to store the dataset as ABC:
 ABC = read.csv("XYZ.csv")

After loading the dataset, we can issue some basic commands to view the information in ABC dataset where ABC is the name of an educational institution:

I) summary(ABC)
II) str(ABC)
III) sort(table(ABC$Student)

A bit on Analysing  data in a dataset:

Using function tapply() and table():

For example, to get the proportion of Student population in ABC institution by Age, we apply the function:

table(ABC$Student, ABC$Age)

Further if we need to find the average Student population taking the different mode of Transportation, such as Bus, Bike, Walking, etc. :

tapply(ABC$Student, ABC$Transportation,mean)

To view it in increasing order, we can add sort function around tapply():

sort(tapply(ABC$Student, ABC$Transportation,mean))

To quit R, we can use the command:
quit("no")

Alternatively,
quit("yes")
 

No comments:

Post a Comment