Posted in R

R language first try


Why R ?

R is a framework and language for creating statistical, data mining and data visualization applications.

It seems like this will feature more in the business world since the strong drive towards advanced analytics; that and that now data science has now become “a thing”.  Microsoft also acquired Revolution Analytics; a company that focuses heavily on the R language.

R will most likely play an important role in the future for data driven applications, especially when it comes to the Microsoft’s data offerings. It can already be used within Azure ML.

For the Business Intelligence professional, it just makes sense to have some sort of literacy around this language. A developer can now supplement their toolbox with functionality that may not be as easily done with the Microsoft SQL BI stack. For example with a small amount of code one can easily bring in some twitter content, mine it for text, create a word cloud, and share that content to your users.

Time sheet Overview

For my first mini-project I created a small script that does the following:

  • Imports a time sheet for an incompleted month from Toggle (CSV)
  • Projects the days left till the end of the month
  • Removes the public holidays which are scraped from a web site
  • Plots the projected hours for the current month vs the total hours of a typical month


First install the needed packages


Import the timesheet data


This is what the data set looks like so far:


Add a projection column


Here’s what the total.projected.time data set looks like:


Removed the public holidays by reading the web page’s HTML. In this case they’re South African holidays. A special character was giving me problems, hence the find/replace.


Add the typical monthly hours. i.e. 8 hours per day excluding weekends and holidays


Below is the final data set. When I downloaded this it was the evening of the 20th, and so I’m using 8 as an average of what I would typical work per day to end off the month. Looks like I’m lagging behind.


To plot the above figures I used the ggplot library:


Here’s the plot result:


I think I’ll try Python next. It is also a language of choice for data scientists, and even though it doesn’t have as long a data analysis history as R, it’s making quick strides with libraries like Pandas.



I'm a Business Intelligence Developer working for a financial services company. My focus is on the Microsoft suite of BI tools.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s