Posted in R

R language first try

images

Why R ?

R is a framework and language for creating statistical, data mining and data visualization applications.

It seems like this will feature more in the business world since the strong drive towards advanced analytics; that and that now data science has now become “a thing”.  Microsoft also acquired Revolution Analytics; a company that focuses heavily on the R language.

R will most likely play an important role in the future for data driven applications, especially when it comes to the Microsoft’s data offerings. It can already be used within Azure ML.

For the Business Intelligence professional, it just makes sense to have some sort of literacy around this language. A developer can now supplement their toolbox with functionality that may not be as easily done with the Microsoft SQL BI stack. For example with a small amount of code one can easily bring in some twitter content, mine it for text, create a word cloud, and share that content to your users.

Time sheet Overview

For my first mini-project I created a small script that does the following:

  • Imports a time sheet for an incompleted month from Toggle (CSV)
  • Projects the days left till the end of the month
  • Removes the public holidays which are scraped from a web site
  • Plots the projected hours for the current month vs the total hours of a typical month

Code

First install the needed packages

ScreenClip

Import the timesheet data

ScreenClip

This is what the data set looks like so far:

ScreenClip

Add a projection column

ScreenClip

Here’s what the total.projected.time data set looks like:

ScreenClip

Removed the public holidays by reading the web page’s HTML. In this case they’re South African holidays. A special character was giving me problems, hence the find/replace.

ScreenClip

Add the typical monthly hours. i.e. 8 hours per day excluding weekends and holidays

ScreenClip

Below is the final data set. When I downloaded this it was the evening of the 20th, and so I’m using 8 as an average of what I would typical work per day to end off the month. Looks like I’m lagging behind.

ScreenClip

To plot the above figures I used the ggplot library:

ScreenClip

Here’s the plot result:

Rplot02

I think I’ll try Python next. It is also a language of choice for data scientists, and even though it doesn’t have as long a data analysis history as R, it’s making quick strides with libraries like Pandas.

Advertisements

Author:

I'm a Business Intelligence Developer working for a financial services company. My focus is on the Microsoft suite of BI tools.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s