Showing posts with label R Programming Course. Show all posts
Showing posts with label R Programming Course. Show all posts

Thursday, 9 July 2015

Starters Guide to Business Analytics Using Excel

Unknown


This blog introduces you to the essential Excel skills that an individual should have in order to survive in IT Industry. After reading this blog, you will get to know the excellence of Excel in terms of Business Analysis.











Basic Excel Skills

These are the list of basic Excel operations that you should definitely know to play with Data. Now a days, it is essential to have basic idea about Excel and these basic Excel skills are:
  • Familiarity with Excel ribbons & User Interface
  • Ability to enter and format data
  • Calculate totals, averages & summaries using formulas
  • Highlight data that meets certain conditions
  • Creating simple reports & charts
  • Understanding the importance of keyboard shortcuts & productivity tricks
This is not the end of list, rather I have just mentioned some of the skills that are mandatory for day to day purposes in order to get insights from the data. Apart from these skills, Excel is rich with ANALYSIS TOOLPACK.

Analytics with the Excel ANALYSIS TOOLPACK 

Many people talk about “Big Data” and Analytics, yet only few understand what it really means. The term Analytics essentially refers to the application of mathematics and statistics to datasets, and it is definitely not a new idea.
Nevertheless, certain characteristics of it have modified over the past five years because of databases like Hadoop that make it possible to analyze unstructured data (i.e., data that is not organized into the basic row and column format of a spreadsheet of workbook).

So, once you come to know the basics of Excel, chances are you will be asking for more. The reason is straight forward. Anyone with good Excel skills is always in demand in industry. Your superiors love you as you can get things done with ease in no time. Your colleagues may show jealousy for you as your workbooks are neat and easy to use. And, you will want more learning, because you have seen the amazing results of Excel.

Tuesday, 2 June 2015

Essential Packages in R Programming Language

Unknown


R is one of the trending programming language for Data Analysis with enormous amount of packages contributed by developers from all prospects and background. Around 4000 packages are listed on CRAN website itself, but are they all determined? Certainly not.

This blog familiarizes you with some of the essential packages in R programming language from different domains that are most extensively used during Data Analysis.

sqldf

I    It is one of the core package used by the Analysts to perform SQL queries on R data frames. Sqldf uses SQLite syntax. If you want to load data from an external source or databases, then R has connecting drivers to most of them. Some of the instances for this are:- 

           RODBCRMySQLRPostgresSQLRSQLite for reading data from the database.
·         XLConnectxlsx for reading and writing Microsoft Excel files from R.
·         foreign reads SAS and SPSS datasets in R. it also helps you load data files from other programs in R.

ggplot2

Most essential package among all data visualization packages widely used by the R programmers. It is fundamentally an application of the grammar of graphics in R to present your results in more understandable way by building customizable plots.  

  

plyr

Data manipulation in R is the most essential step for reforming your data according to your requirements. It is manifested that almost 80% of the time is devoted in data preparation however data manipulation is one of the step incorporated while preparing data.
Plyr package widely helps in manipulation of data by contributing essential functions that it contains for repositioning, subsetting, combining datasets together, summarizing etc. it is recommended using plyr if you are dealing with apply family of functions for data manipulation in R.
Some of the other essential packages used for data manipulation are:-
·         lubridate mainly deals with dates and times handling.
·         stringr works with regular expression and characters.

randomForest

·         One of the major package used for building non-linear models. It is simple to use and works on peculiar types of datasets.
·         Added advantage about this package is it can be used as a feature reduction algorithm.
·         For instance when your dataset has more than 200 variables and you need to find the most remarkable ones, randomForests package has a variable importance function which will only list out the important variables in the dataset. If you are willing to start working on non-linear models, you must start with this package initially.

caret

·         caret package is used for building better predictive models it deals with data handling, feature selection, building multiple predictive models using various techniques.
·         Performs validation checks and prints out the model performance diagnostics.
·         This assuredly looks like a lot and getting used to all these functions would take some time too, but once you are through with that it will make your model building skills more enjoyable. And due to this fact, caret has become popular in recent years amongst R programmers especially in Predictive Analytics field.