R is an open source software that is widely used by many for different purposes. R, at its heart, is a programming language but is also useful for handling data and analysis. It is useful for visualization relatively more than some other statistical packages. Since it allows for programming as well, it makes it more powerful than some other statistical tools for analysis. It has other capabilities as well such as writing web-based books and manuals. Although it might seem challenging to a beginner, it is well worth the effort to learn it as it opens up immense opportunities for doing many things with it. There are many sources for learning R and every learner is at a different stage of learning. I have personally been frustrated by not being able to find something that would put everything that someone dealing with market research data needs, as a beginner. So, I set out to document it as a book using one of R’s own packages (more on this later).
Since R is an open source platform, it is open for anyone to take and extend the code and add greater functionality to it. R is built by an international community of volunteers contributing and extending its functionality in innumerable ways. As you learn it, my hope is that you will come to appreciate its power and its versatility, due to contributions from many. Started initially as a free software environment for statistical computing and graphing by Robert Gentleman and Ross Ihaka (R and R) of the R Core team, it has grown to many contributors allowing it to continue to grow as a platform. Having so many contributors means innovations never end. It also means, that a beginner can feel lost and finding the best approach to solve a problem can take time. This blog is a primer on understanding R for fundamental data analysis in market research.
R versus R-Studio
For a beginner, along with R, what is also required is an understanding of R-Studio which is an IDE (Integrated Development Environment) for R. The IDE makes it easy to develop and test software. The basic open source version of R-Studio software is free. In addition, it provides an easy to use interface to maximize use of R’s functionality through its extension of capability through what are known as packages. Although all of this can be accomplished in R, R-Studio makes it more intuitive once you get used to its features. As of the writing of this blog, R-Studio is hosted in the cloud as well for those who do not want to bother with installation on a local computer. The interface discussed here for a beginner is applicable regardless of use of a local version or the in cloud version. R- Studio is now offered in the cloud with a similar interface.
R and its Packages
Integral to the usefulness of R is the number of packages created by its army of volunteer contributors. A package may be thought of as a special software application designed to perform specific functions. These packages are all contributed by the community of volunteers and are available as add-ins to be used by R’s community of users. Hadley Wickham refers to it as shareable code that is combined with data and documentation with examples and description on what the package does. As you start using the software, you will get more comfortable with the concept of packages, what they can help you accomplish, and the many packages that are available for use with different functionality. As of writing of this blog, Microsoft R Application Network showed over 17,000 packages. This high number indicates how it is not possible to master or know every package. Since every package is meant for accomplishing different purposes, the user needs to familiarize himself or herself with the packages relevant to her or his task.
Learn more about the next steps in creating, reading, analyzing, and visualizing a dataset, typical actions in any market research project here.
This blog is based on an excerpt from the book titled “R for Fundamental Data Analysis in Market Research,” written using an R Bookdown package.