Importing Data with read.csv and readxl Packages

Author

Valerio Licursi

Published

June 26, 2024

Introduction

This exercise will help you learn how to import data into R using the read.csv function from base R and the read_excel function from the readxl package. These are common methods for importing CSV and Excel files, respectively.

Prerequisites

  • readxl package installed (install.packages("readxl"))

Step 1: Importing CSV Data with read.csv

1. Create a Sample CSV File

First, we create a sample CSV file to work with.

write.csv(iris, "iris_sample.csv", row.names = FALSE)

2. Import the CSV File

Use the read.csv function to import the CSV file.

iris_csv <- read.csv("iris_sample.csv")
head(iris_csv)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

Step 2: Importing Excel Data with readxl

1. Create a Sample Excel File

We need to create a sample Excel file. This requires the writexl package, which can be installed using install.packages("writexl").

# install.packages("writexl")
library(writexl)
Warning: package 'writexl' was built under R version 4.3.3
write_xlsx(iris, "iris_sample.xlsx")

2. Import the Excel File

Use the read_excel function from the readxl package to import the Excel file.

library(readxl)
iris_excel <- read_excel("iris_sample.xlsx")
head(iris_excel)
# A tibble: 6 × 5
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
         <dbl>       <dbl>        <dbl>       <dbl> <chr>  
1          5.1         3.5          1.4         0.2 setosa 
2          4.9         3            1.4         0.2 setosa 
3          4.7         3.2          1.3         0.2 setosa 
4          4.6         3.1          1.5         0.2 setosa 
5          5           3.6          1.4         0.2 setosa 
6          5.4         3.9          1.7         0.4 setosa 

Step 3: Comparing Imported Data

Compare the first few rows of the data imported from the CSV and Excel files to ensure they match.

head(iris_csv)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa
head(iris_excel)
# A tibble: 6 × 5
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
         <dbl>       <dbl>        <dbl>       <dbl> <chr>  
1          5.1         3.5          1.4         0.2 setosa 
2          4.9         3            1.4         0.2 setosa 
3          4.7         3.2          1.3         0.2 setosa 
4          4.6         3.1          1.5         0.2 setosa 
5          5           3.6          1.4         0.2 setosa 
6          5.4         3.9          1.7         0.4 setosa 

Step 4: Cleaning Up

Remove the sample files created for this exercise.

file.remove("iris_sample.csv", "iris_sample.xlsx")
[1] TRUE TRUE

Exercise for you: import the rnaseq.xlsx file:


References