DATA MANIPULATION IN R
Web scraping with rvest
Import and export data
Pattern matching with grepl() and grep()
String manipulation
Pattern matching and replacement with gsub() and sub()
String manipulation
Row and column sums and means
Data transformation
Row and column names
Data transformation
Concatenate strings with paste and paste0
String manipulation
Count the number of characters with nchar
String manipulation
Read SQL databases
Import and export data
lapply function
apply family
Read Excel files
Import and export data
Merge data frames
Data transformation
Aggregate
Data transformation
¿What is DATA MANIPULATION?
Data manipulation, also known as data wrangling, refers to the process of transforming and cleaning raw data into a structured format suitable for analysis. This process involves various operations such as filtering, sorting, aggregating, merging, reshaping, and transforming data to make it more organized, understandable, and ready for analysis. R provides several functions to perform these tasks, but dplyr
is one of the most popular and widely used R packages for data manipulation.
-
Base R
Data manipulation in base R involves using the core functions and methods provided by R's base package for handling, transforming, and manipulating data structures such as vectors, matrices, arrays, data frames, and lists. -
dplyr
dplyr
is an R package designed for efficient and user-friendly data manipulation. It provides a set of functions that streamline data wrangling tasks by offering a consistent grammar for manipulating data frames and data tables.