DATA TRANSFORMATION IN R
Data transformation involves techniques to filter data according to specific conditions, segment data into smaller groups, sort data according to defined criteria, summarize data by calculating sums or averages, and combine different data sets into a single data set
BASE R
With base R, data manipulation tasks can be performed without relying on external packages, offering a robust set of tools to work with data sets efficiently, allowing various operations such as data selection, filtering, transformation and summarization
Row and column names
rownames() colnames() dimnames()
Subset data based on conditions
subset() $ [] [[]]
Absolute/relative frequency and contingency tables
table() prop.table() xtabs() addmargins()
Categorize numerical data
cut()
Split data based on groups
split() unsplit()
Aggregate data
aggregate()
Row and column sums and means
rowSums() colSums() rowMeans() colMeans()
rbind() and cbind() functions
rbind() cbind()
Merge data frames
merge()
DPLYR PACKAGE
dplyr offers a clear and concise syntax to perform common tasks such as filtering, selecting, grouping and merging data, being a more intuitive and efficient alternative to R base functions for data manipulation operations
Select columns with dplyr
select() contains() where() matches() starts_with() ends_with() all_of() any_of()
Filter rows with dplyr
filter() slice()
Order rows with the arrange() function from dplyr
arrange() desc()
Rename columns with the rename() function from dplyr
rename() rename_with()
Create and modify columns with the mutate() function from dplyr
mutate() across()
Create statistical summaries with the summarise() function from dplyr
summarise() group_by()