I have a large original data.frame that I filter to form smaller data.frames throughout the analysis.
The original data.frame has the format of:
> head(Moment_all) Exterior Interior Sections Spacing UG Names
1 0.02736669 0.03067941 84-12 12 UG-84 Sample 1
2 0.53220402 0.53739861 124-9 9 UG-124 AASHTO
3 0.54016470 0.54016538 116-9 9 UG-116 Sample 10
4 0.54151540 0.51516650 124-9 9 UG-124 Sample 8
5 0.54663913 0.52989489 124-9 9 UG-124 Sample ./124-9-DIA
6 0.54960475 0.51772120 116-9 9 UG-116 MeanI define the rows that I want to exclude for one of the subsets as:
notbaseline <- c(starts_with("Sample ./"),"Mean","AASHTO")Then I define new data.frame as:
data.exterior <- Moment_all[!grepl(paste(notbaseline,collapse = "|"),Moment_all$Names),]The problem is that I cannot use starts_with when defending the column and I have various types of characters that I would need to filter that don't always have an obvious pattern to directly call them.
Is there a better way of removing the rows based on the character in the Names column?
Thank you!
2 Answers
starts_with() is a dplyr function that is meant to help select particular columns of a data frame.
What you'd want to do is something like the below:
library(dplyr)
library(stringr)
filter(Moment_all, str_detect(Names, "Sample \\./", negate = TRUE), !Names %in% c("Mean", "AASHTO")) We can use base R with subset and grepl
subset(Moment_all, !startsWith(Names, 'Sample') & !Names %in% c("Mean", 'AASHTO'))