r - dplyr - 在R中,如何从long到wide dataframe,在focal column中用逗号分隔多个值

  显示原文与译文双语对照的内容

我有一个电影列表和他们的导演,我想将这些director转换为虚拟变量(即,如果导演导演导演电影,他们有自己的专栏和a1,如果他们不导演那部电影,那一栏的零),这很棘手,因为有时会有两位导演的电影,请参阅下例.df是我拥有的数据,df2是,我想要的。


movie <- c("Star Wars V","Jurassic Park","Terminator 2")


budget <- c(100,300,400)


director <- c("George Lucas, Lawrence Kasdan","Steven Spielberg","Steven Spielberg")



df <- data.frame(movie,budget,director)


df



movie <- c("Star Wars V","Jurassic Park","Terminator 2")


budget <- c(100,300,400)


GeorgeLucas <- c(1,0,0)


LawrenceKasdan <- c(1,0,0)


StevenSpielberg <- c(0,1,1)



df2 <- data.frame(movie, budget, GeorgeLucas, LawrenceKasdan, StevenSpielberg)


df2



时间:

一个选项是cSplit_e


library(splitstackshape)


library(dplyr)


library(stringr)


cSplit_e(df, 'director', sep=",", type = 'character', fill = 0, drop = TRUE) %>%


 rename_at(vars(starts_with('director_')), ~ str_remove(., 'director_'))


# movie budget George Lucas Lawrence Kasdan Steven Spielberg


#1 Star Wars V 100 1 1 0


#2 Jurassic Park 300 0 0 1


#3 Terminator 2 400 0 0 1



...