test_Gene2GO.txt
ChrSy.fgenesh.gene.10 GO:0003676
ChrSy.fgenesh.gene.12 GO:0003676, GO:0004523, GO:0015074
ChrSy.fgenesh.gene.14 GO:0004674, GO:0005509, GO:0005515, GO:0005524, GO:0006468, GO:0016021, GO:0030247
ChrSy.fgenesh.gene.17 GO:0003676, GO:0004190, GO:0006508, GO:0008270, GO:0015074
ChrSy.fgenesh.gene.21 GO:0004672, GO:0006468
ChrSy.fgenesh.gene.22 GO:0003676, GO:0004523, GO:0015074
ChrSy.fgenesh.gene.26 GO:0006508, GO:0008234
ChrSy.fgenesh.gene.27 GO:001602
转换一列变多行
- 方法①
test <- read.table("test_Gene2GO.txt", sep = "\t", header = F)
library(tidyverse)
test %>% separate_rows(V2, sep = ",")
- 方法②
## install.packages("splitstackshape")
## install_github("mrdwab/splitstackshape", ref = "devel")
## 不知道为什么我都不能安装好, 最后下载到本地安装好的
library(splitstackshape)
test %>% cSplit(., "V2", ",", 'long')
多行变两列
- 方法①
aggregate(test1, by = list(test1$V1), c) %>% transmute(Gene = .$Group.1, GOid = .$V2)
-
方法②
方法③
test1 %>%
group_by(V1) %>%
summarise(GO_ID = paste(V2, collapse = ","))
还有许多简单的命令。。。只是能解决就好了。
附带小惊喜
-
https://smach.github.io/R4JournalismBook/HowDoI.html
-
这个网站收集很多常用功能的函数
image.png
-
- 然而这图中又有小惊喜 Practical R for Mass Communication and Journalism, 没错,这是刚出版的书籍, 这本书可读性高。
image.png
至于怎么下载, 大家心里都有万能的下载渠道。。
一列变多行:
- 巧用
stringr::str_split()
和unnest()
函数将 GO 背景文件,将一列拆分成多行 - 巧用
tidyr::separate_rows()
函数将一列拆分成多行
test <- tribble(
~gene, ~GO_ID,
"gene1", paste0("GO:1", ";", "GO:2", ";","GO:3")
)
test %>%
mutate(Go_id = stringr::str_split(GO_ID, ";")) %>%
unnest() %>%
select(gene, Go_id)
test %>%
tidyr::separate_rows(GO_ID, sep = ";")