0%

简要介绍

抽了些时间走了一遍Human GWAS流程,主要包括以下几个步骤:

  • GWAS QC steps along with data visualization.
  • Dealing with population stratification, using 1000 genomes as a reference.
  • Association analyses of GWAS data.
  • Polygenic risk score (PRS) analyses.
Read more »

Windows操作系统,想用Linux操作命令完成一些数据分析工作。本来想着或可用windows的shell代替,不妨在Windows的cmd或powershell中把要装的软件装上就好了,岂料我要装的东西不支持Windows我倒,行吧那就装个虚拟机吧,卑微的小电脑又要装新东西了。

Read more »

理论

两总体协方差阵相等(但未知)时均值向量的检验

为来自总体的随机样本;为来自总体的随机样本,且相互独立,未知。
检验
取检验统计量

Read more »

文章内容灵感来源于一个知乎上的问题。La Vida Seguirá!

,对定义积分算子

且有

(为方便起见这里的函数值域,啊这样我就不用加什么共轭了)

Read more »

这里通过我的一次作业,展示协变量多元响应变量一元多元回归

Multiple linear regression (MLR), also known simply as multiple regression, is a statistical technique that uses several explanatory variables to predict the outcome of a response variable

Read more »

R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS. To download R, please choose your preferred CRAN mirror.

Read more »

憨憨地承认这篇文章大量引用jimmy老师博客
憨憨.jpg

基础概念

需要掌握R内置数据集及R包数据集

理解 定性变量(qualitative variable) 和 定量变量(quantitative variable)

定量数据的集中趋势指标主要是:众数、分位数和平均数

定量数据的离散趋势指标主要是:极差,方差和标准差,标准分数,相对离散系数(变异系数),偏态系数与峰态系数

Read more »

vcf

What is a VCF and how should I interpret it?

VCF stands for Variant Call Format. It is a standardized text file format for representing SNP, indel, and structural variation calls.
A valid VCF file is composed of two main parts: the header, and the variant call records.
The header contains information about the dataset and relevant reference sources (e.g. the organism, genome build version etc.), as well as definitions of all the annotations used to qualify and quantify the properties of the variant calls contained in the VCF file.
For each site record, the information is structured into columns (also called fields) as follows:
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NA12878 [other samples…]

Read more »

什么是Conda

Conda is an open source package management system and environment management system that runs on Windows, macOS and Linux. Conda quickly installs, runs and updates packages and their dependencies. Conda easily creates, saves, loads and switches between environments on your local computer. It was created for Python programs, but it can package and distribute software for any language.

  • 管理包

包的安装、卸载、更新等,用conda操作十分方便。

如果conda装不了,那也只能乖乖地wget [软件安装网址]。

Read more »

SAM/BAM

当测序得到的fastq文件map到基因组之后,用sam(Sequence Alignment/Map)统一格式来表示这种mapping结果,bam是sam的二进制文件(b binary)。sam文件由注释信息比对结果两部分组成。

  • 注释信息(header section)
    • @HD,说明符合标准的版本、对比序列的排列顺序
    • @SQ,参考序列说明
    • @RG,比对上的序列(read)说明
    • @PG,使用的程序说明
    • @CO,任意的说明信息
Read more »