天涯海角

My Web Home

Category Archives: 生物

农业生物信息学数据库导航–植物

谷类作物 http://harvest.ucr.edu/ 美国
目前该数据库收集了大麦、短柄草属、柑橘、咖啡、豇豆、大豆、水稻、小麦等作物的表达谱数据,及相关的一些分子信息(Barley, Brachypodium, Citrus, Coffea, Cowpea, Soybean, Rice, Wheat)。

谷类作物 http://www.gramene.org/ 美国
Gramene Database是一个各种作物基因组信息的数据库,同时具备高级的各基因组间的分析功能。

作物 http://ukcrop.net/ 英国
UK CROPNET:英国农作物生物信息学网络数据库。拥有很多其自己开发的数据库和分析软件,同时也收集相关的一些文献和美国该领域的一些信息。1996

水稻 http://ars-genome.cornell.edu/rice/ 美国
GrainGenes是美国农业部和地域农业图书馆的植物基因组计划支持的麦燕麦和甘蔗遗传数据库

水稻 http://bioserver.myongji.ac.kr/ricemac.html 韩国
韩国水稻基因组数据库。

水稻 http://cdna01.dna.affrc.go.jp/cDNA/ 日本
KOME:水稻的生物分子数据库

水稻 http://cdna01.dna.affrc.go.jp/PIPE 日本
水稻数据库的统一化工具数据库。

水稻 http://drtf.cbi.pku.edu.cn/ 中国
水稻转录因子数据库,该数据库包括了来自水稻品种indica和japonica中所有已知的和可能存在的转录因子信息。

水稻 http://gbrowse.ncpgr.cn/cgi-bin/gbrowse/japonica/ 中国
Rice水稻基因组注释数据库

水稻 http://gene64.dna.affrc.go.jp/RPD/ 日本
水稻蛋白组学数据库。

水稻 http://golgi.gs.dna.affrc.go.jp/SY-1102/rad/ 日本
RAD: 水稻基因组注释数据库。

水稻 http://ine.dna.affrc.go.jp/giot/ 日本
INE: 水稻基因组整合数据浏览器。

水稻 http://mips.gsf.de/proj/plant/jsf/rice/index.jsp 德国
MOsDB: The MIPS Oryza sativa database,水稻基因组数据库,包括序列数据,未来将把突变体信息、表达谱信息整合起来。

水稻 http://mpss.udel.edu/rice/ 美国
Rice MPSS database水稻大规模平行测序数据库。

水稻 http://orygenesdb.cirad.fr/ 法国
OryGenesDB:一个用于水稻反向遗传学研究的交互式工具;有水稻基因T-DNA以及Ds侧冀序列标签数据库,基因注释。

水稻 http://rapdb.dna.affrc.go.jp/ 日本
RAP-DB: 水稻注释计划数据库。

水稻 http://red.dna.affrc.go.jp/RED/ 日本
RED: 水稻表达谱数据库。

水稻 http://redb.ncpgr.cn/ 中国
REDB: 水稻EST数据库。

水稻 http://rgp.dna.affrc.go.jp/giot/INE.html 日本
水稻的基因组数据库(INE)整合了目前大规模测序后获得的关于水稻的基因组信息、cDNA 信息,遗传图谱、物理图谱的信息,并随着水稻测序的进行,持续增加新的信息。

水稻 http://rice.big.ac.cn/rice/index2.jsp 中国
RISE: 水稻信息系统,包括水稻基因组的最新综合信息以及与其他谷类作物的比较基因组分析数据。

水稻 http://rice.genomics.org.cn/ 中国
水稻是一种主要的粮食作物,也是一种谷物基因组研究的模式物 种,北京基因组研究所(BGI)在水稻等作物基因组的测序、信息分析和生物学研究方面久负盛名。为了更好地研究,我们建立了水稻信息系统(BGI- RIS),整合了最新的数据以及比较基因组学分析数据。为了综合分析水稻的两大亚种,japonica和indica,BGI-RIS除了包括自己测序的 indica序列数据外,同时也收集了japonica及其他已知的谷类作物的基因组和EST数据。BGI-RIS对两亚种间的相关基因、重复元件、基因 重复、SNP都进行了注释。

水稻 http://rice.plantbiology.msu.edu/ 美国
水稻基因组注释计划。

水稻 http://ricefox.psc.riken.jp/index.php?contetnstop 日本
RiceFOX:水稻过表达拟南芥全长cDNA突变体数据库。

水稻 http://ricegaas.dna.affrc.go.jp/ 日本
Rice GAAS:水稻基因组信息自动注释系统。

水稻 http://rkd.ucdavis.edu/ 美国
水稻蛋白激酶数据库。

水稻 http://rmd.ncpgr.cn/ 中国
RMD: 水稻突变体数据库

水稻 http://rmg.rice.dna.affrc.go.jp/ 日本
RMG:水稻线粒体基因组信息数据库。

水稻 http://signal.salk.edu/cgi-bin/RiceGE 美国
RiceGE: Rice Functional Genomic Express Database 水稻功能基因组表达数据库。

水稻 http://structure.rice.dna.affrc.go.jp/ 日本
RPSD: 水稻蛋白结构数据库。

水稻 http://tlife.fudan.edu.cn/bgf/ 中国
BGF是一个水稻基因组的基因预测工具 。

水稻 http://tos.nias.affrc.go.jp/ 日本
Rice Tos17 Insertion Mutant Database 水稻逆转座子Tos17插入突变数据库。

水稻 http://urgi.versailles.inra.fr/OryzaTagLine/ 法国
水稻T-DNA插入突变数据库,该数据库包括了侧冀突变标签的分析的那个信息来做反向遗传学的研究。

水稻 http://www.iris.irri.org/ 美国
水稻胚质基因型数据库,以及水稻功能基因组和蛋白组。

水稻 http://www.ncgr.ac.cn/ 中国
我国水稻基因组计划针对水稻的籼稻亚种。

水稻 http://www.pgcdna.co.jp/cgi-bin/wrdb/content.cgi 日本
WILD RICE DATABASE野生稻数据库。

水稻 http://www.plantgenomics.cn/ 中国
Plant中国地域基因组会议,包含有很多农学的生物信息学课题。

水稻 http://www.retroryza.org 美国
RetrOryza是一个提供水稻的LTR- Retrotransposons的数据库,提供了目前已经进行过功能注释的242个家族的反转位子的信息,包括Primer Binding Site,PolyPurine Tract,Target Site Duplication等信息。

水稻 http://www.riceweb.org/ 菲律宾
关于世界范围的水稻生产和市场等情况。

水稻 http://www.shigen.nig.ac.jp/rice/oryzabase/top/top.jsp 日本
水稻科学整合数据库

水稻 http://www.shigen.nig.ac.jp/rice/oryzabase/top/top.jsp 日本
水稻遗传学和基因组学数据库,该数据库涵盖了从水稻的传统的遗传学信息到最新遗传研究方面的进展,还包括一些热门研究方面的课题。

水稻 http://www.staff.or.jp/giot/INE.html 日本
INE水稻基因数据库。

水稻 http://www.tigr.org/tdb/rice/ 美国
美国TIGR研究所维护着几个与水稻基因组有关的数据库,包括基因组注释库重复序列库,以及基因索引。

棉花 http://algodon.tamu.edu/ 美国
cottonDB美国南方平原农业研究中心所维护的棉花数据库

棉花 http://cottondb.org/ 美国
CottonDB是一个包含有棉花基因组学、遗传学和分类学数据的数据库,同时它也是一个不断增加新数据和棉花研究者资料的数据库。

棉花 http://www.cottonmarker.org/ 美国
棉花标记数据库,由多家科研团体协作完成,包含有大量的已公布的序列标记数据。

大麦 http://barley.ipk-gatersleben.de/ebdb.php3 欧洲
欧洲大麦数据库(EBDB)收集的信息主要来源于ECP/GR Working Group对于大麦的研究数据,由德国Gatersleben的IPK植物基因组和作物研究所维护。

大麦 http://bioinf.scri.ac.uk/barley_snpdb/index.html 英国
该在线数据库包括在SCRI开展的通过交叉测序的方法挖掘到小麦和大麦的基因的SNPs的信息,目前由SCRI植物生物信息学组维护。

大麦 http://www.shigen.nig.ac.jp/barley/ 日本
该数据库中包含由日本冈山大学生物资源研究中心收集的大麦种质资源和基因组分析数据。

小麦 http://pgrc.ipk-gatersleben.de/cr-est/ 德国
大麦,小麦,豆类番茄EST数据库

小麦 http://synteny.nott.ac.uk/ 英国
UK CropNet该数据库主要提供了各类有关农作物的基因数据,包括Arabidopsis thaliana、Barley、Brassica spp.、Forage Grasses、Millet and tef、Alfalfa、Chlamydomonas、Dictyostelium等18个物种基因数据库。

小麦 http://wheat.pw.usda.gov/GG2/index.shtml 美国
谷类作物信息数据库,该数据库包括了小麦,大麦,燕麦黑麦和黑小麦等品种的遗传信息和遗传图谱。

小麦 http://www.ecpgr.cgiar.org/databases/crops/wheat.htm 法国
由捷克共和国的作物种植研究所维护的小麦数据库。

小麦 http://www.shigen.nig.ac.jp/wheat/top.html 日本
日本小麦网,由6所大学和研究所联合维护

小麦 http://www.tigr.org/tdb/e2k1/tae1/ 美国
由TIGR institute维护的小麦基因组数据,提供小麦的基因组及基因注释,并且可用于基因组注释等分析。同时,还提供其他谷类作物的同源基因数据,如玉米、大麦、高粱、水稻等。

生物信息学主要英文术语及释义

Abstract Syntax Notation (ASN.l)(NCBI发展的许多程序,如显示蛋白质三维结构的Cn3D等所使用的内部格式)
A language that is used to describe structured data types formally, Within bioinformatits,it has been used by the National Center for Biotechnology Information to encode sequences, maps, taxonomic information, molecular structures, and biographical information in such a way that it can be easily accessed and exchanged by computer software.
Accession number(记录号)
A unique identifier that is assigned to a single database entry for a DNA or protein sequence.
Affine gap penalty(一种设置空位罚分策略)
A gap penalty score that is a linear function of gap length, consisting of a gap opening penalty and a gap extension penalty multiplied by the length of the gap. Using this penalty scheme greatly enhances the performance of dynamic programming methods for sequence alignment. See also Gap penalty.
Algorithm(算法)
A systematic procedure for solving a problem in a finite number of steps, typically involving a repetition of operations. Once specified, an algorithm can be written in a computer language and run as a program.
Alignment(联配/比对/联配)
Refers to the procedure of comparing two or more sequences by looking for a series of individual characters or character patterns that are in the same order in the sequences. Of the two types of alignment, local and global, a local alignment is generally the most useful. See also Local and Global alignments.
Alignment score(联配/比对/联配值)
An algorithmically computed score based on the number of matches, substitutions, insertions, and deletions (gaps) within an alignment. Scores for matches and substitutions Are derived from a scoring matrix such as the BLOSUM and PAM matrices for proteins, and aftine gap penalties suitable for the matrix are chosen. Alignment scores are in log odds units, often bit units (log to the base 2). Higher scores denote better alignments. See also Similarity score, Distance in sequence analysis.
Alphabet(字母表)
The total number of symbols in a sequence-4 for DNA sequences and 20 for protein sequences.
Annotation(注释)
The prediction of genes in a genome, including the location of protein-encoding genes, the sequence of the encoded proteins, anysignificantmatches to other Proteins of known function, and the location of RNA-encoding genes. Predictions are based on gene models; e.g., hidden Markov models of introns and exons in proteins encoding genes, and models of secondary structure in RNA.
Anonymous FTP(匿名FTP)
When a FTP service allows anyone to log in, it is said to provide anonymous FTP ser-vice. A user can log in to an anonymous FTP server by typing anonymous as the user name and his E-mail address as a password. Most Web browsers now negotiate anonymous FTP logon without asking the user for a user name and password. See also FTP.
ASCII
The American Standard Code for Information Interchange (ASCII) encodes unaccented letters a-z, A-Z, the numbers O-9, most punctuation marks, space, and a set of control characters such as carriage return and tab. ASCII specifies 128 characters that are mapped to the values O-127. ASCII tiles are commonly called plain text, meaning that they only encode text without extra markup.
BAC clone(细菌人工染色体克隆)
Bacterial artificial chromosome vector carrying a genomic DNA insert, typically 100–200 kb. Most of the large-insert clones sequenced in the project were BAC clones.
Back-propagation(反向传输)
When training feed-forward neural networks, a back-propagation algorithm can be used to modify the network weights. After each training input pattern is fed through the network, the network’s output is compared with the desired output and the amount of error is calculated. This error is back-propagated through the network by using an error function to correct the network weights. See also Feed-forward neural network.
Baum-Welch algorithm(Baum-Welch算法)
An expectation maximization algorithm that is used to train hidden Markov models.
Baye’s rule(贝叶斯法则)
Forms the basis of conditional probability by calculating the likelihood of an event occurring based on the history of the event and relevant background information. In terms of two parameters A and B, the theorem is stated in an equation: The condition-al probability of A, given B, P(AIB), is equal to the probability of A, P(A), times the conditional probability of B, given A, P(BIA), divided by the probability of B, P(B). P(A) is the historical or prior distribution value of A, P(BIA) is a new prediction for B for a particular value of A, and P(B) is the sum of the newly predicted values for B. P(AIB) is a posterior probability, representing a new prediction for A given the prior knowledge of A and the newly discovered relationships between A and B.
Bayesian analysis(贝叶斯分析)
A statistical procedure used to estimate parameters of an underlyingdistribution based on an observed distribution. See also Baye’s rule.
Biochips(生物芯片)
Miniaturized arrays of large numbers of molecular substrates, often oligonucleotides, in a defined pattern. They are also called DNA microarrays and microchips.
Bioinformatics (生物信息学)
The merger of biotechnology and information technology with the goal of revealing new insights and principles in biology. /The discipline of obtaining information about genomic or protein sequence data. This may involve similarity searches of databases, comparing your unidentified sequence to the sequences in a database, or making predictions about the sequence based on current knowledge of similar sequences. Databases are frequently made publically available through the Internet, or locally at your institution.
Bit score (二进制值/ Bit值)
The value S’ is derived from the raw alignment score S in which the statistical properties of the scoring system used have been taken into account. Because bit scores have been normalized with respect to the scoring system, they can be used to compare alignment scores from different searches.
Bit units
From information theory, a bit denotes the amount of information required to distinguish between two equally likely possibilities. The number of bits of information, AJ, required to convey a message that has A4 possibilities is log2 M = N bits.
BLAST (基本局部联配搜索工具,一种主要数据库搜索程序)
Basic Local Alignment Search Tool. A set of programs, used to perform fast similarity searches. Nucleotide sequences can be compared with nucleotide sequences in a database using BLASTN, for example. Complex statistics are applied to judge the significance of each match. Reported sequences may be homologous to, or related to the query sequence. The BLASTP program is used to search a protein database for a match against a query protein sequence. There are several other flavours of BLAST. BLAST2 is a newer release of BLAST. Allows for insertions or deletions in the sequences being aligned. Gapped alignments may be more biologically significant.
Block(蛋白质家族中保守区域的组块)
Conserved ungapped patterns approximately 3-60 amino acids in length in a set of related proteins.
BLOSUM matrices(模块替换矩阵,一种主要替换矩阵)
An alternative to PAM tables, BLOSUM tables were derived using local multiple alignments of more distantly related sequences than were used for the PAM matrix. These are used to assess the similarity of sequences when performing alignments.
Boltzmann distribution(Boltzmann 分布)
Describes the number of molecules that have energies above a certain level, based on the Boltzmann gas constant and the absolute temperature.Boltzmann probability function(Boltzmann概率函数)
See Boltzmann distribution.
Bootstrap analysis
A method for testing how well a particular data set fits a model. For example, the validity of the branch arrangement in a predicted phylogenetic tree can be tested by resampling columns in a multiple sequence alignment to create many new alignments. The appearance of a particular branch in trees generated from these resampled sequences can then be measured. Alternatively, a sequence may be left out of an analysis to deter-mine how much the sequence influences the results of an analysis.
Branch length(分支长度)
In sequence analysis, the number of sequence changes along a particular branch of a phylogenetic tree.
CDS or cds (编码序列)
Coding sequence.
Chebyshe, d inequality
The probability that a random variable exceeds its mean is less than or equal to the square of 1 over the number of standard deviations from the mean.
Clone (克隆)
Population of identical cells or molecules (e.g. DNA), derived from a single ancestor.
Cloning Vector (克隆载体)
A molecule that carries a foreign gene into a host, and allows/facilitates the multiplication of that gene in a host. When sequencing a gene that has been cloned using a cloning vector (rather than by PCR), care should be taken not to include the cloning vector sequence when performing similarity searches. Plasmids, cosmids, phagemids, YACs and PACs are example types of cloning vectors.
Cluster analysis(聚类分析)
A method for grouping together a set of objects that are most similar from a larger group of related objects. The relationships are based on some criterion of similarity or difference. For sequences, a similarity or distance score or a statistical evaluation of those scores is used.
Cobbler
A single sequence that represents the most conserved regions in a multiple sequence alignment. The BLOCKS server uses the cobbler sequence to perform a database similarity search as a way to reach sequences that are more divergent than would be found using the single sequences in the alignment for searches.
Coding system (neural networks)
Regarding neural networks, a coding system needs to be designed for representing input and output. The level of success found when training the model will be partially dependent on the quality of the coding system chosen.
Codon usageAnalysis of the codons used in a particular gene or organism.
COG(直系同源簇)
Clusters of orthologous groups in a set of groups of related sequences in microorganism and yeast (S. cerevisiae). These groups are found by whole proteome comparisons and include orthologs and paralogs. See also Orthologs and Paralogs.
Comparative genomics(比较基因组学)
A comparison of gene numbers, gene locations, and biological functions of genes in the genomes of diverse organisms, one objective being to identify groups of genes that play a unique biological role in a particular organism.
Complexity (of an algorithm)(算法的复杂性)
Describes the number of steps required by the algorithm to solve a problem as a function of the amount of data; for example, the length of sequences to be aligned.
Conditional probability(条件概率)
The probability of a particular result (or of a particular value of a variable) given one or more events or conditions (or values of other variables).
Conservation (保守)
Changes at a specific position of an amino acid or (less commonly, DNA) sequence that preserve the physico-chemical properties of the original residue.
Consensus(一致序列)
A single sequence that represents, at each subsequent position, the variation found within corresponding columns of a multiple sequence alignment.
Context-free grammars
A recursive set of production rules for generating patterns of strings. These consist of a set of terminal characters that are used to create strings, a set of nonterminal symbols that correspond to rules and act as placeholders for patterns that can be generated using terminal characters, a set of rules for replacing nonterminal symbols with terminal characters, and a start symbol.
Contig (序列重叠群/拼接序列)
A set of clones that can be assembled into a linear order. A DNA sequence that overlaps with another contig. The full set of overlapping sequences (contigs) can be put together to obtain the sequence for a long region of DNA that cannot be sequenced in one run in a sequencing assay. Important in genetic mapping at the molecular level.
CORBA(国际对象管理协作组制定的使OOP对象与网络接口统一起来的一套跨计算机、操作系统、程序语言和网络的共同标准)
The Common Object Request Broker Architecture (CORBA) is an open industry standard for working with distributed objects, developed by the Object Management Group. CORBA allows the interconnection of objects and applications regardless of computer language, machine architecture, or geographic location of the computers.
Correlation coefficient(相关系数)A numerical measure, falling between – 1 and 1, of the degree of the linear relationship between two variables. A positive value indicates a direct relationship, a negative value indicates an inverse relationship, and the distance of the value away from zero indicates the strength of the relationship. A value near zero indicates no relationship between the variables.
Covariation (in sequences)(共变)
Coincident change at two or more sequence positions in related sequences that may influence the secondary structures of RNA or protein molecules.
Coverage (or depth) (覆盖率/厚度)
The average number of times a nucleotide is represented by a high-quality base in a collection of random raw sequence. Operationally, a ‘high-quality base’ is defined as one with an accuracy of at least 99% (corresponding to a PHRED score of at least 20).
Database(数据库)
A computerized storehouse of data that provides a standardized way for locating, adding, removing, and changing data. See also Object-oriented database, Relational database.
Dendogram
A form of a tree that lists the compared objects (e.g., sequences or genes in a microarray analysis) in a vertical order and joins related ones by levels of branches extending to one side of the list.
Depth (厚度)
See coverage
Dirichlet mixtures
Defined as the conjugational prior of a multinomial distribution. One use is for predicting the expected pattern of amino acid variation found in the match state of a hid-den Markov model (representing one column of a multiple sequence alignment of proteins), based on prior distributions found in conserved protein domains (blocks).
Distance in sequence analysis(序列距离)
The number of observed changes in an optimal alignment of two sequences, usually not counting gaps.
DNA Sequencing (DNA测序)
The experimental process of determining the nucleotide sequence of a region of DNA. This is done by labelling each nucleotide (A, C, G or T) with either a radioactive or fluorescent marker which identifies it. There are several methods of applying this technology, each with their advantages and disadvantages. For more information, refer to a current text book. High throughput laboratories frequently use automated sequencers, which are capable of rapidly reading large numbers of templates. Sometimes, the sequences may be generated more quickly than they can be characterised.
Domain (功能域)
A discrete portion of a protein assumed to fold independently of the rest of the protein and possessing its own function.Dot matrix(点标矩阵图)
Dot matrix diagrams provide a graphical method for comparing two sequences. One sequence is written horizontally across the top of the graph and the other along the left-hand side. Dots are placed within the graph at the intersection of the same letter appearing in both sequences. A series of diagonal lines in the graph indicate regions of alignment. The matrix may be filtered to reveal the most-alike regions by scoring a minimal threshold number of matches within a sequence window.
Draft genome sequence (基因组序列草图)
The sequence produced by combining the information from the individual sequenced clones (by creating merged sequence contigs and then employing linking information to create scaffolds) and positioning the sequence along the physical map of the chromosomes.
DUST (一种低复杂性区段过滤程序)
A program for filtering low complexity regions from nucleic acid sequences.
Dynamic programming(动态规划法)
A dynamic programming algorithm solves a problem by combining solutions to sub-problems that are computed once and saved in a table or matrix. Dynamic programming is typically used when a problem has many possible solutions and an optimal one needs to be found. This algorithm is used for producing sequence alignments, given a scoring system for sequence comparisons.
EMBL (欧洲分子生物学实验室,EMBL数据库是主要公共核酸序列数据库之一)
European Molecular Biology Laboratories. Maintain the EMBL database, one of the major public sequence databases.
EMBnet (欧洲分子生物学网络)
European Molecular Biology Network: http://www.embnet.org/ was established in 1988, and provides services including local molecular databases and software for molecular biologists in Europe. There are several large outposts of EMBnet, including EXPASY.
Entropy(熵)
From information theory, a measure of the unpredictable nature of a set of possible elements. The higher the level of variation within the set, the higher the entropy.
Erdos and Renyi law
In a toss of a “fair” coin, the number of heads in a row that can be expected is the logarithm of the number of tosses to the base 2. The law may be generalized for more than two possible outcomes by changing the base of the logarithm to the number of out-comes. This law was used to analyze the number of matches and mismatches that can be expected between random sequences as a basis for scoring the statistical significance of a sequence alignment.
EST (表达序列标签的缩写)
See Expressed Sequence Tag
Expect value (E)(E值)
E value. The number of different alignents with scores equivalent to or better than S that are expected to occur in a database search by chance. The lower the E value, the more significant the score. In a database similarity search, the probability that an alignment score as good as the one found between a query sequence and a database sequence would be found in as many comparisons between random sequences as was done to find the matching sequence. In other types of sequence analysis, E has a similar meaning.
Expectation maximization (sequence analysis)
An algorithm for locating similar sequence patterns in a set of sequences. A guessed alignment of the sequences is first used to generate an expected scoring matrix representing the distribution of sequence characters in each column of the alignment, this pattern is matched to each sequence, and the scoring matrix values are then updated to maximize the alignment of the matrix to the sequences. The procedure is repeated until there is no further improvement.
Exon (外显子)

Coding region of DNA. See CDS.
Expressed Sequence Tag (EST) (表达序列标签)
Randomly selected, partial cDNA sequence; represents it’s corresponding mRNA. dbEST is a large database of ESTs at GenBank, NCBI.
FASTA (一种主要数据库搜索程序)
The first widely used algorithm for database similarity searching. The program looks for optimal local alignments by scanning the sequence for small matches called "words". Initially, the scores of segments in which there are multiple word hits are calculated ("init1"). Later the scores of several segments may be summed to generate an "initn" score. An optimized alignment that includes gaps is shown in the output as "opt". The sensitivity and speed of the search are inversely related and controlled by the "k-tup" variable which specifies the size of a "word". (Pearson and Lipman)
Extreme value distribution(极值分布)
Some measurements are found to follow a distribution that has a long tail which decays at high values much more slowly than that found in a normal distribution. This slow-falling type is called the extreme value distribution. The alignment scores between unrelated or random sequences are an example. These scores can reach very high values, particularly when a large number of comparisons are made, as in a database similarity search. The probability of a particular score may be accurately predicted by the extreme value distribution, which follows a double negative exponential function after Gumbel.
False negative(假阴性)
A negative data point collected in a data set that was incorrectly reported due to a failure of the test in avoiding negative results.
False positive (假阳性)
A positive data point collected in a data set that was incorrectly reported due to a failure of the test. If the test had correctly measured the data point, the data would have been recorded as negative.
Feed-forward neural network (反向传输神经网络)
Organizes nodes into sequence layers in which the nodes in each layer are fully connected with the nodes in the next layer, except for the final output layer. Input is fed from the input layer through the layers in sequence in a “feed-forward” direction, resulting in output at the final layer. See also Neural network.
Filtering (window size)
During pair-wise sequence alignment using the dot matrix method, random matches can be filtered out by using a sliding window to compare the two sequences. Rather than comparing a single sequence position at a time, a window of adjacent positions in the two sequences is compared and a dot, indicating a match, is generated only if a certain minimal number of matches occur.
Filtering (过滤)
Also known as Masking. The process of hiding regions of (nucleic acid or amino acid) sequence having characteristics that frequently lead to spurious high scores. See SEG and DUST.
Finished sequence(完成序列)
Complete sequence of a clone or genome, with an accuracy of at least 99.99% and no gaps.
Fourier analysis
Studies the approximations and decomposition of functions using trigonometric polynomials.
Format (file)(格式)
Different programs require that information be specified to them in a formal manner, using particular keywords and ordering. This specification is a file format.
Forward-backward algorithm
Used to train a hidden Markov model by aligning the model with training sequences. The algorithm then refines the model to reduce the error when fitted to the given data using a gradient descent approach.
FTP (File Transfer Protocol)(文件传输协议)
Allows a person to transfer files from one computer to another across a network using an FTP-capable client program. The FTP client program can only communicate with machines that run an FTP server. The server, in turn, will make a specific portion of its tile system available for FTP access, providing that the client is able to supply a recognized user name and password to the server.
Full shotgun clone (鸟枪法克隆)
A large-insert clone for which full shotgun sequence has been produced.
Functional genomics(功能基因组学)
Assessment of the function of genes identified by between-genome comparisons. The function of a newly identified gene is tested by introducing mutations into the gene and then examining the resultant mutant organism for an altered phenotype.
gap (空位/间隙/缺口)
A space introduced into an alignment to compensate for insertions and deletions in one sequence relative to another. To prevent the accumulation of too many gaps in an alignment, introduction of a gap causes the deduction of a fixed amount (the gap score) from the alignment score. Extension of the gap to encompass additional nucleotides or amino acid is also penalized in the scoring of an alignment.
Gap penalty(空位罚分)
A numeric score used in sequence alignment programs to penalize the presence of gaps within an alignment. The value of a gap penalty affects how often gaps appear in alignments produced by the algorithm. Most alignment programs suggest gap penalties that are appropriate for particular scoring matrices.
Genetic algorithm(遗传算法)
A kind of search algorithm that was inspired by the principles of evolution. A population of initial solutions is encoded and the algorithm searches through these by applying a pre-defined fitness measurement to each solution, selecting those with the highest fitness for reproduction. New solutions can be generated during this phase by crossover and mutation operations, defined in the encoded solutions.
Genetic map (遗传图谱)
A genome map in which polymorphic loci are positioned relative to one another on the basis of the frequency with which they recombine during meiosis. The unit of distance is centimorgans (cM), denoting a 1% chance of recombination.
Genome(基因组)
The genetic material of an organism, contained in one haploid set of chromosomes.
Gibbs sampling method
An algorithm for finding conserved patterns within a set of related sequences. A guessed alignment of all but one sequence is made and used to generate a scoring matrix that represents the alignment. The matrix is then matched to the left-out sequence, and a probable location of the corresponding pattern is found. This prediction is then input into a new alignment and another scoring matrix is produced and tested on a new left-out sequence. The process is repeated until there is no further improvement in the matrix.
Global alignment(整体联配)
Attempts to match as many characters as possible, from end to end, in a set of twomore sequences.
Gopher (一个文档发布系统,允许检索和显示文本文件)
Graph theory(图论)
A branch of mathematics which deals with problems that involve a graph or network structure. A graph is defined by a set of nodes (or points) and a set of arcs (lines or edges) joining the nodes. In sequence and genome analysis, graph theory is used for sequence alignments and clustering alike genes.
GSS(基因综述序列)
Genome survey sequence.
GUI(图形用户界面)
Graphical user interface.
H (相对熵值)
H is the relative entropy of the target and background residue frequencies. (Karlin and Altschul, 1990). H can be thought of as a measure of the average information (in bits) available per position that distinguishes an alignment from chance. At high values of H, short alignments can be distinguished by chance, whereas at lower H values, a longer alignment may be necessary. (Altschul, 1991)
Half-bits
Some scoring matrices are in half-bit units. These units are logarithms to the base 2 of odds scores times 2.
Heuristic(启发式方法)
A procedure that progresses along empirical lines by using rules of thumb to reach a solution. The solution is not guaranteed to be optimal.
Hexadecimal system(16制系统)
The base 16 counting system that uses the digits O-9 followed by the letters A-F.
HGMP (人类基因组图谱计划)
Human Genome Mapping Project.
Hidden Markov Model (HMM)(隐马尔可夫模型)
In sequence analysis, a HMM is usually a probabilistic model of a multiple sequence alignment, but can also be a model of periodic patterns in a single sequence, representing, for example, patterns found in the exons of a gene. In a model of multiple sequence alignments, each column of symbols in the alignment is represented by a frequency distribution of the symbols called a state, and insertions and deletions by other states. One then moves through the model along a particular path from state to state trying to match a given sequence. The next matching symbol is chosen from each state, recording its probability (frequency) and also the probability of going to that particular state from a previous one (the transition probability). State and transition probabilities are then multiplied to obtain a probability of the given sequence. Generally speaking, a HMM is a statistical model for an ordered sequence of symbols, acting as a stochastic state machine that generates a symbol each time a transition is made from one state to the next. Transitions betweenstates are specified by transition probabilities.
Hidden layer(隐藏层)
An inner layer within a neural network that receives its input and sends its output to other layers within the network. One function of the hidden layer is to detect covariation within the input data, such as patterns of amino acid covariation that are associated with a particular type of secondary structure in proteins.
Hierarchical clustering(分级聚类)
The clustering or grouping of objects based on some single criterion of similarity or difference.An example is the clustering of genes in a microarray experiment based on the correlation between their expression patterns. The distance method used in phylogenetic analysis is another example.
Hill climbing
A nonoptimal search algorithm that selects the singular best possible solution at a given state or step. The solution may result in a locally best solution that is not a globally best solution.
Homology(同源性)
A similar component in two organisms (e.g., genes with strongly similar sequences) that can be attributed to a common ancestor of the two organisms during evolution.
Horizontal transfer(水平转移)
The transfer of genetic material between two distinct species that do not ordinarily exchange genetic material. The transferred DNA becomes established in the recipient genome and can be detected by a novel phylogenetic history and codon content com-pared to the rest of the genome.
HSP (高比值片段对)
High-scoring segment pair. Local alignments with no gaps that achieve one of the top alignment scores in a given search.
HTGS/HGT(高通量基因组序列)
High-throughout genome sequences
HTML(超文本标识语言)
The Hyper-Text Markup Language (HTML) provides a structural description of a document using a specified tag set. HTML currently serves as the Internet lingua franca for describing hypertext Web page documents.
Hyperplane
A generalization of the two-dimensional plane to N dimensions.
Hypercube
A generalization of the three-dimensional cube to N dimensions.
Identity (相同性/相同率)
The extent to which two (nucleotide or amino acid) sequences are invariant.
Indel(插入或删除的缩略语)
An insertion or deletion in a sequence alignment.
Information content (of a scoring matrix)
A representation of the degree of sequence conservation in a column of ascoring matrix representing an alignment of related sequences. It is also the number of questions that must be asked to match the column to a position in a test sequence. For bases, the max-imum possible number is 2, and for proteins, 4.32 (logarithm to the base 2 of the number of possible sequence characters).
Information theory(信息理论)
A branch of mathematics that measures information in terms of bits, the minimal amount of structural complexity needed to encode a given piece of information.
Input layer(输入层)
The initial layer in a feed-forward neural net. This layer encodes input information that will be fed through the network model.
Interface definition language
Used to define an interface to an object model in a programming language neutral form, where an interface is an abstraction of a service defined only by the operations that can be performed on it.
Internet(因特网)
The network infrastructure, consisting of cables interconnected by routers, that pro-vides global connectivity for individual computers and private networks of computers. A second sense of the word internet is the collective computer resources available over this global network.
Interpolated Markov model
A type of Markov model of sequences that examines sequences for patterns of variable length in order to discriminate best between genes and non-gene sequences.
Intranet(内部网)
Intron (内含子)
Non-coding region of DNA.
Iterative(反复的/迭代的)
A sequence of operations in a procedure that is performed repeatedly.
Java(一种由SUN Microsystem开发的编程语言)
K (BLAST程序的一个统计参数)
A statistical parameter used in calculating BLAST scores that can be thought of as a natural scale for search space size. The value K is used in converting a raw score (S) to a bit score (S’).
K-tuple(字/字长)
Identical short stretches of sequences, also called words.
lambda (λ,BLAST程序的一个统计参数)
A statistical parameter used in calculating BLAST scores that can be thought of as a natural scale for scoring system. The value lambda is used in converting a raw score (S) to a bit score (S’).
LAN(局域网)
Local area network.
Likelihood(似然性)The hypothetical probability that an event which has already occurred would yield a specific outcome. Unlike probability, which refers to future events, likelihood refers to past events.
Linear discriminant analysis
An analysis in which a straight line is located on a graph between two sets of data pointsin a location that best separates the data points into two groups.
Local alignment(局部联配)
Attempts to align regions of sequences with the highest density of matches. In doing so, one or more islands of subalignments are created in the aligned sequences.
Log odds score(概率对数值)
The logarithm of an odds score. See also Odds score.
Low Complexity Region (LCR) (低复杂性区段)
Regions of biased composition including homopolymeric runs, short-period repeats, and more subtle overrepresentation of one or a few residues. The SEG program is used to mask or filter LCRs in amino acid queries. The DUST program is used to mask or filter LCRs in nucleic acid queries.
Machine learning(机器学习)
The training of a computational model of a process or classification scheme to distinguish between alternative possibilities.
Markov chain(马尔可夫链)
Describes a process that can be in one of a number of states at any given time. The Markov chain is defined by probabilities for each transition occurring; that is, probabilities of the occurrence of state sj given that the current state is sp Substitutions in nucleic acid and protein sequences are generally assumed to follow a Markov chain in that each site changes independently of the previous history of the site. With this model, the number and types of substitutions observed over a relatively short period of evolutionary time can be extrapolated to longer periods of time. In performing sequence alignments and calculating the statistical significance of alignment scores, sequences are assumed to be Markov chains in which the choice of one sequence position is not influenced by another.
Masking (过滤)
Also known as Filtering. The removal of repeated or low complexity regions from a sequence in order to improve the sensitivity of sequence similarity searches performed with that sequence.
Maximum likelihood (phylogeny, alignment)(最大似然法)
The most likely outcome (tree or alignment), given a probabilistic model of evolutionary change in DNA sequences.
Maximum parsimony(最大简约法)
The minimum number of evolutionary steps required to generate the observed variation in a set of sequences, as found by comparison of the number of steps in all possible phylogenetic trees.
Method of momentsThe mean or expected value of a variable is the first moment of the values of the variable around the mean, defined as that number from which the sum of deviations to all values is zero. The standard deviation is the second moment of the values about the mean, and so on.
Minimum spanning tree
Given a set of related objects classified by some similarity or difference score, the mini-mum spanning tree joins the most-alike objects on adjacent outer branches of a tree and then sequentially joins less-alike objects by more inward branches. The tree branch lengths are calculated by the same neighbor-joining algorithm that is used to build phylogenetic trees of sequences from a distance matrix. The sum of the resulting branch lengths between each pair of objects will be approximately that found by the classification scheme.
MMDB (分子建模数据库)
Molecular Modelling Database. A taxonomy assigned database of PDB (see PDB) files, and related information.
Molecular clock hypothesis(分子钟假设)
The hypothesis that sequences change at the same rate in the branches of an evolutionary
tree.
Monte Carlo(蒙特卡罗法)
A method that samples possible solutions to a complex problem as a way to estimate a more general solution.
Motif (模序)
A short conserved region in a protein sequence. Motifs are frequently highly conserved parts of domains.
Multiple Sequence Alignment (多序列联配)
An alignment of three or more sequences with gaps inserted in the sequences such that residues with common structural positions and/or ancestral residues are aligned in the same column. Clustal W is one of the most widely used multiple sequence alignment programs
Mutation data matrix(突变数据矩阵,即PAM矩阵)
A scoring matrix compiled from the observation of point mutations between aligned sequences. Also refers to a Dayhoff PAM matrix in which the scores are given as log odds scores.
N50 length (N50长度,即覆盖50%所有核苷酸的最大序列重叠群长度)
A measure of the contig length (or scaffold length) containing a ‘typical’ nucleotide. Specifically, it is the maximum length L such that 50% of all nucleotides lie in contigs (or scaffolds) of size at least L.
Nats (natural logarithm)
A number expressed in units of the natural logarithm.
NCBI (美国国家生物技术信息中心)
National Center for Biotechnology Information (USA). Created by the United States Congress in 1988, to develop information systems to support thebiological research community.
Needleman-Wunsch algorithm(Needleman-Wunsch算法)
Uses dynamic programming to find global alignments between sequences.
Neighbor-joining method(邻接法)
Clusters together alike pairs within a group of related objects (e.g., genes with similar sequences) to create a tree whose branches reflect the degrees of difference among the objects.
Neural network(神经网络)
From artificial intelligence algorithms, techniques that involve a set of many simple units that hold symbolic data, which are interconnected by a network of links associated with numeric weights. Units operate only on their symbolic data and on the inputs that they receive through their connections. Most neural networks use a training algorithm (see Back-propagation) to adjust connection weights, allowing the network to learn associations between various input and output patterns. See also Feed-forward neural network.
NIH (美国国家卫生研究院)
National Institutes of Health (USA).
Noise(噪音)
In sequence analysis, a small amount of randomly generated variation in sequences that is added to a model of the sequences; e.g., a hidden Markov model or scoring matrix, in order to avoid the model overfitting the sequences. See also Overfitting.
Normal distribution(正态分布)
The distribution found for many types of data such as body weight, size, and exam scores. The distribution is a bell-shaped curve that is described by a mean and standard deviation of the mean. Local sequence alignment scores between unrelated or random sequences do not follow this distribution but instead the extreme value distribution which has a much extended tail for higher scores. See also Extreme value distribution.
Object Management Group (OMG)(国际对象管理协作组)
A not-for-profit corporation that was formed to promote component-based software by introducing standardized object software. The OMG establishes industry guidelines and detailed object management specifications in order to provide a common framework for application development. Within OMG is a Life Sciences Research group, a consortium representing pharmaceutical companies, academic institutions, software vendors, and hardware vendors who are working together to improve communication and inter-operability among computational resources in life sciences research. See CORBA.
Object-oriented database(面向对象数据库)
Unlike relational databases (see entry), which use a tabular structure, object-oriented databases attempt to model the structure of a given data set as closely as possible. In doing so, object-oriented databases tend to reduce the appearance of duplicated data and the complexity of query structure often found in relational databases.Odds score(概率/几率值)
The ratio of the likelihoods of two events or outcomes. In sequence alignments and scoring matrices, the odds score for matching two sequence characters is the ratio of the frequency with which the characters are aligned in related sequences divided by the frequency with which those same two characters align by chance alone, given the frequency of occurrence of each in the sequences. Odds scores for a set of individually aligned positions are obtained by multiplying the odds scores for each position. Odds scores are often converted to logarithms to create log odds scores that can be added to obtain the log odds score of a sequence alignment.
OMIM (一种人类遗传疾病数据库)
Online Mendelian Inheritance in Man. Database of genetic diseases with references to molecular medicine, cell biology, biochemistry and clinical details of the diseases.
Optimal alignment(最佳联配)
The highest-scoring alignment found by an algorithm capable of producing multiple solutions. This is the best possible alignment that can be found, given any parameters supplied by the user to the sequence alignment program.
ORF (开放阅读框)
Open Reading Frame. A series of codons (base triplets) which can be translated into a protein. There are six potential reading frames of an unidentifed sequence; TBLASTN (see BLAST) transalates a nucleotide sequence in all six reading frames, into a protein, then attempts to align the results to sequeneces in a protein database, returning the results as a nucleotide sequence. The most likely reading frame can be identified using on-line software (e.g. ORF Finder).
Orthologous(直系同源)
Homologous sequences in different species that arose from a common ancestral gene during speciation; may or may not be responsible for a similar function. A pair of genes found in two species are orthologous when the encoded proteins are 60-80% identical in an alignment. The proteins almost certainly have the same three-dimensional structure, domain structure, and biological function, and the encoding genes have originated from a common ancestor gene at an earlier evolutionary time. Two orthologs 1 and II in genomes A and B, respectively, may be identified when the complete genomes of two species are available: (1) in a database similarity search of all of the proteome of B using I as a query, II is the best hit found, and (2) I is the best hit when 11 is used as a query of the proteome of B. The best hit is the database sequence with the highest expect value (E). Orthology is also predicted by a very close phylogenetic relationship between sequences or by a cluster analysis. Compare to Paralogs. See also Cluster analysis.
Output layer(输出层)
The final layer of a neural network in which signals from lower levels in the network are input into output states where they are weighted and summed togive an outpu t signal. For example, the output signal might be the prediction of one type of protein secondary structure for the central amino acid in a sequence window.
Overfitting
Can occur when using a learning algorithm to train a model such as a neural net or hid-den Markov model. Overfitting refers to the model becoming too highly representative of the training data and thus no longer representative of the overall range of data that is supposed to be modeled.

P value (P值/概率值)
The probability of an alignment occurring with the score in question or better. The p value is calculated by relating the observed alignment score, S, to the expected distribution of HSP scores from comparisons of random sequences of the same length and composition as the query to the database. The most highly significant P values will be those close to 0. P values and E values are different ways of representing the significance of the alignment.
Pair-wise sequence alignment(双序列联配)
An alignment performed between two sequences.
PAM (可接受突变百分率/可以观察到的突变百分率,它可作为一种进化时间单位)
Percent Accepted Mutation. A unit introduced by Dayhoff et al. to quantify the amount of evolutionary change in a protein sequence. 1.0 PAM unit, is the amount of evolution which will change, on average, 1% of amino acids in a protein sequence. A PAM(x) substitution matrix is a look-up table in which scores for each amino acid substitution have been calculated based on the frequency of that substitution in closely related proteins that have experienced a certain amount (x) of evolutionary divergence.
Paralogous (旁系同源)
Homologous sequences within a single species that arose by gene duplication. Genes that are related through gene duplication events. These events may lead to the production of a family of related proteins with similar biological functions within a species. Paralogous gene families within a species are identified by using an individual protein as a query in a database similarity search of the entireproteome of an organism. The process is repeated for the entire proteome and the resulting sets of related proteins are then searched for clusters that are most likely to have a conserved domain structure and should represent a paralogous gene family.
Parametric sequence alignment
An algorithm that finds a range of possible alignments based on varying the parameters of the scoring system for matches, mismatches, and gap penalties. An example is the Bayes block aligner.
PDB (主要蛋白质结构数据库之一)
Brookhaven Protein Data Bank. A database and format of files which describe the 3D structure of a protein or nucleic acid, as determined by X-ray crystallography or nuclear magnetic resonance (NMR) imaging. Themolecules described by the files are usually viewed locally by dedicated software, but can sometimes be visualised on the world wide web.
Pearson correlation coefficent(Pearson相关系数)
A measure of the correlation between two variables that reflects the degree to which the two variables are related. For example, the coefficient is used as a measure of similarity of gene expression in a microarray experiment. See also Correlation coefficient. Percent identity The percentage of the columns in an alignment of two sequences that includes identical amino acids. Columns in the alignment that include gaps are not scored in the calculation.
Percent similarity(相似百分率)
The percentage of the columns in an alignment of two sequences that includes either identical amino acids or amino acids that are frequently found substituted for each other in sequences of related proteins (conservative substitutions). These substitutions may be found in an amino acid substitution matrix such as the Dayhoff PAM and Henikoff BLOSUM matrices. Columns in the alignment that include gaps are not scored in the calculation.
Perceptron(感知器,模拟人类视神经控制系统的图形识别机)
A neural network in which input and output states are directly connected without intervening hidden layers.
PHRED (一种广泛应用的原始序列分析程序,可以对序列的各个碱基进行识别和质量评价)
A widely used computer program that analyses raw sequence to produce a ‘base call’ with an associated ‘quality score’ for each position in the sequence. A PHRED quality score of X corresponds to an error probability of approximately 10-X/10. Thus, a PHRED quality score of 30 corresponds to 99.9% accuracy for the base call in the raw read.
PHRAP (一种广泛应用的原始序列组装程序)
A widely used computer program that assembles raw sequence into sequence contigs and assigns to each position in the sequence an associated ‘quality score’, on the basis of the PHRED scores of the raw sequence reads. A PHRAP quality score of X corresponds to an error probability of approximately 10-X/10. Thus, a PHRAP quality score of 30 corresponds to 99.9% accuracy for a base in the assembled sequence.
Phylogenetic studies(系统发育研究)
PIR (主要蛋白质序列数据库之一,翻译自GenBank)
A database of translated GenBank nucleotide sequences. PIR is a redundant (see Redundancy) protein sequence database. The database is divided into four categories:
PIR1 – Classified and annotated.
PIR2 – Annotated.
PIR3 – Unverified.
PIR4 – Unencoded or untranslated.
Poisson distribution(帕松分布)
Used to predict the occurrence of infrequent events over a long period of timeor when there are a large number of trials. In sequence analysis, it is used to calculate the chance that one pair of a large number of pairs of unrelated sequences may give a high local alignment score.
Position-specific scoring matrix (PSSM)(特定位点记分矩阵,PSI-BLAST等搜索程序使用)
The PSSM gives the log-odds score for finding a particular matching amino acid in a target sequence. Represents the variation found in the columns of an alignment of a set of related sequences. Each subsequent matrix column corresponds to the next column in the alignment and each row corresponds to a particular sequence character (one of four bases in DNA sequences or 20 amino acids in protein sequences). Matrix values are log odds scores obtained by dividing the counts of the residue in the alignment, dividing by the expected number of counts based on sequence composition, and converting the ratio to a log score. The matrix is moved along sequences to find similar regions by adding the matching log odds scores and looking for high values. There is no allowance for gaps. Also called a weight matrix or scoring matrix.
Posterior (Bayesian analysis)
A conditional probability based on prior knowledge and newly evaluated relationships among variables using Bayes rule. See also Bayes rule.
Prior (Bayesian analysis)
The expected distribution of a variable based on previous data.
Profile(分布型)
A matrix representation of a conserved region in a multiple sequence alignment that allows for gaps in the alignment. The rows include scores for matching sequential columns of the alignment to a test sequence. The columns include substitution scores for amino acids and gap penalties. See also PSSM.
Profile hidden Markov model(分布型隐马尔可夫模型)
A hidden Markov model of a conserved region in a multiple sequence alignment that includes gaps and may be used to search new sequences for similarity to the aligned sequences.
Proteome(蛋白质组)
The entire collection of proteins that are encoded by the genome of an organism. Initially the proteome is estimated by gene prediction and annotation methods but eventually will be revised as more information on the sequence of the expressed genes is obtained.
Proteomics (蛋白质组学)
Systematic analysis of protein expression of normal and diseased tissues that involves the separation, identification and characterization of all of the proteins in an organism.
Pseudocounts
Small number of counts that is added to the columns of a scoring matrix to increase the variability either to avoid zero counts or to add more variation than was found in the sequences used to produce the matrix.PSI-BLAST (BLAST系列程序之一)
Position-Specific Iterative BLAST. An iterative search using the BLAST algorithm. A profile is built after the initial search, which is then used in subsequent searches. The process may be repeated, if desired with new sequences found in each cycle used to refine the profile. Details can be found in this discussion of PSI-BLAST. (Altschul et al.)
PSSM (特定位点记分矩阵)
See position-specific scoring matrix and profile.
Public sequence databases (公共序列数据库,指GenBank、EMBL和DDBJ)
The three coordinated international sequence databases: GenBank, the EMBL data library and DDBJ.
Q20 (Quality score 20)
A quality score of > or = 20 indicates that there is less than a 1 in 100 chance that the base call is incorrect. These are consequently high-quality bases. Specifically, the quality value "q" assigned to a basecall is defined as:
q = -10 x log10(p)
where p is the estimated error probability for that basecall. Note that high quality values correspond to low error probabilities, and conversely.
Quality trimming
This is an algorithm which uses a sliding window of 50 bases and trims from the 5′ end of the read followed by the 3′ end. With each window, the number of low quality (10 or less) bases is determined. If more than 5 bases are below the threshold quality, the window is incremented by one base and the process is repeated. When the low quality test fails, the position where it stopped is recorded. The parameters for window length low quality threshold and number of low quality bases tolerated are fixed. The positions of the 5′ and 3′ boundaries of the quality region are noted in the plot of quality values presented in the" Chromatogram Details" report.
Query (待查序列/搜索序列)
The input sequence (or other type of search term) with which all of the entries in a database are to be compared.
Radiation hybrid (RH) map (辐射杂交图谱)
A genome map in which STSs are positioned relative to one another on the basis of the frequency with which they are separated by radiation-induced breaks. The frequency is assayed by analysing a panel of human–hamster hybrid cell lines, each produced by lethally irradiating human cells and fusing them with recipient hamster cells such that each carries a collection of human chromosomal fragments. The unit of distance is centirays (cR), denoting a 1% chance of a break occuring between two loci
Raw Score (初值,指最初得到的联配值S)
The score of an alignment, S, calculated as the sum of substitution and gap scores. Substitution scores are given by a look-up table (see PAM, BLOSUM). Gap scores are typically calculated as the sum of G, the gap opening penaltyand L, the gap extension penalty. For a gap of length n, the gap cost would be G+Ln. The choice of gap costs, G and L is empirical, but it is customary to choose a high value for G (10-15)and a low value for L (1-2).
Raw sequence (原始序列/读胶序列)
Individual unassembled sequence reads, produced by sequencing of clones containing DNA inserts.
Receiver operator characteristic
The receiver operator characteristic (ROC) curve describes the probability that a test will correctly declare the condition present against the probability that the test will declare the condition present when actually absent. This is shown through a graph of the tesls sensitivity against one minus the test specificity for different possible threshold values.
Redundancy (冗余)
The presence of more than one identical item represents redundancy. In bioinformatics, the term is used with reference to the sequences in a sequence database. If a database is described as being redundant, more than one identical (redundant) sequence may be found. If the database is said to be non-redundant (nr), the database managers have attempted to reduce the redundancy. The term is ambiguous with reference to genetics, and as such, the degree of non-redundancy varies according to the database manager’s interpretation of the term. One can argue whether or not two alleles of a locus defines the limit of redundancy, or whether the same locus in different, closely related organisms constitutes redundency. Non-redundant databases are, in some ways, superior, but are less complete. These factors should be taken into consideration when selecting a database to search.
Regular expressions
This computational tool provides a method for expressing the variations found in a set of related sequences including a range of choices at one position, insertions, repeats, and so on. For example, these expressions are used to characterize variations found in protein domains in the PROSITE catalog.
Regularization
A set of techniques for reducing data overfitting when training a model. See also Overfitting.
Relational database(关系数据库)
Organizes information into tables where each column represents the fields of informa-tion that can be stored in a single record. Each row in the table corresponds to a single record. A single database can have many tables and a query language is used to access the data. See also Object-oriented database.
Scaffold (支架,由序列重叠群拼接而成)
The result of connecting contigs by linking information from paired-end reads from plasmids, paired-end reads from BACs, known messenger RNAs or other sources. The contigs in a scaffold are ordered and oriented with respect to one another.
Scoring matrix(记分矩阵)
See Position-specific scoring matrix.
SEG (一种蛋白质程序低复杂性区段过滤程序)
A program for filtering low complexity regions in amino acid sequences. Residues that have been masked are represented as "X" in an alignment. SEG filtering is performed by default in the blastp subroutine of BLAST 2.0. (Wootton and Federhen)
Selectivity (in database similarity searches)(数据库相似性搜索的选择准确性)
The ability of a search method to locate members of a protein family without making a false-positive classification of members of other families.
Sensitivity (in database similarity searches)(数据库相似性搜索的灵敏性)
The ability of a search method to locate as many members of a protein family as possi-ble, including distant members of limited sequence similarity.
Sequence Tagged Site (序列标签位点)
Short cDNA sequences of regions that have been physically mapped. STSs provide unique landmarks, or identifiers, throughout the genome. Useful as a framework for further sequencing.
Significance(显著水平)
A significant result is one that has not simply occurred by chance, and therefore is prob-ably true. Significance levels show how likely a result is due to chance, expressed as a probability. In sequence analysis, the significance of an alignment score may be calcu-lated as the chance that such a score would be found between random or unrelated sequences. See Expect value.
Similarity score (sequence alignment) (相似性值)
Similarity means the extent to which nucleotide or protein sequences are related. The extent of similarity between two sequences can be based on percent sequence identity and/or conservation. In BLAST similarity refers to a positive matrix score. The sum of the number of identical matches and conservative (high scoring) substitu-tions in a sequence alignment divided by the total number of aligned sequence charac-ters. Gaps are usually ignored.
Simulated annealing
A search algorithm that attempts to solve the problem of finding global extrema. The algorithm was inspired by the physical cooling process of metals and the freezing process in liquids where atoms slow down in movement and line up to form a crystal. The algorithm traverses the energy levels of a function, always accepting energy levels that are smaller than previous ones, but sometimes accepting energy levels that are greater, according to the Boltzmann probability distribution.
Single-linkage cluster analysis
An analysis of a group of related objects, e.g., similar proteins in different genomes to identify both close and more distantrelationships, represented on a tree or dendogram. The method joins the most closely related pairs by the neighbor-joining algorithm by representing these pairs as outer branches onthe tree. More distant objects are then pro-gressively added to lower tree branches. The method is also used to predict phylogenet-ic relationships by distance methods. See also Hierarchical clustering, Neighbor-joining method.
Smith-Waterman algorithm(Smith-Waterman算法)
Uses dynamic programming to find local alignments between sequences. The key fea-ture is that all negative scores calculated in the dynamic programming matrix are changed to zero in order to avoid extending poorly scoring alignments and to assist in identifying local alignments starting and stopping anywhere with the matrix.
SNP (单核苷酸多态性)
Single nucleotide polymorphism, or a single nucleotide position in the genome sequence for which two or more alternative alleles are present at appreciable frequency (traditionally, at least 1%) in the human population.
Space or time complexity(时间或空间复杂性)
An algorithms complexity is the maximum amount of computer memory or time required for the number of algorithmic steps to solve a problem.
Specificity (in database similarity searches)(数据库相似性搜索的特异性)
The ability of a search method to locate members of one protein family, including dis-tantly related members.
SSR (简单序列重复)
Simple sequence repeat, a sequence consisting largely of a tandem repeat of a specific k-mer (such as (CA)15). Many SSRs are polymorphic and have been widely used in genetic mapping.
Stochastic context-free grammar
A formal representation of groups of symbols in different parts of a sequence; i.e., not in the same context. An example is complementary regions in RNA that will form sec-ondary
structures. The stochastic feature introduces variability into such regions.
Stringency
Refers to the minimum number of matches required within a window. See also Filtering.
STS (序列标签位点的缩写)
See Sequence Tagged Site
Substitution (替换)
The presence of a non-identical amino acid at a given position in an alignment. If the aligned residues have similar physico-chemical properties the substitution is said to be "conservative".
Substitution Matrix (替换矩阵)
A substitution matrix containing values proportional to the probability that amino acid i mutates into amino acid j for all pairs of amino acids. such matrices are constructed by assembling a large and diverse sample of verified pairwise alignments of amino acids. If the sample is large enough to be statistically significant, the resulting matrices should reflect the true probabilities of mutations occuring through a period of evolution.Sum of pairs method
Sums the substitution scores of all possible pair-wise combinations of sequence charac-ters in one column of a multiple sequence alignment.
SWISS-PROT (主要蛋白质序列数据库之一)
A non-redundant (See Redundancy) protein sequence database. Thoroughly annotated and cross referenced. A subdivision is TrEMBL.
Synteny
The presence of a set of homologous genes in the same order on two genomes.
Threading
In protein structure prediction, the aligning of the sequence of a protein of unknown structure with a known three-dimensional structure to determine whether the amino acid sequence is spatially and chemically compatible with that structure.
TrEMBL (蛋白质数据库之一,翻译自EMBL)
A protein sequence database of Translated EMBL nucleotide sequences.
Uncertainty(不确定性)
From information theory, a logarithmic measure of the average number of choices that must be made for identification purposes. See also Information content.
Unified Modeling Language (UML)
A standard sanctioned by the Object Management Group that provides a formal nota-tion for describing object-oriented design.
UniGene (人类基因数据库之一)
Database of unique human genes, at NCBI. Entries are selected by near identical presence in GenBank and dbEST databases. The clusters of sequences produced are considered to represent a single gene.
Unitary Matrix (一元矩阵)
Also known as Identity Matrix. A scoring system in which only identical characters receive a positive score.
URL(统一资源定位符)
Uniform resource locator.
Viterbi algorithm
Calculates the optimal path of a sequence through a hidden Markov model of sequences using a dynamic programming algorithm.
Weight matrix
See Position-specific scoring matrix.

国外期刊投稿、审稿过程以及常用术语

学习了一些杂志的在线投稿系统中的作者审稿人编辑部总编的分工和作业情况,在这里和新手以某一个典型例子进行探讨学习一下。
1. Author 作者
如何在线投稿?在线投稿大致步骤:
Step 1: Log In 登陆
The login page gives you three options:
1. Log in with your known User ID and Password 用户名和密码
2. Check to see if you have an existing account 确认是否已经注册过
3. Create a new account 没有就注册一个
Step 2: Enter your Author Center 进入作者中心
To begin a new submission, check a previous submission, continue a submission begun earlier, or submit a revised manuscript, choose Author Center. 确认是新投,还是投修改稿
Step 3: Inside Your Author Center 在个人的作者中心里面
Existing manuscripts are found in one of three areas: 包括三个区域(这个每个杂志可能有区别的)
Manuscripts to be Revised 需修改稿
Partially Submitted Manuscripts 部分上传稿
Submitted Manuscripts 已上传稿
To start a NEW manuscript submission, choose “Submit First Draft of New Manuscript” link. 开始上传新稿
Step 4: Entering Data 输入资料
The following screens ask you to enter each piece of data associated with your manuscript. Most of this data will also be included in the text of your manuscript, but needs to be entered in this format in order to make the system searchable by these fields. It is used for screen display and e-mail notifications only. You cannot enter text into the Manuscript Data Summary table – scroll down each screen to enter the required information. 按照提示一步一步输入
Press “Save and Continue” at the bottom of each screen in order to save all of your work. If you press the "Back" or "Forward" button on your browser your work will not be saved. 继续时选择保存和继续,如果点击back或者forward,原来输入的内容会消失。
Step 5: Upload Your Manuscript 上传文稿
The File Manager is the area where you upload your files. Click on the Save and Continue button to get to the upload page.
Click on the Browse button in step #1. Locate your file and click on the name of the file to place it in the box. Select a file designation that corresponds with the file name in step 浏览选定你电脑中要上传的稿件 #2. Select file is for review or not 选择审稿或不审稿,默认审稿- The file in the first position must be made “yes” available for review in step #3. Click the blue upload button in step #4 点击上传.
Please refer to the “Author Instructions” for each specific journal to determine the Journal preferences of format and Always view your proof carefully prior to submitting. You will not be able to change it once it’s submitted. 一定要看请稿约里的要求,一旦上传后就无法修改。
Your original file will be stored and will be located under “Original Files / Files not for review” on the right side of the screen. ManuscriptCentral will create files and place them under “Files for review.” You can make changes in step #5 before going to the next page. You can not make changes to your uploaded files. Click on "Save and Continue". 如果没有选择审稿,将不会被编辑看到。
Step 5: Submit Your Manuscript 上传
Click on “View uploaded files” – always view your proof carefully prior to submitting. You will not be able to change it once it’s submitted. Then close the file, close the View uploaded files window and click on “Submit your manuscript”. After you choose to submit, you will see a confirmation screen. You will also receive an e-mail confirmation that you can save for future reference. 点击上传后,会出现确认框,最后成功后,系统会自动发mail到你的邮箱。
2. Reviewer 审稿人
Step 1: Log In 用户登陆
Log into the website using the account information. This login information was emailed to you when you agreed to review. If you have lost this login information or didn’t receive it you can use the "request account information" button located on the login screen. This will resend your login information to you again.如果丢失用户名或密码,可以email要求发送。
Step 2: Enter your Reviewer Center 进入审稿中心
To begin reviewing your manuscripts click the Reviewer Center button. In your Reviewer Center there are two tables. The first table is labeled "Manuscripts Pending Review". This table is where manuscripts wait until they have been scored and submitted to the journal. The second table is labeled "Submitted Reviews". This table lists a history of all the manuscripts that you have reviewed in the past. Your Reviewer Center should look something like the picture below. 进入后,有两个表格。一个是待审稿件,另一个是已审稿件,见下图。
[img]http://scholarone.custhelp.com/cgi-bin/scholarone.cfg/php/enduser/fattach_get.php?p_sid=&p_tbl=10&p_id=183&p_created=1088689264[/img]
Step 3: Download or View the Manuscript Files 下载或直接查看稿件
Clicking on the title of the manuscript will open a new window. The location of the manuscript title is shown in the picture below. From this window you can either download or view each of the files the author has uploaded for review. 点击稿件题目会开新窗口。你可以查看或下载作者上传的所有稿件内容。
To view the files: Simply click on any of the file names located in the first column of the new window. 查看。
To download the files: Right-click on the file name. The choose "save target as" from the menu that appears. Next, choose a location to save the file on your computer. Finally, click Save at the bottom of the window. 可以右键保存进行下载。
[img]http://scholarone.custhelp.com/cgi-bin/scholarone.cfg/php/enduser/fattach_get.php?p_sid=&p_tbl=10&p_id=183&p_created=1088688992[/img]
Step 4: View details and Score Manuscript 稿件评价
Once you are ready to begin scoring your manuscript click the "Review" button. 点击“Review”开始评价审阅的稿件。
[img]http://scholarone.custhelp.com/cgi-bin/scholarone.cfg/php/enduser/fattach_get.php?p_sid=&p_tbl=10&p_id=183&p_created=1088688991[/img]
After you click the "Review" button you will get a new screen with written instructions provided by the journal. At the top of the screen there are two buttons. The first button is labeled "View Details". Click this button to see detailed information about the authors and their submission. The second button, "Score Manuscript" will allow you to fill out the scoresheet for the manuscript and enter your comments to Authors and the Editor. 点击后会出现新窗口并有输入帮助。上方有两个按钮,一个是细节查看,另一个是稿件打分。
[img]http://scholarone.custhelp.com/cgi-bin/scholarone.cfg/php/enduser/fattach_get.php?p_sid=*ezithfh&p_tbl=10

&p_id=183&p_created=1088691314[/img]
Step 5: Save Your Review 保存
Once you have completed the scoresheet you can save you work by clicking the "Save Review" button at the bottom of the screen. 完成后点击保存。
[img]http://scholarone.custhelp.com/cgi-bin/scholarone.cfg/php/enduser/fattach_get.php?p_sid=*ezithfh&p_tbl=10&p_id=183&p_created=1088692501[/img]
Step 6: Submit Your Review 上传审稿结果
Finalize your submission on this last screen. Click the "Leave Open" button if you would like to save your review and come back to it at another time. Click the "Submit to Editor" button to complete your review.
[img]http://scholarone.custhelp.com/cgi-bin/scholarone.cfg/php/enduser/fattach_get.php?p_sid=*ezithfh&p_tbl=10&p_id=183&p_created=1088692500[/img]

3. 编辑部
编辑负责选定审稿人,而这些审稿人必须先被加入到database中,然后编辑部才能增加其到reviewer中。当发信给目标审稿人后,会得到两种回复:愿意和不愿意。在未找到审稿人前,都是“"Awaiting Reviewer Assignment"”。找到后,才会进入“Under reivew”状态。
4. 总编
Step 1: Log In 登陆
Note: Before you begin, you should be sure you are using version 7.0 or higher of Netscape or version 6.0 of Internet Explorer and Adobe Acrobat Reader 5.0. If you have an earlier version, you can download a free upgrade using the icons found on the Instructions and Forms link on every page of the Web site. 注意电脑和网络配制要求。
The login page gives you three options: 登陆页面出现三个选项
1. Log in with your known User ID and Password 用已有的用户和密码
2. Check to see if you have an existing account 确认是否已有账户
3. Create a new account 新建账户
How to Get Help
All pages have a “Get Help Now” button in the upper right corner. This link brings up a new window that has instructions, answers to Frequently Asked Questions, and a method to log a case with the support team. 有帮助文件。
Step 2: Enter your Editor-in-Chief Center 进入EIC中心
To check status of manuscripts in your charge, or to assign manuscripts to Editors, choose Editor-in-Chief Center.查看你的工作状态或分配稿件给责任编辑,点击EIC。
Step 3: Inside Your Editor-in-Chief Center EIC中心里面
The primary task in your Editor-in-Chief Center is to assign the Editor to new manuscripts. To view these, choose #1, Unassigned Manuscripts.首要任务是分配稿件给责任编辑
Step 4: Assigning Editor 指定编辑
The resulting list includes all manuscripts assigned to you that are awaiting Editor assignment. Click on the title to view the manuscript, or choose View Details to assign the Editor.查看稿件状态,指定责任编辑。
Page down. 下一页面。
As Editor-in-Chief, you can render an immediate decision or assign an Editor to initiate the peer review process. To assign the Editor, use the box on the right to search by name or area of expertise. To view an Editor’s history for this journal, click on the Editor’s name. 作为总编,你可以决定是否直接拒稿或者分配给编辑。
A window showing Editor history will appear. If this is the Editor you want to assign, choose “Assign Editor” button.有编辑历史。
A letter is produced from the template previously entered into the system. The letter can be modified prior to choosing the “Send Letter” button. 有发给编辑的信件模板。
This manuscript has now been assigned and will no longer show on your list of those waiting to be assigned.

under review可能是指编辑部的初审这一关,然后就是选择和确定审稿人(Awaiting Reviewer selection),并等待审稿人的确认(Awaiting Reviewer Assignment),最后就是审稿人的审稿了(Awaiting Reviewer Scores)。

Awaiting Reviewer selection和Awaiting Reviewer Assignment都是指正在寻找或指定reviewers。该过程是审稿一审阶段最耗时、最费编辑精力的事情。

1.       一般杂志复审与否与编辑部流程关系很大,有的会强制邀请提出major revision的审稿复审的,有的则修回后编辑部直接决定。复审与否还与专家第一次审稿的时候是不是同意复审有关。有些审稿专家会要求修回之后让他看看,这样编辑部也会让他看了。当然,有些编辑自身学术水平比较高,作者稿件修改之后自己直接决定了。
不知道楼上所提的杂志怎样,要具体看待了。
2.如果2个专家意见相左的话,第一次审,一般就有第三个审稿专家参与了。复审意见相左,一般不会再找一个新专家提意见,编辑部会在内部讨论决定。
“Editor assigned”是已经为你的文章确定好了编辑,下一步这个编辑会对你的稿件进行进一步的处理,也可以说编辑从专家库(或者作者建议的审稿人名单)里联系了适合评审该稿件的审稿专家,并且这写专家表示有时间、愿意为杂志评审该稿件。当审稿人登录系统获取了待审稿件后,系统就自动将状态改为"under review".“Editor assigned”再次出现说明下一步这个编辑会对你的稿件再进行进一步的处理,

国外期刊投稿中常见的术语
1. Submitted to Journal
刚提交的状态
2. Manuscript received by Editorial Office
就是你的文章到了编辑手里了,证明投稿成功
3. With editor如果在投稿的时候没有要求选择编辑,就先到主编那,主编会分派给别的编辑。这当中就会有另两个状态:
3.1. Awaiting Editor Assignment指派责任编辑
Editor assigned是把你的文章分给一个编辑处理了。
3.2. Editor Declined Invitation 婉拒。
4.随后也会有2种状态
4.1. Decision Letter Being Prepared 就是编辑没找审稿人就自己决定了,那根据一般经验,对学生来说估计会挂了. 1)英文太差,编辑让修改。 2)内容太差,要拒了。除非大牛们直接被接收。
4.2. Reviewer invited 找到审稿人了,就开始审稿
5. Under review
这应该是一个漫长的等待。当然前面各步骤也可能很慢的,要看编辑的处理情况。
如果被邀请审稿人不想审,就会decline,编辑会重新邀请别的审稿人。
6. Required Reviews Completed
审稿人的意见已上传,审稿结束,等待编辑决定
7. Evaluating Recommendation
评估审稿人的意见,随后你将收到编辑给你的decision
8. Minor revision/Major revision这个时候可以稍微庆祝一下了,问题不大了,因为有修改就有可能。具体怎么改就不多说了,谦虚谨慎是不可少的。
9. Revision Submitted to Journal
又开始了一个循环。
10. Accepted 恭喜了
11. Transfer copyright form 签版权协议
12. uncorrected proof 等待你校对样稿
13. In Press, Corrected Proof 文章在印刷中,且该清样已经过作者校对
14. Manuscript Sent to Production 排版
15 in production
出版中
另外的一些常见英文词汇:
camera-ready paper
可以付印的正式稿件
graphical abstract
图文摘要一个能够突出你文章特色的图,配上一两句话说明
running head 或running title
就是发表文章里显示在你页眉上的(一般论文偶数页显示RUNNING HEAD,奇数页显示论文的前几位作者的英文名缩写),一般是用一个短语(几个单词,别太长了)根括你论文的主要内容。

“垃圾”DNA是宝[转]

一类非编码(Non-coding)的DNA分子(不表达蛋白质),曾经被认为是“垃圾”的DNA目前正成为遗传研究的新热点,“垃圾”DNA的真实含义早已变更,它仅仅是一个历史遗留的名称,实际上,“垃圾”DNA具有无与伦比的表达调控功能,2所顶级学府的2个独立团队在Science上发表关于垃圾DNA的最新研究进展,“垃圾”DNA实际上是促成我们人类的关键调控因子。
    有评论认为,“垃圾”DNA实际上是shape who we are的DNA序列,所以说现在科学家们更愿意称“垃圾”DNA为“非编码DNA”。所谓的垃圾DNA占据人类基因组98%的容量,这也是原来被误以为是垃圾DNA的原因。
    来自耶鲁大学和Wellcome Trust Genome Campus的两位科学家带领的研究团队发现,从统计学意义上来说,人与人之间的基因有99%的相似性,但是,每个个体的非编码DNA却有着显著的差异。这也就是为何我们的基因组大体上相似,却每个人都不一样,也可以说,非编码DNA促使每个人都变得独一无二。
     耶鲁大学
     Michael Snyder教授(目前已跳槽去斯坦福大学)带领的耶鲁大学的研究团队,在Science上发表文章Variation in Transcription Factor Binding Among Humans;同天Ewan Birney教授带领的Wellcome Trust Genome Campus的研究团队,在Science上发表文章Heritable Individual-Specific and Allele-Specific Chromatin Signatures in Humans。
    Michael Snyder教授的研究团队发现,转录因子蛋白粘附在非编码DNA的伸长链上,影响着临近的基因生产蛋白质,在不同人群基因组的不同位置发挥作用。人体中存在成百上千种不同的转录因子,目前Snyder教授的研究团队以2种转录因子为代表进行前驱性的研究,这2种转录因子分别是NF-kappa-B和Pol-Ⅱ。
     Snyder研究团队分析了2种转录因子在不同个体包括人类与猩猩基因组上的结合模式,结果发现,NF-kappa-B的结合模式有75%的相似性,余下的25%存在极大的差异,Pol-Ⅱ的结合模式有92.5%的相似性,余下的7.5%存在极大差异。
     Snyder研究小组对这些差异进行了功能上的研究,分析发现,转录因子在非编码DNA上的差异性的结合模式影响周围基因的表达,不仅与个体的独特性有关,与个体的疾病发生率也有很大关联,比如说,阿尔茨海默病、糖尿病、类风湿性关节炎和其他疾病。
     Snyder估计,这些差异不仅与疾病发生有关,与基因突变率也有关。
     Wellcome Trust Genome Campus
     Ewan Birney带领的研究团队则对转录因子在不同个体中作用模式的差异性产生的原因进行了分析。
     Ewan Birney是欧洲生物信息研究所的遗传学家,他们这次选择了2个家庭来做对比分析。
     研究发现,遗传继承的非编码DNA序列(周围的基因没有发生变异)将促使转录因子的结合位点保持与亲代一致。
    一旦非编码DNA发生变异,即便有时候是单个核苷酸发生改变都可能导致基因组的结构发生变化,导致某些DNA记忆单位缺失或是重复。这些缺失或是重复的变化最终导致疾病的发生,如阿尔茨海默病和肥胖症。
     加州大学的一场学家Kelly Frazer表示,这些新的研究解释了某些疾病发生的机制 ,比如说,我们常常去寻找一些疾病风险基因,但结果往往是,一些提高疾病风险的变异不存在于基因上,而是离基因很远。比如说,心肌梗塞患者60%以上的变异不在某个基因上,而是在非编码DNA上。
     这些新的发现给疾病的研究带来新的视野。
     附:以上的发现说明本人在论坛发表的
破解DNA的信息编码一文中的观点真核细胞的染色体的DNA序列以及由其转录的RNA序列中的非编码的序列包含了对真核细胞的生长和整个生命活动的控制、输入、循环、分支、判断、赋值等语句,同时DNA序列以及由其转录的RNA序列中的非编码的序列包含了真核细胞的生长和整个生命活动的容错编码是符合事实的.

 

原文链接

找到科研创新点

周边的同学和朋友,经常为找不到创新点而烦恼,昨天与我的原来导师的师妹网上也聊起,向我求助。于是,我觉得这个问题估计对一些虫子都有帮助。我也是跨了几个学科,本科是工科,硕士是在国家重点实验室做实验的,估计与大多数虫子相同,博士学管理的,相当于跨了几个学科,现在做高校老师。有点心得,在此,献给各位虫子,希望能对一些虫子有帮助。
创新难,难创新,首先就是要找到创新的点,才能想实现创新的途径和方法。我觉得可以从如何几个方面:
1。科研扫盲,这是创新的第一步也是必要的一步。
首先是把导师,师兄,师姐的文章和论文,科学基金的申请成功报告,没中的申请报告,结题报告,横向课题的报告,咨询报告等全部浏览一遍,知道自己在什么领域,这个领域你的导师和前几届做什么,这个对于硕士来说,我觉得很有必要。这相当于给你科研扫盲,对于那些博士跨学科的来说,也是很有必要的。
2。寻找问题和分解问题,创新的源头。
如连问题都找不到或不知道如何分解问题,科研的基本功需要加强和科研思考的方式需要转换。 多参加知名专家或者基金委或者部委的讲座。这个可以听到很多现实问题的描述,不一定是怎么解决,可能是抛出了问题。问题导向,往往就是我们研究的出发点。还有有的虫子可以走捷径,就是关注当年国家基金(自然、社科、863,973等)申请指南和已经中标的基金项目,这些都有网站,上面都有每年中标项目和项目列表统计,多去看看。如果2007年,有个基金项目你正好赶兴趣,这时你正好处于选题时候,就可以选他,等那个基金结题了,你的博士论文也差不多了。尽管处于两个地方,但是肯定结果不一样。还有就是多观察和对经常见的问题问个为什么?不要相信任何权威,敢于对一切质疑。导师不一定是对的。许多重大创新都是建立对权威的挑战,这样的例子数不胜数。在我硕士是做实验的,我举个简单例子,农村的小孩大概都知道田边的稻子长得好,谷子饱满,我想大多数都知道阳光充足呀,肥料好呀,根系可以深入田埂吸收营养。但是估计有的农村出身的同学可能还注意到一个现象,就是长在田埂护边上的稻谷更好,这是为什么呢,平时护边上的稻子并没有被水淹没,所以这就一个问题。仅研究这个问题或现象,我原来的老板的团队就做了863,973项目,其实就是一个适度亏缺的问题。这个问题如果延伸到医学,你看那些得胃病的人,往往是饱一顿饿一顿,或者经常吃的很饱(据说经常吃的很饱容易变傻),其实如果我们让得胃病的人吃饭的时候“适度亏缺”不就容易了吗?接下来的问题就是:那么为什么适度亏缺就可以了?我们可以发明什么药物让这个人吃了这个药胃还没吃饱情况下就产生饱意或者适度亏缺呢?所以,问题就是要平时多观察一些细致的问题或者已经发生的问题,我们往往对我们习以为常的问题,不问为什么,建议大家看看每年搞笑诺贝尔奖的情况。比如现在学管理,管理的问题就多了,举个简单身边例子,读博士的时候,发觉大家相互交流很少,有的人不愿意把自己的想法说出来,所以,许多老板开什么周会月会,但是往往是气氛不热烈,老板说得多,那么为什么这样呢?你如果深究下去,会有很多值得研究的问题。问题不是没有,而是你没有观察,或者没有对经常的问题,问个为什么? 分解问题,我想学管理的大概知道WBS(工作分解结构),这个对于其他学科虫子来说,真的很有帮助,可以去google,百度搜索下。各个学科不一样,看了这个工具,大家结合自己学科捉摸吧。没有共同的经验,就是工具一样。最后提醒一下,在工作分解结构之前或者看问题之前,一定要高处着眼,低处着手。高处,就是你头脑里要想着你这个问题所有有关联的各个方面,而低处就是从叶子着手解决。
3。看文献——获取创新灵感或者解决问题方法的路径依赖。
看文献,不是看书。这个很多虫子也贡献了很多经验。但是我周围的人也知道小木虫,但是很多人看了那么多经验,可是看完了还是很困惑?原因何在呢?我观察了很周边的同学和同事,我发觉一个重要的就是动手太少,看纸质期刊太少。这个我想小木虫很多发SCI的,一般看国外期刊,但是现在很多图书馆的国外期刊也有纸质版本,看纸质版本,你可以浏览到你的这个领域顶级期刊相关的研究,一些人为什么没有找不到创新,有可能就是根本不知道自己研究的领域到底有那些方向。,随便浏览纸质版本,或许一个并不相关的问题,你无意中看到了,给了你启发,电脑搜索的电子文献往往我们是按主题或者关键词搜索的,请问,你能保证你提前设置的关键词是最新的吗?创新要看不同主题的文章,很多来源于交叉和其他学科。当然有的学科即使要创新也要需要实验设备支撑,这个也是不断磨合的过程。要想找到自己创新点,我觉得看文献很重要。如何看呢?首先,准备好一个不大不小的笔记本(可以命名为科研灵感本),最好有个厚重封皮,准备一支笔。摔开电脑,周边的同学许多很依赖电脑,存了很多文献,至于看了没有,估计大多数占空间。还有电脑一看,网络一开,你得思维无法完全集中于文献,一会儿QQ,一会儿小木虫论坛等等,打扰太多。灵感=心静+环境。去图书馆期刊阅览室,带着前面1,2想到的问题和听到的问题(也要记在你那个专门的科研灵感本上),静下心来,加起来的时间至少2个月,边看期刊的时候,如果闪现什么灵感,马上记下来,切记,一定要记下来,好记忆不如烂笔头,注明出处,你的灵感是解决什么问题的,这个文献给你的启示到底是什么,如果你当时沿着这个灵感还有其他想法,就沿着这个思路下去,直到你不知道写什么,那么就停止,看第二篇。看期刊,最好是从目录看起,稍微沾边的都要浏览一下。对于做实验的科研来说,一般期刊比较少,也比较专业,所以很快能看完。但是对于社科,比如,管理,经济,法学等学科来说,往往会涉及到很多期刊,所以时间很长,但是一定要静下心来,这个时间可以在一年级上完课就去读。对于社科的来说,往往创新不容易,我这里特别提醒一下,由于我原来学工科的,现在学管理,我觉得社科类的研究生一定要去浏览下工科类的杂志或者理科类的杂志或者交叉学科的杂志,一般会有大收获。我的几篇小论文都是启发来源于工科。另外,社科研究的问题往往是一个系统的问题。所以,只要是系统问题,工科类的控制类杂志(像控制与决策,电气自动化,机械工程等),系统类杂志(系统工程理论与实践,运筹学等),计算机类的杂志(计算机学报,计算机技术与应用,微型计算机系统等)都要看。即使就是工科的也可以去看,比如有点人研究水资源配置,显然就是一个系统问题。这些杂志很有帮助。
4.利用网络——创新帮助的好助手。
网络当然有很多专业论坛,数据库等,我都将其归类为电子文献。 如何看电子文献。首先得按主题分类,很多虫子都贡献了,这里不说了。按照上3,这样你从纸质期刊得到很多灵感,那么现在你把你的这些灵感关键词或者主题,从电子文献中去索取,也要按照3的办法,看的时候,马上记下来,或者建立一个WORD文档或者专门的软件,把感兴趣的截取下来,并在旁边注明给你得启发是什么,它有什么用处?这个很重要,有的人看了文献,就丢在一边,看得多,丢得多。另外,在看电子文献的时候,一定要关掉QQ和论坛等东西,不要让这些断了你灵感的来路
5.积累——创新的技巧和关键手段。
按3,4步骤,这样你就积累了很多灵感了。厚厚一个本子或者一个长篇word文件,这样重温一遍。请记住,没有积累,是没有创新的!!!!这个积累不是说把文献从数据库下载下来,放在计算就里面,而是你看了文献,你的随时闪光点或灵感的用笔记下来的看得见的积累。这样,你把你这些闪光点,找了相关文献,觉得可以写一篇小论文,就马上动手写,不要拖,不要找借口,要知道写作的激情会失去的,找不回来的。把小论文写好了,放在一段时间,再看,可以的话,修改后就投,如果觉得可以,投高一些杂志,觉得一般,投一般核心,觉得实在不咋样,就投哪些不是核心的。这里,我周围一些同学和同事,有个观念就是要发就发好的,我觉得这个不好,即使是一篇非核心的,看到自己的东西变成铅字了,心里还是会高兴的,这会给你极大的精神动力。如果,你的稿子就是要发SCI,EI,SSCI等,发了一年半载都难中,会打消你得积极性和使你苦闷,而一旦苦闷了,灵感就跑了。灵感是非常偏向哪些思想活跃的人,那些有精神的人。还有要注意,大的创新点是要靠小的创新点集成的。没有小的,那有大的。胜利的目标总是在不断的加油中接近的,小论文就算你的油,写写,你就顺了,这点对于社科类的虫子,估计很重要。
6.走向大自然——获取一颗创新的生态心。
现在其实我们很多解决的问题来源大自然,大自然是生命的来源,也是创新的生命起源。不管你是理论研究还是社会研究,保持一个生态心很重要,过于功利,浮躁,布满灰尘的心都是创新的杀手。走向大自然,不要逛什么街和超市。这点,估计有的人会说,这与找创新点有何关系?登高而望远,试问,你在那么喧闹的超市,那么多帅哥美女从你前面经过,你的神经会得到休息吗?你的思维会有闪光吗?所以,如果在实验室或者宿舍呆烦了,不知道怎么做。不如带上自己科研灵感本和笔,去郊外或者爬山,让自己的心胸开阔起来,说不定心中的苦闷气出去了,灵感就进来了。
总结:注意看交叉学科和其他学科的杂志,多记,心静,多动手,贴近大自然。
我的心得:没有不开窍的脑袋,只是方法不对。不是没有创新,而是积累不够。

研究生必须知道的生存法则

研究生必须知道的生存法则(一)

1 、教育出来的学生应达到两个基本目标:思想越走越远,行动越来越受约束。 ――“ 人,任何时候都要遵守游戏规则 ” 。
2 、做事情,关键在于自己能否用心。一旦用心,解决问题的思路就会多起来。做事用不用心,明眼人一眼就看得出来。自主性和主动性要靠自己去培养,更要靠自己的责任感来提升。喜欢耍小聪明的学生,恰恰是别人眼中最大的傻子。正所谓最笨的事情都是最聪明的学生做出来的。
3 、一个学生在一个学术团队中存活这么久,离开之际一定要在团队内部打上自己鲜明的烙印。
4 、学生每天的工作量是变化的,但是每天的节奏感应当是一样的。任务多的时候,只能尽量加快节奏;任务少的时候,千万不能减缓节奏,这样才能增加生活的充实度。 ―― 浮躁是维系节奏感的大敌。
5 、我不要求学生每天只睡一个小时,但是我可以要求学生每天少睡一个小时。 ―― 有感于有美国的大学生尝试在打工期间,坚持每天只睡一个小时。
6 、要多和比你强的人多交流,这些人能够帮助你少走弯路,而且他们身上的人脉资源也是完全不一样的。也不要看领导蠢,成天不做正经事,但他年轻时比你能干多了。
7 、新的问题出现时,就是学习的好机会降临了。人要在解决问题的相互折磨中才会共同进步。
8 、当一个人进入你的工作领地时,就算是一个陌生人,也要加以必要的招待。虽然他不一定会帮你,但很有可能他是来挑你毛病的人。如果你不去关注他,他会在你浑然不觉中致你于死地。 ――― 有感于办公室来人后,很多学生不理不睬的冷漠。
9 、学术上的成果,都应该有一个标准。这个标准从一开始就要设定在很高的层级,同时尽量考虑学生易犯的错误来制定。不如此,提交的成果不仅达不到高要求,还会让学生产生只是在不断完成任务的抱怨。而抱着想完成任务的心情,最终只能成为 “ 工匠 ” ,绝对成不了 “ 艺术大师 ” 。

研究生必须知道的生存法则(二)

1 、真正从事学术研究的人,注定是孤独的。孤独不是一种状态,是一种心态。心态的问题需要用心来克服,不是外在行为模式改变可以医治的。
2 、很多研究生和导师交往时,只有紧张的心态,而没有紧张感。紧张的心态往往导致学生做出错误的判断并一错再错,而紧张感则强调遇事三思并依正道而行。虽然一字之别,体现出的教育效果确实相去甚远。
3 、有的研究生以遇到能改变自己一生的导师为荣,有的研究生却只希望自主培养,导师只是其获取学术证书的跳板。这两者没有高下之分,只有匹配与否的区别。如果说研究生遇见好导师是人生的幸事,那么导师遇到好学生则可以称为更幸运的天赐。研究生如果足够超群,他把导师培养成知名学者就是一件不困难的事情。而这是目前科研群体较为常见的态势。
4 、不要把自己置于危险的境地。和比你更 “社会化” 的人比赛小聪明的多少,这就是最危险的事情。显而易见的是,导师比研究生拥有更高的 “社会化” 程度。
5 、一个研究生,要学会利用身边的资源把不知道变成知道,否则读什么研究生?做科学研究不能等、靠、要,要自己学会寻求解决问题的思路。很多事情导师也不熟悉,学生去请教导师,他也要依靠资料检索才有答案。学习,永远没有穷尽,对导师和研究生是一视同仁的。研究生尽量不要把应当属于自己挑的担子丢给导师,这样只会降低导师心目中的印象分。
6 、现代人的人际关系,其基础在于双方是否都拥有可供交换的等价资源。
7 、对于错误,学生应当具备辩证思维,就算做错了事情也没有关系。只有不断犯错误才能进步。总不犯错误就说明犯大错误的时机快来了;要不就是根本没有进步,连犯错误的机会都没有了。老师指出学生所犯的错误,不是要统计学生究竟犯了多少错误并据此来加以惩罚,而是要帮助学生不犯重复性的错误,并督促学生在修正错误方面加快进度和纠正态度。而现实的情况是:学生往往承认错误的速度很快,改正错误的速度很慢;并且,改正错误的速度远远赶不上犯新错误的速度。
8 、能力、机遇、意愿的乘积才是一个人完成某项工作任务的绩效得分。导师对于学生的奖励,不是奖励他的工作态度,而是奖励他的工作成果。
9 、研究生遇到一个棘手的难题,向导师请教是一条重要途径。不过,学生要学会听话听音。如果导师已经透露该问题有解决的途径,那就不要图省事,装聋作哑,甚至说什么“和导师相比自叹不如”的空话。有这个心思,就去好好思考如何通过“引智”来解决问题。在寻求别人帮助时,应当把眼界放宽,去寻找真正能够提供帮助的人,而不是仅仅局限在自己日常的活动区域中。很多学术问题,专业人士的思维方式是独具特色的。例如,要推导一个复杂的数学公式,最简单的方法就是请教数学院的研究生,而不是随便找两个人来问问就草草了事。
10 、弹簧定律:学生能力的成长是一个弹簧被拉伸的过程。弹簧被拉长,一旦松手,弹簧立即回到平衡位置;也就是说,学生能力有所增长,可一旦缺乏关注,学生所有的进步全部消失。若将弹簧拉长到极限,虽然不会弹回到原点,但是失去了弹性,这也就不再叫弹簧了;也就是说,倘若长时间对学生实施高压式的关注,这对学生是极端痛苦的过程,其心理可能会扭曲。

研究生必须知道的生存法则(三)

1 、学生完成科研任务的质量和时间成正比,但正比的系数是和能力成反比的。不是多花时间,就一定能得到满意结果,学术研究领域尤其如此。慢工固然可能出细活,但是能力不足,就不是快慢问题了,而是质量好坏问题。
2 、如果研究生对什么事情该做、什么事情不该做都难以辨别,那么,他做事的机会慢慢就减少了。须知:机会不是永远都有的,也不是一承认错误就能继续拥有的。研究生在导师那里长期兑现不了诺言,就算导师不批评,至少他不再信任你了。信任一旦失去,哪里还有做事的机会呢?永远记住这句话:信任失去的速度比建立的速度要快得多。
3 、一个工作任务没有处理好,在导师过问之前,学生主动向导师说明情况,学生的理由就是理由;如果学生一直保持沉默,直到老师过问才说明为难之处,此时学生的理由就变成了借口。
4 、“做任何事情都是要慢慢磨的”。这里的“磨”是指做事过程中的精心、细心、决心和用心,而不是时间上的拖拉。研究生必须具备“韧性”,正所谓水滴石穿。没有持之以恒的决心,激情往往就像太阳雨,来去迅速,这对一心期望学生进步的导师是最大的伤害。
5 、一个研究生,毕业五到十年以上,只要自己足够努力,在物质方面的需求应当得到基本满足,至少不用为五斗米折腰。因此,在研究生学习阶段,过多考虑未来究竟能否赚到足够多的钱,就好像每天担心十年后有没有足够多的空气供人呼吸一样,纯粹是浪费自己的青春。有这个时间,就多去思考该思考的问题。
6 、对老师的敬畏:畏其才能,敬其品格。如果对老师只剩下畏,却毫无敬,那就说明:学生在完成工作任务方面有所不足,而老师在完善人格方面尚需磨练。
7 、社会太浮躁,人心易躁动,这都怪不得任何个体。现行教育体制,已把高校退变为供人镀金的熔炉。任何从高校这个冶炼炉出产的制品,在烫金学位证书的光照下摇身一变成为一名金人。至于会不会一鸣惊人,那就看个人的造化。
8 、研究生不具备基本的科研素质,最好不要读研究生。导师缺乏指导研究生的能力和品格,还是少找几个学生为好。这两者都基于一个共同的观念:肚子里没货,到产房里去干什么?
9 、学生要有感恩之心。感恩,不是指学生飞黄腾达后给予曾经帮助过他的老师多少物质享受,而是要学生树立极强的社会责任感,在自身能力不断增长、占有的社会资源逐渐增多的情况下,在力所能及范围内去帮助那些需要帮助的弱势群体。不管学生以后是在什么样的工作岗位上,都应树立这一信念。
10 、低调是具备高调做事能力的人才能说的官话。首先要能做事,要能把事情尽量做好,才能选择以什么样的面目示人。有高调的做事能力,却以低调的方式展示,这样的官话一般人是学不来。

研究生必须知道的生存法则(四)

1 、成年人的世界不是看你说什么,而是看你做什么。知道一个道理,并不是成天把它挂在嘴上到处宣扬,而是要在行动中把它做出来产生实效,这才是真正的 “ 懂 ” 道理。研究生已是法律承认的成年人,如果长期把自己当未成年人看待,动不动就以“我是学生,是弱势群体”来回应所有的批评和指责,那么神仙也当不了你的导师。
2 、作为研究生,要有极强的上进心。学生刚进来时,大多是有远大志向的。缺少上进心,做事时只求过得去就行,不设定高标准来要求自己,这样只会浪费导师指导学生所付出的辛劳。
3 、专业只能让你有饭吃,而不能让你过好日子。人要吃饭就不得不有专业,可是有了专业以后,生活的乐趣就大大减少了。一辈子从事一个行当,一辈子只懂这一行,有什么乐趣?
4 、“无私” ,这是成大事者必备的技能。真正的豪杰,在“无私”上,当有更高的境界。王阳明就认为:“莫要轻看了豪杰。能做一番大事业的人,总有一段真挚的精神在内。”所谓“真挚的精神”,应该就是更高层面的“无私”吧。这在吕思勉所著“吕著三国史话”一书中“替魏武帝辨污”一节也有类似观点:“哪有盖世英雄,他的志愿只为自己、为子孙的道理?”研究生要在学术团队中生存,就不能太自利。你是团队的一员,必须为团队服务 。只有学会放弃自我,才能找到真正的自我。
5 、一个人穿惯了休闲装以后,穿正装会觉得不舒服。人一旦习惯了放松的状态后,再想绷紧会痛苦不堪。不要让身体太舒适。一旦懒散,就很难严谨起来。在精神方面,人越舒适越自由越好,但肉体方面要多加磨炼,要多吃一点苦,才能走更长远的路。只要你站得住,就不要坐下;只要你坐得住,就不要躺下。
6 、我一直认为,有人赏识你、提拔你,比你自己努力奋斗更重要。这个观点也许有人不同意,他们宁可选择自己奋斗。我宁愿有人提拔我、赏识我,实在没有办法时,我才会自己去奋斗。话又讲回来,一个人自己不努力、不奋斗,没有人会赏识你,也没有人会提拔你,这称为机遇与能力互为因果。
7 、成长是自己的事情,真正的成长,代表你需要对说过的每一句话负责任。责任,不是只关注自己,做好自己的工作;真正的责任是关爱身边的人。所以,话不能说满,但事情一定要做满。
8 、做事是要遵守制度的,否则定那么多制度干什么?制度是用来规范那些不值得尊重的人。只有自尊不足的人,才需要制度这一低层次的约束机制,高自尊的人更习惯于用内省来约束。如果你觉得制度制定得不合理,那就通过制度改革来解决制度问题,千万不要随便改变游戏规则,因为规则的制定者不是你,而是你的上级,你不能抢班夺权。
9 、导师不能轻易降低研究生的培养标准。对研究生的不忍心,就是导师的自私。对学生不忍心,等同于一步步把学生引诱到安逸的陷阱,进去容易出来难。一旦这种降低成为一种惯性,还会反噬自身,导师的自我要求也就相应放松,这是教学相长的反向解读。
10 、不要觉得自己的导师很蠢,成天不做正经事,但他年轻时也许比你能干多了。实在忍不住想鄙视一下自己的导师,千万不要让他知道,否则日子就不好混了。

科研语录和感谢

读博的一些体会
我导师说过一句让我感触很深的话:论文就是你的学术脸面,五年后大家查得到,十年后大家还查得到,甚至一百年后大家依然查得到。看到论文就知道你是什么样的人了,你现在干造假、发垃圾文、一稿多投之类事情,一百年后人家依然会耻笑你,骂你。发一篇假文章,你一辈子都直不起脸来。
风雨一年行
老板常常说:你要相信,在你做的这一块里,你比谁都牛,包括你的老板。
“不识庐山真面目,只缘身在此山中。”多给自己一个安静的时间和空间,去思考自己最近的得失,去条理最近的思路,避免盲目的做一个当局者。
人品是最高的学历。
要做科研,请先做人。要做人,请先做一个有人品,有人格的人。一切只因为,博士,硕士都只是一个称号,只有人品,才是人生的最高学历。
有时候我们会眼高手低,其实越是简单的事情里面,包含的往往是最深奥的道理,越简单的实验步骤,往往是我们整个实验失败的关键所在,是所谓细节决定成败。
先做人后科研
一个在我们试验室内工作的人员最基本的品质是诚实,他不需要告诉我们成功的经验.我们需要的是他告诉我们失败的教训。 这是我们实验室出高质量的文章的保证!
人人都希望凤凰涅磐,浴火重生,可是平平淡淡的真诚,一点一滴的积累,才是成功的希望。漫漫人海,相识就是缘分,真诚一点,互相合作不是很好吗?
做科研的一些感受
没有大树可以依靠,稿费自付,没有试验场所(只能在公共实验室进行)。在科研中,遇到这样那样的困难时,总以”穷且弥坚,不坠青云之志“自勉。人的成长过程都要经历困难和磨难,现在的牛人很多都是从困难的境况中走出的,因此我相信,只要努力奋斗,提高自己,机遇迟早会眷顾你,明天一定会更好!
yj
<士兵突击>经典台词也很适用于科研1.不抛弃,不放弃。
2.什么是有意义?有意义就是好好活。什么是好好活?好好活就是做很多 很多有意义的事。
3.人不能过得太舒服,太舒服就会出问题。
4.连长说过,日子就是问题叠着问题,要挺胸抬头去面对。
5.信念这玩意不是说出来的,是做出来的。光荣在于平淡,艰巨在于漫长。
6.我不玩牌,玩牌没意义。
7.别再让你爸叫你龟儿子了!
8.明明是个强人,天生一副熊样!
9.人不是做出来的,是活出来的。
几次失误带来的利益–科研中的柳暗花明
大家是否把整个心都投入到科研工作中去,是否吃饭睡觉都在思考你遇到的各种问题。不要怕别人说你痴说你傻。这点才是科研人的可爱之处。
人要学会为自己的行为买单
我们可以把握现在,我们可以珍视自己的每一次选择,可以慎重的决定每一次的问题,我们可以坚持自己的原则,虽然这可能意味着很痛苦,很无趣,甚至很凄清,但是,若干年后,当看到周围那些曾经潇洒的朋友满面愁容的懊恼,当看到曾经风光无限的人后悔,当看到昔日骄傲的人低头时,这份慎重,我想就会明白他的分量。
我们已经过去了可以逃避解释,逃避责任的年纪,我们已经过去了可以蒙混过关的年纪,我们需要为自己的一言一行负起责任,我们需要为自己的荒唐付出代价,我们需要为自己的每一个决定承担起后果——
这或者,也能让我们更快的成熟吧……
科研需要灵感和反常规
搞科研,一定要敢于挑战常规,敢于做大家认为不可能的事情。灵感是在努力过程中产生的,思想的火花一定不要放过,运气不是空穴来风。
    若果说日常科研是工作量,那么灵感就是突破和创新。
如何面对不良导师
心态最重要。任何事都有两方面,凡事望好处看,坏事也有好的一面。无论生活、学习还是工作只要对自己有帮助,感到收获,这就足够了。
我们活着,不就是时刻在接受考验吗?活着是苦的,酸的,所以活着才有滋味,才能体会到偶尔甜的兴奋!
如何做学生
做学生的要积极主动联系导师,不要等导师找你,凡是到导师找你的时候,我个人认为你这个学生就不合格!导师忙,你也忙,你比你导师还忙,这样的结果我看你的问题就来了,因为导师不知道你在忙什么,这个是最关键的,导师如何让你毕业?多联系导师,多沟通总是好的,人总归是人,个人感情因素是不可能排除的。
逆境中奋起——我的硕士科研
导师很重要,但最重要的还要靠自己。
没有谁做实验会一帆风顺,甚至你会遇到同样的实验自己做不出来而别人能做出来的情况。不要气馁,注意改进自己的操作。
做实验不能偷懒,该有的处理过程一步也不能少。否则你将付出超过原来几倍的力气去重复这个过程。
逝去的青春岁月
记得有人说过:
世界上最成功的人
往往不是最有才华的人,
而是最耐得住寂寞的人,
越是接近梦想,
道路便越艰难,
于是
成功的终点成为一种坚持,
一个梦想的实现,
支持精神的往往是
一种单纯的期待和坚持而已
谈谈我对创新的看法
在读研究生的时候,有位老先生告诉我无论什么领域,都会有相应基本理论。基本理论与实践之间的偏差和矛盾,就是悖论。研究悖论就是一个领域最好的研究。
努力弄清楚基本理论与实际的偏差是怎么一回事,问题就一定会问出来。这种创新才是你一辈子都不会愁多的。你一辈子也顶多解决一个两个。
博士选题体会!
在就读的过程中一定要想好未来,规划好未来,做那些东西,不要被老板左右。争取做些有实用价值的东西。也同时完成老板的任务。一句话,根据自己的未来的打算选择做什么,怎么做
在就读的一年级,要及时和老板沟通,不管他耐不耐烦,不管他有没有时间,不要去考虑他,及时了解自己对课题的理解和进度,及时调整更换课题。
是抓好英语水平的学习,及时了解外界的动向,不要等到找工作了,才发现原来自己已经世外桃源好几年了
你要考虑将来,肯定要花时间学习公共关系学,管理学,强化外语的学习等为未来做准备。
虫虫们,你还会选择做科研吗?还会选择读博吗?还会天真的迎难而上,听从老板的吩咐吗?一定要和老板讨价还价,否则你的未来他可不会管。
如何来规划自己的硕士研究生生活
我不再按本科时候的思维只要有课就上,我在研一只上我认为对我有用的课,读文献并在电脑上学习一些绘图软件和数据处理知识,现在看来效果很好,读研后使我彻底明白了一个道理:解放思想。
我们不能够一直去追求虚的东西,已经毕业的研究生,没有人看你的所谓的学习成绩优秀之类的证书,真正的评价指标还是你的paper
不要为那些不实在的东西去费太多心思,要把握高端,知道在哪个阶段我们最应该得到的是什么就够了。
分享我的经历
不要过于在乎导师对你不好,我们这代人将来应该再不会让我们的学生在这样埋怨我们就是了!作科研究得靠自己。SCI并不是什么神话,相当一部分SCI在几年之后就被后人发现是垃圾文章!坚持走自己的路,百倍的努力就会有结果。
毅力很重要!对你感兴趣的方向,就安心做下去,3年不行就5年,坚持就能出成果。我从小就喜欢天文学,自学了广义相对论和量子场论。对于一个工科背景的学生,能自学这两本书是很需要毅力的。若你也是学物理的,不信你可以试试看看能否拿下这两本书。
最后说一点技巧,想要很快出文章,先瞄准一个杂志,然后搜索他的某位审稿人最近的所有工作,你跟着这个方向作,文章容易被录用。
但是,并不全是这样,这就需要你加以筛选。注意,并不是去修改数据。你要对这些结果加以解释,或者加以猜测,这很重要,即便是你的猜测后来被证明是错误的,这都没关系。
七年之痒-我的科研经历
人们常说婚姻生活到了第七年就会没有激情和幻想,红灯暗起危机四伏,两个人在一起真的容易走到山穷水尽。虽然科研和学术上有很多新奇的东西不停地吸引人,我知道,只要在香港科大,还是面对同样的课题和环境,这种复杂的难以抑制的感觉很容易占据心灵,就像溺水一样难以呼吸和挣扎。我和学术的七年之痒如期而遇,真是束手无策。
学术之路根本不是象牙铺成的康庄大道,相反这是一条需要汗水泪水甚至是血水的羊肠小路。
写论文就像是拿着缺页的地图旅行,有很多歧途和风景,常常为到达了目的地之后发现自己的聪明而得意。也像是个小老鼠在迷宫里靠着嗅觉去寻找奶酪
在沙漠里钻井是找不到水的,遇到没有活力的课题就像在棉花堆里无处发力
论文撰写、修改、印刷和答辩都是表面形式,在这之前的思索和探索才是最美妙最幸福的,可惜形式从来都是压制本质的,就像我们往往看重衣着打扮而轻视营养健康一样。
等到想法完全实现那天,也许是几年后了,我能想象到那份平静而幸福的心情,应该有将重物搬到山顶的挑山工的解脱和艰辛、有刚生产后的孕妇的微笑和完成长篇小说的作家的兴奋吧。
人生的意义是为自己喜欢的事情奋斗同时有所收获,如果这种奋斗能给更多的人以推动和安慰,他的生活意义就更大。
如果我们选择了最能为人类福利而劳动的职业,那么,重担就不能把我们压倒,因为这是为大家而献身;那时我们所感到的就不是可怜的、有限的、自私的乐趣,我们的幸福将属于千百万人,我们的事业将默默地、但是永恒发挥作用地存在下去,面对我们的骨灰,高尚的人们将洒下热泪
我想一个从事研究或者艺术的人都会认为自己的职业是最有价值的,即使收入不高生活朴素,即使处于边缘受到冷嘲热讽,即使双脚扎满荆棘备受煎熬。但衣带渐宽终不悔,我相信那份真理和基于心灵的回报一定会在灯火阑珊闪耀。微笑与泪水不仅仅在离开这个世界的时候出现,它们会陪伴一生。生活要继续,学习和事业也要不断更新,这份七年之痒必将褪去,即使它仍然会偶尔发作,但留在心底的仍然是那一份永恒的真诚、执著、平静和幸福。
一个老博士的经验顺口溜
精益求精是目标,难得糊涂最重要
干不出来别上火,坚持就会出结果
不知道的就去问,千万不要闷头搞
定期要与老板聊,找不到人也得找
SCI重新意,写不出来也得写
EI刊源最好办,给钱基本差不多
发表文章老板抢,先斩后奏没商量
心情放松身体好,心理平衡最重要
唱歌健身都要搞,花前月下不能少
写给痛苦煎熬的4年级的博士们
你更要有信心把最艰苦,黎明前最黑暗的这一段熬过去,有空找几个哥们姐们喝几杯吧,大家骂骂倒霉的老板,交流一下哪个杂志发文章最快,发泄一下,放松一下. 明天继续干活,继续努力.一切都会好起来的
发表文章的成败经验
有思路就赶快写,科研也非常强调时效性
写文章就是体力活,只有勤奋,天天写,日日写,才能出成果。
一个研究生导师的肺腑之言
现在你阅读一篇自己研究方向的英文文献,在字典帮助下,能否在3小时(研究生)或6小时(本科生)内完全读懂和理解?
现在你阅读一篇本领域的文献时,能否自然地联想起3篇以上相关的文献? 如果您还做不到,那就还没有跨入研究的门槛,还需加油。

超纯水、RO水、蒸馏水、双蒸水、去离子水区别

超纯水:Ultrapure水 (超纯水),既将水中的导电介质几乎完全去除,又将水中不离解的胶体物质、气体及有机物均去除至很低程度的水。电阻率大于18MΩ*cm,或接近18.3MΩ*cm极限值。

RO水:也称纯水。即通过反渗透膜过滤后的水,反渗透膜的孔径一般为10A到100A之间,所以它能够去除95%以上的离子态杂质。

蒸馏水:利用液体混合物中各组分挥发度的差别,使H2O汽化并随之使蒸气部分冷凝分离而得的水。

ddH2O:Distillation-Distillation H2O(双蒸水),经过2次蒸馏而得的水。

去离子水,把水里的阴阳离子都除掉的水。主要通过RO膜和混床树脂来把水中的离子除掉。但,现在也有不少人把RO水也叫去离子水,这是不准确的。

超纯水是时下纯度最高的水,其次是双级反渗透水(双级RO水)、双蒸水(ddH2O)、纯水(RO水)、蒸馏水。

超纯水作为所有的实验用水都可以,特别是高灵敏度ICP/MS、ppt级分析、同位素分析、疾控中心、药检所、质检所、环监站、高校科研等标准实验室及各种高端精密仪器用水。其他的纯水及双蒸水根据实际情况,在要求不是很严格的情况下也可以用的。

一个高人的实验注意事项——超级有用

实验中的每个环节都很重要
忽略任何一个都可能导致实验的失败
每个人在实验中都有自己的一些好的细小的习惯

01.例如tip头吸完后马上从移液器上取下来,以免忙乱的时候又以同一tip汲取另一种试剂

02.用完量桶或者烧杯后,及时清洗,并放入烘箱。
03.不管大小实验,做完后都要认真做记录。
04.一个枪头如果可以连续吸两次的话,也不能吸,主要是第二次吸时不准,同时也不能完全打出来。
05.做完实验擦干桌子,一切东西归位,有些东西可以回收的就回收,很多实验室的枪头是回收的
06.实验前认真设计,作实验时不胡思乱想,叽叽喳喳,做完后在海阔天空,认真分析试验数据
07.做之前认真设计,做时就不要老是想着结果,平心静气,注意细节,认真记录,因为失败是常事,不要回过头不知道该从哪里找原因。
08.每次的实验结束,要及时做好记录,不要等。好记性不如烂笔头,这虽然是俗语,但是不要放弃笔的重要性!
09.加入试剂之前,把它混匀一下,以免放置时间长了浓度不均
10.做完实验,清理实验台,以便下次能够更快、更好,更方便的继续工作!药品归位,仪器归位!
11.做实验一定要有计划,最好在有部分实验数据出来时,就着手写文章,这个不是为了出文章,而是在写文章的时候,可以清晰自己的思路,知道后面该先做什么,该重点做什么.不然有时候辛苦半死,数据太分散,不能说明问题.
12.不轻易用别人的东西,用时先打招呼,记录要详细,配液体时要记录试剂的批号等?
13.所有的东西都要即时标记好,日期,药品名称,浓度,等
14.移液枪用完之后要归到最大计量的位置,防止久而久之弹簧失去弹性。
15.不用酶标枪时不要一直拿在手里,不可以倒过来(特别是里面还有液体的时候)
16.不要一直说话、聊天~~
17.笔记要全~要多全有多全~
18.还有一点很关键,笔记要放好,最近我的笔记掉了~哭死
19.最后,做完实验要洗手~
20.离开实验室前或进入实验室前注意水,电是否安全
21.试验之前看资料的时候,最好规类存放,看过后记下重点内容,并将出处标明,以免日后找不到
22.一定要记着关水浴箱,切记切记。

实验记录要记录这些内容:
1、先写实验步骤再做实验,实验的间隙注意写下实验现象的描述,实验后注意实验数据的整理和分析。
2、试剂最好自己配,公用的试剂第一次用时要做质检,并注明时间和谁配的,实验室中每个人都应该有个单字母缩写的名字。
3、认真做好实验前的准备工作,包括试剂、仪器的准备,实验方法的确定,请教一下做过类似实验的人。实验不单单是做出来的,还要有你的思想在里面,否则和一个操作工有什么区别。
4、EP管上一定要做好标记,写上日期,同时记录本上应该有详细的情况。重要的东西可以再加贴一张标签纸,在冰箱中反复冻融会使字迹模糊,可以再用透明胶带纸贴在字外面。

多和大家讨论,同时多关注别人讨论的经验,这几乎是最快提高的捷径了!希望有帮助
     实验前多想想,不要着急动手,在脑子里把全部的实验过程过一遍,尽量做到一次成功。要学习美国人的精神,尽一切努力做好这一次,不要想着这一次作不出来再做一遍,这样永远都作不好实验。

     注意他人的安全,注意自己的安全,注意实验室的安全!
     作实验的时候不要想其他的事情,想其他事情的时候不要作实验!
    大胆尝试,不要怕做不出来。尝试越多,获得原创性发现的可能越大。

作好预实验,先动脑后动手,作完以后要思考。
实验过程中不要与人海聊,以免污染。
用过的仪器等恢复原样,清洁实验台面。

大家说的都是切身体会,相信对newcomer和其他人都有启发。
记得一个NOBEL获得者说过:在实验室工作时什么都不要想,要专心于手头的工作;作完实验,要海阔天空地联想, 特别是实验的新思路和方法。
我认为要有风险意思,有些实验你要费很长时间的,或要花很多钱的以及结果对你接下来的实验很重要的,最好做一份详尽的试验流程图贴在试验台前,需要特别加以注意的地方要特别标注,可以随时参照。

你一定要仔细准备,争取一步到位,这样不用又费力又费钱!实验前最好做一个flow chat!
做实验前,设计好时间安排,可以在有限的时间内做更多事.利用起离心和电泳的空闲时间
补上记录和作一些整理工作

最好不让干燥箱过夜
取用有盖子的器皿时,不能抓盖子,以免脱落!
湿手不可握光滑玻璃器皿,以免滑脱!

我觉得做实验要清楚自己所做实验的原理,这样通过仔细记录每步实验的操作,才能找出实验失败的原因。 不用别人的试剂。
所有的试剂都自己配,出了问题才好找原因。
不要太迷信kit,有时候土办法更有用。

首先要靠自己,但也要多交流
当天事要当天毕
认真记录
团结周围的实验同事,因为成也他们,败也他们,这是顺利完成的关键,毕竟良好的人际关系和环境是舒服的

1, 事前充分设计
2 事时认真观察,及时记录,无论好坏
3 事后及时处理数据。
4 遵守实验公约,东西及时归原位。
5 及时处理垃圾整理物件等等

正式实验前的预备实验很重要。这样可以把一些你原来根本就没想到的漏洞补上。还有就是动手前尽量完备自己的protocols。
作实验时认真思考,对每一个步骤都提前想想。实验完毕后,收拾好自己的桌面,在用枪时务必注意轻轻吸液,以免吸到枪里面,很容易就会将枪腐蚀了,而且容易污染以后的实验。
一次最好只跟改一个实验条件,不然就搞不清楚到底是哪个条件改变影响了实验结果我觉得做实验最重要的是耐心和细致.

还有一点就是要做到不耻下问!

实验前充分思考可能出现的问题和结果,试验过程中则不能被自己预计的结论左右。试验结束后尽快整理试验数据,该统计分析的就分析,该作图的作图。同时,查阅文献,对自己的试验结果能够准确解释

多种植物病毒病简历

植物病毒病简历
NO.1 丝瓜病毒病 简 介: 英文名 Vegetable sponge mosaic
病原以黄瓜花叶病毒(CMV)为主,此外还有甜瓜花叶病毒(MMV)、烟草环斑病毒(TRSV)、西瓜花叶病毒(WMV)。
寄主 丝瓜等。
为害症状:幼叶感病呈浅绿和深绿相间斑驳或褪色小环斑,老叶呈现黄色环斑或黄绿色相间花叶,叶片扭缩或畸形,后期产生枯死斑。果实发病后呈螺旋状畸形或细小扭曲,有褪绿病斑。
病原物:由多种病毒侵染引起。

NO.2黄瓜病毒病  A 黄瓜花叶病 英文名 Cucumber mosaic
  病原 黄瓜花叶病毒Cucumber mosaic virus(CMV)、甜瓜花叶病毒Muskmelon mosaic
virus(MMV)、烟草花叶病毒Tobacco mosaic virus(TMV)、黄瓜绿斑花叶病毒Cucumber green
mottle mosaic virus(CGMMV)等。
  寄主 黄瓜等葫芦科植物
  危害 黄瓜的重要病害,病株率可达30%以上,对产量和质量有明显的影响。
  分布 全国各地均有发生。夏秋季发病较重。
为害症状:
 
 发病初期叶脉透明,几天后成为花叶,病形成黄绿或深绿泡斑,叶面常皱缩不平,出现名种畸形,明显变窄;有的病叶粗糙呈革质,绒毛脱落;有的叶基变长,侧
翼变狭变薄,呈现绷紧状态,叶尖细长,呈。鼠尾状。;有的病叶叶缘向上卷曲。有时叶脉出现深褐色坏死,或沿叶脉出现闪电状坏死。早期受侵染的烟株强烈矮
化,高度不及健株的1/2,根系发育不良。
病原物:
   (1)黄瓜花叶病毒:病毒颗粒球状,直径28~30纳米。病毒汁液稀释限点1000~10000倍,钝化温度60~70℃,10分钟,体外存活期3~4天,不耐干燥,在指示植物普通烟、心叶烟及曼陀罗上呈系统花叶,在黄瓜上也现系统花叶。
  (2)甜瓜花叶病毒:钝化温度60~62℃,稀释限点2500~3000倍,体外存活期3~11天。是西北瓜类病毒病的重要毒源。
  (3)烟草花叶病毒:病毒粒体杆状。失毒温度90~93℃10分钟,体外保毒期72~96小时,在干燥病组织内存活30年以上。
  (4)黄瓜绿斑花叶病毒:粒体杆状,粒子大小300纳米×18纳米,超薄切片观察,细胞中病毒粒子排列成结晶形内含体,钝化温度80~90℃,10分钟,稀释限点1000000倍,体外保毒期1年以上。可经汁液摩擦、土壤传播,体外存活期数月至1年。

B 黄瓜绿斑花叶病 英文名  Cucumber green mottle mosaic
病原  黄瓜绿斑花叶病毒(Cucumber green mottle mosaic virus简称CGMMV)。该病毒在黄瓜上有两个变种,即绿斑花叶病和黄斑花叶病。
为害症状:黄瓜绿斑花叶病分绿斑花叶和黄斑花叶两种类型。
①绿斑花叶型:苗期染病幼苗顶尖部的2~3片叶子现亮绿或暗绿色斑驳,叶片较平,产生暗绿色斑驳的病部隆起,新叶浓绿,后期叶脉透化,叶片变小,引起植株
矮化,叶片斑驳扭曲,呈系统性传染。瓜条染病现浓绿色花斑,有的也产生瘤状物,致果实成为畸形瓜,影响商品价值,严重的减产25%左右。
②黄斑花叶型:其症状与绿斑花叶型相近,但叶片上产生淡黄色星状疱斑,老叶近白色。

NO.3西葫芦病毒病 英文名  Summer squash virus disease 异名  西葫芦花叶病。
  病原:黄瓜花叶病毒、甜瓜花叶病毒(MMV)和烟草花叶病毒。
  寄主:西葫芦、南瓜、荀瓜、西瓜、甜瓜、冬瓜等。
  危害:西葫芦的重要病害,损失20%左右,流行年份可达50%以上。棚室栽培发病较轻。
  分布:露地栽培发生严重。
为害症状:
  (1)花叶型花叶型。表现叶片呈现黄、绿相间花斑,叶面凹凸不平,新叶畸形,蔓先端节间短缩。
  (2)蕨叶型蕨叶型。表现矮化症状,叶片皱缩、扭曲,新叶狭长,呈鸡爪状或线状。
  (3)混合型混合型。表现为花叶、蕨叶两种病症兼而有之。发病植株发育不良,花发育不好,难以坐瓜,就是坐瓜也小,瓜面有瘤状突起,失去食用价值。

NO.4 豇豆病毒病 英文名  Cowpea mosaic 异名  豇豆花叶病。
  病原  病原主要有3种:黄瓜花叶病毒(CMV)、豇豆蚜传花叶病毒(Cowpea aphid borne mosaic virus,简称CAMV)和蚕豆萎蔫病毒(Broad bean wilt virus,简称BBWV)。
  危害  苗期发病率10%左右,初秋季节发病重,发病率可达70%~80%,对产量和质量影响很大。豇豆病毒病是豇豆的主要病害,危害较大。
  分布  世界各地均有发生。
为害症状:嫩叶出现花叶、明脉、褪绿或畸形等证状,新生叶片上浓绿部位稍突起呈疣状;有的病株产生褐色凹陷条斑,叶肉或叶脉坏死。病株生长不良、矮化、花器变形、结荚少,豆粒上产生黄绿花斑;有的病株生长点枯死,或从嫩梢开始坏死。

NO.5 番茄病毒病  A番茄条斑病毒病 英文名  Tomato spot virus
  病原  马铃薯X病毒Potato virus X,简称PVX;烟草花叶病毒Tobacco mosaic virus,简称TMV。
  寄主  茄科蔬菜。
  危害  番茄一种重要的病毒性病害。发病率5%~20%,发病株减产明显,严重时发病率可达到100%。
  分布  各地匀有分布,局部发生。
为害症状:发病初期茎、叶柄、果实等位产生黑褐色条纹状坏死。
  (1)叶片:叶背叶脉呈紫褐色油浸状条斑,后沿叶柄蔓延至茎杆,扩展为坏死条斑,严重时植株死亡。病株上部叶片呈现或不呈现深绿色与浅绿色相间的花叶症状,下部叶片症状不明显。发病早的植株节间短,叶片小。
  (2)茎杆:在植株茎杆的上、中部,初生暗绿色下陷的短条纹,后变为深色下陷的油浸状坏死条斑,再经逐渐蔓延扩大,使条斑连片,上下连接,病株逐渐黄萎枯死。
  (3)果实:病株上的果实畸形、坚硬,病斑大小不一,大部分较小,浅褐色至深褐色,表皮凸凹不平,切开病果,则可见褐色条斑,严重者可变褐腐烂。病部变色仅局限于表皮而不深入到茎内和果实内,这一特征可与番茄筋腐病可相区别。

B 番茄斑萎病毒病 英文名  Tomato spotted wilt virus
  病原  番茄斑萎病毒Tomato spotted wilt virus,简称TSWV。
  寄主  番茄、辣椒、烟草、心叶烟、百日草、豌豆、马铃薯、莴苣等茄科、菊科、豆科作物和许多观赏植物品种共900多种植物。
  危害  番茄的一种重要病害,近年来呈现蔓延的趋势,严重影响产量和质量。这种病毒病已成为目前温室生产中最主要的一种病毒病。
  分布  主要发生在热带,近年温带也有发生。
为害症状:
  (1)幼苗:新长出的叶片上卷,呈铜色,出现很多黑色点状病斑,叶背面叶脉变紫。有的生长点坏死,在茎上产生褐色坏死条斑,植株矮化或呈半边生长,严重时萎蔫,不能正常开花结果。
  (2)青果:产生褐色坏死斑,中央突起,易落;成熟果上病斑呈轮纹状,后期病斑变褐坏死,脐部症状与脐腐病相似,但该病果实表皮变褐坏死于脐腐病。

C番茄蕨叶病毒病 英文名  Tomato brake leaf virus
  病原  黄瓜花叶病毒Cucumber mosaic mirus,简称CMV。
  寄主  寄主达45科124种植物,不但为害茄科、藜科植物,还为害瓜类和小麦、玉米等禾本科植物。
  危害  番茄的重要病害,发病普遍。夏、秋季发病重,病株率可达50%以上,最严重时造成毁种绝收。
  分布  全国各地均有发生。
为害症状:顶芽幼叶细长,叶肉组织退化,叶片十分狭长,主脉扭曲,叶片卷起呈管状或螺旋状,形似蕨叶,中、下部叶片向上卷起,病株明显矮化,果少而小。
D 番茄花叶病毒病  英文名  Tomato mosaic
  病原  烟草花叶病毒Tobacco mosaic virus,简称TMV。
  寄主  番茄、茄子、辣椒、烟草和黎科的菠菜等,不能侵染瓜类和禾本科植物。
  危害  番茄的重要病害,是番茄病毒病中发生最普遍的一种。
  分布  全国各地均有发生。
为害症状 :有2种情况
      (1)花叶:表现出绿色深浅不匀的斑驳,叶片不变小,不畸形,植株不矮化,对产量影响不大。
      (2)叶片黄绿:花叶明显凹凸不平,新叶片变小、细长、畸形、扭曲,叶脉变紫,植株矮化,花芽分化能力减退,大量落花落蕾,果小质劣呈花脸状,对产量影响很大,病株比健株减产10%~30%。

E 番茄曲顶病毒病 英文名  Tomato curly top virus
  病原  番茄曲顶病毒Beetcurly top virus。
  寄主  茄科、葫芦科、十字花科、豆科、藜科植物。
为害症状:发病早的常引致全株死亡。
  发病较晚的植株变黄或生长点直立,植株矮化,叶片变厚变脆且向上卷曲,叶柄下弯,变成暗黄色和叶脉变紫,果实常提早假熟。座果后发病的病、健果可能出现在同一茎上,未成熟果实形成暗褐色皱缩果实。

NO.6 辣椒病毒病  英文名  Pepper mosaic virus
  病原  辣椒病毒病的毒源有10多种,我国已发现7种,包括黄瓜花叶病毒(CMV),占55%;烟草花叶病毒(TMV),占26%;马铃薯Y病毒
(PVY),占13%;烟草蚀纹病毒(TEV),占11.8%;马铃薯X病毒(PVX),占10.4%;苜蓿花叶病毒(AMV),占2%;蚕豆萎蔫病毒
(BBWV),占1.4%。其中CMV可划分为4个株系,即重花叶株系、坏死株系、轻花叶株系及带状株系。
  危害  辣椒的主要病害,发生普遍,对辣椒的为害极大,发病重时减产明显甚至绝收。露地栽培常年减产20%~30%,流行年份年发病率高达50%以上。棚室发病较轻。
  分布  全国各地均有发生。
为害症状:主要为害叶片和枝条,常见有花叶、黄化、坏死和畸形4种症状。  
  (1)花叶型:顶部嫩叶叶片皱缩,出现凹凸不平的花斑。发病初期,嫩叶叶脉呈明脉。
 (2)黄化型:整株叶片褪绿呈金黄色,落叶早、早衰。
 (3)坏死型:花、蕾、嫩叶变黑枯死脱落,顶枯茎杆上出现褐色坏死条斑;叶片果实上出现黑褐色大型环纹,果实顶端变黄。
 (4)畸形:叶片细长,叶脉上冲呈蔽叶状;植株矮化,茎节缩短、僵果,叶片暗绿色。植株矮小,枝条多,呈丛枝状,结果少。
 大田生产中以花叶型发生较多,多数情况几种类型混和发生。

A 辣椒枯顶病毒病 英文名  Pepper top blight virus
   病原  蚕豆萎蔫病毒Broad bean wilt virus,简称BBWV。
  寄主  豆科、菊科、茄科、苋科等25种植物。
为害症状 :在田间发病株整株矮缩,黄化蕨叶、不结实或果实小且朽住不长,人工接种叶产生枯斑,并发展成系统斑驳或坏死。

B辣椒黄化条斑病毒病 英文名  Pepper spotted wilt virus disease
  病原  番茄斑萎病毒Tomato spotted wilt virus,简称TSWV。
  寄主  番茄、辣椒、烟草、心叶烟、百日草、莴苣等。
为害症状:
   病株生长点附近叶片上产生畸形、褪绿或花叶,后形成不明显的轮纹,生长点四周茎上生褐色条斑,纵剖维管束变褐,茎基部表皮也变褐,与疫病症状相近。果实发病产生轻微花斑,畸形或产生不规则的褐色坏死斑,生长点枯死是该病特点。
  番茄斑萎病毒引起的黄化条斑病的症状易变,有些品种在顶端嫩茎上产生系统坏死和落叶,落叶后又长出的新叶现系统花斑或严重变形,早期发病植株严重矮化。

C 辣椒苜蓿花叶病毒病 英文名  Pepper alfalfa mosaic virus disease
  病原  苜蓿花叶病毒(AMV)。是我国新发现的危害甜、辣椒的毒源,在北京该毒源常与TMV和CMV复合侵染。是个值得重视的毒源。
为害症状:茄门甜椒和耐湿辣椒染病后,初在叶上产生褪绿斑点和环斑,大小1~2毫米,暗褐色,后叶坏死;上位叶产生系统褪绿环斑或黄色斑驳,多沿脉扩展,心叶变小,稍扭曲、畸形;湿度大条件下,叶脉或茎部产生坏死条纹,果实上也生褪绿花斑,植株稍黄化,矮缩。

NO.7 茄子病毒病 英文名  Eggplant mosaic virus
    病原 病原主要有烟草花叶病(TMV)、黄瓜花叶病毒(CMV)、蚕豆萎蔫病毒(BBWV)、马铃薯X病毒(PVX)等。烟草花叶病毒、黄瓜花叶病毒主要引起花叶型症状,蚕豆萎蔫病毒引起轮点状坏死,马铃薯X病毒引起大型轮点。
为害症状:病叶呈花叶或皱叶,有紫褐色坯死斑,有的病叶呈轮点状坏死,严重时植株矮小、丛生。

A 茄子斑萎病毒病 病原 病原菌为Tomato spotted wilt virus简称TSWV,称番茄斑萎病毒。
  寄主  可系统侵染大丽花、大丽共、百日草、番茄、辣椒、茄子、心叶草、莴苣等。

为害症状:系统侵染,整株发病。苗期染病植株生长缓慢,后叶片上出现黄绿不均花斑叶或形成斑驳状,老叶上产生不规则形暗绿色斑纹。植株矮缩,结果少或不结果。

NO.8 马铃薯病毒病 英文名  Potato virus
  病原  马铃薯X病毒Potato virus X(简称PVX)、马铃薯V 病毒(Potato virus V,简称PVV)、马铃薯卷叶病毒Potato leafroll Virus(简称PLRV)、类病毒及一种类菌原体。
  寄主  多种植物。
  危害  为害较严重。感发病毒的马铃薯,通过块茎积累,使块茎品质变劣,产量逐年降低,减产20%~50%,严重的达80%以上。
  分布  分布较广。
为害症状:
    包括普通花叶病、条斑花叶病、卷叶病、纤块茎病及丛枝病。
  (1)普通花叶病:表现的症状有轻花叶、坏死性叶斑、矮化以及植株由下向上枯死,块茎变小。
  (2)条斑花叶病:发病株初期叶片呈斑驳花或有枯斑,后期发展成叶脉坏死,有时主茎上出现褐色条斑,叶片完全坏死,但不脱落,有些品种无坏死,但枯株矮小,茎叶变脆,花叶并聚丛生。
  (3)卷叶病:病枝叶边缘向上翻卷,叶片黄绿色,严重时呈筒状,但不表现皱缩。叶质厚而脆,呈皮革状,有时叶背呈紫色,重病植株矮小,个别株早枯。
  (4)纤块茎病:又称纺锤形块茎病,植株正常或矮化,少分枝,叶与茎夹角变小,顶叶直立,叶缘呈波状或向上卷,叶片较小,色浓绿,僵硬变脆,背面有时变紫。块茎变长,一头尖,呈纺锤状,表现光滑,但有时龟裂。芽眼较多而平浅,有时突起。
  (5)丛枝病:病株矮化,叶色淡绿,在主茎的叶腋丛生数十根侧枝,侧枝纤长细弱,圆柱形。病株不开花,而形成数量较多的小块茎,块茎休眠期短,易生芽。

NO.9西瓜病毒病 英文名  Watermelon virus disease 异名  俗称龙头。
  病原  西瓜花叶病毒Watermelon mosaic virus(简称WMV)、黄瓜花叶病毒Cucumber green mottle mosaic virus(简称CMV)、烟草花叶病毒(TMV)。
  寄主  西瓜花叶病毒寄主范围窄,只侵染葫芦科、豆科植物。黄瓜花叶病毒寄主范围39科117种植物。黄瓜绿斑花叶病毒寄主黄瓜、西瓜、瓠瓜及甜瓜。
  危害  重要病害,发病轻的田块减产10%~20%,发病重的田块减产50%以上。严重影响了西瓜的产量和质量,甚至造成绝收。
  分布  许多国家和地区均有发生。
为害症状:
西瓜病毒病症状表现为花叶型和蕨叶型2种。花叶型植株顶部叶片现浓淡相间的花叶,蕨叶型植株病叶变得窄长,皱缩畸形。轻病株结瓜小,发病重时结瓜少或不结
瓜,植株萎缩,茎变短,新生茎蔓纤细扭曲,花器发育不良,难于座瓜。坐瓜后果实发育不良,容易形成畸形瓜,严重时实果表面凹凸不平,果实小,瓜瓤暗褐色,
品质差。

A 西瓜绿斑驳花叶病 英文名  Watermelon green mottle mosaic
  病原  黄瓜绿斑花叶病毒Cucumber green mottle mosaic virus,简称CGMMV,属病毒。该病毒在黄瓜上有2个变种,即绿斑花叶病和黄斑花叶病。
  寄主  黄瓜、西瓜、瓠瓜及甜瓜。
  危害 2005年首次在我国发生,辽中地区发生的西瓜花叶病,严重影响了当地西瓜的品质,经济损失巨大。
为害症状: 主要为害叶、果梗及果实。
  (1)苗期:苗期症状明显,发病时在新叶上出现褐色和绿色斑驳病斑,不规则形。
  (2)成株期:成株发病症状不明显。果柄、果实发病初生褐色斑,剖开病果产生赤褐色油浸状病变,病部腐败发臭,无法食用。

NO.10扁豆花叶病毒病  
    病原  大豆花叶病毒Soybean mosaic virus(SMV),黄瓜花叶病毒Cucumber mosaic virus(CMV)。
  危害  病毒病是扁豆的常见病害,发生普遍,主要在夏秋季的露地发生。一般发病株为5%左右,严重时发病率达10%以上,对产量有明显影响。
  分布  各地均有分布。
为害症状:
  扁豆花叶病毒病主要发生在花前或花后,表现为系统花叶及斑驳,叶片生长基本正常、叶上出现轻微淡黄绿相间的斑驳,叶片变小或明脉,有的心叶不舒展或节间缩短;扭曲畸形,有的表现为系统环斑。病株矮小。

NO.11花生黄花叶病毒病 异名  花生花叶病
  病原  黄瓜花叶病毒Cucumber mosaic virus,简称CMV。株系间对寄主植物致病性存在明显差异,花生上流行的黄瓜花叶病毒是CA株系。
  寄主  黄瓜花叶病毒寄主范围广泛,能侵染葫芦科、茄科、十字花科和豆科多种作物和蔬菜。但花生上流行的黄瓜花叶病毒CA株系,目前仅发现自然侵染花生和菜豆。在人工接种条件下,CA株系能侵染6科32种植物。
  危害  属多发性流行病害。流行年份,发病率可达90%以上。显著影响花生的品质和产量,早期发病花生减产30%~40%。
  分布  除我国外,尚未见其他国家报道。主要在河北、辽宁、山东以及北京等沿渤海湾花生产区流行为害。
为害症状:
   花生出苗后即见发病。初在顶端嫩叶上现褪绿黄斑,叶片卷曲,后发展为黄绿相间的黄花叶、网状明脉和绿色条纹等症状。病害发生后期症状有减轻趋势。
  该病害典型黄花叶症状易与其他花生病毒病相区别。但该病害常和花生条纹病毒病混
  合发生,症状不易区分。

A花生条纹病毒病  英文名  Peanut stripe virus
  异名  花生轻斑驳病。
  病原  花生条纹病毒Peanu Stripe Virus,PStV,属马铃薯Y病毒属组。
  寄主  范围窄,花生、大豆、芝麻和鸭拓草等。
  危害  山东、河北、河南、江苏和安徽等8省(市)花生产区田间发病率在50%以上,常年流行,不少地块达到100%。长江流域及其以南花生种植区,该病害仅在少数地块零星发生。
  分布  花生上分布最广的一种病毒病害。

为害症状:
   发病初期顶端嫩叶上出现清晰的褪绿斑和环斑,后发展成浅绿与绿色相间的轻斑驳状,沿侧脉出现断续的绿色条纹以及橡树叶状花叶等症状。叶片上症状通常
一直保留到植株生长后期。品种类型间症状表现有些差异。除种传和早期感发病株症状较重,植株稍矮化外,病株不矮化,叶片不明显变小。有时两种或3种病毒复
合侵染,产生以花叶为主的复合症状。

B 花生普通花叶病 英文名  Peanut common virus
  异名  花生矮化病毒病、花生普通病毒病。
  病原  花生矮化病毒Peanut Stunt Virus,简写PSV,属黄瓜花叶病毒组。该病毒存在不同株系。
  寄主  花生、菜豆、豇豆、长豇豆、刀豆、大豆、苋色藜、刺槐等。
  危害  该病害属于暴发性流行病害,年份零星发生,大流行年份则发生严重,给花生生产带来严重损失。该病害对花生影响大,病株形成小果和畸形果,早期发病株可减产30%~50%。
  分布  河南、河北、辽宁、山东等北方花生产区。
为害症状:
   病株开始在顶端嫩叶出现明脉,侧脉明显变淡,变宽或褪绿斑,随后发展成浅绿与绿色相间普通花叶症状,沿侧脉出现辐射状绿色小条纹和斑点。叶片变窄变小,叶缘波状扭曲。病害明显影响荚果发育,形成很多小果和畸果。

C 花生芽枯病毒病 英文名  Peanut root knot nematode
  病原  番茄斑萎病毒Tomato spotted wilt Virus,简称TSWV,属番茄斑萎病毒组。
  寄主  花生、绿豆、大豆、芝麻、豌豆、番茄、烟草、马铃薯、茄子、辣椒、菠萝等。
  危害  该病对花生影响很大,早期发病可引起颗粒无收。
  分布  印度花生重要病害,近年在美国南部花生产区有发展的趋势。我国主要发生在广东和广西,多数花生地块零星发生,最高发病率达可20%。
为害症状:
   病株顶端叶片上出现很多伴有坏死的褪绿黄斑或环斑。沿叶柄和顶端表皮下维管束呈坏死褐色,并引起顶端叶片和生长点坏死,顶端生长受抑制,节间缩短,叶片坏死,植株明显矮化。