Child pages
  • Summary of Large and Small BEL Corpuses
Skip to end of metadata
Go to start of metadata

 

The primary intent for the small and large corpus is to provide a public source of examples of knowledge representation in BEL.

Small Corpus

The small corpus is primarily derived from full-text articles including abstract and caption text, but not derived from tables or diagrams. The small corpus is composed of approximately 2000 BEL statements manually curated from 57 PubMed papers of related topics. There are around 800 unique evidences and ~1700 causal statements. These BEL statements represent observations from experiments performed in human, mouse and rat.

 

 

Relationship

Count

increases

997

decreases

353

directlyIncreases

319

directlyDecreases

103

hasComponent

2

positiveCorrelation

74

subProcessOf

4

negativeCorrelation

24

isA

1

association

10

causesNoChange

55

hasComponents

1

hasMembers

1

prognosticBiomarkerFor

2

 

 

 

Other Functions

Count

biologicalProcess

511

pathology

124

cellSecretion

13

degradation

33

cellSurfaceExpression

29

translocation

53

Activity Function

Count

kinaseActivity

539

transcriptionalActivity

262

catalyticActivity

137

gtpBoundActivity

83

phosphataseActivity

18

transportActivity

2

molecularActivity

103

peptidaseActivity

28

Abundance Function

Count

proteinAbundance

1067

complexAbundance

253

abundance

590

rnaAbundance

191

compositeAbundance

52

geneAbundance

4

microRNAAbundance

35

Modification Type

Count

Phosphorylation

302

substitution

28

Acetylation

4

Farnesylation

2

Ubiquitination

9

Hydroxylation

22

Sumoylation

5

 

 

Large Corpus

The large corpus is a subset of the Selventa knowledgebase, consisting of approximately 80,000 BEL statements from ~16300 citations. This corpus does not include statements extracted from tables or high throughput experiments, and corresponding citations have not necessarily been fully curated like in the small corpus. The statements are a collection of independent observations that were not selected to represent any specific biological process(es) or signaling pathway(s). Transcriptional control statements were selected in a balanced manner and protein signaling cascades were included in reasonable detail. There are ~59500 causal statements. These BEL statements represent observations from experiments performed in human, mouse and rat.

 

 

Relationship

Count

increases

52377

decreases

21583

directlyIncreases

5976

directlyDecreases

1215

negativeCorrelation

182

positiveCorrelation

35

 

 

 

Other Functions

Count

biologicalProcess

14792

pathology

5369

Activity Functions

Count

kinaseActivity

14852

transcriptionalActivity

12804

catalyticActivity

7560

gtpBoundActivity

1484

phosphataseActivity

681

Abundance Functions

Count

proteinAbundance

75831

complexAbundance

7680

abundance

23880

rnaAbundance

33770

geneAbundance

1422

microRNAAbundance

35

Modification Type

Count

Phosphorylation

7685

 

 

  • No labels