General Evaluation Criteria

The task is evaluated in several sub-levels, as described below. This method of evaluation is intended to enable partial participation. Even if not all levels are fulfilled, a reasonable evaluation score can be achieved. For example, teams that do not want to produce normalized entities (see "Evaluation on Term-Level"), can use a placeholder instead as will be explained in the following.


Evaluation on different Levels

During evaluation, we will split the BEL statements produced by the participants and evaluate them on the following levels:

Evaluation Example – full BEL statement and the parts evaluated on the different levels:

BEL Statementp(HGNC:BCL2A1) decreases bp(GOBP:"apoptotic process")act(p(MGI:Hras)) increases p(MGI:Mmp9)
Evidence SentenceWe demonstrate that the Bfl-1 protein suppresses apoptosis induced by the p53 tumor suppressor protein in a manner similar to other Bcl-2 family members such as Bcl-2, Bcl-xL and EBV-BHRF1.

Cells with activated ras demonstrated high level of expression of 72-kDa metalloproteinase (MMP-2, gelatinase A), and 92-kDa metalloproteinase (MMP-9, gelatinase B) compared with cells containing SV40 large T antigen alone.

Term-level Evaluation
bp(GOBP:"apoptotic process")
Function-level Evaluation
Relationship-level Evaluation
p(HGNC:BCL2A1) decreases bp(GOBP:"apoptotic process")
p(MGI:Hras) increases p(MGI:Mmp9)
Full-statement evaluation
p(HGNC:BCL2A1) decreases bp(GOBP:"apoptotic process")
act(p(MGI:Hras)) increases p(MGI:Mmp9)



Evaluation on Term-Level 

On Term-level, the correctness of BEL terms will be evaluated.

BEL terms are built from entities, their namespaces and associated abundance or process functions.

The evaluation of BEL terms includes the following:


BEL Terms are evaluated at one sub-levels:

1) "Term Level": Correctness of complete BEL term; Are the abundance/process functions as well as the namespaces and identifiers correct?



Placeholder Argument Example: 

BEL Term FormatAssociated Term FunctionAssociated NamespacesOther acceptable functions

BEL term Example

Short Function NameLong Function NameFunction Type
proteinAbundance()abundanceHGNC, MGI, EGID
g(), r(), m()
We recommend using the p() function only. The p() function will be accepted in place of all other possible functions (i.e. the functions g(), r() and m()).
bp(GOBP:"cell migration")
pathology()process MESHD

Evaluation on Function-Level

On function-level, the correctness of discovered function will be evaluated. Functions are only accepted together with their argument BEL terms. 

Functions are evaluated at two sub-levels:

1) "Function Level": Correctness of functions together with their arguments; Is a function associated to the correct BEL terms?

2) Secondary Function Level": Correctness of a function only, regardless of the correctness of their term-arguments (BUT: BEL terms or placeholders need to be present!)

For each of these sub-levels, an F-Score will be calculated.



Placeholder Argument Example: 

The following functions are present in the test set and will be assessed:

Function TypeFunctionExampleComments


Full credit is given if at least one argument of the complex() function is correct.



Original statement:

deg(p(MGI:Ctnnb1)) decreases tloc(p(MGI:Ctnnb1),GOCCID:0005737,GOCCID:0005634)

Also Accepted :

deg(p(MGI:Ctnnb1)) decreases tloc(p(MGI:Ctnnb1))

Accepted Statement with Placeholders (only scores at sub-level 1):

deg(p(PH:placeholder)) decreases tloc(p(PH:placeholder))

Location arguments are not expected but will be accepted if syntactically correct. No credit will be given for location arguments.

The tloc() function will be accepted and credited in place of the following functions:

sec(), surf()
act(p(HGNC:MMP1)) increases deg(p(HGNC:COL2A1))



Original Statement:

bp(GOBP:"response to tumor cell") increases p(MGI:Bad,pmod(P,S,112))

Also Accepted:

bp(GOBP:"response to tumor cell") increases p(MGI:Bad,pmod(P))

Accepted Statement with Placeholders (only scores at sub-level 1):

p(PH:placeholder) increases p(PH:placeholder,pmod(P))

Only pmod(P) will be evaluated. Additional arguments are not expected but will be accepted if syntactically correct. No credit will be given for additional arguments.

The pmod() function will not be accepted and credited in place of any other functions:



Original Statement (no credit will be given for kin() function):

kin(p(MGI:Kdr)) increases p(MGI:Pecam1)

Accepted Statement:

act(p(MGI:Kdr)) increases p(MGI:Pecam1)

Accepted Statement with Placeholders (only scores at sub-level 1):

act(p(PH:placeholder)) increases p(PH:placeholder)

Credit will only be given for the act() function only. The act() function will be accepted and credited in place of any other activity function listed.

The act() function will be accepted and credited in place of the following functions:

cat(), chap(), gtp(), kin(), pep(), phos(), ribo(), tscript(), tport()

These functions are not accepted for evaluation; It is necessary to use act() instead.


Evaluation on Relationship-Level 

On relationship level, the relationship contained in a BEL statement will be evaluated. BEL relationships have a subject, predicate, object structure. The subject and the object are the entities involved in a relationship in the format of BEL terms, the predicate is the relationship between them. Only these components of a BEL statement will be taken into account for the evaluation on relationship-level. Other functions which are not term functions, will be ignored.

Relationships will be evaluated on two levels:

1) "Relationship Level": Full Relationships; subject and object need to be correct.

2) "Secondary Relationship Level": Partial Relationships; relationships containing two correct units, either

For both levels, an F-score will be calculated.




The following relationships will be assessed:

RelationshipAlternative Symbol


Statement from Gold Standard

Expected/Evaluated on Relationship Level:

only the relationship and its term-arguments

Examples for Partial Relationships/Use of Placeholders

(only scoring at secondary relationship level evaluation)

kin(p(MGI:Kdr)) decreases p(MGI:Tek)
p(MGI:Kdr) decreases p(MGI:Tek)

Partial Relationship:

p(MGI:Kdr) decreases p(MGI:wrong identifier)

Entity Placeholder:

p(PH:Placeholder) decreases p(MGI:Tek)

Association Placeholder:

p(MGI:Kdr) association p(MGI:Tek)

Accepted Statement:

act(p(HGNC:MDM2)) directlyDecreases p(HGNC:TP53)

Also Accepted:

act(p(HGNC:MDM2)) decreases p(HGNC:TP53)
p(HGNC:MDM2) directlyDecreases p(HGNC:TP53)
p(HGNC:MDM2) decreases p(HGNC:TP53)


(either version is accepted and scores the same credit)

Partial Relationship:
p(HGNC:wrong identifier) directlyDecreases p(HGNC:TP53)

Entity Placeholder:

p(HGNC:MDM2) decreases p(PH:Placeholder)

Association Placeholder:

p(HGNC:MDM2) association p(HGNC:TP53)
decreases will be accepted in place of directlyDecreases with the same credit given
p(HGNC:ARRB1) increases kin(p(HGNC:MAPK1))
p(HGNC:ARRB1) increases p(HGNC:MAPK1)
Partial Relationship:
p(HGNC:wrong identifier) increases p(HGNC:MAPK1)


p(PH:Placeholder) increases p(HGNC:MAPK1)

Association Placeholder:

p(HGNC:ARRB1) association p(HGNC:MAPK1)

Accepted Statement:

p(HGNC:VEGFB) directlyIncreases act(p(HGNC:FLT1))

Also Accepted:

p(HGNC:VEGFB) increases act(p(HGNC:FLT1))
p(HGNC:VEGFB) directlyIncreases p(HGNC:FLT1)
p(HGNC:VEGFB) increases p(HGNC:FLT1)

(either version is accepted and scores the same credit)

Partial Relationship:
p(wrong namespace:wrong identifier) directlyIncreases p(HGNC:FLT1)


p(HGNC:VEGFB) increases p(PH:Placeholder)

Association Placeholder:

p(HGNC:VEGFB) association p(HGNC:FLT1)

increases will be accepted in place of directlyIncreases with the same credit given



Special CaseStatement from Gold Standard

Expected/Evaluated on Relationship Level:

only the relationship and its term-arguments

Example of a Relationship with complex as an argument
cat(complex(p(HGNC:CREBBP),p(HGNC:EP300))) increases p(HGNC:KLF1,pmod(A))
complex(p(HGNC:CREBBP),p(HGNC:EP300))) increases p(HGNC:KLF1)
complex(p(HGNC:CREBBP)) increases p(HGNC:KLF1)
complex(p(HGNC:EP300)) increases p(HGNC:KLF1)
complex(p(HGNC:CREBBP),p(PH:Placeholder))) increases p(HGNC:KLF1)

(all above versions are accepted and score the same credit)

The complex() function is the only function that is evaluated on the function as well as on relationship level (not considering process and abundance functions but only other functions).

A complex function is evaluated as correct if at least one of its term-arguments is correct.


Full Statement 

Evaluates if a full BEL statement is correct and complete. 


Overall Evaluation

A final overall score will be calculated from the results of all evaluation levels. Full discovered statements will be scored by their amount of coverage compared to the gold standard statement.