Child pages
  • BioCreative V Track 4 (BEL Task 2015) Home
Skip to end of metadata
Go to start of metadata

Important News

BEL Track Task Description

Task 1

  • Short description: Given textual evidence for a BEL statement, generate the corresponding BEL statement.

  • Training data: A significant number of relationships systematically selected from the curated networks, with their evidence and the full BEL statement.

  • Test data: A smaller number of relationships from the same dataset. We provide only the evidence sentence and the participants have to generate the BEL statement.

  • Evaluation: We will compare a list of the n best BEL statements generated by the user system to the corresponding human-generated BEL statements. In case a user notices inconsistencies after the fully-automated evaluation step, specific BEL statements can be re-submitted for manual evaluation, given that their syntax has been verified.

Task 2

  • Short description: Given a BEL statement, provide at most 10 additional evidence sentences.

  • Training data: Same data as for Task 1

  • Test data: BEL statements WITHOUT evidence. The participants have to provide at most 10 sentences (ranked by confidence) with different PMIDs that offer evidence for each BEL statement.

  • Evaluation: A team of experts will manually assess all evidence statements provided by the participants and classify them as correct or incorrect. We will then score each participants’ contribution using a ranking metric, such as TAP-k.

Biological Expression Language

The Biological Expression Language (BEL) is a language for representing scientific findings in the life sciences in a computable form. BEL represents scientific findings by capturing causal and correlative relationships in a given context. This context includes information about the biological system and experimental conditions. The supporting evidences are captured and linked to the publication references. It is specifically designed to adopt external vocabularies and ontologies, and therefore represents life-science knowledge in language and schema known by the community.


More Information about dataset files and format

Sample Data

In the BioCreative V Track 4 sample data version 1 was used, which can be found here (Datasets -> Sample Corpus).

Training Data

The structure of the training data is identical to the structure of the sample data described above.

In the BioCreative V Track 4 traning data version 1 was used, which can be found here (Datasets -> Training Corpus).

Test Data

The Test Data will be released on 14th of June.

Test Set Task 1

  • Main File:  Test.sentence

    (Contains 100 Sentences)

    Test Set Task 1 - Second Stage  (deadline 00:01 AoE Thursday 18)

Task 1: Participants are asked to deliver BEL statements generated from the provided sentences (File: Test.sentence). The evaluation will be fully automated (as described under Evaluation Details of Biocreative BEL Task 1), using a manually created gold standard which includes all possible BEL statements generated from the input sentences.

The supporting file containing entities from the test set can be used to generate submissions on higher levels (Function Level, Relationship Level, Full Statement Level). A participation on stage 1 is not required for participating on stage 2.

Test Set Task 2

Task 2: Participants are asked to deliver corresponding evidence (i.e. one sentence per each BEL statement, with corresponding PubMed identifier), from any PubMed abstract. Evidence from full text will be accepted as well, if the paper is available in open access format (in particular Open Access subset of PubMed Central).


The test data is coming from the same pool as the training and sample data.

  • For Task 1 we only use sentences/PMIDs not occurring in the sample and training data. 
  • For Task 2 we expect evidence sentences which are not part of the sample and training data.

Important Dates

Release training data: April 24, 2015

Release Test Data: Jun 14, 2015 

Submission of Results by Participants Deadline: Jun 16, 2015

Release of gold standard entities: Jun 17, 2015

Second submission deadline: Jun 18, 2015 (optional delivery of revised results of task 1 including gold standard entities)

Notification of results to participants: July 10, 2015 (results of task 1 might be notified earlier)

Deadline for delivery of system description papers: July 20, 2015 

Provide feedback on the papers: August 1, 2015

Camera-ready: August 15, 2015

Workshop: September 9-11, 2015

Submission Limits

TASK 1: 

We will accept 3 runs per participant/team. In each run, for each input sentences, up to 10 BEL statements or fragments will be considered. 

Although you are free to use the runs as you prefer, in principle each run is meant to represent a different configuration of your system. 

TASK 2: 

We will accept one run per participant/team, containing at most ten (independent) sentences ranked in order of relevance.

Since these sentences will be manually evaluated by a team of experts, we can only guarantee to evaluate the top five sentences for each submission. If resources allow, more sentences might be considered (depending on the number of submissions).

Main References

Task organizing committee:

  • Dr. Fabio Rinaldi (OntoGene, Switzerland)
  • Dr. Juliane Fluck (Fraunhofer SCAI Institute, Germany)
  • Dr. Sam Ansari (sbvIMPROVER, Switzerland)
  • Dr. Julia Hoeng (sbvIMPROVER, Switzerland)
  • Prof. Dr. Martin Hofmann-Apitius (OpenBEL Consortium and Fraunhofer SCAI Institute, Germany)



Space contributors


  • No labels