Child pages
  • BioCreative VI Track 3 (BEL Task 2017) Home
Skip to end of metadata
Go to start of metadata

Important News

BEL Task

Extraction of causal network information using the Biological Expression Language (BEL)

Overview: Automatic extraction of biological network information is one of the most desired and most complex tasks in biological and medical text mining. In BioCreative V, we tackled this complexity by extracting causal relationships represented in Biological Expression Language (BEL, BEL is an advanced knowledge representation format which has been designed to be both human readable and machine processable. The smallest unit is a BEL statement or BEL nanopub, expressing a single causal relationship. In the last BioCreative V Track 4 (BEL Task), there was only a limited time for participants to train on the data and, in addition, the evaluation environment became only available for the test phase.  Furthermore, for the second subtask, the sentence classification, no training data was available. Therefore, we decide to present the same task based on new test data. This time, the training data for both subtask is available and, the evaluation environment can be used during the training time. As before, the challenge is organized into two tasks which will evaluate the complementary aspects of the problem:

Task 1

  • Short description: Given selected textual, construct the corresponding BEL statement.

  • Training data: A significant number of relationships systematically selected from the curated networks, with their evidence and the full BEL statement.

  • Test data: A smaller number of relationships from the same dataset. We provide only the evidence sentence and the participants have to generate the BEL statement.

  • Evaluation: We will compare a list of the n best BEL statements generated by the user system to the corresponding human-generated BEL statements. In case a user notices inconsistencies after the fully-automated evaluation step, specific BEL statements can be re-submitted for manual evaluation, given that their syntax has been verified.

Task 2

  • Short description: Given a BEL statement, detect all available textual evidences.

  • Training data: Same data as for Task 1

  • Test data: BEL statements WITHOUT evidence. The participants have to provide at most 10 sentences (ranked by confidence) with different PMIDs that offer evidence for each BEL statement.

  • Evaluation: A team of experts will manually assess all evidence statements provided by the participants and classify them as correct or incorrect. We will then score each participants’ contribution using a ranking metric, such as TAP-k.

Biological Expression Language

The Biological Expression Language (BEL) is a language for representing scientific findings in the life sciences in a computable form. BEL represents scientific findings by capturing causal and correlative relationships in a given context. This context includes information about the biological system and experimental conditions. The supporting evidences are captured and linked to the publication references. It is specifically designed to adopt external vocabularies and ontologies, and therefore represents life-science knowledge in language and schema known by the community.


More Information about dataset files and format

Sample Data

In the BioCreative V Track 4 sample data version 1 was used, which can be found here (Datasets -> Sample Corpus).

Training Data

The structure of the training data is identical to the structure of the sample data described above.

In the BioCreative V Track 4 traning data version 1 was used, which can be found here (Datasets -> Training Corpus).

Test Data

Test Set Task 1

  • First Stage (deadline 00:01 AoE Tuesday 25th, 2017)

    • Main File: Sentences

      (Contains 116 Sentences)

    • Support files

      • We will focus on a restricted frequent number of biological processes.
        Only the GO processes included in this list will be considered.

      •   BioC version

  • Second Stage  (deadline 01:01 AoE Thursday 27th, 2017 )

    • Supporting File for second stage: Annotated Sentences (the file contains entity information with offsets and the associated normalized concept with the namespace)

Task 1: Participants are asked to deliver BEL statements generated from the provided sentences (File: Test.sentence). The evaluation will be fully automated (as described under Evaluation Details of Biocreative BEL Task 1), using a manually created gold standard which includes all possible BEL statements generated from the input sentences.

The supporting file containing entities from the test set can be used to generate submissions on higher levels (Function Level, Relationship Level, Full Statement Level). A participation on stage 1 is not required for participating on stage 2.

Test Set Task 2 (deadline  00:01 AoE Tuesday 25th, 2017)

Task 2: Participants are asked to deliver corresponding evidence (i.e. one sentence per each BEL statement, with corresponding PubMed identifier), from any PubMed abstract. Evidence from full text will be accepted as well, if the paper is available in open access format (in particular Open Access subset of PubMed Central).

Submission Limits

TASK 1: 

We will accept 3 runs per participant/team. In each run, for each input sentences, up to 10 BEL statements or fragments will be considered. 

Although you are free to use the runs as you prefer, in principle each run is meant to represent a different configuration of your system. 

TASK 2: 

We will accept one run per participant/team, containing at most ten (independent) sentences ranked in order of relevance. Since these sentences will be manually evaluated by a team of experts, we can only guarantee to evaluate the top five sentences for each submission. If resources allow, more sentences might be considered (depending on the number of submissions).

Important Dates

Release training data (Training-2015 + Sample-2015 + Test-2015): Already available at the Datasets page.

Evaluation website: 

Release test data: Jul 24, 2017 (Mon) (Jul 11, 2017 (Tue))

Submission of results (by participants) deadline: Jul 25, 2017 (Tue) (Jul 12, 2017 (Wed))

Release of gold standard entities: Jul 26, 2017 (Wed) (Jul 13, 2017 (Thu))

Second submission deadline: Jul 27, 2017 (Thu) (Jul 14, 2017 (Fri)) (optional delivery of revised results of task 1 including gold standard entities)

Notification of results to participants: Aug 18, 2017 (Fri) (Aug 4, 2017 (Fri)) (results of task 1 might be notified earlier)

Submission of the system description papers: Sep 3, 2017 (Aug 20, 2017 (Sun))

Feedback on the papers: Sep 15, 2017 (Fri)

Camera-ready papers: Oct 1, 2017 (Sun)

Workshop: Oct 18-20, 2017 (Wed-Fri)

Main References

  • Short presentation of BEL task: bel.pdf
  • Google group for task participants: BEL task

Task organizing committee

  • Dr. Juliane Fluck (Fraunhofer SCAI Institute, Germany)
  • Sumit Madan (Fraunhofer SCAI Institute, Germany)
  • Dr. Justyna Szostak (Philip Morris International: PMI, Switzerland)
  • Prof. Dr. Martin Hofmann-Apitius (OpenBEL Consortium and Fraunhofer SCAI Institute, Germany)
  • No labels