Short description: Given textual evidence for a BEL statement, generate the corresponding BEL statement.
Training data: A significant number of relationships systematically selected from the curated networks, with their evidence and the full BEL statement.
Test data: A smaller number of relationships from the same dataset. We provide only the evidence sentence and the participants have to generate the BEL statement.
Evaluation: We will compare a list of the n best BEL statements generated by the user system to the corresponding human-generated BEL statements. In case a user notices inconsistencies after the fully-automated evaluation step, specific BEL statements can be re-submitted for manual evaluation, given that their syntax has been verified.
Short description: Given a BEL statement, provide at most 10 additional evidence sentences.
Training data: Same data as for Task 1
Test data: BEL statements WITHOUT evidence. The participants have to provide at most 10 sentences (ranked by confidence) with different PMIDs that offer evidence for each BEL statement.
Evaluation: A team of experts will manually assess all evidence statements provided by the participants and classify them as correct or incorrect. We will then score each participants’ contribution using a ranking metric, such as TAP-k.
The Biological Expression Language (BEL) is a language for representing scientific findings in the life sciences in a computable form. BEL represents scientific findings by capturing causal and correlative relationships in a given context. This context includes information about the biological system and experimental conditions. The supporting evidences are captured and linked to the publication references. It is specifically designed to adopt external vocabularies and ontologies, and therefore represents life-science knowledge in language and schema known by the community.
The structure of the training data is identical to the structure of the sample data described above.
The Test Data will be released on 14th of June.
Test Set Task 1
Main File: Test.sentence
(Contains 100 Sentences)
Test Set Task 1 - Second Stage (deadline 00:01 AoE Thursday 18)
- Supporting File for second stage: Test.entities
- Supporting File for second stage: Test.entities
Task 1: Participants are asked to deliver BEL statements generated from the provided sentences (File: Test.sentence). The evaluation will be fully automated (as described under Evaluation Details of Biocreative BEL Task 1), using a manually created gold standard which includes all possible BEL statements generated from the input sentences.
The supporting file containing entities from the test set can be used to generate submissions on higher levels (Function Level, Relationship Level, Full Statement Level). A participation on stage 1 is not required for participating on stage 2.
Test Set Task 2
- Main File: Test.BEL
(Contains 100 BEL statements)
- Supporting File 1: TestBioC.xml
- Supporting File 2: test_graph_svg.zip ,
- Supporting File 3: Test-fragments.csv
- Supporting File 4: Test-entities.csv
Task 2: Participants are asked to deliver corresponding evidence (i.e. one sentence per each BEL statement, with corresponding PubMed identifier), from any PubMed abstract. Evidence from full text will be accepted as well, if the paper is available in open access format (in particular Open Access subset of PubMed Central).
The test data is coming from the same pool as the training and sample data.
- For Task 1 we only use sentences/PMIDs not occurring in the sample and training data.
- For Task 2 we expect evidence sentences which are not part of the sample and training data.
Release training data: April 24, 2015
Release Test Data: Jun 14, 2015
Submission of Results by Participants Deadline: Jun 16, 2015
Release of gold standard entities: Jun 17, 2015
Second submission deadline: Jun 18, 2015 (optional delivery of revised results of task 1 including gold standard entities)
Notification of results to participants: July 10, 2015 (results of task 1 might be notified earlier)
Deadline for delivery of system description papers: July 20, 2015
Provide feedback on the papers: August 1, 2015
Camera-ready: August 15, 2015
Workshop: September 9-11, 2015
We will accept 3 runs per participant/team. In each run, for each input sentences, up to 10 BEL statements or fragments will be considered.
Although you are free to use the runs as you prefer, in principle each run is meant to represent a different configuration of your system.
We will accept one run per participant/team, containing at most ten (independent) sentences ranked in order of relevance.
Since these sentences will be manually evaluated by a team of experts, we can only guarantee to evaluate the top five sentences for each submission. If resources allow, more sentences might be considered (depending on the number of submissions).
- BioCreative web site: http://www.biocreative.org/
- Short presentation of BEL task: bel.pdf
- Google group for task participants: http://tinyurl.com/om7stv9
- BEL documentation can be found here.
- Short video introduction about BEL (6 min): https://sbvimprover.com/sites/default/files/nvc-video1-using-bel.mp4
- Full BEL specification: http://wiki.openbel.org/download/attachments/819563/BEL%20V1.0%20Language%20Overview.pdf?api=v2
- sbvIMPROVER (large-scale manual curation of pathway networks): https://bionet.sbvimprover.com/
- Causal Biological Networks (CBN) database http://www.causalbionet.com/
Task organizing committee:
Recent space activity