EXCITE - Extraction of Citations from PDF Documents

img

About Excite


Excite Team




WeST -The Institute for Web Science and Technologies

  • team-img

    Prof. Dr. Steffen Staab

    Team Leader

    staab@uni-koblenz.de

  • team-img

    Dr. Zeyd Boukhers

    Researcher

    boukhers@uni-koblenz.de

  • team-img

    Martin Körner

    Researcher

    mkoerner@uni-koblenz.de

GESIS - Leibniz-Institut für Sozialwissenschaften

  • team-img

    Dr. Philipp Mayr

    Team Leader

    philipp.mayr@gesis.org

  • team-img

    Behnam Ghavimi

    Researcher

    behnam.ghavimi@gesis.org

  • team-img

    Azam Hosseini

    Programmer

    azam.hosseini@gesis.org


  
  

External Supporter

Dr. Heinrich Hartmann

heinrich@heinrichhartmann.com


Software

Several services are provided by Excite to extract and parse citations. All tools are licensed under Creative Commons attribution (CC BY-NC) and their codes are available on GitHub.

  • EXParser: It is a Python tool that extracts and segment references from PDF files by adopting a feedback mechanism.


    Read more ....

  • EXMatcher: This algorithm is implemented for finding corresponding items in a bibliography corpus (such as Sowiport.org or related-work.net) for reference strings.

    Read more ....

  • EXPublisher: This code is dedicated to the task of converting EXCITE data to a JSON file with OCC ontology.


    Read more ....

  • EXRef-Identifier: It is an annotator tool that helps to annotate reference string in a text files and thus create a gold standard.

    Read more ....

    live demo

  • EXRef-Segmentation: It is an annotator tool that helps to manually parse reference strings.

    Read more ....

    live demo

  • RefExt: It is JAVA tool that extracts references from PDF files. Using Conditional Random Fields (CRF).


    Read more ....


News From Excite


P ublications