Hindawi XML Corpus

In order to facilitate the use of Hindawi’s content for data mining purposes, Hindawi makes its full corpus of XML content available for download as a single .zip file. This .zip file is organized using a two-level folder structure, first by publication year, then by journal. For example, the folder called "2011" contains subfolders for any journal that has one or more published articles in 2011, and inside each of these folders are individual XML files for these articles. In addition, the downloaded .zip file contains an XML file called contents.xml, which provides an overview of all of the subfolders that exist within the main .zip file.

The content of this .zip file is updated on a daily basis, and the XML files contained within this corpus download adhere to the JATS 1.1 DTD. If you have questions about Hindawi’s XML corpus download, please contact help@hindawi.com.

Download the Hindawi Corpus

We are committed to sharing findings related to COVID-19 as quickly and safely as possible. Any author submitting a COVID-19 paper should notify us at help@hindawi.com to ensure their research is fast-tracked and made available on a preprint server as soon as possible. We will be providing unlimited waivers of publication charges for accepted articles related to COVID-19. Sign up here as a reviewer to help fast-track new submissions.