XML-Based IUPAC Standard for Experimental and Critically Evaluated Thermodynamic Property Data Storage and Capture
One of the activities of IUPAC’s Committee on Printed and Electronic Publication is a project to develop an XML-based standard for thermodynamic data communications. On 29 January 2004, this project task group—lead by Michael Frenkel (National Institute of Standards and Technology, Boulder, USA)—met at the ESDU International plc, London, U.K.
|Task group meeting participants: front, from left: Prof. W.A. Wakeham, Dr. A.R.H. Goodwin, Dr. A.I. Johns; rear, from left: Dr. M. Satyro, Dr. D. Lide, Dr. M. Frenkel, Dr. M. Schmidt, Prof. K.N. Marsh, Dr. J.W. Magee, and Dr. J.H. Dymond.|
Of the 10 attendees, about half represented the data-supplying side of the thermodynamic data delivery process (in particular, major journals in the field, data books, and various data compilation documents), while the other half represented the data-receiving side (databases and chemical engineering software applications). The data-supplying side was represented by Prof. Kenneth Marsh (editor in chief of the Journal of Chemical and Engineering Data), Dr. J.W. Magee (associate editor of the Journal of Chemical and Engineering Data), Dr. A.R.H. Goodwin (editor of the Journal of Chemical Thermodynamics), Dr. D. Lide (editor in chief of the CRC Handbook Chemistry and Physics), and Dr. R. Craven (coordinator of various data evaluation projects within ESDU). Drs. M. Satyro, N.I. Johns, and M. Schmidt—who are responsible in their respective organizations for the development of major chemical engineering software and database products—represented the data-receiving side.
Dr. Frenkel emphasized the need for an international standard for thermochemical and thermophysical data storage and exchange (Jan-Feb 2004 CI, in print p. 17), and gave a description of ThermoML, an XML-based structure, which was being developed to provide a practical solution to this problem. ThermoML includes essentially all experimentally determined thermodynamic and transport property data—a total of more than 120 properties—for pure compounds, multicomponent mixtures, and chemical reactions. The framework of ThermoML has been published in J. Chem. Eng. Data, 48, 2-11 (2003). It has been validated using the NIST/TRC SOURCE data archival system for 9000 data sets from 7500 publications. The extension of ThermoML for description of various measures of uncertainties and precision of thermodynamic data was also published (see J. Chem. Eng. Data, 48, 1344–1359 ). The next stage, which is in progress, involves incorporation of predicted data, critically evaluated data, and fitting equations.
ThermoML has already been established for global data communication through guided data capture of thermophysical and thermochemical data from papers accepted for publication in scientific journals, or in publications from bodies such as IUPAC, or from measurements in industry. These data can then be read, using the appropriate software, by data-user groups such as industrial chemical engineers or academic researchers.
The Journal of Chemical and Engineering Data was the first journal to agree upon ThermoML as the format for the exchange and storage of thermophysical property data. When letters of acceptance are sent to authors, they are invited to submit data files to TRC (Thermodynamics Research Center) at NIST (National Institute for Standards and Technology). The process has been in place for one year, during which time the compliance of authors has risen from 3% to 92%. The data are available to authors at <www.trc.nist.gov>. The Journal of Chemical Thermodynamics is the second journal to take advantage of this data archival and electronic dissemination scheme (see J. Chem. Thermodyn., 36 iv ). It is expected that Fluid Phase Equilibria, the International Journal of Thermophysics, and Thermochimica Acta will follow shortly.
Prof. K.N. Marsh said that in order to establish a commonly accepted protocol for meta- and numerical data submission by authors of original publications, there had to be the following:
- an agreed-upon standard format for particular property data
- a mechanism, and an incentive, for data producers to submit their data
- a body to accept and verify the data
- the means for authors to access their own and other authors’ data
- an access for other researchers and engineers to the archived data
Marsh firmly believes that the NIST/TRC scheme met all these objectives and that was why it had been implemented with his journal. The process was described in an editorial (J. Chem. Eng. Data, 48, 1 ), and authors were sent a note on electronic data submission if their data were suitable for guided data capture (GDC). Once a manuscript is accepted, the authors are asked to download the GDC software, enter the data, and submit the file to NIST/TRC. The data are then checked, the authors are told of any inconsistencies, and they then have the opportunity to correct the data prior to publication. This has led to an improvement in the quality of the publications.
Dr. M. Satyro said that small and medium-sized companies had no access to experts in thermodynamics, and so used commercial process simulators. The problem with this was that process simulation orientated databases had often been developed piecemeal, and there was an artificial separation of pure component and mixture property data. Furthermore, the data had not been evaluated for uncertainties. As a result, it was impossible to quantify the quality of the simulator results, and simulated flash engine behavior was often poorly documented. Satyro gave some examples of serious differences between simulated results and actual behavior. He asserted that to test the quality of different thermodynamic models, reliable thermodynamic data are necessary, and these data need to be readily available in a form that can be integrated into a user’s program. The ultimate aim is to have reliable estimates of the uncertainties in data so that full error propagation can be carried out.
Satyro said that having a standard format for thermophysical property data would greatly simplify the input of verified data, with uncertainty estimates, to simulation packages. This would result in significant savings in process plant costs, by eliminating the need for over-design on the current scale. He informed attendees that Virtual Material Group has developed a ThermoML file-reader-software populating the database feeding a simulation engine directly from the Web-based ThermoML file-dissemination system supported by the NIST/TRC.
Current and Future Work
The task group members accepted the developed framework of the proposed XML-based IUPAC standard and the dictionary developed to describe uncertainties. They also approved the proposal that the “ThermoML” namespace be reserved on the IUPAC Web site for the standard being developed within the project (subsequently, this has been implemented, see
- inclusion of predicted data, critically evaluated data, and coverage of fitting equations in ThermoML
- inclusion of the IUPAC Chemical Identifier (see description of the IUPAC project at
- expansion of ThermoML to include electrolytes and molten salts, with possible extension to polymers
- continued collaboration with authors, editors, and publishers, in particular with participating journals, J. Chem. Eng. Data and J. Chem. Thermodyn., extension to include Fluid Phase Equilib., Thermochim. Acta, and Int. J. Thermophys., and exploration of links to J. Phys. Chem. J. Chem. Phys. and J. Solution Chem.
- continued collaboration with user groups, including the chemical process design community
Members approved the following publication plans:
- ThermoML coverage of predicted and critically-evaluated data and fitting equations
- a description of thermochemical data communication
- a recommendation of ThermoML as the XML-based IUPAC standard for experimental and critically evaluated thermodynamic property data storage and capture
The next task group meeting will take place 17–21 August 2004 during the IUPAC Conference on Chemical Thermodynamics in Beijing, China. A final meeting (venue to be arranged) will be held in December 2004.
For more information, contact the Task Group Chairman Michael Frenkel <email@example.com>.
Page last modified 2 July 2004.
Copyright © 2003-2004 International Union of Pure and Applied Chemistry.
Questions regarding the website, please contact firstname.lastname@example.org