Script hitran-scraper.py

usage: hitran-scraper.py [-h] [-t T] [M] [I] [llzero] [llfin]

Retrieves molecular lines from the HITRAN database [Gordon2016]

This script uses web scraping and the HAPI to save locally molecular lines from the HITRAN database.

While the HAPI provides the downloading facility, web scraping is used to get the lists of molecules
and isotopologues from the HITRAN webpages and get the IDs required to run the HAPI query.

The script is typically invoked several times, each time with an additional argument.

References:

[Gordon2016] I.E. Gordon, L.S. Rothman, C. Hill, R.V. Kochanov, Y. Tan, P.F. Bernath, M. Birk,
    V. Boudon, A. Campargue, K.V. Chance, B.J. Drouin, J.-M. Flaud, R.R. Gamache, J.T. Hodges,
    D. Jacquemart, V.I. Perevalov, A. Perrin, K.P. Shine, M.-A.H. Smith, J. Tennyson, G.C. Toon,
    H. Tran, V.G. Tyuterev, A. Barbe, A.G. Császár, V.M. Devi, T. Furtenbacher, J.J. Harrison,
    J.-M. Hartmann, A. Jolly, T.J. Johnson, T. Karman, I. Kleiner, A.A. Kyuberis, J. Loos,
    O.M. Lyulin, S.T. Massie, S.N. Mikhailenko, N. Moazzen-Ahmadi, H.S.P. Müller, O.V. Naumenko,
    A.V. Nikitin, O.L. Polyansky, M. Rey, M. Rotger, S.W. Sharpe, K. Sung, E. Starikova,
    S.A. Tashkun, J. Vander Auwera, G. Wagner, J. Wilzewski, P. Wcisło, S. Yu, E.J. Zak,
    The HITRAN2016 Molecular Spectroscopic Database, J. Quant. Spectrosc. Radiat. Transf. (2017).
    doi:10.1016/j.jqsrt.2017.06.038.

positional arguments:
  M           HITRAN molecule number (default: (lists molecules))
  I           HITRAN isotopologue number (not unique, starts over at each
              molecule) (default: (lists isotopologues))
  llzero      Initial wavelength (Angstrom) (default: None)
  llfin       Final wavelength (Angstrom) (default: None)

optional arguments:
  -h, --help  show this help message and exit
  -t T        Table Name (default: (molecular formula))

This script belongs to package pyfant

Usage examples

$ hitran-scraper.py

List of all HITRAN molecules
============================

  ID  Formula    Name
----  ---------  --------------------
   1  H2O        Water
   2  CO2        Carbon Dioxide
   3  O3         Ozone
   4  N2O        Nitrous Oxide
   5  CO         Carbon Monoxide
   6  CH4        Methane
   7  O2         Molecular Oxygen
   8  NO         Nitric Oxide
   9  SO2        Sulfur Dioxide
  10  NO2        Nitrogen Dioxide
  11  NH3        Ammonia
  12  HNO3       Nitric Acid
  13  OH         Hydroxyl Radical
  14  HF         Hydrogen Fluoride
  15  HCl        Hydrogen Chloride
  16  HBr        Hydrogen Bromide
  17  HI         Hydrogen Iodide
  18  ClO        Chlorine Monoxide
  19  OCS        Carbonyl Sulfide
  20  H2CO       Formaldehyde
  21  HOCl       Hypochlorous Acid
  22  N2         Molecular Nitrogen
  23  HCN        Hydrogen Cyanide
  24  CH3Cl      Methyl Chloride
  25  H2O2       Hydrogen Peroxide
  26  C2H2       Acetylene
  27  C2H6       Ethane
  28  PH3        Phosphine
  29  COF2       Carbonyl Fluoride
  31  H2S        Hydrogen Sulfide
  32  HCOOH      Formic Acid
  33  HO2        Hydroperoxyl Radical
  34  O          Oxygen Atom
  36  NO+        Nitric Oxide Cation
  37  HOBr       Hypobromous Acid
  38  C2H4       Ethylene
  39  CH3OH      Methanol
  40  CH3Br      Methyl Bromide
  41  CH3CN      Methyl Cyanide
  43  C4H2       Diacetylene
  44  HC3N       Cyanoacetylene
  45  H2         Molecular Hydrogen
  46  CS         Carbon Monosulfide
  47  SO3        Sulfur trioxide

Now, to list isotopologues for a given molecule, please type:

    hitran-scraper.py <molecule ID>

where <molecule ID> is one of the IDs listed above.

Now suppose we want the molecule OH molecule:

$ hitran-scraper.py 13

List of all isotopologues for molecule 'OH' (Hydroxyl Radical)
==============================================================

m_formula      ID    ID_molecule  Formula      AFGL_Code  Abundance
-----------  ----  -------------  ---------  -----------  ---------------
OH              1             13  (16)OH              61  0.997473
OH              2             13  (18)OH              81  0.002
OH              3             13  (16)OD              62  1.553710 × 10-4


Now, to download lines, please type:

    hitran-scraper.py 13 <isotopologue ID> <llzero> <llfin>

where <isotopologue ID> is one the numbers from the 'ID' column above,

and [<llzero>, <llfin>] defines the wavelength interval in Angstrom.

Now selecting the first isotopologue and specifying the visible wavelength range:

$ hitran-scraper.py 13 1 3000 7000

Isotopologue selected:
======================

Field name    Value
------------  --------
m_formula     OH
ID            1
ID_molecule   13
Formula       (16)OH
AFGL_Code     61
Abundance     0.997473

Wavelength interval (air): [3000.0, 7000.0] Angstrom
Wavenumber interval (vacuum): [14289.61969369552, 33342.42546386186] cm**-1
Table name: '(16)OH'

Fetching data...
===
=== BEGIN messages from HITRAN API ===
===
BEGIN DOWNLOAD: (16)OH
  65536 bytes written to ./(16)OH.data
  65536 bytes written to ./(16)OH.data
  65536 bytes written to ./(16)OH.data
  65536 bytes written to ./(16)OH.data
  65536 bytes written to ./(16)OH.data
  65536 bytes written to ./(16)OH.data
  65536 bytes written to ./(16)OH.data
  65536 bytes written to ./(16)OH.data
  65536 bytes written to ./(16)OH.data
  65536 bytes written to ./(16)OH.data
Header written to ./(16)OH.header
END DOWNLOAD
                     Lines parsed: 3855
PROCESSED
===
=== END messages from HITRAN API ===
===
...done
Please check files '(16)OH.header', '(16)OH.data'

Quick note on the HITRAN API

The files created (‘(16)OH.header’, ‘(16)OH.data’) can be opened using the HAPI. They are also accessed by the application convmol.py.

The HAPI can be downloaded, but one version is also included with the f311 package. The following is an example of how the HITRAN data could be accessed from the Python console:

>>> from f311 import hapi
>>> hapi.loadCache()
Using .
(16)OH
                     Lines parsed: 3855
>>> oh_data = hapi.LOCAL_TABLE_CACHE["(16)OH"]
>>> oh_data.keys()
dict_keys(['data', 'header'])
>>> oh_data["data"].keys()
dict_keys(['ierr', 'gpp', 'molec_id', 'global_lower_quanta', 'sw', 'gamma_self', 'n_air', 'elower', 'line_mixing_flag', 'local_lower_quanta', 'gp', 'global_upper_quanta', 'gamma_air', 'local_upper_quanta', 'iref', 'local_iso_id', 'delta_air', 'nu', 'a'])

To work properly with these data in your code, you may have a look at the HAPI source code and manual, as this library is superbly documented.

Within f311, the code in f311.convmol.conv_hitran.hitran_to_sols() contains a usage example of HITRAN data.