Work with obsSpace#

JEDI or GSI generates observation space diagnostic files, which contains original observation information, hofx (H(x), i.e. model counter-parts at the observataion locations), OMB (observation minus background), OMA as well as other information.

This notebook covers how to deal with JEDI diganostic files (which are also called output ioda files). We call them jdiag files.

import packages and define variables#

%%time 

# autoload external python modules if they changed
%load_ext autoreload
%autoreload 2

import os, sys
pyDAmonitor_ROOT=os.getenv("pyDAmonitor_ROOT")
if pyDAmonitor_ROOT is None:
    print("!!! pyDAmonitor_ROOT is NOT set. Run `source ush/load_pyDAmonitor.sh`")
else:
    print(f"pyDAmonitor_ROOT={pyDAmonitor_ROOT}\n")
sys.path.insert(0, pyDAmonitor_ROOT)
    
# import modules
import warnings
import math
import numpy as np
import uxarray as ux
import xarray as xr
import pandas as pd
import seaborn as sns  # seaborn handles NaN values automatically
import matplotlib.pyplot as plt
from netCDF4 import Dataset
from DAmonitor.base import query_dataset, query_data, query_obj, to_dataframe, aircraft_reject_list_to_df

#jdiag_file=f"{pyDAmonitor_ROOT}/data/ioda/ioda_aircar.2024050600.nc"
jdiag_file=f"{pyDAmonitor_ROOT}/data/mpasjedi/jdiag_aircar_t133.nc"
pyDAmonitor_ROOT=/home/Guoqing.Ge/pyDAmonitor
CPU times: user 3.27 s, sys: 1.05 s, total: 4.32 s
Wall time: 5.01 s

Use the obsSpace class to read jdiag files#

from DAmonitor.obs import obsSpace, fit_rate
# create a t133 object using the obsSpace class
t133=obsSpace(jdiag_file)
# check the t133 object
query_obj(t133)
['bt',
 'ds',
 'filepath',
 'groups',
 'locations',
 'metadata',
 'nlocs',
 'q',
 't',
 'u',
 'v']
np.set_printoptions(threshold=np.inf)  # print out all array values
# print(t133.t.aircraftFlightNumber)
# print(t133.t.stationIdentification)
print(t133.t.stationIdentification[10])
print(t133.t.aircraftFlightNumber[10])
#
np.set_printoptions(threshold=500) # don't print out all array values
print(t133.t.ObsValue)
KKJ1Z4BA
R33RPDYQ
[218.15 223.15 250.45 ... 224.15 222.15 224.15]
# query data
query_data(t133.q)  # since the t133 object does not contain q observations, so it will display meta information
stationIdentification, aircraftFlightPhase, timeOffset, prepbufrDataLvlCat, longitude, stationElevation, latitude, prepbufrReportType, pressure, dateTime, height, dumpReportType, aircraftFlightNumber, ObsError, ObsType, ObsValue, QualityMarker, Location
# query data
query_data(t133.t)
EffectiveError0, EffectiveError1, EffectiveError2, EffectiveQC0, EffectiveQC1, EffectiveQC2, stationIdentification, aircraftFlightPhase, timeOffset, prepbufrDataLvlCat, longitude, stationElevation, latitude, prepbufrReportType, pressure, dateTime, height, dumpReportType, aircraftFlightNumber, ObsBias0, ObsBias1, ObsBias2, ObsError, ObsType, ObsValue, QualityMarker, TempObsErrorData, hofx0, hofx1, hofx2, innov1, oman, ombg, Location

The above output shows that we reorganize the data structure based on the observation variable, i.e. t, q, uv, bt
We can access values easily using popular Python class strucutre, i.e. t133.t, t133.t.ObsValue, etc

Convert to Pandas DataFrame#

Converting jdiag data into Pandas DataFrame brings up lots of benefits and we can use utilize lots of mature DataFrames capabilities.
We can see this from the following cell.

pd.set_option('display.max_columns', None)  # show all dataframe columns
df = to_dataframe(t133.t)
df[df["QualityMarker"]>=12]  # filter the data frame
#df
EffectiveError0 EffectiveError1 EffectiveError2 EffectiveQC0 EffectiveQC1 EffectiveQC2 stationIdentification aircraftFlightPhase timeOffset prepbufrDataLvlCat longitude stationElevation latitude prepbufrReportType pressure dateTime height dumpReportType aircraftFlightNumber ObsBias0 ObsBias1 ObsBias2 ObsError ObsType ObsValue QualityMarker TempObsErrorData hofx0 hofx1 hofx2 innov1 oman ombg Location

Read the aircraft reject list from “current_bad_aircraft.txt”#

filepath=f"{pyDAmonitor_ROOT}/data/current_bad_aircraft.txt"
dflist = aircraft_reject_list_to_df(filepath)
dflist
tailID flagT flagW flagR FSL MDCRS N bs_T std_T bs_S std_S bs_D std_D bs_W std_W bs_RH std_RH fail_reason
0 00000320 T W - 26490 -------- 470 11.3 2.8 0.8 2.7 -8 11 1.9 4.0 0.0 0.0 Unknown(bias_Tstd_Tbias_DIR)
1 00000400 T W R 0 -------- 0 0.0 0.0 0.0 0.0 0 0 0.0 0.0 0.0 0.0 UND_Piper(resrch_T_W_R)
2 00000450 T W R 9086 -------- 0 0.0 0.0 0.0 0.0 0 0 0.0 0.0 0.0 0.0 UND_Piper(resrch_T_W_Rresrch_T_W_R)
3 00000451 T W R 9148 -------- 0 0.0 0.0 0.0 0.0 0 0 0.0 0.0 0.0 0.0 King_Air(resrch_T_W_RU_Wyo_T_W_R)
4 00000499 T W R 0 -------- 0 0.0 0.0 0.0 0.0 0 0 0.0 0.0 0.0 0.0 Unknown(TAM_FS_T_W_R)
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
278 N988DL - W - 250 C5M13RZA 0 0.0 0.0 0.0 0.0 0 0 0.0 0.0 0.0 0.0 MD88(md88_W)
279 N989DL - W - 251 ENXL3TBA 0 0.0 0.0 0.0 0.0 0 0 0.0 0.0 0.0 0.0 MD88(md88_W)
280 N990AK T - - 31354 VM3LX3RA 924 -0.4 15.6 0.3 2.1 -1 7 1.6 3.0 0.0 0.0 Unknown(std_T)
281 OE-LPL - W - 32404 UAV15PJA 310 0.9 0.9 1.1 2.2 18 63 18.1 21.8 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)
282 OE-LPM - W - 32579 BGWUR1ZA 269 0.2 1.0 1.0 2.3 11 65 17.5 21.0 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)

283 rows × 18 columns

check if any tailID is in the jdiag stationIdentification field#

# all 283 tailIDs are NOT found in jdiag `stationIdentification` fields
mask = dflist["tailID"].isin(df["stationIdentification"])
print(len(dflist[mask]))
0
# this confirms all 283 rows are NOT found in jdiag `stationIdentification` fields
mask = ~dflist["tailID"].isin(df["stationIdentification"])
dflist[mask]
tailID flagT flagW flagR FSL MDCRS N bs_T std_T bs_S std_S bs_D std_D bs_W std_W bs_RH std_RH fail_reason
0 00000320 T W - 26490 -------- 470 11.3 2.8 0.8 2.7 -8 11 1.9 4.0 0.0 0.0 Unknown(bias_Tstd_Tbias_DIR)
1 00000400 T W R 0 -------- 0 0.0 0.0 0.0 0.0 0 0 0.0 0.0 0.0 0.0 UND_Piper(resrch_T_W_R)
2 00000450 T W R 9086 -------- 0 0.0 0.0 0.0 0.0 0 0 0.0 0.0 0.0 0.0 UND_Piper(resrch_T_W_Rresrch_T_W_R)
3 00000451 T W R 9148 -------- 0 0.0 0.0 0.0 0.0 0 0 0.0 0.0 0.0 0.0 King_Air(resrch_T_W_RU_Wyo_T_W_R)
4 00000499 T W R 0 -------- 0 0.0 0.0 0.0 0.0 0 0 0.0 0.0 0.0 0.0 Unknown(TAM_FS_T_W_R)
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
278 N988DL - W - 250 C5M13RZA 0 0.0 0.0 0.0 0.0 0 0 0.0 0.0 0.0 0.0 MD88(md88_W)
279 N989DL - W - 251 ENXL3TBA 0 0.0 0.0 0.0 0.0 0 0 0.0 0.0 0.0 0.0 MD88(md88_W)
280 N990AK T - - 31354 VM3LX3RA 924 -0.4 15.6 0.3 2.1 -1 7 1.6 3.0 0.0 0.0 Unknown(std_T)
281 OE-LPL - W - 32404 UAV15PJA 310 0.9 0.9 1.1 2.2 18 63 18.1 21.8 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)
282 OE-LPM - W - 32579 BGWUR1ZA 269 0.2 1.0 1.0 2.3 11 65 17.5 21.0 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)

283 rows × 18 columns

check if any MDCRS is in the jdiag stationIdentification fields#

mask = dflist["MDCRS"].isin(df["stationIdentification"])
print(len(dflist[mask]))
dflist[mask]
53
tailID flagT flagW flagR FSL MDCRS N bs_T std_T bs_S std_S bs_D std_D bs_W std_W bs_RH std_RH fail_reason
28 N12005 - W - 23986 X0ZRVXRA 410 0.3 0.9 0.7 2.5 19 61 20.6 24.2 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)
29 N12006 - W - 24180 GN0RPDJA 334 0.9 0.9 0.5 2.3 8 37 13.2 14.6 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)
31 N12012 - W - 26076 LO5SVXJA 401 0.8 1.0 0.6 2.2 6 59 19.5 22.4 0.0 0.0 Unknown(std_DIRstd_Wrms_W)
34 N12116 T - - 5159 SSQRPMJA 482 2.8 1.2 0.9 3.6 -1 8 2.0 4.9 0.0 0.0 Unknown(bias_T)
37 N14011 - W - 25904 5YSSATRA 554 0.6 0.9 0.5 2.3 4 57 15.8 18.4 0.0 0.0 Unknown(std_DIRstd_Wrms_W)
38 N14016 - W - 29764 JS5BPMRA 390 0.8 1.0 0.3 2.5 9 55 15.1 17.6 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)
43 N17002 - W - 23457 WSRSVRZA 280 0.5 1.2 0.4 2.2 0 51 13.5 15.6 0.0 0.0 Unknown(std_DIRstd_Wrms_W)
45 N17017 - W - 29260 VBVTBYJA 501 1.1 1.0 0.8 2.5 1 61 14.4 17.3 0.0 0.0 Unknown(std_DIRstd_Wrms_W)
46 N17963 - W - 13118 CKHSYBBA 242 1.1 0.9 0.5 2.2 4 59 14.5 17.1 0.0 0.0 Unknown(std_DIRstd_Wrms_W)
48 N19986 - W - 27637 01CBZMBA 396 0.6 0.9 0.4 2.4 -5 52 11.6 13.8 0.0 0.0 Unknown(std_DIRstd_Wrms_W)
50 N211WN - - R 11131 ZS4UOGBA 3228 0.1 0.8 0.3 1.8 -1 7 1.4 2.6 12.6 25.0 Unknown(bias_RHstd_RH)
51 N213WN - - R 11076 M5OEOGJA 3576 -0.2 0.7 0.3 1.7 -2 8 1.3 2.5 0.2 45.4 Unknown(std_RH)
54 N236WN - - R 11098 VXZUOFRA 3113 -0.1 0.7 0.4 1.7 -1 6 1.3 2.5 -52.1 21.9 Unknown(bias_RHstd_RH)
66 N267WN - - R 11368 US5EOFJA 3244 -0.2 0.8 0.3 1.7 -0 6 1.2 2.4 58.4 89.2 Unknown(bias_RHstd_RH)
78 N27901 - W - 11326 1NISBMJA 389 1.5 1.0 1.0 3.1 6 52 11.1 13.3 0.0 0.0 Unknown(std_DIRstd_Wrms_W)
80 N27964 - W - 13170 3N5BFFBA 382 1.1 1.0 0.8 2.4 -6 64 13.1 16.0 0.0 0.0 Unknown(std_DIRstd_Wrms_W)
84 N281WN - - R 11419 5EVEOGBA 3039 0.0 0.7 0.2 1.6 -0 6 1.2 2.2 57.2 85.6 Unknown(bias_RHstd_RH)
90 N28912 - W - 11984 4B1SV4JA 564 0.2 0.8 1.2 2.3 10 58 14.1 16.5 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)
91 N28987 - W - 27599 MWFDFYRA 542 0.0 1.1 0.6 2.3 4 68 11.5 14.4 0.0 0.0 Unknown(std_DIRstd_Wrms_W)
96 N29907 - W - 11481 GYOTA4ZA 484 0.3 0.8 0.6 2.5 10 52 12.6 14.7 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)
100 N30913 - W - 12237 XNJCPKZA 481 0.2 0.8 0.7 2.5 8 47 14.8 16.7 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)
109 N394UP T - - 26003 JEWWUUJA 645 3.4 1.0 -0.4 2.2 2 10 1.7 3.3 0.0 0.0 Unknown(bias_T)
118 N410UP - - R 701 PPJGU1JA 2154 1.3 0.9 0.5 2.3 -1 7 1.6 3.1 52.1 91.0 Unknown(bias_RHstd_RH)
123 N438WN - - R 11165 DVPUOFZA 2866 -0.0 0.8 0.2 1.6 -0 7 1.3 2.5 -11.0 65.6 Unknown(bias_RHstd_RH)
124 N442WN - - R 11179 ZGUUOGRA 3035 -0.2 0.8 0.4 1.9 -1 7 1.4 2.7 59.1 90.3 Unknown(bias_RHstd_RH)
128 N45905 - W - 11430 PIKRV4RA 420 0.8 0.9 0.8 2.2 17 53 19.7 22.4 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)
129 N459WN - - R 10804 CGCEOHBA 3247 -0.0 0.7 0.2 1.6 -0 7 1.2 2.3 42.7 75.0 Unknown(bias_RHstd_RH)
135 N6715C T - - 2878 550IZAZA 239 2.3 0.9 0.8 2.8 -1 10 1.8 3.9 0.0 0.0 Unknown(bias_T)
136 N684UA T - - 24706 2G1IR4ZA 630 -5.2 6.0 0.3 3.7 -2 8 2.4 4.8 0.0 0.0 Unknown(bias_Tstd_T)
149 N801AC - W - 12328 VMUIZ4BA 416 1.3 0.8 0.3 2.4 11 45 16.5 18.3 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)
150 N802AN - W - 12501 EPSEILRA 214 0.9 0.8 0.7 2.7 13 45 10.5 12.2 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)
152 N804AN - W - 12482 0OEUILJA 436 0.7 0.8 0.6 2.5 11 45 11.9 13.7 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)
153 N805AN - W - 12487 YX5EILZA 405 0.8 0.9 1.0 2.2 18 60 16.0 18.9 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)
154 N806AA - W - 12575 WITICOZA 289 0.9 0.8 0.9 2.5 17 43 15.2 17.0 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)
156 N806NW T - - 10675 AFIHLIJA 360 -2.6 4.3 0.9 2.6 -0 5 2.1 3.6 0.0 0.0 Unknown(bias_Tstd_T)
163 N814AA - W - 13179 LNWICORA 488 0.8 0.8 0.9 2.5 11 52 12.5 14.9 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)
164 N815AA - W - 13251 J4RYCOJA 545 -0.2 0.9 0.6 2.2 12 46 15.1 16.9 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)
165 N817AN - W - 13837 KCWEIERA 409 1.0 0.7 0.6 2.6 14 45 14.0 15.9 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)
170 N823AN - W - 13667 BOZEILJA 399 -0.2 0.8 0.9 2.3 13 46 17.7 19.7 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)
171 N824AN - W - 13693 12KEILJA 403 0.3 1.0 0.7 2.4 -1 40 11.5 12.9 0.0 0.0 Unknown(std_DIRstd_Wrms_W)
176 N8310C - - R 12692 SQXIOYZA 2706 -0.1 0.6 0.3 1.7 -1 6 1.3 2.5 23.5 90.8 Unknown(bias_RHstd_RH)
178 N832AA - W - 18533 1AQICMZA 338 1.1 0.9 0.9 2.5 13 41 13.4 15.1 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)
188 N870AX - W - 26291 JTFX3UJA 236 0.9 0.8 0.8 2.4 4 28 9.3 10.4 0.0 0.0 Unknown(std_Wrms_W)
191 N873BB - W - 27881 1FZYW3BA 248 0.1 0.9 0.8 2.2 4 40 9.5 10.8 0.0 0.0 Unknown(std_DIRstd_Wrms_W)
192 N874AN - W - 29211 BTXEILZA 335 1.4 0.7 0.8 2.7 8 31 9.8 11.1 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)
193 N8755L T - - 28364 BNJLYXJA 2794 1.3 3.1 0.2 1.9 -1 6 1.4 2.7 0.0 0.0 Unknown(std_T)
200 N881BK - W - 29385 JJ3LILRA 479 1.0 0.9 0.5 2.4 10 54 14.1 16.2 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)
201 N882BL - W - 30541 CNZ13PZA 514 0.8 0.9 0.8 2.5 16 44 13.0 14.8 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)
202 N8838Q T - - 29942 PAKWX3RA 2987 -3.1 0.9 0.3 2.1 -1 7 1.5 3.1 0.0 0.0 Unknown(bias_T)
204 N884AA - W - 29680 2IZYCORA 248 0.6 0.8 0.4 2.5 21 68 22.8 26.8 0.0 0.0 Unknown(bias_DIRstd_DIRstd_Wrms_W)
206 N8863Q T - - 30549 0XKWXBZA 2333 0.3 2.3 0.3 1.9 -0 7 1.5 2.9 0.0 0.0 Unknown(std_T)
241 N944WN - - R 11935 NFFUOGZA 2948 -0.2 0.7 0.2 1.6 -1 6 1.2 2.3 -24.2 14.9 Unknown(bias_RH)
243 N945WN - - R 10980 XU4UOFBA 2798 0.1 0.7 0.2 1.6 -0 7 1.2 2.3 -25.5 14.5 Unknown(bias_RH)

check if any tailID or MDCRS is in the jdiag stationIdentification fields#

mask = dflist["tailID"].isin(df["aircraftFlightNumber"])
print(len(dflist[mask]))
mask = dflist["MDCRS"].isin(df["aircraftFlightNumber"])
print(len(dflist[mask]))
0
0

Plot hisgrams#

example 1#

plt.figure(figsize=(8, 5))
#sns.histplot(df["oman"], bins=50, kde=True, color="steelblue")
sns.histplot(t133.t.oman, bins=100, kde=True, color="steelblue")
plt.title("Histogram of oman")
plt.xlabel("oman values")
plt.ylabel("Density")
plt.tight_layout()
plt.show()
../_images/dde8c826a6fca515f181af7bb6ccfebc30d78e129408806f4bbb60b1b77031b4.png

example 2#

df_long = df[["oman", "ombg"]].melt(var_name="variable", value_name="value")

plt.figure(figsize=(8, 5))
sns.histplot(data=df_long, x="value", hue="variable", bins=50, kde=True, element="step", stat="count")
plt.title("Overlayed Histogram: oman vs ombg")
plt.tight_layout()
plt.show()
../_images/e19715ac050b305d9919dd8436db43308ac956d42bdb05346871b740e4a25525.png

example 3#

plt.figure(figsize=(8, 5))
sns.histplot(df["oman"], bins=100, kde=True, color="blue", label="oman", multiple="layer")
sns.histplot(df["ombg"], bins=100, kde=True, color="red", label="ombg", multiple="layer")

plt.title("Overlayed Histogram: oman vs ombg")
plt.xlabel("Value")
plt.ylabel("Frequency")
plt.legend()
plt.tight_layout()
plt.show()
../_images/b11fe3ff500439f7e710e4795c10f7213a460bd502b5f224d3436e5ebc1e6815.png

plot fitting rate#

The rate fitting to observtions is an important metric to evaluate data assimilation performance.
We don’t want a small fitting rate, which means very little observation impacts on the model forecast.
We don’t want a large fitting rate either, which means we fit too close to the obervations and the model forecast will crash. Ususally, a fitting rate of 20%~30% is expected.

# Filter valid data (both 'oman' and 'ombg' are not NaN)
valid_df = df[df["oman"].notna() & df["ombg"].notna()].copy()
valid_df = valid_df.dropna(subset=["height"])  # removes any rows in valid_df where height is missing (NaN)
# print(valid_df[valid_df["height"] < 0]["height"])   # negative height
dz = 1000
grouped = fit_rate(t133.t, dz=dz)

# 5. Plot vertical profile of fit_rate vs height
plt.figure(figsize=(7, 6))
plt.plot(grouped["fit_rate"], grouped["height_bin"], marker="o", color="blue")
# plt.axvline(x=0, color="gray", linestyle="--")  # ax vertical line

plt.xlabel("Fit Rate (%)")  # label change
plt.gca().xaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f'{x*100:.0f}%'))  # format as %
plt.ylabel("Height Bin (m)")
plt.title("Vertical Profile of Fit Rate")

# Fine-tune ticks
plt.xticks(np.arange(0, 0.25, 0.05))  #, fontsize=12)
plt.yticks(np.arange(0, 13000, dz))  #, , fontsize=12)
# Add minor ticks
from matplotlib.ticker import AutoMinorLocator
plt.gca().xaxis.set_minor_locator(AutoMinorLocator())
plt.gca().yaxis.set_minor_locator(AutoMinorLocator())
# plt.grid(which='both', linestyle='--', linewidth=0.5)
plt.grid(True)

plt.ylim(0, 13000)  # set y-axis from 0 (bottom) to 13,000 (top)
plt.tight_layout()
plt.show()
OMA: bias=0.1127 rms=0.9431
OMB: bias=0.1390 rms=1.0462
Overall fit_rate: 9.8576%
0, -1000, 1.7179, 1.7704, 2.9680%
1, 0, 1.3425, 1.4839, 9.5256%
2, 1000, 0.8929, 0.9891, 9.7274%
3, 2000, 0.9511, 1.0712, 11.2078%
4, 3000, 0.8140, 0.9379, 13.2102%
5, 4000, 0.7072, 0.8306, 14.8529%
6, 5000, 0.7251, 0.8159, 11.1235%
7, 6000, 0.7037, 0.7727, 8.9264%
8, 7000, 0.7456, 0.8361, 10.8245%
9, 8000, 0.7729, 0.8456, 8.5992%
10, 9000, 0.7673, 0.8331, 7.9049%
11, 10000, 0.8056, 0.8697, 7.3711%
12, 11000, 1.1992, 1.3272, 9.6471%
13, 12000, 1.3538, 1.4615, 7.3744%
../_images/3a64bf15ce0ee9504e9a8fdba176a0aa3325f7cdf727510a4c26b1448234c58b.png
print(grouped["height_bin"])
0     -1000
1         0
2      1000
3      2000
4      3000
5      4000
6      5000
7      6000
8      7000
9      8000
10     9000
11    10000
12    11000
13    12000
Name: height_bin, dtype: int32

plot satellite observations#

load satellite data using obsSpace#

%%time

cris_file = f"{pyDAmonitor_ROOT}/data/mpasjedi/jdiag_cris-fsr_n20.nc"
obsCris = obsSpace(cris_file)
CPU times: user 3.19 s, sys: 7.45 s, total: 10.6 s
Wall time: 36.8 s
query_data(obsCris.bt, meta_exclude="sensorCentralWavenumber_") # removing the meta_exclude paramter will output all sensorCentralWavenumber_* information
DerivedObsError, DerivedObsValue, CloudDetectMinResidualIR, NearSSTRetCheckIR, ScanEdgeRemoval, GrossCheck, Thinning, UseflagCheck, EffectiveError0, EffectiveError1, EffectiveError2, EffectiveQC0, EffectiveQC1, EffectiveQC2, sensorZenithAngle, satelliteIdentifier, sensorScanPosition, instrumentIdentifier, longitude, latitude, sensorViewAngle, scanLineNumber, solarZenithAngle, sensorChannelNumber, sensorAzimuthAngle, fieldOfViewNumber, dateTime, solarAzimuthAngle, ObsBias0, ObsBias1, ObsBias2, constantPredictor, emissivityJacobianPredictor, hofx0, hofx1, hofx2, innov1, lapseRatePredictor, lapseRate_order_2Predictor, oman, ombg, sensorScanAnglePredictor, sensorScanAngle_order_2Predictor, sensorScanAngle_order_3Predictor, sensorScanAngle_order_4Predictor, Channel, Location
print(obsCris.bt.hofx0)
[[-- -- -- ... -- -- --]
 [222.63551330566406 224.9462127685547 222.70394897460938 ...
  291.0711669921875 291.02618408203125 290.8084411621094]
 [-- -- -- ... -- -- --]
 ...
 [-- -- -- ... -- -- --]
 [-- -- -- ... -- -- --]
 [-- -- -- ... -- -- --]]

assemble target obs array#

ncount=0
idx = []
idx2 = []
ch=61
for n in np.arange(len(obsCris.bt.ombg[:,ch])):
    #if obsCris.bt.CloudDetectMinResidualIR[n,ch] == 1: 
     if obsCris.bt.ombg[n,ch] > -200 and obsCris.bt.ombg[n,ch] < 200:
       idx.append(n)
       ncount = ncount + 1 

lat=obsCris.bt.latitude[idx]
lon=obsCris.bt.longitude[idx]
obarray=obsCris.bt.DerivedObsValue[idx,ch]
print(lon,lat,obarray)
print(ncount)
[-127.40092 -129.51625 -131.27661 ... -132.82477 -135.50513 -137.78625] [24.14625 24.23062 24.33297 ... 47.29415 47.29884 47.30563] [233.91612 235.93317 237.10016 ... 232.89937 233.92868 234.85745]
9521

prepare coloar map#

datmi = np.nanmin(obarray)  # Min of the data
datma = np.nanmax(obarray)  # Max of the data

if np.nanmin(obarray) < 0:
  cmax = datma
  cmin = datmi
  cmax=310
  cmin=200
  #cmax=1.0
  #cmin=-1.0
  cmap = 'RdBu'
else:
  #cmax = omean+stdev
  #cmin = np.maximum(omean-stdev, 0.0)
  cma = datma
  cmin = datmi
  cmax=310
  cmin=200
  #cmax=1.0
  #cmin=-1.0
  cmap = 'RdBu'
  cmap = 'viridis'
  cmap = 'jet'


cmin = 200.
cmax = 310.
conus_12km = [-150, -50, 15, 55]

color_map = plt.cm.get_cmap(cmap)
reversed_color_map = color_map.reversed()
units = 'K'
#units = '%'
fig = plt.figure(figsize=(10, 5))
/tmp/ipykernel_4082298/1360930195.py:30: MatplotlibDeprecationWarning: The get_cmap function was deprecated in Matplotlib 3.7 and will be removed in 3.11. Use ``matplotlib.colormaps[name]`` or ``matplotlib.colormaps.get_cmap()`` or ``pyplot.get_cmap()`` instead.
  color_map = plt.cm.get_cmap(cmap)
<Figure size 1000x500 with 0 Axes>

make the plot#

import cartopy.crs as ccrs
import matplotlib.ticker as mticker
import matplotlib.colors as colors

ax = plt.axes(projection=ccrs.PlateCarree(central_longitude=0))

# Plot grid lines
# ----------------
gl = ax.gridlines(crs=ccrs.PlateCarree(central_longitude=0), draw_labels=True,
                  linewidth=1, color='gray', alpha=0.5, linestyle='-')
gl.top_labels = False
gl.xlabel_style = {'size': 10, 'color': 'black'}
gl.ylabel_style = {'size': 10, 'color': 'black'}
gl.xlocator = mticker.FixedLocator(
   [-180, -135, -90, -45, 0, 45, 90, 135, 179.9])
ax.set_ylabel("Latitude",  fontsize=7)
ax.set_xlabel("Longitude", fontsize=7)

# Get scatter data
# ------------------
sc = ax.scatter(lon, lat,
                c=obarray, s=4, linewidth=0,
                transform=ccrs.PlateCarree(), cmap=cmap, vmin=cmin, vmax = cmax, norm=None, antialiased=True)

# Plot colorbar
# --------------
cbar = plt.colorbar(sc, ax=ax, orientation="horizontal", pad=.1, fraction=0.06,ticks=[200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310])
#cbar = plt.colorbar(sc, ax=ax, orientation="horizontal", pad=.1, fraction=0.06,ticks=[-3, -2.5, -2, -1.5, -1, -0.5, 0, 0.5, 1.0, 1.5, 2.0, 2.5, 3 ])
#cbar = plt.colorbar(sc, ax=ax, orientation="horizontal", pad=.1, fraction=0.06,ticks=[0, 10, 20, 20, 40, 50, 60, 70, 80, 90, 100])
cbar.ax.set_ylabel(units, fontsize=10)
# Plot globally
# --------------
#ax.set_global()
#ax.set_extent(conus)
ax.set_extent(conus_12km)

# Draw coastlines
# ----------------
ax.coastlines()
ax.text(0.45, -0.1, 'Longitude', transform=ax.transAxes, ha='left')
ax.text(-0.08, 0.4, 'Latitude', transform=ax.transAxes,
        rotation='vertical', va='bottom')

#text = f"Total Count:{datcont:0.0f}, Max/Min/Mean/Std: {datma:0.3f}/{datmi:0.3f}/{omean:0.3f}/{stdev:0.3f} {units}"
#print(text)
#ax.text(0.67, -0.1, text, transform=ax.transAxes, va='bottom', fontsize=6.2)

dpi=100
gl.top_labels = False
plt.tight_layout()

# show plot
# -----------
# pname='test.png'
# plt.savefig(pname, dpi=dpi)
../_images/396d4c6f3df23f837b3762144330f93a0438dd7b09142dd91411e23212fbd961.png

For reference, using netCDF4 to read jdiag files directly#

dataset=Dataset(jdiag_file, mode='r')
query_dataset(dataset)
EffectiveError0
    airTemperature
EffectiveError1
    airTemperature
EffectiveError2
    airTemperature
EffectiveQC0
    airTemperature
EffectiveQC1
    airTemperature
EffectiveQC2
    airTemperature
MetaData
    stationIdentification, aircraftFlightPhase, timeOffset, prepbufrDataLvlCat, longitude, stationElevation, latitude, prepbufrReportType, pressure, longitude_latitude_pressure, dateTime, height, dumpReportType, aircraftFlightNumber
ObsBias0
    airTemperature
ObsBias1
    airTemperature
ObsBias2
    airTemperature
ObsError
    airTemperature, relativeHumidity, windEastward, pressure, windNorthward
ObsType
    specificHumidity, windEastward, airTemperature, windNorthward
ObsValue
    dewpointTemperature, specificHumidity, airTemperature, windEastward, windNorthward
QualityMarker
    airTemperature, pressure, windEastward, specificHumidity, height, windNorthward
TempObsErrorData
    airTemperature
hofx0
    airTemperature
hofx1
    airTemperature
hofx2
    airTemperature
innov1
    airTemperature
oman
    airTemperature
ombg
    airTemperature
Location

NOTE:#

We can see that jdiag files (or output ioda files) have the group/variable structure.
And we need to use, for example, obs.groups["oman"].variables["airTemperature"][:] to access data.

Wit the obsSpace class, we can access the data very easily and conveniently by using obs.t.oman