Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No module named 'pygribjump' #133

Open
jmb64 opened this issue May 16, 2024 · 18 comments
Open

No module named 'pygribjump' #133

jmb64 opened this issue May 16, 2024 · 18 comments
Labels
bug Something isn't working

Comments

@jmb64
Copy link

jmb64 commented May 16, 2024

What happened?

Hi,

  This issue is more of a question than a bug report. 

 A couple of month ago someone you probably know (Dr. Quintino) gave a video conference where I learned about "polytope", earthkit, ... etc.  A couple of slides showed performance results (timings of feature extraction using polytope and GribJump).  It prompted me to install the package with conda and play with a few examples. One of them was 3D_shipping_route_wave_model.py  making use of the FDB backend ...

python 3D_shipping_route_wave_model.py

Traceback (most recent call last):
File ".../3D_shipping_route_wave_model.py", line 8, in
from polytope.datacube.backends.fdb import FDBDatacube
File .../python3.10/site-packages/polytope/datacube/backends/fdb.py", line 4, in
import pygribjump as pygj
ModuleNotFoundError: No module named 'pygribjump'

I searched for pygribjump ... it was nowhere to be found.

I realize that the software is a work in progress. I just want to check with you that the module (GribJump, pygribjump)
is indeed not publicly available and the outcome is as expected ?

I look forward to see more performance results with earthkit/polytope. Your work is interesting and appears to be really tigthly
integrated to the software infrastructure at ecmwf. This is new stuff in comparison to what is done in the context of PANGEO
for example.

Anyway, thanks for your time,

Oh, and I did succeed in running some other tests though ... :-)

What are the steps to reproduce the bug?

none

Version

polytope 1.03

Platform (OS and architecture)

cat /etc/os-release --> Ubuntu 22.04.4 LTS uname -a ---> Linux 4.18.0-240.el8.x86_64

Relevant log output

none

Accompanying data

none

Organisation

Environment and Climate Change Canada

@jmb64 jmb64 added the bug Something isn't working label May 16, 2024
@jameshawkes
Copy link
Collaborator

Hi @jmb64,

You are quite right, we had not made gribjump public yet, but now we have :)
https://github.com/ecmwf/gribjump

It is very experimental and we are actively developing this whole stack. To use gribjump you'll need to build it, along with FDB (https://github.com/ecmwf/fdb) and it's dependencies. pygribjump needs to be pip installed directly from the git repository too, as its not on pypi yet.

To be honest the polytope-gribjump-fdb stack is not really in a state where we'd expect people to be able to deploy it succesfully, but you're welcome to try!

@jmb64
Copy link
Author

jmb64 commented May 20, 2024

Hi James,

Thanks for your reply.  I won't loose any sleep over it but I will probably try ... :-)

It seems that access to the python api pygribjump is rectricted though.
Here's what I get:


python3 -m pip install --upgrade git+ssh://git@github.com/ecmwf/gribjump.git@master

Collecting git+ssh://@github.com/ecmwf/gribjump.git@master
Cloning ssh://
@github.com/ecmwf/gribjump.git (to revision master) to /tmp/jmb001/203/pip-req-build-csew7wed
Running command git clone --filter=blob:none --quiet 'ssh://****@github.com/ecmwf/gribjump.git' /tmp/jmb001/203/pip-req-build-csew7wed
Warning: Permanently added 'github.com' (ED25519) to the list of known hosts.
git@github.com: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights and the repository exists.


@jameshawkes
Copy link
Collaborator

Hi @jmb64, try pip installing it using https instead of ssh:

pip install --upgrade git+https://github.com/ecmwf/gribjump

@jmb64
Copy link
Author

jmb64 commented May 20, 2024

Right on. Thanks!

@jmb64
Copy link
Author

jmb64 commented May 23, 2024

Would you happen to have an example showing the content for this file ?

GRIBJUMP_CONFIG_FILE = "/path/to/gribjump/config.yaml"

@ChrisspyB
Copy link
Member

Hi @jmb64 ,
The config for GribJump is very minimal (for now), and in principle should be optional. For a local extraction on your machine, the following config.yaml should do the trick:

---
type: local

@jmb64
Copy link
Author

jmb64 commented May 28, 2024

Hi,

Ok, I probably got as far as i could without having an actual instance of FDB installed locally.  I thought I could maybe get away with plain grib file(s) on GPFS to mimick FDB (with indexes on the side ?). By the way how do you let your python scripts know  the location of your local FDB ?    As a suggestion, I don't know how complex this would be but  ... a simple "example FDB" setup would be useful for running tests outside the walls of ECMWF I guess. Apart from that I think my software installation appears to be ok. A few scripts are not completing successfully (polytope requests definition) but I suspect it is because they are outdated w/r to the api. This is on me. It is a work in progress. I was warned and that's fine :-)    

I will follow your development and docs .

Thanks.
Jean-Marc

@jmb64
Copy link
Author

jmb64 commented May 28, 2024

Hi Chris,

 Just to let you know that I managed to get things a little bit more "concrete" for myself with this FDB ... using 

readthedocs.io (fdb and earthkit-data). To answer my own question about an FDB example for testing ... i was able to create
the POSIX filesystem flavor of FDB from grib through earthkit. I also noticed the grib2fdb cli tool in my build of the whole shebang ... The hard part is not knowing much about the ECMWF culture (MARS language ...)
One last thing, my fdb build was incomplete (cli tools and tests/api...) and I solved it by adding libcurl (-lcurl) to a bunch
of link.txt files. I don't know if it was something I missed in the configuration (ecbuild) or if it should be corrected in fdb github ...

@jmb64
Copy link
Author

jmb64 commented May 30, 2024

Maybe one last question ...

Running this test
https://github.com/ecmwf/polytope/blob/develop/tests/test_fdb_datacube.py

it seems I cannot get "step" to make it as an FDB datacube axes.

This is the bit of code which I believe should create the datacube instance:


self.options = {
"config": [
{"axis_name": "step", "transformations": [{"name": "type_change", "type": "int"}]},
{"axis_name": "number", "transformations": [{"name": "type_change", "type": "int"}]},
{
"axis_name": "date",
"transformations": [{"name": "merge", "other_axis": "time", "linkers": ["T", "00"]}],
},
{
"axis_name": "values",
"transformations": [
{"name": "mapper", "type": "octahedral", "resolution": 1280, "axes": ["latitude", "longitude"]}
],
},
{"axis_name": "latitude", "transformations": [{"name": "reverse", "is_reverse": True}]},
{"axis_name": "longitude", "transformations": [{"name": "cyclic", "range": [0, 360]}]},
]
}
self.config = {"class": "od", "expver": "0001", "levtype": "sfc", "stream": "oper"}
self.fdbdatacube = FDBDatacube(self.config, axis_options=self.options)

Traceback (most recent call last):
File "...ECMWF/EARTHKIT_POLYTOPE/test_fdb_datacube.py", line 102, in
a.test_fdb_datacube()
File "... ECMWF/EARTHKIT_POLYTOPE/test_fdb_datacube.py", line 78, in test_fdb_datacube
result = self.API.retrieve(request)
File ".../.conda/envs/ecmwf/lib/python3.10/site-packages/polytope/polytope.py", line 58, in retrieve
request_tree = self.engine.extract(self.datacube, request.polytopes())
File ".../.conda/envs/ecmwf/lib/python3.10/site-packages/polytope/engine/hullslicer.py", line 140, in extract
self._unique_continuous_points(p, datacube)
File ".../.conda/envs/ecmwf/lib/python3.10/site-packages/polytope/engine/hullslicer.py", line 27, in _unique_continuous_points
mapper = datacube.get_mapper(ax)
File ".../.conda/envs/ecmwf/lib/python3.10/site-packages/polytope/datacube/backends/datacube.py", line 128, in get_mapper
return self._axes[axis]
KeyError: 'step'

I get this KeyError during the extraction ... which I figure is because "step" has not been added to axes ?

I added a few extra print ... in the

JMB: datacube.py create_axes_config: {'config': [{'axis_name': 'step', 'transformations': [{'name': 'type_change', 'type': 'int'}]}, {'axis_name': 'number', 'transformations': [{'name': 'type_change', 'type': 'int'}]}, {'axis_name': 'date', 'transformations': [{'name': 'merge', 'other_axis': 'time', 'linkers': ['T', '00']}]}, {'axis_name': 'values', 'transformations': [{'name': 'mapper', 'type': 'octahedral', 'resolution': 1280, 'axes': ['latitude', 'longitude']}]}, {'axis_name': 'latitude', 'transformations': [{'name': 'reverse', 'is_reverse': True}]}, {'axis_name': 'longitude', 'transformations': [{'name': 'cyclic', 'range': [0, 360]}]}]} [{'axis_name': 'step', 'transformations': [{'name': 'type_change', 'type': 'int'}]}, {'axis_name': 'number', 'transformations': [{'name': 'type_change', 'type': 'int'}]}, {'axis_name': 'date', 'transformations': [{'name': 'merge', 'other_axis': 'time', 'linkers': ['T', '00']}]}, {'axis_name': 'values', 'transformations': [{'name': 'mapper', 'type': 'octahedral', 'resolution': 1280, 'axes': ['latitude', 'longitude']}]}, {'axis_name': 'latitude', 'transformations': [{'name': 'reverse', 'is_reverse': True}]}, {'axis_name': 'longitude', 'transformations': [{'name': 'cyclic', 'range': [0, 360]}]}]

JMB: fdb.py self.fdb_coordinates.items() dict_items([('values', [])])

JMB: fdb.py name_values values []

JMB: opt axis_name='values' transformations=[MapperConfig(name='mapper', type='octahedral', resolution=1280, axes=['latitude', 'longitude'], local=None)]

JMB: self._axes {'latitude': <polytope.datacube.datacube_axis.FloatDatacubeAxis object at 0x1517b7e39c30>, 'longitude': <polytope.datacube.datacube_axis.FloatDatacubeAxis object at 0x1517b7e39cc0>}

JMB get_mapper self._axes= {'latitude': <polytope.datacube.datacube_axis.FloatDatacubeAxis object at 0x1517b7e39c30>, 'longitude': <polytope.datacube.datacube_axis.FloatDatacubeAxis object at 0x1517b7e39cc0>}

I believe latitude and longitude have been added to the datacube axes from these prints ...

Thanks.

JM

@jmb64
Copy link
Author

jmb64 commented May 30, 2024

Logging from polytope for
python test_fdb_datacube.py:

cat lis_test_fdb_datacube.log

INFO:root:Created an FDB datacube with options: {'config': [{'axis_name': 'step', 'transformations': [{'name': 'type_change', 'type': 'int'}]}, {'axis_name': 'number', 'transformations': [{'name': 'type_change', 'type': 'int'}]}, {'axis_name': 'date', 'transformations': [{'name': 'merge', 'other_axis': 'time', 'linkers': ['T', '00']}]}, {'axis_name': 'values', 'transformations': [{'name': 'mapper', 'type': 'octahedral', 'resolution': 1280, 'axes': ['latitude', 'longitude']}]}, {'axis_name': 'latitude', 'transformations': [{'name': 'reverse', 'is_reverse': True}]}, {'axis_name': 'longitude', 'transformations': [{'name': 'cyclic', 'range': [0, 360]}]}]}

DEBUG:root:Skipping /etc/polytope/config.json, file not found.
DEBUG:root:Skipping /etc/polytope/config.yaml, file not found.
DEBUG:root:Skipping /home/jmb001/.polytope.json, file not found.
DEBUG:root:Skipping /home/jmb001/.polytope.yaml, file not found.
DEBUG:root:Conflated config: {}

INFO:root:Axes returned from GribJump are: {}

INFO:root:Polytope created axes for: dict_keys(['latitude', 'longitude'])

@mathleur
Copy link
Member

mathleur commented Jun 3, 2024

Hi @jmb64,

It seems like maybe there isn't the right data for this particular example written on the FDB that Polytope is connecting to?

Unfortunately, I don't think the data that goes with these particular tests can be viewed publicly. However, I think it should be possible to adapt the Polytope options and datacube configuration to work with the data written on your local FDB.

The datacube axes created in the Polytope API aren't really specified by the options, but are built by "reading" the axes from the data on the FDB. The options only really help create an easy user interface to make requesting data more intuitive for users, such as being able to request data at longitude=420 (instead of having to request data at longitude=60) for example.

The options here refer to particular "transformations" on the data, such as cyclic axes or special grids, like octahedral grids for example (when the data grib file doesn't store the data along latitude/longitude, but rather along a combined "index" axis). There is also the type change transformation, which we use when the request type (step in integers for example) doesn't match how the data is stored on the FDB (in this case as strings), as well as the reverse transformation, when the data stored in the FDB on an axis is in decreasing order (for example latitude indexes are stored from 90 to -90 instead of -90 to 90).
The datacube configuration refers to the first-layer path pointing to the data stored on the local FDB.

I also think that currently, the Polytope version on develop doesn't yet work with the GribJump master branch. Instead, I believe the feature/gribjump_axis_branching branch of Polytope should work with the latest GribJump master branch.

I hope this helps, but I would be happy to answer more questions!

@jmb64
Copy link
Author

jmb64 commented Jun 3, 2024

Hi Mathilde,

       I appreciate you taking the time to offer these explanations. I kinda realize friday that actual data (FDB) was playing a 

major role in the datacube configuration :-) ... just by changing the FDB I noticed I was getting past the configuration (i.e.
step axes ...) issue . I tried retrieving a GRIB file output from the HRES WAM with the 2d spectrum (2dfd) ... but it is not
available it seems (damn! :-)) ... as you mentioned. I will see if I can rewrite an example akin to 3D_shipping_route_wave_model but with available data this time ! I will re-read your post carefully, i'm sure this will help, and might have some questions later ...

Regards,

Jean-Marc

PS: You probably know the answer to this one ... In Tiago's presentation (Bytes from Petabytes, slide 20) there's a graph showing linear scaling for the extraction process for 1000 to 4500 values on Lustre ... we're talking about double precision values right ? Hence the 36kB(4500 values)/730ms ---> 4-6 ms/value ... Are those results from /examples or /performance on github ?

@jmb64
Copy link
Author

jmb64 commented Jun 6, 2024

Hello again,

 I decided to adapt the polytope/FDB example "3D_shipping_route_wave_model.py", creating an FDB from era5 specific humidity in the marine boundary layer since I don't have access to the wave modele data (let,s say one is interested in shipborne lidar data , .. i.e. HALO/Wales)  using this CDS request:

#!/usr/bin/env python
import cdsapi

c = cdsapi.Client()

c.retrieve("reanalysis-era5-complete", {
"class": "ea",
"date": "2021-07-01/to/2021-07-31",
"expver": "1",
"levelist": "100/101/102/103/104/105/106/107/108/109/110/111/112/113/114/115/116/117/118/119/120/121/122/123/124/125/126/127/128/129/130/131/132/133/134/135/136/137",
"levtype": "ml",
"param": "133",
"step": "0/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/18",
"stream": "oper",
"time": "18:00:00",
"type": "fc",
"grid": "0.25/0.25"
}, "era5_202107_regular_ll_2.grib")

then create the local FDB

fdb write era5_202107_regular_ll_2.grib. I keep the same shapefile for shipping route (Shipping-Lanes-v1.shp)

  1. Ruuning the modified script for specific humidity the extraction process encounters a problem in GribJump:

GRIBJUMP_DEBUG Using FDB's (new) axes impl
GRIBJUMP_DEBUG Base request: retrieve,class=ea,date=20210702,time=1800,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=1,stream=oper,type=fc
GRIBJUMP_DEBUG Flattened request: retrieve,class=ea,date=20210702,time=1800,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=1,stream=oper,type=fc
GRIBJUMP_DEBUG Flattened keys for request retrieve,class=ea,date=20210702,time=1800,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=1,stream=oper,type=fc: [class=ea,date=20210702,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=1,stream=oper,time=1800,type=fc]
GRIBJUMP_DEBUG Base request: retrieve,class=ea,date=20210702,time=1800,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=2,stream=oper,type=fc
GRIBJUMP_DEBUG Flattened request: retrieve,class=ea,date=20210702,time=1800,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=2,stream=oper,type=fc
GRIBJUMP_DEBUG Flattened keys for request retrieve,class=ea,date=20210702,time=1800,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=2,stream=oper,type=fc: [class=ea,date=20210702,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=2,stream=oper,time=1800,type=fc]
GRIBJUMP_DEBUG Base request: retrieve,class=ea,date=20210702,time=1800,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=3,stream=oper,type=fc
GRIBJUMP_DEBUG Flattened request: retrieve,class=ea,date=20210702,time=1800,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=3,stream=oper,type=fc
GRIBJUMP_DEBUG Flattened keys for request retrieve,class=ea,date=20210702,time=1800,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=3,stream=oper,type=fc: [class=ea,date=20210702,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=3,stream=oper,time=1800,type=fc]
GRIBJUMP_DEBUG Built flat keys
GRIBJUMP_DEBUG Built keyToExtractionItem
GRIBJUMP_DEBUG Union request retrieve,class=ea,date=20210702,time=1800,domain=g,expver=0001,levelist=137,levtype=ml,param=133,step=1/2/3,stream=oper,type=fc
GRIBJUMP_DEBUG Found 3 fields in 1 files
GRIBJUMP_DEBUG File map:
GRIBJUMP_DEBUG file=.../ECMWF/data/regular_ll/ea:0001:oper:20210702:1800:g/fc:ml.20240605.121554.hpcr5-vis5.6636793919176709.data, Offsets=[155833500, 234789140, 313744780, ]
GRIBJUMP_DEBUG Thread 22437929043520 starting
GRIBJUMP_DEBUG Thread 22437716297280 starting
GRIBJUMP_DEBUG Queued task 0
Queued task 0GRIBJUMP_DEBUG Waiting for 1 task...
GRIBJUMP_DEBUG task...
GRIBJUMP_DEBUG 2437716297280 new job
GRIBJUMP_DEBUG Cache directory
GRIBJUMP_DEBUG Warning, cache persistence is disabled
GRIBJUMP_DEBUG Cache file /fc:ml.20240605.121554.hpcr5-vis5.6636793919176709.data.gribjump does not exist
GRIBJUMP_DEBUG Thread 22437716297280 exception: Read access outside data section: 2984368 + 32 > 2076480
GRIBJUMP_DEBUG All tasks complete
GRIBJUMP_DEBUGGRIBJUMP_DEBUG Thread Thread 22437716297280 stopping (queue closed) data section: 2984368 +Thread Thread 22437716297280 stopping (queue closed) data section: 2984368 +

The offsets for step 1,2,3 seem right judging by fdb dump-index on the FDB.

  1. I'm using polytope (develop branch) and GribJump (develop branch, either may17 commit or latest june5 give the same
    result).

Debugging points to : GribJumpDataAccessor.h (line 35)

throw std::runtime_error("Read access outside data section: " + std::to_string(offset) + " + " + std::to_string(size) + " > " + std::to_string(data_section_size))

Does this look familiar to you ?
Again, I realize this is work in progress. Ans I'm not implying this is a bug since I barely know what I'm doing when
creating the polytope request ! :-)

  1. Full python script adaptation if ever you would consider running similar case.

cat 3D_shipping_route_era5_mabl_specific_humidity_jmb.py

import geopandas as gpd
import matplotlib.pyplot as plt
import pandas as pd
import pytest
from eccodes import codes_grib_find_nearest, codes_grib_new_from_file

from polytope.datacube.backends.fdb import FDBDatacube
from polytope.engine.hullslicer import HullSlicer
from polytope.polytope import Polytope, Request
from polytope.shapes import Ellipsoid, Path, Select
from tests.helper_functions import download_test_data
import logging

class TestReducedLatLonGrid:
def setup_method(self, method):
#JMB
logger = logging.getLogger(name)
logging.basicConfig(filename='lis_3D_shipping_route_era5_MABL_specific_humidity.log', encoding='utf-8', level=logging.DEBUG)

    self.options = {
        "config": [
            {"axis_name": "step", "transformations": [{"name": "type_change", "type": "int"}]},
            {"axis_name": "number", "transformations": [{"name": "type_change", "type": "int"}]},
            {
                "axis_name": "date",
                "transformations": [{"name": "merge", "other_axis": "time", "linkers": ["T", "00"]}],
            },
            {
                "axis_name": "values",
                "transformations": [
                    {"name": "mapper", "type": "reduced_ll", "resolution": 1441, "axes": ["latitude", "longitude"]}
                ],
            },

#JMB {"axis_name": "latitude", "transformations": [{"name": "reverse", "is_reverse": True}]},
{"axis_name": "longitude", "transformations": [{"name": "cyclic", "range": [0, 360]}]},
]
}

#ea:0001:oper:20210701:1800:g

self.config = {"class": "od", "stream": "wave"}

    self.config = {"class": "ea", "stream" : "oper" }
    
    self.fdbdatacube = FDBDatacube(self.config, axis_options=self.options)
    self.slicer = HullSlicer()
    self.API = Polytope(datacube=self.fdbdatacube, engine=self.slicer, axis_options=self.options)

def find_nearest_latlon(self, grib_file, target_lat, target_lon):
    messages = grib_file

    # Find the nearest grid points
    nearest_points = []
    for message in [messages[0]]:
        nearest_index = codes_grib_find_nearest(message, target_lat, target_lon)
        nearest_points.append(nearest_index)

    return nearest_points

@pytest.mark.internet
@pytest.mark.skip(reason="can't install fdb branch on CI")
def test_reduced_ll_grid(self):
    shapefile = gpd.read_file(".../ECMWF/EARTHKIT_POLYTOPE/examples/data/Shipping-Lanes-v1.shp")
    geometry_multiline = shapefile.iloc[2]
    geometry_object = geometry_multiline["geometry"]

    lines = []
    i = 0

    #JMB

print("geometry_object=",geometry_object)

print("len(geometry_object,geoms)=",len(geometry_object.geoms))

#JMB for line in geometry_object[:7]:
for line in list(geometry_object.geoms)[:7]:
for point in line.coords:
point_list = list(point)
if list(point)[0] < 0:
point_list[0] = list(point)[0] + 360
lines.append(point_list)

    # Append for each point a corresponding step

    new_points = []
    for point in lines[:7]:
        new_points.append([point[1], point[0], 1])
        
    print("new_points=",new_points)        
        
    # Pad the shipping route with an initial shape

    padded_point_upper = [0.24, 0.24, 1]
    padded_point_lower = [-0.24, -0.24, 1]

padded_point_upper = [5.0, 5.0, 1]

padded_point_lower = [-5.0, -5.0, 0]

    initial_shape = Ellipsoid(["latitude", "longitude", "step"], padded_point_lower, padded_point_upper)

    # Then somehow make this list of points into just a sequence of points

    ship_route_polytope = Path(["latitude", "longitude", "step"], initial_shape, *new_points)
    print("JMB ship_route_polytope=",ship_route_polytope)
    
    request = Request(
        ship_route_polytope,
        Select("date", [pd.Timestamp("20210702T180000")]),
        Select("domain", ["g"]),
        Select("expver", ["0001"]),
        Select("param", ["133"]),
        Select("levelist", ["137"]),                  
        Select("class", ["ea"]),            
        Select("stream", ["oper"]),
        Select("levtype", ["ml"]),
        Select("type", ["fc"]),
    )
    print("JMB request=",request)
    
    result = self.API.retrieve(request)
    result.pprint()

    print("JMB: after request result=",result) 
    lats = []
    lons = []
    eccodes_lats = []
    eccodes_lons = []
    tol = 1e-8

#JMB f = open("./tests/data/wave.grib", "rb")
f = open(".../ECMWF/data/regular_ll/202107/era5_202107_regular_ll.grib", "rb")
messages = []
print("JMB before codes_grib_new_from_file ",flush=True)
message = codes_grib_new_from_file(f)
print("JMB after codes_grib_new_from_file message=",message,flush=True)
messages.append(message)

    leaves = result.leaves
    for i in range(len(leaves)):
        #JMB
        print("JMB i, leaves=",i,leaves)
        cubepath = leaves[i].flatten()
        lat = cubepath["latitude"]
        lon = cubepath["longitude"]
        del cubepath
        lats.append(lat)
        lons.append(lon)
        nearest_points = codes_grib_find_nearest(message, lat, lon)[0]
        eccodes_lat = nearest_points.lat
        eccodes_lon = nearest_points.lon
        eccodes_lats.append(eccodes_lat)
        eccodes_lons.append(eccodes_lon)
        assert eccodes_lat - tol <= lat
        assert lat <= eccodes_lat + tol
        assert eccodes_lon - tol <= lon
        assert lon <= eccodes_lon + tol
        print(i)
    f.close()

    worldmap = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))
    fig, ax = plt.subplots(figsize=(12, 6))
    worldmap.plot(color="darkgrey", ax=ax)

    plt.scatter(lons, lats, s=18, c="red", cmap="YlOrRd")
    plt.scatter(eccodes_lons, eccodes_lats, s=6, c="green")
    plt.colorbar(label="Temperature")
    plt.show()

a=TestReducedLatLonGrid()
a.setup_method(None)
a.test_reduced_ll_grid()


Cheers,

Jean-Marc

@mathleur
Copy link
Member

mathleur commented Jun 6, 2024

Hello @jmb64,

I think maybe this fails because of the grid option given to Polytope for the data. I think the grid keyword in the MARS requests transform the data automatically to a regular lat/lon grid and so the regular grid "transformation" would be needed in this case. I believe here the options should thus be

    self.options = {
    "config": [
        {"axis_name": "step", "transformations": [{"name": "type_change", "type": "int"}]},
        {"axis_name": "number", "transformations": [{"name": "type_change", "type": "int"}]},
        {
            "axis_name": "date",
            "transformations": [{"name": "merge", "other_axis": "time", "linkers": ["T", "00"]}],
        },
        {
            "axis_name": "values",
            "transformations": [
                {"name": "mapper", "type": "regular", "resolution": 360, "axes": ["latitude", "longitude"]}
            ],
        },
       {"axis_name": "latitude", "transformations": [{"name": "reverse", "is_reverse": True}]},
       {"axis_name": "longitude", "transformations": [{"name": "cyclic", "range": [0, 360]}]},
   ]
   }

I hope this helps!

@jmb64
Copy link
Author

jmb64 commented Jun 6, 2024

Hi Mathilde,

  Thanks for the help. I did mentionned I was not saying there was a bug ! :-)

End of listing with your suggestion:


JMB fdb.py assign_fdb_output_to_nodes: sorted_range_lengths= [2, 2, 2, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 2, 2, 2, 2, 1, 1, 1, 2, 2, 2, 2, 2, 1, 2, 1, 2, 2, 2, 2, 2, 1, 2, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 2, 2, 3]
JMB fdb.py assign_fdb_output_to_nodes: len(sorted_fdb_range_nodes) 49
JMB fdb.py assign_fdb_output_to_nodes: len(sorted_range_lengths)= 49
JMB fdb.py assign_fdb_output_to_nodes: request_output_values= [[(array([0.01897763, 0.01898812]), array([3], dtype=uint64)), (array([0.0189309 , 0.01894377]), array([3], dtype=uint64)), (array([0.01861666, 0.01857756]), array([3], dtype=uint64)), (array([0.01925228, 0.01907585]), array([3], dtype=uint64)), (array([0.01757191]), array([1], dtype=uint64)), (array([0.01921032, 0.01924036]), array([3], dtype=uint64)), (array([0.01765965]), array([1], dtype=uint64)), (array([0.01865195, 0.01875971]), array([3], dtype=uint64)), (array([0.01804016]), array([1], dtype=uint64)), (array([0.01867436, 0.01855133]), array([3], dtype=uint64)), (array([0.01802109]), array([1], dtype=uint64)), (array([0.01856135, 0.01851414]), array([3], dtype=uint64)), (array([0.0183811]), array([1], dtype=uint64)), (array([0.01856325, 0.0186014 ]), array([3], dtype=uint64)), (array([0.01868962, 0.01883791]), array([3], dtype=uint64)), (array([0.019114 , 0.01894663]), array([3], dtype=uint64)), (array([0.01896332, 0.01910256]), array([3], dtype=uint64)), (array([0.01925371, 0.01914023]), array([3], dtype=uint64)), (array([0.01899813]), array([1], dtype=uint64)), (array([0.01924895]), array([1], dtype=uint64)), (array([0.0188484]), array([1], dtype=uint64)), (array([0.01918934, 0.01911066]), array([3], dtype=uint64)), (array([0.01867293, 0.01856039]), array([3], dtype=uint64)), (array([0.01952313, 0.01928614]), array([3], dtype=uint64)), (array([0.01834391, 0.01835249]), array([3], dtype=uint64)), (array([0.01955031, 0.01959131]), array([3], dtype=uint64)), (array([0.01830862]), array([1], dtype=uint64)), (array([0.01927422, 0.01930474]), array([3], dtype=uint64)), (array([0.01842688]), array([1], dtype=uint64)), (array([0.01934336, 0.01917456]), array([3], dtype=uint64)), (array([0.018414 , 0.01840876]), array([3], dtype=uint64)), (array([0.01935576, 0.01924084]), array([3], dtype=uint64)), (array([0.01839398, 0.01840304]), array([3], dtype=uint64)), (array([0.01930474, 0.01928519]), array([3], dtype=uint64)), (array([0.0183482]), array([1], dtype=uint64)), (array([0.01919602, 0.01917313]), array([3], dtype=uint64)), (array([0.01838063]), array([1], dtype=uint64)), (array([0.01911495, 0.01908301]), array([3], dtype=uint64)), (array([0.01833342, 0.01833294]), array([3], dtype=uint64)), (array([0.01902483, 0.01903055]), array([3], dtype=uint64)), (array([0.01832054, 0.01832245]), array([3], dtype=uint64)), (array([0.01890133, 0.01894282]), array([3], dtype=uint64)), (array([0.01829718]), array([1], dtype=uint64)), (array([0.01876972]), array([1], dtype=uint64)), (array([0.01830147]), array([1], dtype=uint64)), (array([0.018692]), array([1], dtype=uint64)), (array([0.01822947, 0.0182433 ]), array([3], dtype=uint64)), (array([0.01868342, 0.01871155]), array([3], dtype=uint64)), (array([0.0183544 , 0.01858614, 0.01869438]), array([7], dtype=uint64))]]
Traceback (most recent call last):
File "/home/jmb001/.conda/envs/ecmwf/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 2022, in shape
result = a.shape
AttributeError: 'list' object has no attribute 'shape'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/jmb001/jmscript/PY/ECMWF/EARTHKIT_POLYTOPE/3D_shipping_route_era5_mabl_specific_humidity_mathilde.py", line 169, in
a.test_reduced_ll_grid()
File "/home/jmb001/jmscript/PY/ECMWF/EARTHKIT_POLYTOPE/3D_shipping_route_era5_mabl_specific_humidity_mathilde.py", line 117, in test_reduced_ll_grid
result = self.API.retrieve(request)
File "/home/jmb001/.conda/envs/ecmwf/lib/python3.10/site-packages/polytope/polytope.py", line 59, in retrieve
self.datacube.get(request_tree)
File "/home/jmb001/.conda/envs/ecmwf/lib/python3.10/site-packages/polytope/datacube/backends/fdb.py", line 75, in get
self.assign_fdb_output_to_nodes(output_values, fdb_requests_decoding_info)
File "/home/jmb001/.conda/envs/ecmwf/lib/python3.10/site-packages/polytope/datacube/backends/fdb.py", line 257, in assign_fdb_output_to_nodes
print("JMB fdb.py assign_fdb_output_to_nodes: np.shape(request_output_values)=",np.shape(request_output_values))
File "/home/jmb001/.conda/envs/ecmwf/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 2024, in shape
result = asarray(a).shape
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 3 dimensions. The detected shape was (1, 49, 2) + inhomogeneous part.
GRIBJUMP_DEBUGGRIBJUMP_DEBUG Thread Thread 22435342517824 stopping (queue closed)22435554211392 stopping (queue closed)

Thread Thread 22435342517824 stopping (queue closed)22435554211392 stopping (queue closed)

It seems we got output values from GribJump now and they sure look like surface specific humidities around 18-19 g/Kg ... which is nice. There's a glitch remaining ...which appears minor ?

What is the meaning of "resolution" :360 then ? I thought we had to specify the data resolution (0.25 deg in this case). I guess this information is retrieved from FDB (GRIB) then.

I was also wondering about request about NEMO output. The software is not able to deal with such a grid yet (ORCA025), am I right ?

Anyway. Your help is much appreciated!

Jean-Marc
:-)

@mathleur
Copy link
Member

mathleur commented Jun 6, 2024

Hi @jmb64 ,

This is great! It looks like it's almost working :)

I believe maybe this print line in the code might be causing issues:

print("JMB fdb.py assign_fdb_output_to_nodes: np.shape(request_output_values)=",np.shape(request_output_values))

One of the problems of Polytope which happens here is that the output of Polytope isn't necessarily "box"-shaped so most of the numpy functions like shape() do not work on Polytope's output!

We've also just updated the develop branch of Polytope yesterday which might fix some of the last bugs!

For the resolution, it's currently maybe a bit counterintuitive. It was initially built around the octahedral grid, where the resolution was the number x in Ox, so somewhat related to the number of data points on a latitude line. We've tried to keep it somewhat "uniform" for the different grids and so the resolution for the regular lat/lon grid is the number of points on a latitude line (in this case it's 90/0.25=360). I hope this explains a bit more!

Unfortunately, I do not believe we support ORCA grids yet.

I hope this helps!

@jmb64
Copy link
Author

jmb64 commented Jun 6, 2024

Mathilde,

I assure you ... this helps ! :-)

Screenshot 2024-06-06 142706

How can I reference the actual values (specific humidities) along the trajectory) in the script (duhhh) ?
I wrote them in fdb.py but ....

JM

@jmb64
Copy link
Author

jmb64 commented Jun 6, 2024

Mathilde,

Finally,

Using leaves[i].result (?) ...

Screenshot 2024-06-06 151209

JM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants