Releases: kraina-ai/overturemaestro
Releases · kraina-ai/overturemaestro
0.2.6
0.2.5
Added
- Option to pass list of
hierarchy_depth
values for multiple theme / type pairs - Info about current theme / type pair to the
HierarchyDepthOutOfBoundsWarning
0.2.4
Added
- Places hierarchy based on the official taxonomy #63
- Option to change minimal confidence score for places and select only primary category for the wide form transformation #63
Changed
- Added option to use any non-negative integer as a
hierarchy_depth
value for wide form processing #64 - Shortened hash parts for generated file names to 8 characters per part
Fixed
- Bug where a constant value has been overwritten instead of being copied before modifying
0.2.3
0.2.2
Fixed
- Changed wide format definitions for different release versions
0.2.1
Added
- Wide format release index to precalculate all possible columns #43
- Flag
include_all_possible_columns
to keep or prune empty columns #43 overturemaestro.advanced_functions.wide_form.get_all_possible_column_names
for getting a list of all possible column names #46overturemaestro.cache.clear_cache
function for clearing local release index cache from the API
0.2.0
Added
- Automatic total time wrapper decorator to aggregate nested function calls
- Parameter
columns_to_download
for selecting columns to download from the dataset #23 - Option to pass a list of pyarrow filters and columns for download for each theme type pair when downloading multiple datasets at once
- Automatic columns detection in pyarrow filters when passing
columns_to_download
- New
advanced_functions
module with awide
format for machine learning purposes #38
Changed
- Refactored available release versions caching #24
- Removed hive partitioned parquet schema columns from GeoDataFrame loading
Deprecated
- Nested fields in PyArrow filter in CLI is now expected to be separated by a dot, not a comma #22
0.1.2
0.1.1
Changed
- Modified release index consolidation script
0.1.0
Added
- CLI #3
- Option to filter data with bounding box #4
- Tests for the library #6
- Automatic newest release version loading #7
- Library docs #2
- README content
- Verbosity modes
- Total operation time
- Overloads for the functions typing
- Function for displaying all available release versions
- GitHub Action workflows for docs deployment
Changed
- Moved location of the pregenerated release indexes to the global cache #19
- Moved
scikit-learn
andpolars
to the dedicated dependency group #9 - Sped up intersection algorithm
- Reduced number of max concurrent connections for parquet files download
Fixed
- Memory leak during concurrent parquet files download
- Added automatic retry for downloads with 10 retries