Releases: dyvenia/viadot
Releases · dyvenia/viadot
v2.1.2
v2.1.1
v2.1.0
What's Changed
- ✨ Add complete proxy settings in
SAPRFC
example by @trymzet in #403 - Update tests by @winiar93 in #396
- SQLServer To DuckDB by @angelika233 in #404
- ✨ Added databricks-connect support by @afraijat in #409
- ✨Added databricks source to viadot by @afraijat in #434
- Decrease Docker image size by @trymzet in #458
- Added rollback feature, improved comments formatting by @afraijat in #452
- 🔊 Replaced Prefect logging with Python logging by @afraijat in #459
- Add databricks cleanup task by @trymzet in #502
- Fix orion prefect deployment by @trymzet in #515
- Fix import error from
datahub_cleanup_task
by @trymzet in #516 - ✨ Added
ExchangeRates
source to the library by @djagoda881 in #535 - ✨ Add
Sharepoint
2.0 source by @trymzet in #534 - ♻️ Update
ExchangeRates
to use new config by @trymzet in #536 - 🔥 Remove unused requirements after having removed tasks by @trymzet in #537
- ♻️ Readd Databricks to init by @trymzet in #538
- 🔥 Databricks - remove
env
parameter by @trymzet in #539 - ♻️ Readding the
AzureDataLake
source by @trymzet in #540 - ✨ Add a default cluster port to
Databricks
source by @trymzet in #541 - 🐛 Add missing info about the required
org_id
credential by @trymzet in #542 - ✨ Add
TableDoesNotExist
exception by @trymzet in #543 - Add
from_df()
method toAzure Data Lake
source by @trymzet in #546 - ✨ Added parameters to functions in config.py by @djagoda881 in #548
- ♻️ Update Databricks to work with 2.0 configs by @trymzet in #553
- ✨ Adding a column to the to_df function by @djagoda881 in #550
- ✨ Created decorator to include viadot source to df by @fgoiriz in #567
- ✨ Added migration of cloud_for_customers.py source to 2.0 by @fgoiriz in #552
- 🐛 Modified Sources init.py by @fgoiriz in #569
- 🐛 Databricks bug fix by @afraijat in #571
- Add SAPRFC source by @AnnaGerlich in #582
- Add s3 source by @trymzet in #587
- 🐛Fix credential handling to handle AWS region by @trymzet in #588
- 🐛 Fix logger being called before initialization by @trymzet in #589
- 🐛 Fix credential handling when not specified by @AnnaGerlich in #590
- 🐛 Fixed and Extended S3 Source by @TillPickha in #603
- 🐛 Fix handling of empty viadot config by @trymzet in #611
- 📝 Update docs by @trymzet in #612
- Add VSCode setup by @trymzet in #613
- Remove prefect as base image by @trymzet in #617
- 📌 Fix boto dependency hell by @trymzet in #618
- 📌 Fix boto to particular version by @trymzet in #619
- Remove ARM architecture by @trymzet in #620
- Remove old dependencies by @trymzet in #622
- ➕ Add missing
pyyaml
dependency by @trymzet in #624 - ➕ Add
pydantic
dependency by @trymzet in #625 - ⬆️ Bump
viadot
package version by @trymzet in #626 - ⬆️ Upgrade dependencies to fix conflicts by @trymzet in #628
- ✨ Add
test_download_file()
test forSharepoint
source by @AnnaGerlich in #608 - 📝 Minor docs change by @trymzet in #635
- Added tests for the
RedshiftSpectrum
source by @AnnaGerlich in #636 - Fixed bug in exchange rates unit tests by @djagoda881 in #638
- Databricks replace bug by @djagoda881 in #658
- Utils response function by @djagoda881 in #661
- Genesys source migration by @djagoda881 in #665
- Genesys bug fix by @djagoda881 in #668
- Databricks snakecase column bug by @djagoda881 in #673
- 📝 Added howto migrate from viadot 1 to viadot 2 by @afraijat in #680
- ⚡️Enhanced
S3()
andRedshiftSpectrum()
sources by @AnnaGerlich in #678 - 📝 Refined viadot migration docs by @afraijat in #684
- Automatically create table folder in
RedshiftSpectrum.from_df()
by @trymzet in #693 - Cleanup compose by @trymzet in #696
- Upgrade Databricks connector for 11.3+ runtimes by @trymzet in #697
- ♻️ Standardize credentials validation by @trymzet in #699
- Databricks pandas types casting by @djagoda881 in #701
- ✨ Added
close_connection()
tosap_rfc
by @AnnaGerlich in #709 - 🚑 Fixed AWS credentials handling by @AnnaGerlich in #713
- 🚑 Fixed typo in
chunksize
parameter name by @AnnaGerlich in #718 - ✨ Add MSSQL ODBC driver to image by @trymzet in #719
- Add Trino source by @trymzet in #726
- Source MinIO by @trymzet in #729
- ✨ Added
SAPRFCV2
source class by @AnnaGerlich in #835 - Update PyPI pipeline to use Trusted Publishing by @trymzet in #837
- ✨ Add
partition_cols
param toMinio.from_df()
. by @trymzet in #838 - Update version to 2.0a15 by @trymzet in #839
- ✨ Optimize
MinIO
andTrino
sources by @trymzet in #845 - ⬆️ Bump version by @trymzet in #859
- Implement
recursive
param inMinIO.rm()
by @trymzet in #861 - Add
Decimal
type support to Trino by @trymzet in #862 - 🔖 Bump version by @trymzet in #864
- Trino connection context manager by @trymzet in #865
- ⚡️ Performance upgrade of
SAPRFCV2
andDockerfile
update by @marcinpurtak in #863 - ➕ Add missing dependencies in
setup.py
by @djagoda881 in #871 - 🔖 Bumped
viadot2
version to2.0a19
by @djagoda881 in #872 - 🐛 Added dependencies directly to
setup.py
by @djagoda881 in #875 - ⬆️ Upgraded dependencies versions by @djagoda881 in #876
- ✨ Added
validate()
util for dataframe validation by @burzekj in #869 - 🚀 Bumped viadot2 version to
2.0a20
by @djagoda881 in #877 - 🧱 Added new dependence management in the project by @djagoda881 in #878
- Bump
azure-identity
by @trymzet in #879 - ♻️ Apply downstream improvements to the Dockerfile by @trymzet in #881
- 👷 Add CI to 2.0 by @trymzet in #882
- 🐛 Fix pip not installing to viadot user's
.local
dir by @trymzet in #883 - 🐛 Fix Dockerfile build failing due to file permissions by @trymzet in #884
- 📝 Improve contributing docs by @trymzet in #889
- 🐛 Fixing crash when no data returned in
SAPRFCV2
by @marcinpurtak in #887 - ♻️ Added Databircks as optional dependency and source by @djagoda881 in #880
- 🚀 Added new functionality to saprfc source regarding where statement by @adrian-wojcik in #899
- Modified reading the file and inspect provided URL by @Rafalz13 in #914
- Bugfix/saprfcv2 column shift by @marcinpurtak in #916
- 📝 Make docs great again by @trymzet in #924
- 🐛 Fix MkDocs build failing due to incorrect YAML parsing ...
Viadot 0.4.26
What's Changed
Added
- Added option for
SAP RFC
connector to get credentials from Azure KeyVault or directly passing dictionary inside flow.
Fixed
- Fixed the
if_exists
parameter definition in theCreateTableFromBlob
task. - Changed
requirements.txt
to level up version ofdbt-sqlserver
in order to fix the bug withMAXRECURSION
error in dbt_run.
Changed
- Changed
dbt-sqlserver
version togit+https://github.com/djagoda881/dbt-sqlserver.git@v1.3.latest_option_clause
.
Removed
- Removed
dbt-core==1.3.2
fromrequirements.txt
. - Removed copying files to conformed/ and operational/ directories when running
ADLSTOAzureSQL
flow.
Shortcut
- 🎨 Delete promote_to task from the flow by @malgorzatagwinner in #849
- changed upstream order for df_validation by @dominikjedlinski in #868
- 🐛 Changed dbt packages in requirements.txt by @adrian-wojcik in #855
- Fixed the
if_exists
parameter inCreateTableFromBlob
task by @Diego-H-S in #874 - ♻️ Changed credentials logic in sap_rfc to get credentials from KeyVault by @adrian-wojcik in #866
- Modify
TransformAndCatalogToLuma
process and commands by @Rafalz13 in #860 - 🐛 Fix bug related to logger.warning and tests in sap_rfc source by @adrian-wojcik in #890
- ♻️ Change
sap_credentials
variable tocredentials
by @adrian-wojcik in #891 - Release 0.4.26 PR by @Rafalz13 in #892
Full Changelog: v0.4.25...v0.4.26
Viadot 0.4.25
What's Changed
Added
- Added logic for if_empty param:
check_if_df_empty
task toADLSToAzureSQL
flow. - Added
geopy
library torequirements
. - Added new parameter
validate_df_dict
toADLSToAzureSQL
class. - Added new ViewType
agent_timeline_summary_view
to Genesys.
Shortcut
- GitHub action bug by @burzekj in #832
- Modified
if_empty
logic inADLSToAzureSQL
by @burzekj in #833 - Fixed json normalize by @KaterynaIurieva in #829
validate_df
implementation toADLSToAzureSQL
by @burzekj in #834- Cleaned small flaws in tests by @KaterynaIurieva in #844
- New Viewtype agent timeline by @Diego-H-S in #846
- Added
geopy
library to requirements by @Rafalz13 in #841 - Release 0.4.25 PR by @Rafalz13 in #847
Full Changelog: v0.4.24...v0.4.25
Viadot 0.4.24
What's Changed
Fixed
task_utils/get_nested_value
fixed issue with non dict parameter passed without level(1st workflow)
Shortcut
- Hotfix/check_nested_value non dict value handling by @marcinpurtak in #830
- Release 0.4.24 PR by @Rafalz13 in #831
Full Changelog: v0.4.23...v0.4.24
Viadot 0.4.23
What's Changed
Added
- Added tests for new functionalities in SAPRFC and SAPRFCV2 regarding passing credentials.
- Added new params for mapping and reordering DataFrame for
Genesys
task and flow. - Added
get_task_logs
task to search for logs in the flow - Added
get_flow_run_id
task to find flow ID. - Added
search_for_msg_in_logs
task used to control flows in multiflows by searching for a given log message from a given task. - Added closing session to
SAPBW
. - Added
CSV
as a new output extension toSharepointListToADLS
flow.
Fixed
- Fixed creation of URL in
VidClub
source class. When theregion=None
the region parameter will not be included in the URL.
Changed
if_no_data_returned
added for sharepoint list flow which can fail, warn in case of no data returend or skip (continue) execution in the old way.- Changed
__init__
inSAPRFC
andSAPRFCV2
class in source in order to raise warning in prefect when credentials will be taken from DEV.
Shortcut
- Web msg logic fix by @burzekj in #820
- Sharepoint list to csv by @cgildenia in #823
- added conn.close after each session to sappw by @gwieloch in #825
- 🐛 Added warning logger for credential by @adrian-wojcik in #824
- Fix url in vidclub connector by @KaterynaIurieva in #822
- Feature/sharepoint multiflows control by @marcinpurtak in #826
- Improve
utils.py
test coverage by @Rafalz13 in #817 - Fix tests before release by @Rafalz13 in #827
- Release 0.4.23 PR by @Rafalz13 in #828
New Contributors
- @KaterynaIurieva made their first contribution in #822
Full Changelog: v0.4.22...v0.4.23
Viadot 0.4.22
What's Changed
Added
- Added
TM1
source class. - Added
TM1ToDF
task class. - Added
set_prefect_kv
parameter toBigQueryToADLS
withFalse
as a default. If there is a need to create new pair in KV Store the parameter can be changed toTrue
. - Added
_rename_duplicated_fields
method toSharepointListToDF
task class for finding and rename duplicated columns. - Added new view type
agent_interaction_view_type
inGenesys
source. - Added new logic for endpoint
users
inGenesys
task. - Added libraries
nltk
andsklearn
torequirements
.
Fixed
- Fixed bug for endpoint
conversations
in GET method inGenesys
Task.
Changed
- Splitted test for
Eurostat
on source tests and task tests. - Modified
SharepointList
source class:
-> Docstrings update. - Modified
SharepointToADLS
flow class:
-> Docstrings update.
-> Changed set_prefect_kv: bool = False to prevent forced KV store append. - Modified
SharepointListToADLS
flow class:
-> Changed set_prefect_kv: bool = False to prevent forced KV store append. - Modified
SharepointList
source class:
-> Docstrings update.
-> Changed_unpack_fields
method to handle Sharepoint MultiChoiceField type + small improvements.
-> Changedget_fields
method to handle special characters - different approach to call get() and execute_query().
-> Renamed method fromselect_expandable_user_fields
toselect_fields
+ update for MultiChoiceField type.
-> Changedcheck_filters
method errors messages and more checks added.
-> Changedoperators_mapping
method errors messages.
-> Changedmake_filter_for_df
method errors messages. - Modified
SharepointListToDF
task class:
-> Docstrings update - Splitted test for Eurostat on source tests and task tests.
- Modified
CustomerGauge
source class with simplified logic to return json structure. - Expanded
CustomerGaugeToDF
task class with separate cleaning functions and handling nested json structure flattening with two new methods_field_reference_unpacker
and_nested_dict_transformer
. - Changed
CustomerGaugeToADLS
to containing new arguments.
Shortcut
- Genesys agent interaction view type by @Diego-H-S in #803
- Added TM1 connector by @angelika233 in #801
- Eurostat tests improvments by @adrian-wojcik in #806
- ⚡️ Added
set_prefect_kv
parameter toBigQueryToADLS
flow by @Rafalz13 in #809 - Improve test coverage for viadot source package by @Rafalz13 in #807
- Dev customer gauge unpacker by @hhnks in #800
- ✨ Add new requirements by @malgorzatagwinner in #813
- Feature/sharepoint list extension kpi5 by @marcinpurtak in #810
- web_msg logic fix by @burzekj in #812
- Fixed typo in tests by @marcinpurtak in #814
- ✨ new logic to extracting users from genesys by @burzekj in #811
- Revert of changes for desired columns by @marcinpurtak in #815
- Release 0.4.22 PR by @Rafalz13 in #816
New Contributors
- @malgorzatagwinner made their first contribution in #813
Full Changelog: v0.4.21...v0.4.22
Viadot 0.4.21
[0.4.21] - 2023-10-26
Added
- Added
validate_df
task to task_utils. - Added
SharepointList
source class. - Added
SharepointListToDF
task class. - Added
SharepointListToADLS
flow class. - Added tests for
SharepointList
. - Added
get_nested_dict
to untils.py.
Fixed
Changed
- Changed
GenesysToCSV
logic for end_point == "conversations". Added new fields to extraction.
Viadot 0.4.20
What's Changed
Added
- Added
Office365-REST-Python-Client
library torequirements
. - Added the
GetSalesQuotationData
view in theBusinessCore
source. - Added new ViewType
queue_interaction_detail_view
to Genesys. - Added new column
_viadot_source
to BigQuery extraction.
Changed
- Changed the flow name from
TransformAndCatalog
toTransformAndCatalogToLuma
. - Modified
add_viadot_metadata_columns
to apply a parameter source_name to the decorator for theto_df
function or function where the DataFrame is generated. - Changed the
SharepointToDF
task in order to implementadd_viadot_metadata_columns
with valuesource_name="Sharepoint"
after changes. - Changed
Mindful
credentials passed by theauth
parameter, instead of theheader
.
Shortcut
- Add
GetSalesQuotationData
to Bussines core by @angelika233 in #752 - Mindful auth by @Diego-H-S in #744
- ♻️ Changed _and_add_viadot_metadata_columns_decorator and aplied it t… by @adrian-wojcik in #758
- 🐛 Fixed bug in test after changing decorator by @adrian-wojcik in #760
- Genesys interaction view type by @Diego-H-S in #761
- Added Office365-REST-Python-Client to the requirements.txt by @burzekj in #753
- 📝 updated documentation by @angelika233 in #754
- ✨ Modify
TransformAndCatalog
flow by @Rafalz13 in #750 - Add
_viadot_source
column toBigQuery
by @Rafalz13 in #763 - Correct tests before release by @Rafalz13 in #764
- Release 0.4.20 PR by @Rafalz13 in #765
Full Changelog: v0.4.19...v0.4.20