Skip to content

v2.1.0

Compare
Choose a tag to compare
@github-actions github-actions released this 22 Aug 09:19
· 1684 commits to main since this release
46b51b7

What's Changed

New Contributors

Full Changelog: v0.4.3...v2.1.0

Old changelog

Added

  • Added new version of Genesys connector and test files.
  • Added new version of Outlook connector and test files.
  • Added new version of Hubspot connector and test files.
  • Added Mindful connector and test file.
  • Added sql_server_to_parquet Prefect flow.
  • Added sap_to_parquet Prefect flow.
  • Added duckdb_to_sql_server, duckdb_to_parquet, duckdb_transform Prefect flows.
  • Added bcp and duckdb_query Prefect tasks.
  • Added DuckDB source class.
  • Added sql_server_to_minio flow for prefect.
  • Added df_to_minio task for prefect
  • Added handling for DatabaseCredentials and Secret blocks in prefect/utlis.py:get_credentials
  • Added SQLServer source and tasks create_sql_server_table, sql_server_to_df,sql_server_query
  • Added basename_template to MinIO source
  • Added _empty_column_to_string and _convert_all_to_string_type to convert data types to string.
  • Added na_values parameter to Sharepoint class to parse N/A values coming from the excel file columns.
  • Added get_last_segment_from_url function to sharepoint file.
  • Added validate function to viadot/utils.py
  • Fixed Databricks.create_table_from_pandas() failing to overwrite a table in some cases even with replace="True"
  • Enabled Databricks Connect in the image. To enable, follow this guide
  • Added Databricks source
  • Added ExchangeRates source
  • Added from_df() method to AzureDataLake source
  • Added SAPRFC source
  • Added S3 source
  • Added RedshiftSpectrum source
  • Added upload() and download() methods to S3 source
  • Added Genesys source
  • Fixed a bug in Databricks.create_table_from_pandas(). The function that converts column names to snake_case was not used in every case. (#672)
  • Added howto_migrate_sources_tasks_and_flows.md document explaining viadot 1 -> 2 migration process
  • RedshiftSpectrum.from_df() now automatically creates a folder for the table if not specified in to_path
  • Fixed a bug in Databricks.create_table_from_pandas(). The function now automatically casts DataFrame types. (#681)
  • Added close_connection() to SAPRFC
  • Added Trino source
  • Added MinIO source
  • Added gen_split() method to SAPRFCV2 class to allow looping over a data frame with generator - improves performance
  • Added adjust_where_condition_by_adding_missing_spaces() to SAPRFC. The function that is checking raw sql query and modifing it - if needed.

Changed

  • Changed location of task_utils.py and removed unused/prefect1-related tasks.
  • Changed the way of handling NA string values and mapped column types to str for Sharepoint source.
  • Added SQLServerToDF task
  • Added SQLServerToDuckDB flow which downloads data from SQLServer table, loads it to parquet file and then uploads it do DuckDB
  • Added complete proxy set up in SAPRFC example (viadot/examples/sap_rfc)
  • Added Databricks/Spark setup to the image. See README for setup & usage instructions
  • Added rollback feature to Databricks source
  • Changed all Prefect logging instances in the sources directory to native Python logging
  • Changed rm(), from_df(), to_df() methods in S3 Source
  • Changed get_request() to handle_api_request() in utils.py
  • Changed SAPRFCV2 in to_df()for loop with generator
  • Updated Dockerfile to remove obsolete adoptopenjdk and replace it with temurin

Removed

  • Removed the env param from Databricks source, as user can now store multiple configs for the same source using different config keys
  • Removed Prefect dependency from the library (Python library, Docker base image)
  • Removed catch_extra_separators() from SAPRFCV2 class

Fixed

  • Fixed bcp prefect task to run correct.
  • Fixed the typo in credentials in SQLServer source