Skip to content

Releases: dyvenia/viadot

Viadot 0.4.9

27 Sep 11:56
7c0e534
Compare
Choose a tag to compare

Added

  • Added new column named _viadot_downloaded_at_utc in genesys files with the datetime when it is created.
  • Added sftp source class SftpConnector
  • Added sftp tasks SftpToDF and SftpList
  • Added sftp flows SftpToAzureSQL and SftpToADLS
  • Added new source file mindful to connect with mindful API.
  • Added new task file mindful to be called by the Mindful Flow.
  • Added new flow file mindful_to_adls to upload data from Mindful API tp ADLS.
  • Added recursive parameter to AzureDataLakeList task

Viadot 0.4.8

06 Sep 14:22
361e998
Compare
Choose a tag to compare

Added

  • Added protobuf library to requirements

Viadot 0.4.7

06 Sep 12:45
520bb67
Compare
Choose a tag to compare

Added

  • Added new flow - SQLServerTransform and new task SQLServerQuery to run queries on SQLServer
  • Added duckdb_query parameter to DuckDBToSQLServer flow to enable option to create table
    using outputs of SQL queries
  • Added handling empty DF in set_new_kv() task
  • Added update_kv and filter_column params to SAPRFCToADLS and SAPToDuckDB flows and added set_new_kv() task
    in task_utils
  • Added Genesys API source Genesys
  • Added tasks GenesysToCSV and GenesysToDF
  • Added flows GenesysToADLS and GenesysReportToADLS
  • Added query parameter to PrefectLogs flow

Changed

  • Updated requirements.txt
  • Changed 'handle_api_response()' method by adding more requests method also added contex menager

Viadot 0.4.6

21 Jul 13:09
0088556
Compare
Choose a tag to compare

Added

  • Added rfc_character_limit parameter in SAPRFCToDF task, SAPRFC source, SAPRFCToADLS and SAPToDuckDB flows
  • Added on_bcp_error and bcp_error_log_path parameters in BCPTask
  • Added ability to process queries which result exceed SAP's character per low limit in SAPRFC source
  • Added new flow PrefectLogs for extracting all logs from Prefect with details
  • Added PrefectLogs flow

Changed

  • Changed CheckColumnOrder task and ADLSToAzureSQL flow to handle appending to non existing table
  • Changed tasks order in EpicorOrdersToDuckDB, SAPToDuckDB and SQLServerToDuckDB - casting
    DF to string before adding metadata
  • Changed add_ingestion_metadata_task() to not to add metadata column when input DataFrame is empty
  • Changed check_if_empty_file() logic according to changes in add_ingestion_metadata_task()
  • Changed accepted values of if_empty parameter in DuckDBCreateTableFromParquet
  • Updated .gitignore to ignore files with *.bak extension and to ignore credentials.json in any directory
  • Changed logger messages in AzureDataLakeRemove task

Fixed

  • Fixed handling empty response in SAPRFC source
  • Fixed issue in BCPTask when log file couln't be opened.
  • Fixed log being printed too early in Salesforce source, which would sometimes cause a KeyError
  • raise_on_error now behaves correctly in upsert() when receiving incorrect return codes from Salesforce

Removed

  • Removed option to run multiple queries in SAPRFCToADLS

Viadot 0.4.5

23 Jun 12:13
f35e6df
Compare
Choose a tag to compare

Added

  • Added error_log_file_path parameter in BCPTask that enables setting name of errors logs file
  • Added on_error parameter in BCPTask that tells what to do if bcp error occurs.
  • Added error log file and on_bcp_error parameter in ADLSToAzureSQL
  • Added handling POST requests in handle_api_response() add added it to Epicor source.
  • Added SalesforceToDF task
  • Added SalesforceToADLS flow
  • Added overwrite_adls option to BigQueryToADLS and SharepointToADLS
  • Added cast_df_to_str task in utils.py and added this to EpicorToDuckDB, SAPToDuckDB, SQLServerToDuckDB
  • Added if_empty parameter in DuckDBCreateTableFromParquet task and in EpicorToDuckDB, SAPToDuckDB,
    SQLServerToDuckDB flows to check if output Parquet is empty and handle it properly.
  • Added check_if_empty_file() and handle_if_empty_file() in utils.py

Viadot 0.4.4

09 Jun 13:18
7d8f8b0
Compare
Choose a tag to compare

Added

  • Added new connector - Outlook. Created Outlook source, OutlookToDF task and OutlookToADLS flow.
  • Added new connector - Epicor. Created Epicor source, EpicorToDF task and EpicorToDuckDB flow.
  • Enabled Databricks Connect in the image. To enable, follow this guide
  • Added MySQL source and MySqlToADLS flow
  • Added SQLServerToDF task
  • Added SQLServerToDuckDB flow which downloads data from SQLServer table, loads it to parquet file and then uplads it do DuckDB
  • Added complete proxy set up in SAPRFC example (viadot/examples/sap_rfc)

Changed

  • Changed default name for the Prefect secret holding the name of the Azure KV secret storing Sendgrid credentials

Viadot 0.4.3

28 Apr 15:12
9c7db6d
Compare
Choose a tag to compare

Added

  • Added adls_file_name in SupermetricsToADLS and SharepointToADLS flows
  • Added BigQueryToADLS flow class which anables extract data from BigQuery.
  • Added Salesforce source
  • Added SalesforceUpsert task
  • Added SalesforceBulkUpsert task
  • Added C4C secret handling to CloudForCustomersReportToADLS flow (c4c_credentials_secret parameter)

Fixed

  • Fixed get_flow_last_run_date() incorrectly parsing the date
  • Fixed C4C secret handling (tasks now correctly read the secret as the credentials, rather than assuming the secret is a container for credentials for all environments and trying to access specific key inside it). In other words, tasks now assume the secret holds credentials, rather than a dict of the form {env: credentials, env2: credentials2}
  • Fixed utils.gen_bulk_insert_query_from_df() failing with > 1000 rows due to INSERT clause limit by chunking the data into multiple INSERTs
  • Fixed get_flow_last_run_date() incorrectly parsing the date
  • Fixed MultipleFlows when one flow is passed and when last flow fails.

Viadot 0.4.2

08 Apr 08:28
99a4c5b
Compare
Choose a tag to compare

Added

  • Added AzureDataLakeRemove task

Changed

  • Changed name of task file from prefect to prefect_data_range

Fixed

  • Fixed out of range issue in prefect_data_range

Viadot 0.4.1

07 Apr 13:54
a5bde2a
Compare
Choose a tag to compare

Changed

Hot fix - bumped version

Viadot 0.4.0

07 Apr 12:44
1e281f6
Compare
Choose a tag to compare

Added

  • Added custom_mail_state_handler function that sends mail notification using custom smtp server.
  • Added new function df_clean_column that cleans data frame columns from special characters
  • Added df_clean_column util task that removes special characters from a pandas DataFrame
  • Added MultipleFlows flow class which enables running multiple flows in a given order.
  • Added GetFlowNewDateRange task to change date range based on Prefect flows
  • Added check_col_order parameter in ADLSToAzureSQL
  • Added new source ASElite
  • Added KeyVault support in CloudForCustomers tasks
  • Added SQLServer source
  • Added DuckDBToDF task
  • Added DuckDBTransform flow
  • Added SQLServerCreateTable task
  • Added credentials param to BCPTask
  • Added get_sql_dtypes_from_df and update_dict util tasks
  • Added DuckDBToSQLServer flow
  • Added if_exists="append" option to DuckDB.create_table_from_parquet()
  • Added get_flow_last_run_date util function
  • Added df_to_dataset task util for writing DataFrames to data lakes using pyarrow
  • Added retries to Cloud for Customers tasks
  • Added chunksize parameter to C4CToDF task to allow pulling data in chunks
  • Added chunksize parameter to BCPTask task to allow more control over the load process
  • Added support for SQL Server's custom datetimeoffset type
  • Added AzureSQLToDF task
  • Added AzureSQLUpsert task

Changed

  • Changed the base class of AzureSQL to SQLServer
  • df_to_parquet() task now creates directories if needed
  • Added several more separators to check for automatically in SAPRFC.to_df()
  • Upgraded duckdb version to 0.3.2

Fixed

  • Fixed bug with CheckColumnOrder task
  • Fixed OpenSSL config for old SQL Servers still using TLS < 1.2
  • BCPTask now correctly handles custom SQL Server port
  • Fixed SAPRFC.to_df() ignoring user-specified separator
  • Fixed temporary CSV generated by the DuckDBToSQLServer flow not being cleaned up
  • Fixed some mappings in get_sql_dtypes_from_df() and optimized performance
  • Fixed BCPTask - the case when the file path contained a space
  • Fixed credential evaluation logic (credentials is now evaluated before config_key)
  • Fixed "$top" and "$skip" values being ignored by C4CToDF task if provided in the params parameter
  • Fixed SQL.to_df() incorrectly handling queries that begin with whitespace

Removed

  • Removed autopick_sep parameter from SAPRFC functions. The separator is now always picked automatically if not provided.
  • Removed dtypes_to_json task to task_utils.py